All posts by IEP Author

Arnold Geulincx (1624—1669)

Arnold (or Arnout) Geulincx was an early-modern Flemish philosopher who initially taught at Leuven (Louvain) University, but fled the Catholic Low Countries when he was fired there in 1658. He settled at Leiden, in the Protestant North, where he worked under the patronage of the Cartesian Calvinist theologian Abraham Heidanus (1597-1678), and tried to obtain a post at Leiden University. Geulincx was never to procure a steady position in his new surroundings, and ultimately died in poverty as a victim of the 1669 Leiden plague. On the basis of Descartes’ philosophy, he developed a range of philosophical ideas that sometimes closely resemble Spinoza’s, but always have a particular flavour of their own. His contributions in the fields of logic, metaphysics and ethics have earned him a place not only in the history of Dutch Cartesianism, but in Western intellectual history at large.

As a result of accusations that he had been a Spinozist in disguise, Geulincx’ name was almost erased from history after 1720, but nineteenth-century historians rehabilitated Geulincx for having been a forerunner of Immanuel Kant. Nowadays, Arnold Geulincx is primarily known as a representative of seventeenth-century “occasionalism”, and as an original thinker in-between Descartes and Spinoza. Despite a certain impact he made on his immediate Leiden pupils, such as the Dutch Cartesians Cornelis Bontekoe (c. 1644-1685) and Johannes Swartenhengst (1644-1711), and on the English philosopher Richard Burthogge (1638-1705), as well as on a number of enlightened members of the Dutch Calvinist clergy during the last quarter of the seventeenth century, Geulincx’ most significant influence in intellectual history to date has been on the novels and plays of Samuel Beckett (1906-1989), as well as, through Beckett, on late twentieth-century French philosophy.

Table of Contents

  1. Life
  2. Logic and Method
  3. Metaphysics
  4. Ethics
  5. Anti-Aristotelianism
  6. A Philosophy of Wonder
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Arnold Geulincx was born in the city of Antwerp, which despite having lost its former glory as a hub of world trade and a centre of the arts, had regained new vigour as the home of Counter-Reformation culture in the Southern Netherlands—a new spirit that was evidenced in the paintings of Peter-Paul Rubens, Jacob Jordaens and Anthonie van Dijck, as well as in the Baroque church of Saint Carolus Borromeus, a Jesuit monument consecrated in 1625. Geulincx’ father apparently did well as the city’s messenger to Brussels, since he bought a large house just around the corner of Saint Carolus Borromeus’ Church when Arnold was around thirteen years of age, and another, adjacent one, a year later. While Jan Geulincx, one of Arnold’s younger brothers, studied with Jacob Jordaens for some time, Arnold was destined for an academic career and left Antwerp to go to university in January 1640. In Leuven, he studied arts and philosophy at the College of the Lily, obtaining his licentiate on November 19, 1643 ranking second best in a class of 159 students. Reading theology for some time, Geulincx was appointed junior professor in philosophy at Lily College in December 1646.

Not much is known about Geulincx’ early career, but it is reasonable to assume that he made a strong impression with his rhetorical skills, founded on the remarkable proficiency in Latin he had already exhibited during his Antwerp school days. In the autumn of 1649, Geulincx’ career perspectives seemed secure enough for his parents to give up their life in Antwerp and join their son in Leuven, the area they originally came from. Another three years hence, Geulincx became senior professor and was asked to deliver a series of speeches during the end-of-the-year Saturnalia festivities. Protests against the nova philosophia at Leuven may have been prompted by Geulincx’ opening address on December 16, 1652, where he ventilated Baconian ideas and outlined recommendations for changes to be made to the university curriculum.  Initially, however, this did not in any way hinder a successful continuation of his academic career.

Disgrace and downfall came only in 1658, when Geulincx was dismissed, presumably on account of attempting to breach the rule of celibacy for university professors by planning to marry his cousin Susanna Strickers— a privilege that had been granted to his Lily tutor William Philippi (1600-1665) in 1630 only after mediation by the Brabant Council. Reportedly, the 1630 agreement had been made on the explicit condition that this would be the last time. A later eighteenth-century source mentions disputes with his colleagues and debts as reasons for Geulincx’ dismissal, but these have been impossible to trace. Since there is no evidence that the committee that sacked him had any problems with Geulincx personally, it may well have been the case that he simply had to choose either not to marry or to leave.

Religious considerations, however, may also have played a part. A letter of recommendation signed May 3, 1658 by the three Leiden theologians: Abraham Heidanus, Johannes Coccejus (1603-1669) and Johannes Hoornbeek (1617-1666), not only indicates that Geulincx had turned his back on the Catholic faith after he had taken in St. Augustine’s theory of grace, but also that he had initially visited Holland of his own accord in January 1658, “under the pretext of another trip to this province” (Eekhof, 1919: 19). Upon his return to Leuven, Geulincx had found that a successor had been appointed in his place. Although, as the text tells us, he had already decided to give up his position at Leuven, he had not expected the hostile reaction he was met with, since the letter specifies that “he barely escaped a life sentence.” If it is indeed the case that Geulincx confronted his colleagues in January 1658 with the embarrassing fact that he had left Leuven prompted by the intention to convert to Protestantism, it is likely they treated his case with utmost efficiency and discretion. Such intentions would have come at a very untimely moment. Amidst condemna­tions of Jansenism issued by the Vatican, and declarations of political freedom made by the Brabant Council, there was extremely little room to maneuver for Leuven’s university professors. They may well have been happy to explain Geulincx’ dismissal, if at all, in terms of marriage plans rather than dogmatic preferences. Rumors about Geulincx’ debts, moreover, may have had their origin in the fact that, under these circumstances, Geulincx had to leave everything behind in a hurry, and flee to Leiden penniless.

In his new home town, Geulincx was to graduate in medicine on September 17, 1658, no doubt in order to be able to earn a living. He married Susanna on December 8. Rather than to become a doctor, however, his ambition was to resume his career as a professor of philosophy. After a series of appointments and dismissals, Geulincx was finally appointed as junior lecturer in late 1662, with the help of Heidanus; first in logic; then in metaphysics. He was temporarily appointed as Professor extraordinarius in 1665, but was allowed to teach ethics only in February 1667. From June to November 1669, Geulincx was again newly appointed, now in order to teach rhetoric. He died in poverty in November 1669, having failed to pay any rent for the apartment he had shared with Susanna since October 1668. When Susanna died around the New Year, there was only some furniture left to compensate the couple’s creditors.

2. Logic and Method

With two works on the subject of logic, Geulincx had nevertheless started off his Leiden career in a positive mood. In the first of these works, his Logica suis fundamentis restituta (Logic Restored to its Foundations, 1662), Geulincx interprets negation mainly as propositional negation, that is, as acting on the whole of a proposition, not on terms. The other book, Methodus inveniendi argumenta (A Method for Finding Arguments, 1663), used set theory relations to demonstrate logical principles. For this way of approaching logic, the Dutch philosopher Gabriël Nuchelmans (1922-1996) would later refer to Geulincx’ logic as a “containment theory of logic”, in which relations of containment illustrate how statements are implied by other statements. Containment may explain logical consequence, for instance, since the propositional content of a statement q may be implied by that of p, just as, according to Geulincx, every proposition p will entail any number of further statements implied by p. Interpreting the way in which subjects relate to predicates in terms of relations of containment as well, Geulincx considered subjects as the denumerable “parts” of conceptual “wholes”, and considered the connection between subjects and predicates to be made on the basis of the “relation in which they stand to one another within the hierarchical structure of a conceptual field” (Nuchelmans, 1988: 40).

Producing a modernised summary of Geulincx’ propositional logic in the 1939 issue of Erkenntniss, the Swiss logician Karl Dürr (1888-1970) portrayed Geulincx as an early representative of symbolic logic. Geulincx presented his logical principles in a purely conceptual form and evidently depended on earlier scholastic traditions, such as in his formulation of De Morgan’s laws, which reproduce the fifteenth-century account John Versor offered in his commentary on Peter of Spain. Yet, according to Dürr, Geulincx’ logic contained all the elements of a mathematical logic, complete with variables and logical constants, as well as other remarkable features, such as a Tarskian definition of truth.

To weigh the sophistication of a seventeenth-century system of logic against its medieval forerunners, or to assess its significance for the development of later formal logic is, however, a complicated matter. In a later study, Dürr compared Geulincx’ achievement with similar works in logic, such as the Port-Royal Logic (1662), and works by Johannes Clauberg (1622-1665), Leibniz and Girolamo Saccheri (1667-1733). Dürr came to the conclusion that, especially in the area of propositional logic, Geulincx’ system was richer than that of most of his contemporaries, whilst in the field of term logic, his basic rules for the formal validity of syllogisms surpassed even those of Leibniz in elegance and precision (Dürr, 1965).

As a senior professor in Leuven, Geulincx had previously shown an interest in Baconian philosophy and had proposed to revise the university’s curriculum in such a way that natural philosophy might be studied as a separate field that also included logic and mathematics, as well as forms of experimentation. It is unknown whether Geulincx developed Cartesian views in Leuven as well, as did his Leuven colleague William van Gutschoven (c. 1618- 1667) and his tutor William Philippi (c. 1600-1665) at some point. However this may be, it was only in Leiden that Geulincx began to develop a Cartesian line of argument in natural philosophy, metaphysics and ethics, and expounded views on God’s causal role in nature that would later be interpreted as “occasionalist”.

3. Metaphysics

The appeal to God’s causal activity would become a central feature of both Geulincx’ metaphysics and his ethics, but the way in which he justified and explained the need for a divine administration of the activities normally attributed to “secondary causes”—that is to say, to individual persons and things—differs markedly from the arguments seen in the works of medieval Islamic “occasionalists” and Cartesian contemporaries such as Louis de la Forge, Géraud de Cordemoy and Nicholas Malebranche. Rather than developing the theological view that God exercises full power over man’s causal and epistemological functions; or questioning the metaphysically problematic notion of an exchange of accidents between substances; or, finally, dismissing the possibility that purely corporeal bodies might have a power to move either themselves or other bodies, Geulincx developed his so-called “occasionalist” position on the basis of an interpretation that grounds the idea of causality on the inner experience of active involvement (Renz & Van Ruler, 2010). What may pass for causality in the strictest sense is revealed by what human beings are familiar with, and what they experience within themselves as their own activities: the conscious awareness of “doing” things. Geulincx thus turns the Cartesian focus on human awareness, with its potential for deliberate and conscious activity, into the bedrock of a metaphysics of causal activity. With the notion of activity being linked to states of mental awareness, causality itself becomes the privilege of conscious minds, and a phenomenon for which the subject “doing” them is uniquely responsible.

At the same time, the scope of human activity is greatly reduced on the basis of such a criterion. Since the Cogito, or human consciousness, realises that there are many thoughts (cogitationes) that do not depend on the subject having them, Geulincx very early on in his Metaphysica Vera drew the conclusion that, “[t]here is a knowing and willing being distinct from me” (Geulincx, 1892: 150). It is this being, God, who arouses in us, through his manipulation of matter, the thoughts for which, not knowing how they come about, we cannot claim responsibility ourselves. On the basis of this consideration, Geulincx came to formulate the maxim that has become known as the first axiom of his philosophy: Quod nescis quomodo fiat, id non facis, in other words: “What you do not know how to do, is not your action” (Geulincx, 1892: 150).

In the Metaphysica Vera, or True Metaphysics, first published posthumously in 1691, the focus on the various causal roles of God and man gives rise to a tripartition of the discipline into an Autologia, a philosophy of the Self; a Somatologia, or a metaphysics of the World; and, finally, a Theologia, on God. To include a discussion of the physical universe in an exposition on metaphysics is something that would have been uncharacteristic for Descartes, but it is a move towards a deeper, metaphysical, understanding of nature that Geulincx shares with Spinoza. In fact, although the Metaphysica Vera is an unfinished text that was never authorized and leaves many questions unanswered, it testifies to the way in which various ontological conceptualisations in Spinozism have their antecedents in Geulincx. One of these is the distinction of causal levels into substantial and modal spheres. A significant aspect of Geulincx’ understanding of physical reality is his duplication of the world into a world of “becoming” and a world of “being”— a distinction Geulincx relates to Plato. According to this view, all individual bodies, with their states of “presence” and “absence”, belong to the world of becoming. Based on the idea that a world of mere effects cannot be all there is, Geulincx’ Platonic interpretation of the Cartesian universe introduces the notion of a Body-as-such, in which these effects find their ontological foundation. Carefully avoiding any reintroduction of the Aristotelian terminology of “substance” and “accident”, Geulincx thereby reintroduces the idea of an ontological distinction between the enduring entities of Mind and Body on the one hand, and their varying “modal”, that is, spatio-temporal manifestations on the other. Formulated in Platonic terms in Geulincx and in Aristotelian terms in Spinoza, this quasi-scholastic strategy to distinguish substantial from accidental levels of being results in a metaphysical interpretation of reality in terms of a diversity of ontological spheres – an interpretation that goes well beyond Descartes, but that we find in both Geulincx’ Metaphysica vera and Spinoza’s Principia Philosophiae Cartesianae, Short Treatise and Ethics (Van Ruler, 2009). In Geulincx, moreover, Descartes’ indistinct metaphysical categorizations, in which a single universal matter occurs next to a set of countless individual minds and a single God, is transformed into a strict metaphysical dualism according to which there are only two things: God, or Mind, on the one hand, and World, or Matter, on the other. Placing human minds in God, moreover, Geulincx also prefigured Spinoza in his way of arguing that human minds, like human bodies, are parcels of a larger field, or “modes”.

4. Ethics

Parallels with Spinozistic ways of thinking equally occur in Geulincx’ treatment of the subject of ethics. In both authors, Descartes’ natural philosophy serves as a new basis for the neo-Stoic view that morality should primarily be seen as a way of mentally dealing with inevitable patterns of causality in nature and human social life. According to Geulincx, moreover, the application of reason to all areas of experience is the practical upshot of a mental attitude focused on a “love of God”. Contrary to Spinoza, Geulincx had no qualms with the idea that one is free whether or not to align oneself mentally to the necessary course of things. To put it in Geulincx’ own words: whereas one always obeys God, one has the option whether or not to obey reason – and this is what constitutes the criterion of morality.

With his focus on reason, Geulincx conforms to a general tendency within Renaissance moral philosophy. At the same time, he interprets what is reasonable in his own peculiar way, introducing a new set of four cardinal virtues, namely diligence, obedience, justice and humility, in place of the old quadriga of temperance, fortitude, justice and prudence. These virtues are all aimed at reason. Accordingly, rather than being directed towards other human beings, what Geulincx prescribes as obedience is an obedience to reason, just as humility is a mental humility in the face of reason, diligence involves a diligent attention to reason, and justice is the acceptance of a just and reasonable mean.

Reason should always be followed, but in the context of such encouragements to mental subservience, the example with which Geulincx illustrates obedience is easily misread. Even the wretched life of a slave, Geulincx argues, may be lived in freedom, as long as the slave is able to direct his will to the call of reason and to endure even “an appalling and cruel slavery” by obeying orders not because it is the will of his master, but because it is his own (Geulincx, 1986: 82; See also 1893: 23; 2006: 24). Despite its awkward way of seemingly sanctioning slavery, this argument only carries to the extreme another conception predominant in both classical and Renaissance traditions of Western moral philosophy, and most straightforwardly expressed in (neo-)Stoic sources: the notion that mental freedom does not depend on the relative force of outward circumstances, but is brought about exclusively by an inner consent to the demands of reason.

In combination with his metaphysical view on the limitations of human causal activity, such a radical endorsement of intellectualist and indifferentist arguments would seem inevitably to lead to a moral position emphasising a passive or even submissive attitude. Geulincx, however, did not preach quietism. The complete text of his Ethics was published only posthumously in 1675, presumably by Bontekoe, under the title of Gnōthi seauton, or Know Thyself, but Geulincx had already issued a Dutch version of the first of its six “Treatises” as Van de Hooft-deuchden (“On the Cardinal Virtues”) in 1664. Far from teaching resignation, the book contains an exceptionally practical list of ethical maxims and reads like a self-help manual in popular psychology rather than a moral treatise in the traditional sense of the word. What, according to Geulincx, is reasonable for a human being to do in the light of the “human condition”— a concept he may have taken over from the French moralist Pierre Charron, or from his Leuven professor in theology Libert Froidmont (1587-1653)— is to abide by seven moral guidelines, or “obligations”: to accept death, to avoid suicide, to take care of one’s health and of that of one’s species, to learn a trade, to earn a living, to relax now and again, and never to curse one’s ancestry or day of birth.

With respect to all of these guidelines, Pierre Charron’s De la Sagesse (1601; revised edition 1603) may have provided Geulincx with a model for the kind of things a moral treatise should instruct (De Vleeschauwer, 1974). Geulincx, however, explained his obligations on the basis of a quasi-Cartesian metaphysical groundwork that at first sight seems to undermine rather than to support them. Denying, like Spinoza, the possibility of any interaction between the body and the mind, Geulincx comes to the conclusion that the human being is only an onlooker, a “spectator” of the outside universe: “I am a mere spectator of a machine whose workings I can neither adjust nor readjust” (Geulincx, 1893: 33; 2006: 34). This would seem to make all human activity not only irrelevant, but downright impossible. Geulincx, however, argues that we should nevertheless be mindful to fulfil certain actions we know from experience God wishes us to perform. We have to search for food, for instance, in order to survive, and we should try to comply in as far as we are able with such evident commitments. Indeed, an attentiveness to the basic facts of life is what links the two aspects of what Geulincx presents as his ethics of ‘humility’. On the one hand, this is the “occasionalist” Inspection of Oneself that tells us we find ourselves in a situation we neither control nor really understand; and, on the other hand, the list of “Obligations” that mark the obvious tasks we have to fulfil, and thus comprise a Disregard of Oneself. We should always choose what we know to be best. The only thing we should not do, according to Geulincx, is to bother about the outcome of our wishes, all of which are ultimately up to God. Thus, in the end, it is only our intentions that matter. In a famous example, Geulincx argued that it is for God to decide whether or not one is killed by the dagger with which one penetrates one’s heart. How one’s volitions are matched by activities produced in the material sphere “outside” is necessarily beyond us.

Geulincx does not speculate on the question to what extent we may rely on God’s resolve. Since the way in which God links physical to mental states is unknown to us, it is unclear whether Geulincx himself expected God either to have established a permanent world order or to produce an incessant number of miracles. As German commentators argued in late-nineteenth century debates on the possible impact of Geulincx on Leibniz, the analogy of two independent but synchronised clocks that Geulincx introduced in order to explain the relation between body and mind, seems to accentuate the Cartesian idea of a law-like regularity in nature. This is a position consistent with the emphasis laid on the notion of reason in Geulincx’ ethics. Yet where human volitions are in play, such as in Geulincx’ example of the dagger, or in his references to the phenomenon of paralysis, it would seem that God might have a more immediate role to play.

In the end, a solution to such metaphysical questions is not Geulincx’ primary concern in the context of ethics. As far as morality is concerned, it does not matter whether God makes a singular decision or whether he lets all physical conditions play their proper roles whenever one wishes to pierce one’s heart with a dagger. The moral point is, that this should not have been one’s intention in the first place. In this sense, the example is not so much meant to elucidate a metaphysical viewpoint, as it is indicative of Geulincx’ preoccupation with questions of life and death, and with the idea that the realm of the moral is defined by the mental attitude one takes with respect to preserving the condition that one finds oneself in as a conscious being. This is also the way in which to read the ethical axiom that Geulincx introduced as a counterpart to his earlier metaphysical maxim. The slogan Ubi nihil vales, ibi nihil velis, “Wherein you have no power, therein you should not will” (Geulincx 1893: 164; 2006: 178), applies not so much to any specific activities, but rather to human existence—“the human condition”— as such.

Although it is very likely that Spinoza (whose friend Lodewijk Meyer studied with Geulincx) must at least have known Geulincx by name, and although one may trace many coincidences in their works, there is no convincing evidence that the two men either knew each other, or knew each other’s work (Van Ruler, 2006). Likewise, it is unknown to what extent Geulincx’ moral philosophy may have inspired Spinoza. Spinoza may have been thinking of Geulincx for instance when, in his own Ethics, he explicitly denied that humility is a virtue, since this was the single most important of the four cardinal virtues for Geulincx. Spinoza may, on the other hand, also have wished simply to make an unequivocal statement against the traditional glorification of humility in Christian theological contexts.

Contrary to Spinoza, Geulincx presented his own moral philosophy as a philosophy compatible with Christian views. Interpreting certain Christian themes in purely naturalistic ways, such as by taking the “devil” solely to stand for a mental propensity to persist in inconsiderate behaviour, Geulincx’ Christian philosophy was unorthodox, but it was also paradoxical in various respects. He considered his moral philosophy to be an ethics exclusively founded on reason. Still, God’s word, or so Geulincx argued, had worked for him like a microscope: once Scripture had revealed the truth, he was now able to decide questions of right and wrong without its help – in other words, purely on the basis of reason. The obvious implication of this is that pagan philosophers could never have been able to find their way in matters of moral philosophy, and this is indeed what Geulincx concluded. True spiritual redemption was open only to philosophers acquainted with what Scripture had shown to be reasonable: the idea that one has no title to one’s life and that this insight should bear fruit in an attitude of humility. Geulincx might still include Platonic, Stoic and even Aristotelian concepts, and stick to classical forms of philosophical analysis in his ethics, but he dismissed all pagan philosophies for having been developed on the basis of inappropriate motivations. All pagans had urged “for the Land of Cockaigne”; they had craved for pleasure rather than having searched for God (Geulincx 1893: 52-54; 1966: 116-118). The pagans, in other words, had consciously aimed at achieving happiness, when all they should have been doing was to look for what is right. The difference between these two roads, Geulincx admits, is a very subtle one, for since reason and Christianity themselves lead to happiness, one has to be extremely careful to avoid the pitfall of self-centered motivations even as a Christian philosopher.

If, as Geulincx argues, one has to flee happiness in order to pursue it, it may seem tempting to try to flee happiness exactly for the reason of acquiring it. In that case, however, Geulincx argues, happiness “will not pursue you” (Geulincx, 1893: 58; 2006: 57). In other words, while one knows happiness will result from the fulfilment of a duty, one still needs to fulfil the duty without doing it with the aim of acquiring happiness, or it will not work. The notions of “Obligation” and “Law” may help to avert any psychological dilemmas here. Laws, according to Geulincx, never correspond to obvious forms of self-interest, or they would not be laws. If only we direct our mind “to refer nothing of what we do or do not do to our Happiness, but everything to our Obligation” and thus “pledge” ourselves “wholly to God” (Geulincx, 1893: 58 and 57, respectively; 2006: 57 and 56), there is no problem. Libertas will be the immediate, if paradoxical, effect of obedience; and happiness, Felicitas (or beatitude, Beatitudo, a word Geulinx uses for Felicitas only when explaining matters in the accepted scholastic terminology) will present itself automatically as the mental bonus for abiding by the way of virtue. Simply doing what God and reason demand, the wise man is able to disconnect his mind from sensory impressions, and to assent to what happens in God’s universe not according to what is most agreeable to him, but according to the way in which reason presents things as they are.

The complicated dialectics of receiving happiness in return for virtue caused Geulincx to touch upon theological questions as well. If Geulincx became a Jansenist in Leuven after having imbibed Augustine’s theory of grace, and a Calvinist later in Leiden, he must at some point have become aware that the whole idea of devising a Protestant moral philosophy was something inherently problematic. Theologically speaking, there could be no question of a Jansinist or Calvinist God distributing happiness in return for our effort. Geulincx was well aware of this, and therefore attempts to deny that he ever implied that God acts in reply to our achievement: “But mark: I did not say that the Humble first love God, and are then loved in return by God. Certainly not, I did not say this, and this should suffice” (Geulincx, 1893: 64; 2006: 63). Philosophically speaking, the rewards of virtue were nevertheless exactly this: God’s love in return for our love of God and reason. In line with a wider tendency in Dutch Cartesianism, Geulincx inevitably had to argue for a strict separation between philosophy and theology in order to save the practical relevance of his moral philosophy.

Besides classical, Christian and Cartesian themes, there may also have been biographical factors involved in shaping Geulincx’ ethics. The precarious living conditions of his Leiden years in particular seem to be reflected in his preoccupation with the insecurities of life and with the possibility of suicide, both of which topics are central to his ethics. Yet such interests may also have had their origin in a special talent for the experience of wonder in Geulincx, as well as an exceptionally subtle philosophical imagination.

5. Anti-Aristotelianism

It is in foreshadowing quasi-Kantian themes that Geulincx’ philosophical discernment appears most conspicuously. Essentially a criticism of Aristotelian ways of thinking, Geulincx’ Metaphysica ad mentem peripateticam, a book that was published only posthumously in 1691, argued that there was an illusory quality to thinking, aside from the illusiveness of sense perception. Not only was it true that our senses, as Descartes had argued, yield a subjective view of the world; according to Geulincx, our intellectual “ways of thinking” (modi cogitandi) distort our conception of reality just as much. Indeed, it is with intelligible species that we impose our ways of thinking on outside things similarly to the way in which we impose sensible species onto the world that do not apply to things as they are in themselves. Both ways, we “always attribute the phantasms (phasmata) of sense and intellect to things themselves”— even if “there is something divine in us that always tells us it is not so” (Geulincx, 1892: 301).

Once more giving a stricter format to Cartesian intuitions than Descartes himself would have done, and prefiguring Spinoza on both accounts, Geulincx distinguished four different kinds of knowledge and drew a sharp distinction between the realm of “imaginations” and the realm of “ideas”. Holding on to a classical notion of scientia that limits the notion of “idea” to the knowledge of the “essence” of a thing, Geulincx interpreted the gradual development of epistemological stages in Platonic rather than in Aristotelian terms and classified the respective levels of knowledge as (1) sense perception, (2) knowledge, or cognitio, (3) scientia, or knowledge with an account; and, finally, (4) the ultimate kind of scientia that is called sapientia or wisdom, which is available only to whomever is accountable for the thing known. Thus offering a seemingly Augustinian-inspired understanding of “ideas” as the kind of things in God’s mind that we must somehow have access to in order to intuit the essences of things, Geulincx in fact denied man any wisdom apart from the wisdom related to his own mental activities, such as our mental activities of love and hate, affirmation and negation and so forth, the reason for this being that to understand these and to will are, in the end, the only things one can actually “do”.

Wisdom accordingly presents itself in Geulincx mainly in a negative way; that is to say, in the form of a recognition that our intellectual capacities are extremely limited with respect to understanding things that occur outside the realm of consciousness. Although the mind knows that all things are either minds or bodies and that infinite mind and infinite extension (that is, God and Body) are ultimately all there is, our “modes of thinking”, in other words, our ways of apprehending reality, misrepresent things as they are in themselves by seeing them as separate “beings” that may function as the subject of predication. Yet we have to see them in this way, if we wish to say something about them.

Although there is an immediate Cartesian context as well as a Scotist terminological background to these arguments, and although, like Geulincx, authors such as Clauberg and Johannes de Raey (1622-1702) had also tried to come to terms with the indistinct manner in which Descartes had discussed general metaphysical concepts in the Principia Philosophiae (Aalderink: 2009), Geulincx’ position stands out for the way in which it emphasizes how the human intellect is liable to characterize the outside world in terms of forms of propositional content that portray whatever there is as being divided into objects possessing certain properties. As Geulincx himself remarks (Geulincx, 1892: 199), “few people seem to observe” that this logical mould introduces ontological classifications for which there is actually no basis in reality itself.

Geulincx thus came to criticise a philosophical viewpoint that had been almost universally shared since Aristotle, the idea, namely, that ontological concepts such as the concept of “substance” may function in parallel ways in metaphysics and logic. His criticism of this view (which is not merely a Peripatetic, but in fact a virtually universal human assumption) launched the epistemologically radical idea that the linguistic and logical ways in which our concepts function within our intellectual representations of the outside world, should actually be a warning against taking them seriously in metaphysical terms. According to Geulincx, logical and linguistic distinctions do not necessarily represent things as they are in themselves. Indeed, notions such as “being (ens), substance, accident, relation, subject, predicate, whole and part” only illustrate how we think about objects. As modes of thought we use these notions to express what we mean when we distinguish a thing from its activity or from our judgement of it. Our manner of understanding, however, should not be confused with the way things are structured and organised independently of our representations of them. Nor should we uncritically build philosophical systems on the categories and logical forms that help us to analyse what we experience.

Because of the way in which he gave prominence to, and ultimately dealt with, the question of the knowability of “things as they are in themselves”, Geulincx’ position has often been associated with the critical philosophy of Immanuel Kant. Ernst Cassirer (1874-1945), for instance, saw both Geulincx’ thesis of the unknowability of “things in themselves” (translated in German as Dinge an sich) and his view that all human understanding is dependent on “forms of thought” brought in by ourselves, as prefigurations of the Kantian position. Although he remained careful not to deny the differences between Geulincx and Kant, the Flemish Geulincx scholar (and former nazi-sympathiser in exile) Herman de Vleeschauwer (1899-1986) in 1957 agreed that if one defines “Criticism” as the theory according to which “we know things only by the medium of our forms of thought”, one could no longer “regard it as the personal discovery of Kant” (De Vleeschauwer, 1957: 63).

In general terms, Geulincx’ alertness to the possible incongruity between the logic of our thoughts and the structure of the outside world may indeed be compared to Kant’s. It may even be extended beyond Kant to serve as a comparison between the Flemish Cartesian’s criticisms of scholastic views and Wittgenstinian, as well as postmodern censures of the metaphysical suggestion that logical forms reflect an ontological structure of things. At the same time, Geulincx stood closer to other seventeenth-century denunciations of Aristotelianism inspired by Descartes, such as John Locke’s. Exposing scholastic metaphysics as a logical scheme functional only within the domain of our daily interaction with macroscopic objects, Geulincx’ evaluation of Peripatetic metaphysics, although it is cast in a rather scholastic terminology itself, anticipates Locke’s view in so far as it confirms the idea that there is a purely nominal aspect to the Aristotelian manner of metaphysical categorisation.

And yet Geulincx’ Metaphysica ad mentem peripateticam creates a sense of epistemological alienation that goes far beyond Locke’s criticism of the notion of substance. If, as a direct consequence of Cartesian natural philosophy, Geulincx argued that scholastic types of analysis in metaphysics might be exposed as logico-linguistic frameworks only, this not only meant that there is a certain contingency to the “essences” derived from mere experience; it also meant that the logic of substance itself was mistaken, and that, accordingly, the search for “substantiality” was ill-conceived. Geulincx did, of course, accept the existence of a universal “Body”, but for him, this idea was not dependent on the vague conceivability of a substantial substrate to which one might attach accidental properties. For Geulincx, the notion of Body-as-such may simply be deduced from the fact that one finds many “thoughts” (cogitationes) in one’s conscious experience that do not depend on oneself. Accordingly, there is something out there, something orchestrated by God. This is the World itself, no less— but there is no sense in continuing, like Locke, to see this World as a substance with properties, or to lament the indistinctness of this “something”. With respect to substan­tiality, we should rather be aware that we are misled by our own intellect into searching for it in the everyday world of things. In the Principia philosophiae, Descartes himself had already argued against trying to conceive of substantial beings behind the forms of “extension” and “thought” that we find in nature. Geulincx drew the ultimate conclusion by arguing that the search for a universal “something” of which the property of being extended is an “accident”, arises from the mistaken belief that the world is structured along the lines of our “modes of thought”.

As a consequence, Geulincx does indeed come close to Kant in the sense that his emphasis on the unknowability of things is modified by the idea that the world as it is “in itself”, remains hidden to our observation and eludes our limited epistemological capabilities to grasp what is actually there. Still, Geulincx’ arguments are very different from Kant’s. According to the Flemish philosopher, our intellect imposes a grid on our experience on account of which we necessarily envision the external world as a world of “things”. Doing so, our metaphysical imagination follows the linguistic and logical habit of distinguishing substantives from adjectives in language and subjects from predicates in logic. The problem with scholastic metaphysics is that it draws ontological conclusions from such cognitive ways of dealing with reality. Just as we attribute our sense impressions to the outside world even though, at least at a certain level of mental development, we become aware that such attributions are incorrect, so too should we, with respect to our intellectual understanding of things, come to doubt the way in which we attribute our cogitationes to things in themselves.

According to Geulincx, there is hardly a way to avoid this, and we cling to the idea of distinguishing beings from properties with even more tenacity than we adhere to the idea of attributing mentally experienced qualities to external things in sense experience. Posing the question how we come to conclude that there is a real basis for distinguishing between subjects and predicates, Geulincx rejects the common scholastic ways of arguing for an actual relation of “inherence” between them. He does, however, offer an alternative ground for our habit of seeing things this way. Whenever we refer to things either as “beings” or as “properties”, it may be that we do so because of the relative stability of our various sense impressions: “The real cause (…) may be, that people see some things as more firm, stable and lasting, others as more fluid, fleeting and frail. Thus (…) light and darkness, colours and sounds and all similar things are regarded as more fluid than body or extension” (Geulincx, 1892: 305). What, in other words, modern psychology and evolutionary biology might consider to be innate propensities, Geulincx was tempted to explain on empiricist grounds. Repeatedly confirming that fluid impressions find their support in firmer ones, rather than that firm marks rest on fleeting signals, our senses will encourage our intellect to follow suit and conceptualise the world in terms of independent beings and their dependent properties.

Rather than to Locke, Kant, or Wittgenstein, Geulincx accordingly compares best to Geulincx himself. A similar interpretation of the way in which the human intellect conceptually rearranges sense experience is found only in the works of his pupil Richard Burthogge. According to Burthogge, the senses give us “external qualia, which reason interprets as predicable of substances or subjects”; a position that, in the terminology of analytical philosophy, has been interpreted as a form of “idealism” (Ayers, 2005: 195). As with so many other statements of this underestimated English philosopher, however, this particular view derives straight from Geulincx’ Metaphysica ad mentem Peripateticam.

Again coming closer to Kant than to Locke, Geulincx developed his epistemological arguments vis-à-vis Aristotelianism not so much in order to make room for a new understanding of nature, but rather in order to heighten our philosophical awareness of the fact that we are fundamentally ignorant of what the world is like independently of our experience. As with Kant, moreover, there is a certain religious susceptibility at play in Geulincx’ philosophical concerns. Exhibiting a mental predisposition coloured by Augustinianism in all of his works, Geulincx would always keep wondering at the ineffable character of God’s universe and our position in it.

6. A Philosophy of Wonder

If Geulincx hardly compares to other philosophers in the Western tradition, others did take their inspiration from Geulincx. Having developed an interest in seventeenth-century philosophy during his assistantship at the École Normale Supérieure in Paris from 1928 to 1930, the Irish poet and novelist Samuel Beckett would take up a close study of Geulincx’ works (the Metaphysica Vera and Ethics in particular) at Trinity College Dublin, in the spring of 1936. As a direct result of this interest, Arnold Geulincx was to play a crucial role in Beckett’s Murphy (finished in June 1936 and published in 1938)—a book that presents its leading character preferably sitting naked in his London apartment, tied to a teakwood rocking chair. Implicit references to Spinoza and explicit references to Geulincx accompany the way in which Murphy’s inner experience is detailed, and is further explained in later chapters.

It has been well-established how Geulincxian imagery, such as that of the cradle (which Geulincx used to explain the relationship between our will and God’s, and which Beckett turned into a rocking chair), the two synchronised clocks, and the passenger walking on the deck of a ship against the vessel’s direction (an image Geulincx himself may have derived from Justus Lipsius), were continually reused by Beckett well beyond Murphy; how Geulincxian expressions, such as “coming hither, acting here, departing hence”, turn into metaphorically rich elements of literary structure in Beckett; and how Geulincx’ overall theme of power and impotency would continue to resonate in Beckett’s plays, prose and cinematographic works. If it is true that “[what] chiefly endured for Beckett from Geulincx was his acceptance of ignorance as the basic human condition, his ethic of humility and his advocacy for ascetic withdrawal and rigorous self-examination” (Herren, 2012: 195), it is also clear why Geulincx might come to function as a replacement for Descartes in Beckett, and as “a philosopher who spoke to [Beckett] as no other had” (Cordingley, 2012: 49). The contrast between Geulincx and Descartes may also serve to accentuate that there was a Geulincxian conceptual background to what twentieth-century philosophers may have derived from Beckett’s plays. On account of the element of ineffability that Geulincx added to Cartesianism, it has been argued that the notion of an “absence of self-presence”, particularly in thinking and in authorship (a theme taken up by French philosophers such as Blanchot, Foucault and Derrida), found a Geulincxian inspiration in Beckett (Uhlmann, 2006: 113).

The great difference, however, between Geulincx on the one hand and twentieth-century French philosophers inspired by Beckett’s absurdist plays on the other, is that Geulincx—like Beckett himself, for that matter—had no inclination to diminish the importance of subjective experience. Indeed, it is precisely in this respect that Geulincx’ “experiential” defence of occasionalist arguments was squarely at odds with Malebranche’s alternative notion of God’s pre-ordination of human minds. Later commentators have been surprised by such disparities within occasionalist philosophy (Nadler, 1999), or have even drawn the misguided conclusion that Geulincx was an inconsistent occasionalist (Terraillon, 1912; Rousset, 1999). In fact, rather than to explain away human mental activity on the grounds of theological or determinist dogma, Geulincx not only took the inner world of conscious­ness as his starting point in philosophy, but also saw it as a cause for wonder at the singularity of the human condition. If things prove themselves to be ineffable, it is to the human subject that they do so. Similarly, if outside things remain inscrutable, it is only of inner experience itself that our knowledge is genuine and absolute.

Samuel Beckett is believed to have broken away from making further dogmatic use of philosophy after his post-war realisation that “All I am is feeling” (Uhlmann, 2006: 72). His interest in Geulincx, however, did not suffer from this. If a mood of estrangement, coupled to a painstaking examination of the inner life, is what Beckett found familiar in Geulincx, it is significant that Beckett never studied the quasi-Kantian arguments from the Metaphysica ad mentem peripateticam, arguably Geulincx’ most radical philosophical text. Apart from some transcriptions taken from the Metaphysica vera, Beckett took his notes mainly from Geulincx’ Ethics. A familiarity of viewpoints must have been obvious to Beckett in these texts as well, which may add to our conviction that Beckett’s prolonged interest in Geulincx was based primarily on an affection that went beyond specific images or doctrines of philosophy.

There was obviously “something of a friendship across centuries” between Beckett and Geulincx (Tucker, 2012: 181), apparently motivated by the articulation of a shared experience that Beckett cherished in Geulincx, and that presumably involved a recognition of something very intimate and relatively rare, even if it had been expressed in such technical philosophical contexts as a religiously motivated metaphysics and a theory of ethics combining classical and Christian themes.

The ultimate secret to Geulincx’ appeal may be that his philosophical texts, despite their traditional setting, have a captivating strangeness to them, which is linked to the alienating topics they address. Whatever his philosophy may have done for Beckett’s artistic development, it is beyond doubt that, just as in Samuel Beckett’s case, Geulincx’ Baroque blend of Augustino-Cartesianism will continue to impress likeminded readers by its unique evocation of the timeless motif of human metaphysical ignorance, as well as by its humbling expression of amazement at the mystery of existence.

7. References and Further Reading

a. Primary Sources

  • Geulincx, Arnold, Opera philosophica, vol. 1, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1891.
    • Geulincx’ Orationes and the Logica restituta
  • Geulincx, Arnold, Opera philosophica, vol. 2, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1892.
    • The Methodus, as well as the metaphysical and physical works
  • Geulincx, Arnold, Opera philosophica, vol. 3, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1893.
    • Ethics, ethical disputations and Notes on Descartes
  • Geulincx, Arnold, Sämtliche Schriften in fünf Bänden, ed. H.J. de Vleeschauwer, Stuttgart-Bad Cannstatt: Frommann-Holzboog, 1965–1968.
    • A handy reprint (in 3 vols.) of the Opera Philosophica
  • Geulincx: Présentation, choix de textes et traduction, ed. Alain De Lattre, Philosophes de tous les temps vol. 69, Paris: Seghers, 1970.
    • A selection of Geulincx’ texts in French
  • Geulincx, Arnout, Van de hoofddeugden. De eerste tuchtverhandeling, ed. Cornelis Verhoeven, Baarn: Ambo, 1986.
    • The first part of the Ethics in a modern version of the Dutch original
  • Geulincx, Arnold, Metaphysics, ed. Martin Wilson, Wisbech: Christoffel Press, 1999.
    • First English edition of the Metaphysica vera
  • Geulincx, Arnold, Ethics, With Samuel Beckett’s Notes, ed. Han van Ruler, Anthony Uhlmann and Martin Wilson. Leiden and Boston: Brill, 2006.
    • The complete Ethics in English with a transcription of Beckett’s notes
  • Geulincx, Arnold, Éthique, ed. Hélène Bah-Ostrowiecki, Turnhout: Brepols, 2010.
    • The Ethics in a modern French edition

b. Secondary Sources

  • Aalderink, Mark, ‘Spinoza and Geulincx on the human condition, passions, and love’, Studia Spinozana vol. 15 / Wiep van Bunge (ed.), Spinoza and Dutch Cartesianism, Würzburg: Königshausen & Neumann, 2006, pp. 67-87.
    • On the Augustinian concept of love and its impact on Geulincx and Spinoza
  • Aalderink, Mark, Philosophy, Scientific Knowledge, and Concept Formation in Geulincx and Descartes, Utrecht: Zeno, 2010.
    • Published dissertation on the epistemological differences between Descartes and Geulincx
  • Armogathe, Jean-Robert, and Vincent Carraud, ‘The First Condemnation of Descartes’ Œuvres: Some Unpublished Documents from the Vatican Archives’, in: Daniel Garber and Steven Nadler (eds.), Oxford Studies in Early Modern Philosophy, vol. 1, Oxford: Clarendon, 2003, pp. 67-109.
    • Contains the only known reference to Geulincx’ marriage plans as a reason for his dismissal
  • Ayers, M.R., ‘Richard Burthogge and the Origins of Modern Conceptualism’, in: Tom Sorell and G.A.J. Roger (eds.), Analytic Philosophy and History of Philosophy, Oxford: Clarendon, 2005, pp. 179-200.
    • On Geulincx’ most important pupil in epistemology
  • Cassirer, Ernst, Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit (Berlin, 1906-1923), ed. Dagmar Vogel in 2 vols., Hamburg: Meiner, 1999.
    • On Geulincx and Kant
  • Cooney, Brian, ‘Arnold Geulincx: A Cartesian Idealist’, Journal of the History of Philosophy, vol. 16 (1978), pp. 167-180.
    • English language introduction to Geulincx
  • Cordingley, Anthony, ‘École Normale Supérieure’, in: Anthony Uhlmann (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013, pp. 42-52.
    • On Samuel Beckett’s intellectual development during the late 1920s and early 1930s
  • Dürr, Karl, ‘Die mathematische Logik des Arnold Geulincx’, The Journal of Unified Science (Erkenntnis), vol. 8 (1939-40), pp. 361-8.
    • A translation of Geulincx’ logic in modern terms
  • Dürr, Karl, ‘Arnold Geulincx und die klassische Logik des 17. Jahrhunderts’, Studium Generale 18 (1965-8), pp. 520-541.
    • Geulincx’ logic in the context of other seventeenth-century sources in the field
  • Eekhof, A., ‘De wijsgeer Arnoldus Geulincx te Leuven en te Leiden’, in Nederlandsch Archief voor Kerkgeschiedenis, new series, vol. 15 (1919), pp. 1-24.
    • On the Letter of Recommendation of 3 May 1658
  • Herren, Graley, ‘Working on Film and Television’, in: Anthony Uhlmann (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013, pp. 192-202.
    • On Beckett’s psychology and Geulincx’ influence on his screenplays
  • Kossmann, E.F., ‘De laatste woning van Arnold Geulincx’, in Bijdragen voor Vaderlandsche Geschiedenis en Oudheidkunde 7-3, pp. 136-138.
    • On Geulincx’ last residence and debts
  • Land, J.P.N., ‘Arnold Geulincx te Leiden (1658-1669)’, in Verslagen en Mededeelingen der Koninklijke Akademie van Wetenschappen, Afdeeling Letterkunde, 3rd series, vol. 3 (1887), pp. 277-327.
    • On Geulincx’ Leuven dismissal and Leiden career
  • Land, J.P.N., ‘Aanteekeningen betreffende het leven van Arnold Geulincx’, in Verslagen en Mededeelingen der Koninklijke Akademie van Wetenschappen, Afdeeling Letterkunde, 3rd series, vol. 10 (1894), pp. 99-119.
    • On Geulincx’ life in Flanders
  • Land, J.P.N., Arnold Geulincx und seine Philosophie. The Hague: Martinus Nijhoff, 1895.
    • A Geulincx biography.
  • Lattre, A. de, L’occasionalisme d’Arnold Geulincx, Paris: Les Editions de Minuit, 1967.
    • Published dissertation on Geulincx’ philosophy
  • McCracken, J.D, Thinking and Valuing: An Introduction, Partly Historical, to the Study of the Philosophy of Value. London: Macmillan, 1950.
    • Interpretation of Descartes, Geulincx and Spinoza as a particular school of ethics
  • Monchamp, Georges, Histoire du Carté­sianis­me en Belgique, Bruxelles et St. Trond: F. Hayez, 1886.
    • An as yet unsurpassed history of Cartesianism in the Southern Netherlands
  • Nadler, Steven, ‘Knowledge, Volitional Agency and Causation in Malebranche and Geulincx’, British Journal for the History of Philosophy 7 (1999-2), pp. 263-274.
    • On similarities and differences between Malebranche and Geulincx
  • Nuchelmans, Gabriël, Geulincx’ Containment Theory of Logic, Amsterdam: Koninklijke Nederlandse Akademie van Wetenschappen / Noord-Hollandsche Uitgevers Maatschappij, 1988.
    • Detailed account of Geulincx’ use of set theory in logic
  • Paquot, Jean Noël, Memoires pour servir a l’histoire litteraire des dix-sept provinces des Pays-Bas, de la principauté de Liege, et de quelque contrées voisines, vol. 13, Louvain: De l’imprimerie academique, 1768.
    • Reference to Geulincx’ presumed Leuven quarrels and debts
  • Pfleiderer, Edmund, Leibniz und Geulincx: Mit besonderer Beziehung auf ihr beiderseitiges Uhrengleichniss, Tübingen: Tübinger Universitäts-Schriften, 1884.
    • Start of the controversy on the image of the synchronised clocks in Geulincx and Leibniz
  • Renz, Ursula, and Han van Ruler, ‘Okkasionalismus’, in: Hans Jörg Sandkühler (ed.), Enzyklopädie Philosophie, Hamburg: Felix Meiner, 2010, vol. 2, pp. 1843-1846.
    • On the diversity of occasionalisms
  • Rousset, Bernard, Geulincx entre Descartes et Spinoza, Parijs : Vrin, 1999.
    • Posthumously published monograph on Geulincx
  • Ruler, Han van, ‘“Something, I know not what.” The Concept of Substance in Early Modern Thought’, in Lodi Nauta and Arjo Vanderjagt (eds.), Between Imagination and Demonstration. Essays in the History of Science and Philosophy Presented to John D. North, Leiden: Brill, 1999, pp. 365-93.
    • On Geulincx, Locke and the notion of individuality in scholastic and Cartesian thought
  • Ruler, Han van, ‘Geulincx, Arnold (1624-1669)’, in Wiep van Bunge, Henri Krop, Bart Leeuwenburgh, Han van Ruler, Paul Schuurman and Michiel Wielema (eds.), The Dictionary of Seventeenth and Eighteenth-Century Dutch Philosophers, in 2 vols, Bristol: Thoemmes, 2003, vol. 1, pp. 322-331.
    • Extended dictionary entry on Geulincx and his works
  • Ruler, Han van, ‘Geulincx and Spinoza: Books, Backgrounds and Biographies’, in Studia Spinozana 15 / Wiep van Bunge (ed.), Spinoza and Dutch Cartesianism. Würzburg: Königshausen & Neumann, 2006, pp. 89-106.
    • On whether Geulincx and Spinoza knew each other or each other’s work
  • Ruler, Han van, ‘Spinozas doppelter Dualismus’, transl. Andreas Fliedner, in: Deutsche Zeitschrift für Philosophie 57 (2009-3), pp. 399-417.
    • On parallel forms of dualism in Geulincx and Spinoza
  • Terraillon, Eugène, La morale de Geulincx dans ses rapports avec la philosophie de Descartes, Paris: Alcan, 1912.
    • Short work on Geulincx’ occasionalism
  • Thijssen-Schoute, C. Louise, Nederlands Cartesianisme, Amsterdam: Noord-Hollandsche Uitgevers Maatschappij, 1954; new ed. by Theo Verbeek, Utrecht: HES, 1989.
    • Source book on Dutch Cartesianism
  • Tucker, David, Samuel Beckett and Arnold Geulincx: Tracing ‘a literary fantasia’, London: Continuum, 2012.
    • Detailed study and interpretation of all of Beckett’s references to Geulincx
  • Uhlmann, Anthony, Samuel Beckett and the Philosophical Image, Cambridge: Cambridge U.P., 2006.
    • On Beckett’s use of philosophical themes and their literary and philosophical impact
  • Uhlmann, Anthony, Chris Conti and Andrea Curr (eds.), Arnold Geulincx Resource Site, funded by the Australia Research Council: www.geulincx.org
    • A website dedicated to Geulincx research by The Beckett and Geulincx Research Project
  • Uhlmann, Anthony (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013.
    • A volume of articles on Beckett’s intellectual biography
  • Vander Haeghen, Victor, Geulincx. Étude sur sa vie, sa philosophie et ses ouvrages, Diss. Liège, Gent: Vander­haeghen, 1886.
    • Complete intellectual biography
  • Vanpaemel, Geert, Echo’s van een wetenschappelijke revolutie. De mechanistische natuur­wetenschap aan de Leuvense Artesfaculteit, Brussel: KAWLSK, 1986.
    • On Leuven University’s curriculum and Geulincx’ proposals for change
  • Verbeek, Theo, ‘Geulincx, Arnold (1624-69)’, in Edward Craig (ed.), Routledge Encyclopedia of Philosophy, vol. 4, Londen: Routledge, 1998, pp. 59-61.
    • Concise account of Geulincx’ philosophy and its relation to Descartes
  • Vleeschauwer, Herman J. de, Three Centuries of Geulincx Research, Mededelings van die Universiteit van Suid-Afrika / Communications of the University of South Africa, Pretoria 1957.
    • Bibliographical outline of Geulincx-interpretations
  • Vleeschauwer, Herman J. de, ‘Ha Arnold Geulincx letto il « De la Sagesse » de Pierre Charron?’, Filosofia 25 (1974-2 and 1974-4), pp. 117-134 and 373-388.
    • On Charron’s De la Sagesse as a model for Geulincx’ Ethics.

 

Author Information

Han van Ruler
Email: vanruler@fwb.eur.nl
Erasmus University
The Netherlands

Neocolonialism

The term “neocolonialism” generally represents the actions and effects of certain remnant features and agents of the colonial era in a given society. Post-colonial studies have shown extensively that despite achieving independence, the influences of colonialism and its agents are still very much present in the lives of most former colonies. Practically, every aspect of the ex-colonized society still harbors colonial influences. These influences, their agents and effects constitute the subject matter of neocolonialism.

Jean Paul Sartre’s Colonialism and Neocolonialism (1964) contains the first recorded use of the term neocolonialism. The term has become an essential theme in African Philosophy, most especially in African political philosophy. In the book, Sartre argued for the immediate disengagement of France’s grip upon its ex-colonies and for total emancipation from the continued influence of French policies on those colonies, particularly Algeria. However, it was at one of the All African People’s Conferences (AAPC), a movement of political groups from countries in Africa under colonial rule, which held conferences in the late 1950s and early 1960s in Accra, Ghana, where the term was first officially used in Africa. At the AAPC’s “1961 Resolution on Neocolonialism,” the term neocolonialism was given its first official definition. It was described as the deliberate and continued survival of the colonial system in independent African states, by turning these states into victims of political, mental, economic, social, military and technical forms of domination carried out through indirect and subtle means that did not include direct violence. With the publication of Kwame Nkrumah’s Neo-colonialism: The Last Stage of Imperialism in 1965, the term neocolonialism finally came to the fore. Neocolonialism has since become a theme in African philosophy around which a body of literature has evolved and has been written and studied by scholars in sub-Saharan Africa and beyond. As a theme of African philosophy, reflection on the term neocolonialism requires a critical reflection upon the present socio-economic and political state of Africa after independence from colonial rule and upon the continued existence of the influences of the ex-colonizers’ socio-economic and political ideologies in Africa.

Table of Contents

  1. Introduction
  2. History of Neocolonialism
  3. Neocolonialism: Related Concepts
  4. Colonialism
  5. Imperialism
  6. Decolonization
  7. Neocolonialism: The Last Stage of Imperialism
  8. The Myth of Neocolonialism
  9. Neocolonialism Today in Africa: The Era of Globalization
  10. Conclusion
  11. References and Further Reading

1. Introduction

Neocolonialism can be described as the subtle propagation of socio-economic and political activity by former colonial rulers aimed at reinforcing capitalism, neo-liberal globalization, and cultural subjugation of their former colonies. In a neocolonial state, the former colonial masters ensure that the newly independent colonies remain dependent on them for economic and political direction. The dependency and exploitation of the socio-economic and political lives of the now independent colonies are carried out for the economic, political, ideological, cultural, and military benefits of the colonial masters’ home states. This is usually carried out through indirect control of the economic and political practices of the newly independent states instead of through direct military control as was the case in the colonial era.

Conceptually, the idea of neocolonialism can be said to have developed from the writings of Karl Marx (1818-1883) in his influential critique of capitalism as a stage in the socio-economic development of human society. The continued relevance of Marxist socio-economic philosophy in contemporary times cannot be denied. The model of society as structured by an economic basis, legal and political superstructures, and a definite form of social consciousness that Marx presented both in The Capital (1972) as well as in the Preface to the Critique of Political Economy (1977) remains important to socio-economic theory. Marx presents theories which explain a certain kind of evil in capitalism. Today, capitalism has produced the multinational corporations that can assemble far more effective intelligence behind their often nefarious designs than any nation’s government can assemble to try to hold multinationals at bay. As things go now with the capitalist system, there is an indication that there are some foresights in some of Marx’s prognostication. The world seems to continue to acquiesce to the vast control of economic and political resources by the wealthiest 1%. No doubt, Marx’s prognostications have been vindicated in many ways than they have been refuted.

Consequent analysis by Sartre, in his critique of French economic policies on Algeria, was an attempt to combine his existentialist idea of human freedom with Marx’s economic philosophy in order to better establish his opposition to France’s economic colonization of Algeria. Proper coinage of the term neocolonialism in Africa, however, is attributed to Nkrumah who used it in his 1963 preamble of the Organization of African States (OAU) Charter and later, as the title of his 1965 book, Neocolonialism: The Last Stage of Imperialism.

In a simple context, neocolonialism is a class name for all policies, infrastructures and agents actively contributing to society, which indirectly serve to grant continuity to the practices known to the colonial era. The essence of neocolonialism is that while the state appears to be independent and have total control over its dealings, it is in fact controlled by outsider economic and political influences (Nkrumah1965, 7). The loss of control of the machineries of the states to the neocolonialists underlies the basis of Nkrumah’s discourse.

In his article “Philosophy and Post-Colonial Africa”, Tsenay Serequeberhan explicates the nature of neocolonialism in Africa in a manner that reveals how Europe propagates its policy of socio-economic and political dominance in post-colonial Africa. For Serequeberhan, neocolonialism in Africa is that which internally replicates in a disguised manner what was carried out during the colonial period. This disguised form constitutes the nature of the European neocolonial subjugation as it concerns the politics of economic, cultural, and scientific subordination of African states (Serequeberhan 1998, 13). With this, we can describe the general nature of neocolonialism as a divergence in national power—political, economic, or military—which is used rather lopsidedly by the dominant power to subtly compel the dominated sectors of the dominated society to do its bidding. The method and praxis of neocolonialism lies in its guise to enjoin leaders of the independent colonies to accept developmental aids and support through which the imperial powers continue to penetrate and control their ex-colonies. Through the guise of developmental aids and support, technological and scientific assistance, the ex-colonial masters impose their hegemonic political and cultural control in the form of neocolonialism (Serequeberhan 1998, 13).  In such a situation, the leaders of the seemingly independent African states become minions to the whims and caprices of the ex-colonial lords or their multinational corporations in terms of the management of the affairs of the new states. Prima facie, it would seem that the neocolonial state is free of the influence of imperialists, and it appears to be governed completely by its own indigenes. In truth, though, the state remains under its former colonial masters and their accomplices. Being under the continued impression that the former colonialists are superior and more civilized, the leaders of the supposedly new independent states continue to practice and encourage the people to imbibe the ways and cultural practices, and more essentially the economic control, of the imperialists.

Within a neocolonial situation, therefore, the imperialists usually maintain their influence in as many sectors of the former colony as possible, making it less of an independent state and more of a neo-colony. To this end, in politics, economics, religion, and even education, the state looks up to its imperialists, rather than improving upon its own indigenous culture and practices. Through neocolonialism, the more technologically advanced nations ensure their involvement with low income nations, such that this relationship practically annihilates the potential for the development of the smaller states and contributes to the capital gain of the technologically advanced nations (Parenti 2011, 24).

In On the Postcolony, Achille Mbembe further examines the nature of neocolonialism in Africa and says that the underpinning theory on which neocolonialism rests consists of bald assertions with no tenable arguments to support it. Evidently, in his view, after colonialism has ended in Africa, the West did not consider that Africans were capable of organizing themselves socially, economically and politically. To Mbembe, the reason for holding such ideas and advancing them is simply because the African is believed to be intellectually poor and is reducible to the level of irrationality. In his words, the capacity for Africans to rationally organize themselves is “understood through a negative interpretation” (Mbembe 2001, 1). This interpretation reveals the African as never possessing things and attributes that are properly part of human nature, or rather (even if reluctantly granted the status of the human) those things and attributes are generally of lesser value, little importance, and of poor quality (Mbembe 2001, 1). In other words, since Africans and other people that are different in race, language, and culture from the West do not possess the power, the rigour, the quality, and the intellectual analytical abilities that characterize Western philosophical and political traditions (Mbembe 2001 2), it is then difficult to assume that they would have the rational capacity to organize themselves socially, economically and politically. In a rejoinder to this bald assertion and negative interpretation, Mbembe retorts that the West has always had insurmountable difficulties with accepting an African theory on the “experience of the Other”, or on the issue of the “I” of others to which the West seems to perceive as foreign to it. In other words, the typical Western tradition has always denied the existence of any “self” but its own. It has always denied the idea of a common human nature, such that, “a humanity shared with others, long posed, and still poses, a problem for Western consciousness” (Mbembe 2001, 2).

Fundamentally, this denial is not peculiar to the period of neocolonialism alone. It has a history that dates back to the period of the trans-Atlantic slave trade and colonialism. In his book, The Invention of Africa, V. Y. Mudimbe asserts that there are three methods that are representative of the colonial structure in Africa: the domination of physical space, the reformation of natives’ minds, and the integration of local economic histories into the Western perspective. This structure constitutes the three complementary aspects of the colonial organization which embraces the physical, human, and spiritual elements of the colonizing experience (Mudimbe 1988, 2). This colonial structure is aimed at emphasizing a historicity that promotes discourses on African primitiveness, which is used in justifying why the continent needed to be conquered and colonized in the first place (Mudimbe 1988, 20). Citing what Ignacy Sachs calls “europeocentricism”, Mudimbe says this model of colonialism is to “dominate our thought and given its projection on the world scale by the expansion of capitalism….it marks contemporary culture imposing itself as a strongly conditioning model for some and forced deculturation for others” (Sachs 1971, 22). In all of this, according to Mudimbe, europeocentricism is anchored on the denial of the “Other” in European consciousness. It continues to be a denial in spite of the assertion of the “existence of the other” in Paul Ricoeur’s meditation on the irruption of the other: “when we discover that there are several cultures instead of just one…be it illusory or real, we are threatened with destruction by our own discovery. Suddenly, it becomes possible that there are just others, that we ourselves are an “other” among others” (Ricoeur 1965, 278). To this end, if one accepts Ricoeur’s assertion on the existence of cultural pluralism one would be able to affirm the foundation of Mudimbe’s submission in his The Idea of Africa that in each continent, for example Africa, “there are natural features, cultural characteristics, and, probably, values that contribute to its reality different from those of, say, Asia and Europe” (Mudimbe 1994, xv).

It is based on this distinct reality of each culture that William Abraham in The Mind of Africa examines the problems and challenges that face post-colonial Africa vis-à-vis the continent’s interaction with Europe. Abraham acknowledges the existence of neocolonialism in Africa, but proposes an integrative form of culture whereby certain positive aspects of Western culture may be integrated with African culture in order to forge a common bond (Abraham 1962, 83). Abraham however emphasizes that in spite of the on-going social, economic and political change in Africa due to the impact of neocolonialism, Africa’s culture must be guarded from being eroded by Western influence and civilization, or what he refers to as “the externality of an outsider” (Abraham 1962, iv)

The above description of the nature of neocolonialism and its different dimensions sparsely elaborates on the themes of subjugation and an apparent imposition of a hegemonic economic, political, and social order mostly in the guise of trade relations or developmental aid grants by the imperialists. The attendant implication of this relates to how post-colonial African states have seemingly failed to apply themselves to the problems of self-maintenance.

2. History of Neocolonialism

Towards the late nineteenth century through to the latter half of the twentieth century, some European countries, such as Britain, France, Belgium, and Portugal, had colonized a large number of African nations, setting up economic systems that allowed for seemingly extensive exploitation. Decades after World War II, these European nations granted political independence to their colonies in Africa, but still found a way to retain their economic influence and power over the former colonies. From the 1950s when many African colonies began to gain independence, they soon realized that the actual liberation that they had anticipated was outlandish. So, in spite of the assumption of Africans to political leadership positions, Africans soon realized that the economic and political atmosphere were still under some form of control of the former colonial masters. By implication, post-colonial Africa continued to experience the domination of the Western styled economic model that was prevalent during the period of colonialism. It does appear that the former colonial masters only wanted to grant political independence to their former colonies, and did not want them to be liberated from colonialism. This is why it is inferred that the situation which informs the ideological implementation of neocolonialism in Africa began immediately after the political independence of most African states.

In postcolonial Africa, events and situations have revealed how neocolonialism was nurtured from the moment independence was granted. The elements of neocolonial influences that are apparent within the interactions continually exist between former colonial masters and their former colonies attest to this assertion. For example, the point could be made with regards to the ongoing interactions between France and Francophone African countries such as Cameroon, Togo and Ivory Coast, as well as between Britain and Anglophone African countries such as Ghana, Nigeria and the Gambia

In the case of Cameroon, particularly after the amalgamation of French Cameroon with Southern British Cameroon in 1961, the granting of political independence to Cameroon by France was dependent on certain negotiations on matters of defense, foreign policy, finance, and economy, as well as technical assistance. This resulted in the adoption and institutionalization of the French 5th Republic constitutional model, alongside French political, economic, monetary, and cultural dominance in the new Cameroon (Martin 1985, 192). Following the creation of the French Franc zone, which established the Franc CFA as the general currency for all Francophone countries, the West African colonies became tied in a fixed parity of 50:1 to the French franc, automatically granting the French government control over all financial and budgetary activities (Goldsborough 1979, 72). France also continued its military presence in Cameroon after independence. France established military and defense assistance agreements with Cameroon (Martin 1985, 193). Furthermore, the French institutionalized linguistic and cultural links with all its former colonies, thereby creating the “La Francophonie” heading which served as a platform for reinforcing the assimilation of the French language, culture and ideology (Martin 1985, 198).

Although Britain may have continued to maintain an indirect economic influence through multinational corporations on its former colonies, the direct effects of British’s neocolonial socio-political and political ideologies have diminished significantly over the years. However, the West in general maintains an indirect form of domination over all developing African countries through means such as loans from the World Bank or the International Monetary Fund (IMF). This form of neocolonialism is done through foreign aids or foreign direct investments where strict or severe financial conditionalities are imposed. Such conditionality often renders the neocolonial state subservient to the economic and sometimes political will of the foreign donor.

Clearly, from this brief history of neocolonialism in Africa, one can see that colonialism itself had an epochal dimension on the history of contemporary Africa. This is why the study of neocolonialism has become critical to the study of African history and politics. It is however more crucial to the field of African philosophy because of the need to reflect on the socio-political and moral impacts of neocolonialism on Africa.

3. Neocolonialism: Related Concepts

Included in the themes of African Philosophy, especially African Social and Political Philosophy are the related concepts of neocolonialism, colonialism, imperialism and decolonization. This is rightly so because of the events and activities of European expansionists’ agenda that occurred, especially in Africa’s modern history. Although these epochal events preceded one another, the methods and praxis of colonialism, imperialism, and neocolonialism are only slightly different. Their common denominators include social, economic, political, and cultural subjugations of the colonized. However, the concept of decolonization differs essentially from the others. While others are conceptually linked with exploitation and domination, decolonization is a means for liberation, which could be through a social, cultural, political and economic form of revolution.

a. Colonialism

Broadly construed, the term colonialism can be described as the deliberate imposition of the rules and policies of a nation on another nation. Its strategy is the forced placement of a nation over another that gives room for the opportunity to exploit the colonized nation in order to facilitate the economic development of the colonialist home state.

A definition by Ronald J. Horvath sees colonialism as a “form of domination – the control by individuals or groups over the territory and/or behavior of other individuals or groups” (Horvath 1972, 46). Clearly, colonialism is a tool for expansion and a form of exploitation on all fronts. This is why Robert Young’s view on colonialism is that it “involved an extraordinary range of different forms and practices carried out with respect to radically different cultures, over many centuries” (Young 2001: 17).

The idea behind colonialism basically is the conquest and rule over a country or region by another, allowing for the exploitation of the resources of the conquered for the profit of the conqueror. Colonialism is an instrumental process through which a state acquires and maintains colonies in another territory. The outcome of this, which is the colonial stage of society, alters mildly or altogether the economic, political, social and even intellectual structure of the conquered state.

Between the 1860s and 1900s, Africa as a whole was subjected to various forms of aggression from Europe, ranging from diplomatic pressures to military invasions until almost all African states were finally conquered and colonized. The process of colonization came to its complete stage with invasions of the political, economic and socio-cultural spheres of the African societies.

The first attempts at colonization occurred when the Europeans began to seek trade pursuits outside their own continent, and thus discovered that many other nations, particularly in Africa, had wealth in natural resources which had potentials for their own economic gain.

We can simply say that the nature of Colonialism involves a forced relationship between an indigenous majority and a minority of foreign invaders. Of course, its history can be traced to slavery, where indigenous people, particularly of Africa, were forcibly and violently taken as slaves to plantations in Europe and the Americas. Through slavery, Africa’s sons and daughters in large numbers were violently seized and taken to Europe as sellable commodities (Nwolize 2001, 25). However, as slavery was ending in the 1850s, Europe was packaging another round of violent visitation against Africa. This invasion took off in earnest after the Berlin conference of 1884 – 1885. Colonialism came with further violence. Vandalism, murder, torture, looting, rape, death, and destruction were also the order of the day (Afisi 2009a, 62).

Certain perceived basic assumptions seem to have informed the colonial construction of African savagery which was used to justify the nature of colonial warfare. Works of enlightenment philosophers such as Frederick Hegel’s (1770-1831) Lectures on the Philosophy of World History (1975) and Immanuel Kant’s Anthropology from a Pragmatic Point of View (1798, 2006) essentially informed these assumptions. Hegel speculates about the continent of Africa and asserts that Africa proper “is enveloped in the dark mantle of Night”. To Hegel, “The peculiarly African character is difficult to comprehend, for the very reason that in reference to it, we must give up the principle which naturally accompanies all our ideas–the category of Universality” (1975, 174). Hegel here states that Africans’ lack the category of Universality and, also, situates the African at the level of irrationality. “The Negro,” Hegel writes, “exhibits the natural man in his completely wild and untamed state” (174). The African was, to Hegel, a complete moron who had no idea of decency and could not distinguish his right from left. Similarly, the racist proclivities of Kant lie in his denial of any intellectual endowments and rational abilities to non-white races.

In a further attempt to rationalize colonialism, Lucien Levy Bruhl (1985, 63) standardizes the colonial discourse when he commissioned rationality as a Western signature, and thus granting what he terms mystic or pre-logical thinking to non-Western peoples. These denigrating words in particular refer to the African. These arguments justify the colonialist’s actions and reasons for invading and conquering the territories of the perceived Dark Continent. With this invasion, the entirety of the lives of this indigenous majority came to depend solely upon the powerful invaders. The fundamental decisions affecting the lives of the indigenes were made by the colonial masters. The colonialists gradually perpetuated the socio-economic and political spheres of the state, and finally, the minds of the people. The conquered were made to believe that they were inferior and, as such, only the ways of the colonialists were worthy to be imbibed.

In his article, “Modern Western Philosophy and African Colonialism”, E. Chukwudi Eze queries the rationality that underlies the thoughts and assumptions that emanate from the European Enlightenment philosophers who promoted the ideals of individual freedom and the dignity of the human person on the one hand, and who, on the other hand, were associated with the thoughts and promotion of slavery and colonialism (Eze 1998, 217). For European Enlightenment philosophers, Africans were not in the same logical set as normal humans. Therefore, their advocacy for the ideals of humanity and democracy did not apply to Africans. This justified their arguments for the promotion of “imperial and colonial subjugation of non-European peoples” (Eze 1998, 218). This suggests that there is a distinction for the enlightenment philosophers between, in Cornel West’s words, “sterling rhetoric and lived reality” or, in Abiola Irele’s, between the “word and deed” (Eze 1998).

Refutations have been made against these assumptions, which suggest that Africa, in the words of Walter Rodney (1972), was developing at its pace before the advent of colonialism. However, due to the debilitating effects of colonial rule, African scholars and political thinkers were faced with the serious challenges of socio-political and cultural reconstruction of Africa. The colonialists had imposed European beliefs and values on Africa. Thus, European languages, belief systems, social, economic, and political systems replaced pre-colonial African ones. As a reaction to the effects of colonialism, there was the need to find an alternative ideology for decolonization. The reflective attitude and the thought process in the search of the ideology for decolonization resulted in the abstraction of different philosophical ideas and the development of theories in political philosophy. Consequently, what is known as African social and Political Philosophy started as a reaction to colonialism. This explains the reason why colonialism is an important theme in African philosophy.

b. Imperialism

Imperialism dates further back in history, as it is traced back to the disintegrated Roman Empire. Imperialism can be described as an orientation which holds that a country can gain political or economic power over another through imposed sovereignty or more indirect mechanisms of control. Imperialism does not focus only on political dominance, but also conquest over expansion. It is particularly focused on the acquisition of power by a state over another group of people. It is also described as a state policy, practice or advocacy of extending power and dominion, especially by direct territorial acquisition or by gaining political and economic control of other areas. As Michael Parenti describes it, Western European imperialism first took place against other Europeans such as when Ireland became the first colony of what later became known as the British Empire (Parenti 2011, 11). However, those who virtually faced the thrust of the European, North American, and Japanese imperial powers have been states in Africa, Asia, and Latin America (Parenti 2011, 13).

An understanding of the basic modus operandi of imperialism suggests that foreign governments can govern a territory without significant settlement, quite unlike colonialism in which settlement is a key feature. Imperialism is merely an exercise of power over the conquered regions without immigration of any form.

In his book Decolonising the Mind, Ngugi wa Thiong’o explicates the nature of imperialism, particularly as it affects the culture and language of the African. Wa Thiong’o asserts that imperialism has absolute effects on the economic, political, military, cultural and psychological wellbeing of the people affected. He describes the effect of imperialism on Africa from two main perspectives. First is the socio-economic and political effect of the imperialist tradition on “consolidated finance capital” (Wa Thiong’o 1986, 2). He maintains that the subjugation of Africa’s economic life is done through the use of multinational corporations, and particularly how most African countries have been lured into accepting loans from the International Monetary Fund (IMF). Wa Thiong’o’s concern about the IMF is that the economic life of every worker and peasant of such countries that have taken the loans are mortgaged forever. This is because as such countries continue to service the IMF loans, the organization is entrusted with the power to dictate the direction of the economic policies of those states. This can also be said of the imperialist domination of politics where it is ensured that African states rely on Western models of politics, policing, judiciary practice, and education.

Wa Thiong’o’s second perspective on the consequences of imperialism relates to what he calls “effect of cultural bomb” (Wa Thiong’o 1986, 3). According to him, imperialism uses a cultural bomb to isolate people and estrange them from their identity. This is done by annihilating the people from their heritage, their environment, their names and, above all, their language. Wa Thiong’o asserts that language remains the most essential vehicle through which the human soul can be held captive. In this case, the imperialists are fully aware of this essence and deliberately use “language as a means of spiritual subjugation” (Wa Thiong’o 1986, 9). So in Wa Thiong’o’s submission, this cultural and psychological form of imperialism remains the biggest weapons that undermine the value of the human person and erodes the dignity of the people’s identity. This form of imperialism has the tendency to make people embrace the imperialists’ alien culture, language, and way of life, and be far removed from their indigenous heritage and identity.

The study of the dignity of the African in every form is central to African philosophy. The need to embrace the value of Africa’s cultural heritage that is devoid of any form of imperialist subjugation is essential for the promulgation of African philosophy. It is in the light of this that, as a theme of African philosophy, the study of imperialism remains crucial to understanding its methods and its effects on the socio-economic, political, and cultural life of the African.

c. Decolonization

As a theme in African philosophy, the term decolonization connotes an ideology for true emancipation in post-colonial Africa. We talk of emancipation from cultural, economic, political, psychological forms of colonialism. African philosophers have consistently been concerned with the issue of liberation of the mind, spirit and body, as well as the emancipation of the African from all elements and influences of colonialism. It is as a result of these concerns that the study of decolonization is essential to the project of African philosophy.

The term decolonization can be described as the abolishment of colonialism and the enthronement of a people/nation’s powers over its own territories. It is typically referred to not merely as independence from colonialism but a total liberation from the influences and powers of imperial neocolonialism. It is a situation in which a new state acts under its own volition, free from the direct control of foreign actors. Decolonization refers to the ability or willingness of the previously colonized nation to become free from imperial rule in order to control its own domestic and international affairs. It is also the mechanism and/or ability of a people to be liberated from cultural and psychological domination of foreign influences.

In Africa, for instance, many theoretical assumptions informed the need to necessarily decolonize. In principle, we can trace the spark for an ideology for decolonization as beginning with the rise of communism in the former Soviet Union. The teachings of Marx, Frederick Engels (1820-1895), and Vladimir Lenin (1870-1924) against the exploitation of the masses remain the backdrop of this decolonization. The influences of the teachings of Frantz Fanon on decolonization cannot also be gainsaid. These teachings and influences seemed to have led to the conviction that informed the early African political thinkers on the need to radically decolonize and end the influences of neocolonialism. Some post-independent African thinkers, such as Leopold Sedar Senghor of Senegal, Sekou Toure of Guinea, Julius Nyerere of Tanzania, Obafemi Awolowo of Nigeria, and Kwame Nkrumah of Ghana, among others, were faced with the serious challenges of socio-economic, political, and cultural reconstruction of the postcolonial African states. They were faced with the task of liberating Africa from the imposition of neocolonial European values, languages, and belief systems, social, economic, and political systems which seemed to have replaced the pre-colonial African ones. Consequently, the principle of individualism, believed to have been a European signature, seemed to have replaced the African cultural context of brotherhood, which suggests a welfare system of communalism, collectivism, and egalitarianism; hence, the need for a search for an ideology for decolonization (Afisi 2009b, 33).

As noted above, among the writers on decolonization in Africa, Fanon was one of the prominent figures. In fact, his writings are notably extensive on the process and methods of decolonization and of true liberation. Central to Fanon is the idea that only through decolonization can there be true liberation. Fanon’s strong advocacy of decolonization results from his commitment to the preservation of individual human dignity.

Fanon’s works, Black Skin White Mask (1952) and The Wretched of the Earth (1962), contain radical critiques of the French colonization. He views colonialism as a forcible control of another state, with the word ‘force’ as key. Fanon accuses the colonizers of using force to exploit raw materials and labour from colonized countries. To justify their actions however, the colonial masters proclaim that the natives were savages and that European culture was the most ideal for adoption.

Fanon claims that the colonial situation is by definition a violent one. He condemns the violence inflicted on the colonized by the colonizer. However, he distinguishes between a threefold categorization of violence. This includes; physical, structural and psychological violence. Physical violence implies the somatic injury inflicted on human beings, the most radical manifestation of which is the killing of an individual. Structural violence reflects the fact of exploitation and its necessary institutional form of the colonial situation. Psychological violence is the injury or harm done to the human psyche (Fanon 1952, 39). This third categorization includes elements of indoctrination of various kinds and threats which tend to decrease the victims’ mental potentialities. As a way out of all of these, Fanon advocates a reprisal use of violence against the settlers to enable the colonized regain their self-respect. To him, since the colonial situation is itself a violent one, the colonial masses can only achieve liberation through replicated form of violence. True liberation, according to Fanon, must be accompanied by violence. His submission is that for liberation to be total, accurate and objectively achieved, it has to be accompanied by violence (Fanon 1962, 102).

In Fanon, decolonization requires violence on the part of the colonized. Violence plays a critical role in the decolonization struggle. The colonized must see violence in decolonization as that which leads not to retrogression, but liberation. Fanon sees decolonization as implementation of the concept of ‘the last shall be the first’. It is a psycho-social process, a historical process that changes the order of the world. Decolonization involves a struggle for the mental elevation of the colonized African people (Fanon 1962, 116). So, from all of this, Fanon contends that Africa is in need of true liberation which can only result from decolonization. In his submission, resisting a colonial power using only politics cannot be effective; violence is the best way to attain decolonization.

Arguing from a similar position, Kwasi Wiredu in his book, Conceptual Decolonization in African Philosophy contends that colonialism has not only affected Africa’s political society but also its mental reasoning. He advocates the need for Africans to go through a process of mental decolonization. Wiredu recommends the process of decolonization from two conceptual analyses: first, ‘avoiding through a critical conceptual self-awareness the unexamined assimilation in our thought… of the conceptual frameworks embedded in foreign philosophical tradition’, and second ‘exploiting as much as judicious the resources of our own indigenous conceptual schemes in our philosophical meditations…’ (Wiredu 1998, 117). For Wiredu, the most important function of post-colonial philosophy is what he refers to as “conceptual decolonization”. This simply implies “divesting African philosophical thinking of all undue influences emanating from our colonial past” (Wiredu 1998). Wiredu sees decolonization as a necessary tool for developing an authentic African philosophy that is devoid of any neo-positivist influences.

Wa Thiong’o, in his thinking, believes that decolonization can only take place in Africa when the “cultural bomb” is diffused. This process begins when “writing” is done in the various indigenous African languages. Such writings, which would enhance the renaissance of African cultures, must also carry with it the spirit and content of anti-imperialist struggles. This would ultimately help in liberating the mind of the people from foreign control (Wa Thiong’o 1986, 29). For total decolonization to occur, Wa Thiong’o enjoins writers in African languages to form a revolutionary vanguard in the struggle to decolonize the mind of Africans from imperialism.

4. Neocolonialism: The Last Stage of Imperialism

After the independence of most African nations, Africans soon began to notice that their countries were being subjected to a new form of colonialism, waged by their former colonialists and some other developed nations. It is pertinent to mention that even though neocolonialism is a subtle propagation of social-economic and sometimes political activities of former colonial overlords in their ex-colonies, documented evidence has shown that a country that was never colonized can also become a neo-colonialist state. Countries such as Liberia and Ethiopia that never experienced colonialism in the classical sense have become neocolonial states by dint of their reliance on international finance capital, courtesy of its fragile economic structure (Attah, 2013:71). It is based on this that neocolonialism can be said to be a new form of colonial exploitation and control of the new independent states of Africa, and other African states with fragile economies.

Nkrumah views neocolonialism as a new form of subjugation of the economic, social, cultural, and political life of the African. His postulation is that European imperialism of Africa has passed through several stages, from slavery to colonization and subsequently to neocolonialism being the last stage of the imperialist subjugation and exploitation process. Nkrumah’s (1965) classic, Neo-colonialism: The Last Stage of Imperialism, is an analysis of neocolonialism in relation to imperialism. The book emphasizes the need to recognize that colonialism had yet to be abolished in Africa. Rather, it had evolved into what he calls neocolonialism. Nkrumah reveals the methods that the West used in its shift in tactics from colonialism to neocolonialism. In his words: “without a qualm it dispenses with its flags, and claims that it is ‘giving’ independence to its former subjects, to be followed by ‘aid’ for their development. Under cover of such phrases however, it devises innumerable ways to accomplish objectives formerly achieved by naked colonialism” (Nkrumah 1965). This explains the condition under which a nation is continually enslaved by the fetters of neocolonialism while being independent in theory, and yet being trapped outwardly by international sovereignty, so that it is actually directed politically and economically from the outside.

Nkrumah contends that neocolonialism is usually exercised through economic or monetary means. As part of the methods of control in a neocolonial state, the imperialist power and control over the state is gained through contributions to the cost of running the state, promotion of civil servants into positions that allow them to dictate and wield power, and through monetary control of foreign exchange by the imposition of a banking system that favors the imperial system.

Nkrumah further explains that neocolonialism results in the exploitation of different sectors of the nation, using different forms and methods: “[t]he result of colonialism is that foreign capital is used for the exploitation rather than for the development of the less developed parts of the world. Investment under neocolonialism increases rather than decreases the gap between the rich and the poor countries of the world” (Nkrumah 1965).

On the link between Neocolonialism and Imperialism, Nkrumah writes that neocolonialism is the worst and most heightened form of imperialism. For those who practice it, it ensures power without responsibility and unchecked exploitation for those who suffer it. He explains that neocolonialist exploitation is implemented in the political, religious, ideological, economic, and cultural spheres of society. He further provides details of the infiltration and manipulation of organized labour by agencies of the West in African countries. He discusses how the mass media is used as an instrument of neocolonialism in the following statement: “[w]hile Hollywood takes care of fiction, the enormous monopoly press, together with the outflow of slick, clever, expensive magazines attends to what it chooses to call ‘news’” (Nkrumah 1965). Religion too, according to Nkrumah, is distorted and used to support the cause of neocolonialism.

Nkrumah’s submission, however, is in the projection that as dangerous as neocolonialism is to the future of Africa, it would eventually, like colonialism, be defeated by the unity of all those who are being oppressed and exploited. He prescribes unity and awareness amongst all Africans.

Buttressing the above submission, Noah Echa Attah in his paper, “The historical conjuncture of neo-colonialism and underdevelopment in Nigeria” (2013), traces the root of underdevelopment in Africa, particularly in Nigeria, to the effects of neocolonialism. In his assertion, African countries have never been truly independent after colonialism had left because the idea of partnering with the ex-colonialists has continued to guide state economic policies. Foreign firms have continued to dominate the business sectors of the economy such that relatively few, but large and integrated foreign firms otherwise called multi-national corporations, have made themselves indispensable to the growth or otherwise of the economy. Local industries in Africa are extensions of metropolitan firms, such that the needed raw materials for the industries depend on very high import content of over 90% from the capitalist economies (Attah 2013, 76). Thus, the continued dependence of industrial investments in Africa on the capitalist intensive technology is strictly aimed at further developing the metropolitan economies.

Attah explicates how Western neocolonialists have collaborated with local bourgeoisie in Africa to perpetuate the exploitation of the people and state economies in Africa. According to him, most of the local bourgeoisie collaborators are not committed to national interest and development, and their aim is to ensure the continued reproduction of foreign domination of the African economic space. The local bourgeoisie are bereft of ideas capable of engendering growth and development. The objective of foreign capital, therefore, is to continue to co-opt the weak and nascent local bourgeoisie into its operations. “The co-optation of the local bourgeoisie into the network of foreign capital condemned the former to the position of ‘comprador’” (Attah 2013, 77).

Adducing from the above exposition, Attah also asserts that neocolonialism is a new form of imperial rule characterized by the domination of foreign capital. His claim is that instead of real independence, what Africa has is pseudo-independence with the trappings of the illusion of freedom. To him, neocolonialism in Africa is made possible due to the roles and actions of local bourgeoisie in collusion with foreign capital. He is concerned that the different African economies have become willing tools in the hands of the West because of their fragility. In many cases, the African states have inadvertently authorized the dependency of African economies on foreign capital, which is a necessary legitimacy for neocolonialism. Neocolonialism, Attah submits, leads to underdevelopment where the local bourgeoisie and the foreign capitals are interested in the economy for personal accumulation rather than national development of the neocolonialist state.

5. The Myth of Neocolonialism

Thomas Molnar (1965) in his paper, “Neocolonialism in Africa?”, asserts that African nations continue to depend on the Western industrial nations for economic aid, loans, investment, market, and other technical assistance because they require this dependency for their development. He acknowledges that the colonial regime in Africa left Africa in destitution, not only materially but also in terms of education and technical training. Molnar also affirms that nobody will deny that the colonialist period sanctioned abuses and exploitation on Africa (Molnar 1965, 177).

However, in spite of the end of colonialism in Africa, Molnar is concerned that African economies have not been properly functional, independent of foreign aids and investments. He claims that the economic presence of the West is imperative for the future progress of Africa’s socio-economic and political stability. The assumption is that only fruitful economic arrangements with western industrialized countries may guarantee Africa’s future (Molnar 1965, 182).

Further, Molnar asserts that the call for decolonization in Africa took place in a hurried, haphazard way. Africans were unready and immature for economic and political independence as of the time it achieved it. As a result of this situation, the West is under obligation in post-colonial Africa to keep up its aid, not as a tribute paid for past colonial situation but as one half of a two-way process of cooperation (Molnar 1965, 183). For its development, Africa needs the West. Since the newly independent African countries will continue to be economically dependent on the West, neocolonialism is not a negative term. In fact, “neo-colonialism” is the only way of getting Africa to the take-off stage” (Molnar 1965, 183).

In a reaction to Molnar’s glorification of neocolonialism in Africa, Tunde Obadina’s article, “The Myth of Neocolonialism” (2000), which is a critical analysis of the colonial situation in Africa, and the myth surrounding indigenous growth and development in post-colonial Africa, gushes at the ‘apologist’ claim about the positive influences of colonialism. According to Obadina, these apologists contend that despite the exploitation of resources perpetrated by the colonialists, their overall influence on the African society in terms of reducing the economic gap between Africa and the West is positive. The argument here is that colonialism improved the living condition of Africans, providing necessary tools for civilization such as formal education, modern medicine, and enlightenment, including shaping the political organization. However, Obadina notes that in spite of these apologetic claims, Africa is still today considered a continent in economic and political crisis. However, the apologists are usually quick to point out that the failure of Africa’s development is due to the conscious or unconscious refusal to adopt the legacies of the colonialists. These apologists even go to the extent of saying that Africa is in such a state because they gained independence a lot sooner than was necessary for them.

Citing D.K Fieldhouse as one of the apologists, Obadina mentions that Fieldhouse is of the view that it would be difficult to imagine what would have become of African countries had the colonial rule not come. Fieldhouse had contended that pre-colonial Africa by itself lacked the capacity, social and economic organization to transform itself into modern states that would result in the establishment of advance economies. According to Fieldhouse, African states today would be a direct replica of what they were in the primitive days if they had not encountered the European culture and civilization.

Fieldhouse’s Eurocentric orientations result from the contention that Africans are by nature irrational, incompetent, and unable to produce anything useful. These orientations have gone to great extents to undermine Africa’s indigenous culture, tradition, religion and even philosophy. Even Gene Blocker opines that ‘the more philosophical African philosophy becomes, the less African it is in content and the more African it becomes, the less philosophical it is in content’ (Blocker 1987, 2).

In contrast to these apologetic arguments, Obadina points out that many nationalist African scholars have raised arguments to criticize the apologists’ positions on the basis that Africa would definitely have developed in a unique way, different from the European system. Obadina asserts that colonialism bore nothing but negative effects on Africa. According to him, Africans lived more ‘enriched’ lives before colonialism, and would have continued in that manner had they not been colonized. The effect of colonial rule has left the continent in a more dilapidated state; it has compromised the nations’ capabilities to develop. Obadina cites Walter Rodney’s assessments of how colonialism only succeeded in making Africa underdeveloped and, worse still, dependent on Western nations. The colonial juxtaposing of people from different cultures, ignoring the already established borders and redrawing Africa’s map created and still results today in various degrees of ethnic conflicts. Furthermore, colonialism undermined pre-colonial political systems which used to be effective for Africans and imposed foreign political concepts which include multi-party democracy. This, according to Rodney and many other critics, has left Africa in serious social and political crises. Obadina takes Nigeria as an example, which, because of its great population and natural resources, had qualities that seem to be leading eventually to her destruction. The party politics, according to Obadina, introduced by the colonialists was the major cause of ethnic conflicts in Africa.

Obadina acknowledges the difficulty in providing an objective analysis of the impact of colonialism in Africa. Despite this, he avers that colonialism in Africa may have some positives. However, what cannot be denied is the fact that it was something imposed, which had no regard for the existing structures already in place. Furthermore, colonial rule was not an idea geared towards the development of the colonized states in any way, but something established solely for the benefit of the colonial states.

Furthermore, Obadina forthrightly asserts that African nations are to be blamed for the continued reliance on their former colonial lords for economic and political direction. This neocolonial situation poses serious danger to the evolution of indigenous-based economic growth, and at the same time, has adverse effects on political stability. It has, according to him, hampered the growth of movements geared towards change. He believes that African nations, after independence, should have shut the door against imports and exports from the West and sought to develop themselves using their own resources, not dependent on foreign corporations. This would have, he says, improved Africa’s infrastructural levels of economic and political growth. In Obadina’s view, if African nations, for example, had pursued this independent economic agenda, they would have survived, because Cuba did so and survived. Obadina opines that the traditional agenda that came with colonialism was the false ‘idea of progress’. With this idea as the fundamental gospel, Africans were made to believe that their living conditions could be positively altered. This, among other things, smoothened their way into the continent, since, after all, it has been peoples’ desires for material improvement. It created in Africans the desire for Western civilization; but the West failed to hand over to Africans the tools for realizing such civilization.

Africa in the early 21st century is a neocolonial continent, according to Obadina. Africa continues to face the problem of dealing with the overbearing presence of Western civilization. In the quest for modernization, the focus is mostly on the Western world and there is little or no focus on the urgent need for internal changes in this same quest. Despite colonial rule in Africa ending only late near the end of the twentieth century, Obadina submits that African nations at the beginning of the 21st century have the responsibility to develop themselves by making changes in their internal structures using indigenous knowledge, while at the same time learning all they can from the influence of the Western world and putting these to use for their own benefit.

In all of these, African philosophers have continued to interrogate the idea of neocolonialism and its effects on Africa’s development. The outcomes of such interrogations continue to form content that need to be taught and studied within the project of African philosophy.

6. Neocolonialism Today in Africa: The Era of Globalization

The heavy dependence on foreign aid and the apparent activities of the multinational corporations in Africa reveal that Africa at the beginning of the 21st century is still in a neocolonial stage of development. The activities of the corporations in Africa, particularly those from Europe and America reveal nothing short of economic exploitation and cultural domination. Early 21st century Africa is witnessing neocolonialism from different fronts, from the influences of trans-national corporations from Europe and America to the form of a new imperial China, which many African governments now seem obligated to. The establishment of the multinational corporations, and more recently Chinese interests in Africa through Chinese companies, appear mainly to exist for the benefits of the home economies of the neocolonialists than to infuse local African economies with cash to stimulate growth and increase local capacity.

In the Africa of the early 21st century, some scholars, such as Ali Mazrui, have opined that the new form of neocolonialism is globalization. Much as the way that neocolonialism has been variously described, Mazrui also describes how globalization “allows itself to be a handmaiden to ruthless capitalism, increases the danger of warfare by remote control, deepens the divide between the haves and have-nots, and accelerates damage to our environment” (Mazrui 2002, 59). This negative perspective on globalization, particularly as it relates to extreme capitalism, essentially corroborates the assertion by Michael Maduagwu that “globalization is only the latest stage of European economic and cultural domination of the rest of the world which started with colonialism, went through imperialism and has now arrived at the globalization stage” (Maduagwu 1999, 65).

Looking at globalization in this way, Oseni Afisi, also condemns it to the corridor of neocolonialism and cultural subjugation.  Globalization becomes the imposition of a particular culture and value system upon other nations with the direct intent of exploitation. What this indicates is that globalization is indeed the engine room for the propagation of neocolonialism and new imperialism on the African soil.  While colonialism has ended, the reality on the ground in Africa in the immediate years after it is that political independence in many African states has not culminated in the much desired economic and cultural freedom (Afisi 2011, 5).

Afisi further opines that the greatest venture upon which the negative impact of globalization in Africa rests primarily is the erosion of Africa’s cultural heritage.  Upon this heritage hinges the political, economic, social, educational subjugation of the continent of Africa.  The forcible integration of Africa into globalization through slavery and colonialism has led to the problem of personal identity and cultural dilemma for the African.  Africa has had to be dependent upon Europe and America, and, more recently, upon China for its development and, one might add, the development of her identity and culture.

By contrast, Olufemi Taiwo, in Africa Must Be Modern: A Manifesto, has a radical position which seeks to have sufficient bearing on Africa’s consideration of the inherent benefits of globalization. Taiwo uncompromisingly defends globalization and suggests that its benefits must be harnessed by Africans. Taiwo berates the level of hostility that Africa has shown towards modernity, stating the regrettable impact of such hostility to the economic, social, and political development of the continent (Taiwo 2014). To Taiwo, Africa must address the challenges of modernity and globalization by embracing them instead of being hostile to them. As Taiwo further posits, Africa needs to fully engage with and derive benefits from globalization and its attendant capitalist democracy (Taiwo 2014).

In a similar vein, D. A. Masolo, in African Philosophy in Search of Identity, remarks that the needs and experiences of Africans today are conditioned by their peculiar cultural circumstances. The nature of these cultural circumstances is that the African of the post-colonial period has embraced modernity, science and technology as part of his/her culture. These African understand the world around them by being open minded and ridding themselves of any traction that may mold their thinking. This is the nature of African identity after colonialism, an identity which will aid in the intellectual construction of a modern African philosophy (Masolo 1994, 251).

7. Conclusion

As a theme of African philosophy, the term neocolonialism became widespread in use—particularly in reference to Africa—immediately the process of decolonization began in Africa.  The widespread use of the term neocolonialism began when Africans realized that even after independence their countries were still being subjected to a new form of colonialism.  The challenges that neocolonialism poses to Africa seem to be related to the socio-economic, cultural, and political development of the people and states of the continent. These challenges have, however, been attributed both positive and negative impacts on the continent.

On the whole, this article is an exposition of the theme of neocolonialism within the project of African philosophy. The introduction is a cursory look at the term “neocolonialism” with a view to clarifying the basic concept of the term. The history of neocolonialism is a historical analysis of the beginning of neocolonialism in Africa. This analysis reveals how the idea of neocolonialism was nurtured before independence was granted to most African states. No doubt, the term neocolonialism has some close relations to some other concepts. This explains the reasons the term colonialism, imperialism, decolonization, and globalization are essential to better understanding neocolonialism. Discussions about the negatives and some positives of colonialism and its offshoot neocolonialism in Africa are exposed in this article. This reveals that neocolonial elements may continue to be an integral part of Africa’s socio-economic, cultural, and political existence. However, some of the social and political philosophical questions which may continue to preoccupy the minds of Africans include: Can neocolonialism be abolished from Africa? Can the positives of neocolonialism outweigh the negatives, or vice versa, in terms of the impacts on the African economy? Will Africa ever be truly decolonized?

8. References and Further Reading

  • Abraham, William. The Mind of Africa. Chicago: University of Chicago Press, 1962.
    • A discourse on neocolonialism and integrative form of culture in Africa.
  • Afisi, Oseni Taiwo. “Tracing Contemporary Africa’s Conflict Situation to Colonialism: A Breakdown of Communication among Natives”. Philosophical Papers and Review Vol.1 (4): 59-66 (2009a).
    • A historico-philosophical analysis of colonialism as the root of tribal conflicts in Africa.
  • Afisi, Oseni Taiwo. “Human Nature in Marxism-Leninism and African Socialism”, Thought and Practice: A Journal of the Philosophical Association of Kenya, New Series. Vol. 1(2): 25-40 (2009b).
    • A comparative analysis of the nature of man in the philosophical ideologies of Marxism and African philosophy.
  • Afisi, Oseni Taiwo. “Globalization and Value System”. LUMINA. Vol. 22.(2): 1-12 (2011).
    • A discourse on the nature of globalization: its negatives and benefits on Africa’s value system.
  • Attah, Noah Echa. “The historical conjuncture of neo-colonialism and underdevelopment in Nigeria”. Journal of African Studies and Development. Vol.5 (5): 70-79 (2013).
    • A historical analysis of the effect of colonialism in Africa.
  • Blocker, Gene. “African Philosophy”. African Philosophical Inquiry. Vol.1(2): 1- 12 (1987).
    • A critical discussion on the idea and content of African Philosophy.
  • Bruhl, Lucien Levy. How Natives Think, Princeton. N.J: Princeton University Press, 1985.
    • A discussion on the distinction between the mindset of the primitive and the mindset of the civilized human beings.
  • Eze E. Chukwudi. “Modern Western Philosophy and African Colonialism”. E. Chukwudi Eze (ed) African Philosophy: An Anthology. Massachusetts: Blackwell Publishers Ltd, 1998.
    • A discourse on the contradictory nature of European Enlightenment period and its promotion of slavery and colonialism at the same time.
  • Fanon, Frantz. Black Skin, White Masks. London: MacGibbon, 1952.
    • An analysis of the psychology of the colonial situation.
  • Fanon, Frantz. The Wretched of the Earth. New York: Grove Press, 1962.
    • A critical analysis of colonialism, cultural and political decolonization.
  • Goldsborough, J. “Dateline Paris: Africa’s Policeman”. Foreign Policy 33. (1979).
    • An analysis of the French African economic policy.
  • Hegel, G.W.F. Lectures on the Philosophy of World History. Trans. H. B Nisbet. Cambrdige: Cambridge University Press, 1975.
    • Hegel’s philosophical analysis of world history.
  • Horvath J. Ronald. “A Definition of Colonialism”. Current Anthropology 13 (1): 45-57 (1972).
    • A general discussion on definition and classification of colonialism.
  • Kant, Immanuel. Anthropology from a Pragmatic Point of View. Robert B. Louden, ed. Introduction by Manfred Kuehn. Cambridge: Cambridge University Press, 2006.
    • A collection of Kant’s lectures on Anthropology and the nature of man.
  • Lenin, Vladimir. Imperialism: The Highest Stage of Capitalism. Moscow: Progress Publishers, 1916.
    • An exposition of the nature and the process of the imperialist’s financial capital in generating greater profits from their colonies.
  • Maduagwu, O. Michael. “Globalization and its challenges to National Cultures and Values: A Perspective from Sub-Saharan Africa” being a paper presented at the International Roundtable on the challenges of Globalization, University of Munish, 18-19 March. (1999).
    • A presentation of globalization as a tool of European economic and cultural domination of the rest of the world.
  • Martin, Guy. The Historical, Economic and Political Bases of France’s African Policy. The Journal of Modern African Studies 23 (2): 189-208 (1985).
    • An analysis of France’s continued influence and power on its former African colonies.
  • Marx, Karl. Capital: The Process of Capitalist Production. Trans. Fowkes. Knopf Doubleday. 1972.
    • A critical analysis of the economic law of capitalist mode of production.
  • Marx, Karl. Economy, Class and Social Revolution. London: Nelson Publishers, 1977.
    • Marx’s selected writings on capitalism and the process of social revolution.
  • Masolo, D.A. African Philosophy in Search of Identity. Bloomington: Indiana University Press, 1994.
    • A discussion on the debate of what constitutes the study African philosophy.
  • Mazrui Ali. A. “Nkrumanizm and the Triple Heritage in the Shadow of Globalisation” being a paper presented at the Aggrey-Fraser-Guggisberg Memorial Lectures, University of Ghana, Legon, Accra, (2002).
    • A presentation of the effects of globalization on Africa.
  • Mbembe, Achili. On the Postcolony. Berkeley: University of California Press, 2001.
    • A discourse on the nature of neocolonialism and its negative impact in Africa.
  • Molnar, Thomas. “Neocolonialism in Africa?” Modern Age. Spring (1965).
    • An analysis of nature of neocolonial economy in Africa and its positive impacts.
  • Mudimbe V.Y. The Invention of Africa: Gnosis, Philosophy and the Order of Knowledge. Bloomington and Indianapolis: Indiana University Press, 1988.
    • A discourse on the interplay of Western colonialism in Africa, and its denial of the existence of “the other” in European consciousness.
  • Mudimbe V.Y. The Idea of Africa. Bloomington: Indiana University Press, 1994.
    • A discourse on the distinct cultural values that constitute the African reality.
  • Ngugi, wa Thiong’o. Decolonising the Mind: The Politics of Language in African Literature. London: James Curreys, 1986.
    • A discourse on cultural imperialism and on written African languages as a vehicle for Africa’s decolonization.
  • Nkrumah, Kwame. Neo-Colonialism: The Highest Stage of Imperialism. London: Heinemann, 1965.
    • An analysis of the nature of neocolonial economy and its relationship with Imperialism.
  • Nwolize, OBC. “The Fate of Women, Children and the Aged in Contemporary Africa’s Conflict Theatres”, Paper delivered at the Public Annual lecture of the National Association of Political Science Students, University of Ibadan, (2001).
    • A presentation of the effects of colonialism and conflict situations in contemporary Africa.
  • Obadina, Tunde. “The myth of Neo-colonialism” in Africa Economic Analysis, 2000.
    • A critical analysis of colonialism and neocolonialism in Africa.
  • Parenti, Michael. The Face of Imperialism. New York: Paradigm Publishers, 2011.
    • An exposition of the role of multinational corporations in the imperialist conquests.
  • Sartre, Jean-Paul. Colonialism and Neocolonialism, translated by Steve Brewer, Azzedine Haddour, Terry McWilliams; Paris: Routledge, 1964.
    • A critical analysis of French colonial policies on Africa, especially in Algeria.
  • Serequeberhan, Tsenay. “Philosophy and Post-Colonial Africa”in E. Chukwudi Eze (ed) African Philosophy: An Anthology. Massachusetts: Blackwell Publishers Ltd, 1998.
    • A discussion on the nature of neocolonialism in Africa.
  • Taiwo, Olufemi. Africa Must Be Modern: A Manifesto. Indianapolis: Indiana University Press, 2014.
    • A discourse on globalization and modernity in Africa.
  • Young, Robert. Postcolonialism: An Historical Introduction. Oxford: Blackwell, 2001.
    • A discussion on the historical & theoretical origins of postcolonial theory.
  • Wiredu, Kwasi. Conceptual Decolonization in African Philosophy: 4 Essays. Ibadan: Hope Publications, 1998.
    • A discourse on decolonization and development of an authentic Africa Philosophy

 

Author Information

Oseni Taiwo Afisi
Email: oseni.afisi@lasu.edu.ng
Lagos State University
Nigeria

Artistic Medium

Laocoon sculptureArtistic medium is an art critical concept that first arose in 18th century European discourse about art. Medium analysis has historically attempted to identify that out of which works of art and, more generally, art forms are created, in order to better articulate norms or standards by which works of art and art forms can be evaluated. Since the 19th century, medium analysis has emerged in two different forms of critical and theoretical discourse about art.Within traditional art forms, such as painting and sculpture, modernist artists and critics began to interrogate art forms and the history of their possibilities in order to discover the necessary conditions for instances of those forms. This modernist interest in medium aimed to strip away unnecessary traditional artistic conventions in order to identify that which is essential to the form. Within newly emergent forms of popular art, such as movies, comics, and video games, artists and critics have attempted to articulate both the ways the norms of these forms of popular art arise from new material and technological modes of creating and interacting with reproducible images.

The possibilities for an art form, whether traditional or newly emergent, can only be discovered by artists in acts of artistic creation. For this reason, the relation between art forms and their media develops and changes as the art forms continue to be discovered and reimagined by artists.

Table of Contents

  1. Introduction
  2. The Challenge of Medium Skepticism
    1. Carroll’s Medium Skepticism
    2. The Need for the Concept of Artistic Medium
  3. Theorizing About Art Forms Before the Emergence of the Concept of Artistic Medium
    1. Aristotle
    2. Aristotle and Horace as Models for Theorizing Art
    3. Music
  4. Gotthold Lessing and the Problem of Art in the 18th Century
    1. Art in the 18th Century
    2. Lessing on Painting and Poetry
    3. Herder and Hegel
  5. The Invention of Photography and the Discovery of Its Artistic Possibilities
    1. The Etymology of the Term “Artistic Medium”
    2. The Challenge of Photography
    3. Accounting for Photography’s Artistic Possibilities
  6. Modernism as the Discovery of Medium
    1. The Emergence of Modernism
    2. Modernism and 20th Century Music
    3. Fried on the Value of Modernism
    4. Postmodernism
  7. New Forms of Popular Art in the 20th Century
    1. Movies
    2. Comics
    3. Video Games
  8. Conclusion
  9. References and Further Reading

1. Introduction

Artistic medium is a term that is used by artists and art critics to refer to that out of which a work of art or, more generally, a particular art form, is made. There are, generally speaking, two related ways of using artistic medium in critical or artistic discourse. On the one hand, we often talk about an artistic medium by reference to the material out of which a work of art is made. Works of art in museums or galleries will often have the medium listed along with the title and the artist’s name on the display card. A painting might have “oil on canvas” or “watercolor” listed along with the artist’s name and the work’s title; a sculpture might have “marble,” “steel,” or “papier-mâché” listed in the same way. On the other hand, we also talk about medium to refer to the way a work of art organizes its audience’s experience in space and time. An actor might talk about the differences in performing on television and on film as performing in two different artistic media. Or a critic might describe television as a “writer’s medium” and movies as a “director’s medium.” Sometimes there may be no interesting differences regarding the material out of which the work was made; for this way of using medium, the crucial differences have to do with the spatiotemporal organization of the audience’s experience of the work of art.

Much of the critical and theoretical interest in the concept of artistic medium stems from a belief that analyzing the material conditions that underlie a particular art form allows us to articulate its norms and standards. Often critics and theorists who make use of the concept of artistic medium do so in order to connect an analysis of an art form’s material basis and conditions with some claim about what artistic norms or standards are proper to the art form. Because the connection between a description of a medium, an art form’s material basis, and the artistic experiences appropriate to that medium is a matter of some controversy, clarification of the philosophical insights and confusions associated with the concept of artistic medium must start not by arriving at its comprehensive definition, but rather by noting the characteristic forms of reasoning in which the concept is used.

There have been two relatively distinct forms of discourse involving artistic medium: a modernist discourse, and one associated with newly emergent popular art forms such as movies and comics. The uses of artistic medium in these discursive traditions have shared important similarities, especially a reliance on the concept to identify what is distinctive about a particular art form and an interest in grounding the norms governing a particular art form in the form’s material basis. But there are important differences as well. Modernist uses of the concept appeal to artistic medium as a way of justifying avant-garde approaches to traditional art forms by making clear how contemporary experimental instances of a form are genuine instances of that form because they inherit the tradition in question by purifying it of all that is inessential and accidental. Proponents of newly emergent popular art forms, on the other hand, are interested in articulating what is unique about the new forms in order to locate their possibilities in distinction from traditional or older forms and to demonstrate how its best instances are works of art.

In recent years, some analytic philosophers of art have suggested that the concept of artistic medium is necessarily a confused one and should be abandoned in favor of other art-critical concepts such as style or genre. For example, Noël Carroll, in his theorization of film in the 1980s and 90s, suggested abandoning medium as a critically inert and confused category. More recently, Carroll has found critical uses for the concept of artistic medium, especially in the analysis of exemplary instances of avant-garde film. Though Carroll does now recognize legitimate applications of the concept of artistic medium in film criticism and theory, it is nonetheless worthwhile to take seriously his initial radical skeptical challenge to critical and theoretical uses of the concept of artistic medium. Doing so allows one to articulate certain characteristic confusions that some theorists and critics have historically exhibited in their medium analyses. But, equally, it allows for the opportunity to clarify what, historically, has characterized the richest and most insightful critical and theoretical uses of artistic medium. As we shall see, these kinds of confusions are apt to arise when the theorist or critic does not remember that artistic medium is an art critical concept. As an art critical concept, what a medium for an art form is can only be known through artists discovering its possibilities in the creation of works within the form.

In general, confusions arise in using artistic medium when theorists and critics do not treat the concept as a critical one, but instead picture a medium as something that could be identified prior to and independently of any particular artistic uses to which it is put. In Art as Experience (1934), John Dewey attempts to combat this possibility for confusion by distinguishing between an artistic medium and raw material. When we identify some collection of matter prior to and independent of any particular artistic context, then we have identified some raw material, which may, it is true, be put to use for various artistic ends. But we cannot specify what artistic possibilities are available to artists by identifying and analyzing that material. Rather, when a given vehicle is taken up and explored within a particular artistic problematic or tradition, artists discover it as an artistic medium. It is thus through the work of artists that the artistic possibilities of an artistic medium can be discovered, and not by analyzing the material in isolation. In this sense artistic medium essentially is a critical concept. What is possible within a particular medium is discovered by artists as they attempt to explore a particular artistic problematic or inherit a particular artistic tradition. For this reason, what the medium of an art form is, as Theodor Adorno insists in Philosophy of New Music (1948), is a historical question. There is no fixed, ahistorical answer to the question, “What are the material conditions for painting, or music, or any particular art form?”

In order to clarify the nature of the concept artistic medium, this article takes two different, although closely related, lines of approach. This article will first clarify the roles artistic medium can rightfully play within critical and theoretical discourses by responding to the challenge of medium skepticism, which takes the concept to be necessarily confused. Then, it will outline the history of artistic medium’s emergence by describing the forms of critical reasoning in which the concept has been characteristically used. In so doing, the article will articulate why the concept has been so important for the development of new forms of popular art and for avant-garde and modernist experimentation, and why the concept has been vulnerable to characteristic confusions.

The first section of this article will engage with the challenge of medium skepticism. Medium skepticism, a position recently prominent in the philosophy of art, holds that artistic medium gives rise to a set of characteristic confusions because the concept is both essentializing and one that grounds its reasoning in a priori reflection upon the nature of the material basis of an art form. As we shall see, those two theoretical temptations are not inherent in the concept but are dangers only given a certain picture of how we determine what the medium is.

Then, employing Adorno’s thought that our understanding of what a medium is must be located in the history of the development of its art form, the article describes the emergence of the concept of artistic medium and the history of its critical and theoretical uses in the development of modern arts. First, there is a brief account of how philosophers and critics in the ancient world and the European tradition theorized artistic possibilities relative to a given art form prior to the emergence of artistic medium as a critical and theoretical category: namely, by identifying an art form and its norms and standards by specifying its proper experience. Then, the two sites of emergence for the concept of artistic medium are described: first, in the 18th century, in the critical work of Gotthold Lessing and, most importantly, his reflections on the differences between painting and poetry; second, most decisively, in the 19th century, in response to the invention of photography, its potential as a new art form, and its relation to painting. This complex historical field within which the concept of artistic medium emerged allows us to locate the centrality of the concept in 20th-century artistic discourses and also the philosophical confusions associated with it.

2. The Challenge of Medium Skepticism

a. Carroll’s Medium Skepticism

 In the late 1980s and early 1990s, Noël Carroll issued what we may call the challenge of medium skepticism; he argued that medium analysis of film is necessarily confused and that film’s medium can be identified, but that it has no artistically normative implications. Carroll’s interest in the concept of artistic medium originated from within film theory, but his claims about medium analysis being an essentializing discourse, and his recommendations that theorists and critics abandon the concept of artistic medium in favor of other art theoretic concepts like genre and style, apply to the use of medium as an art theoretic concept in general. More recently, Carroll has moved away from his medium skepticism and has acknowledged uses for artistic medium, especially in describing and evaluating avant-garde or structural film. Nonetheless, evaluating the challenge of medium skepticism is valuable in order to clarify how critics and theorists use artistic medium in characteristically confused ways and how the concept can be used in ways that avoid those confusions. In responding to the challenge of medium skepticism, we articulate the value of artistic medium.

At the root of Carroll’s concern is his contention that medium analysis ultimately depends, either implicitly or explicitly, on an illicit judgment from the nature of material conditions underlying the work of art to a set of norms and prescriptions meant to govern an art form. Carroll contends that medium analysis must slip into a theory of medium specificity in which each art form has a single medium and each medium has a distinctive feature that does and should characterize artistic creation within the form: the medium’s distinctive feature or power provides the aim the art form and its practitioners should pursue. Carroll’s rejection of medium specificity, which he sees as the inescapable heart of medium analysis, consists of two related objections: first, that medium analysis necessarily essentializes by identifying an art form with a single medium and a medium with a unique characteristic; and second, that medium analysis is structured around a priori reflection on the nature of the medium, yielding normative prescriptions about which artistic experiences are appropriate that in fact merely reflect one’s theoretical biases or idiosyncratic tastes.

b. The Need for the Concept of Artistic Medium

It is true that much medium analysis essentializes, but critical or theoretical discourse involving medium is not necessarily essentializing. It may seem that the modernist tradition, for example, supports Carroll’s contention that use of artistic medium is necessarily essentializing, since exploration of an art form by means of its media often meant stripping away all that could be in order to discover what was essential to the art form. However, the modernist question of what is essential to the art form is itself a particular, historically-located question about artistic medium, and whatever answers modernist artists generated need not be taken as definitive of the timeless and unchanging essence of some particular art form. In fact, rejection of essentializing claims is revealed as necessary for sound medium analysis, since there is no independent grasp on what counts as an artistic medium outside of the context provided by a particular artistic problem or concern. This can be somewhat obscured for us because of the importance of modernism for our understanding of artistic medium as a concept; it is characteristic of modernist artists to take an art form itself as a question or problem to be explored. The modernist question of what, for example, constitutes the conditions of painting is part of an artistic project that takes as its starting point the history of painting and looks to inherit that tradition by stripping away all that is inessential to painting. But modernist artists and critics identify and explore shape or surface or color as they arise as problems or conditions for painting at a particular historical moment, not because of some timeless understanding they have of the nature of the media as such.

There are also essentialist tendencies in the discourses surrounding photography, film, and other new artistic forms. These essentialist claims often arise because the medium analysis starts from the problem of the new technological bases for these art forms. In this tradition of medium analysis, theorists and critics start with reflection on the nature of, say, photography as a new technological, productive process and draw aesthetic prescriptions or standards from the ontological structure of photographic experiences. Rudolf Arnheim, a film theorist prominent in the 1930s and after, offers a perspicuous example of a commitment to medium specificity in his theorization of film. Arnheim identifies the characteristic differences between a black and white photographic moving image and reality, and prescribes their accentuation as the basis for film art. Arnheim’s theoretical commitments to the purity of film as a medium led him to reject the development of color film and sound film as detracting from the artistic possibilities of silent black and white movies. As an example of medium analysis gone wrong, Arnheim’s restrictive prescriptions exemplify the reasons for Carroll’s rejection of medium specificity theories, for Arnheim’s commitment to the purity of silent film and rejection of the possibilities of sound movies reflects his own theoretical views about the unique characteristics of film itself, but voiced as an essentialist understanding of film. Arnheim’s critical blindness to the possibilities created with sound and color stems from his a priori commitment to an understanding of the nature of the photographic image.

But if Arnheim’s use of the concept artistic medium is subject to the confusions of medium specificity that Carroll warns of, other paradigmatic instances of medium analysis for emergent popular art forms do not fall prey to them. The critic and philosopher Walter Benjamin, in his “Little History of Photography” (1931), for example, articulates unique characteristics of photography that make possible artistic expression. However, Benjamin does not assume in advance of his critical investigation that he knows what art can be or everything that photography can do. Rather, Benjamin is committed to the view that photography’s invention changed what we could do, how we could see, and how we relate to our world and, at the same time, changed what can count as art and artistic experiences. Benjamin starts not with an essentializing, a priori analysis of the nature of photography in itself and for any possible use, but with a critically honed appreciation for the photographer Eugène Atget’s work and how that work discovers and explores certain characteristic possibilities for photography. Benjamin aims to identify the artistic possibilities within emergent artistic practices, what he identifies in his essay “The Work of Art in the Age of Its Technological Reproducibility” as art’s developmental tendencies. He does not attempt to offer an analysis of the medium independent of its emerging artistic uses and prescribe what those uses should be. Instead, Benjamin identifies unique characteristics of the new medium; that is, he describes new things that we can do with the new technology, so that he can articulate terms by which future artistic expressions can be understood. This is a critical judgment, to be evaluated in light of past and future instances of photography and photographic arts.

Carroll, then, worries about certain forms of theoretical confusion and critical blindness that can arise around commitments to medium specificity. Sometimes critics and theorists concerned with emergent popular art forms, like Arnheim, do succumb to the temptation towards medium specificity and prescribe some range of appropriate artistic experiences based on a supposed a priori, essentialist understanding of the nature of the technological basis for the art form. But the best critics and theorists engaged in medium analysis are not attempting to prescribe to artists which experiences are proper to the medium; rather, they are critically evaluating works of art in order to offer an analysis of why the art works on its audience in the ways that it does, and to articulate new possibilities that change what art can be.

Within the modernist tradition of medium analysis, most artists, critics, and theorists do not begin with an a priori, essentialist analysis of the medium in order to identify how it should be used in particular art forms. Instead, it is characteristic of the modernist tradition that the art form itself is taken up by artists as a problem or a question. Such artists, theorists, and critics do not draw out artistic prescriptions based on an understanding of the medium in isolation from any particular use. Rather, questions of shape and color are explored so as to find ways in which shape and color are conditions of possibility for painting. As we shall see, some postmodernist theorists and critics, such as Rosalind Krauss, share a version of Carroll’s worry that interest in exploring a medium demonstrates a commitment to some form of medium specificity: they object to the role of medium exploration in modernist arts, believing that modernist artists continually rediscover the same few automatisms or forms of repetition that they then explore as if they have, through the originality of their creation, taken on the art form itself.

It is common in critical discussions of art to use art form and medium more or less interchangeably and to talk about, for example, “the medium of painting.” It is worth distinguishing between an art form as a particular form of experience and a medium as the material conditions that underlie that form of experience and make it possible. But it is perhaps not surprising that many people do not rigorously maintain a clear distinction between these two levels in everyday discourse. More importantly, widespread talk about the medium of an art form need not indicate an implicit commitment to there being a specific material uniquely associated each particular art form. Instead, we may think about talking about the medium of an art form as indicating a level of analysis. Furthermore, what is picked out by some claim about medium is a contextual question relative to the history of the art form and the particular artistic problematic the work in question explores. If talking about the medium of an art form can indicate a level of analysis rather than a commitment to the medium specificity thesis, then we are open to thinking about the medium as a shifting collection of automatisms or forms of productive repetition that can evolve through the history of the art form.

Though Carroll worries that confusions can arise in using the concept of artistic medium, he acknowledges in his later work that critics and theorists have found productive critical and theoretical uses of the concept. In general, medium analysis that is grounded in exploring how characteristic experiences constituting a work of art or an art form are structured and achieved does not fall prey to the dangers of medium specificity that worried Carroll. For medium analysis pursued in this manner is critical, articulating the automatisms that underlie ongoing artistic practices. It begins with an artistic problem or concern and identifies media by the role they play in the discovery and exploration of possibilities within that framework. It does not start with reflection on the nature of the medium itself in order to deduce what characteristic properties or features are appropriate to exploit for artistic ends. Theorists and critics are most likely to fall victim to medium specificity confusions when they picture the medium of an art form as some type of raw material that can be grasped independently of and prior to its uses within an art form. Instead, our grasp on what a medium is and can be only arises within the context established by an artistic problem or tradition, to be discovered by artists and articulated by critics.

Furthermore, Carroll warns against attempting to identify an art form with a single artistic medium. There is no reason to think that there is some single material condition that constitutes the possibilities within an entire art form. Instead, within art forms, there are different media, different forms of repetition or automatisms, which have significance in structuring characteristic instances of the form. This means that, for example, film itself, considered as a technological innovation, may not always provide the most productive level of unity for medium analysis. Rather, we can ask: What are the various media at work in popular movies, in documentaries, in cartoons, and in particular film arts? How do those automatisms allow for the discovery of the artistic possibilities within particular forms of film art? Approaching medium analysis in this way allows one to locate the concept of artistic medium as a critical tool for understanding modern art experiences in which art is taken to have its distinct forms of experience that have their own norms and standards without necessarily committing to any essentialist assumptions.

3. Theorizing About Art Forms Before the Emergence of the Concept of Artistic Medium

Prior to the 18th century, theorists of art articulated the norms and standards characteristic of a given art form without any reference to a medium or the material conditions that underlie the work of art. Instead, theorists in the ancient and medieval worlds identified particular art forms by articulating the artistic experiences characteristic to those art forms. In so doing, they then could develop an account of what was and was not appropriate within an art form, given the experience at which the art form necessarily aimed.

a. Aristotle

Aristotle’s account of tragedy in his Poetics is one of the earliest examples of this way of theorizing about an art form and is a paradigmatic instance of it. Aristotle develops his account of tragedy as an art form by identifying the characteristic experience—catharsis—instances of the art form provide for audiences. He is able to develop a number of claims about how a tragedy should be structured in order to achieve this characteristic experience. Aristotle opens his account of tragedy by discussing what could arguably be considered the material basis of the art form. He identifies rhythm, melody, and verse as the means by which the effects fundamental to tragedy are achieved. However, he immediately notes that other art forms also utilize those same means and argues that what is specific to tragedy as an art form cannot be identified by analysis of the means by which instances of the art form are achieved. Instead, he turns his attention to the features of tragedy that are specific to the art form by analyzing the experience that structures the form.

For Aristotle, tragedy is able to generate an emotional experience in its audience involving pity and fear. His name for this experience is catharsis. The precise nature of this Aristotelian catharsis has been and continues to be a matter of great debate: it has been taken to be an experience of emotional discharge, of emotional purification, and of moral education, to name just a few interpretations. Fortunately, we do not need to determine what exactly Aristotle took catharsis to be in order to note the general shape of his reasoning about tragedy as an art form.

That tragedy aims for catharsis, a specific emotional experience for those who watch the drama, defines the nature of tragedy as an art form for Aristotle and organizes his analysis of how tragedies are structured. Most importantly, for Aristotle, tragedy can best achieve an experience of catharsis in its audience because it, unlike history or epic poetry, has a dramatic form. The events of a tragedy unfold as the audience watches; the audience apprehends events as they unfold, constituting the unity of the plot as a single action. The audience of a history or epic poem, on the other hand, learns of many actions and events, and their interrelation and history, by means of a narrative rather than dramatic form. Because the audience members for a tragedy witness the events of the drama as they play out in front of them, they are able to understand how those events, on the one hand, have an inexorable logic and, on the other, arise from choices that the protagonist makes that could have been otherwise. It is tragedy’s dramatic form that allows the audience to experience the choices made by the protagonist as both highly contingent, in that other options or other paths are always available, and necessary, in that the protagonist is such that the central action of the tragedy is characteristic of him. This tension between the contingency and the inevitability of the events depicted in a tragedy arises because the audience witnesses the protagonist make a choice and come to live with, or be destroyed by, its consequences. There is, therefore, an intimate connection between the dramatic structure of tragedy as an art form and the emotional cathartic experience available only to tragedy’s audiences.

Tragedy’s characteristic experience of emotional catharsis for the audience accounts for a number of features of the art form. Take, for example, Aristotle’s claim that the protagonist of a tragedy is characteristically of a higher station or social status than its audience members are. On his view, audience members who recognize the protagonist as their social better are in position to achieve the appropriate emotional catharsis because the events that unfold are understood to be the result of the protagonist’s choices and not merely to be the result of intractable or unfortunate circumstances; the heights from which the protagonist falls clarify the consequences of his action. Whether or not we agree with Aristotle about the reasons he offers for the fact that Greek tragedies characteristically centered on royal figures or even demigods, we can see that he is identifying the art form’s characteristic experience in order to explain why the art form has the features it does. Other features of tragedy that Aristotle accounts for include the extent to which the central action is grounded within basic familial structures and tensions and the role of the protagonist’s ignorance in the completion of the action.

Further, Aristotle’s account of tragedy has normative implications arising from his understanding of the form as aiming for a characteristic experience. In articulating the characteristic experience for the audience at which the art form aims, Aristotle identifies and explains why certain features, such as the high social status of its protagonist and the protagonist’s ignorance, aid in producing catharsis. Thus, his account of tragedy and how it is structured outlines a set of norms and standards that arise from the aim of achieving the art form’s characteristic experience.

b. Aristotle and Horace as Models for Theorizing Art

Aristotle’s analysis of tragedy as structured by the aim of achieving a characteristic experience in its audience is paradigmatic of artistic analysis and theory in the ancient Greek and Roman world and then again in Europe up through the 18th century. Horace’s Ars Poetica, for example, identifies poetry by its characteristic experience, namely, an experience of apprehending unity of action. Beginning in the Renaissance, Europeans began more extensive theorizing about art and art forms and they relied on Horace and, to a lesser extent, Aristotle, in order to justify a variety of accounts of art forms and the characteristic experiences that establish their norms and standards. For example, literary theorists such as Lodovico Castelvetro and Julius Caesar Scaliger, writing in 16th century Italy, both defended poetry as an imitative art (following Horatian principles) and argued for the legitimacy of literature produced in popular, vernacular languages. Such arguments were soon adopted across Europe by theorists such as Joachim du Bellay and Philip Sidney, who developed largely Horatian-style defenses of vernacular poetry as an imitative art. In the 17th century, Aristotle’s Poetics achieved a kind of prominence as a model for theorizing the norms of art forms; especially in France, dramatists such as Pierre Corneille and literary figures associated with the Académie française, took Aristotle to argue for the unity of action, of place, and of time as fundamental to dramatic structure. Although it is certainly possible to identify appeals to medium within this tradition of literary criticism inspired by Aristotle and Horace, most obviously in the arguments made in favor of works written in vernacular language as a legitimate means of artistic expression, by and large this Horatian tradition of poetics centers around analysis of the art form’s norms and standards by reference to an artistic aim, characteristically imitation, which structures the particular type of artistic experience.

c. Music

By contrast, theorizing about music, both in the ancient world and in the European tradition prior to the 18th century, did make appeal to what we might think of as the medium of music in articulating the norms underlying musical practices. However, what was identified as music’s medium itself changed over time. This is not a weakness of these theoretical accounts but rather an illustration of Adorno’s contention, noted above, that our understanding of the nature of the medium of some art form is itself a product of the history of that form: there is no fixed, ahistorical characterization of music’s medium because what musical experiences are is not fixed but continually discovered in composing and playing music.

The earliest theories about the nature of music within the Western tradition held that music is the expression of a set of natural ratios that are equally expressed macroscopically in the movement of the celestial spheres and microscopically with the harmonies of the human soul. Pythagoras and his followers placed mathematical and musical knowledge at the center of their studies and influenced Plato and other ancient philosophers. One prominent example of this ancient tradition of theorizing music as an expression of natural harmonies is Boethius, a Roman statesman and Neoplatonist philosopher from the 6th century C.E. In his De institutione musica, Boethius distinguishes between three types of music: music of the spheres, music of the human spirit, and instrumental music. Boethius’ account of music as the expression of the natural macro and microscopic harmonies of the universe was important through the European Middle Ages.

As the musical possibilities within the European tradition developed from monophony to polyphony in the late Middle Ages and Renaissance and then, during the Baroque era, to more complex contrapuntal forms of composition, theorizing about the nature of music also changed. Much medieval theorizing was directly inspired by Boethius and oriented around practical instructional concerns for composers and musicians. Indeed, much of the history of Western musical theory is bound up with theories of tuning. By the 16th century, there emerged in the work of theorists such as Gioseffo Zarlino, a new theoretical category—temperament—which made possible accounts better suited to describe the innovations in polyphony and counterpoint composition. Another strand of European music theory that emerged during and after the Renaissance drew on concepts from ancient rhetorical theories in order to describe the space of musical possibilities. By the end of the 18th century, such rhetorical analysis had largely been supplanted by a new type of analysis of musical forms, such as the sonata, in the work of Heinrich Koch and others. These theoretical developments around the nature of musical experience were responsive to the evolving nature and increasing complexity of music composition and performance in the 18th and 19th century.

The preceding is not meant to be a comprehensive overview of theorizing about art and art forms in the ancient and European traditions prior to the 18th century. Importantly, theorizing about artistic medium did not have the central place in theorizing about art forms more generally that it came to occupy beginning in the 18th and 19th centuries. As art established itself as a relatively autonomous region of experience, the concept of medium emerged as a critical means of understanding distinct types of artistic experience. Thus while we can, in retrospect, identify candidates for music’s medium, there is something anachronistic about thinking of them as examples of theorizing about artistic medium, since theorists like Boethius, for example, did not categorize music as one type of artistic experience among others. As Europeans began to conceive of art as a distinct region of experience, the Aristotelian and Horatian model for articulating the norms of an art form by reflection on its overall aim was fundamentally modified with the introduction of sustained consideration of the medium as the means for achieving a particular type of artistic experience.

4. Gotthold Lessing and the Problem of Art in the 18th Century

 With his Laocoön: An Essay on the Limits of Painting and Poetry, published in 1766, Gotthold Ephraim Lessing is often identified as the first theorist or critic to engage in medium analysis. In that essay, he articulates the standards by which painting and poetry should depict bodies in action through an analysis of the spatiotemporal conditions under which the art forms are experienced. Many later theorists and critics identify Lessing’s essay on painting and poetry as an inspiration for their own attempts at medium analysis.

a. Art in the 18th Century

Before examining the particulars of Lessing’s own analysis of painting and poetry, it will be helpful to note something of the wider context of art and art criticism in 18th century Europe. By the mid-18th century in Europe, art had become its own distinct realm of experience, the culmination of a long and complex process in which artistic creation gradually decoupled from religious expression. It is a reflection of art’s new status as a distinct form of experience that the 18th century in Europe saw the emergence of art history and art criticism as intellectual practices and aesthetic experience became an important topic for philosophers. Johann Winckelmann, for instance, developed the first comprehensive account of ancient art, distinguishing between Greek, Greco-Roman, and Roman art, and explicitly took up the Greeks in particular as a model for contemporary artists. Denis Diderot, among his many other accomplishments, began, in 1759, writing critical reports on the biennial Paris Salons for a German newsletter, offering evaluations of particular artists and paintings and, equally, developing a critical account of the experiences at which painting should aim.

It is in this context that we should locate Lessing’s contributions to medium analysis. In writing his Laocoön essay, Lessing shared with Winckelmann and Diderot (all three writing more or less concurrently in the 1750s and 1760s) an awareness of art as a distinct form of experience and thus as posing its own particular questions and distinct problems. Lessing, like Winckelmann, held that art, inasmuch as it was distinct from religious experience, should take beauty as its ultimate aim. Further, the ancient Greek, Hellenic, and Roman artists provide the best model for contemporary artists in large part because, being pre-Christian, their work reflects an unadulterated focus on artistic beauty for beauty’s sake. Later Christian artists were, on this view, required to maintain a sort of double allegiance to the demands of beauty and the teachings of the Church, to the detriment of their work artistically.

Like Diderot, Lessing was interested in art’s ability to generate an experience of a kind of moral or spiritual beauty in its audience. Both Diderot and Lessing believed that painting, for example, can show moments of beauty that are not exclusively visual in nature by encouraging audiences to imagine moral and spiritual possibilities that we do not ordinarily encounter or recognize in our everyday lives. The aesthetic aim means that painters should choose a revelatory moment within the action depicted that offers the chance to think through the nature of that action.

b. Lessing on Painting and Poetry

Lessing’s work of medium analysis in the Laocoön essay begins with an art historical question: did the Laocoön Group, a statue excavated in Rome in 1506 and currently on display at the Vatican, precede or come after Virgil’s account of Laocoön and his death in the Aeneid? In the Aeneid, Virgil recounts the story of Laocoön, a Trojan priest, who warns against bringing the Greek offering of a giant horse statue into Troy. Snakes sent by the gods kill Laocoön and his sons; the Trojans interpret this a sign from the gods that Laocoön should not be heeded and bring the offering into the city, ensuring their ultimate doom. The question Lessing sets out to answer, whether the sculptors inspired the poet or vice versa, serves as a jumping-off point for Lessing’s broader interest in establishing the different norms and standards that govern painting and poetry.

Lessing characterizes painting and poetry in quite abstract and capacious terms. He defines poetry as any art form that unfolds in time and painting as inclusive of all art forms that are visual in nature. Distinctions between the different materials out of which works of art (between marble and oil paint on canvas, say) are not relevant for Lessing’s analysis: he abstracts away from what will later be thought of by some critics and theorists as distinct artistic mediums in order to characterize painting and poetry in terms of the spatiotemporal experience of the audience in apprehending the work. This stands in contrast with later analysis of artistic medium, which often centers on the particular matter out of which works of art are made.

In fact, though often (rightly) credited as the first critic to offer an analysis of artistic medium, Lessing himself does not describe painting and poetry as artistic mediums. It is worth noting that there was no widespread appeal to a concept of artistic medium until the middle of the 19th century, when a term that had its home in scientific contexts was extended into artistic contexts. So, although Lessing is correctly credited with the first developed medium analysis, he describes painting and poetry as different methods for achieving a particular artistic experience.

Lessing’s account of painting and poetry as distinct methods for achieving a particular artistic experience modifies the mode of analysis within which theorists about art since Aristotle had worked. That Aristotle-inspired mode of analysis developed an account of the norms and standards governing an art form by identifying the experience characteristic of the art form and generating an account of the features of the forms in light of their contribution to the overall experience aimed at. Similarly, Lessing’s analysis of painting and poetry takes as its starting point a particular artistic aim; namely the audience’s imaginative apprehension of bodies in action as beautiful, and distinguishes between two different methods for achieving that experience. Unlike Aristotle, Lessing has in mind a general type of experience, the imaginative apprehension of bodies in action as beautiful, achieved by two different methods. Lessing is able to offer an account of the different artistic norms governing painting and poetry because he takes them up as different means by which a kind of artistic experience can be achieved, where each means is constituted by a distinct spatiotemporal structure.

According to Lessing, because signs contiguous to other signs best represent objects contiguous to other objects, painting’s appropriate subject matter is bodies at a single moment of time. Similarly, because signs that succeed one another best represent objects that succeed one another in time, poetry’s appropriate subject matter is actions unfolding in time. Lessing argues that the material conditions of the method determine what is appropriate to that art form:

If it is true that in its imitations painting uses completely different means or signs than does poetry, namely figures and colors in space rather than articulated sounds in time, and if these signs must indisputably bear a suitable relation to the thing signified, then signs existing in space can express only objects whose wholes or parts coexist, while signs that follow one another can express only objects whose wholes or parts are consecutive.

Objects or parts of objects which exist in space are called bodies. Accordingly, bodies with their visible properties are the true subjects of painting.

Objects or parts of objects which follow one another are called actions. Accordingly, actions are the true subject of poetry. (78)

Some subsequent critics and theorists have read Lessing here as offering two distinct tasks for painting and poetry, depicting bodies and depicting action. But Lessing considers these to be two different methods for achieving a single effect; namely, getting the audience to imagine bodies in action. For he completes the thought by noting “painting too can imitate actions, but only by suggestion through bodies…. Poetry also depicts bodies, but only by suggestion through action.” (78) Because Lessing is interested in painting and poetry inasmuch as they are different methods for encouraging audiences to experience beauty through imagining bodies in action, he immediately goes on to outline the norms that govern these different methods. Poets should construct their descriptions of actions by referencing each body participating in the overall action in terms of a single characteristic as it makes its contributions to the action. That allows the audience to imagine each body’s role in the overall action and so not be distracted by unnecessary detail or description. Painters should, according to Lessing, choose to depict the single moment of an overall action that encourages the audience to best imagine the action and one that offers the audience particular insight into what is at stake in the action.

Homer and the sculptors who created the Laocoön Group are exemplary artists in Lessing’s view because they grasp the norms at work in their respective art forms. Homer’s mastery in part stems from his ability to offer descriptions of scenes that are centered around and grow out of an action, as when he describes Agamemnon’s armor and regalia as he is in the act of donning it. According to Lessing, Homer’s usual practice is not to linger on description for its own sake but only to describe objects in the midst of action and only in terms of a single distinct characteristic, so as to encourage the audience’s imagination: for example, he evokes black-prowed ships skimming over a wine-dark sea. Likewise, the sculptors of the Laocoön Group are exemplary inasmuch as they have chosen a moment before the snakes have crushed and broken Laocoön and he has started to scream. Instead, they show his resistance to his suffering, the way in which he is enduring the pain and suffering that will inevitably overwhelm him. In this way, the audience is able to imagine both the enormity of his suffering and the spiritual beauty of his resistance in the face of immense suffering.

Lessing’s analysis of painting and poetry, therefore, identifies them as distinct methods for exploring a shared artistic problematic and considers them insofar as they constitute the aimed-for artistic experience. He is able to articulate a set of critical norms and standards for the art forms by reflecting on their different underlying spatiotemporal conditions. Lessing’s analysis establishes the dependence of the norms of painting and poetry on the spatiotemporal conditions of the experiences of those works of art. This dependence is the result of his choice to begin not with the material conditions of the art forms, but by locating those art forms as participating in a particular artistic aim, that is, the demand that painting and poetry encourage their audiences’ imaginative apprehension of the beauty possible for bodies in action. In turn this aim determines painting and poetry as methods and gives Lessing’s normative recommendations the force they have.

c. Herder and Hegel

 Lessing’s theorization of artistic medium proved influential as questions of art and aesthetic experience moved to the center of philosophical thought at the end of the 18th and the beginning of the 19th century. Johann Herder, for example, in his Sculpture: Some Observations on Form and Shape from Pygmalion’s Creative Dream (1778) extends and complicates Lessing’s approach to artistic medium by denying that painting and sculpture could, as Lessing held, be understood in the same terms because they both offer a single moment of an action up to the audience for contemplation. Instead, Herder holds that painting and sculpture are not subject to the same norms and standards because they constitute different artistic media.

Most decisively, the concept of artistic medium plays a central role in the thought of Georg Wilhelm Friedrich Hegel, arguably the most influential philosopher of the early 19th century. On Hegel’s view, the norms and ideals that structure human interactions develop out of and so are made explicit in the history of human political development, intellectual development, and artistic development. Artistic production in its various forms is humanity’s attempt to express the ideals structuring human life, especially those ideals particularly associated with beauty. But if this is case, then the question arises: Why are there different art forms, given that they all are structured around a shared, if at times inchoate, desire? Hegel’s answer to this question, most developed in his Lectures on Aesthetics (published posthumously in 1835), is that different artistic media serve as the basis for and so give rise to the possible forms of expression within particular art forms. In Hegel’s work, we have perhaps the clearest example of the consequences of conceiving of art as a distinct form of human experience, separate from religion and politics for example. In thinking of art as a general field of experience within which we can distinguish distinct art forms, the concept of artistic medium is critical in allowing Hegel to maintain the unity of art in general while still distinguishing clearly between particular art forms and the norms and standards that govern them.

5. The Invention of Photography and the Discovery of Its Artistic Possibilities

a. The Etymology of the Term “Artistic Medium”

As noted above, Lessing does not describe his analysis of painting and poetry as an analysis of artistic medium. He talks about painting and poetry as different methods for achieving an experience. Indeed, it was not until the middle of the 19th century, almost 90 years after Lessing’s Laocoön essay, that medium began to be used in artistic contexts, referring to the material conditions out of which works of art are made. The Oxford English Dictionary notes that the earliest use of medium in an artistic context, signifying the raw material out of which a work of art is made, is from 1861. This new use of medium in an artistic context grew out of an earlier use that describes the substance (such as oil or water) that painters mix with pigment to create paint. We still speak of oil paint as a distinct medium that differs from watercolor; this use is an extension from an earlier one that identifies oil and water as media in which pigments are mixed.

b. The Challenge of Photography

While the first uses of medium in artistic contexts often referenced the material out of which paint and then paintings were made, the timing of this development is likely connected to a radical problem that gripped the art world of the 19th century; namely, the emergence of technologies that reproduced images: first lithography, and then photography, which reproduces images of our world now past. In the 1820s, Nicéphore Niépce developed the first photoetching process; in the 1830s, a number of inventors, most prominently Niépce’s partner Louis Daguerre and William Fox Talbot, worked independently on developing photographic processes that were capable of capturing images mechanically with much shorter exposures. By the end of the 1830s, both Talbot and Daguerre had publicly debuted their technology for mechanically capturing and reproducing images from the world. The invention of photography was widely felt as a challenge to the received understanding of what could be art and what artistic experiences were proper to painting specifically. But the debates surrounding photography and painting in the 19th century largely centered on whether or in what ways photography could serve as the means of artistic expression.

The new photographic technology was, in many cases, quickly distinguished from processes by which works of art were produced and dismissed as being incapable of producing art. The most prominent argument made against the possibility that photography could be art was based on the mechanical nature of the photographic process. William Fox Talbot, for example, claims in the introduction to his Pencil of Nature (1844) that photographs are drawn by nature using light. Talbot’s view that photography was the production of natural images by mechanical means alone, without the intervention of any human artistry, was widely shared in the 19th century and taken to be grounds that photography could not ultimately be an artistic medium. If the photograph is made by the interaction of natural processes of light and chemicals, then it cannot be a work of art, any more than a tree or a sunset could be.

The 19th century debates about the possibility of photographic art seem, from our 21st century vantage point, hopelessly misguided. But this is largely because we are the recipients of an understanding of what can count as art that has been altered by the development of reproductive technologies like photography and film. Throughout most of the 19th century, it seemed obvious to a large number of critics that the mechanical nature of photography excluded it straightforwardly from consideration as art. Charles Baudelaire for example, in his Salon of 1859, worries that the public is starting to confuse photography for art by mistakenly taking a mechanical means for image reproduction as capable of inspiring imagination. Photographers, on this view, were mere technicians, only capable of reproducing natural images by exploiting the laws of nature. Because there was no human creativity at work in producing the photographic images, those images could not be art and the photographers were not artists.

Those who argued that photography could be art generally took two lines of response. On the one hand, many held that, while photography was ultimately a mechanical process, it could be artistic inasmuch as it is able to mimic painting and the artistic experiences of which painting is capable. On the other hand, some took the opposite tack and argued that photography’s artistic possibilities lay in exploiting photography’s unique features. On the first line of response, photography was artistic by the extent to which it was able to look like painting or otherwise reproduce it. On the second, photography was seen as artistic by the extent to which it distinguished itself from painting’s artistic possibilities by taking advantage of the features that only it possessed.

Photographers attempted to imitate painting in several ways. One of the earliest uses to which photography was put was the reproduction of paintings in order to disseminate widely what would otherwise require a pilgrimage to see. Further, some early photographers took photographs that were essentially reproducing the subject matter of earlier paintings by staging scenes reminiscent of those paintings. These genre photographs are the primary objects of Baudelaire’s denunciation of photography as essentially lacking in human creativity. Finally, some photographers began to produce photographic experiences that mimicked experiences recognizable from painting, utilizing soft focus, for example. Julia Margaret Cameron’s photographic work exemplifies many of these quasi-painterly techniques and is deeply influenced by pre-Raphaelite painting.

On the other hand, many photographers and critics, beginning with the earliest instances of photography, emphasized aspects of photographic production that were taken to be unique to it and therefore unlike painting. In announcing Daguerre’s invention to the French Academy of Sciences in 1839, François Arago emphasized the precision of the photographic image and its ability to allow its audience to see aspects of their world that they would otherwise be unable to see. During the 19th century, photographic practices evolved to capture moments and aspects of the world that are fleeting. Eugène Atget, for example, developed his photographic productive practice around capturing aspects of Parisian life that were disappearing as the city continually modernized, memorializing scenes from a form of life that was fading away. Walter Benjamin, looking back at the development of 19th and early 20th century photography from the vantage point of the 1930s, articulates photography’s unique features in terms of its ability to allow us to become conscious of what otherwise passes before our eyes unrecognized. On Benjamin’s view, photography’s ability to reveal what he terms the optical unconscious constitutes its distinctive power and its value as an artistic medium.

c. Accounting for Photography’s Artistic Possibilities

The problem posed by photography and its relation to art generally and painting specifically led to an approach towards questions of artistic medium distinct from the approach pioneered by Lessing. Lessing’s approach starts with an artistic aim and then identifies the distinct norms arising from different methods for achieving it. Photography instead presented itself as a problem: the question is not how best to achieve a particular artistic experience with a mode of expression or set of material conditions, but instead, to what extent, if any, can this new mode of expression be artistic? The emergence of this new technology raised a pressing question: Can it serve as the basis for artistic creation and, if so, what aspects or features of it are most appropriate for creating art? This approach to medium analysis begins by identifying a particular medium and its unique features or characteristics and determining what artistic experiences artists using the medium should pursue.

Both the early defenders and objectors to photography’s artistic value follow the same basic template for analyzing medium. First, the critic identifies the medium (the photographic technology that constitutes a new mode of expression) and determines its unique features. Then, the theorist or critic evaluates the ways in which those unique features can generate artistic experiences. There are two ways this evaluation can happen. One can reflect on the nature of the medium and the unique features of its productive process and try to deduce an absolute a priori claim about its artistic possibilities independently of critical examination of the works produced; this approach generates confusions or, at best, tendentious critical prescriptions. Alternately, one can engage in a critical investigation of work that utilizes the unique features of the medium in order to articulate how new artistic possibilities are being generated. Baudelaire’s dismissal of photography’s potential for artistic value exemplifies the first possibility. Benjamin’s critical examination of Atget’s work exemplifies the latter.

6. Modernism as the Discovery of Medium

a. The Emergence of Modernism

 As modernism transformed almost all traditional art forms more or less simultaneously during the first half of the 20th century, artistic medium became one of the crucial art critical concepts not just for theorists and critics but for artists as well. For modernist artists, inheriting traditional art forms meant querying the conditions of possibility underlying the art form in order to determine, through discovery and exploration, the necessary conditions for contemporary instances of the art form. For this reason, modernist arts often seemed to critics and some artists to be exercises in shedding, as some things taken to be essential to the form earlier in the tradition are discovered to be mere conventions and thus no longer conditions for contemporary instances of the art form.

There are too many important modernist artists across a wide variety of traditional art forms to give a comprehensive survey of them here. However, it is worth identifying a few of them in order to emphasize the modernist concerns that were deeply shared by artists across a broad swath of different art forms. In dance, for example, Isadora Duncan, the American dancer, rejected the inherited tradition of ballet techniques and thought of her own practice as the exploration of dance’s medium, the human body in its freedom of movement and gesture. In his “Ornament and Crime” (1913), Adolf Loos, an Austrian architect, critic, and theorist, argued against unnecessary architectural ornamentation in ways that heralded the modernist emphasis on purity of form and design. The modernist architectural commitment to form following function in design and the general dictum that buildings should be “machines for living” culminated in the work of architects such as Le Corbusier, Walter Gropius, and Ludwig Mies van der Rohe. In literature, the work of Gertrude Stein, James Joyce, and Franz Kafka, to name only a few, all differently exemplify modernist commitments. In Joyce’s work, for example, we can see a broad modernist development from an exploration of the history of forms of literary expression in Ulysses (1922) to an obsessive examination of the expressive possibilities of language itself in his final book, Finnegans Wake (1939).

b. Modernism and 20th Century Music

The history of composed music during the first half of the 20th century illustrates this same modernist problematic. By the 1920s, a number of composers began to explore new and unconventional forms of composition, including serial and atonal composition. Among the most prominent of these modernist composers was Arnold Schoenberg, who developed twelve-tone technique or row composition. Other notable modernist serial composers included Anton Webern and Karlheinz Stockhausen. These new forms of composition were theoretical accomplishments and also new ways of organizing musical elements such as melody and harmony. Not only did these developments in composition provide new systems of musical organization, but they readjusted audience’s understanding of older forms of composition now thought of as, for example, limited to tonal relations in contrast with modernist atonality. As Theodor Adorno observes, Schoenberg’s twelve-tone technique, for example, stands in contrast with 19th century composers who manipulate and transform the repetition of certain basic musical relations. In Schoenberg’s compositions, however, there is no room for repetition. Rather, the composer moves through a series of distinct relations between pitches. Usually these rows of related pitches do not get repeated but explored once, and then a new row is generated. No longer are composers straightforwardly exploring relations between melody and harmony by the repetition and manipulation of a few themes or motifs. Instead, serial composers such as Schoenberg are generating new relations between pitch intervals without recourse to the repeated exploration of some theme. Thus, in the modernist exploration of music’s possibilities, the medium of music itself undergoes radical developments. In other words, modernist composers no longer took for granted compositional techniques or assumptions that had for prior generations seemed obvious or unproblematic. Instead, modernist composers aimed to generate an entire system of composition and so too a theoretical articulation of the constraints and rules by which their particular system of composition operates. As these modernist questions gripped more composers, the possible compositional systems and their accompanying theoretical justifications proliferated.

c. Fried on the Value of Modernism

Why modernism should have taken hold in a number of traditional art forms more or less simultaneously in the early part of the 20th century remains an important question, one that cannot be answered directly in this article. We will simply note that it did happen and that although many artists and critics embraced the modernist moment with traditional art forms as the promise of clarifying what was truly necessary for those arts, the modernist moment also clearly marked a kind of crisis for those traditional art forms, in which that which had previously been accepted as the possible basis for serious work within the form no longer satisfied artists or audiences.

The logic of modernism is important for understanding the concept of artistic medium because the exploration of the art form’s medium in its purity was central to it. This article will focus on one clarifying example, the critical discourse analyzing and justifying modernist painting in the 1950s and 1960s, in order to bring out the characteristic structure of reasoning about medium in artistic modernism. Three critics in particular, Clement Greenberg, writing in the 1940s and 1950s, and Stanley Cavell and Michael Fried, writing in the 1960s, championed the modernist project in American painting and sculpture; their work offers a perspicuous example of the logic of modernism as an exploration of artistic medium. Greenberg’s “Modernist Painting,” in particular, is an early statement of critical purpose, justifying the modernist project of medium exploration for its own sake. Greenberg saw the modernist project as akin to the Kantian commitment to critical philosophy: like Kant, modernist artists, on Greenberg’s view, engage in a project of criticism, reflecting on the nature of the form in its purity by discovering and articulating its limits. Cavell, in his “A Matter of Meaning It” (1969), identifies modernism as the realization of an art form’s artistic media through the discovery of its contemporary conditions of possibility. The work of the modernist artist, according to Cavell, is to find the criteria for an instance of an art form in the act of inheriting that form.

The dominance of this modernist problematic was challenged in the 1960s as minimalist or conceptual art on the one hand and pop art on the other developed alternative artistic possibilities to be explored. These alternative artistic programs competed with modernist painting by rejecting painting and sculpture altogether as forms for artistic expression. Instead, the aim was to cultivate forms of experience in ways not bound by painting’s forms, its problematics, and its media. For example, pop art was interested in exploring the image and contemporary experiences of images as such, rather than posing the image as a problem situated merely within the history of painting.

This confrontation between modernism on the one hand and pop art, minimalism, or conceptual art on the other was felt as a crisis involving the very existence of painting and sculpture as art forms by a number of artists and critics. Michael Fried, in “Art and Objecthood” (1967), offers perhaps the strongest critical polemic on behalf of modernist painting and sculpture. Fried identifies recent developments in painting as responding to a conflict between minimalists and modernists about how shape should function as an artistic medium:

What is at stake in this conflict is whether the paintings or objects in question are experienced as paintings or objects, and what decides their identity as painting is their confronting of the demand that they hold as shapes. Otherwise they are experienced as nothing more than objects. This can be summed up by saying that modernist painting has come to find it imperative that it defeat or suspend its own objecthood, and that the crucial factor in this undertaking is shape, but shape that must belong to painting—it must be pictorial, not, or not merely, literal. (151)

Fried here identifies the minimalist project as taking what he terms a literal approach to shape, for example, in which shape on its own is apparently explored for its artistic possibilities. By contrast, for Fried the modernist project takes the art form itself as an artistic problematic or a contemporary question and the medium exploration is in service of the exploration of that problematic: What now are the conditions of painting?

Fried’s objection in condemning conceptual artists and minimalists as literalists is that exploration of the medium as such loses connection with what is possible within traditional art forms like painting and sculpture; namely, an aesthetic experience. Painting and sculpture aim at the production in their audiences of a shared moment of judgment, a moment of judgment that audiences together take pleasure in extending and contemplating. The literalists, on the other hand, construct for their audiences experiences that cannot be shared in a single moment of judgment but are necessarily individual explorations of objects within a space over some duration. Fried thinks conceptual and minimal artists offer audiences theatricalized experiences, unfolding for each individual in time without the possibility of a shared moment of aesthetic judgment. For Fried, it is not possible to arrive at the unity of an aesthetic experience simply by the exploration of material conditions in themselves, cut loose from any artistic problematic or aim. In contrast, modernist artists committed to the traditional art forms are interested in discovering the material conditions for experiences that demand aesthetic judgment. The modernist worry, articulated by Fried and Cavell, is that the possibility for authentic experiences of art are lost when the questions of artistic medium no longer arise in relation to an existing art form and its traditions. Substituting theatricalized experiences for serious artistic experiences will mean that people no longer have experiences that are both aesthetic and ascetic. Grounding one’s explorations within the history of an art form in order to offer a contemporary instance of the art form calls for an appropriately serious response on the part of the beholder, a response that demands self-work on the part of the beholder in ways that enrich both the experience and the beholder. In contrast to aesthetic literalism, the modernist project thus involves the cultivation of aesthetic judgment; through contemplation, better understanding of the relation between the present instance of the form and the history of the form is achieved.

d. Postmodernism

For those artists and critics committed to modernist art, the task at hand was the survival of traditional art forms through a radical exploration of what is most essential to a particular form. In so doing, the modernist artist aims to continue the art form by an original contribution to the tradition and creating work that discovers artistic possibilities on behalf of the art form. But to those artists and critics that emerged in the wake of the modernist moment, this stance of the heroic artist revealing possibilities for an art form through creating new instances of the form came to seem inappropriate and a bit self-aggrandizing. Postmodern critics and artists in the 1970s and after developed new approaches to the history of traditional art forms. Rosalind Krauss in “The Originality of the Avant-Garde” (1981) argues that modernists and avant-garde artists imagine that they make themselves the new origin of the art form as they continuously discover its essential conditions. Such modernist artists continually rediscover a few prominent automatisms, forms of repetition, as if they were the essence of painting and their discovery were an act of artistic originality. Krauss argues that rather than discover the essential material conditions of the art, modernist artists returned again and again to a fundamental form of repetition activated throughout the history of painting; namely, the grid. Avant-garde and modernist artists from this point of view do nothing but treat the various forms of repetition and automatisms that constitute the history of the art form as original individual discoveries of the grid and its possibilities for painting.

Postmodern art is characterized by a change in relation to an art form’s tradition. Rather than attempt to investigate the necessary material conditions for contemporary expression within the art form, the postmodernist attempts to otherwise activate a tradition’s discarded conventions through exaggeration, juxtaposition, and unabashed repetition. Modernism’s approach to tradition is to strip away everything conventional and inessential in order to discover the fundamental conditions of the art form. Postmodernism instead approaches an art form’s tradition as a collection of automatisms to be explored and activated again through conscious repetition. Rather than discarding all that is unnecessary, the postmodern artist juxtaposes or exaggerates disparate conventions and so hopes to rediscover possibilities within forgotten automatisms that modernism would have discarded. From the point of view of postmodernism, Fried’s attachment to the experience of working on oneself in order to better behold modernist works of art looks like a self-aggrandizing response analogous to the modernist artist’s “genius” in laying bare the conditions of painting as such. On the other hand, if Benjamin is correct that the invention of lithography and subsequent technologies that produce and reproduce images irrevocably accelerated European art’s transition from forms of experience centered around cult value to forms of experience centered around exhibition value, then modernist commitments to the possibilities of aesthetic-ascetic experience in inheriting traditional art forms may be seen as late attempts to sustain the possibilities of art with cult value.

Modernism coheres around the concept of artistic medium, for the aims of modernism are to discover the possibilities and thus the limits, the strengths, the tensions, the contradictions within an art form and its history by discovering how the form’s material conditions can be transformed into new and newly definitive instances of the form. Because postmodern artists return to the history of the form to discover its discarded conventions and automatisms rather than discarding them, they no longer think of media in terms of the essence of an art form. But although medium is not central to postmodern art, it is nonetheless still useful in critically evaluating works of art. Postmodernism emerged in the wake of modernism; the break with the history of traditional art forms that constituted the modernist moment was a break from conventions that no longer provided conviction for artistic expression. Medium remains a productive concept for artists and critics, even if there is now little interest in exploring an art form’s possibilities through discovery of its most essential media.

7. New Forms of Popular Art in the 20th Century

The 20th century saw the emergence of a succession of new forms of popular art, including movies, comics, and video games. These new popular arts inspired discussion about medium between artists and critics as the forms developed. Especially early in the lives of new popular art forms, questions of medium and medium analysis seem pressing to both artists and critics. This is because new art forms grow by borrowing artistic problems and aims from related earlier forms and by exploring a different material basis that makes new forms of artistic expression possible and in which artistic questions and interests can be pursued, critiqued, or otherwise engaged.

a. Movies

 Movies and film criticism are an exemplary instance of a new form of popular art generating elaborate and often productive discourses about medium. Much of the early history of film criticism and film theory is marked out by exploration of a number of questions related to film as a medium. As film theory began to establish itself as an academic field of interest in the early 1970s, interest shifted away from questions of medium. But early in the development of the movies as an art form, film’s potential for artistic expression was a prominent critical conversation between theorists and artists.

In the 1920s, Soviet filmmakers and critics such as Sergei Eisenstein, Vsevolod Pudovkin, and Dziga Vertov were engaged in a critical discourse about film’s potential for popular art in writing and with their movies. Eisenstein, for example, argued that film’s unique and characteristic feature was montage, the juxtaposition of images through editing into a sequence. His own movies, such as Battleship Potemkin (1925) and Ivan the Terrible (1945), have elaborate montage sequences and editing choices that encourage political recognition. Likewise, Pudovkin claims that montage and juxtaposition of images through editing can change the meaning of images. For both, the emphasis on montage as a unique and characteristic feature of film being central to its possible artistic experiences stems from their interest in the ways in which juxtaposing images can generate both abstract judgments and strong emotional responses. Dziga Vertov’s exploration of the photographic and mechanical basis of the film image was central to his artistic and political project of discovering new ways of allowing his audiences to understand the world around them. He emphasized film’s ability to reveal minute features and gestures, otherwise unseen or unnoticed, so that the audience is able to recognize them as characteristic of the overall environment. One exemplary instance of Vertov’s exploration of the nature of film as a medium of expression is his experimental movie, Man with a Movie Camera (1928).

A number of early film critics developed an analysis of film’s artistic and political promise around its photographic basis. In “The Work of Art in the Age of Its Technological Reproducibility,” (1939) Walter Benjamin emphasized photography’s ability to make visible minute aspects and gestures so as to display the character of people and environments. Similarly, popular movies offer the opportunity to develop new habits of perception that allow audiences to recognize fraught meaningful gestures. Walter Benjamin’s medium analysis is exemplary, for Benjamin explicitly asks of all reproductive technologies that have developed after lithography not, “What features of these material conditions are unique and thus capable of artistic experiences that take advantage of those features?” but rather, “How does the existence of these new reproductive technologies change what art can be?” Benjamin’s form of medium analysis is historically and critically grounded in successful instances of emergent forms of popular art.

Rudolf Arnheim, also writing film criticism in the 1920s and 1930s, offered an analysis of film’s potential as an artistic medium. Unlike Benjamin, who emphasizes film’s ability to reveal the optical unconscious, Arnheim identifies the ways in which the film image differs from everyday images and derives from those features the norms that should serve as the basis for film art. Arnheim’s critical blinders and commitment to an idea about purity of medium led him to argue against the possibility of film art that includes sound because film and sound are distinct media and should not be mixed.

Another early theorist committed to a medium analysis of film and photography is Siegfried Kracauer. Kracauer’s critical articulation of film’s artistic possibilities stems, like Benjamin’s, from the unique capabilities of photography and film, and so he identifies and encourages film’s documentary and democratizing impulses.

André Bazin, a midcentury French film critic who championed the Italian neorealists and cultivated a generation of French film critics and filmmakers, developed an analysis of film’s photographic basis. Bazin emphasizes photography’s ability to satisfy absolutely the desire to preserve the world as the basis for a critical understanding of film and its possibilities. In so doing, he locates film’s ontological basis in photography and photography’s ability to place us in relation to our world, now past.

This tradition of exploring the meaning of film’s ontological status as photographic includes Stanley Cavell’s work, especially his The World Viewed: Reflections on the Ontology of Film (1971). Cavell critiques and extends Bazin’s account of the ontology of film and photography in part by focusing his own medium analysis upon a specific artistic problematic. The World Viewed offers a medium analysis not of film as such, but of popular Hollywood movies. In analyzing the medium of movies, according to Cavell, prior to the 1960s popular movies explored the possibilities and tensions within a problematic of modern action that emerged in the 19th century concerning the possibilities for urbane, stylish, and productive action, but by the late 1960s, a new problematic concerning the contemporary possibilities for action simpliciter was emerging. The World Viewed was written in observance of this transition within popular movies and draws on the conceptual tools of medium analysis in order to register the fact of this transformation.

If the beginning of the 1970s saw the emergence of a new problematic for popular movies to discover and explore, it also saw the establishment of film theory as an academic discipline. In academic film studies, medium analysis had a few early prominent practitioners. Leo Braudy’s The World in a Frame: What We See in Films (1976), for example, offers an analysis of film’s artistic possibilities by distinguishing between the ways in which movie worlds are both closed off from and open to and interpenetrate with our world. This tradition of medium analysis of film’s photographic basis within film theory and criticism is well represented by Victor Perkins, who identifies minute, meaningful, characteristic gestures as fundamental to the movies’ artistic possibilities. Perkins’ commitment to the fundamental role that artistic medium has within his critical practice points to an intimate nexus of considerations of medium and artistic experience within the creation of movies and artistic practices more generally.

However, soon after film theory and criticism found an academic home within film studies in the 1970s, theorists and critics moved away from sustained medium analysis of film or the film arts. Instead, academics developed alternative interpretative frameworks, prominently Lacanian, feminist, and Marxist ones, that displaced the prominence of medium analysis within film theory. In analytic philosophy, as philosophy and film established itself as a domain of inquiry, instances of medium analysis gave way primarily to cognitive science approaches to theorizing film and film experiences. Medium analysis depends on the unity of the aesthetic experience to which the medium in question is able to contribute. A cognitive science approach to the effects possible in certain modes of filmmaking need not concern itself with the unity of aesthetic experience.

In the 1990s, Noël Carroll, a leading advocate of the cognitive-science approach in analytic philosophy and film, argued that medium was necessarily a confused category and should be eschewed by philosophers interested in theorizing movies and other film arts. In the mid-2000s, Carroll adjusted his view and acknowledged uses for the concept of medium, especially in describing the practices of certain experimental or avant-garde film artists. Regardless, for many years since its inception, indeed until the mid-1970s, one prominent form film theory has taken is medium analysis.

b. Comics

Comics, as they have developed as an art form, have also developed critical and theoretic discourses that participate in some form of medium analysis. Much of the most prominent medium analysis has been by artists adopting a critical and theoretic stance with respect to their own artistic practices. Prominent instances of this medium analysis of comics by comics artists include Will Eisner’s Comics and Sequential Art (1985) and Scott McCloud’s Understanding Comics (1993). Both Eisner and McCloud offer paradigmatic instances of medium analysis, in that both are theorizing the particular ways in which comics, as an art form, are able to achieve forms of aesthetic unity in relating image and action.

c. Video Games

Video games and 21st century gaming offer another instance of an emergent popular art form that has inspired early practitioners, critics, and theorists to engage in medium analysis. Much of the academic discourse analyzing gaming grows out of film studies and necessitates some medium consideration as terms and interpretative frameworks are applied in new contexts or, alternatively, theorists attempt to distinguish clearly between experiences that are proper to movies and other narrative visual forms and experiences that are proper to games and gaming. Medium analysis has been an important aspect of developing theoretic and critical discourses about gaming in which game creators and theorists are in conversation.

8. Conclusion

Currently within academia, medium analysis is largely pursued in media studies and disciplines exploring the emergence of new media. In philosophy, medium analysis has recently been utilized in numerous ways within the philosophy of gaming and video games. Given the ways in which screens and screen technology continue to interpenetrate contemporary reality, we can anticipate further recourse to medium analysis in theorizing these new forms of experience. Even if the collapse of interest in modernist projects in the arts has moved contemporary theorizing about art away from medium as a central concept, academic theorists of new media and new popular arts still participate in a discourse of medium analysis.

Artistic medium continues to be a productive critical concept as well for working artists and critics interested in articulating the means by which an artistic experience is structured and organized. That theorists and critics attempting to theorize medium should run into characteristic confusions in defining and theorizing medium stems from the picture they share of medium as an object. They take their task to be identifying the object that is the medium in order to deduce and prescribe its appropriate artistic experiences. But working critics and artists are less likely to think of artistic medium as an object to be studied for its own sake. For such critics and artists, thinking about medium is thinking about how something functions in creating a particular effect or in structuring a particular form of experience. What photography, for example, can be is to be discovered by artists as they pursue particular projects or lines of exploration. In this sense, artistic medium is a critical concept; we can only say what media are constitutive of an art form by critically examining instances of the form. This way of approaching medium analysis, as necessarily a critical pursuit, conceives of artistic media as the capacities for organizing and structuring the audience’s experience as the means of exploring and discovering the possibilities and tensions within an artistic problematic. These capacities for organizing artistic experience are forms of repetition or automatisms that have significance as the means by which a form of artistic experience is structured.

Medium analysis emerged with, and has developed in response to, modern art. As critics and theorists began to argue for art and the aesthetic as a distinct form of experience in the 18th century, independent of its former subservience to religion and able to dedicate itself to aiming only at beauty, medium analysis developed. First developed in Lessing’s work, medium analysis is a critical tool for understanding the norms constituting a capacity for structuring an artistic experience. The value of artistic medium in theoretical and critical discourse is realized when medium is approached not as some raw material to be investigated in advance of its possible artistic uses but as a means by which artists discover and explore possibilities within a particular artistic problematic. Discovery of a medium’s possibilities happens by artists in the creation of new instances of an art form, and by audiences and critics in the experience of particular instances of the art form. Theorists can avoid confusion by remembering that medium is an essentially critical concept, in that what is possible within a medium is discovered by artists as they continue to create and explore.

9. References and Further Reading

  • Adorno, T. W. (2006). Philosophy of new music. R. Hullot-Kentor (Ed.). Minneapolis: University of Minnesota Press.
    • First published in 1947, Adorno here analyzes the work of Schoenberg and Strindberg as exemplary of the new possibilities in 20th century music and identifies the medium of music as a historical phenomenon.
  • Adorno, T. W. (2014). Current of music. Cambridge, United Kingdom: Polity.
    • This is a collection of Adorno’s work on radio, much of it unpublished during his lifetime, which analyzes how radio determines possibilities for experiencing music.
  • Aristotle. (2014). Poetics. In J. Barnes (Ed.). Complete works of Aristotle: The revised Oxford translation (Vol. 2). Princeton, NJ: Princeton University Press.
    • This is the standard English translation of Aristotle’s analysis of tragedy as an art form.
  • Arnheim, R. (1957). Film as art. Berkeley: University of California Press.
    • Arnheim argues that film’s artistic potential is best realized by taking advantage of the features unique to the medium.
  • Atget, E., & Abbott, B. (1964). The world of Atget. New York, NY: Horizon.
    • This is a collection of the work of Eugene Atget, whose photographs of Paris streets, according to Walter Benjamin, exemplify artistic possibilities for photography as an artistic medium.
  • Baudelaire, C. (1980). “The modern public and photography.” In A. Trachtenberg (Ed.), Classic essays on photography (pp. 83–90). Stony Creek, CT: Leete’s Island Books.
    • In this essay, Baudelaire argues that photography cannot be an artistic medium because it does not engage the imagination appropriately.
  • Bazin, A. (1968). “The ontology of the photographic image.” In H. Gray (Trans.), What is cinema? (Vol. 1, pp. 9–16). Berkeley: University of California Press.
    • Bazin holds that photography satisfies once and for all the desire to preserve reality and thus opens up new artistic possibilities.
  • Benjamin, W. (1999). Little history of photography. In M. W. Jennings, H. Eiland, and G. Smith (Eds.), Selected writings (Vol. 2, Part 2, pp. 507–530). Cambridge, MA: Harvard University Press.
    • Benjamin describes the history of 19th and early 20th century photographic theories and practices.
  • Benjamin, W. (2003). The work of art in the age of its technological reproducibility: Third version. In H. Eiland and M. W. Jennings (Eds.), Selected writings (Vol. 4, pp. 251–283). Cambridge, MA: Harvard University Press.
    • In this seminal essay, Benjamin identifies the transition to technological reproducibility as a fundamental shift in the nature of art and argues that film constitutes a new mode of perception.
  • Blausius, L. (2006). Mapping the terrain. In T. Christensen (Ed.), The Cambridge history of Western music theory (pp. 27–45). Cambridge, United Kingdom: Cambridge University Press.
    • This is a helpful overview of the history of Western music theories.
  • Bower, C. (2006). The transmission of ancient music theory into the Middle Ages. In T. Christensen (Ed.), The Cambridge history of Western music theory (pp. 136–167). Cambridge, United Kingdom: Cambridge University Press.
    • This essay describes the reception of ancient music theory during the European Middle Ages.
  • Braudy, L. (2002). The world in a frame: What we see in films. Chicago, IL: University of Chicago Press.
    • Originally published in 1976, Braudy describes the artistic possibilities particular to popular movies.
  • Burnham, S. (2006). Form. In T. Christensen (Ed.), The Cambridge history of Western music theory (880–906). Cambridge, United Kingdom: Cambridge University Press.
    • This essay gives an overview of the emergence of musical form as a central theoretical category in the 18th and 19th centuries.
  • Carroll, N. (1985). The specificity of media in the arts. Journal of Aesthetic Education19(4), 5–20.
    • This essay is the earliest instance of Carroll’s critique of medium specificity.
  • Carroll, N. (1996). Theorizing the moving image. Cambridge, United Kingdom: Cambridge University Press.
    • In the opening chapters of this book, Carroll offers his most developed criticism of medium specific theories and his most sustained skepticism about the coherence of the concept of artistic medium.
  • Carroll, N. (2006). Philosophizing through the moving image: The case of serene velocity. The Journal of Aesthetics and Art Criticism64(1), 173–185.
    • In this essay, Carroll acknowledges the need for the concept of artistic medium in describing the experience of structural films such as Serene Velocity.
  • Cavell, S. (1969). A matter of meaning it. In Must we mean what we say?: A book of essays (pp. 213–237). Cambridge, United Kingdom: Cambridge University Press.
    • In this early essay, Cavell articulates an understanding of artistic medium as something discovered and explored by artists as they create.
  • Cavell, S. (1979). The world viewed: Reflections on the ontology of film. Cambridge, MA: Harvard University Press.
    • In this seminal work in the philosophy of film, Cavell articulates the medium of film as a succession of automatic world projections.
  • Cavell, S. (1981). Pursuits of happiness: The Hollywood comedy of remarriage. Cambridge, MA: Harvard University Press.
    • Cavell develops an account of movie genre as artistic medium by articulating Hollywood remarriage comedies as a distinct genre.
  • Cavell, S. (1996). Contesting tears: The Hollywood melodrama of the unknown woman. Chicago, IL: University of Chicago Press.
    • Cavell returns to the concept of genre as medium by developing an account of a companion genre, the melodrama of the unknown woman.
  • Cavell, S. (2014). The fact of television. In Themes out of school: Effects and causes (pp. 235–281). Chicago, IL: University of Chicago Press.
    • Cavell develops an account of television as medium by contrast with his account of film in The World Viewed.
  • Christensen, T. (2006a). Introduction. In The Cambridge history of Western music theory (1–26). Cambridge, United Kingdom: Cambridge University Press.
    • This is a helpful précis of the history of Western music theory.
  • Christensen, T. (Ed.). (2006b). The Cambridge history of Western music theory. Cambridge, United Kingdom: Cambridge University Press.
    • This collection offers in-depth essays on various crucial aspects of the history of Western music theory.
  • Cox, J., & Ford, C. (2003). Julia Margaret Cameron: The complete photographs. Los Angeles, CA: Getty.
    • This is a comprehensive collection of Julia Margaret Cameron’s photography.
  • Curtis, W. J. (1996). Modern architecture since 1900. London, United Kingdom: Phaidon.
    • This is a good overview of architecture in the 20th century, including modernist architecture.
  • de Font-Reaulx, D. (2013). Painting and photography: (1839–1914). Paris, France: Flammarion.
    • This describes the complicated relations and lines of influence between painting and photography in the 19th and early 20th centuries.
  • Dewey, J. (2005). Art as experience. New York, NY: Penguin.
    • In his classic text on art as a form of experience, Dewey distinguishes between an artistic medium and the raw material out of which works of art are made.
  • Diderot, D. (1995). Diderot on art, volume 1. The salon of 1765 and notes on painting. J. Goodman (Ed.). New Haven, CT: Yale University Press.
    • This collection contains much of Diderot’s critical writing on painting.
  • Duncan, I. (2013). My life (revised and updated). New York, NY: Liveright.
    • Duncan’s autobiography also includes her reflections on her dance practices and commitments.
  • Eisenstein, S. (2014). Film form: Essays in film theory. Chicago, IL: Houghton Mifflin Harcourt.
    • This is a collection of Eisenstein’s essays on film technique and theory.
  • Eisner, W. (2008). Comics and sequential art: Principles and practices from the legendary cartoonist (Will Eisner instructional books). New York, NY: W. W. Norton.
    • Eisner offers an account of the principles underlying his approach to comics based on his decades as a comic artist.
  • Frascina, F., Harrison, C., & Paul, D. (Eds.). (1982). Modern art and modernism: a critical anthology. London, United Kingdom: Sage.
    • This is a good collection of critical writings about modernist arts.
  • Fried, M. (1980). Absorption and theatricality: Painting and beholder in the age of Diderot. Berkeley: University of California Press.
    • Fried offers an account of the artistic problematic articulated by Diderot and explored in 18th century painting.
  • Fried, M. (1998a). Art and Objecthood. In Art and objecthood: Essays and reviews (148–172). Chicago, IL: University of Chicago Press.
    • Fried’s essay argues that the turn from modernism to minimalism and conceptual art in the 1960s depends on a misunderstanding of what exploring an artistic medium can be.
  • Fried, M. (1998b). Manet’s Modernism: Or, the face of painting in the 1860s. Chicago, IL: University of Chicago Press.
    • In this book, Fried argues that Manet’s exploration of the artistic problematic he inherited from 18th and 19th century French painting constituted a form of modernism.
  • Fried, M. (2008). Why photography matters as art as never before. New Haven, CT: Yale University Press.
    • Fried argues that the cross-pollination of painting and photography has led to the most important artistic developments of the late 20th and early 21st centuries.
  • Greenberg, C. (1982). Modernist painting. In Frascina, F., Harrison, C., & Paul, D. (Eds.), Modern art and modernism: A critical anthology (5–10). London, United Kingdom: Sage.
    • Greenberg’s essay argues that modernist painting approaches the medium of painting on the model of Kantian criticism, in order to discover it in its purity.
  • Greenberg, C. (1984a). Modernist sculpture, its pictorial past. In Art and culture: Critical essays (158–163). Boston, MA: Beacon.
    • In this essay, Greenberg outlines the ways in which modernist sculpture distinguishes itself from painting.
  • Greenberg, C. (1984b). Art and culture: Critical essays. Boston, MA: Beacon.
    • This is a collection of many of Greenberg’s essays on the project of modernism in painting and sculpture.
  • Halliwell, S. (1998). Aristotle’s Poetics. Chicago, IL: University of Chicago Press.
    • This is an insightful critical analysis of Aristotle’s Poetics.
  • Harrison, C., Wood, P., & Gaiger, J. (Eds.). (1998). Art in theory, 1815–1900: An anthology of changing ideas. Oxford, United Kingdom: Blackwell
    • This is a comprehensive collection of 19th century art theory.
  • Harrison, C., Wood, P., & Gaiger, J. (Eds.). (2000). Art in theory 1648–1815: An anthology of changing ideas. Oxford, United Kingdom: Blackwell.
    • This is a comprehensive collection of Western art theories prior to the 19th century.
  • Harrison, C., & Wood, P. (Eds.). (2003). Art in theory, 1900–2000: An anthology of changing ideas. Oxford, United Kingdom: Blackwell.
    • This is a comprehensive collection of 20th century art theories.
  • Hegel, G. W. F. (1975a). Hegel’s aesthetics: Lectures on fine art (Vol. 1). (T. M. Knox, Trans.). Oxford, United Kingdom: Oxford University Press.
    • Hegel’s lectures on art begin with reflections on the idea of art and its relation to thought.
  • Hegel, G. W. F. (1975b). Hegel’s aesthetics: Lectures on fine art (Vol. 2). (T. M. Knox, Trans). Oxford, United Kingdom: Oxford University Press.
    • Hegel’s lectures on art conclude with reflections on the differentiation and development of particular art forms.
  • Herder, J. G., & Gaiger, J. (2002). Sculpture: Some observations on shape and form from Pygmalion’s creative dream. Chicago, IL: University of Chicago Press.
    • Herder’s reflections on the nature of sculpture develop a critical response to Lessing’s account of painting as inclusive of sculpture.
  • Horace. (1989). Ars Poetica. In N. Rudd (Ed.). Horace: Epistles book II and Ars Poetica (Vol. 2). Cambridge, United Kingdom: Cambridge University Press.
    • Horace describes the form of poetic experience as structured by imitation.
  • Joyce, J. (1986). Ulysses. New York, NY: Vintage Books.
    • Joyce’s classic work of modernism, first published in 1922, explores the history of literature and its forms.
  • Joyce, J. (1999). Finnegans wake. New York, NY: Penguin.
    • Joyce’s final work explores the nature of language and its expressive possibilities.
  • Kracauer, S. (1997). Theory of film: The redemption of physical reality. Princeton, NJ: Princeton University Press.
    • Kracauer’s account of film identifies the medium’s central feature to be its ability to connect us with reality.
  • Krauss, R. (1986a). The originality of the avant-garde. In The originality of the avant-garde and other modernist myths (pp. 151–170). Cambridge, MA: MIT Press.
    • Krauss’ essay calls into question the cult of originality underlying the critical reception of modernist art and articulates the theoretical framework for postmodernist responses.
  • Krauss, R. (1986b). The originality of the avant-garde and other modernist myths. Cambridge, MA: MIT Press.
    • This is a collection of Krauss’ critical essays arguing for postmodern artistic possibilities.
  • Lear, J. (1992). Katharsis. In Essays on Aristotle’s Poetics (pp. 315–340). Princeton, NJ: Princeton University Press.
    • Lear’s essay offers an insightful interpretation of the role of catharsis in Aristotle’s account of tragedy.
  • Lessing, G. E. (1962). Hamburg dramaturgy. (H. Zimmern, Trans.). Mineola, NY: Dover.
    • This volume collects Lessing’s theatrical criticism and contains his reflections on Aristotle’s account of tragedy.
  • Lessing, G. E. (1984). Laocoön: An essay on the limits of painting and poetry. (E. A. McCormick, Trans.). Baltimore, MD:Johns Hopkins University Press.
    • Lessing’s essay is arguably the classic work of medium analysis; in it, he distinguishes between painting and poetry as two different methods for imagining action.
  • Levenson, M. (Ed.). (2011). The Cambridge companion to modernism. Cambridge, United Kingdom: Cambridge University Press.
    • This is a collection of critical essays reflecting on modernist art.
  • Loos, A. (1998). Ornament and crime. In A. Opel (Ed.), Ornament and crime: Selected essays (pp. 167–177). Riverside, CA: Ariadne.
    • Loos’ essay is a polemic for architectural modernism and against unnecessary ornamentation.
  • McCloud, S. (1994). Understanding comics: The invisible art. New York, NY: William Morrow.
    • McCloud’s book is the articulation of the nature of the medium of comics by a contemporary comic artist.
  • Medium, n. and adj. 2016. OED Online. Web.
    • This article tracks the etymology and evolution of the term “artistic medium.”
  • Perkins, V. F. (1990). Must we say what they mean? Film criticism and interpretation. Movie34(5), 1–6.
    • In this essay, Perkins defends the critical value of the concept of medium and identifies the ability to capture minute but meaningful gestures at the heart of the medium of the movies.
  • Perkins, V. F. (1993). Film as film: Understanding and judging movies. Boston, MA: Da Capo.
    • Perkins’ book articulates normative standards specific to movies and draws on the concept of medium to do so.
  • Perron, B., & Wolf, M. J. (Eds.). (2009). The video game theory reader. New York, NY: Routledge.
    • This collection of recent essays of video game theory includes a number of essays that use the concept of medium in order to articulate the experiences specific to video game play.
  • Pudovkin, V. I. (2013). Film technique and film acting: The cinema writings of V. I. Pudovkin. Redditch, United Kingdom: Read Books.
    • This collection of Pudovkin’s writing on film theory includes many reflections on what is particular to the medium of film.
  • Rasch, R. (2006). Tuning and temperament. In T Christensen (Ed.). The Cambridge history of Western music theory. Cambridge, United Kingdom: Cambridge University Press.
    • This essay gives a helpful account of the emergence of temperament as a music theoretic category in the European music tradition.
  • Rorty, A. (1992). The psychology of Aristotelian tragedy. In A. Rorty (Ed.), Essays on Aristotle’s Poetics (pp. 1–22). Princeton, NJ: Princeton University Press.
    • This essay describes Aristotle’s views on the audience’s experience of tragedy.
  • Talbot, W. H. F. (1969). The pencil of nature. Boston, MA: Da Capo.
    • This essay is a reflection on the nature of photography by one of its inventors.
  • Trachtenberg, A. (1980). Classic essays on photography. Stony Creek, CT: Leetes Island Books.
    • This collection contains a number of important essays on photography and its possibilities from the 19th century.
  • Vertov, D. (1984). Kino-eye: The writings of Dziga Vertov. Berkeley: University of California Press.
    • Vertov’s writings reflect on the radical possibilities for contemporary perception offered by film.
  • Vertoy, D. (1998). Man with a movie camera [Motion Picture]. United States: Image Entertainment.
    • Released in 1929, Vertov’s experimental film explores the range of documentary possibilities for film.
  • Wack, D. (2013). Medium and the end of myths: Transformation of the imagination in The world viewedConversations: The Journal of Cavellian Studies, 1, 39–58.
    • This essay describes the transformation in the medium of movies that Cavell identifies in The World Viewed.
  • Wack, D. (2014). How movies do philosophy. Film and Philosophy, 18.
    • This essay argues that movies, documentaries, structural films, cartoons, and so on all constitute distinct artistic mediums and identifies the medium of the movies as structured around the apprehension of action.
  • Wellbery, D. E. (1984). Lessing’s Laocoön: Semiotics and aesthetics in the age of reason. Cambridge, United Kingdom: Cambridge University Press.
    • Wellbery’s book is an insightful and sustained interpretation of Lessing’s Laocoön essay.
  • Winckelmann, J. J., & Potts, A. (2006). History of the art of antiquity. Los Angeles, CA: Getty.
    • Winckelmann’s book on ancient art was widely influential in the 18th century and definitive for the development of art history as an intellectual discipline. 

 

Author Information

Daniel Wack
Email: dwack@knox.edu
Knox College
U. S. A.

Political Revolution

Revolutions are commonly understood as instances of fundamental socio-political transformation. Since “the age of revolutions” in the late 18th century, political philosophers and theorists have developed approaches aimed at defining what forms of change can count as revolutionary (as opposed to, for example, reformist types of change) as well as determining if and under what conditions such change can be justified by normative arguments (for example, with recourse to human rights). Although the term has its origins in the fields of astrology and astronomy, “revolution” has witnessed a gradual politicization since the 17th century. Over the course of significant semantic shifts that often mirrored concrete political events and experiences, the aspect of regularity, originally central to the meaning of the term, was lost: Whereas in the studies of, for example, Nicolaus Copernicus, “revolution” expressed the invariable movements of the heavenly bodies and, thus, the repetitive character of change, in its political usage, particularly stresses the moments of irregularity, unpredictability, and uniqueness.

In light of the marked heterogeneity of the ways in which thinkers such as Thomas Paine (1737-1809), J.A.N. de Condorcet (1743-1794), Immanuel Kant (1724-1804), G.W.F. Hegel (1770-1831), Mikhail Bakunin (1814-1876), Karl Marx (1818-1883), Hannah Arendt (1906-1975), and Michel Foucault (1926-1984) reflect on the possibilities and conditions of radically transforming political and social structures, this article concentrates on a set of key questions confronted by all these theories of revolution. Most notably, these questions pertain to the problems of the new, of violence, of freedom, of the revolutionary subject, the revolutionary object or target, and of the temporal and spatial extension of revolution. In covering these problems in turn, it is the goal of this article to outline substantial arguments, analyses, and aporias that shape modern and contemporary debates and, thereby, to indicate important conceptual and normative issues concerning revolution.

This article is divided into three main sections. The first section briefly reconstructs the history of the concept “revolution.” The second section gives an overview of the most important strands of politico-philosophical thought on revolution. The third section examines paradigmatic positions developed by theorists with respect to the central problems mentioned above. As the majority of thinkers who address revolution do not elaborate comprehensive theories and as there is comparatively little thematic secondary literature on the subject, this part proposes a framework for individually situating and systematically relating the differing approaches.

Table of Contents

  1. History of the Concept
  2. Three Traditions of Thought
    1. The Democratic Tradition
    2. The Communist Tradition
    3. The Anarchist Tradition
  3. Concepts of Revolution
    1. The Question of Novelty
    2. The Question of Violence
    3. The Question of Freedom
    4. The Question of the Revolutionary Subject
    5. The Question of the Revolutionary Object
    6. The Question of the Extension of Revolution
  4. Conclusion
  5. References and Further Reading

1. History of the Concept

In preparation for presentation of the different philosophical approaches to revolution in the following article, this section is concerned with providing a concise outline of the history of the concept. In so far as “revolution” is employed to describe political transformation, conceptual historians understand its origins to be genuinely modern. Critically informed by the experience of the revolutions in England, America, and France, the term in common usage designates the epitome of political change, that is, change not only in laws, policies, or government but in the established order that is both profound and durable. Earlier conceptions of political change are missing the notions of a people’s autonomous ability to act or of its right to emancipation. Further, the absence of two structural preconditions explains why revolution in the sense of fundamental politico-social transformation is not conceived prior to modernity. On the historical level, it is the formation of the “strong” state that is conducive to a political imagination of radical liberation from state oppression and the subsequent founding of an essentially different order. The extent of the Hobbesian type of the state’s disciplining power and the impossibility of direct political participation thus lay the ground for revolutionary projects. On the conceptual level, the supersession of cyclical conceptions of history as advocated by Aristotle, Polybius, Cicero, or Machiavelli by linear models of thought allows for the idea of irreversible progress in politics and society. In the course of this shift in historical thinking revolution is eventually looked upon as a catalyzing, even enabling factor of progress. Since history is no longer understood as dependent on forces beyond human control (such as, for example, divine providence), human agency comes to be regarded as the decisive factor in shaping its course (compare Koselleck, 1984 and 2004; for arguments that revolution, both as a concept and a phenomenon, does have pre-modern origins, compare Rosenstock-Huessy, 1993 [1938]; Berman, 1985).

The history of political thought largely attests to the assessment that the idea of revolution as structural, justifiable change is unknown prior to modernity. Aristotle’s reflections on political change (metabolé tes politeías) in books III and IV of Politics show that the alterations he takes into consideration do not amount to the complete breakdown of an existing order, its organizing hierarchy, and its principles of inclusion/exclusion. Despite certain arguable similarities to modern concepts (for instance, with respect to the element of violence), conceptual predecessors of “revolution” such as stasis and kinesis in the Greek tradition or seditio, secessio, and tumultus in the Roman tradition have strong negative connotations. In ancient and medieval political thought, they are primarily related to anarchy and civil war. Even in the works of an early-modern thinker like Machiavelli the idea of an absolute hiatus, a fundamental rupture on the continuum of politics is not developed fully. Although he is occupied with political change, key concepts related to the topic (most importantly, rinovazione, mutazione, and alterazione) are overridden by the conviction that all shifts as to forms of constitutions ultimately do not break out of a cycle of historical recurrence. In short, the notion of a world-shaping human “power to interrupt” and “to begin” (compare Merleau-Ponty, 2005 [1945]) and the corresponding “pathos of novelty” (compare Arendt, 2006 [1963]) remain alien to pre-modern thought.

In the 17th and 18th century, the discovery of revolution as a relevant political category is reflected and supported by political and moral philosophy. John Locke, in his Second Treatise on Civil Government (1689), develops an influential defense of the right of resistance, rebellion, and even revolution. Going beyond Thomas Hobbes’s considerations on a subject’s right to defend herself against the sovereign if her life is under threat, his social contract theory presents this protective right against stately coercion and oppression as a necessary political concretization of the individuals’ inalienable natural right to “life, liberty, and estate.” Jean-Jacques Rousseau, in the Discourse on the Origin of Inequality (1755) and the Social Contract (1762), aims at exposing the morally degenerate, politically illegitimate state of the Ancien Régime and proposing a liberal, egalitarian political and legal constitution to replace it. According to Rousseau, the “general will” ousts the particular will of the monarch as the guideline in politics, thereby implying that the people attain autonomy, sovereignty, and, thus, the status of full political subjectivity. Locke’s and Rousseau’s considerations thus importantly add to a revaluation of acts of protest and insurrection: Such acts can no longer be dismissed as the work of political offenders or public enemies as was the case prior to the undermining of the “political theology” of absolutism and feudalism, which was largely based on the doctrine of divine right (compare Kantorowicz, 1997 [1957]; Walzer, 1992). Instead, thanks to the political thought of the enlightenment in general and to Lockean and Rousseauian social contract and natural rights theory in particular, such acts can now be interpreted as an exercise of rationally and morally justifiable political self-determination. Although neither Locke nor Rousseau present elaborated theories of revolution, they develop positions that are inherently critical of any political order that is not built on the principles of consent and trust and, thus, potentially revolutionary. Their reflections on legitimate governance and on citizens’ rights go beyond earlier discussions of justified resistance to monarchs—such as the 1579 Vindiciae contra Tyrannos, published under the pseudonym Stephen Junius Brutus—, which rely on expertocratic leadership as opposed to political self-determination of the people. Their works thus prepare the ground for the two main ideas of the revolutionary age: “natural” human rights and national sovereignty (compare Habermas, 1990; Menke/Raimondi, 2011).

Resulting from a plethora of intellectual and material factors, the distinctly modern understanding of “revolution” takes shape on the eve of the historical revolutions of the late 18th century: It is both a “combat term” (R. Koselleck) in political praxis and an “essentially contested concept” (W.B. Gallie) in political theory. It is in the works of thinkers like Condorcet, Kant, or Marx that this contest is henceforth held and that the specific political and philosophical meaning of the term is spelled out, albeit in widely differing ways.

2. Three Traditions of Thought

Before turning to a detailed examination of important conceptual and normative issues concerning revolution, this section aims at giving an overview of three dominant lines of thought on revolution. Given the considerable discontinuities and breaks within each of these strands on the one hand and the numerous overlaps and interchanges between them on the other, the lines of thought presented here have to be understood as ideal types. Although it is likely that there are alternative perspectives, very few theories of revolution resist classification into one of these strands.

a. The Democratic Tradition

A primarily democratic strand of theory is influenced by the works of Locke, takes shape in Thomas Jefferson’s and J.A.N. de Condorcet’s thinking, and is further developed in Kant’s reflections on gradual, yet profound transformation. Throughout the 19th and 20th century, it is continued selectively in the late writings of Friedrich Engels or in Hannah Arendt’s and Jürgen Habermas’s considerations of the subject. This strand is characterized by a strong emphasis on non-violent, legal means and on politico-legal liberty and equality as the essential aims of revolution. Its representatives understand revolution as a continuing project or task that cannot reach a point of completion and satisfaction. Correspondingly, these thinkers, for the most part, reject notions of instantaneous rupture and absolute novelty whereby they undermine rigid distinctions between revolutionary and reformist change. Key elements of this tradition resonate in the work of a contemporary thinker like Etienne Balibar. He suggests an understanding of revolution as a progressive power that operates from within the democratic system. Instead of aiming at the radical overthrow of this system, democratic citizens assume the role of the revolutionary subject by advocating constant additions to and revisions of the existing order and its institutions—for example, an extension of what Arendt calls “the right to have rights” to non-citizens, increased possibilities for political participation, or a more consequent adherence to human rights—allowing for its continued legitimacy (compare Balibar, 2014).

b. The Communist Tradition

A primarily communist line of revolutionary theory begins with the works of Rousseau. This line is elaborated decisively in the thinking of Karl Marx and Friedrich Engels. Significant modifications notwithstanding, it is continued in the writings of Vladimir Lenin and Jean-Paul Sartre during the 20th century. The majority of its representatives share the belief in the possibility of revolutions being finalized and completed. Although they offer different suggestions as to justifiable forms and degrees of violence, they further share the idea that violence, in general, can function as an acceptable means of revolution. They also agree that the realization of material liberty and equality (as opposed to merely “formal,” that is, legal liberty and equality) in the social sphere are its main goals. As this sphere includes apolitical institutions such as the market, substantial revolutionary transformation cannot satisfy itself with abstract political principles but needs to affect the concrete conditions in which a society exists (for example, the relations of production). In addition, the notion of solidarity is central to these thinkers’ vision of revolutionary action and of a post-revolutionary society that is realized through these actions. Key elements of this strand of revolutionary thought shape the works of contemporary theorists such as Alain Badiou and Slavoj Zizek. Interpreting existing democratic orders as regimes of radical immanence, it is evident to them that genuine transcendence (a “communism to come”) has to manifest itself as a supersession of this order. To overcome the inherently bourgeois structures and discourses of power that are ceaselessly reproduced by late-capitalist democracies, radical disruptions are needed. Taking the form of acts of “terror” or “subtraction,” such disruptions express the “eternal truths” of the suffering of the masses (compare Badiou, 2012; Zizek, 2012).

c. The Anarchist Tradition

An anarchist tradition of revolutionary theory has its sources in 19th century America (Josiah Warren), France (Pierre-Joseph Proudhon), and in the thought of the Russian theorists Mikhail Bakunin and Peter Kropotkin. This tradition is later taken up in the works of, for example, Emma Goldman, Rosa Luxemburg, and Paul Goodman. Although these thinkers differ considerably in their assessment of revolutionary violence, they converge as to the crucial emancipatory aim of revolution: As any form of institutionalized authority is considered incompatible with human autonomy, their vision is the creation of a society independent of “imperial institutions” in the economic, social, and political realms. Consequently, they do not content themselves with a redistribution of political power, however radical, within the framework of the state, but aim at its abolition instead. David Graeber, in his contemporary reformulation of anarchism, describes the way in which the envisaged revolutionary abolition of vertical structures is linked to the emergence of new forms of horizontal relations, that is, of communal existence. These forms are no longer organized by the logic of dominance and of cost/benefit; instead, they are shaped by the principles of mutual aid and free cooperation, which are not guided by instrumental rationality (compare Graber, 2004).

3. Concepts of Revolution

The following section discusses central questions addressed in the works of theorists from these main strands: The questions of novelty, violence, freedom, the revolutionary subject, the revolutionary object or target, and the extension of revolution. As it is neither possible to comprehensively discuss relevant concepts of revolution proposed by political philosophers and theorists nor to comprehensively include thematic considerations of the theorists presented here, this section contents itself with highlighting certain crucial features. Since this article is concerned with concepts of revolution as developed by political philosophers and theorists, important historical (compare Furet/Ozouf, 1989; Hobsbawm, 1996 [1962]; Palmer, 2014 [1959]), sociological (compare Skopcol, 1979), and politological (compare DeFronzo, 2011) studies that primarily concentrate on the phenomenon of revolution, its empirical forms and causes, are not taken into account. Further, a number of theoretical explorations of revolution are also not taken into consideration. This applies to the works of partisans of revolution such as, for example, Georges Sorel or Georg Lukács as well as to the works of critics of revolution such as, for example, Edmund Burke, Jeremy Bentham, Joseph de Maistre, or Carl Schmitt.

The exclusive focus on the six questions mentioned above is justified by the fact that they constantly appear in the theoretical debates regarding revolution as criteria in determining (a) if and under what conditions political change can be considered as revolutionary and (b) if and under what conditions such revolutionary change can be considered as legitimate. Despite the differing historical settings as well as the differing political and philosophical commitments of the individual thinkers, these questions thus constitute the common themes that connect their heterogeneous approaches to revolution. For each of these questions, the intent is to display the extremes of the spectrum on which important theorists of revolution operate and to indicate paradigmatic stances they take on this spectrum. It is with the help of this analytical framework that the various approaches to revolution since its intellectual discovery can be individually situated and systematically related to one another: The original revolutionary experience in the context of the American and French Revolution as reflected in the writings of Jefferson, Paine, Sieyès, and Condorcet; its reception in German Idealism; the further development of revolutionary thought in different versions of Marxism; its application to the problem of colonialism in the 20th century; and, finally, contemporary debates about the relevance and meaning of revolution informed, among other things, by the crises of late capitalism and representative democracy.

a. The Question of Novelty

The question of novelty pertains to the degree of revolutionary transformation and to the mode in which such transformation is achieved. Whereas some theorists of revolution argue that the post-revolutionary state needs to be absolutely new and different in comparison to the pre-revolutionary state, others hold that revolution is conceivable as a realization of relative novelty. Although some theorists argue that transformation needs to take place in a historically disruptive or discontinuous fashion in order to be revolutionary in character, others hold that effective revolutionary change can unfold in a continuous or stepwise manner.

For Thomas Paine, there can be no doubt that the American revolutionary struggle for independence from colonial rule, understood as a practical application of enlightenment thought, amounts to a radical break in history. According to his remarks, the liberation of the colonies from monarchical government must be seen as the unique and irreversible establishment of a fundamentally new political order. Employing nature as a timeless criterion for revolution, he describes monarchy not only as an anachronistic, unjustifiable “absurdity” but as a grave violation of natural law. In Paine’s view, its supersession by consent-based, liberal, and egalitarian republicanism is therefore tantamount to “begin[ning] the world over again” (Paine, 2000: 44). In contrast to Paine’s considerations that often oscillate between conceptual analyses and calls to revolutionary action (and, thus, indicate the difficulty inherent to addressing the subject of revolution in an objective, non-partisan manner), his contemporary Condorcet suggests an understanding of revolution that is not informed by a comparatively strong concept of novelty. Condorcet’s understanding becomes particularly apparent in his stance towards the trial and execution of Louis XVI. Rejecting the extra-legalism advocated by, among others, Robespierre and Saint-Just, he develops a theoretical position that argues for the compatibility of profound change of the political system and historical continuity: For him, the largely unprecedented challenge of bringing the king to court can only be met by taking recourse to elements of previous politico-legal systems. What is more, it is precisely such elements that—under the condition that they are not just imitated, but innovatively rearranged—make the necessary “regulation” of revolutionary dynamics possible and, thus, guarantee revolutionary progress (compare Condorcet, 2012; Walzer, 1992). Instead of interpreting novelty in terms of the political creation of a “new world” without historical parallel, the new, here, is comprehended in terms of a reconfiguration of constitutive parts of the old, that is, of the pre-revolutionary world.

As represented here by Paine and Condorcet, the axis of the new, crucial for conceptually grasping revolution, runs between the extremes of absolute and relative rupture or inception. The ends of this spectrum are reflected in numerous later theories of revolution. For instance, Friedrich Engels (1820-1895), in late works such as, for example, his introduction to the reprint of Marx’s The Class Struggles in France, describes revolutionary struggle as ongoing and procedural in character. For Engels, this struggle cannot be detached from existing political, legal, and economic conditions, meaning that radical revolutionary breaks or leaps are inconceivable. As his moderate understanding of the new allows for minor modifications of the state of affairs to be labeled as revolutionary, it is inclined to tie revolution closely to reform. This propensity is reflected in his programmatic idea of a re-appropriation of universal suffrage, which turns it from a means of bourgeois dominance into an ultimately revolutionary means of proletarian liberation. As opposed to Engels’s approach to the question of the new, Walter Benjamin (1892-1940), in On the Concept of History, propounds an understanding of revolution as a state of exception in which the continuum of history is “burst open.” According to his “messianic” concept of novelty, revolutions are unforeseeable, kairological events that suspend the regular, chronological order of time: They constitute a leap into an epoch that is incommensurable with what has previously existed.

Immanuel Kant, in his thoughts on revolution, attempts to avoid similarly one-sided answers to the question of the new. Rather, his complex considerations on progressive transformation aim at undermining the dichotomy between either emphatic or deflationary notions of the new by closely associating “complete change” or “complete revolution” (völlige Umwälzung) and “thorough reform” (gründliche Reform) (compare Kant, 2006c [1795/96]). Yet, Kant’s remarks on the subject of political or politico-moral change—scattered over writings such as What is Enlightenment?, Toward Perpetual Peace, The Metaphysics of Morals, and The Contest of the Faculties—seem marked by a tension between a reformist bias and revolutionary tendencies. Whereas the former is expressed in his privileging of enlightened monarchs such as Frederick II of Prussia as the agents of change or in his explicit criticism of the French Revolution on the grounds of excessive use of violence, the latter becomes apparent in his comments on the “enthusiasm” with which contemporary Europeans observe the revolutionary events in France or in his reflections on the radical switch from “despotism” to “republicanism,” that is, from the old absolutist order to a new order of freedom and morality. Kant evidently considers the difference between the two types of order to be tremendous: An order responsible for the heteronomous subjugation of the individual by the ruler is overcome by an order primarily characterized by the proliferation of individual autonomy and political participation as well as the decrease of armed conflict and war. Kant appears to resolve what presents itself as a tension between differing, even incompatible concepts of the new by taking into account the specific temporal constitution of profound political change: For him, such change is grasped adequately only as a process that is mediated in multiple ways, but not as a sudden gestalt switch. Rejecting the sharp, static dichotomy between relative and absolute novelty (and, with it, the dichotomy between reform and revolution) and integrating the two instead, Kant shows that there is no necessary interdependence between the suddenness and the depth of political change. He thus does not accept the common assumption among theorists of revolution and active revolutionaries alike that only abrupt, immediate transformation can count as profound and progressive in a relevant sense. Although republican states, according to Kant, are fundamentally different from despotic states the principles of which are superseded entirely, the emancipatory transition from heteronomy to autonomy is achieved stepwise. Kant’s idea of “complete change” reflects his teleological understanding of history as an imperfect, yet steady development “from worse to better” as expounded in his considerations on the conditions of the possibility of progress in Idea for a Universal History from a Cosmopolitan Perspective and Conjectural Beginning of Human History; it crystallizes in concepts such as “gradualness” (Allmählichkeit) and “approximation” (Annäherung) used by Kant to illustrate his notion of progressive transformation. It follows that, with Kant, the new can impossibly be conceived in theologically charged terms of the miracle or the “event.” Yet, the terminal phase of this gradual, indeterminate transition, for him, does mark the inception of a genuinely new age in the history of humanity, which is not only “an age of ‘enlightenment’ but ‘an enlightened age’” (compare Kant, 2006a [1784]). Politically, the latter manifests itself in consent-based republican systems essentially guided by the humanity formulation of the Categorical Imperative and, thus, in a “political body the likes of which the earlier world has never known” (Kant, 2006b [1784]: 14).

Within the theoretical debates, further problems arise that are immediately tied to the question of revolutionary novelty. For instance, several theorists of revolution do not merely reflect upon the new in terms of its degree and its mode. Instead, they also investigate its sources: The new is conceived as a result made possible by acts of re-appropriation (as expressed, for example, in Jefferson’s recourse to classical antiquity), by acts of reconfiguration (as expressed, for example, in Condorcet’s approach to assembling individual elements of various previous and present legal systems), or by acts of creation (as expressed, for example, in Bakunin’s idea of creative destruction by revolutionary “bandits”).

b. The Question of Violence

The question of violence pertains to legitimate means of revolutionary transformation. While some thinkers of revolution approve of violence as an essential vehicle for bringing about radical change and assert its creative capacities, others advocate its unreserved exclusion from the realm of progressive politics and make recourse to right and law instead. Again, numerous intermediate positions between the extremes of permissive and prohibitive attitudes toward violence can be found in which theorists try to identify specific conditions under which the use of violence is legitimate (for example, if violence contributes to a measurable increase in freedom) or to determine specific forms of violence that are justifiable (for example, violence against property). In addition, this section focuses on prevalent strategies for justifying revolutionary violence with recourse to, among others, utilitarian and politico-theological arguments.

Anarchist theorist and activist Mikhail Bakunin, in his thoughts on radical socio-political transformation, stresses the creative power of humans in general and the creative potential of violence in particular. For him, revolution begins with the forcible destruction of the old (statist) order, which prepares the “fertile” ground for a fundamentally new (non-statist) order (compare Bakunin, 1990 [1873]). Even though Bakunin declares the institutions that constitute the political and economic centers of power to be the primary target of acts of revolutionary “bandits,” he holds that such violence can also legitimately affect the persons who are present at these centers. In order to justify the use of revolutionary violence Bakunin argues for an understanding of such violence as reactive and necessary: Confronted with the repressive violence of the state, its police and military units, partisans of the “social revolution” must resort to violence. In his view, such violence is justified both as an act of self-defense and as a means of a progressive politics that transcends a deeply unjust status quo in which autonomy is made impossible by the existence and the authority of the state. Thus, for Bakunin, violence is not merely an extreme alternative in case non-violent (for example, legal) vehicles of transformation fail. Instead, it is an inherent factor of revolution. In his comments on revolution, provoked by the experience of the Iranian Revolution, Michel Foucault agrees with this assessment insofar as he considers manifestations of violence an important motor of transformative politics (compare Foucault, 2005 [1978-79]). Based on irreconcilable concepts of the political and further fueled by resentment, intolerance, and hatred, a quasi-Schmittian fighting position between “friends” and “enemies” of the revolution, that is, between the supporters of the “saint” (Ayatollah Khomeini) and the “king” (Shah Reza Pahlevi) emerges. This fighting position, for Foucault, is to be seen as an inevitable element of radical change. Despite his constative judgment that violent conflict essentially enables revolutionary dynamics, he does not present an elaborate justification of revolutionary violence.

Contrary to Bakunin and Foucault, Kant understands violence as neither a necessary nor a justifiable element of revolution. Not only do his remarks reveal a pronounced reservation resulting from empirical observations of the cruelties committed in the course of the revolution in France (cf. Kant, 1991 [1798]). What is more, his rejection of the idea that violence could be considered a legitimate means of progress is a matter of principle. His position becomes particularly manifest in his reflections on the trial against Louis XVI as presented in the Doctrine of Right (compare Kant, 1996 [1797]). From the standpoint of his practical philosophy, there can be no doubt that the execution of the previous monarch is not acceptable. For Kant, this form of legally regulated and sanctioned regicide differs from historically well-known simple regicide, that is, the killing of a king on impulse or motivated by political power strategies: For in the trial, the established political principle of the inviolable nature of sovereign power is undermined and ultimately replaced by the principle of violence. Since the prosecution, in trying and finally executing the former king of France, does not appeal to a singular, exceptional situation but, instead, lends general juridical character to it, violent revolutionary insurrection against the sovereign is turned into a principle or Grundsatz of politics. To understand the right to violent resistance and revolution as a political principle (as is the case in the trial), for Kant, anticipates the Great Terror of 1793/94 (for a similar critique of the trial and execution of Louis XVI, compare Camus, 1991 [1951]). More importantly, it passes off the violent protest against sovereign governments as generally permissible and problematically normalizes it. As a consequence of the legalization of permanent insurrection, the consolidation of political (and, with it, moral) order is considerably complicated while civil disorder and war, in Kant’s view the key impediment to politico-moral progress, become the rule. A comparably unambiguous rejection of violence as an instrument of revolution can be found in Arendt’s On Revolution where she describes violence as a “limit” of the realm of the political: For her, the revolutionary praxis of violence (as exercised in the revolutions in France and Russia) as well as theoretical justifications of revolutionary violence (as given by, for example, Bakunin) are inherently anti-political.

Condorcet is one of the thinkers who neither understands violence as an integral part of revolution and gives carte blanche to its use nor completely rules out that it can serve as a justifiable means in processes of radical transformation. His intermediate position crystallizes in his considerations on the trial against Louis XVI: Representing the standpoint of the Girondins, he argues that the charges against the former king (or, rather, the “citizen Louis Capet”) cannot be based on “enmity” as suggested by Jacobins like Robespierre and Saint-Just, but have to refer to “treason” instead. The binary logic of the Jacobins according to which any monarch has to either rule or die and their corresponding attempt to apply the laws of war in the trial against the king are thus curbed. The position suggested by Condorcet allows for an at least tentative maintenance of the rule of law and of the validity of principles of justice. Like any other laws and measures, revolutionary laws and measures as developed in the course of the trial are subject to the rules of justice (compare Condorcet, 2012). In stark contrast to the Jacobins’ enthusiasm for unrestricted, extralegal, and decisionist self-authorization, what is emphasized here is the necessity of revolutionary self-restraint. According to Condorcet, the exceptional, unprecedented situation of the revolutionary trial has to be modeled on the ideal of due process of law if it is to remain distinguishable from mere revolutionary terror. Thus, revolutionary violence as it manifests itself in the eventual execution of the former king is not categorically rejected. However, it can only be considered as justified if it is legally channeled and, as a result, compatible with certain demands of justice. Insisting on the significance of revolutionary justice (however imperfect in its practical realization) in the exercise of legally qualified violent acts, Condorcet avoids the common opposition of either violence or law as the decisive tools of transformation. On the one hand, this treatment of the representatives of the old system, in not suspending the law, sets an example for the new order and for the way in which it interprets law and justice. It thus contributes to the transformation of revolutionary violence into legitimate authority. On the other hand, this treatment of the collapsed regime contributes to facilitating the peaceful co-existence of partisans and opponents of the revolution in a post-revolutionary society: Instead of declaring the former king to be a “moral monster” to be immediately “annihilated” and instead of declaring war against supporters of the monarchy and all other “enemies of freedom” as suggested by Robespierre and Saint-Just, Condorcet’s insistence on legal equality aims at finding peaceful trading zones and common ground between the factions so that previous political opponents can be repositioned as potential future partners.

Intermediate positions between the extremes of approval and rejection of violence as an instrument of revolution are also developed by Walter Benjamin, Herbert Marcuse, and, more recently, by Slavoj Zizek. With regard to the question of justification, these thinkers propose alternatives to Condorcet’s idea of legalized and, thus, legitimate revolutionary violence. Benjamin, taking recourse to political theology, interprets and justifies revolutionary movements as inner-worldly manifestations of unmediated “divine violence” that overcomes the oppressive “mythical violence” exercised by the state. With respect to the content and effect of “divine violence,” Benjamin’s remarks remain sketchy. On the one hand, the notion can be taken to imply the use of force against representatives of the state’s “mythical” authority; on the other hand, it can be interpreted as resulting in a fundamental transformation of the law which becomes critical of itself by recognizing and counter-balancing its inherent violent potential. At any rate, revolutionary movements, for Benjamin, represent a form of justice that incommensurably exceeds the existing legal order. If they are successful, they cathartically suspend the “serfdom” and “barbarity” characteristic of human history and realize the possibility of the fundamentally new (compare Benjamin, 1999 [1921]). Marcuse (1898-1979), in contrast, proposes a quasi-utilitarian justification of revolutionary violence. In Ethics and Revolution (1964) he argues that only a “brutal calculus” can determine whether a specific revolutionary project is legitimate. The suggested calculus amounts to a cost-benefit analysis of the probable number of victims on the one hand and the probable gains in human progress on the other (in terms of, for example, tolerance or human rights). For Marcuse, the historical events in England, America, and France prove the dialectical character of revolutionary violence, that is, the fact that violent conflict can contribute decisively to substantial economic and social, political and moral improvements. However, he insists that such violence is justifiable only if its use (a) is directly and recognizably tied to specific moral goals and (b) ceases at the earliest possible stage of the revolutionary process. Zizek (1949) attributes a central role to violence as an instrument to break out of the absolutely imminent “deadlock” represented by the current order of liberal democracy and market economy. His reflections concentrate on the revolutionary capacities of passive forms of violence, which he presents as particularly justifiable. Most importantly, he suggests a “Bartlebian politics” of refusal and withdrawal that undermines the discursive power of the dominant system. Such a politics, which has an expressive, communicative function, rejects the prevailing “hegemonic” language and counters the existing system’s power to name with subversive silence. For Zizek, political forms of direct non-action, guided by Bartleby’s maxim of “I would prefer not to” allow for a first negative step in the revolutionary process in creating a “vacuum” of effective power which, in a second step, can be filled with positive content. Thus, in arguing that, in the present circumstances, “doing nothing is the most violent thing to do” (an idea that also informs the traditions of strikes, pickets, and silent vigils), he designates radical non-action as a justifiable mode of revolutionary violence (compare Zizek, 2008).

Debates within and around contemporary movements with fundamentally transformative social and political agendas attest to the continued significance of violence, of its permissibility and justifiability, as the central normative problem in the context of revolution. Supporters of the Occupy movement deny the legitimacy of physical violence and, in particular, of physical violence directed against persons, as a means of revolutionary change. Instead, they largely subscribe to a “Bartlebian” revolutionary politics of non-violent violence, that is, a politics of subversive silence and, respectively, creative re-naming. The adherence to this kind of inactive, discursive violence was expressed performatively during the 2013 Gezi Park protests in Istanbul. Whereas the “standing man” actions enacted a “bodily politics” of obstruction (compare Butler, 2015) and an attitude of refusal through silence and passivity, the derogative term çapulcu (looter, marauder) used by government officials to discredit the protesters was creatively appropriated by them and re-interpreted as a honorific title. In Egypt, supporters of the Arab Spring movement took recourse to certain strands within the Islamic legal tradition when considering the question of violence. It was not only in terms of human rights and democratic governance but also in terms of the Islamic law of rebellion and of war that the question of violence was discussed. Although the positions of the main legal schools of thought differ considerably in their assessment of the question, there is a pronounced tendency to attempt to avoid or, at least, limit violence in internal conflicts and to consider it justifiable only if all other means of bringing about change have been exhausted (compare El Fadl, 2006; Al Dawoody, 2011).

c. The Question of Freedom

The question of freedom pertains to the primary objective of revolutionary transformation. Here, the spectrum established by theorists of revolution spans between the poles of freedom as liberation from oppression (that is, negative revolutionary freedom) and of freedom as the foundation and realization of a new political order (that is, positive revolutionary freedom).

Post-colonial theorist Frantz Fanon (1925-1961), in his reflections on revolutionary change, primarily concentrates on the aspect of liberation. For Fanon, whose work attests to the de-Europeanization of revolution during the 20th century, decolonization is to be understood as a process of “rehabilitation” of the suppressed that importantly implies a justifiably violent moment of radical riddance of the structural cornerstones of political, social, economic, and cultural domination and exploitation. Revolutionary liberation thus leads to the creation of a “tabula rasa,” which is the precondition for the subsequent development of a new institutional order and, what is more, the emergence of “sovereign” forms of post-colonial subjectivity (compare Fanon, 1967 [1961]). A comparable focus on revolutionary freedom as freedom from oppression characterizes the thinking of critical theorist Herbert Marcuse. For him, breaking free from the existing order is the essential element of a revolution. He argues that in light of the extent to which an inherently “repressive” socio-political order, the order of late capitalism, clearly dominates, strategies of resisting and undermining have to be considered before anything else. As Marcuse makes clear in Ethics and Revolution, such strategies of liberation do not only include forms of passive resistance as indicated in the concept of “the great refusal” but also the use of violence. Both can serve as a means to unsettle the systemic “paralysis” or blockage of human needs and potentials in industrialized Western societies. Consequently, Marcuse’s understanding of freedom is shaped by the idea of emancipation from a system of extreme immanentism that produces entirely controlled, uniform, “one-dimensional” humans. In spite of the emphasis on revolutionary freedom as liberation from prevailing modes of materialistic existence and instrumentally rational thought, he also points to a more positive notion of freedom: With explicit recourse to the thought of Jean-Paul Sartre, he discusses the necessity of “projects” that allow for forms of free (for example, artistic) action to be released (compare Marcuse, 1991 [1964]).

As opposed to Fanon and Marcuse, Hannah Arendt holds that the content of revolutionary freedom is “participation in public affairs,” that is, the positive freedom to act politically. In historical terms, this kind of freedom is exemplified for Arendt in the American Revolution where the foundation of a new political constitution—a republican constitution which codifies participatory citizenship—is itself achieved by participatory, autonomous “speech and action.” Although Arendt admits that an element of negative freedom is integral to thorough transformation, she is unequivocal in qualifying the “desire for liberation” as an insufficient objective of revolution if the latter is to be genuinely “political” as opposed to merely “social.” Employing the term “political” in a normative rather than a descriptive way, she appropriates the Aristotelian distinction between “political” and “despotic” forms of constitutional order and transposes it to the problem of revolutionary disorder. Consequently, in Arendt’s view, not every revolution can automatically be considered political. Instead, processes of profound, sustainable transformation have to meet certain conditions if they are to be labeled as political. The delineation Arendt suggests is essentially based on two criteria: For her, a revolution is apolitical or even anti-political if (a) what she calls “the social question” is its essential driving force and if (b) violence plays a central role in bringing about a new order (compare Arendt, 2006 [1963]). Similarly, Thomas Jefferson (1743-1826), himself a central intellectual and political figure of the American Revolution, insists on the importance of positive aspects of revolutionary freedom. This becomes apparent when he directly relates the idea of an “empire for liberty” to the notion of “self-government.” It is underlined in his remarks on resistance and rebellion: Despite their potential legitimacy and their “refreshing” effects on the “tree of liberty,” such attempts to be free from forms of “despotism” and “tyranny” remain insufficient in that they fail to found an alternative order that reliably rests on a constitution conducive to the realization of “life, liberty, and the pursuit of happiness” (compare Jefferson, 2004).

Karl Marx endeavors to relativize the opposition between either negative or positive freedom as definitive of revolutionary freedom. For him, revolution has to be conceived as a temporal process spanning over different stages. Thereby, an element of liberation plays a crucial role at the beginning of radical change insofar as it contributes to the liquefaction of an existing, oppressive system (such as the system of bourgeois, capitalist “class rule”). However, Marx’s theory of revolution expounds that this deconstructive element needs to be complemented by a reconstructive element once, in the later stages of the revolutionary process, the solidification of its transformative dynamics, that is, the formation of a new system becomes the essential task. The final paragraph of the 1848 Communist Manifesto paradigmatically reveals Marx’s (and Engels’s) understanding of revolutionary freedom as necessarily encompassing both negative and positive moments: The communist revolution casts off the “chains” as well as it “wins” a new, classless “world.” According to Marx, such a world makes possible the exercise of “real freedom” in the positive sense of individual “self-realization” that is embedded in a community and manifested in labor. As already stated in On the Jewish Question (1843/44), freedom thus understood differs from the bourgeois conception of freedom which is based on a “monadic” view of humans who only relate to each other in terms of competition. Marx argues that under the guise of this strictly individualist and merely formal kind of freedom, it is exclusively capital, not humans that can be considered as free. Thus, it is the idea of commitment, grounded in the communal and practical orientation of his notion of “self-realization,” which, against the background of his criticism of capitalist society, characterizes Marx’s concept of post-revolutionary freedom. In his understanding, the indeterminacy or openness of this concept as regards content guarantees that the spontaneity constitutive of freedom is not prefigured and, thereby, inhibited or even suppressed: For Marx, it is evident that the precise results of authentically free human action and interaction cannot be predicted. Thus, the significance of his vision of a future free society, in which the difference between oppressors and oppressed is overcome, is underlined in his deliberate refusal to further specify its shape.

d. The Question of the Revolutionary Subject

The question of the revolutionary subject pertains to the primary agent of radical transformation. Here, the spectrum ranges from history unfolding largely independent of man’s decisions and actions on the one end to autonomous, history-shaping man on the other. In the latter case, the agent can take a variety of forms ranging from exceptional individuals to a transnational “multitude,” from a distinct avant-garde to an amorphous crowd.

G.W.F. Hegel’s concept of revolution is thoroughly determined by his concept of history. Radicalizing Kant’s teleological conception, Hegel understands history as a rational process in which the “idea of freedom” successively realizes itself. According to his macro-perspective, this progressive development, the self-actualization of objective “spirit,” unfolds based on the principle of dialectics. It becomes manifest in the “oriental” civilizations of China, India, and Persia, in ancient Greece, in the Roman Empire, and, finally, in the “Germanic” age of reformation and enlightenment which supersedes the “dark night” of the Middle Ages, Renaissance, and the era of feudalism (compare Hegel, 1991 [1832-45]). From this it follows that the revolutions in the United States and France or the 1791 slave uprising in Haiti on which Hegel comments have to be interpreted as indicative of the current stage of development of the idea of freedom. As a consequence, revolutions, for Hegel, cannot be “made” by humans as autonomous agents. Rather, they mark epochal transitions in the “necessary” progression of history, which finds expression in the thoughts and deeds of humans. Hegel’s remarks on the French Revolution reveal that revolutionary achievements (most importantly, man’s historically unparalleled attempt to govern reality through ideas) and revolutionary failures (most importantly, the “abstract,” “subjective,” and, thus, deficient understanding of freedom which leads to the Terreurs) are to be seen primarily as reflections of the imperfect level reached by “spirit” thus far (compare Hegel, 1977 [1807]).

In opposition to Hegel’s accentuation of the progressive dynamics inherent to history, a wide range of theorists emphasize the principal role of human action with regard to the question of revolutionary subjectivity. However, these thinkers suggest various concretizations of man as the driving force of profound transformation: Bakunin emphasizes the world-changing potential of individual “bandits” (compare Bakunin, 1990 [1873]); Lenin points to a revolutionary avant-garde of limited size (compare Lenin, 1987 [1902]); Foucault attributes this role to the entirety of a people united by an experience of “political spirituality” (compare Foucault, 2005 [1978-79]); Fanon understands revolutionary subjectivity to be actualized by the “wretched” victims of colonialism (compare Fanon, 1967 [1961]; Sartre, 1967); Marcuse sees the heterogeneous group of the marginalized and “hopeless” both within and without Western societies as the key agent of revolution (compare Marcuse, 1991 [1964]); finally, contemporary theorists like Michael Hardt and Antonio Negri present a global “multitude” as the only political unit capable of realizing a revolution against the system of late capitalism (compare Hardt/Negri, 2004; Negri, 2011).

In Marx’s thought, the dichotomy between the idea that revolution is the effect of history’s independent development and the idea that revolution is the immediate product of human action is put into question. On the one hand, Marx’s position is strongly influenced by Hegelian philosophy: Despite modifying Hegel’s dialectics materialistically, he reiterates the thought of an internal logic to history (for Marx, the logic of “class struggle”) on the basis of which all processes of transformation can be explained as “necessary.” Yet, on the other hand, a specific social class is needed to concretely carry out such processes. In the historical context of the 19th century this social class is the “proletariat,” which is presented as the decisive factor of revolutionary change (compare Marx/Engels, 2012 [1848]). Thus, although Marx and Engels hold that revolution cannot be “made” thanks to human will and action alone, it cannot become manifest without human will and action. With respect to the problem of the revolutionary subject, a similar interplay between history’s inaccessible movement and self-determined human agency is described by theorists concerned with the kairos, that is, the right moment or timing for radical change. Rousseau, for instance, argues that specific historical constellations (“crises”) are necessary for humans (here, a people) to successfully initiate revolutions (compare Rousseau, 2012 [1762]). For Jefferson, such constellations—“precious occasions” beyond human planning and control—are the precondition for successfully consolidating the progress thus far achieved by bringing to a halt the revolutionary dynamics before it escalates into continuing violence and irreversible political, social decomposition (compare Jefferson, 2010). In both cases, human will and action is autonomous. Yet, according to Rousseau and Jefferson, revolutionary subjectivity is strongly affected and limited by what historical situations grant or deny respectively.

Further questions arise once theorists have identified man as the subject to actively make revolution. For instance, it is to be determined whether the revolutionary subject’s capacity to act in a world-transforming way is the result of minute “organization” as argued by Lenin for example, or whether it emerges “spontaneously” as, for example, Kropotkin claims. Another debate in this context concerns the driving motivational forces behind revolutionary subjectivity. Here, some theorists emphasize material, that is, social or economic factors, while others understand immaterial, that is, intellectual or spiritual factors, to be decisive. This tension between “being” and “consciousness” is reflected in the controversy between Jean-Paul Sartre and Maurice Merleau-Ponty: Whereas the former understands the revolutionary subject’s actions as caused by a concrete material “situation” of oppression (compare Sartre, 1955 [1946]), the latter insists that such actions constitute a form of “significance” (“Sinn-gebung”), that is, a form of freely creating meaning through revolutionary projects, which is irreducible to materialist causality (compare Merleau-Ponty, 2005 [1945]). Finally, the positions diverge with respect to the attitudes that are considered particularly conducive to effective individual or collective revolutionary action. Foucault, based on his observations of the overthrow of the Shah, underlines the influence of the “profane register” of indignation, resentment, even hatred that crucially fuels the revolutionary movement in Iran (compare Foucault, 2005 [1978-79]). Pointing to the deeply transformative political projects of Mahatma Gandhi, Martin Luther King, and Nelson Mandela, Martha Nussbaum attributes their success to an attitude that overcomes negative, destructive emotions and is committed to “non-anger” instead (compare Nussbaum, 2013). In her view, this mental commitment to non-anger is more decisive for revolutionary justice and for post-revolutionary reconciliation between former opponents than the practical commitment to non-violence.

e. The Question of the Revolutionary Object

The question of the revolutionary object pertains to the primary target of revolutionary change. Two predominant strands can be distinguished: While some theorists hold that revolutions should primarily aim at converting the attitudes, convictions, belief systems and world-views of individuals, others argue that the material, institutional frameworks within which humans act and interact constitute the main object or site of revolutionary change. Once more, a variety of positions can be found in between these extremes. Such positions hold both dimensions not only to be necessary conditions of radical change but also to mutually affect each other.

Fanon is one of the thinkers who argue that revolution cannot be limited to a remaking of the external world, that is, to the establishment of a different political, economic, social, and cultural order. Instead, full transformation is only achieved by an internal process of “creation” in which the carriers of the revolution, individually as well as collectively, re-humanize themselves in their struggle for liberation from systemically de-humanizing colonial rule. According to Fanon’s politico-psychological theory of revolution, the inner sphere of attitudes towards oneself, one’s community, and one’s former oppressors is the essential locus of revolutionary change: It is there that a radical transformation of the revolutionaries’ status occurs which turns them from an “animalized” and “objectified,” anonymous and disposable mass into “sovereign” subjects capable not only of self-determination but also of self-respect (compare Fanon, 1967 [1961]).

In contrast, the anarchist theorists Mikhail Bakunin and Peter Kropotkin (1842-1921) point to the institutional conditions as the main target of the “social revolution” they advocate. In their understanding, it is above all the institution of the state that has to be destroyed if freedom, morality, and solidarity are to be realized among humans: Being a source of “artificial” authority, any state, independent of its specific form, makes the unrestricted, free flourishing of men impossible (compare Bakunin, 2009 [1871]). Therefore, conquering freedom in its totality is tantamount to establishing an order that abolishes every political or religious institution that exercises authority. Such a society organizes itself according to the principles of decentralization, social diversity, and horizontal interconnectedness, which allow for harmony and happiness on both the subjective and inter-subjective level (compare Kropotkin, 2008 [1892]). This line of thought, which emphasizes the primacy of institutional transformation, is also represented by Kant. Far from suggesting the abolition of the state, however, Kant marks the essential institutions of the state—its politico-legal constitution and system of law—as the decisive lever to unhinge despotism and promote progress with respect to freedom, rationality, and morality in a process of “complete revolution.” In his view, a program of political pedagogy that aims at directly transforming the way in which humans understand themselves and the world is not only empirically unreliable, but also categorically insufficient. What is needed instead is a progressive shift as to systemic conditions that make it possible for a “spirit of freedom” to unfold successively. It is conditions founded on principles of right that will eventually lead to a fuller realization of the individuals’ moral and rational potential (compare Kant, 1991 [1798]; 1996 [1797]).

Insisting on the comprehensive character of revolution, Rousseau, when thinking about its adequate object or target, attempts to avoid comparable predeterminations. He argues that both the modus operandi of individual humans (that is, their ways of thinking, feeling, and acting) and of political institutions (that is, their ways of being structured and of acting upon citizens) has to be tackled for thorough transformation to occur. Consequently, if “moral” and “civil” liberty and equality are to be realized, it takes the contribution of education, as elaborated in Emile, as well as of institutional restructuring, as elaborated in the Social Contract: According to Rousseau, both the individual and the framework of politico-legal institutions constitute necessary targets of revolutionary change. Rousseau’s considerations thus underline the interdependence of both transformative dimensions.

f. The Question of the Extension of Revolution

This question pertains to (a) the temporality or, more narrowly, the duration and (b) the expansion of revolutionary transformation. Theorists dissent considerably as to whether such transformation has to be conceived as momentary, procedural, or permanent; they also disagree whether revolutions are to be understood as local, national, international, or global instances of profound, lasting politico-social change.

On the basis of his “messianic” conception of time and history that rejects the conventional understanding of time as “empty” (that is, as continuous and homogenous), Benjamin interprets revolution as a “shock” that kairologically disrupts the prevailing chronological and, with it, social and political order. For him, revolution thus constitutes a momentary event that makes a switch from a state of historical normalcy to a state of historical exception possible. This switch is as radical as it is sudden: “Every second” has the potential to serve as the gate through which “the messiah” can enter to fundamentally transform the world (compare Benjamin, 2009 [1940/42]). As opposed to Benjamin, thinkers like Hegel or Antonio Gramsci (1891-1937) understand revolution as a process that spans in time before it leads to substantial, intelligible change, that is, to new political, legal, and economic, cultural, linguistic, and aesthetic principles being implemented and effectively taking root. Although Hegel describes the French Revolution as a “glorious dawn,” it is evident for him that the political events of the late 1780s and early 1790s are belated, derivative effects of a long-lasting historical epoch of revolution that encompasses the ages of the reformation and the enlightenment (compare Hegel, 1991 [1832-45]). Discussing revolution in more narrowly political terms, Gramsci describes its realization as a tedious “war of position” against “hegemonic” power structures: It is only by means of persistently working their way through numerous struggles with the opponents of revolution over time that its carriers can hope to supersede an established order (compare Gramsci, 1992 [1929-35]). Similarly, Marx and Engels put emphasis on the aspect of duration. Modeling their understanding of revolution on the Israelites’ exodus from Egypt (compare Walzer, 1985), they attribute great significance to the interval period that lies in between the status quo at the time of the failed revolutions of 1848 and the future actualization of a classless society. Given the considerable distance between the initial and the terminal point of revolution, they propose a notion and practical program of “permanent revolution” that links immature democratic revolution to mature “proletarian” revolution. In modernizing and democratizing this idea, Étienne Balibar (*1942) expounds an understanding of revolution as a continuous, open-ended task. According to his view, revolution cannot hope for a final stage of satisfaction and completion (compare Balibar, 2014). Instead, it means an ongoing exercise in responsible citizenship and in “democratizing democracy.” This exercise allows for an ever increasing inclusion of groups and individuals who, heretofore, have been denied the ability to “take part,” that is, for their unrestricted recognition as full subjects of “equaliberty,” which is a hybrid term indicating the two main trajectories of modern emancipatory politics: On one side, the Lockean liberal and individualist strand and, on the other, the Rousseauian socialist and collectivist strand, which Balibar takes to be interdependent and co-constitutive elements of democratic revolution.

Other thinkers discuss revolution primarily in terms of its spatial extension. Contemporary anarchist theorist David Graeber (*1961) argues that revolutionary projects can be pursued by the creation of “autonomous spaces” on a local scale. Within such spheres, alternatives to dominant forms of coexistence and interaction, of politics and economy can be practiced whereby the existing order is unmasked as contingent. What is more, in drawing on exemplary practices from other epochs and cultures, the contours of an order devoid of institutions such as the state or capitalism and of repressive convictions such as racism and misogyny are “pre-figured.” For Graeber, the narrow spatial limits of these alternative micro-worlds characterized by autonomy, mutual aid, and direct democracy do not negatively affect their subversive, transformative capacities (compare Graeber, 2004). Whereas thinkers such as, for example, Sieyès and Foucault see the nation state as the adequate space for revolution to occur (compare Sieyès, 2003 [1789]; Foucault, 2005 [1978-79]), others claim that this is too limited a scope for radical transformation to have profound and lasting impact. For instance, Lenin, not unlike Sartre in his “revolutionary humanism” (compare Sartre, 1955 [1946]), follows Marx in emphasizing the transnational implications of revolution even if its scope, especially in its early phases, has to be national for reasons of mere practicability. According to Lenin, emancipatory projects carried out by a “revolutionary people” send shockwaves across neighboring as well as distant countries. Thus, it is evident to him, that the Russian Revolution ultimately represents the “interests of world socialism,” which outweigh mere national interests (compare Lenin, 1987 [1902]; 1978 [1917]). This position takes up the universalism inherent to the American and French Revolution which finds its expression in pronounced references to the “rights of man” in the writings and speeches of Paine or Mirabeau as well as in the essential political documents of the revolutionary period: the 1776 Declaration of Independence and the 1789 Declaration of the Rights of Man and of the Citizen.

4. Conclusion

Even when the plurality of manners in which “revolution” is used in the domains of technology and science, culture and art, is left aside and when the term is applied in the domain of politics only, the heterogeneity and contested nature of understandings remains considerable. In spite of the wide range of specific approaches, arguments, and agendas characteristic of the individual theories of political revolution, they can be situated within one multifaceted, yet unified intellectual space: From the theoretical enablers and “inventors” of revolution like Rousseau, Paine, or Kant to contemporary thinkers of revolution like Balibar or Graeber, their theories have been confronted with a number of central problems and questions which open up, shape, and sustain this space. It is primarily in terms of these central questions that they have attempted to conceptually grasp revolution. Six of these questions have been outlined in the above sections: (1) the question of revolutionary novelty which is discussed on a spectrum between the extremes of absolute and relative notions of rupture and beginning; (2) the question of revolutionary violence and its legitimacy discussed on the spectrum between unqualified approval and unreserved exclusion as a means of revolution; (3) the question of revolutionary freedom discussed on the spectrum between negative (liberation) and positive (foundation) concepts of freedom as the aim of revolution; (4) the question of the revolutionary subject discussed on the spectrum between individual doers on the one end and a global “multitude” on the other; (5) the question of the revolutionary object or target discussed on the spectrum between political, social institutions and individual, subjective attitudes, convictions, and beliefs; and, (6), the question of the temporal and spatial extension of revolution discussed on the spectrum between momentary and local on the one end, permanent and global on the other. Despite their pronounced heterogeneity and their attempts to periodically redefine revolution, it is with respect to these key questions that the theories presented here share family resemblances to one another.

Defining whether political change can be considered revolutionary constitutes the conceptual issue at the core of these theories. In particular, they aim at circumscribing revolution in regard to related, yet distinct concepts such as revolt, rebellion, and reform whereby the questions of the new, of liberty, and of the legitimacy of violence serve as the most relevant criteria for demarcation. The first two criteria play a central role in the distinction between revolution on the one hand, revolt and rebellion on the other. As a consequence of the underlying main goal of casting off an unjust, oppressive regime, both revolt and rebellion are based on limited notions of novelty and liberty. Thus, in comparison to revolutionary change, the specific kind of change they aspire to is more marginal in its scope. However, once revolution is not conceived as momentary but as procedural (as is the case in Kant’s or Marx’s considerations), drawing such a clear conceptual line seems less feasible: If revolution is understood as a temporal sequence that encompasses multiple stages, an initial “revolting” or “rebellious” phase is conceivable, for which the aspect of durable foundation of a new order is secondary. For the differentiation of revolution and reform, the criteria of novelty and violence are central. Whereas the criterion of violence reliably allows for a demarcation, temporalized understandings of revolution entail the blurring of a seemingly obvious difference with respect to the aspect of novelty: Here, a concluding “reformist” phase of revolution is thinkable in which the configuration of an institutional order or the establishment of a common ground with former “enemies of the revolution” takes precedence. Accordingly, when Kropotkin links revolution and revolt or when Kant explicitly associates revolution with reform, the relatedness between these concepts and not to mention the phenomena is reflected. In light of these resemblances, attempts at a precise conceptual critique of revolution, which distinguishes it sharply from revolt, rebellion, or reform remain heuristic in character.

Determining if and under what conditions revolutionary action and, especially, revolutionary violence are morally justified constitutes the normative issue at the core of theories of revolution. Although revolution represents the most radical expression of dissent and protest, the determination of its legitimacy reveals points of contact with debates on less extreme forms of a politics of resistance and transformation such as, for example, civil disobedience (compare Rawls, 1999). Despite the differences as to, inter alia, the scope of the envisaged transformation, their legitimacy essentially depends on the underlying cause and motivation. Revolutionary action and, with it, at least temporary political disorder, can only be considered legitimate if it aims at overcoming continued violations of the basic rights of specific groups or entire nations by the regime in power that are both severe and systematic. While conflict between ruling powers and revolutionary movements typically takes place within the context of a state, broader issues independent of the policies of a specific state can also be invoked as a justified cause to engage in radically transformative politics. The Occupy movement and its appeal to the inequalities brought about by the current global economic system is a case in point. Within and beyond the context of the state, the intention to right the wrongs—that is, the injustices as to dignity, liberty, and equality—committed by a regime and secured by unjust political, legal, social, or economic institutions is the primary precondition for a revolutionary project’s justifiability.

Furthermore, the (il)legitimacy of revolutionary politics is determined by the heavily disputed question of the permissibility of revolutionary violence. In relation to this question, the focus is not on the just cause, the right reason and intention of such a politics, but on the conduct in the course of its realization. The dispute pertains to different dimensions: It concerns the general issue whether violence can be considered a politically and, more importantly, morally justifiable means of revolution, in other words, whether, based on strategic or principled considerations, its use can be justified at all. In addition, it concerns more specific issues such as its justifiable form (for example, violence against property), scope (for example, violence limited to early stages of the revolutionary process), and status (for example, violence as a last resort once all peaceful alternatives have failed). Here, the discussion on revolution resembles theoretical debates on just war (Arendt, 2006 [1963]; Walzer, 2006 [1977]). For instance, much like in the case of the ius in bello, attempts to formulate essential criteria of acceptable revolutionary conduct aim at ensuring the proportionality of the use of violence, at discriminating between legitimate and illegitimate targets, and at prohibiting hostile acts which are “vile in themselves” (compare Kant, 2006c [1795/96]). Besides the perspectives of cause (in analogy to the terminology of just war theory: ius ad revolutionem) and conduct (ius in revolutione), there is a third critical perspective, in terms of which the legitimacy of revolutionary action and violence is determined. This perspective focuses on the ius post revolutionem, that is, on the final stage of a revolution, and assesses its capacity to terminate the state of exception in order to transition into a new and stable political order. Thereby, the stability of such a reconstitution is largely predicated on reconciliation with and inclusion of former adversaries. It is mainly thanks to the criteria of cause, conduct, and reconstitution that revolutionary violence becomes distinguishable from the violence used by criminals and, especially, terrorists. However, largely on the basis of formative historical experiences of excessive revolutionary violence—of revolutions not only harming their enemies, but also “devouring their children” —as well as of Gandhi’s or Mandela’s successful transformative projects, non-violent revolutionary action generally has a greater claim to justification.

A further relevant issue with regard to just revolution theory pertains to the self-authorization of revolutionary movements, which raises the questions whom such movements speak for and whose interests they represent. This issue crystallizes in revolutionary declarations that often appeal to “the people” (compare Habermas, 1990; Derrida, 2002). In this case, the legitimacy of a revolutionary project depends, among other things, on whether the revolutionaries’ political power and the sovereignty of the regime they establish is based on force or on discourse, that is, on oppression or persuasion of the majority.

To conclude, this article provides a sample of the rich theoretical discourse surrounding the contested concept of revolution. While the positions developed within the three dominant schools of thought (democratic, communist, and anarchist) are strongly shaped by broader commitments to the underlying political philosophies and often indebted to other debates (for example, on war), this discourse has distinctive features due to the specificity of its object of investigation and the controversial exchange of views between the different traditions. Given both its width and unsettledness, there are significant conceptual and normative issues for philosophers to address. It is not only in light of the often problematic history of revolutions that it is expedient to theoretically “provide yardsticks and measurements” (Hannah Arendt); a thorough analysis and critical assessment of transformative concepts, agendas, and strategies is also required because of the contemporary re-emergence of movements with revolutionary aspirations from the Zapatistas to the Arabellion, Occupy, or the Indignados.

5. References and Further Reading

  • Arendt, H., 2006, On Revolution [1963], New York: Penguin.
  • Badiou, A., 2012, The Rebirth of History, trans. G. Elliott, London/New York: Verso.
  • Balibar, É., 2014, Equaliberty: Political Essays, trans J. Ingram, Durham: Duke University Press.
  • Bakunin, M., 2009, God and the State [1871], New York: Cosimo.
  • Bakunin, M., 1990, Statism and Anarchy [1873], trans. & ed. M.S. Shatz, Cambridge: Cambridge University Press.
  • Benjamin, W., 2009, On the Concept of History [1940/42], New York: Classic Books America.
  • Benjamin, W., 1999, Zur Kritik der Gewalt [1921], in Walter Benjamin Gesammelte Schriften, vol. II.1, eds. R. Tiedemann & H. Schweppenhauser, Frankfurt: Suhrkamp, 179–204.
  • Berman, H., 1985, Law and Revolution: The Formation of the Western Legal Tradition, Cambridge, MA: Harvard University Press.
  • Butler, J., 2015, Notes Toward a Performative Theory of Assembly, Cambridge, MA: Harvard University Press.
  • Camus, A., 1991, The Rebel: An Essay on Man in Revolt [1951], trans. A. Bower, New York: Vintage Books.
  • Condorcet, J.A.N. de, 2012, Political writings, eds. S. Lukes & N. Urbinati, Cambridge, UK/New York: Cambridge University Press.
  • Dawoody, M., 2011, The Islamic Law of War: Justifications and Regulations, London: Palgrave Macmillan.
  • DeFronzo, J., 2011, Revolution and Revolutionary Movements, Boulder: Westview Press.
  • Derrida, J., 2002, “Declarations of Independence”, in Negotiations: Interventions and Interviews 1971-2001, ed. & trans. E. Rottenberg, Stanford: Stanford University Press, 46–54.
  • Engels, F., 1969, Germany: Revolution and Counter-Revolution [1851/52], with the collaboration of Karl Marx, ed. E. Marx, London: Lawrence & Wishart.
  • Fadl, K., 2006, Rebellion and Violence in Islamic Law, Cambridge: Cambridge University Press.
  • Fanon, F., 1967, The Wretched of the Earth [1961], trans. C. Farrington, Harmondsworth: Penguin.
  • Foucault, M., 2005, “Writings on the Iranian Revolution” [1978-79], in Foucault and the Iranian Revolution: Gender and the Seductions of Islamism, eds. J. Afary & K.B. Anderson, Chicago: University of Chicago Press, 179–277.
  • Furet, F., and, M. Ozouf (eds.), 1989, A Critical Dictionary of the French Revolution, Cambridge, MA: Belknap Press.
  • Graeber, D., 2004, Fragments of an Anarchist Anthropology, Chicago: Prickly Paradigm Press.
  • Gramsci, A., 1992–, Prison Notebooks [1929-35], New York: Columbia University Press.
  • Habermas, J., 1990, “Naturrecht und Revolution”, in Theorie und Praxis, Frankfurt: Suhrkamp, 89–127.
  • Hardt, M., and A. Negri, 2004, Multitude: War and Democracy in the Age of Empire, New York: Penguin Press.
  • Hegel, G.W.F., 1977, Phenomenology of Spirit [1807], trans. A.V. Miller, Oxford: Clarendon Press.
  • Hegel, G.W.F., 1991, The Philosophy of History [1832-45], trans. J. Sibree, Buffalo, NY: Prometheus Books.
  • Hobsbawm, E., 1996, The Age of Revolution: Europe 1789-1848 [1962], New York: Vintage Books.
  • Jefferson, Th., 2004–, The Papers of Thomas Jefferson: Retirement Series, ed. J.J. Looney, Princeton: Princeton University Press.
  • Jefferson, Th., 2010, The Selected Writings of Thomas Jefferson: Authoritative Texts, Contexts, Criticism, ed. W. Franklin, New York: W. W. Norton & Co.
  • Kant, I., 2006a, “An Answer to the Question: What is Enlightenment?” [1784] in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 17–23.
  • Kant, I., 2006b, “Idea for a Universal History from a Cosmopolitan Perspective” [1784], in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 3–16.
  • Kant, I., 1991, “The Contest of Faculties” [1798], in Kant: Political Writings, ed. H. Reiss, Cambridge: Cambridge University Press, 176–190.
  • Kant, I., 1996, The Metaphysics of Morals [1797], trans. & ed. M. Gregor, Cambridge/New York: Cambridge University Press.
  • Kant. I, 2006c, “Toward Perpetual Peace: A Philosophical Sketch” [1795/96], in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 67–109.
  • Kantorowicz, E., 1997, The King’s two Bodies: A Study in Medieval Political Theology [1957], Princeton: Princeton University Press.
  • Koselleck, R., 2004, Futures Past: On the Semantics of Historical Time, trans. K. Tribe, New York: Columbia University Press.
  • Koselleck, R. et al., 1984, “Revolution (Rebellion, Aufruhr, Bürgerkrieg)”, in Geschichtliche Grundbegriffe. Historisches Lexikon zur politisch-sozialen Sprache in Deutschland, volume 5, eds. O. Brunner, W. Conze, & R. Koselleck, Stuttgart: Klett-Cotta, 653–788.
  • Kropotkin, P., 2008, The Conquest of Bread [1892], Oakland, CA: AK Press.
  • Kropotkin, P., 1972, Mutual Aid: A Factor of Evolution [1902], ed. P. Avrich, New York: New York University Press.
  • Lenin, V.I., 1987, Essential Works of Lenin: What is to be done? and other writings, ed. H.M. Christman, New York: Dover Publications.
  • Lenin, V.I., 1978, State and Revolution: Marxist Teaching about the Theory of the State and the Tasks of the Proletariat in the Revolution [1917], Westport: Greenwood Press.
  • Locke, J., 1986, Second Treatise on Civil Government [1689], Amherst, NY: Prometheus Books.
  • Marcuse, H., 1984, “Ethik und Revolution” [1964], in Herbert Marcuse Schriften, volume 8, Frankfurt: Suhrkamp, 100–114.
  • Marcuse, H., 1991, One-dimensional Man: Studies in the Ideology of Advanced Industrial Society [1964], Boston: Beacon Press.
  • Marx, K., 2001a, Capital: A critique of Political Economy. Vol. I, Book One, The Process of Production of Capital [1867], trans. S. Moore & E. Aveling, London: Electric Book Co.
  • Marx, K., 2001b, The Class Struggles in France [1850], London: Electric Book Co.
  • Marx, K., 2012, The Communist Manifesto [1848], with Friedrich Engels, London: Verso.
  • Menke, Ch., and, F. Raimondi (eds.), 2011, Die Revolution der Menschenrechte. Grundlegende Texte zu einem neuen Begriff des Politischen, Berlin: Suhrkamp.
  • Merleau-Ponty, M., 2005, Philosophy of Perception [1945], London/New York: Routledge.
  • Nail, T., 2012, Returning to Revolution: Deleuze, Guattari, and Zapatismo, Edinburgh: Edinburgh University Press.
  • Nussbaum, M., 2013, Political Emotions: Why Love Matters for Justice, Cambridge, MA: Harvard University Press.
  • Paine, Th., 2012, Common Sense [1776], ed. R. Beeman, New York: Penguin.
  • Paine, Th., 2000, Political Writings, ed. B. Kuklick, Cambridge/New York: Cambridge University Press.
  • Paine, Th., 1992, Rights of Man [1791], ed. G. Claeys, Indianapolis: Hackett Pub. Co.
  • Palmer, R., 2014, The Age of Democratic Revolution: A Political History of Europe and America 1760-1800 [1959], Princeton: Princeton University Press.
  • Rawls, J., 1999, “The Justification of Civil Disobedience”, in John Rawls: Collected Papers, S. Freeman (ed.), Cambridge, MA: Harvard University Press, 176–189.
  • Rosenstock-Huessy, E., 1993, Out of Revolution: Autobiography of Western Man [1938], Providence: Berg Publishers.
  • Rousseau, J.-J., 1992, Discourse on the Origin of Inequality [1755], trans. D.A. Cress, Indianapolis: Hackett Pub. Co.
  • Rousseau, J.J., 2012, Of the Social Contract and other Political Writings [1762], ed. C. Bertram, trans. Q. Hoare, London: Penguin.
  • Sartre, J.-P., 1962, “Materialism and Revolution” [1946], in Literary and Philosophical Essays, transl. A. Michelson, New York: Criterion Books, 185–239.
  • Sartre, J.-P., 1967, “Preface”, in The Wretched of the Earth, F. Fanon, Harmondsworth: Penguin.
  • Sieyès, E.J., 2003, Political Writings: Including the Debate between Sieyès and Tom Paine, with a Translation of What is the Third Estate?, ed. M. Sonenscher, Indianapolis: Hackett Pub. Co.
  • Skopcol, T., 1979, States and Social Revolutions, Cambridge/New York: Cambridge University Press.
  • Walzer, M., 1985, Exodus and Revolution, New York: Basic Books.
  • Walzer, M., 2006, Just and Unjust Wars: A Moral Argument with Historical Illustrations [1977], New York: Basic Books.
  • Walzer, M., 1992, Regicide and Revolution: Speeches at the Trial of Louis XVI [1972], New York/Oxford: Columbia University Press.
  • Zizek, S., 2012, The Year of Dreaming Dangerously, London/Brooklyn, NY: Verso.
  • Zizek, S., 2008, Violence: Six Sideway Reflections, New York: Picador.

Author Information

Florian Grosser
Email: florian.grosser@unisg.ch
University of St. Gallen
Switzerland

Everettian Interpretations of Quantum Mechanics

Between the 1920s and the 1950s, the mathematical results of quantum mechanics were interpreted according to what is often referred to as “the standard interpretation” or the “Copenhagen interpretation.” This interpretation is known as the “collapse interpretation” because it supposes that an observer external to a system causes the system, upon observation, to collapse from a quantum mechanical state to a state in which the elements of the system appear to have a determinate value for the property measured. Although this interpretation is largely successful at explaining our experiences of the world, it fails in that it gives rise to what has become known as the measurement problem, a problem described in section 2.

In addition to this problem, there is another problem tied to the role of the observer. In the 1950s, Hugh Everett III (1930-1982) was considering quantum mechanics as it might apply to the entirety of the universe. Surely if quantum mechanics were true on the local level of laboratories and experiments taken as closed systems, it would also be true for the entire universe taken as a closed system. The problem with this approach is that there is no external observer available at the scale of the entire universe to cause a collapse of the quantum state, a state that the laws of quantum mechanics say the universe would be in, were it unobserved. Thus, Everett suggested that we abandon the notion of an observer-caused collapse and we consider all quantum states to be always non-collapsed.

Everett published one short paper in 1957—his doctoral dissertation (Everett 1957a, 1957b)—and after that, he left academia. He later published the longer, original version of his dissertation at the request of Bryce DeWitt and Neill Graham (Everett 1973). In his dissertation, Everett develops the mathematical theory that is the foundation of Everettian quantum mechanics [EQM]; but many people have believed the theory itself needs interpretation. Although Everett was not interested in the philosophical implications of his work, there has been great interest among philosophers in trying to interpret what EQM implies about the metaphysical structure of the world.

This article surveys the various ways philosophers have attempted to interpret Everett. To begin, the standard interpretation, as well as its attendant problems, is discussed briefly. Following that, the bare theory, the single and many minds theories, and versions of a many worlds theory are discussed. The article closes by discussing two relational interpretations of Everett.

Table of Contents

  1. Preamble
  2. The Standard Interpretation
  3. Everettian Interpretations
    1. Everett Plus Nothing (The Bare Theory)
    2. Everett Plus Minds
    3. Everett Plus Worlds (DeWitt’s Splitting Worlds)
      1. Problems with the Notion of Splitting
      2. The Preferred Basis Problem
  4. The Evolution of Many Worlds Interpretations
    1. The Problem of Probability and Graham’s Attempt at Adding a Measure
    2. The Oxford MWI
    3. Objections to the Oxford MWI
  5. Relational Interpretations of Everettian Quantum Mechanics
    1. Simon Saunders’ Relational Interpretation
    2. The Relative Facts Interpretation
  6. Conclusion
  7. References and Further Reading

1. Preamble

Before beginning to survey the various ways philosophers have attempted to interpret Everett, we must address the question of whether or not there even are rival interpretations of Everett. Some of the most influential physicists and philosophers working on EQM have either taken it as fact (DeWitt 1970) or explicitly argued (Deutsch 2010; Wallace 2012) that there is only one “interpretation” of EQM: some version of the many worlds interpretation [MWI].

Bryce DeWitt (1923-2004) took it to be the case that the only way one can “interpret” Everett was through a many worlds theory. He wrote, “The mathematical formalism of the quantum theory is capable of yielding its own interpretation” (DeWitt 1970: 160). And that formalism “forces us to believe in the reality of all the simultaneous worlds represented in the superposition described by equation [(2), below], in each of which the measurement has yielded a different outcome” (DeWitt 1970: 161).

David Deutsch (1953- ) and David Wallace (1976- ) have argued that there are no rival “interpretations” of Everett: “Other ‘interpretations’ . . . are really alternative physical theories . . .” (Wallace 2012: 382, Wallace’s emphasis). They see the “Everett interpretation” to be “just quantum mechanics itself, read literally, straightforwardly—naively, if you will—as a direct description of the physical world, just like any other microphysical theory” (Wallace 2012: 2). Deutsch writes:

. . . insisting that parallel universes are ‘only an interpretation’ and not a – what? a scientifically established fact or something (as if there were such a thing) – has the same logic as those stickers that they paste in some American biology textbooks, saying that evolution is ‘only a theory’, by which they mean precisely that it’s just an ‘interpretation’. Or, in terms of the analogy that Everett used in his famous exchange of letters with Bryce DeWitt, it’s like claiming that the motion of the Earth about its axis is only an ‘interpretation’ that we place on our observations of the sky (2010: 543, Deutsch’s emphasis).

Wallace writes:

The ‘Everett interpretation of quantum mechanics’ is just quantum mechanics itself, ‘interpreted’ the same way we have always interpreted scientific theories in the past: as modelling the world. Someone might be right or wrong about the Everett interpretation – they might be right or wrong about whether it succeeds in explaining the experimental results of quantum mechanics, or in describing our world of macroscopically definite objects, or even in making sense – but there cannot be multiple logically possible Everett interpretations any more than there are multiple logically possible interpretations of molecular biology or classical electrodynamics (2012: 38, Wallace’s emphasis).

The arguments Deutsch and Wallace provide may be persuasive to some readers. But the purpose of the current article is to survey what has historically been done by philosophers attempting to draw metaphysical pictures from Everett’s pure wave mechanics. Wallace is explicit about the fact that he is not attempting to do historical exegesis of Everett’s views (2012: 2), and whether Everett would be sympathetic to or supportive of an MWI is an open question. (Again see Barrett 2010, Barrett and Byrne 2012, and Bevers 2011.) So whether or not a version of the MWI is the correct interpretation of Everett, or even the only interpretation of Everett, is a question that can be adjudicated in other venues. Our purpose here is to consider the ways in which philosophers have attempted to interpret Everett’s pure wave mechanics, and so, after one final preliminary note, it is to this that we shall turn.

There is one other debate that ought to be considered before embarking on our project. And that is the debate over the appropriate way to explain the results of quantum mechanical experiments. Everett’s proposal for pure wave mechanics is but one way physicists explain what seem to be counterintuitive outcomes of quantum mechanical experiments. Other ways include Bohmian mechanics (de Broglie 1928, Bohm 1952) and GRW (Ghirardi, Rimini & Weber 1986). Whether or not the unitary dynamics proposed by Everett are the correct laws for describing the world is a question that is far from decided. But as this article is concerned with the question of Everett interpretation, a full rehearsal of this debate goes beyond its scope. There is no assumption made here about what the correct theory of the world is; rather there is only a historical discussion of the way people have interpreted Everett.

For more on interpretations of quantum mechanics, see “Interpretations of Quantum Mechanics” in this encyclopedia and also (Lewis 2016).

2. The Standard Interpretation

In Schrödinger’s cat thought experiment (Schrödinger, 1935),  there is a cat locked inside a box along with a glass vial of cyanide; a hammer set to potentially break the vial; and a Geiger counter inside of which there is a sample of a radioactive substance small enough that there is a 50% chance of one of the atoms decaying in the course of one hour and a 50% chance that none will. If an atom decays the Geiger counter will click which causes the hammer to fall, the flask to break, the cyanide gas to be released and the cat to die. If an atom does not decay, the cat remains alive. If the inside of the box is not observed during this hour, Schrödinger took the formalism of quantum mechanics to imply that the cat would be in a superposition of being alive and dead.

Superpositions are states of systems that are represented mathematically by a weighted sum of the possible values for the property in question. Each summand will represent one of the possible values for the property and will be accompanied by a complex number coefficient which, when it is multiplied by its complex conjugate and the result is squared (in other words, when its norm is squared), the standard interpretation takes to be the probability of the system collapsing into that value for the property. So in our cat example, there will be two terms in the superposition of the state of the system that includes the cat, each with a coefficient of 1/√2 which, when its norm is squared, gives us a ½ probability of the system collapsing into the state that includes the cat being alive and a ½ probability of its collapsing into the state of the system that includes the cat being dead.

We never seem to observe cats (or any other macroscopic objects) as being in superpositions. The standard interpretation assumes that when an observer interacts with a system, that observation causes a collapse of the superposition, and the objects in the system take on definite values for the property being measured. So when the box is opened and the observer looks into it, the system randomly and instantaneously collapses into either cat alive or cat dead with a 50% probability of finding either.

Now we can transition from cats to electrons. Electrons have a property called “angular momentum” that can take a definite value of either spin up or spin down along the x, y or z axis. These values are mutually exclusive in the sense that if an electron has a definite value for one of the properties, it does not have a definite value for any of the others. It has been experimentally determined that when electrons have a definite spin property along one axis, they are in a superposition of having spin up and spin down along both of the other axes. So, for example, when an electron is determinately x-spin up, it is in a superposition of being y-spin up and y-spin down, and it is in a superposition of being z-spin up and z-spin down.

The standard interpretation tells us that when we observe electrons that are in such superpositions, they instantaneously and randomly collapse from the superposition they were in to one of the definite properties that make up that superposition. So when we take an electron that is x-spin up, for example, and measure its z-spin, the standard interpretation tells us that it collapses from being in a superposition of being z-spin up and z-spin down into either being z-spin up or being z-spin down. In the standard interpretation, this collapse explains the determinate measurement records that we get in experiments with quantum mechanical systems. The standard interpretation also tells us that if we do not observe quantum particles, then that collapse will not happen and they will remain in their superpositions.

The difference in these empirical results is captured in two of the laws that are part of the theory of quantum mechanics:

  1. When no measurement, or other observation, is made of a system, then that system evolves in a deterministic and linear fashion.
  2. When a measurement, or other observation, is made of a system, then that system instantaneously and non-deterministically collapses into a definite value for the property being measured.

In the standard interpretation, the second law accounts for the fact that when we measure any property of an object, it has a definite value.

These two laws are not compatible, and there is no clear explanation of when one is to be used instead of the other.  In other words, there is no explanation of what constitutes an act of observation in the standard interpretation.  In addition to this, there is the problem that if we take measurement devices to be physical systems like any other, then the standard interpretation says that the quantum system that makes up the measuring device will evolve deterministically, but the second law says that it will take on a definite value with a certain probability.  In other words, it would have to follow both a deterministic law and a law governed by chance. This is not logically possible.  This, in short, is the quantum measurement problem.  It is part of what has driven the search for different interpretations of quantum mechanics.

Another part of what drove the search for a new interpretation of quantum mechanics was an interest in being able to apply quantum mechanics to the entire universe. This is what, at least in part, led Hugh Everett III to suggest that instead of having two mutually incompatible laws for the description of the evolution of states, we drop the law that is used when systems are observed (Everett 1957a, 1957b). The implication of this is that while the standard interpretation suggests that when a measurement is made, the superposition of a quantum particle collapses and the particle has a determinate measurable (and measured) property, the so-called “Everett interpretation” claims that there is no such collapse.  All the theories that have sprung from Everett’s “pure wave mechanics” have come to be known as “no-collapse” theories since they propose that there is no collapse of the superpositions.

One difficulty with no-collapse theories is making sense of how it is that we seem to have determinate measurement records for quantum particles even though those particles do not have a determinate value for the property measured, since they never collapse out of their superpositions. Another is the question of probability in a universe in which everything happens. Various interpretations of Everett have answered these issues differently. It is to a discussion of these various interpretations that we now turn.

3. Everettian Interpretations

a. Everett Plus Nothing (The Bare Theory)

Everett’s pure wave mechanics suggests that there is generally no determinate fact about the everyday properties of the objects in our world, since the equations that are supposed to describe such properties are such that they describe superpositions of those properties. Rather, Everett takes there to be only “relative states” and thus “relative properties” of quantum systems.

To see what he means by “relative states” and “relative properties,” consider the following. When we want to learn the value of a property for some system, we measure for that property.  But Everett treats measuring devices just as he would any other system with which the object system interacts, and so the measuring device will become correlated with the system that it is measuring.  In order to learn anything about one subsystem, even the reading on a measuring device, one must make reference to the complement of the subsystem:

As a result of the [measurement] interaction the state of the measuring apparatus is no longer capable of independent definition. It can be defined only relative to the state of the object system. In other words, there exists only a correlation between the two states of the two systems. It seems as if nothing can ever be settled by such a measurement . . . There is no longer any independent system state or observer state, although the two have become correlated in a one-one manner (Everett 1957b: 144, 146).

Everett explains “correlation” this way: “If one makes the statement that two variables, X and Y, are correlated, what is basically meant is that one learns something about one variable when he is told the value of the other” (Everett 2012: 61; this definition also shows up in Everett 1973: 17). Even in this case, it only “seems” as if nothing can be settled because of this correlation between an object system and a measuring device.  In fact, one can settle matters with the use of relative states:

. . . a constituent subsystem cannot be said to be in any single well-defined state, independently of the remainder of the composite system.  To any arbitrarily chosen state for one subsystem there will correspond a unique relative state for the remainder of the composite system.  This relative state will usually depend upon the choice of state for the first subsystem.  Thus the state of one subsystem does not have an independent existence, but is fixed only by the state of the remaining subsystem.  In other words, the states occupied by the subsystems are not independent, but correlated.  Such correlations between systems arise whenever systems interact (Everett 1957b: 142; Everett’s emphasis).

Consider again what happens when we measure the z-spin of an x-spin up electron. Before the measurement interaction, the state of the system consisting of the electron, e, and the measuring device, m, can be expressed in this way:

(1)        |m>ready1/√2(|↑z>e + |↓z>e)

Where “|m>ready” is what we use to express that the measuring device is ready to make a measurement, “|↑z>e” expresses that the electron is z-spin up and “|↓z>e” expresses that it is z-spin down. When we measure the z-spin, the measuring device interacts with the system it is measuring and becomes a part of the system. After that interaction, the state of the system that consists of the measuring device and the electron can be expressed in this way:

(2)        1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

Where “|“↑”z>m” expresses that the measuring device has recorded z-spin up and “|“↓”z>m” expresses that the measuring device has recorded z-spin down.

In this state, the measuring device is in a superposition of reading z-spin up and z-spin down. Everett writes, “one can . . . look upon the total wave function . . . as a superposition of pairs of subsystem states, each element of which has a definite q value . . . for each of which the apparatus has recorded a definite value . . .” (Everett 1957a: 58, 59; Everett’s emphasis). So the way we get an explanation of our determinate measurement records is by understanding that they are records of relative states of a system.

Using the concept of relative states, we can say two things: (1) relative to the electron’s being z-spin up, the measuring device recorded “z-spin up” as the value of the electron’s z-spin; (2) relative to the electron’s being z-spin down, the measuring device recorded “z-spin down” as the value of the electron’s z-spin. This explains our determinate measurement results because if we ask a reliable observer whether he got a determinate measurement result for an experiment, he will always say “yes,” even when his measurement result is a superposition of outcomes and not one outcome determinately (Albert 1992, Barrett 1999). Here is why:

Let us say that there is a reliable observer who is about to measure the z-spin of an x-spin up electron. By “reliable” we mean that when we ask him whether he has a measurement record, if he has one then he will answer that he does; if he does not have one, he will answer that he does not. Let us also say that our observer is truthful.  He will always truthfully report what he has as a measurement record—if he recorded “z-spin down,” then he will answer “z-spin down” when asked what he recorded. Recall that x-spin up electrons are in a superposition of being z-spin up and z-spin down. So when we describe the state of the electron mathematically there will be one summand that describes the electron as z-spin up and one that describes it as z-spin down.

According to Everett’s pure wave mechanics, when our observer makes a measurement of the electron he does not cause a collapse, but instead becomes correlated with the electron. What this means is that where once we had a system that consisted of just an electron, there is now a system that consists of the electron and the observer. The mathematical equation that describes the state of the new system has one summand in which the electron is z-spin up and the observer measured “z-spin up” and another in which the electron is z-spin down and the observer measured “z-spin down.” In both summands our observer got a determinate measurement record, so in both, if we ask him whether he got a determinate record, he will say “yes.” If, as in this case, all summands share a property (in this case the property of our observer saying “yes” when asked if he got a determinate measurement record), then that property is determinate.

This is strange because he did not in fact get a determinate measurement record; he instead recorded a superposition of two outcomes.  After our observer measures an x-spin up electron’s z-spin, he will not have determinately gotten either “z-spin up” or “z-spin down” as his record. Rather he will have determinately gotten “z-spin up or z-spin down,” since his state will have become correlated with the state of the electron due to his interaction with it through measurement. Everett believed he had explained determinate experience through the use of relative states (Everett 1957b: 146; Everett 1973: 63, 68–70, 98–9). That he did not succeed is largely agreed upon in the community of Everettians.

This sparse interpretation of Everett, adding no metaphysics or special assumptions to the theory, has come to be known as the “bare theory.”  One might say that the bare theory predicts disjunctive outcomes, since the observer will report that she got “either z-spin up or z-spin down ”—without any determinate classical outcome—without being in a state where she would determinately report that she got “z-spin up” or determinately report that she got “z-spin down” (Barrett 1999).  So, if the problem is to explain how we end up with determinate measurement results, the bare theory does not provide us with that explanation. Something must be added to Everett’s account.

For more on the bare theory see Albert and Loewer 1988, Albert 1992 and Barrett 1999. That Everett was uninterested in the philosophical implications of his work has been argued by Barrett 2010, Barrett and Byrne 2012 and Bevers 2011, though Deutsch 2010 and Wallace 2012 differ with this conclusion.

b. Everett Plus Minds

One suggestion about what to add to Everett’s account is to suppose that every time we are faced with an entangled state such as the state in which our observer found himself in the last section, we conclude that he got a determinate result from a particular perspective. To see what motivates this, consider what happens when an observer goes from being ready to read the result on a measuring device:

(3) |o>ready 1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

to having read the device:

(4) 1/√2(|↑z>e |“↑”z>m |“↑”z>o + |↓z>e|“↓”z>m|“↓”z>o).

Here we use “|“↑”z>o” to represent the state of the observer having read “z-spin up” off the measuring device’s pointer. If the observer forms beliefs about the z-spin of the electron based on the results of the experiment (as it seems reasonable to presume), then the state of the system that now contains the observer will be the following:

(5) 1/√2(|↑z>e |“↑”z>m|“↑”z>o |BEL“↑”z>o + |↓z>e|“↓”z>m|“↓”z>o |BEL“↓”z>o)

Here “|BEL“↑”z>o” expresses that the observer believes that the electron is z-spin up. This state of the system implies that our observer is in a superposition of belief states (Albert and Loewer, 1988: 197). But our observer doesn’t feel like he is in a superposition of belief states. He feels like he has a definite result for the z-spin measurement of the electron. David Albert and Barry Loewer set out to produce an interpretation of Everett’s pure wave mechanics that “explains how it is that we always ‘see’ (mistakenly so . . .) macroscopic objects as not being in superpositions and never experience ourselves as in superpositions” (Albert and Loewer, 1988: 203). They propose that it is a function of the evolution of one’s mental state that explains one’s experiences (Albert and Loewer, 1988; Albert 1992).

Albert and Loewer begin by supposing that mental states supervene on particular brain states and are accurately accessible to introspection (204). They take it that the state of our observer believing that he read “z-spin up” is identical with a physical brain state that is distinct from the physical brain state that is associated with our observer believing he read “z-spin down” (204). Albert and Loewer do not explain this supervening relation; they merely state it as a given fact. But, they argue, this fact is inconsistent with taking our observer to be reliable and to his holding the belief that there was a determinate measurement record obtained when he measured the z-spin of an x-spin up electron. This is because to the state expressed by (5) our observer will answer “yes” when asked, “Does the electron have a determinate z-spin?” since it does have a determinate z-spin in each term. But our observer does not believe that the electron is z-spin up, nor does he believe that it is z-spin down since his brain is not in the state that corresponds to either of those beliefs; his brain is in a superposition of states (Albert and Loewer, 1988: 204). So Albert and Loewer give up the connection between brain states and belief states. They accept a “modest . . . non-physicalism” (205).

The first way they explain this is with what they call the single mind view. This view adds to quantum theory the principle that the evolution of an observer’s mental states is probabilistic. In the state expressed by (4), our observer starts out with no beliefs about the z-spin of the electron. But when he measures the z-spin, the probability is 50% that he ends up believing it is z-spin up and 50% that he ends up believing it is z-spin down. The association of belief states with physical states is dictated by probability and is determined by the quantum evolution of the system that consists of him, the electron and the measuring device (Albert and Loewer, 1988: 205–206). Thus, mental states are never in superpositions, even if physical brain states are.

Albert and Loewer immediately recognized certain problems with this view (1988: 206). The dualism this implies is particularly problematic since it implies that mental states do not supervene on brain states, or any physical states in general, since “one cannot tell from the state of a brain what its single mind believes” (206). Additionally, in the superposition that describes the state of the system in question, all the terms but one will represent “mindless brains” (often referred to as “mindless hulks”), and which one represents a mind is impossible to determine from the quantum formalism or from experiment. Jeffrey Barrett describes this problem nicely (1999). If two observers are measuring the z-spin of our x-spin up electron, then the system of the observers, the measuring devices and the electron will evolve from this state

(6) |o1>ready|o2>ready 1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

to this one

(7) 1/√2(|↑z>e |“↑”z>m |“↑”z>o1|“↑”z>o2 + |↓z>e|“↓”z>m|“↓”z>o1|“↓”z>o2).

There is nothing in the dynamics proposed by Albert and Loewer that prevents the mental states of each observer from being associated with the same term or from being associated with different terms. But there is nothing that tells us which is the case. In each term of (7) there is a physical brain state for each observer, but with only one is there a mental state for each observer. Thus, it is possible that observer 1 is associated with the first term of (7) and observer 2 is associated with the second, but that neither of them realizes it. As such, observer 1 will (correctly) remember having gotten “z-spin up” as the result of her measurement but (incorrectly) remember that her friend got the same result. The same will hold true, mutatis mutandis, for observer 2. There is no way to determine whether or not one is speaking to a mindless hulk rather than a set of physical states with which there is associated a mental state (Barrett, 1999: 189–190).

Barrett also points out that the single minds view predicts that when an observer repeats a measurement she may get a different result from what she first got and falsely remember having gotten a first measurement that matches her second (1999: 187–188). So this observer is in a position where she cannot trust that what she remembers is what actually happened. Here is why. Consider again the state represented in (4). If our observer were to repeat her measurement of the z-spin of the electron, then the second measurement would be identical to the first. The state would then be

(8) 1/√2(|↑z>e |“↑”z“↑”z >m |“↑”z“↑”z> o + |↓z>e|“↓”z“↓”z >m|“↓”z“↓”z >o).

Even if our observer ended up in the state in which she read “z-spin up” for her first measurement, there is still a 50/50 chance that her mental state will be associated with each term in (8). Thus, there is a 50% chance that her mental state will evolve in such a way that she (correctly) believes that the results of her two measurements agreed, but she will (incorrectly) believe that she got “z-spin down” each time.

To avoid some of these problems, Albert and Loewer propose what they call the many minds view. This view takes it that every observer is associated with an infinite set of minds. This immediately solves the mindless brains problem since there are minds associated with each term in the state of a system that includes an observer. It also solves the problem of dualism in the sense that “mental states are determined by or supervenient on brain (or brain + environment) states” (206), though the minds are still non-physical in that they are not subject to the rules of quantum mechanics for their evolution. This is a benefit of the many minds view since mental states will never (in accordance with our experience) be in a superposition. But this benefit leads to an additional difficulty in that it retains a certain element of dualism. The only mental state that does any supervening on an observer’s physical state is what might be called the “global” mental state. This is the state that is associated with the entire set of minds. It is the only thing that evolves according to quantum mechanical dynamics. Importantly, what might be called the “local” state, the state to which an observer has direct introspective access, does not so supervene. If it did, it would also evolve according to the dynamics, but Albert and Loewer take it as a benefit to the many minds view that it does not. (For more on this objection see Barrett, 1999: 194–196 and Lockwood, 1996: 174.)

Determinate experience is just one part of the dilemma of Everett interpretation. The other is probability. The many minds view takes the norm squared of the coefficients on the terms to be interpreted as giving the proportion of minds associated with each term in the state of the system. In this view, probabilities “refer . . . to sequences of states of individual minds” (208). In (4) our observer is in a brain state that has a mind associated with it that has no belief about the z-spin of the electron, but she can predict what the probability is of ending up with a mind that believes she observed z-spin up or z-spin down; she can predict what the sequence of her mental states will be, according to the norm squared of the coefficients. Knowing that (3) evolves into (4) after a measurement, and accepting the reasonable claim that beliefs are formed when the observer looks at the measuring device, our observer knows that half her minds will believe that she measured the electron as being z-spin up and half believe she measured it as z-spin down.

It is debatable whether Albert and Loewer’s many minds view succeeds in answering the questions of determinate experience and probability that come with any interpretation of Everett and whether the dualism that is inherent in their view is a price worth paying. Michael Lockwood (1996) believed that it was not a price worth paying and so worked to develop a competing many minds view that did not suffer this problem.

Lockwood’s version of many minds argues that “associated with a sentient being at any given time, there is a multiplicity of distinct conscious points of view . . .  it is these conscious points of view or ‘minds’ . . . that are to be conceived as literally dividing or differentiating over time” (1996: 170). For him there is a sense in which our observer can regard herself as having just one mind. He calls this her “multimind” or “Mind,” and it consists of all the minds (lower-case “m”) that are described in the terms of the state of the system of which the observer is a part (1996: 177). Each mind has a “maximal experience” that describes its complete state of consciousness, but it should not be identified with a state of the Mind of the observer. Lockwood takes it that there is “complete supervenience of the mental on the physical” and so he avoids the dualism that plagues Albert and Loewer’s version (1996: 184). To do this, though, he agrees with Albert and Loewer that one must give up “the assumption that . . . there is a uniquely correct way of linking earlier and later maximal experiences of the same Mind together to form persisting minds . . .” but he does not take that to be fatal to his project (1996: 183–184). For Lockwood, each mind makes up one subset of the Mind and “each stands in an equal relation of succession to the given . . . maximal experience . . . with which we started” (Lockwood, 1996: 183). To go back to our example, there is a mind associated with each term in (4), and these go to make up the Mind of the observer. Each of these minds has equal claim to be the successor to the mind in (3) that was in the ready state. So it is unclear who the observer in (3) should expect to be once the measurement is made.

Diachronic personal identity leads to the problem of how to interpret probabilities from this viewpoint. Lockwood believes that he has a meaningful interpretation of probability as “a naturally preferred measure on sets of simultaneous maximal experiences” (1996: 182). He wants to analogize this to duration so that in analogy to “this pain lasted twice as long as the last one” he can say something like, “this pain is, superpositionally speaking, twice as extensive as that” (182; Lockwood’s emphasis). The extensiveness of an experience is a function of its “temporal ‘length’ and of superpositional ‘width’” in the sense that it has a higher coefficient associated with its term (182). It is largely agreed that Lockwood does not have a sufficient explanation for probability. Loewer (1996) and Barrett (1999) both argue that Lockwood’s conception of probability is insufficient for explaining how we are to take probabilities to be guides to rational decision making and prediction. Lockwood cannot explain how the norm squared of the coefficient ought to guide an observer’s predictions about what her experience will be in the future since there is no way to track her over time. In our example above, when we ask what the probability will be of our observer in (3) seeing “z-spin up” upon measurement, we have to say it will be 1. The same is true for “z-spin down,” since in some term there is a mind that goes to make up the Mind of the observer that observes each measurement result.

Barrett (1999) argues that aside from this problem, Lockwood has a “very unusual notion of probability in mind” (209). He wants to “introduce a (sic) entirely new notion of probability,” and one that Barrett finds puzzling and insufficient for explaining an observer’s determinate experiences (209–210).

There are several other interpretations of Everett that are in the same vein as the single minds and many minds views of Albert and Loewer and Lockwood, but none of them are quite as fully worked out as those presented here. The interested reader may want to look at Zeh 1981, Squires 1987, Donald 1990, Squires 1991, Stapp 1991, Squires 1993, Donald 1993, Donald 1995, and Page 1995.

c. Everett Plus Worlds (DeWitt’s Splitting Worlds)

What is arguably the most common interpretation of Everett’s formulation is what Bryce DeWitt called the many-worlds interpretation [MWI] (DeWitt 1968 and Wheeler 1998: 269–70). In his 1967 lecture on what he calls the “Everett-Wheeler interpretation” of quantum mechanics, DeWitt takes Everett’s claim that:

. . . with each succeeding observation (or interaction), the observer state “branches” into a number of different outcomes of the measurement . . . for the object-system state.  All branches exist simultaneously in the superposition after any given sequence of observations (Everett 1957a: 25–6; Everett 1957b: 146).

to imply that we are forced:

. . . to believe in the ‘reality’ of all the simultaneous ‘worlds’ represented in the superposition [in which we find the universe after a measurement interaction] . . . in each of which the measurement has yielded a different outcome (DeWitt 1968: 326).

The first published version of Everett’s dissertation began to gain popularity when it was reprinted in The Many Worlds Interpretation of Quantum Mechanics (DeWitt and Graham 1973).  In his papers in this volume, DeWitt explicitly refers to the “reality of all simultaneous worlds” that he believes are implied by his reading of Everett and to the “reality composed of many worlds” that he takes Everett’s formalism to have taught us is true about our universe (DeWitt 1970: 161, DeWitt 1971: 182). So from this point on, Everett’s interpretation has most commonly been known as the “many worlds interpretation” of quantum mechanics.

The branching that occurs in DeWitt’s MWI can be interpreted in several different ways. One possibility is DeWitt’s way, which suggests that:

[o]ur universe must be viewed as constantly splitting into a stupendous number of branches, all resulting from the measurementlike interactions between its myriads of components . . . every quantum transition taking place on every star, in every galaxy, in every remote corner of the universe is splitting our local world on earth into myriads of copies of itself (DeWitt 1970: 178).

DeWitt takes a strong realist position in regards to the worlds that are the result of the branches splitting.  He takes each branch to be “a possible universe-as-we-actually-see-it” (DeWitt 1970: 163) and believes that in spite of the fact that “all branches must be regarded as equally real” (DeWitt 1970: 178; see also Everett 1957b: note added in proof), we inhabit only one of the worlds that go to make up reality and we have no access to other worlds (DeWitt 1970: 182).

In order to see how this fares with regards to the question of determinate measurement results, let us make a distinction between a local I, IL, and a global I, IG. IL is the self who experiences determinate measurement results when conducting quantum mechanical experiments; IL sees only one outcome of every experiment. IL is who we generally consider ourselves to be. So when DeWitt says that we inhabit only one world of the many possible worlds, we can take him to be referring to weL as a collection of ILs. We then can understand him to believe that there will be an IL for each world that is created in the split. In contrast, there is always only one IG. We can think of the IG as the self, were it to exist, which has access to all the worlds. IG can be thought of as someone outside the theory who can see the branching structure of the world and who knows every outcome of quantum mechanical experiments, someone who has a “god’s-eye view” of the universe as a whole. We as ILs do not seem to have access to the experiences of IG. Thus, the only thing that needs to be explained in terms of determinate measurement results is the perspective of IL since this is the only perspective we seem to have, and according to Everett, it is the only perspective that is of any importance to an observer (Everett 1973: 99).

Everett argues that relative states provide a way to understand an observerL‘s determinate measurement record. The only place where the indeterminacy shows up is in the global perspective of a state; the local, relative perspective will always be determinate. Everett argues that an observer will never have access to the global state and therefore will always and only have determinate measurement results relative to his state (Everett 1973).

DeWitt’s branching worlds follow along the same line. For him, there are many different branches of the universe and each branch splits every time there is an observation or correlation made. On each branch is an observerL who will split every time the branch does. Each observerL gets a determinate record of whatever outcome occurs on his branch, because that is exactly what happened on that branch.

i. Problems with the Notion of Splitting

However, there are objections that have been raised to DeWitt’s proposal that we take each branch to be “a possible universe-as-we-actually-see-it.” The main objection has been that the idea that we in some way split or branch seems absurd given our experience in the world (Everett 1957c: 2; DeWitt 1971: 179; DeWitt and Graham 1973: 161; Xavier 1962: Tuesday Morning, p. 20).

Additionally, one might object to what turns out to be an infinite (possibly uncountable) number of worlds (Healey 1984; Saunders 1997). Some early many-world interpreters have taken Everett to be implying such a profligacy of worlds when he writes: “[a]ll branches exist simultaneously in the superposition after any given sequence of observations” (Everett 1957b: 146), and so “[f]rom the viewpoint of the theory all elements of a superposition (all ‘branches’) are ‘actual’, none any more ‘real’ than the rest” (Everett57b: note added in proof; Everett’s emphasis).

There appear to be two distinct issues raised.  The first is that a multitude of worlds, always splitting, seems to defy our sense that we are not splitting.  The second is that it seems to run counter to intuition to propose that there are multiple copies of the world, and of people, that constitute the universe.  Let us see how these objections are addressed one at a time as doing so will shed light on Everett’s thoughts about the relative state formulation.

The first objection, that we do not feel the splitting, is addressed by Everett both in his response to DeWitt’s letter in 1957 (Everett 1957c) and ultimately in the short dissertation (Everett 1957b).  In the letter to DeWitt he writes:

I must confess that I do not see this “branching process” as the “vast contradiction” that you do.  The theory is in full accord with our experience (at least insofar as ordinary quantum mechanics is).  It is in full accord just because it is possible to show that no observer would ever be aware of any “branching,” which is alien to our experience as you point out.

The whole issue of the “transition from the possible to the actual” is taken care of in a very simple way – there is no such transition, nor is such a transition necessary for the theory to be in accord with our experience.

From the viewpoint of the theory, all elements of a superposition (all “branches”) are “actual,” none any more “real” than another.  It is completely unnecessary to suppose that after an observation somehow one element of the final superposition is selected to be awarded with a mysterious quality called “reality” and the others condemned to oblivion – they won’t cause any trouble anyway because all the separate elements of the superposition (“branches”) individually obey the wave equation with complete indifference to the presence of absence (“actuality” or not) of any other elements.

This is only to say that the theory manages to avoid the difficulty of the “transition from possible to actual” – and I consider this to be not a weakness, but rather a great strength of the theory.  The theory is isomorphic with experience when one takes the trouble to see what the theory itself says our experience will be.  Little more can be asked of it without exposing a naked philosophic prejudice of one kind or another (Everett 1957c: 3; Everett’s emphasis).

In the short dissertation this becomes:

Arguments that the world picture presented by this theory is contradicted by experience, because we are unaware of any branching process, are like the criticism of the Copernican theory that the mobility of the earth as a real physical fact is incompatible with the common sense interpretation of nature because we feel no such motion.  In both cases the argument fails when it is shown that the theory itself predicts that our experience will be what it in fact is.  (In the Copernican case the addition of Newtonian physics was required to be able to show that the earth’s inhabitants would be unaware of any motion of the earth.) (Everett 1957b: note added in proof).

Everett had no problem with an observerL not feeling the split because his theory predicts that the splitting of worlds is not something of which IL could be aware.  An observerL will never have access to the global state of a system and so will never be able to observe anything but the state of his branch.  It requires a perspectiveG to observe any branching event.  Since there can never be any observation of a branching event, there can never be a physical record (memory) of the event.

The second objection comes from those who want the most economic metaphysics possible.  Such philosophers might argue that to have an infinite number of worlds after a split is metaphysically extravagant.  While some theorists who first encounter this idea initially balk at it, ultimately most do not find anything terribly objectionable in it.  In fact, most MWI theorists embrace the notion of a multitude of existing worlds.  If the MWI as proposed by DeWitt solves the quantum measurement problem, then this metaphysical extravagance may be worth the cost.

For more on the story of the evolution from the long to the short thesis, and on Everett’s life, see Byrne 2010.

ii. The Preferred Basis Problem

DeWitt’s MWI fails to provide us with any explanation of when worlds split.  Unfortunately, saying when they do is just as difficult as saying when a collapse occurs, which is one way to understand the quantum measurement problem. To determine when a world splits, we would first need to know in which basis we should write the universal state. If we knew this, then we would know that a split has occurred because a new term would show up in the global state when it is written in the preferred basis.  The choice of basis also determines which properties in the universe are determinate—namely, those that are represented by vectors that are on the axes of the basis—and which worlds exist after a measurement interaction. DeWitt does not provide any way to choose a particular basis, and any way that we might suggest in the context of his MWI would be blatantly ad hoc.  This problem has come to be known as the preferred basis problem.

Even though DeWitt’s MWI seems to provide us with an explanation of why we get determinate measurement records, that explanation assumes that there is a basis in which the universal state has been written that guarantees that those measurement records are in fact determinate.  In order to be able to assert the determinateness of records in a particular branch, the basis that we choose ought to be one in which those records are determinate.  But the basis in which the pointer on our measuring device has a determinate position and the basis in which our mental states are determinate are not necessarily the same.  It is not clear that we can choose a basis in which they will both be determinate, not to mention all the other things that we want to have as determinate properties in order to be conscious beings capable of successfully completing a quantum mechanical experiment.

Without a preferred basis in which to write the universal state, one is free to choose any basis one likes with the result being that in different bases there will be different decompositions of the universal state vector.  If each term in the expansion of the universal state is taken to represent a different possible world, or a different description of a world, then with each choice of basis there will be different terms and so different worlds. Thus, without a preferred basis, there is no fact of the matter as to when splits occur, no fact of the matter as to which properties in the universe are determinate and no fact of the matter as to which worlds go to make up the universe; rather, these are all a function of the choice of basis.

So we are left without answers to several questions: What worlds are there?  Which properties are determinate?  When do worlds split?  Solving this is as difficult as solving the original quantum measurement problem and in fact is a version of it (Barrett 1999: 176).  Thus, no progress has been made toward solving the measurement problem, insofar as this is a problem that could be solved by saying when collapses or splits occur.  So the cost of DeWitt’s MWI is the reintroduction of the measurement problem, something Everett’s pure wave mechanics was developed to avoid.  Fortunately for those sympathetic to Everett’s ideas, quantum mechanics without the collapse postulate has evolved in the decades since DeWitt’s writing.  In what follows we will see how the concept of “many worlds” has been developed within the interpretation of Everett’s physics.

4. The Evolution of Many Worlds Interpretations

a. The Problem of Probability and Graham’s Attempt at Adding a Measure

Something that the standard interpretation of quantum mechanics provides that is missing in pure wave mechanics is a way to account for probabilistic claims. In any MWI of Everett’s relative state formulation, there is in some sense a different world in which every possible outcome of an experiment occurs; every world is real, and therefore every outcome occurs.  We lose the ability to say, “Event e happens with probability p” where p is less than 1. Everett writes, “In order to establish quantitative results, we must put some sort of measure (weighting) on the elements of the final superposition” (Everett 1957b: 147).

In the standard interpretation, this is done with the use of the Born rule (Born 1926). The Born rule is what physicists use to assign probabilities to the outcomes of quantum mechanical experiments. For each term in the state of the system, written in some basis, there is associated with it a complex number, the amplitude of that region of the wave function. The Born rule says to take that number and square its norm in order to get the probability of that term being the outcome of a measurement. But in EQM, a deterministic theory in which everything happens, the Born rule does not prima facie seem to be applicable.

To be able to derive probability from within his theory, Everett first needed to define what he meant by a “typical element of the final superposition” since “typical” presupposes some notion of probability. Presumably he meant an element in which the predications of quantum mechanics are borne out. But if he determined what is “typical” by counting up all the branches, taking there to be one world for each term in the expansion of the universal state when it is written in the preferred basis, and calling a world “typical” when it displays the same results as most of the others, then there is a problem because in a large majority of the worlds, by this measure, the quantum statistics will not even be close to true (Graham 1973: 235).

Neill Graham was the first to suggest that there was something missing from Everett’s derivation of a probability measure. He writes that the worlds that display the proper relative frequency are “in a numerical minority” and “any attempt to show that the probability interpretation holds in the majority of the resulting Everett worlds is doomed to failure” (Graham 1973: 236). (See also Barrett 1999: 168–70 for a very clear explanation of why the statistics fail to work for any coefficient other than 1/√2.) But Everett not only suggests that we use a counting measure, he believes that he has shown that the “only choice of measure” is the square amplitude measure (Everett 1957b: 147). A typical world is then one for which the value of the square amplitude measure is high.  But Everett’s use of this measure is not a solution to the original problem. The original problem was to derive probability from a deterministic theory in which everything happens. To solve this one needs to add the assumption that after a measurement one should expect to find oneself in a “typical” world, that typicality being determined by the norm-square of the coefficient on the term that describes that world; the higher the value, the more “typical” the world. But this is akin to claiming that the Born rule holds.  (For more on this see Barrett 1999: 168–173, Wallace 2007 and 2012.)

Early 21st century MWI theorists take a very different view of the multiplicity of the world, thereby solving some of the problems inherent in earlier MWIs. This view is in line with what David Wallace has proposed, namely that the multiple branches of the universe that arise from the mathematical theory of quantum mechanics are emergent (Wallace 2012). Because most of those who have developed and are proponents of Wallace’s emergent branching universe view are or have been located around Oxford, we will call this view the “Oxford MWI.”

b. The Oxford MWI

The main difference among Wallace’s emergent branching universe view, the Oxford MWI, and the MWIs that came before is that prior MWI theorists, in large part, understand the wave function to be a real entity, leading to a real multiplicity of worlds at the fundamental level of the theory. (For more on the question of realism regarding the wave function, see Ney and Albert 2013.) Wallace, on the other hand, sees these worlds to be emergent from the underlying microphysical description of the universe. They are no less real, but they are structural facts that are instantiated within EQM (Wallace 2012: Chapter 2). Wallace explains how he conceptualizes these structures with what he calls “Dennett’s criterion”: “A macro-object is a pattern, and the existence of a pattern as a real thing depends on the usefulness – in particular, the explanatory power and predictive reliability – of theories which admit that pattern in their ontology” (Wallace 2010a: 58, Wallace 2012: 50).

Wallace asks us to consider a tiger and its hunting patterns. We can describe both in terms of electrons and atoms, but in that description we do not see the patterns that emerge when we consider them at the macroscopic level. The tiger atoms and the swirl of atoms and energy that make up the hunting patterns are real, objective parts of the microphysical system, but they are not practically useful for predicting how tigers behave in the wild. Zoology cannot be reduced to physics. Rather, physics instantiates zoology. Wallace illustrates the instantiation relation with the example of the relation between quantum mechanics and a classical conception of the solar system. Classical mechanics is instantiated by quantum mechanics and is applicable to the solar system because some of the solar system’s properties “approximately instantiate a classical-mechanical dynamical system” (Wallace 2012: 56).

Applying these considerations to EQM, one can say that the microphysical description of the universe contains descriptions of states of affairs that are structured like the macroscopic objects we encounter in the world. When two of these states are superposed with one another, the quantum state of the system instantiates two different macroscopic systems at once. Thus, one can say that two different worlds emerge from the microphysical-level description that instantiates them. Wallace writes that “there are entities whose existence is entailed by the theory which deserve the name ‘worlds’” (Wallace 2010a: 68); thus we should take these worlds to be real entities.

Wallace considers decoherence the only consideration that one should use to help determine how the universe ought to be carved up. A system is said to “decohere” when it becomes correlated with something in its environment and thereby destroys the interference effects that would otherwise have been present if the system were in a pure state, the state in which one finds an entangled system. Decoherence theorists take the destruction of the interference effects to be all that is required to explain determinate experience. Because such correlations are ubiquitous and radically swift, and because it is, in practice, impossible to isolate a macroscopic system from its environment sufficiently to prevent such correlations (even the best isolated system, if it is above absolute zero, will radiate heat and therefore interact with its environment), these correlations will produce results that seem to indicate that a collapse has occurred and not that the system is still in a superposition. The property that decoherence picks out is very close to position when one is considering an experiment that requires us to see the position of a pointer on a measuring device (as pointing either “up” or “down,” for example). (For more on decoherence see the original formulations of it in Zeh 1970, Zeh 1973, Zurek 1981, and Zurek 1982. In Zurek 1991, Zeh 1995, Giulini et al 1996, Butterfield 2001, Zurek 2002 and Schlosshauer 2007 the reader can find very accessible introductions to and discussions of decoherence.)

Given that there is no preferred way to carve up the universe, aside from decoherence considerations, and no particular “most-fine-grained” way to describe the quantum structure of the universe, there is also no fact of the matter about how many branches there are, but Wallace does not see this as a problem. Rather he sees it as misguided to even ask the question “How many worlds are there?” much as it is misguided to ask “How many experiences did you have yesterday?” (Wallace 2010a: 67–8, Wallace 2012: 99–102, 120).

Recall that part of the job of Everett interpretation is to explain how it is that we get determinate measurement records when we do quantum mechanics experiments. Wallace argues that “the emergence of a classical world from quantum mechanics is to be understood in terms of the emergence from the theory of certain sorts of structures and patterns” (2003: 5), and it is these structures that are in superpositions, not the emergent macroscopic objects. To see what he means here, it is worth quoting him at some length:

To see in a different way how the ideas of Sections 4-5 resolve the problem of macroscopic indefiniteness, consider the following sketch of the problem.

  1. After the experiment, there is a linear superposition of a live cat and a dead cat.
  2. Therefore, after the experiment the cat is in a linear superposition of being alive and being dead.
  3. Therefore, the macroscopic state of the cat is indefinite.
  4. This is either meaningless or refuted by experiment.

But (1) does not imply (2). The belief that it does is based upon an oversimplified view of the quantum formalism, in which there is a Hilbert space of cat states such that any vector in the space is a possible state of the cat. This is superficially plausible in view of the way that we treat microscopic subsystems: an electron or proton, for instance, is certainly understood this way, and any superposition of electron states is another electron state. But any state of a cat is actually a member of a Hilbert space containing states representing all possible macroscopic objects made out of the cat’s subatomic constituents. Because of Dennett’s criterion, this includes states which describe

a live cat;
a dead cat;
a dead dog;
this paper . . .

We can say (if we want, and within nonrelativistic quantum mechanics) that the particles which used to make up the cat are now jointly in a linear superposition of being a live cat and being a dead cat. But cats themselves are not the sort of things which can be in superpositions. Cats are by definition “patterns which behave like cats”, and there are definitely two such patterns in the superposition.

The point can be made more generally:

It makes sense to consider a superposition of patterns, but it is just meaningless to speak of a given pattern as being in a superposition (Wallace 2003: 12; Wallace’s emphases).

In the Oxford MWI, decoherence gives rise to the branching structures that make up the various worlds. And it is these structures that give rise to the patterns from which macroscopic objects emerge. Decoherence also causes the interference between branches to “wash out,” and so systems appear to have determinate values in the decoherence basis. (For more on the Oxford MWI see Albert 2010, Kent 2010, Maudlin 2010, Price 2010, Vaidman 2014 and Bacciagaluppi and Ismael 2015.)

There was quite a bit of work at developing a theory of probability in the context of the Oxford MWI (Saunders 1995, Saunders 1998, Vaidman 1998, Wallace 2002, Wallace September 2003, Saunders 2005, Wallace 2006, Greaves January 2007, Wallace 2007, Greaves and Myrvold 2010, Saunders 2010, Wallace 2010b, Tappenden 2011, Vaidman 2012, Wallace 2012, Wilson 2013). The work generally focuses on two different concerns in probability theory: explaining how one can recover uncertainty from a deterministic world, and explaining how we can make sense of the fact that we seem to be able to take branch weights to be related to probability, as the Born rule suggests that we can in a standard collapse interpretation.

Simon Saunders presents a tripartite view of what role chance plays in standard single world views of probability:

(i) Chance is measured by statistics, and perhaps, among observable quantities, only statistics, but only with high chance. (ii) Chance is quantitatively linked to subjective degrees of belief, or credences: all else being equal, one who believes that the chance of E is p will set his credence in E equal to p (the so-called “principal principle”). (iii) Chance involves uncertainty; chance events, prior to their occurrence, are uncertain (Saunders 2010: 181).

Linking chance to statistics (as in (i)) was originally argued by Everett (1973), as we have seen above. But while this might explain certain aspects of probability, it does not explain why we ought to take branch weights to be probability. The arguments for (ii) and (iii) aim to answer this question.

Wallace and others have argued that we can make sense of probability in a theory in which all possible outcomes occur (what has been called the “incoherence problem” in Greaves 2004, Wallace 2005, Wallace 2006, Baker 2007, Lewis 2007, and Saunders and Wallace 2008a; and what has been alternatively termed “Subjective Certainty” by Greaves 2004, Baker 2007, Greaves 2007, Lewis 2007, and Greaves and Myrvold 2010). Wallace’s justification for the claim that we can make sense of probability in a determinate universe is the emergent structure that we have discussed above. It is only at the fundamental level that EQM is deterministic; at the emergent level it is not (Wallace 2012: 115). Once this has been established, two other problems remain to be solved, what Wallace calls the “practical problem” and the “epistemic problem” (Wallace 2012: 158). The first of these is how to justify allowing branch weights to play the role in decision making that ordinary probability plays in a classical context ((ii), above). The second asks how we justify taking branch weights to play the role of probability in showing that the results of experiments support quantum mechanics. Albert (2010) explains this general concern clearly when he writes, “Why (for example) should it come as a surprise, on a picture like [EQM], to see what we would ordinarily consider a low-probability string of experimental results? Why should such a result cast any doubt on the truth of this theory (as it does, in fact, cast doubt on quantum mechanics)” (Albert 2010: 356)?

The general principle of rationality on which Wallace’s argument is founded is David Lewis’ Principal Principle (Lewis 1980). In Wallace’s terminology: “For any real number x, a rational agent’s personal probability of an event E conditional on the objective probability of E being x, and on any other background information, is also x” (Wallace 2012: 140). Wallace’s goal is to show that there is an “Everett-specific derivation of the Principle” and,

to prove, rigorously and from general principles of rationality, that a rational agent, believing that (Everett-interpreted) quantum mechanics correctly gives the structure and dynamics of the world and that the quantum states of his branch is |y>, will act for all intents and purposes as if he ascribed probabilities in accordance with the Born Rule, as applied to |y> (Wallace 2012: 150, 159).

To provide this rigorous proof, Wallace builds on an argument of David Deutsch’s (1997, 1999). He uses Deutsch’s results and furthers them to argue that branch weight non-circularly serves as objective probability. As the details of both Deutsch’s and Wallace’s arguments are quite technical, I leave it to the interested reader to investigate more fully. (More on the Deutsch-Wallace derivation of probability and refinements to their work can be found in Wallace 2002, Wallace September 2003, Greaves 2004, Saunders 2005, Greaves January 2007, Greaves March 2007, Wallace 2007, Greaves and Myrvold 2010.)

The arguments for (iii), that chance involves uncertainty, are the focus of quite a number of papers. Saunders believes that to have a fully worked out view of probability, one must explain uncertainty in EQM (2010: 189–90).  A view called Subjective Uncertainty [SU] is described by Saunders (1998). There he argues that there is uncertainty in even the deterministic physics of Everettian quantum mechanics. He asks us to consider a pre-measurement observer at time t1. Call her she1. When she1 measures the x-spin of a z-spin particle, this results in a branching of her world and she1 ends up with two successors at time t2: “she2” who sees “up” as her measurement result, and “she2” who sees “down” as her measurement result. The question is, “Who should she1 expect to become?”

There seem to be three possibilities: She can expect to become one of them, both of them or neither of them (Saunders 1998: 383). Saunders claims that it is nonsense to suggest that she1 should expect to become neither of them. In the emergent branching view, branching events are ubiquitous and yet we have the experience of continuing to move through time. She would not expect to become both of them because we do not have the experience of seeing two measurement results of our experiments. So the only remaining option is that she1 should expect to become one of her successors, but she does not know which one. This is the basis for the SU attitude about uncertainty. (For more on the link between uncertainty and probability in EQM see Ismael 2003, Greaves 2004, Wallace 2006, Baker 2007, Greaves January 2007, Greaves March 2007, Lewis 2007, and Wallace 2007.)

Saunders links uncertainty to questions of personal identity in EQM (2010). If an agent, Alice, does not know what branch she is on, that can account for a type of uncertainty in EQM. This implies that some of the questions of uncertainty, and therefore probability, will depend upon answers to the question of diachronic personal identity in the context of EQM. (For more on self-locating uncertainty and diachronic identity in EQM see Vaidman 1998, Wallace 2005, Lewis 2007, Lewis 2008, Saunders and Wallace 2008a, 2008b, Tappenden 2008, Wallace 2012, Conroy 2016, and Sebens and Carroll 2016.)

c. Objections to the Oxford MWI

There are of course objections to the Oxford MWI. The first set to consider are those that focus on the use of decoherence to solve the preferred basis problem—a problem that must be solved to explain determinate measurement records and probability.

Recall that a system is said to decohere when it becomes correlated with something in its environment and thereby destroys the interference effects that would otherwise have been present if the system were in a pure state. The coefficients on each term of the state of the system, when one traces over the environment (that is, when one essentially ignores the effects that are caused by only the environment) approximately match the probabilities one obtains from the standard collapse formulation. But even though decoherence theorists believe they have solved the problem of accounting for what seem to be determinate measurement records (Zeh 1997), such interactions do not produce determinate results (Barrett 1999). The interaction between, say, a pointer on a measuring device and its environment does not produce a determinate position for the pointer. Decoherence destroys the interference effects, but it does not produce determinate measurement records. There is a sense as if a collapse has occurred, but all we really end up with is a more complex entangled superposition (Albert 1992, Barrett 1999).

Related to this worry is the fact that using decoherence considerations to choose a preferred basis does not clarify exactly which basis is chosen. In every interaction the property that decoheres most quickly and completely will always be one that is close to position, but it will not always be the same property each time (Barrett 1999: 242–4). This is not going to be troubling to a MWI theorist like Wallace, however, as he does not take there to be a particular fine-grained way to carve up the universe—provided that it is done in a way that preserves the emergence of macroscopic entities from the underlying structure (Wallace 2012).

An additional problem with taking a property very close to position to be the preferred basis in which to write the state of the system is that this is adding an additional principle to quantum mechanics, one that says that whatever decoheres most quickly and completely is the preferred observable chosen to be determinate (Barrett 1999). This does not concern most Oxford MWI theorists, though, as they do not take themselves to be doing Everett exegesis.

A more pressing problem that faces the MWI theorist who relies on decoherence is the question of circularity. David Baker (2007) and Ruth Kastner (2014) have both argued that the use of decoherence begs the question regarding the derivation of probability because the use of decoherence presupposes a concept of objective probability. As Baker puts it, “proofs of decoherence depend implicitly upon the truth of the Born rule. Without first justifying the probability rule, Everettians cannot establish the existence of a preferred basis or the division of the wave function into branches. But without a preferred basis or a specification of branches, there can be no assignment of probabilities to measurement outcomes” (Baker 2007: 3). Kastner puts her concern this way: “the goal of decoherence is to obtain vanishing of the off-diagonal terms, which corresponds to the vanishing of interference and the selection of the observable R as the one with respect to which the universe purportedly ‘splits’ in an Everettian account . . . the vanishing of the off-diagonal terms is crucially dependent on an assumption that makes the derivation circular” (Kastner 2014: 57). Both Baker and Kastner are pointing to the fact that in order to ignore the off-diagonal terms, the crucial step in decoherence that provides the preferred basis, one must already have a conception of probability.

The best that Oxford MWI theorists can do, according to Baker, is to show that because the off-diagonal terms are incredibly close to zero, the observer can ignore them as part of her decision making (she ought not care much about them). If, however, that is justified by saying that it is because those terms are very unlikely to occur, then one is bringing in an illegitimate notion of probability (Baker 2007: 19ff). Kastner points to a different problem, the problem of the arbitrariness of the division between system, measuring device and environment (Kastner 2014: 57). It is crucial that one be able to ignore the effects of the environment, but if the division is arbitrary, then there is no non-circular way of isolating only the environment.

There are at least two other objections raised against the Oxford MWI that are in this vein, Barnum et al 2000 and Hemmo and Pitowsky 2007. The former points to a hidden assumption in Deutsch’s proof, one that “is not just a minor addition to Deutsch’s list of assumptions, but rather a major conceptual shift. The assumption is akin to applying Laplace’s Principle of Insufficient Reason to a set of indistinguishable alternatives, an application that requires acknowledging a priori that amplitudes are related to probabilities” (Barnum et al 2000: 1180). Hemmo and Pitowsky criticize the use of the Everettian Principal Principle, essential to Wallace’s derivation of the Born rule, on the grounds that it is incoherent to claim that observers should treat a term with a measure of zero as if it had a probability of zero. They also argue that there is no reason to believe (that is not question-begging) that rational observers would be justified in believing the statistical predications of quantum mechanics if they also believed the Oxford MWI (Hemmo and Pitowsky 2007: 334). Some of these objections were addressed by Wallace and others in the intervening years. But other criticisms have also been raised. Most of these criticize the use of decision theory to guide the derivations of probability in EQM.

David Albert (2010) argues that the entire program undertaken by Deutsch and Wallace (and others supporting them) misses the point. He writes, “the question out of which this entire program arises, seems like the wrong question. The questions to which this program is addressed are questions of what we would do if we believe that the fission hypothesis were correct. But the question at issue here is precisely whether to believe that the fission hypothesis is correct” (359)! The “fission hypothesis” is the hypothesis that the Schrödinger equation is the complete story of the evolution of the world and that each branch that results from a branching event has an observer with an actual experience. He goes on to say, “The decision-theoretic program seems to act as if what primarily and in the first instance stands in need to being explained about the world is why we bet the way we do” (359). And this is not what Albert believes needs to be explained, but rather, “What we need is an account of our actual empirical experiences of frequencies” (360).

Peter Lewis (2010) has argued that the decision-theoretic considerations that guide both Deutsch’s and Wallace’s arguments are not sufficient to show that the only rational guide to an observer’s decision-making procedures is the Born rule. Lewis argues that there is a gap in Deutsch’s proof by showing that there are other rules that an observer can follow and still be consistent with the rationality constraints assumed by Deutsch. Lewis acknowledges that Wallace has filled the gaps in Deutsch’s argument by adding a new axiom of rationality, but that unlike Deutsch’s axioms, Wallace’s addition is “not an innocuous and general axiom of rationality . . . [rather] it is a substantive claim about decision-making in the specific context of Everettian quantum mechanics, and so requires a substantive justification” (Lewis 2010: 21). Lewis does not believe that the justification that Wallace provides is sufficient. Alastair Wilson (2013, 2015) has worked to counter this criticism of Lewis’ by proposing new principles that tie the physics of EQM to modal metaphysics, thereby helping to provide the justification for some of Wallace’s most contentious claims.

Adrian Kent (2010) argues that Wallace’s attempt to axiomatize rational decision making in EQM, using decision theory as its model, is incoherent, and that in fact “Wallace’s axioms are not constitutive of rationality either in Everettian quantum theory or in theories in which branchings and branch weights are precisely defined” (307). Huw Price (2010) argues that because in the Oxford MWI view there are multiple future successors to an observer that she ought to care about, it is irrelevant to the observer (as far as rationality requirements go) which future branch she will subjectively occupy (378–79). Following this and further detailed argument, Price concludes that “there seems little prospect that a Deutschian decision rule can be a constraint of rationality, in a manner analogous to the classical case” (389).

the Oxford MWI has had a great deal of influence on many philosophers, and so work on the question of probability has not stopped. For more work on the question of probability see Dizadji-Bahmani 2013, Dawid and Thébault 2014, Wilson 2015, Jansson 2016, and Sebens and Carroll 2016.

5. Relational Interpretations of Everettian Quantum Mechanics

Given the centrality of Everett’s notion of relative states, it seems important to consider those interpretations that also highlight their importance. In section 3a we considered the bare theory interpretation of Everett. The two interpretations considered in this section are Simon Saunders’ relational interpretation and the relative facts interpretation.

a. Simon Saunders’ Relational Interpretation

Simon Saunders developed an interpretation of Everett in which values for systems can only be defined relative to a point of view, or a context of interest or relevance.  He proposes that just as we understand facts about tense to be relations, we should also understand facts about the properties of systems to be relations (Saunders 1993, 1995, 1996a, 1996b, 1997, 1998).

Let us assume that what appears to be the case is in fact the case, and that we are tracking the truth about the world when we make statements about how the world appears to us.  Then, there must be something about the world that makes true the proposition

 (9)  My coffee cup is on my desk.

This is what is known as the truthmaker principle.  Saunders has proposed that we understand the truthmaker for (9) analogously to the way B-theorists about time understand the truthmaker for a proposition like

(10)  My coffee cup was at home.

A B-theorist takes the truthmaker for (10) to be a fundamentally relational fact. For the B-theorist a statement like (10) can be reduced to a statement like

(11) The event of my coffee cup’s being at home is earlier than now.

For a B-theorist, relations such as the one in (11) are permanent dyadic relations that order positions in time. Other B-series relations include simultaneous with, earlier than, and later than. (For more on the B-theory of time, see McTaggart 1908, Maudlin 2002 and Markosian 2008 and the article on time in this encyclopedia.)

Saunders draws an analogy between what he considers the nature of the truthmaker for a proposition about the property of an object and what a B-theorist considers the nature of the truthmaker for a proposition about the property of an event like

(12) Event e is past.

In both cases the truthmaker is some fundamentally relative fact (Saunders 1995).

So now suppose that there are two non-concurrent events, E and E’. Then the following two statements, while both true, appear to be contradictory:

(13) Event E is now.

(14) Event E’ is now.

However, if we introduce two other events, W and W’, that are not identical and not concurrent, then we can resolve the apparent contradiction by instead saying:

(15) Event E is now relative to event W.

(16) Event E’ is now relative to event W’.

And these two statements are not contradictory.

Saunders suggests that we extend this analogy to the consideration of truthmakers for propositions like (9).  If we do, then a seeming contradiction in pure wave mechanics has a solution that is analogous to the solution to the apparent contradiction in tense metaphysics (Saunders 1995).

In a relational interpretation of EQM, most every physical systemG will typically have most every possible relative value for a property, just as every event has all of the qualities of past, present and future.  So, it is true to say:

 (17) X has value x.

and

(18) X has value x’.

even if ≠ x’ and the two are mutually exclusive.  But if so, then (17) and (18) are contradictory.

However, if we now introduce two parameters, Y and Y’, that can take values and are such that Y ≠ Y’, we can restate (17) and (18) as:

(19) X has value x relative to Y having value y.

(20) X has value x’ relative to Y’ having value y’.

It is clear that (19) and (20) are not contradictory.

In this view, an event’s having a seemingly particular (tensed) time is analogous to an object having a seemingly particular (determinate) value for a property.  An event happened in the past (or future) relative to another time.  Likewise, an objectL has a determinate value relative to some parameter.

By relativizing the property of an object, what we are doing is explicitly changing the focus of our discussion from that of the properties of an objectG to that of the properties of an objectL.  So in (17) and (18) it is to XG that we are referring, but in (19) and (20) it is to XL that we are referring.  In the tense case, we change our focus from one concept of “now” in pair (13) and (14), to a different, relative concept of “now” in the pair (15) and (16).

The question then becomes, what are the relativizing parameters Y and Y’?  For Saunders, the parameters are worlds, or branches, at a particular time. In light of this, consider again (9).  In analogy with the B-theory, one recourse for explaining its truth (when it is in fact true) is to say that it is true because there is a determinate relative fact that consists in the coffee cup being on my desk.  Each possible fact about the value for the coffee cupG‘s position occurs in a different world.  So Saunders makes sense of the truth of a proposition like (9) by relativizing the cupL‘s position value to the world in which it finds itself.  Relative to being in this world, the coffee cupL is on my desk; relative to being in a different world, the coffee cupL is in the Mariana Trench; relative to being in yet another world, the coffee cupL is in my mother’s kitchen.  Thus, the fact that makes (9) true is a relation between the position value for the coffee cupL and the world in which the coffee cupL finds itself. (For more discussion and criticism of Saunders’ relational interpretation see Barrett 1999, Conroy 2010, Laudisa and Rovelli 2008.)

b. The Relative Facts Interpretation

Another attempt to read a metaphysical picture from EQM is the relative facts interpretation [RFI]. It takes seriously Everett’s claim of the centrality of the notion of relative states and adds no additional principles to his pure wave mechanics. It bears a great deal of similarity to Saunders’ relational interpretation, but whereas Saunders’ view is a many-worlds view, the RFI takes there to be just one world in which all objects generally have relational properties (Conroy 2010, 2012, 2016).

Consider the problem described in the last section of resolving the contradiction between (17) and (18). The RFI can do so by defining the parameters Y and Y’ as being the (relative) state of the complement of the system that we are considering.  So we can make sense of the truth of a proposition like (9) by relativizing a coffee cupL‘s position to the state of the complement of the system of which it is a part.  My coffee cupL is on my desk relative to my having put it there, my desk being in my office, to my not having knocked the cup off, and so forth.  The fact that my desk is in my office is relative to some other collection of relative facts about the complement of its system (the movers that put it there being a part of that), and this goes on ad infinitum.  Likewise, my coffee cupL is in the Mariana Trench relative to my having decided to take a cruise in the Pacific, and my having dropped my coffee cup over the side of the ship at the right time, and so on. The truthmaker for a proposition like (9), in the RFI, is a relative fact.

That there can be a metaphysics of relative facts has been argued in the context of quantum mechanics in a general sense. Because entanglement is an inescapable part of the quantum mechanical world, several philosophers have argued that non-separable, entangled quantum mechanical states imply that there are relations that fail to be supervenient upon or be grounded by non-relational properties of their relata, and that this leads to quantum holism and a metaphysics consisting of non-reducible relations. Thus it seems reasonable to take a relational metaphysics and apply it to Everett interpretation given the importance he places on the notion of relative states. (For more on the development of a relational metaphysics in light of considerations of quantum entanglement see Cleland 1984, Teller 1986, Teller 1989, French 1989, Healey 1991, Esfeld 2003, Esfeld 2004, Schaffer 2010, Calosi 2014, McKenzie 2014, and Esfeld 2016.)

Both the RFI and Saunders’ relational interpretation can explain determinate measurement results by saying that an observer has a determinate relative result. While Saunders has worked on solving the problem of probability (see section 4b above), the RFI still lacks a development of a clear sense of how probability is meant to function.

6. Conclusion

Although there is no consensus about the correct way to interpret the outcome of quantum mechanical experiments, one very influential way of doing so is due to Hugh Everett III, who suggested that we drop the collapse postulate and take the universe to be such that its quantum state is an incredibly complex superposition that never collapses. The work that Everett did in the late 1950s has led many philosophers and philosophers of physics to attempt to build a metaphysical picture of the world based on the physics that he proposed. This article has surveyed the major developments in the work that was inspired by Everett’s 1957 dissertation. Just as there is no consensus on whether or not Everett has the best physics for the description of the world, there is no consensus on the best way to read a metaphysical picture off of the world Everett described. It is left to the reader to adjudicate these debates.

7. References and Further Reading

  • Albert, David Z. Quantum Mechanics and Experience.  Cambridge: Harvard University Press, 1992.
  • Albert, David Z. “Probability in the Everett Picture.” In Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 355–368.
  • Albert, David Z. and Barry Loewer.  “Interpreting the Many Worlds Interpretation.”   Synthese 77 (1988): 195–213.
  • Bacciagaluppi, Guido and Jenann Ismael. “Book Review: The Emergent Multiverse: Quantum Theory According to the Everett Interpretation.Philosophy of Science 82 (1): 129–148: January 2015.
  • Baker, David J. “Measurement Outcomes and Probability in Everettian Quantum Mechanics.” Studies in History and Philosophy of Modern Physics 38 (2007): 153–69.
  • Barnum, Howard, Carlton M. Caves, Jerome Finkelstein and Ruediger Schack. “Quantum Probability from Decision Theory.” Proceedings of the Royal Society of London A, 456 (2000): 1175–1182.
  • Barrett, Jeff.   The Quantum Mechanics of Minds and Worlds.  New York: Oxford University Press, 1999.
  • Barrett, Jeff.  “Ithaca Interpretation of Quantum Mechanics.” In Compendium of Quantum Physics. Greenberger, Daniel, Klaus Hentschel and Friedel Weinert (eds.). Berlin: Springer, 2009: 325–6.
  • Barrett, Jeff. “A Structural Interpretation of Pure Wave Mechanics.” Humana Mente 13  (April 2010): 225–235.
  • Barrett, Jeffrey and Peter Byrne. The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Princeton, NJ: Princeton University Press, 2012.
  • Bevers, Brett. “Everett’s ‘Many Worlds’ Proposal.” Studies in History and Philosophy of Modern Physics, 42 (1) (February 2011): 3–12.
  • Birkhoff, Garrett and John von Neumann.  The Logic of Quantum Mechanics. Vol 37. 1936.
  • Birman, Fernando.  “Quantum Mechanics, Correlations, and Relational Probability.”  CRÍTICA, Revista Hispanoamericana de Filosofía 41, no. 121 (April 2009): 3–22.
  • Bohm, David. “A suggested interpretation of the quantum theory in terms of ‘hidden’ variables, I and II.” Physical Review, 85 (1952): 166–193.
  • Bohr, Niels.  “Quantum Mechanics and Physical Reality.” Nature 136 (1935): 1025–1026. Reprinted in Quantum Theory and Measurement.  Wheeler, John. A., and Wojciech. H. Zurek, (eds.).  Princeton: Princeton University Press, 1983: 144.
  • Bohr, Niels.  “Causality and Complementarity.” Philosophy of Science 4, no. 3 (July 1937): 289–98.
  • Bohr, Niels.  “Quantum Physics and Philosophy: Causality and Complementarity” (1958a).  In Essays 1958-1962 on Atomic Physics and Human Knowledge.  Vol. 3 of The Philosophical Writings of Niels Bohr.  Woodbridge, Conn.: Ox Bow Press, 1987: 1–7.
  • Bohr, Niels.  Atomic Physics and Human Knowledge.  New York: Wiley, 1958b.
  • Born, Max. “Zur Quantenmechanik der Stoßyorgänge”   Zeitschrift für Physik 37, No. 12 (December 1926): 863–67. English translation, “On the Quantum Mechanics of Collisions.” In Quantum Theory and Measurement.  Wheeler, John. A., and Wojciech. H. Zurek, (eds.).  Princeton: Princeton University Press, 1983: 52–55.
  • Brown, Matthew. J.  “Relational Quantum Mechanics and the Determinacy Problem.” British Journal for the Philosophy of Science 60 (2009): 679–95.
  • Butterfield, Jeremy. “Some Worlds of Quantum Theory.” In Scientific Perspectives on Divine Action. Robert John Russell, Philip Clayton, Kirk Wegter-McNelly, and John Polkinghorne (eds.). Vatican City: Vatican Observatory Publications, 2001.
  • Byrne, Peter. The Many Worlds of Hugh Everett III.  New York: Oxford University Press, 2010.
  • Calosi, Claudio. “Quantum Mechanics and Priority Monism.” Synthese 191, no. 5 (2014): 915–28.
  • Cleland, Carol. E.  “Space:  An Abstract System of Non-Supervenient Relations.” Philosophical Studies 46, no. 1 (July 1984): 19–40.
  • Clifton, Rob.  Ed.  Perspectives on Quantum Reality.  Boston: Kluwer Academic Press, 1996.
  • Conroy, Christina. “A relative facts interpretation of Everettian quantum mechanics.” PhD Thesis. Irvine: University of California, 2010.
  • Conroy, Christina. “The Relative Facts Interpretation and Everett’s note added in proof. Studies in History and Philosophy of Modern Physics 43 (2012): 112–120.
  • Conroy, Christina.  “Branch-Relative Identity.” In Individuals Across Sciences. Edited by Alexandre Guay and Pradeu. Oxford University Press: New York, 2016: 250-271.
  • Dawid, Richard and Karim Thébault. “Against the Empirical Viability of the Deutsch-Wallace-Everett Approach to Quantum Mechanics.” Studies in the History and Philosophy of Modern Physics, 47 (2014): 55–61.
  • Dizadji-Bahmani, Foad. “The probability problem in Everettian quantum mechanics persists.” British Journal for the Philosophy of Science 0. (2013): 1–27
  • de Broglie, Louis. Solvay Congress (1927). Electrons et Photons: Rapports et Discussions du Cinquième Conseil de Physique tenu à Bruxelles du 24 au 29 Octobre 1927 sous les Auspices de l’Institut International de Physique Solvay, Paris: Gauthier-Villars, 1928.
  • Deutsch, David. The Fabric of Reality. New York: Penguin Books, 1997.
  • Deutsch, David. “Quantum Theory of Probability and Decisions.” Proceedings of the Royal Society of London A458 (1999): 2911–23.
  • Deutsch, David. “Apart from Universes.” 2010. In Many Worlds? Everett, Quantum Theory, & Reality. Saunders, Simon, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press: 542-552.
  • DeWitt, Bryce. S. Letter to John Wheeler. 1957. Reprinted in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne (eds.). Princeton: Princeton University Press, 2012: 242–251.
  • DeWitt, Bryce. “Everett-Wheeler Interpretation of Quantum Mechanics.”  1968. In Battelle Rencontres. DeWitt, Cecile M. and John. A. Wheeler, (eds.).  New York: Benjamin, 1 January 1968.
  • DeWitt, Bryce.  1970. “Quantum Mechanics and Reality.”  Physics Today 23, no. 9.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973.
  • DeWitt, Bryce.  “The Many-Universes Interpretation of Quantum Mechanics.”   Foundations of Quantum Mechanics.  New York: Academic Press Inc., 1971.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973.
  • DeWitt, Bryce. S., and Neill Graham, (eds.).  The Many-Worlds Interpretation of Quantum Mechanics. Princeton: Princeton University Press, 1973.
  • DeWitt, Cecile M. and John. A. Wheeler, eds.  Battelle Rencontres. New York: Benjamin, 1 January 1968.
  • Esfeld, Michael.  “Do Relations Require Underlying Intrinsic Properties?  A Physical Argument for a Metaphysics of Relations.” Metaphysica: International Journal for Ontology and Metaphysics 4, no. 1 (2003): 5–25.
  • Esfeld, Michael.  “Quantum Entanglement and a Metaphysics of Relations.” Studies in History and Philosophy of Modern Physics 35 (2004): 601–617.
  • Esfeld, Michael. “The Reality of Relations: The Case from Quantum Physics.” In The Metaphysics of Relations. Marmadoro, Anna and David Yates, (eds.). New York: Oxford University Press, 2016: 218–34.
  • Everett, Hugh III. “On the Foundations of Quantum Mechanics.”  PhD Thesis.  Princeton University. 1957a.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 3–140.
  • Everett, Hugh III.  “’Relative State’ Formulation of Quantum Mechanics.” Reviews of Modern Physics 29 (1957b): 454–462. Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 141–49.
  • Everett, Hugh III. Letter to Bryce DeWitt.  May 31, 1957c. Reprinted in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 252–256.
  • Everett, Hugh III. “The Theory of the Universal Wave Function.” 1973. In The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 3–140.
  • Everett, Hugh III. “Quantitative Measure of Correlation.” Printed in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 61–63.
  • French, Steven.  “Individuality, supervenience and Bell’s Theorem.” Philosophical Studies 55 (1989): 1–22.
  • Ghirardi, Giancarlo, Alberto Rimini and Tullio Weber. “Unified dynamics for microscopic and macroscopic systems.” Physical Review D34 (1989): 470.
  • Graham, Neill.  “The Measurement of Relative Frequency.” 1973. In The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 229–253.
  • Greaves, Hilary. “Understanding Deutsch’s Probability in a Deterministic Multiverse.” Studies in History and Philosophy of Modern Physics 35 (2004): 423–56.
  • Greaves, Hilary. “Probability in the Everett Interpretation.” Philosophy Compass 2, 1 (January 2007): 109–128.
  • Greaves, Hilary. “On the Everettian Epistemic Problem.” Studies in History and Philosophy of Modern Physics 38, 1 (March 2007): 120–152.
  • Greaves, Hilary and Wayne Myrvold. “Everett and Evidence.” In Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010: 264–304.
  • Healey, Richard A.  “How Many Worlds?” Noûs 18 (1984): 591–616.
  • Healey, Richard A.  “Holism and Nonseparability.” Journal of Philosophy 88 (1991): 393–421.
  • Hemmo, Meir and Itamar Pitowsky. “Quantum Probability and Many Worlds.” Studies in the History and Philosophy of Modern Physics 38 (2007): 333–350.
  • Hooker, Clifford A. The Logico-Alegraic Approach to Quantum Mechanics, vol. 1.  Dordrecht: D. Reidel, 1975.
  • Ismael, Jenann. “How to Combine Chance and Determinism: Thinking About the Future in an Everett Universe.” Philosophy of Science 70 (2003): 776–90.
  • Jammer, Max.  The Philosophy of Quantum Mechanics.  New York: McGraw Hill, 1974.
  • Jansson, Lina. “Everettian Quantum Mechanics and Physical Probability: Against the Principle of ‘State Supervenience’.” Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 53 (February 2016): 45–54.
  • Joos, Erich, H. Dieter Zeh, Claus Keifer, Giulini, Domenico, Joachim Kupsch, and Ion-Olimpiu Stametesci, (eds.). Decoherence and the Appearance of a Classical World in Quantum Theory. Berlin: Springer; second revised edition, 2003.
  • Kastner, Ruth E. “‘Einselection’ of Pointer Observables: The New H-Theorem?” Studies in History and Philosophy of Modern Physics 48 (2014): 56–58.
  • Kent, Adrian. “One World Versus Many: The Inadequacy of Everettian Accounts of Evolution, Probability and Scientific Confirmation” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 307–354.
  • Laudisa, Federico and Carlo Rovelli.  “Relational Quantum Mechanics.” Stanford Encyclopedia of Philosophy, (2008). http://plato.stanford.edu/entries/qm-relational/.
  • Lewis, David. “A Subjectivist’s Guide to Objective Chance” in Studies in Inductive Logic and Probability, Volume II. Richard C. Jeffrey, (ed.). Berkeley: University of California Press. 1980: 263–293.
  • Lewis, Peter. “Interpretations of Quantum Mechanics.” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/int-qm/.
  • Lewis, Peter. “Uncertainty and Probability for Branching Selves.” Studies in History and Philosophy of Modern Physics 38 (2007): 1–14.
  • Lewis, Peter. “Probability, Self-Location and Quantum Branching.” Philosophy of Science 76 (5), (December 2009): 1009–1019.
  • Lewis, Peter.  “Probability in Everettian Quantum Mechanics.” Manuscrito, 33 (2010): 285–306.
  • Lewis, Peter. Quantum Ontology. New York: Oxford University Press, 2016.
  • Mackey, George W.  “Quantum Mechanics and Hilbert Space.” American Mathematics Monthly 64 (1957): 45–57.
  • Mackey, George W.  Foundations of Quantum Mechanics.  New York: W. A. Benjamin, 1963.
  • Markosian, Ned.  “Time.” Stanford Encyclopedia of Philosophy (2008). http://plato.stanford.edu/entries/time.
  • Maudlin, Tim.  Quantum Non-Locality and Relativity: Metaphysical Intimations of Modern Physics. Oxford: Blackwell, 2002.
  • Maudlin, Tim. “Can the World Be Only Wavefunction?” in Many Worlds? Everett, Quantum Theory, & Reality. Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 121–143.
  • McKenzie, Kerry. “Priority and Particle Physics: Ontic Structural Realism as a Fundamentality Thesis.” British Journal for the Philosophy of Science 65 no. 2, (2014): 353–80.
  • McTaggart, J. M. Ellis. “The Unreality of Time.” Mind 17 (October 1908): 457–474.
  • Mermin, David. “What is Quantum Mechanics Trying to Tell Us?” American Journal of Physics 66 (1998): 753–767.
  • Ney, Alyssa and David Albert. The Wave Function. New York: Oxford University Press, 2013.
  • Price, Huw. “Decisions, Decisions, Decisions: Can Savage Salvage Everettian Probability?” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 369–390.
  • Putnam, Hilary. “Is Logic Empirical?” Boston Studies in the Philosophy of Science, vol. 5. Coehn, Robert S. and Marx W. Wartofsky, (eds.) Dordrecht: D. Reidel, 1968: 216–241.
  • Reichenbach, Hans. Philosophical Foundations of Quantum Mechanics. Berkeley: University of California Press, 1944.
  • Rovelli, C. “Relational quantum mechanics.” International Journal of Theoretical Physics 35, no. 8 (1995): 1637–1678.
  • Saunders, Simon.  “Decoherence, Relative States, and Evolutionary Adaptation.” Foundations of Physics 23, no. 12 (1993): 1553–1585.
  • Saunders, Simon.  “Time, Quantum Mechanics, and Decoherence.” Synthese 102, no. 2 (1995): 235–266.
  • Saunders, Simon.  “Relativism.”  (1996a). In Perspectives on Quantum Reality. Clifton, Rob, (ed.).  Boston: Kluwer Academic Press, 1996.
  • Saunders, Simon.  “Time, Quantum Mechanics and Tense.” Synthese 107 (1996b): 19–53.
  • Saunders, Simon. “Naturalizing Metaphysics.”  The Monist 80. no. 1 (1997): 44–69.
  • Saunders, Simon.  “Time, Quantum Mechanics and Probability.” Synthese 114 (1998): 373–404.
  • Saunders, Simon. “What is Probability?” In Quo Vadis Quantum Mechanics. Avshalom C. Elitzur, Shahar Dolev, and Nancy Kolenda, (eds.). Berlin: Springer-Verlag, 2005.
  • Saunders, Simon. “Chance in the Everett Interpretation” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010: 181–205.
  • Saunders, Simon, Jonathan Barrett, Adrian Kent and David Wallace (eds.).  Many Worlds? Everett, Quantum Theory, & Reality. New York: Oxford University Press. 2010.
  • Saunders, Simon and David Wallace. “Branching and Uncertainty.” (2008a). British Journal for the Philosophy of Science 59, (3): 293–305.
  • Saunders, Simon and David Wallace. “Reply.” (2008b). British Journal for the Philosophy of Science 59, (3): 315–37.
  • Schaffer, Jonathan. “Monism: The Priority of the Whole.” Philosophical Review 119 no.1 (2010): 31–76. Reprinted in Spinoza on Monism. Goff, Philip, (ed.). New York: Palgrave, 2012: 149–66.
  • Schaffer, Jonathan and Jenann Ismael. “Quantum Holism: Nonseparability as Common Ground.” Synthese (2016): online published first.
  • Schlosshauer, Maximilian. Decoherence and the Quantum-to-Classical Transition. Heidelberg: Springer. 2007.
  • Schrödinger, Erwin. “Die gegenwärtige Situation in der Qauntenmechanik.” Naturwissenschaften, 23 (1935): 807–812, 823–828, 844–849; English translation by Trimmer, J. D. “The Present Situation in Quantum Mechanics: A Translation of Schrödinger’s ‘Cat Paradox’ Paper.” Proceedings of the American Philosophical Society, 124: 323–338, reprinted in Wheeler and Zurek (1983).
  • Sebens, Charles and Sean M. Carroll. “Self-Locating Uncertainty and the Origin of Probability in Everettian Quantum Mechanics.” British Journal for the Philosophy of Science. First published online July 5, 2016.
  • Tappenden, Paul. “Saunders and Wallace on Everett and Lewis” (2008). British Journal for the Philosophy of Science, 59: 307–314.
  • Tappenden, Paul. “Evidence and Uncertainty in Everett’s Multiverse,” British Journal for the Philosophy of Science, 62 (2011): 99–123.
  • Teller, Paul.  “Relational Holism and Quantum Mechanics. The British Journal for the Philosophy of Science 37, no. 1 (March 1986): 71–81.
  • Teller, Paul. “Relativity, Relational Holism and the Bell Inequalities.” In Philosophical consequences of quantum theory: Reflections on Bell’s theory. Cushing, James and Ernan McMullin, (eds.).  South Bend, IN: University of Notre Dame Press. (1989): 208–23.
  • Vaidman, Lev. “On the Schizophrenic Experiences of the Neutron or Why We Should Believe in the Many-Worlds Interpretation of Quantum Theory.” International Studies in the Philosophy of Science 12, 3 (1998): 245–261.
  • Vaidman, Lev. “Probability in the Many-Worlds Interpretation of Quantum Mechanics.” In Probability in Physics, The Fronteirs Collection XII. Yemima Ben-Menahem and Meir Hemmo, (eds.). Berlin: Springer. (2012): 299–311.
  • Vaidman, Lev. “Review: David Wallace The Emergent Multiverse: Quantum Theory According to the Everett Interpretation.” The British Journal for the Philosophy of Science, 0 (2014): 1–4.
  • van Fraassen, Bas C. “Rovelli’s World.” Foundations of Physics 40, no. 4 (2009): 390–417.
  • Wallace, David. “Quantum Probability and Decision Theory Revisited.” (2002). http://arxiv.org/abs/quant-ph/0211104
  • Wallace, David. “Everettian Rationality: Defending Deutsch’s Approach to Probability in the Everett Interpretation.” Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 34, 3 (September 2003): 415–439.
  • Wallace, David. “Everett and Structure.” (2003). Studies in History and Philosophy of Science, Part B 34, (1): 87–105.
  • Wallace, David. “Language Use in a Branching Universe.”(2005). Preprint: http://philsci-archive.pitt.edu/archive/00002554/
  • Wallace, David. “Epistemology Quantized: Circumstances in Which We Should come to Believe in the Everett Interpretation.” (2006). British Journal for the Philosophy of Science 57, (4): 655–689.
  • Wallace, David. “Quantum Probability from Subjective Likelihood: Improving on Deutsch’s Proof of the Probability Rule.” (2007) Studies in History and Philosophy of Modern Physics 38, 311–32.
  • Wallace, David. “Decoherence and Ontology” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010a: 53–72.
  • Wallace, David. “How to Prove the Born Rule” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010b: 227–263.
  • Wallace, David. (2012) The Emergent Universe. New York: Oxford University Press.
  • Wheeler, John A.  Geons, Black Holes and Quantum Foam.  New York: W. W. Norton, 1998.
  • Wheeler, John. A., and Wojciech. H. Zurek, (eds.). Quantum Theory and Measurement. Princeton: Princeton University Press, 1983.
  • Wilson, Alastair. “Objective Probability in Everettian Quantum Mechanics.” British Journal for the Philosophy of Science 64 (4) (2013): 709–737.
  • Wilson, Alastair. “The Quantum Doomsday Argument.” British Journal for the Philosophy of Science 0 (2015): 1–19.
  • Xavier University Conference Transcript. October 1–5, 1962.  Cincinnati, Ohio. Printed in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 267–279.
  • Zeh, H. Dieter. “On the Interpretation of Measurement in Quantum Theory.” (1970). Foundations of Physics 1: 69–76. Reprinted in Quantum Theory and Measurement. Wheeler, John A., and Wojciech H. Zurek, (eds.). Princeton: Princeton University Press, 1983: 342–49.
  • Zeh, H. Dieter. “Toward a Quantum Theory of Observation.” (1973). Foundations of Physics 3: 109–16.
  • Zeh, H. Dieter. “Basic Concepts and Their Interpretation.” (1995). In Decoherence and the Appearance of a Classical World in Quantum Theory. Giulini, Domenico, Erich Joos, Claus Keifer, Joachim Kupsch, Ion-Olimpiu, and H. Dieter Zeh, (eds.). Berlin: Spinger, 2003: Chapter 2.
  • Zurek, Wojciech H. “Pointer Basis of Quantum Apparatus: Into what Mixture does the Wave Packet Collapse?” (1981). Physical Review D (24): 1516–1525.
  • Zurek, Wojciech H. “Environment-Induced Superselection Rules.” (1982). Physical Review D (26): 1862–1880.
  • Zurek, Wojciech H. “Decoherence and the Transition from Quantum to Classical.” (1991). Physics Today 44 (October): 36–44.
  • Zurek, Wojciech H. “Decoherence and the Transition from Quantum to Classical – Revisited.” (2002). Los Alamos Science 27: 2–25.

 

Author Information

Christina Conroy
Email: c.conroy@moreheadstate.edu
Morehead State University
U. S. A.

Integrated Information Theory of Consciousness

Integrated Information Theory (IIT) offers an explanation for the nature and source of consciousness. Initially proposed by Giulio Tononi in 2004, it claims that consciousness is identical to a certain kind of information, the realization of which requires physical, not merely functional, integration, and which can be measured mathematically according to the phi metric.

The theory attempts a balance between two different sets of convictions. On the one hand, it strives to preserve the Cartesian intuitions that experience is immediate, direct, and unified. This, according to IIT’s proponents and its methodology, rules out accounts of consciousness such as functionalism that explain experience as a system operating in a certain way, as well as ruling out any eliminativist theories that deny the existence of consciousness. On the other hand, IIT takes neuroscientific descriptions of the brain as a starting point for understanding what must be true of a physical system in order for it to be conscious. (Most of IIT’s developers and main proponents are neuroscientists.) IIT’s methodology involves characterizing the fundamentally subjective nature of consciousness and positing the physical attributes necessary for a system to realize it.

In short, according to IIT, consciousness requires a grouping of elements within a system that have physical cause-effect power upon one another. This in turn implies that only reentrant architecture consisting of feedback loops, whether neural or computational, will realize consciousness. Such groupings make a difference to themselves, not just to outside observers. This constitutes integrated information. Of the various groupings within a system that possess such causal power, one will do so maximally. This local maximum of integrated information is identical to consciousness.

IIT claims that these predictions square with observations of the brain’s physical realization of consciousness, and that, where the brain does not instantiate the necessary attributes, it does not generate consciousness. Bolstered by these apparent predictive successes, IIT generalizes its claims beyond human consciousness to animal and artificial consciousness. Because IIT identifies the subjective experience of consciousness with objectively measurable dynamics of a system, the degree of consciousness of a system is measurable in principle; IIT proposes the phi metric to quantify consciousness.

Table of Contents

  1. The Main Argument
    1. Cartesian Commitments
      1. Axioms
      2. Postulates
    2. The Identity of Consciousness
      1. Some Predictions
    3. Characterizing the Argument
  2. The Phi Metric
    1. The Main Idea
    2. Some Issues of Application
  3. Situating the Theory
    1. Some Prehistory
    2. IIT’s Additional Support
    3. IIT as Sui Generis
    4. Relation to Panpsychism
      1. Relation to David Chalmers
  4. Implications
    1. The Spectrum of Consciousness
    2. IIT and Physics
    3. Artificial Consciousness
      1. Constraints on Structure/Architecture
      2. Relation to “Silent Neurons”
  5. Objections
    1. The Functionalist Alternative
      1. Rejecting Cartesian Commitments
      2. Case Study: Access vs. Phenomenal Consciousness
      3. Challenging IIT’s Augmentation of Naturalistic Ontology
    2. Aaronson’s Reductio ad Absurdum
    3. Searle’s Objection
  6. References and Further Reading

1. The Main Argument

IIT takes certain features of consciousness to be unavoidably true. Rather than beginning with the neural correlates of consciousness (NCC) and attempting to explain what about these sustains consciousness, IIT begins with its characterization of experience itself, determines the physical properties necessary for realizing these characteristics, and only then puts forward a theoretical explanation of consciousness, as identical to a special case of information instantiated by those physical properties. “The theory provides a principled account of both the quantity and quality of an individual experience… and a calculus to evaluate whether a physical system is conscious” (Tononi and Koch, 2015).

a. Cartesian Commitments

IIT takes Descartes very seriously. Descartes located the bedrock of epistemology in the knowledge of our own existence given to us by our thought. “I think, therefore I am” reflects an unavoidable certainty: one cannot deny one’s own existence as a thinker even if one’s particular thoughts are in error. For IIT, the relevance of this insight lies in its application to consciousness. Whatever else one might claim about consciousness, one cannot deny its existence.

i. Axioms

IIT takes consciousness as primary. Before speculating on the origins or the necessary and sufficient conditions for consciousness, IIT gives a characterization of what consciousness means. The theory advances five axioms intended to capture just this. Each axiom articulates a dimension of experience that IIT regards as self-evident.

First, following from the fundamental Cartesian insight, is the axiom of existence. Consciousness is real and undeniable; moreover, a subject’s consciousness has this reality intrinsically; it exists from its own perspective.

Second, consciousness has composition. In other words, each experience has structure. Color and shape, for example, structure visual experience. Such structure allows for various distinctions.

Third is the axiom of information: the way an experience is distinguishes it from other possible experiences. An experience specifies; it is specific to certain things, distinct from others.

Fourth, consciousness has the characteristic of integration. The elements of an experience are interdependent. For example, the particular colors and shapes that structure a visual conscious state are experienced together. As we read these words, we experience the font-shape and letter-color inseparably. We do not have isolated experiences of each and then add them together. This integration means that consciousness is irreducible to separate elements. Consciousness is unified.

Fifth, consciousness has the property of exclusion. Every experience has borders. Precisely because consciousness specifies certain things, it excludes others. Consciousness also flows at a particular speed.

ii. Postulates

In isolation, these axioms may seem trivial or overlapping. IIT labels them axioms precisely because it takes them to be obviously true. IIT does not present them in isolation. Rather, they motivate postulates. Sometimes the IIT literature refers to phenomenological axioms and ontological postulates. Each axiom leads to a corresponding postulate identifying a physical property. Any conscious system must possess these properties.

First, the existence of consciousness implies a system of mechanisms with a particular cause-effect power. IIT regards existence as inextricable from causality: for something to exist, it must be able to make a difference to other things, and vice versa. (What would it even mean for a thing to exist in the absence of any causal power whatsoever?) Because consciousness exists from its own perspective, the implied system of mechanisms must do more than simply have causal power; it must have cause-effect power upon itself.

Second, the compositional nature of consciousness implies that its system’s mechanistic elements must have the capacity to combine, and that those combinations have cause-effect power.

Third, because consciousness is informative, it must specify, or distinguish one experience from others. IIT calls the cause-effect powers of any given mechanism within a system its cause-effect repertoire. The cause-effect repertoires of all the system’s mechanistic elements taken together, it calls its cause-effect structure. This structure, at any given point, is in a particular state. In complex structures, the number of possible states is very high. For a structure to instantiate a particular state is for it to specify that state. The specified state is the particular way that the system is making a difference to itself.

Fourth, consciousness’s integration into a unified whole implies that the system must be irreducible. In other words, its parts must be interdependent. This in turn implies that every mechanistic element must have the capacity to act as a cause on the rest of the system and to be affected by the rest of the system. If a system can be divided into two parts without affecting its cause-effect structure, it fails to satisfy the requirement of this postulate.

Fifth, the exclusivity of the borders of consciousness implies that the state of a conscious system must be definite. In physical terms, the various simultaneous subgroupings of mechanisms in a system have varying cause-effect structures. Of these, only one will have a maximally irreducible cause-effect structure. This is called the maximally irreducible conceptual structure, or MICS. Others will have smaller cause-effect structures, at least when reduced to non-redundant elements. Precisely this is the conscious state.

b. The Identity of Consciousness

IIT accepts the Cartesian conviction that consciousness has immediate, self-evident properties, and outlines the implications of these phenomenological axioms for conscious physical systems. This characterization does not exhaustively describe the theoretical ambition of IIT. The ontological postulates concerning physical systems do not merely articulate necessities, or even sufficiencies, for realizing consciousness. The claim is much stronger than this. IIT identifies consciousness with a system’s having the physical features that the postulates describe. Each conscious state is a maximally irreducible conceptual structure, which just is and can only be a system of irreducibly interdependent physical parts whose causal interaction constitutes the integration of information.

An example may help to clarify the nature of IIT’s explanation of consciousness. Our experience of a cue ball integrates its white color and spherical shape, such that these elements are inseparably fused. The fusion of these elements constitutes the structure of the experience: the experience is composed of them. The nature of the experience informs us about whiteness and spherical shape in a way that distinguishes it from other possible experiences, such as of a blue cube of chalk. This is just a description of the phenomenology of a simple experience (perhaps necessarily awkward, because it articulates the self-evident). Our brain generates the experience through neurons physically communicating with one another in systems linked by cause-effect power. IIT interprets this physical communication as the integration of information, according to the various constraints laid out in the postulates. The neurobiology and phenomenology converge.

Theories of consciousness need to account for what is sometimes termed the “binding problem.” This concerns the unity of conscious experience. Even a simple experience like viewing a cue ball unites different elements such as color, shape, and size. Any theory of consciousness will need to make sense of how this happens. IIT’s account of the integration of information may be understood as a response to this problem.

According to IIT, the physical state of any conscious system must converge with phenomenology; otherwise the kind of information generated could not realize the axiomatic properties of consciousness. We can understand this by contrasting two kinds of information. First, there is Shannon information: When a digital camera takes a picture of a cue ball, the photodiodes operate in causal isolation from one another. This process does generate information; specifically, it generates observer-relative information. That is, the camera generates the information of an image of a cue ball for anyone looking at that photograph. The information that is the image of the cue ball is therefore relative to the observer; such information is called Shannon information. Because the elements of the system are causally isolated, the system does not make a difference to itself. Accordingly, although the camera gives information to an observer, it does not generate that information for itself. By contrast, consider what IIT refers to as intrinsic information: Unlike the digital camera’s photodiodes, the brain’s neurons do communicate with one another through physical cause and effect; the brain does not simply generate observer-relative information, it integrates intrinsic information.  This information from its own perspective just is the conscious state of the brain. The physical nature of the digital camera does not conform to IIT’s postulates and therefore does not have consciousness; the physical nature of the brain, at least in certain states, does conform to IIT’s postulates, and therefore does have consciousness.

To identify consciousness with such physical integration of information constitutes an ontological claim. The physical postulates do not describe one way or even the best way to realize the phenomenology of consciousness; the phenomenology of consciousness is one and the same as a system having the properties described by the postulates. It is even too weak to say that such systems give rise to or generate consciousness. Consciousness is fundamental to these systems in the same way as mass or charge is basic to certain particles.

i. Some Predictions

IIT’s conception of consciousness as mechanisms systematically integrating information through cause and effect lends itself to quantification. The more complex the MICS, the higher the level of consciousness: the corresponding metric is phi. Sometimes the IIT literature uses the term “prediction” to refer to implications of the theory whose falsifiability is a matter of controversy. This section will focus on more straightforward cases of prediction, where the evidence is consistent with IIT’s claims. These cases provide corroborative evidence that enhance the plausibility of IIT.

Deep sleep states are less experientially rich than waking ones. IIT predicts, therefore, that such sleep states will have lower phi values than waking states. For this to be true, analysis of the brain during these contrasting states would have to show a disparity in the systematic complexity of non-redundant mechanisms. On IIT, this disparity of MICS complexity directly implies a disparity in the amount of conscious integrated information, because the MICS is identical to the conscious state. The neuroscientific findings bear out this prediction.

IIT cites similar evidence from the study of patients with brain damage. For example, we already know that among vegetative patients, there are some whose brain scans indicate that they can hear and process language: When researchers prompt such patients to think about playing tennis, the appropriate areas of the brain become activated. Other vegetative patients do not respond this way. Naturally, this suggests that the former have a richer degree of consciousness than the latter. When analyzed according to IIT’s theory, the former have a higher phi metric than the latter; once again, IIT has made a prediction that receives empirical confirmation. IIT also claims that findings in the analysis of patients under anesthesia corroborate its claims.

In all these cases, one of two things happens. First, as consciousness fades, cortical activity may become less global. This reversion to local cortical activity constitutes a loss of integration: The system no longer is communicating across itself in as complex a way as it had. Second, as consciousness fades, cortical activity may remain global, but become stereotypical, consisting in numerous redundant cause-effect mechanisms, such that the informational achievement of the system is reduced: a loss of information. As information either becomes less integrated or becomes reduced, consciousness fades, which IIT takes as empirical support of its theory of consciousness as integrated information.

c. Characterizing the Argument

IIT combines Cartesian commitments with claims about engineering that it interprets, in part by citing corroborative neuroscientific evidence, as identifying the nature of consciousness. This borrows from recognizable traditions in the field of consciousness studies, but the structure of the argument is novel. While IIT’s proponents strive for clarity in the exposition of their work by breaking it down into the simpler elements of axioms, propositions, and identity claims, the nature of the relations between these parts remains largely implicit in the IIT literature. To evaluate the explanatory success or failure of IIT, it should prove helpful to attempt an explication of these logical relations. This requires characterizing the relationship of the axioms with the postulates, and of the identity claims with the axioms, postulates, and supporting neuroscientific evidence.

The axioms, of course, count as premises. These premises seem to lead to the postulates: each postulate flows from a corresponding axiom. At the same time, IIT describes these postulates as unproven assumptions, which seems at odds with their being conclusions drawn from the axioms. Consider the first axiom and its postulate, concerning existence. The axiom states that consciousness exists, and more specifically, exists intrinsically; the postulate holds that this requires a conscious system to have cause-effect power, and more specifically, to have this power over itself. The link involves, in part, the claim that existence implies cause-effect power. This claim that for a thing to exist, it must be able to make a difference, is plausible, but not self-evident. Nor does the axiomatic premise alone deductively imply this postulate. Epiphenomenalists, for example, claim that conscious mental states, although existent and caused, do not cause further events; they do not make a difference. Epiphenomenalists certainly do not go on to identify consciousness with physically causal systems, as IIT does.

Tononi (2015) adopts the position that the move from the axioms to the postulates is one of inference to the best explanation, or abduction. On this line, while the axioms do not deductively imply the postulates, the postulates have more than mere statistical inductive support. For example, consider the observation that human brains, which on IIT are conscious systems, have cause-effect power over themselves. Minimally, this offers a piece of inductive support for describing conscious systems in general as having such a power. Tononi takes a stronger line, claiming that a system’s property of having cause-effect power over itself most satisfyingly explains its intrinsic existence. So, what makes the brain a system at all, capable of having its own consciousness, is its ability to make a difference to itself. This illustrates the relation of postulates such as the first, concerning cause-effect power, to axioms such as the first axiom, concerning intrinsic existence, by appeal to something like explanatory fit, or satisfactoriness, which is to characterize that relation abductively.

In any case, IIT moves from the sub-conclusion about the postulates to a further conclusion, the identity claim: consciousness is identical to a system’s having the physical properties laid out by the postulates, which realize the phenomenology described by the axioms. Here again, the abductive interpretation remains an option. On this interpretation, the conjunction of the physical features of the postulates provides the most satisfactory explanation for the identity of consciousness.

This breakdown of the argument reveals the separability of the two parts. A less ambitious version of IIT might have limited itself to the first part, claiming that the physical features described by the postulates are the actual and/or best ways of realizing consciousness, or more strongly that they are necessary and/or sufficient, without going on to say that consciousness is identical to a system having these properties. The foregoing paragraphs outlined the possible motivation for the identification claim as lying in the abductive interpretation.

The notion of best explanation is notoriously slippery, but also ubiquitous in science. From an intuitive point of view one might regard the content of the conjunction of the postulates as apt for accounting for the phenomenology, but one might, motivated by theoretical conservatism, stop short of describing this as an identity relation. One clue as to why IIT does not take this tack may lie in IIT’s methodological goal of parsimony, something the literature mentions with some regularity. Perhaps the simplicity of identifying consciousness with a system’s having certain intuitively apt physical properties outweighs the non-conservatism of the claim that consciousness is fundamental to such systems the way mass is fundamental to a particle.

2. The Phi Metric

a. The Main Idea

IIT strives, among other things, not just to claim the existence of a scale of complexity of consciousness, but to provide a theoretical approach to the precise quantification of the richness of experience for any conscious system. This requires calculating the maximal amount of integrated information in a system. IIT refers to this as the system’s phi value, which can be expressed numerically, at least in principle.

Digital photography affords particularly apt illustrations of some of the basic principles involved in quantifying consciousness.

First, a photodiode exemplifies integrated information in the simplest way possible. A photodiode is a system of two elements, which together render it sensitive to two states only: light and dark. After initial input from the environment, the elements communicate input physically with one another, determining the output. So, the photodiode is a two-element system that integrates information. A photodiode not subsumed in another system of greater phi value is the simplest possible example of consciousness.

This consciousness, of course, is virtually negligible. The photodiode’s experience of light and dark is not rich in the way that ours is. The level of information of a state depends upon its specifying that state as distinct from others. The repertoire of the photodiodes allows only for the most limited differentiation (“this” vs. “that”), whereas the repertoire of a complex system such as the brain allows for an enormous amount of differentiation. Even our most basic experience of darkness distinguishes it not only from light, but from shapes, colors, and so forth.

Second, a digital camera’s photodiodes’ causal arrangement neatly exemplifies the distinction between integrated and non-integrated information. Putting to one side that each individual photodiode integrates information as simply as possible, those photodiodes do not take input or give output to one another, so the information does not get integrated across the system. For this reason, the camera’s image is informative to us, but not to itself.

Each isolated photodiode has integrated information in the most basic way, and would therefore have the lowest possible positive value of phi. The camera’s photodiodes taken as a system do not integrate information and have a phi value of zero.

IIT attributes consciousness to certain systems, or networks. These can be understood abstractly as models. A computer’s hardware may be modelled by logic circuits, which represent its elements and their connections as interconnected logic gates. The way a particular connection within this network mediates input and output determines what kind of logic gate it is. For example, consider a connection that takes two inputs, either which can be True or False, and then gives one output, True or False. The AND logic gate would give an output of True if both inputs were True or if both inputs were False. In other words, if both the one AND the other have the same value, the AND gate gives a True output. Such modelling captures the dynamics of binary systems, with “True” corresponding to 1 and “False” to 0. The arrangement of a network’s various logic gates (which include not only AND, but also OR, NOT, XOR, among others) determines how any particular input to the system at time-step 1 will result in output at time-step 2, and so on for that system. The brain can be modelled this way too. Input can come from a prior brain state, or from other parts of the nervous system, through the senses. The input causes a change in brain state, depending on the organization of the particular brain, which can be articulated in abstract logic.

In order to measure the level of consciousness of a system, IIT must describe the amount of its integrated information. This is done by partitioning the system in various ways. If the digital camera’s photodiodes are partitioned, say, by dividing the abstract model of its elements in half, no integrated information is lost, because all the photodiodes are in isolation from each other, and so the division does not break any connections. If no logically possible partition of the system results in a loss of connection, the conclusion is that the system does not make a difference to itself. So, in this case, the system has no phi.

Systems of interest to IIT will have connections that will be lost by some partitions and not by others. Some partitions will sever from the system elements that are comparatively low in original degree of connectivity to the system, in other words elements whose (de)activation has few causal consequences upon the (de)activation of other elements. A system where all or most elements have this property will have low phi. The lack of strong connectivity may be the result of relative isolation, or locality, an element not linking to many other elements, directly or indirectly. Or it could be from stereotypicality, where the element’s causal connections overlap in a largely redundant way with the causal connection of other elements. A system whose elements are connected more globally and non-redundantly will have higher phi. A partition that not only separates all elements that do not make a difference to the rest of the system for reasons of either isolation or redundancy from those that do make a difference, but also separates those elements whose lower causal connectivity decreases the overall level of integration of the system from those that do not, will thereby have picked out the maximally irreducible conceptual structure (MICS), which according to IIT is conscious. The degree of that consciousness, its phi, depends upon its elements’ level of causal connectivity. This is determined by how much information integration would be lost by the least costly further partition, or, in other words, how much the cause-effect structure of the system would be reduced by eliminating the least causally effective element within the MICS.

It is important to note that not every system with phi has consciousness. A sub- or super-system of an MICS may have phi but will not have consciousness.

If we were to take a non-MICS subsystem of a network, which in isolation still has causal power over itself, articulable as a logic circuit, then that would have phi. Were it indeed in isolation, it would have its own MICS, and its phi would correspond to that system’s degree of consciousness. It is, however, not in isolation, but rather part of a larger system.

IIT interprets the exclusion axiom—that any conscious system is in one conscious state only, excluding all others—as implying a postulate that holds that, at the level of the physical system, there be no “double counting” of consciousness. So, although a system may have multiple subsystems with phi, only the MICS is conscious, and only the phi value of the MICS (sometimes called phi max) measures conscious degree. The other phi values measure degrees of non-conscious integrated information. So, for example, each of a person’s visual cortices does not enjoy its own consciousness, but parts of each belong to a single MICS, which is the person’s one unitary consciousness.

If we were to take a supersystem of an MICS, one that includes the MICS and also other associated elements with lower connectivity, we could again assign it a phi value, but this would not measure the local maximum of integrated information. The supersystem integrates information, but not maximally, and its phi is therefore not a measure of consciousness. This is probably best understood by example: a group of people in a discussion integrate information, but the connective degree among them is lower than the degree of connectivity within each individual. The group as such has no consciousness, but each individual person—or, more properly, the MICS of each—does. The individuals’ MICSs are local maxima of integrated information and therefore conscious.

b. Some Issues of Application

The number of possible partitions of a system, called Bell’s number, grows immensely as the number of elements increases. For example, the tiny nematode, a simple species of worm, has 302 neurons, “and the number of ways that this network can be cut into parts is the hyperastronomical 10 followed by 467 zeros” (Koch, 2012). Calculating phi precisely for much more complex systems such as brains eludes computation pragmatically, although not in principle. In the absence of precise phi computation, IIT employs mathematical “heuristics, shortcuts, and approximations” (ibid.). The IIT literature includes several different mathematical interpretations of phi calculation, each intended to replace the last; it is not yet clear that IIT has a settled account of it. Proponents of IIT hold that the mathematical details will enable the application, but not bear on the merits, of the deeper theoretical claims. At least one serious objection to IIT, however, attempts a reductio ad absurdum of those deeper claims via analysis of the mathematical implications.

It is clear that, whatever the mathematical details, the basic principles of phi imply that biological nervous systems such as the brain will be capable of having very high phi, because neurons often have thousands of connections to one another. On the other hand, a typical circuit in a standard CPU only makes a few connections to other circuits, limiting the potential phi value considerably. It is also clear that even simple systems will have at least some phi value, and that, provided they constitute local maxima, will have a corresponding measure of consciousness.

IIT does not intend phi as a measure of the quality of consciousness, only of its quantity. Two systems may have the same phi value but different MICS organizations. In this case, each would be conscious to the same degree, but the nature of the conscious experience would differ. The phi metric captures one dimension, the amount of integrated information, of a system. IIT does address the quality of consciousness abstractly, although not with phi. A system’s model includes its elements and their connections, whose logic can be graphed as a constellation of (de)activated points with lines between them representing (de)activated connections. This is, precisely, its conceptual structure. Recall that the maximally irreducible conceptual structure, or MICS, is, on IIT, conscious. A graph of the MICS, according to IIT, captures its unique shape in quality space, the shape of that particular conscious state. In other words, this is the abstract form of the quality of the experience, or the experience’s form “seen from the outside.” The perspective “from the inside” is available only to the system itself, whose making differences to itself intrinsically, integrating information in one of many possible forms, just precisely is its experience. The phenomenological nature of the experience, its “qualia,” are evident only from the perspective of the conscious system, but the logical graph of its structure is a complete representation of its qualitative properties.

3. Situating the Theory

a. Some Prehistory

IIT made its explicit debut in the literature in 2004, but has roots in earlier work. Giulio Tononi, the theory’s founder and major proponent, worked for many years with Gerald Edelman. Their work rejected the notion that mental events such as consciousness will ever find full explanation by reference to the functioning of a system. Such functionalism, to them, ignores the crucial issue of the physical substrate itself. They especially emphasized the importance of re-entry. To them, only a system composed of feedback loops, where input may also serve as output, can integrate information. Feed-forward systems, then, do not integrate information. Even before the introduction of IIT, Tononi was claiming that integrated information was essential to the creation of a scene in primary consciousness.

Christof Koch, now a major proponent of IIT, collaborated for a long time with Francis Crick. Much of their earlier work focused on identifying the neural correlates of consciousness (NCC), especially in the visual system. While such research advances particular knowledge about the mechanisms of one set of conscious states, Crick and Koch came to see this work as failing to address the deeper problems of explaining consciousness generally. Koch also rejects the idea that identifying the functional dynamics of a system aptly treats what makes that system conscious. He came to regard information theory as the correct approach for explaining consciousness.

So, the two thinkers who became IIT’s chief advocates arrived at that position after close neuroscientific research with Nobel Laureates who eschewed functional approaches to consciousness, favoring investigation of the relation of physical substrate to information generation.

b. IIT’s Additional Support

As of 2016, Tononi, IIT’s creator, runs the Center for Sleep and Consciousness at the University of Madison-Wisconsin. The Center has more than forty researchers, many of whom work in its IIT Theory Group. Koch, a major supporter of IIT, heads the prestigious Allen Institute for Brain Science. The Institute has links with the White House Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative, as well as the European Human Brain Project (HBP).  IIT’s body of literature continues to grow, often in the form of publications associated with the Center and the Institute. The public reputation of these organizations, as well as of Tononi and Koch’s earlier work, lends a certain authority or celebrity to IIT. The theory has enjoyed ample attention in mainstream media. Nevertheless, IIT remains a minority position among neuroscientists and philosophers.

c. IIT as Sui Generis

IIT does not fit neatly into any other school of thought about consciousness; there are points of connection to and departures from many categories of consciousness theory.

IIT clearly endorses a Cartesian interpretation of consciousness as immediate; such an association is unusual for a self-described naturalistic or scientific theory of consciousness. Cartesian convictions do inform IIT’s axioms and so motivate its overall methodological approach. To label IIT as a Cartesian theory generally, however, would be misleading. For one thing, like most modern theories of consciousness, it dissociates itself from the idea of a Cartesian theatre, or single point in the brain that is the seat of consciousness. Moreover, it is by no means clear how IIT stands in relation to dualism. Certainly, IIT does not advertise itself as positing a mental substance that is separate from the physical. At the same time, it draws an analogy between its identification of consciousness as a property of certain integrated information systems and physics’ identification of mass or charge as a property of particles. One might interpret such introduction of immediate experience into the naturalistic ontology as having parallels with positing a new kind of mental substance.  The literature will occasionally describe IIT as a form of materialism. It is true that IIT theorists do focus on the material substrate of informational systems, but again, one might challenge whether a theory that asserts direct experience as fundamental to substrates with particular architectural features is indeed limiting itself to reference to material in its explanation.

In describing the features of conscious systems, IIT will make reference to function, but IIT rejects functionalism outright. To IIT theorists, articulating the functional dynamics of a system alone will never do justice to the immediate nature of experience.

d. Relation to Panpsychism

The IIT literature, not only from Tononi but also from Koch and others, refers with some regularity to panpsychism—broadly put, any metaphysical system that attributes mental properties to basic elements of the world—as sharing important ground with IIT. Panpsychism comes in different forms, and the precise relationship between it and IIT has yet to be established. Both IIT and panpsychism strongly endorse Cartesian commitments concerning the immediate nature of experience. IIT, however, only attributes mental properties to re-entrant architectures, because it claims that only these will integrate information; this is inconsistent with any version of panpsychism that insists upon attributing mental properties to even more basic elements of the structure of existence.

i. Relation to David Chalmers

Of the various contemporary philosophical accounts of consciousness, IIT intersects perhaps most frequently with the work of David Chalmers. This makes sense, not only given Chalmers’s panpsychist leanings, but also given the express commitment of both to a Cartesian acceptance of the immediacy of experience, and a corresponding rejection of functionalist attempts to explain consciousness.

Moreover, Chalmers’s discussion of the relation of information to consciousness strongly anticipates IIT. Before the introduction of IIT, Chalmers had already endorsed the view of information as involving specification, or reduction of uncertainty. IIT often echoes this, especially in connection with the third axiom and postulate. Chalmers also characterizes information as making a difference, which relates to IIT’s first postulate especially. These notions of specification and difference-making are familiar from the standard Shannon account of information, but the point is that both Chalmers and IIT choose to understand consciousness partly by reference to these concepts rather than to information about something.

A major theme in Chalmers’s work involves addressing the problem of the precise nature of the connection between physical systems and consciousness. Before IIT, Chalmers speculated that information seems to be the connection between the two. If this is the case, then phenomenology is the realization of information. Chalmers suggests that information itself may be primitive in the way that mass or charge is. Now, this does not directly align with IIT’s later claims of how consciousness is fundamental to certain informational systems in the same way that mass or charge is fundamental to particles, but the parallels are clear enough. Similarly, Chalmers’s description of a “minimally sufficient neural system” as the neural correlate of consciousness resembles IIT’s discussion of the MICS. Both also use the term “core” in this context. It comes as no surprise that IIT and Chalmers have intersected when we read Chalmers’s earlier claims: “Perhaps, then, the intrinsic nature required to ground the information states is closely related to the intrinsic nature present in phenomenology. Perhaps one is even constitutive of the other” (Chalmers, 1996).

Still, the relationship between Chalmers’s work and IIT is not one of simple alliance. Despite the apparent similarity of their positions on what is fundamental, there is an important disagreement. Chalmers takes the physical to derive from the informational, and grounds the realization of phenomenal space—the instantiation of conscious experience—not upon causal “differences that make a difference,” but upon the intrinsic qualities of and structural relations among experiences. IIT regards consciousness as being intrinsic to certain causal structures, which might be read as the reverse of Chalmers’s claim.  In describing his path to IIT, Koch endorses Chalmers as “a philosophical defender of information theory’s potential for understanding consciousness” while faulting Chalmers’s work for not addressing the internal organization of conscious systems (Koch, 2012). To Koch, treating the architecture is necessary because consciousness does not alter in simple covariance with change in amounts of bits of information. Because IIT addresses the physical organization, it struck Koch as superior.

There has been some amount of cross-reference between IIT and Chalmers in the literature, although important differences are apparent. For example, Chalmers famously discusses the possibility of a zombie, or operating physical match of a human that does not have experience. On IIT, functional zombies are possible, but not zombies whose nervous connections duplicate our own. In other words, if a machine were built to imitate the behavior of a human perfectly, but whose hardware involved feed-forward circuits, then it would generate possibly no phi, or more likely low, local phi, rather than the high phi of human consciousness. But if we posit that the machine replicates the connections of the human down to the level of the hardware, then it would follow that the system would integrate the same level of phi and would be equally conscious.

Chalmers (2016) writes that IIT “can be construed as a form of emergent panpsychism,” which is true in a sense, but requires qualification. By “emergent panpsychism” Chalmers means that IIT posits consciousness as fundamental not to the merest elements of existence but to certain structures that emerge at the level of particles’ relations to one another.  This is a fair assessment, whether or not IIT’s advocates choose to label the theory this way. But in the difference between emergent and non-emergent lies the substance of IIT: what precisely makes one structure “emerge” conscious and another not is what IIT hopes to explain. Non-emergent panpsychism of the sort associated with Chalmers by definition pitches its explanation at a different level. Indeed, it does not necessarily grant the premise that there exist non-conscious elements, let alone structures, in the first place. Despite the similarities between IIT and some of Chalmers’s work, the two should not be confused.

4. Implications

a. The Spectrum of Consciousness

It is widely accepted that humans experience varying degrees of consciousness. In sleep, for example, the richness of experience diminishes, and sometimes we do not experience at all. IIT implies that brain activity during this time will generate either less information or less integrated information, and interprets experimental results as bearing this out. On IIT, the complexity of physical connections in the MICS corresponds to the level of consciousness. By contrast, the cerebellum, which has many neurons, but neurons that are not complexly interconnected and so do not belong to the MICS, does not generate consciousness.

There does not exist a widely accepted position on non-human consciousness. IIT counts among its merits that the principles it uses to characterize human consciousness can apply to non-human cases.

On IIT, consciousness happens when a system makes a difference to itself at a physical level: elements causally connected to one another in a re-entrant architecture integrate information, and the subset of these with maximal causal power is conscious. The human brain offers an excellent example of re-entrant architecture integrating information, capable of sustaining highly complex MICSs, but nothing in IIT limits the attribution of consciousness to human brains only.

Mammalian brains share similarities in neural and synaptic structure: the human case is not obviously exceptional. Other, non-mammalian species demonstrate behavior associated in humans with consciousness. These considerations suggest that humans are not the only species capable of consciousness. IIT makes a point of remaining open to the possibility that many other species may possess at least some degree of consciousness. At the same time, further study of non-human neuroanatomy is required to determine whether and how this in facts holds true. As mentioned above, even the human cerebellum does not have the correct architecture to generate consciousness, and it is possible that other species have neural organizations that facilitate complex behavior without generating high phi. The IIT research program offers a way to establish whether these other systems are more like the cerebellum or the cerebral cortex in humans. Of course, consciousness levels will not correspond completely to species alone. Within conscious species, there will be a range of phi levels, and even within a conscious phenotype, consciousness will not remain constant from infancy to death, wakefulness to sleep, and so forth.

IIT claims that its principles are consistent with the existence of cases of dual consciousness within split-brain patients. In such instances, on IIT, two local maxima of integrated information exist separately from one another, generating separate consciousness. IIT does not hold that a system need have only one local maximum, although this may be true of normal brains; in split-brain patients, the re-entrant architecture has been severed so as to create two. IIT also takes its identification of MICSs through quantification of phi as a potential tool for assessing other actual or possible cases of multiple consciousness within one brain.

Such claims also allow IIT to rule out instances of aggregate consciousness. The exclusion principle forbids double-counting of consciousness. A system will have various subsystems with phi value, but only the local maxima of phi within the system can be conscious. A normal waking human brain has only one conscious MICS, and even a split-brain patient’s conscious systems do not overlap but rather are separate. One’s conscious experience is precisely what it is and nothing else. All this implies that, for example, the United States of America has no superordinate consciousness in addition to the consciousness of its individuals. The local maxima of integrated information reside within the skulls of those individuals; the phi value of the connections among them is much lower.

Although IIT allows for a potentially very wide range of degrees of consciousness and conscious entities, this has its limits. Some versions of panpsychism attribute mental properties to even the most basic elements of the structure of the world, but the simplest conscious entity admitted on IIT to be conscious would have to be a system of at least two elements that have cause-effect power over one another. Otherwise no integrated information exists. Objects such as rocks and grains of sand have no phi, whether in isolation or heaped into an aggregate, and so no consciousness.

IIT’s criteria for consciousness are consistent with the existence of artificial consciousness. The photodiode, because it integrates information, has a phi value; if not subsumed into a system of higher phi, this will count as local maximum: the simplest possible MICS or conscious system. Many or most instances of phi and consciousness may be the result of evolution in nature, independent of human technology, but this is a contingent fact. Often technological systems involve feed-forward architecture that lowers or possibly eliminates phi, but if the system is physically re-entrant and satisfies the other criteria laid out by IIT, it may be conscious. In fact, according to IIT, we may build artificial systems with a greater degree of consciousness than humans.

b. IIT and Physics

IIT has garnered some attention in the physics literature. Even if one accepts the basic principles of IIT, it still remains open to offer a different account of the physical particulars. Tononi and the other proponents of IIT coming from neuroscientific backgrounds tend to offer description at a classical grain. They frame integrated information by reference to neurons and synapses in the case of brains, and to re-entrant hardware architecture in the case of artificial systems. Such descriptions stay within the classical physics paradigm. This does not exhaust the theoretical problem space for characterizing integrated information.

One alternative (Barrett, 2014) proposes that consciousness comes from the integration of information intrinsic to fundamental fields. This account calls for reconceiving the phi metric, which in its 2016 form applies only to discrete systems, and not to electromagnetic fields.  Another account (Tegmark, 2015) also proposes non-classical physical description of conscious, integrated information. This generalizes beyond neural or neural-type systems to quantum systems, suggesting that consciousness is a state of matter, whimsically labelled “perceptronium.”

c. Artificial Consciousness

IIT’s basic arguments imply, and the IIT literature often explicitly claims, certain important constraints upon artificial conscious systems.

i. Constraints on Structure/Architecture

At the level of hardware, computation may process information with either feed-forward or re-entrant architecture. In feed-forward systems, information gets processed in only one direction, taking input and giving output. In re-entrant systems, which consist of feedback loops, signals are not confined to movement in one direction only; output may operate as input also.

IIT interprets the integration axiom (the fourth axiom, which says that each experience’s phenomenological elements are interdependent) as entailing the fourth postulate, which claims that each mechanism of a conscious system must have the potential to relate causally to the other mechanisms of that system. By definition, in a feed-forward system, mechanisms cannot act as causes upon those parts of the system from which they take input. A purely feed-forward system would have no phi, because although it would process information, it would not integrate that information at the physical level.

One implication for artificial consciousness is immediately clear: Feed-forward architectures will not be conscious. Even a feed-forward system that perfectly replicated the behavior of a conscious system would only simulate consciousness. Artificial systems would need to have re-entrant structure to generate consciousness.

Furthermore, re-entrant systems may still generate very low levels of phi. Conventional CPUs have transistors that only communicate with several others. By contrast, each neuron of the conscious network of the brain connects with thousands of others, a far more complex re-entrant structure, making a difference to itself at the physical level in such a way as to generate much higher phi value. For this reason, brains are capable of realizing much richer consciousness than conventional computers. The field of artificial consciousness, therefore, would do well to emulate the neural connectivity of the brain.

Still another constraint applies, this one associated with the fifth postulate, the postulate of exclusion. A system may have numerous phi-generating subsystems, but according to IIT, only the network of elements with the greatest cause-effect power to integrate information—the maximally irreducible conceptual structure, or MICS—is conscious. Re-entrant systems may have local maxima of phi, and therefore small pockets of consciousness. Those attempting to engineer high degrees of artificial consciousness need to focus their design on creating a large MICS, not simply small, non-overlapping MICSs.

If IIT is correct in placing such constraints upon artificial consciousness, deep convolutional networks such as GooGleNet and advanced projects like Blue Brain may be unable to realize high levels of consciousness.

ii. Relation to “Silent Neurons”

IIT’s third postulate has a somewhat counterintuitive implication. The third axiom claims that each conscious experience is precisely what it is; that is, it is distinct from other experiences. The third postulate claims that, in order to realize this feature of consciousness, a system must have a range of possible states, describable by reference to the cause-effect repertoires of its mechanistic elements. The system realizes a specific conscious state by instantiating one of those particular physical arrangements.

An essential component of the phenomenology of a conscious state is the degree of specificity: a photodiode that registers light specifies one of only two possible states. IIT accepts that such a simple mechanism, if not subsumed under a larger MICS, must be conscious, but only to a negligible degree. On the other hand, when a human brain registers light, it distinguishes it from countless other states; not only from dark, but from different shades of color, from sound, and so forth. The brain state is correspondingly more informative than the photodiode state.

This means that not only active neuronal firing, but also neuronal silence, determines the nature of a conscious state. Inactive parts of the complex, as well as active ones, contribute to the specification at the physical level, which IIT takes as the realization of the conscious state.

It is important not to conflate silent, or inactive, neurons of this kind with de-activated neurons. Only neurons that genuinely fall within the cause-effect repertoires of the mechanistic elements of the system count as contributing to specification, and this applies to inactive as well as to active ones. If a neuron was incapable all along of having causal power within the MICS, its inactivity plays no role in generating phenomenology. Likewise, IIT predicts that should neurons otherwise belonging to cause-effect repertoires of the system be rendered incapable of such causation (for example, by optogenetics), their inactivity would not contribute to phenomenology.

5. Objections

a. The Functionalist Alternative

According to functionalism, mental states, including states of consciousness, find explanation by appeal to function. The nature of a certain function may limit the possibilities for its physical instantiation, but the function, and not the material details, is of primary relevance. IIT differs from functionalism on this basic issue: on IIT, the conscious state is identified with the way in which a system embodies the physical features that IIT’s postulates describe.

Their opposing views concerning constraints upon artificial consciousness nicely illustrate the contrast between functionalism and IIT. For the functionalist, any system that functions identically to, for example, a conscious human, will by definition have consciousness. Whether the artificial system uses re-entrant or feed-forward architecture is a pragmatic matter. It may turn out that re-entrant circuitry more efficiently realizes the function, but even if the system incorporates feed-forward engineering, so long as the function is achieved, the system is conscious. IIT, on the other hand, expressly claims that a system that performed in a way completely identical to a conscious human, but that employed feed-forward architecture, would only simulate, but not realize consciousness. Put simply, such a system would operate as if it were integrating information, but because its networks would not take output as input, would not actually integrate information at the physical level. The difference would not be visible to an observer, but the artificial system would have no conscious experience.

i. Rejecting Cartesian Commitments

Those who find functionalism unsatisfactory often take it as an inadequate account of phenomenology: no amount of description of functional dynamics seems to capture, for example, our experience of the whiteness of a cue ball. Indeed, IIT entertains even broader suspicions. Beginning with descriptions of physical systems may never lead to explanations of consciousness. Rather, IIT’s approach begins with what it takes to be the fundamental features of consciousness. These self-evident, Cartesian descriptors of phenomenology then lead to postulates concerning their physical realization; only then does IIT connect experience to the physical.

This methodological respect for Cartesian intuitions has a clear appeal, and the IIT literature largely takes this move for granted, rather than offering outright justification for it. In previous work with Edelman (2000), Tononi discusses machine-state functionalism, an early form of functionalism that identified a mental state entirely with its internal, “machine” state, describable in functional terms. Noting that Putnam, machine-state functionalism’s first advocate, came to abandon the theory because meanings are not sufficiently fixed by internal states alone, Tononi rejects functionalism generally. More recently, Koch (2012) describes much work in consciousness as “models that describe the mind as a number of functional boxes” where one box is “magically endowed with phenomenal awareness.” Koch confesses to being guilty of this in some of his earlier work. He then points to IIT as an exception.

Functionalism is not receiving a full or fair hearing in these instances. Machine-state functionalism is a straw man: contemporary versions of functionalism do not commit to an entirely internal explanation meaning, and not all functionalist accounts are subject to the charge of arbitrarily attributing consciousness to one part of a system. The success or failure of functionalism turns on its treatment of the Cartesian intuitions we all have that consciousness is immediate, unitary, and so on. Rather than taking these intuitions as evidence of the unavoidable truth of what IIT describes in its axioms, functionalism offers a subtle alternative. Consciousness indeed seems to us direct and immediate, but functionalists argue that this “seeming” can be adequately accounted for without positing a substantive phenomenality beyond function. Functionalists claim that the seeming immediacy of consciousness receives sufficient explanation as a set of beliefs and dispositions to believe that consciousness is immediate. The challenge lies in giving a functionalist account of such beliefs: no mean feat, but not the deep mystery that non-functionalists construe consciousness as posing. If functionalism is correct in this characterization of consciousness, it undercuts the very premises of IIT.

ii. Case Study: Access vs. Phenomenal Consciousness

Function may be understood in terms of access. If a conscious system has cognitive access to an association or belief, then that association or belief is conscious. In humans, access is often taken to be demonstrated by verbal reporting, although other behaviors may indicate cognitive access. Functionalists hold that cognitive access exhaustively describes consciousness (Cohen and Dennett, 2012). Others hold that subjects may be phenomenally conscious of stimuli without cognitively accessing them.

Interpretation of the relevant empirical studies is a matter of controversy. The phenomenon known as “change blindness” occurs when a subject fails to notice subtle differences between two pictures, even while reporting thoroughly perceiving each. Dennett’s version of functionalism, at least, interprets this as the subject not having cognitive access to the details that have changed, and moreover as not being conscious of them. The subject overestimates the richness of his or her conscious perception. Certain non-functionalists claim that the subject does indeed have the reported rich conscious phenomenology, even though cognitive access to that phenomenal experience is incomplete. Block (2011), for instance, holds this interpretation, claiming that “perceptual consciousness overflows cognitive access.” On this account, phenomenal consciousness may occur even in the absence of access consciousness.

IIT’s treatment of the role of silent neurons aligns with the non-functionalist interpretation. On IIT, a system’s consciousness grows in complexity and richness as the number of elements that could potentially relate causally within the MICS grows. Such elements, even when inactive, contribute to the specification of the integrated information, and so help to fix the phenomenal nature of the experience. In biological systems, this means that silent but potentially active neurons matter to consciousness.

Such silent neurons are not accessed by the system. These non-accessed neurons still contribute to consciousness. As in Block’s non-functionalism, access is not necessary for consciousness. On IIT, it is crucial that these neurons could potentially be active, so they must be accessible to the system. Block’s account is consistent with this in that he claims that the non-accessed phenomenal content need not be inaccessible. (Koch, separately from his support of IIT, takes the non-functionalist side of this argument (Koch and Tsuchiya, 2007); so do Fahrenfort and Lamme (2012); for a functionalist response to the latter, see Cohen and Dennett (2011, 2012).)

Non-functionalist accounts that argue for phenomenal consciousness without access make sense given a rejection of the functionalist claim that phenomenality may be understood as a set of beliefs and associations, rather than a Cartesian, immediate phenomenology beyond such things.

iii. Challenging IIT’s Augmentation of Naturalistic Ontology

Any account of consciousness that maintains that phenomenal experience is immediately first-personal stands in tension with naturalistic ontology, which holds that even experience in principle will receive explanation without appeal to anything beyond objective, or third-personal, physical features. Among theories of consciousness, those versions of panpsychism that attribute mental properties to basic structural elements depart perhaps most obviously from the standard scientific position. Because IIT limits its attribution of consciousness to particular physical systems, rather than to, for example, particles, it constitutes a somewhat more conservative position than panpsychism. Nevertheless, IIT’s claims amount to a radical reconception of the ontology of the physical world.

IIT’s allegiance to a Cartesian interpretation of experience from the outset lends itself to a non-naturalistic interpretation, although not every step in IIT’s argumentation implies a break from standard scientific ontology. IIT counts among its innovations the elucidation of integrated information, achieved when a system’s parts make a difference intrinsically, to the system itself. This differs from observer-relative, or Shannon, information, but by itself stays within the confines of naturalism: for example, IIT could have argued that integrated information constitutes an efficient functional route to realizing states of awareness.

Instead, IIT makes the much stronger claim that such integrated information, provided it is locally maximal, is identical to consciousness. The IIT literature is quite explicit on this point, routinely offering analogies to other fundamental physical properties. Consciousness is fundamental to integrated information in the same way as it is fundamental to mass that space-time bends around it. The degree and nature of any given phenomenal feeling follow basically from the particular conceptual structure that is the integrated information of the system. Consciousness is not a brute property of physical structure per se, as it is in some versions of panpsychism, but it is inextricable from physical systems with certain properties, just as mass or charge is inextricable from some particles. So, IIT is proposing an addition to what science admits into its ontology.

The extraordinary nature of the claim does not necessarily undermine it, but it may be cause for reservation. One line of objection to IIT might claim that this augmentation of naturalistic ontology is non-explanatory, or even ad hoc. We might accept that biological conscious systems possess neurology that physically integrates information in a way that converges with phenomenology (as outlined in the relation of the postulates to the axioms) without taking this as sufficient evidence for an identity relation between integrated information and consciousness.  In response, IIT advocates might claim that the theory’s postulates give better ontological ground than functionalism for picking out systems in the first place.

b. Aaronson’s Reductio ad Absurdum

The computer scientist Scott Aaronson (on his blog Shtetl-Optimized; see Horgan (2015) for an overview) has compelled IIT to admit a counterintuitive implication. Certain systems, which are computationally simple and seem implausible candidates for consciousness, may have values of phi higher even than those of human brains, and would count as conscious on IIT. Aaronson’s argument is intended as a reductio ad absurdum; the IIT response has been to accept its conclusion, but to deny the charge of absurdity. Aaronson’s basic claim involves applying phi calculation. Advocates of IIT have not questioned Aaronson’s mathematics, so the philosophical relevance lies in the aftermath.

IIT refers to richly complex systems such as human brains or hypothetical artificial systems in order to illustrate high phi value. Aaronson points out that systems that strike us as much simpler and less interesting will sometimes yield a high phi value. The physical realization of an expander graph (his example) could have a higher phi value than a human brain. A graph has points that connect to one another, making the points vertices and the connections edges. This may be thought of as modelling communication between points. Expander graphs are “sparse” – having not very many points – but those points are highly connected, and this connectivity means that the points have strong communication with one another. In short, such graphs have the right properties for generating high phi values. Because it is absurd to accept that a physical model of an expander graph could have a higher degree of consciousness than a human being, the theory that leads to this conclusion, IIT, must be false.

Tononi (2014) responds directly to this argument, conceding that Aaronson has drawn out the implications of IIT and phi fairly, even ceding further ground: a two-dimensional grid of logic gates, even simpler than an expander graph, would have a high phi value and would, according to IIT, have a high degree of consciousness. Tononi has already argued that a photodiode has minimal consciousness; to him, accepting where Aaronson’s reasoning leads is just another case of the theory producing surprising results. After all, science must be open to theoretical innovation.

Aaronson’s rejoinder challenges IIT by arguing that it implicitly holds inconsistent views on the role of intuition. In his response to Aaronson’s original claims, Tononi disparages intuitions regarding when a system is conscious: Aaronson should not be as confident as he is that expander graphs are not conscious. Indeed, the open-mindedness here suggested seems in line with the proper scientific attitude. Aaronson employs a thought-experiment to draw out what he takes to be the problem. Imagine that a scientist announces that he has discovered a superior definition of temperature and has constructed a new thermometer that reflects this advance. It so happens that the new thermometer reads ice as being warmer than boiling water. According to Aaronson, even if there is merit to the underlying scientific work, it is a mistake for the scientist to use the terms “temperature” or “heat” in this way, because it violates what we mean by those terms in the first place: “heat” means, partly, what ice has less of than boiling water. So, while IIT’s phi metric may have some merit, it is not in measuring consciousness degree, because “consciousness” means, partly, what humans have and expander graphs and logic gates do not have.

One might, in defense of IIT, respond by claiming that the cases are not as similar as they seem, that the definition of heat necessitates that ice has less of it than boiling water and that the definition of consciousness does not compel us to draw conclusions about expander graphs’ non-consciousness, strange as that might seem. Aaronson’s argument goes further, however, and it is here that the charge of inconsistency comes into play. Tononi’s answer to Aaronson’s original reductio argument partly relies upon claiming that facts such as that the cerebellum is not conscious are totally well-established and uncontroversial. (IIT predicts this because the wiring of the cerebellum yields a low phi and is not part of the conscious MICS of the brain.) Here, argues Aaronson, Tononi is depending upon intuition, but it is possible that although the cerebellum might not produce our consciousness, it may have one of its own. Aaronson is not arguing for the consciousness of the cerebellum, but rather pointing out an apparent logical contradiction. Tononi rejects Aaronson’s claim that expander graphs are not conscious because it relies on intuition, but here Tononi himself is relying upon intuition. Nor can Tononi here appeal to common sense, because IIT’s acceptance of expander graphs and logic gates as conscious flies in the face of common sense.

It is possible that IIT might respond to this serious charge by arguing that almost everyone agrees that the brain is conscious, and that IIT has more success than any other theory in accounting for this while preserving many of our other intuitions (that animals, infants, certain patients with brain-damage, and sleeping adults all have dimmer consciousness than adult waking humans, to give several examples). Because this would accept a certain role for intuitions, it would require walking back the gloss on intuition that Tononi has offered in response to Aaronson’s reductio. Moreover, Aaronson’s arguments show that such a defense of the overall intuitive plausibility of IIT will face difficult challenges.

c. Searle’s Objection

In one of very few published discussions of IIT by a philosopher, John Searle (2013a) has come out against it, criticizing its emphasis on information as a departure from the more promising “biological approach.” His objections may be divided into two parts; Koch and Tononi (2013) have offered a response.

First, Searle claims that in identifying consciousness with a certain kind of information, it has abandoned causal explanation. Appeal to cause should be the proper route for scientific explanation, and Searle maintains, as he has throughout his career, that solving the mystery of consciousness will depend upon the explication of the causal powers special to the brain that give rise to experience. Information fails aptly to address the problem because it is observer-relative; indeed, it is relative to the conscious observer. A book or computer, to take typical examples of objects associated with information, does not contain information except insofar as it is endowed by the conscious subject. Information is in the eye of the beholder. The notion of information presupposes consciousness, rather than explaining it.

Second, according to Searle, IIT leads to an absurdity, namely panpsychism, which is sufficient reason to reject it. He interprets IIT as imputing consciousness to all systems with causal relations, and so it follows on IIT that consciousness is “spread over the universe like a thin veneer of jam.” A successful theory of consciousness will have to appreciate that consciousness “comes in units” and give a principled account of how and why this is the case.

Koch and Tononi’s response (2013) addresses both strands of Searle’s argument. First, they agree that Shannonian information is observer-relative, but point out that integrated information is non-Shannonian. IIT defines integrated information as necessarily existing with respect to itself, which they understand in expressly causal terms, as a system whose parts make a difference to that system. Integrated information systems therefore exist intrinsically, rather than relative to observers. Not only does IIT attend to the observer-relativity point, then, but also does so in a way that, contrary to Searle’s characterization, crucially incorporates causality.

Second, they deny that IIT implies the kind of panpsychism that Searle rejects as absurd. As they point out, IIT only attributes consciousness to local maxima of integrated information (MICS), and although that implies that some simple systems such as the isolated photodiode have a minimal degree of consciousness, it provides a principle to determine which “units” are conscious, and which are not. As Tononi had already put it, before Searle’s charges: “How close is this position to panpsychism, which holds that everything in the universe has some kind of consciousness? Certainly, the IIT implies that many entities, as long as they include some functional mechanisms that can make choices between alternatives, have some degree of consciousness. Unlike traditional panpsychism, however, the IIT does not attribute consciousness indiscriminately to all things. For example, if there are no interactions, there is no consciousness whatsoever. For the IIT, a camera sensor as such is completely unconscious…” (Tononi, 2008).

Although Searle offers a rejoinder (2013b) to Tononi and Koch’s response, it largely rehearses the original claims. Regardless of whether IIT is true, Tononi and Koch have given good reason to read it as addressing precisely the concerns that Searle raises. Arguably, then, Searle might have reason to embrace IIT as a theory of consciousness that at least attempts a principled articulation of the special casual powers of the brain, which Searle has regarded for many years as the proper domain for explaining consciousness.

6. References and Further Reading

  • Barrett, Adam. “An Integration of Integrated Information Theory with Fundamental Physics.” Frontiers in Psychology, 5 (63). 2014.
    • Calls for a re-conception of phi with respect to electromagnetic fields.
  • Block, Ned. “Perceptual Consciousness Overflows Cognitive Access.” Trends in Cognitive Science, 15 (12). 2011.
    • Argues for the distinction between access and phenomenal consciousness.
  • Chalmers, David. The Conscious Mind. New York: Oxford University Press. 1996.
    • A major work, relevant here for its discussions of information, consciousness, and panpsychism.
  • Chalmers, David. “The Combination Problem for Panpsychism.” In L. Jaskolla and G. Bruntup (Ed.s) Panpsychism. Oxford University Press. 2016.
    • Updated take on a classical problem; Chalmers makes reference to IIT here.
  • Cohen, Michael, and Daniel Dennett. “Consciousness Cannot be Separated from Function.” Trends in Cognitive Science, 15 (8). 2011.
    • Argues for understanding phenomenal consciousness as access consciousness.
  • Cohen, Michael, and Daniel Dennett. “Response to Fahrenfort and Lamme: Defining Reportability, Accessibility and Sufficiency in Conscious Awareness.” Trends in Cognitive Science, 16 (3). 2012.
    • Further defends understanding phenomenal consciousness as access consciousness.
  • Dennett, Daniel. Consciousness Explained. Little, Brown and Co. 1991.
    • Classic, comparatively accessible teleofunctionalist account of consciousness.
  • Dennett, Daniel. Sweet Dreams: Philosophical Obstacles to a Science of Consciousness. London, England: The MIT Press. 2005.
    • A concise but wide-ranging, updated defense of functionalist explanation of consciousness.
  • Edelman, Gerald. The Remembered Present: A Biological Theory of Consciousness. New York: Basic Books. 1989.
    • Influential upon Tononi’s early thinking.
  • Edelman, Gerald, and Giulio Tononi. A Universe of Consciousness: How Matter Becomes Imagination. New York: Basic Books. 2000.
    • Puts forward many of the arguments that later constitute IIT.
  • Fahrenfort, Johannes, and Victor Lamme. “A True Science of Consciousness Explains Phenomenology: Comment on Cohen and Dennett.” Trends in Cognitive Science, 16 (3). 2012.
    • Argues for the access/phenomenal division supported by Block.
  • Horgan, John. “Can Integrated Information Theory Explain Consciousness?” Scientific American. 1 December 2015. http://blogs.scientificamerican.com/cross-check/can-integrated-information-theory-explain-consciousness/
    • Gives an overview of an invitation-only workshop on IIT at New York University that featured Tononi, Koch, Aaronson, and Chalmers, among others.
  • Koch, Christof. Consciousness: Confessions of a Romantic Reductionist. The MIT Press. 2012.
    • Intellectual autobiography, in part detailing the author’s attraction to IIT.
  • Koch, Christof, and Naotsugu Tsuchiya. “Phenomenology Without Conscious Access is a Form of Consciousness Without Top-down Attention.” Behavioral and Brain Sciences, 30 (5-6) 509-10. 2007.
    • Also argues for the access/phenomenal division supported by Block.
  • Koch, Christof, and Giulio Tononi. “Can a Photodiode Be Conscious?” New York Review of Books, 7 March 2013.
    • Responds to Searle’s critique.
  • Oizumi, Masafumi, Larissa Albantakis, and Giulio Tononi. “From the Phenomenology to the Mechanisms of Consciousness: Integrated Information Theory 3.0.” PLOS Computational Biology, 10 (5). 2014. Doi: 10.1371/journal.pcbi.1003588.
    • Technically-oriented introduction to IIT.
  • Searle, John. “Minds, brains and programs.” In Hofstadter, Douglas and Daniel Dennett, (Eds.). The Mind’s I: Fantasies and Reflections on Self and Soul (pp. 353-373). New York: Basic Books. 1981.
    • Searle’s classic paper on intentionality and information.
  • Searle, John. The Rediscovery of Mind. Cambridge, MA: The MIT Press. 1992.
    • Fuller explication of Searle’s views on intentionality and information.
  • Searle, John. “Can Information Theory Explain Consciousness?” New York Review of Books 10 January 2013(a).
    • Objects to IIT.
  • Searle, John. “Reply to Koch and Tononi.” New York Review of Books. 7 March 2013(b).
    • Rejoinder to Koch and Tononi’s response to his objections to IIT.
  • Tegmark, Max. “Consciousness as a State of Matter.” Chaos, Solitons & Fractals. 2015.
    • Proposes a re-conception of phi, at the quantum level.
  • Tononi, Giulio. “An Information Integration Theory of Consciousness.” BMC Neuroscience, 5:42. 2004.
    • The earliest explicit introduction to IIT.
  • Tononi, Giulio. “Consciousness as Integrated Information: A Provisional Manifesto.” Biology Bulletin, 215: 216–242. 2008.
    • An early overview of IIT.
  • Tononi, Giulio. “Integrated Information Theory.” Scholarpedia, 10 (1). 2015. http://www.scholarpedia.org/w/index.php?title=Integrated_information_theory&action=cite&rev=147165.
    • A thorough synopsis of IIT.
  • Tononi, Giulio, and Gerald Edelman. “Consciousness and Complexity.” Science, 282 (5395). 1998.
    • Anticipates some of IIT’s claims.
  • Tononi, Giulio, and Christof Koch. “Consciousness: Here, There and Everywhere?” Philosophical Transactions of the Royal Society, Philosophical Transactions B, 370 (1668). 2015. Doi: 10.1098/rstb.2014.0167
    • Perhaps the most accessible current introduction to IIT, from the perspective of its founder and chief proponent.

 

Author Information

Francis Fallon
Email: Fallonf@stjohns.edu
St. John’s University
U. S. A.

Scientific Realism and Antirealism

Debates about scientific realism concern the extent to which we are entitled to hope or believe that science will tell us what the world is really like. Realists tend to be optimistic; antirealists do not. To a first approximation, scientific realism is the view that well-confirmed scientific theories are approximately true; the entities they postulate do exist; and we have good reason to believe their main tenets. Realists often add that, given the spectacular predictive, engineering, and theoretical successes of our best scientific theories, it would be miraculous were they not to be approximately correct. This natural line of thought has an honorable pedigree yet has been subject to philosophical dispute since modern science began.

In the 1970s, a particularly strong form of scientific realism was advocated by Putnam, Boyd, and others. When scientific realism is mentioned in the literature, usually some version of this is intended. It is often characterized in terms of these commitments:

  • Science aims to give a literally true account of the world.
  • To accept a theory is to believe it is (approximately) true.
  • There is a determinate mind-independent and language-independent world.
  • Theories are literally true (when they are) partly because their concepts “latch on to” or correspond to real properties (natural kinds, and the like) that causally underpin successful usage of the concepts.
  • The progress of science asymptotically converges on a true account.

 

Table of Contents

  1. Brief History before the 19th Century
  2. The 19th Century Debate
    1. Poincaré’s Conventionalism
    2. The Reality of Forces and Atoms
    3. The Aim of Science: Causal Explanation or Abstract Representation?
  3. Logical Positivism
    1. General Background
    2. The Logical Part of Logical Positivism
    3. The Positivism Part of Logical Positivism
  4. Quine’s Immanent Realism
  5. Scientific Realism
    1. Criticisms of the Observational-Theoretical Distinction
    2. Putnam’s Critique of Positivistic Theory of Meaning
    3. Putnam’s Positive Account of Meaning
    4. Putnam’s and Boyd’s Critique of Positivistic Philosophy of Science
    5. Inference to the Best Explanation
  6. Constructive Empiricism
    1. The Semantic View of Theories and Empirical Adequacy
    2. The Observable-Unobservable Distinction
    3. The Argument from Empirically Equivalent Theories
    4. Constructive Empiricism, IBE, and Explanation
  7. Historical Challenges to Scientific Realism
    1. Kuhn’s Challenge
    2. Laudan’s Challenge: The Pessimistic Induction
  8. Semantic Challenges to Scientific Realism
    1. Semantic Deflationism
    2. Pragmatist Truth Surrogates
    3. Putnam’s Internal Realism
  9. Law-Antirealism and Entity-Realism
  10. NOA: The Natural Ontological Attitude
  11. The 21st Century Debates
    1. Structuralism
    2. Stanford’s New Induction
    3. Selective Realism
  12. References and Further Reading

1. Brief History before the 19th Century

The debate begins with modern science. Bellarmine advocated an antirealist interpretation of Copernicus’s heliocentrism—as a useful instrument that saved the phenomena—whereas Galileo advocated a realist interpretation—the planets really do orbit the sun. More generally, 17th century protagonists of the new sciences advocated a metaphysical picture: nature is not what it appears to our senses—it is a world of objects (Descartes’ matter-extension, Boyle’s corpuscles, Huygens’ atoms, and so forth) whose primary properties (Cartesian extension, or the sizes, shapes, and hardness of atoms and corpuscles, and/or forces of attraction or repulsion, and so forth) are causally responsible for the phenomena we observe. The task of science is “to strip reality of the appearances covering it like a veil, in order to see the bare reality itself” (Duhem 1991).

This metaphysical picture quickly led to empiricist scruples, voiced by Berkeley and Hume. If all knowledge must be traced to the senses, how can we have reason to believe scientific theories, given that reality lies behind the appearances (hidden by a veil of perception)? Indeed, if all content must be traced to the senses, how can we even understand such theories? The new science seems to postulate “hidden” causal powers without a legitimate epistemological or semantic grounding. A central problem for empiricists becomes that of drawing a line between objectionable metaphysics and legitimate science (portions of which seem to be as removed from experience as metaphysics seems to be). Kant attempted to circumvent this problem and find a philosophical home for Newtonian physics. He rejected both a veil of perception and the possibility of our representing the noumenal reality lying behind it. The possibility of making judgments depends on our having structured what is given: experience of x qua object requires that x be represented in space and time, and judgments about x require that x be located in a framework of concepts. What is real and judgable is just what is empirically real—what fits our system of representation in the right way—and there is no need for, and no possibility of, problematic inferences to noumenal goings-on. In pursuing this project Kant committed himself to several claims about space and time—in particular that space must be Euclidean, which he regarded as both a priori (because a condition of the possibility of our experience of objects) and synthetic (because not derivable from analytical equivalences)—which became increasingly problematic as 19th century science and mathematics advanced.

2. The 19th Century Debate

Many features of the contemporary debates were fashioned in 19th century disputes about the nature of space and the reality of forces and atoms. The principals of these debates—Duhem, Helmholtz, Hertz, Kelvin, Mach, Maxwell, Planck, and Poincaré—were primarily philosopher-physicists. Their separation into realists and antirealists is complicated, but Helmholtz, Hertz, Kelvin, Maxwell, and Planck had realist sympathies and Duhem, Mach, and Poincaré had antirealist doubts.

a. Poincaré’s Conventionalism

By the late 19th century several consistent non-Euclidean geometries, mathematically distinct from Euclidean geometry, had been developed. Euclidean geometry has a unique parallels axiom and angle sum of triangles equals 180º, whereas, for example, spherical geometry has a zero-parallel axiom and angle sum of triangles greater than or equal to 180º. These geometries raise the possibility that physical space could be non-Euclidean. Empiricists think we can determine whether physical space is Euclidean through experiments. For example, Gauss allegedly attempted to measure the angles of a triangle between three mountaintops to test whether physical space is Euclidean. Realists think physical space has some determinate geometrical character even if we cannot discover what character it has. Kantians think that physical space must be Euclidean because only Euclidean geometry is consistent with the form of our sensibility.

Poincaré (1913) argued that empiricists, realists, and Kantians are wrong: the geometry of physical space is not empirically determinable, factual, or synthetic a priori. Suppose Gauss’s experiment gave the angle-sum of a triangle as 180º. This would support the hypothesis that physical space is Euclidean only under certain presuppositions about the coordination of optics with geometry: that the shortest path of an undisturbed light ray is a Euclidean straight line. Instead, for example, the 180º measurement could also be accommodated by presupposing that light rays traverse shortest paths in spherical space but are disturbed by a force, so that physical space is “really” non-Euclidean: the true angle-sum of the triangle is greater than 180º, but the disturbing force makes it “appear” that space is Euclidean and the angle-sum of the triangle is 180º.

Arguing that there is no fact of the matter about the geometry of physical space. Poincaré proposed conventionalism: we decide conventionally that geometry is Euclidean, forces are Newtonian, light travels in Euclidean straight lines, and we see if experimental results will fit those conventions. Conventionalism is not an “anything-goes” doctrine—not all stipulations will accommodate the evidence—it is the claim that the physical meaning of measurements and evidence is determined by conventionally adopted frameworks. Measurements of lines and angles typically rely on the hypothesis that light travels shortest paths. But this lacks physical meaning unless we decide whether shortest paths are Euclidean or non-Euclidean. These conventions cannot be experimentally refuted or confirmed since experiments only have physical meaning relative to them. Which group of conventions we adopt depends on pragmatic factors: other things being equal, we choose conventions that make physics simpler, more tractable, more familiar, and so forth. Poincaré, for example, held that, because of its simplicity, we would never give up Euclidean geometry.

b. The Reality of Forces and Atoms

Ever since Newton, a certain realist ideal of science was influential: a theory that would explain all phenomena as the effects of moving atoms subject to forces. By the 1880s many physicists came to doubt the attainability of this ideal since classical mechanics lacked the tools to describe a host of terrestrial phenomena: “visualizable” atoms that are subject to position-dependent central forces (so successful for representing celestial phenomena) were ill-suited for representing electromagnetic phenomena, “dissipative” phenomena in heat engines and chemical reactions, and so forth. The concepts of atom and force became questionable. The kinetic theory of gases lent support to atomism, yet no consistent models could be found (for example, spectroscopic phenomena required atoms to vibrate while specific heat phenomena required them to be rigid). Moreover, intermolecular forces allowing for internal vibration and deformation could not be easily conceptualized as Newtonian central forces. Newtonian action-at-a-distance forces also came under pressure with the increasing acceptance of Maxwell’s theory of electromagnetism, which attributed electromagnetic phenomena to polarizations in a dielectric medium propagated by contiguous action. Many thought that physics had become a disorganized patchwork of poorly understood theories, lacking coherence, unity, empirical determinacy, and adequate foundations. As a result, physicists became increasingly preoccupied with foundational efforts to put their house in order. The most promising physics required general analytical principles (for example, conservation of energy and action, Hamilton’s principle) that could not be derived from Newtonian laws governing systems of classical atoms. The abstract concepts (action, energy, generalized potential, entropy, absolute temperature) needed to construct these principles could not be built from the ordinary intuitive concepts of classical mechanics. They could, however, be developed without recourse to “hidden mechanisms” and independently of specific hypotheses about the reality underlying the phenomena. Most physicists continued to be realists: they believed in a deeper reality underlying the phenomena that physics can meaningfully investigate; for them, the pressing foundational problem was to articulate the concepts and develop the laws that applied to that reality. But some physicists became antirealists. Some espoused local antirealism (antirealist about some kinds of entities, as Hertz (1956) was about forces, while not espousing antirealism about physics generally).

c. The Aim of Science: Causal Explanation or Abstract Representation?

Others espoused global antirealism. Like contemporary antirealists, they questioned the relationship among physics, common sense and metaphysics, the aims and methods of science, and the extent to which science, qua attempt to fathom the depth and extent of the universe, is bankrupt. While their realist colleagues hoped for a unified, explanatorily complete, fundamental theory as the proper aim of science, these global antirealists argued on historical grounds that physics had evolved into its current disorganized mess because it had been driven by the unattainable metaphysical goal of causal explanation. Instead, they proposed freeing physics from metaphysics, and they pursued phenomenological theories, like thermodynamics and energetics, which promised to provide abstract, mathematical organizations of the phenomena without inquiring into their causes. To justify this pursuit philosophically, they proposed a re-conceptualization of the aim and scope of physics that would bring order and clarity to science and be attainable. The aim of science is: economy of thought (science is a useful instrument without literal significance (Mach 1893)), the discovery of real relations between hidden entities underlying the phenomena (Poincaré 1913), and the discovery of a “natural classification” of the phenomena (a mathematical organization of the phenomena that is the reflection of a hidden ontological order (Duhem 1991)). These affinities, between 19th century global antirealism and 20th century antirealism, mask fundamental differences. The former is driven by methodological considerations concerning the proper way to do physics whereas the latter is driven by traditional metaphysical or epistemological concerns (about the meaningfulness and credibility of claims about goings-on behind the veil of appearances).

3. Logical Positivism

Logical positivism began in Vienna and Berlin in the 1910s and 1920s and migrated to America after 1933, when many of its proponents fled Nazism. The entire post-1960 conversation about scientific realism can be viewed as a response to logical positivism. More a movement than a position, the positivists adopted a set of philosophical stances: pro-science (including pro-verification and pro-observation) and anti-metaphysics (including anti-cause, anti-explanation, anti-theoretical entities). They are positivists because of their pro-science stance; they are logical positivists because they embraced and used the formal logic techniques developed by Frege, Russell, and Wittgenstein to clarify scientific and philosophical language.

a. General Background

As physics developed in the early 20th century, many of the 19th century methodological worries sorted themselves out: Perrin’s experiments with Brownian motion persuaded most of the reality of atoms; special relativity unified mechanics and electromagnetism and signaled the demise of traditional mechanism; general relativity further unified gravity with special relativity; quantum mechanics produced an account of the microscopic world that allowed atoms to vibrate and was spectacularly supported empirically. Moreover, scientific developments undermined several theses formerly taken as necessarily true. Einstein’s famous analysis of absolute simultaneity showed that Newtonian absolute space and time were incorrect and had to be replaced by the space-time structure of Special Relativity. His Theory of General Relativity introduced an even stranger notion of space-time: a space-time with a non-Euclidean structure of variable curvature. This undermined Kant’s claims that space has to be Euclidean and that there is synthetic a priori knowledge. Moreover, quantum mechanics, despite its empirical success, led to its own problems, since quantum particles have strange properties—they cannot have both determinate position and momentum at a given time, for example—and the quantum world has no unproblematic interpretation. So, though everyone was converted to atomism, no one understood what atoms were.

Logical positivism developed within this scientific context. Nowadays the positivists are often depicted as reactionaries who developed a crude, ahistorical philosophical viewpoint with pernicious consequences (Kuhn 1970, Kitcher 1993). In their day, however, they were revolutionaries, attempting to come to grips with the profound changes that Einstein’s relativity and Bohr’s quantum mechanics had wrought on the worldview of classical physics and to provide firm logical foundations for all science.

Logical positivism’s philosophical ancestry used to be traced to Hume’s empiricism (Putnam 1962, Quine 1969). On this interpretation, the positivist project provides epistemological foundations for problematic sentences of science that purport to describe unobservable realities, such as electrons, by reducing sentences employing these concepts to unproblematic sentences that describe only observable realities. Friedman (1999) offers a different Kantian interpretation: their project provides objective content for science, as Kant had attempted, by showing how it organizes our experience into a structured world of objects, but without commitment to scientifically outdated aspects of Kant’s apparatus, such as synthetic a priori truths or the necessity of Euclidean geometry. Whichever interpretation is correct, the logical positivists clearly began with traditional veil-of-perception worries (§1) and insisted on a distinction that both Hume and Kant advocated—between meaningful science and meaningless metaphysics.

b. The Logical Part of Logical Positivism

This distinction rests on their verificationist theory of meaning, according to which the meaning of a sentence is its verification conditions and understanding a sentence is knowing its verification conditions. For example, knowing the meaning of “This is blue” is being able to pick out the object referred to by “this” and to check that it is blue. While this works only for simple sentences built from terms that directly pick out their referents and predicates with directly verifiable content, it can be extended to other sentences. To understand “No emerald is blue” one need only know the verification conditions for “This is an emerald”, “This is blue” and the logical relations of such sentences to “No emerald is blue” (for example, that “no emerald is blue” implies “if this is an emerald, then this is not blue”, and so forth). Simple verification conditions plus some logical knowledge buys a lot. But it does not buy enough. For example, what are the verification conditions expressed by “This is an electron”,  where “this” does not pick out an ostendible object and where “is an electron” does not have directly verifiable content?

To deal with this, the positivists, especially Carnap, hit upon an ingenious program. First, they distinguished two kinds of linguistic terms: observational terms (O-terms), like “is blue”, which have relatively unproblematic, directly verifiable content, and theoretical terms (T-terms), like “is an electron”, which have more problematic content that is not directly verifiable. Second, they proposed to indirectly interpret the T-terms, using logical techniques inherited from Frege and Russell, by deductively connecting them within a theory to the directly interpreted O-terms. If each T-term could be explicitly defined using only O-terms, just as “x is a bachelor” can be defined as “x is an unmarried male human”, then one would understand the verification conditions for a T-term just by understanding the directly verifiable content of the O-terms used to define it, and a theory’s theoretical content would be just its observational content.

Unfortunately, the content of “is an electron” is open-ended and outstrips observational content so that no explicit definition of it in terms of a finite list of O-terms can be given in first-order logic. From the 1930s to the 1950s, Carnap (1936, 1937, 1939, 1950, 1956) struggled with this problem by using ever more elaborate logical techniques. He eventually settled for a less ambitious account: the meaning of a T-term is given by the logical role it plays in a theory (Carnap 1939). Although T-terms cannot be explicitly defined in first-order logic, the totality of their logical connections within the theory to other T-terms and O-terms specifies their meaning. Intuitively, the meaning of a theoretical term like “electron” is specified by: “electron” means “the thing x that plays the Θ-role”, where Θ is the theory of electrons. (This idea can be rendered precisely in second-order logic by a “Ramseyified” definition: “electron” means “the thing x such that Θ(x)”, where “Θ(x)” is the result of taking the theory of electrons Θ (understood as the conjunction of a set of sentences) and replacing all occurrences of “is an electron” with the (second-order) variable “x” (Lewis 1970).

Two features of this theory of meaning lay groundwork for later discussion. First, the meaning of any T-term is theory-relative since it is determined by the term’s deductive connections within a theory. Second, the positivists distinguished analytic truths (sentences true in virtue of meaning) and synthetic truths (sentences true in virtue of fact). “All bachelors are unmarried” and “All electrons have the property of being the x such that Θ(x)” are analytic truths, whereas “Kant was a bachelor” and “Electrons exist” are synthetic truths. The positivists inherited this distinction from Kant, but, unlike Kant, they rejected synthetic a priori truths. For them, there are only analytic a priori truths (all pure mathematics, for example) and synthetic a posteriori truths (all statements to the effect that a given claim is verified).

c. The Positivism Part of Logical Positivism

The positivists distinguished legitimate positive science, whose aim is to organize and predict observable phenomena, from illegitimate metaphysics, whose aim is to causally explain those phenomena in terms of underlying unobservable processes. We should restrict scientific attention to the phenomena we can know and banish unintelligible speculation about what lies behind the veil of appearances. This distinction rests on the observational-theoretical distinction (§3b): scientific sentences (even theoretical ones like “Electrons exist”) have meaningful verifiable content; sentences of metaphysics (like “God exists”) have no verifiable content and are meaningless.

Because of their hostility to metaphysics, the positivists “diluted” various concepts that have a metaphysical ring. For example, they replaced explanations in terms of causal powers with explanations in terms of law-like regularities so that “causal” explanations become arguments. According to the deductive-nomological (DN) model of explanation, pioneered by Hempel (1965), “Event b occurred because event a occurred” is elliptical for an argument like: “a is an event of kind A, b is an event of kind B, and if any A-event occurs, a B-event will occur; a occurred; therefore b occurred”. The explanandum logically follows from the explanantia, one of which is a law-like regularity.

Because they advocated a non-literal interpretation of theories, the positivists are considered to be antirealists. Nevertheless, they do not deny the existence or reality of electrons: for them, to say that electrons exist or are real is merely to say that the concept electron stands in a definite logical relationship to observable conditions in a structured system of representations. What they deny is a certain metaphysical interpretation of such claims—that electrons exist underlying and causing but completely transcending our experience. It is not that physical objects are fictions; rather, all there is to being a real physical object is its empirical reality—its system of relations to verifiable experience.

4. Quine’s Immanent Realism

Quine, an early critic of logical positivism, acknowledged their rejection of transcendental questions such as “Do electrons really exist (as opposed to being just useful fictions)?” Our evidence for molecules is similar to our evidence for everyday bodies, he argued; in each case we have a theory that posits an arrangement of objects that organizes our experience in a way that is simple, familiar, predictive, covering, and fecund. This is just what it is to have evidence for something. So, if we have such an organizing theory for molecules, then we can no more doubt the existence of molecules than we can doubt the existence of ordinary physical bodies (Quine 1955). Quine thus arrived at a realism not unlike the empirical realism of the logical positivists.

However, Quine rejected their theory of meaning and its central analytic-synthetic distinction, arguing that theoretical content cannot be analytically welded to observational content. The positivists, he argued, confuse the event of positing with the object posited. Yes, scientists conventionally introduce posits (an event) as Stoney introduced the term “electron” in 1894: “electron” means “the fundamental unit of electric charge that permanently attaches to atoms”. But no, scientists do not treat the conventions as analytic truths that cannot be revised without a change of meaning. Scientists did not treat Stoney’s definition as binding analytic truth and “Electrons exist” as a synthetic hypothesis whose truth must be verified. More generally, Quine argued, once the explicit definitional route failed by Carnap’s allowing the meaning of “electron” to be a function of the totality of its logical connections within a theory, Carnap had already adopted meaning holism, according to which one cannot separate the analytic sentences, whose truth-values are determined by the contribution of language, from the synthetic sentences, whose truth-values are determined by the contribution of fact.

Quine accepted meaning holism together with another thesis, epistemological holism, a doctrine often called “the Quine-Duhem Thesis”, because Duhem used it to argue against Poincaré’s conventionalism. The Quine-Duhem thesis says that only a group of hypotheses can be falsified because only a group of hypotheses has observational consequences. If a single hypothesis, H, implies an observational consequence O and we get evidence for not-O; then we can deduce not-H. But a single hypothesis will typically not imply any observational consequence. Take, for example, Gauss’s supposed mountaintop triangulation experiment to test whether space is Euclidean (§2a). Let H = “Space is Euclidean” and O = “The measured angle-sum of the triangle equals 180º”. Clearly H does not entail O without auxiliary assumptions: for example, A1 = “Light travels the shortest Euclidean paths”, A2 = “No physical force appreciably disturbs the light”, A3 = “The triangle is large enough for deviations from rectilinear paths to be experimentally detectable”, and so forth. Consequently, if the experiment yields not-O = “The measured angle-sum of the triangle is not equal to 180º”, we cannot deduce not-H = “Space is non-Euclidean”. We can only deduce not-(H and A1 and A2 and Aand so forth); that is, we can only deduce that one or more of the hypothesis and the auxiliary assumptions is false—perhaps space is Euclidean but some force is distorting the light paths to make it look non-Euclidean. Poincaré and the positivists reply that it is conventional or analytic that space is Euclidean; there is no fact of the matter. In rejecting conventionalism, Duhem and Quine claim that we may keep H and reject one of the Ais to accommodate not-O: any statement may be held true in light of disconfirming experience. [It is misleading, however, to call epistemological holism “the Quine-Duhem thesis”. For Duhem, epistemological holism holds only for physical theories for rather special reasons; it does not extend to mathematics or logic and is not connected with theses about meaning. Quine extends epistemological holism from physics to all knowledge, including all knowledge traditionally regarded as a priori, including allegedly analytic statements.] Quine, but not Duhem, believed that our reluctance to revise mathematics and logic (because of their centrality to our belief-systems) does not entail their a prioricity (irrevisability based on evidence).

Moreover, if the analytic-synthetic distinction collapses, so too does the positivist separation of metaphysics from science. For Quine, metaphysical questions are just the most general and abstract questions we ask and are decided on the grounds we use to decide whether electrons exist. All questions are “internal” in the sense that they must be formulated in our home language and answered with our standard procedures for gathering and weighing evidence. In particular, questions about the reality of some putative objects are to be answered in terms of whether they contribute to a useful organization of experience and whether they withstand the test of experience.

5. Scientific Realism

In the 1970s, a particularly strong form of scientific realism (SR) was advocated by Putnam, Boyd, and others (Boyd 1973, 1983; Putnam 1962, 1975a, 1975b). When scientific realism is mentioned in the literature, usually some version of SR is intended. SR is often characterized in terms of two commitments (van Fraassen 1980):

SR1     Science aims to give a literally true account of the world.

SR2     To accept a theory is to believe it is (approximately) true.

However, scientific realists’ arguments and their interpretation of SR1 and SR2 often presuppose further commitments:

SR3     There is a determinate mind-independent and language-independent world.

SR4     Theories are literally true (when they are) partly because their concepts “latch on to” or correspond to real properties (natural kinds, and the like) that causally underpin successful usage of the concepts.

SR5     The progress of science asymptotically converges on a true account.

a. Criticisms of the Observational-Theoretical Distinction

Critics of positivism argued that there is no workable, well-motivated distinction between observational and theoretical vocabulary that would make the former unproblematic and the latter problematic (for example, Putnam 1962; Maxwell 1962; van Fraassen 1980). First, O-terms apply to apparently theoretical entities (for example, red corpuscle) and T-terms apply to apparently observable entities (for example, the moon is a satellite). Second, if T-terms were epistemologically or semantically problematic, that would have to be due to the unobservable nature of their referents. But in the continuous gradation between seeing with the unaided eye, with binoculars, with an optical microscope, with an electron microscope, and so on, there is no sharp cut-off between being observable and being unobservable where we could non-arbitrarily say: beyond this we cannot trust the evidence of our senses or apply terms with confidence. Third, the “able” in “observable” cannot be specified in a way that motivates a plausible distinction. Most “theoretical” entities can be detected (like electrons) with scientific instruments or theoretically calculated (like lunar gravity). The positivist may respond that they cannot be directly sensed, and are thus unobservable, but why should being directly sensed be the criterion for epistemological or semantic confidence? Fourth, observation is theory-infected: what we can both observe and employ as evidence is a function of the language, concepts, and theories we possess. A primitive Amazonian may observe a tennis ball (he notices it), but without the relevant concepts he cannot use it as evidence for any claims about tennis. Such arguments undermine a central distinction of the positivist program.

b. Putnam’s Critique of Positivistic Theory of Meaning

Putnam (1975a, 1975b) provides a general argument against all theories of meaning (Frege, Russell, Carnap, Kuhn), including positivist theories, which are classical in the relevant sense. Classical concepts have two characteristics: they determine their extensions in the world, and we can “grasp” them. To know the meaning of a directly interpretable O-term is to associate it with a concept (verification condition) which determines the term’s extension. In turn, to know the meaning of an indirectly interpretable T-term is to know its logical connections to directly interpretable terms. These two features of the classical view are:

(1)  To know the meaning of F is to be in a certain psychological state (of grasping F’s associated concept and knowing it is the meaning of “F”);

(2)  The meaning of F determines the extension of F in the sense that, if two terms have the same meaning, they must have the same extension.

If the meaning of “water” is the concept the clear, tasteless, potable, nourishing liquid found in lakes and rivers, then by (1) I must associate that concept with “water” if I’m to know its meaning and by (2) something will be water just in case it satisfies that concept.

Putnam’s famous Twin Earth argument (Putnam 1975b) is intended to show that all classical theories fail because (1) and (2) are not co-tenable. Suppose the year is 1740 when speakers did not know that water is H2O. Suppose too that another planet, Twin-Earth, is just like Earth except that a different liquid, whose chemical nature is XYZ, is the clear, tasteless, potable, nourishing liquid found in lakes and rivers. Suppose finally that Earthling Oscar and Twin-Earthling Twin-Oscar are duplicates and share the very same internal psychological states so that Oscar thinks “water is the clear, tasteless, potable, nourishing liquid found in lakes and rivers” if and only if Twin-Oscar thinks “water is the clear, tasteless, potable, nourishing liquid found in lakes and rivers”. In other words, they grasp the same meaning and associate it with the word “water”; (1) is satisfied. But then (2) cannot be satisfied: meaning does not determine extension because the extension of “water” (in English) = H2O yet XYZ = the extension of “water” (in Twin-English). If (1), then not-(2). Conversely, if meaning does determine extension, then since the extension of “water” (on Earth) is the extension of “water” (on Twin-Earth), Oscar and Twin-Oscar must associate different meanings with the term. Consequently, either (1) or (2) must go. Putnam keeps (2) and revises (1).

c. Putnam’s Positive Account of Meaning

How is extension determined, if not classically? Putnam develops a causal-historical account of reference for natural kind terms (“water”) and physical magnitude terms (“temperature”). Think of these terms being introduced into the language via an introducing event or baptism. The introducer points to an object (or phenomenon) and intones: “let ‘t’ apply to all and only objects that are relevantly similar (same kind, same magnitude) to this sample (or to whatever is the cause of this phenomenon)”. Later t-users learn conditions that normally pick out the referent of t, use these conditions to triangulate their usage with that of others and with extra-linguistic conditions, and intend their t-utterances to conform to the t-practices initiated in the introducing event. The term passes through the community so that reference is preserved. Then, on Putnam’s view, the extension of the term is part of the meaning of the term, the kind or magnitude that the term “locked on to” in the course of its introduction and historical development. So H2O is part of the English meaning of “water” and (2) is satisfied: meaning determines extension since extension is part of the meaning. This gives an intuitively plausible reading of the Twin-Earth scenario: Oscar is talking about water (H2O) and Twin-Oscar is talking about Twin-water (XYZ).

On classical accounts, a speaker S correctly uses a term “t” to refer to an object x only if x uniquely satisfies a concept, description, verification procedure, or theory that S associates with “t”. In the 1740s English-speakers lacked such uniquely identifying knowledge, though we would naturally say they were using “water” as we do — to refer to H2O. On Putnam’s account, S correctly uses t to refer to x only if S is a member of a linguistic community whose t-usage (via their linguistic and extra-linguistic interactions) is causally or historically tied to the things or stuff that are of the same kind as x. Realistic semantics ties correct usage to things in the world using causal relations. Because truth is defined in terms of reference (for example, “a is F” is true if and only if the referent of “a” has the property expressed by “F”), truth on Putnam’s account is also a causal notion.

We now see why SR is committed to SR3 and SR4 above. Clearly SR1 requires SR3: science can aim at a literally true account of the world only if the world is some determinate way that an account can be literally true of. But Putnam’s semantics requires more: that there be natural kinds and magnitudes that our terms lock onto, which is SR4. Note SR5 also seems to require SR3 and SR4. To many realists who accept SR3, SR4 seems extravagant and mysterious. Natural kinds seem to be an unnecessary traditional philosophical apparatus imposed on realism without the support of, and indeed undermined by, science. Our best science suggests that natural kinds do not exist: water, for example, is not a simple natural kind, H2O, but a more complicated structure of constantly changing polymeric variations, and biological species are anything but simple kinds. And even if there were natural kinds, it seems unreasonable to expect that language could neatly lock onto them: why should our accidental encounters with various samples in our limited part of the universe put us in a position to lock onto universal kinds? Continuity of reference of the kind advocated by Putnam may be too crude. More fine-grained accounts have been proposed (Kitcher 1993; Wilson 1982, 2006) which acknowledge the complicated evolution of science and language yet avoid metaphysical extravagance.

d. Putnam’s and Boyd’s Critique of Positivistic Philosophy of Science

A common argument for SR is the following:

  1. An acceptable philosophy of science should be able to explain standard scientific practice and its instrumental success.
  2. Only SR can explain standard scientific practice and its instrumental success.
  3. Thus SR is the only acceptable philosophy of science.

This is an instance of inference to the best explanation (§5e). Here we look at premise 2, which follows logically from:

2a. There are only two contending explanations: SR and Idealism.

2b. Idealism fails to explain the practice and its success, while SR succeeds.

Premise 2a: For Putnam the distinction between realism and idealism is fundamentally semantic. In realist (or externalist) semantics the world leads and content follows: content is determined causally and historically by the way world is; the content of “water” is H2O. In idealist (or internalist) semantics content drives and the world follows: the world is whatever satisfies the descriptive content of our thoughts; the content of “water” is the clear, tasteless, potable, nourishing liquid found in lakes and rivers. Idealism is a blanket category covering any account of meaning (including positivist and Kuhnian and pragmatist accounts (§§7-8)) in the family of classical theories (§5b).

Premise 2b: Idealism fails to explain scientific practice and success in several ways: (i) For the positivist, “Electrons exist” means “Θi implies ‘electrons exist’ and Θi is observationally correct” and “‘electron’ refers to x” means “x is a member of the kind X such that Θi(X)” (§3b). Existence, reference, and truth are all theory-relative. Take “electron” in Thomson’s 1898 theory, in Bohr’s 1911 theory, and in full quantum theory (late 1920s). Since the meaning of “electron” changes from theory to theory and meaning determines reference, the referent of “electron” changes from theory to theory. So, Thomson, early Bohr, later Bohr, Heisenberg, and Schrödinger were (a) talking about a different entity and (b) changing the meaning of “electron”. Putnam argues that this is a bizarre re-description of what we would normally say: they were (a) talking about the same entity and (b) making new discoveries about it. By contrast, realist truth and reference are trans-theoretic: once “electron” was introduced into the language by Stoney, it causally “locked onto” the property being an electron; then the various theorists were talking about that entity and making new discoveries about it. So realism, unlike positivism, saves our ordinary ways of talking and acting.

(ii) The conjunction objection: in practice we conjoin theories we accept. Realist truth has the right kind of properties, such as closure under the logical operation of conjunction (if T1 is true and T2 is true, then (T1 and T2) is true), to underwrite this conjunction practice. But positivist surrogates for truth, reference, and acceptance cannot underwrite this practice. From “T1 is observationally correct” and “T2 is observationally correct”, it does not follow that (T1 and T2) is observationally correct—their theoretical parts could contradict each other, for example, so that their conjunction would imply all observational sentences, both true and false. Again realism, but not positivism, succeeds. Similarly, the practice of conjoining auxiliary hypotheses with a theory to extend and test the theory cannot be accounted for by positivism. In {Newton’s theory of gravitation + there is no transneptunian planet}, “gravitation” has one meaning; in {Newton’s theory of gravitation + there are transneptunian planets}, it has another meaning. But the discovery that the latter was true and the former false should not be described as a change of meaning or reference of the word “gravitation”. Again realism succeeds where positivism fails.

(iii) The No-Miracles Argument (NMA): everyone agrees that science is instrumentally successful and increasingly so. Scientists believe that newly proposed theories stand a better chance of success if they resemble current successful theories or if they are tested by methods informed by such theories, and they construct scientific instruments, experiments, and applications relying on current theories. Moreover, scientists are getting better at doing this—consider improvements in microscopy over the past three centuries. Their actions are successful and rely on their beliefs that current theories can be depended upon to produce a likelihood of success. These successes are a miracle on positivist principles. Why should reliance on observationally correct theories be expected to produce success, unless we believe what they say about unobservables? In contrast, SR explains these successes: scientists’ actions rely upon their belief that the theories they use are approximately true; those actions have a high degree of success; the best explanation of their success is that the theories relied upon are approximately true.

e. Inference to the Best Explanation

Argument 1-3 (§5d) is an instance of inference to the best explanation (IBE), an inferential principle that realists endorse and antirealists reject. IBE is the rule that we should infer the truth of the theory (if there is one) that best explains the phenomena. Thus we should infer SR because it best explains scientific practice and its instrumental success.

First, a few clarifications of IBE are in order. If IBE is to be non-trivial, the best explanation must not entail that what is best must antecedently be what is most likely, since of course we should infer the truth of the most likely explanation. Rather the best explanation must be characterized in terms of properties like “loveliest” or “most explainey” (Lipton 2004). Traditional examples of such properties are: it has wide scope and precision; it appeals to plausible mechanisms; it is simple, smooth, elegant, and non-ad hoc; and it underwrites contrasts (why this rather than that). Then IBE says we should accept the theory that optimizes such explanatory virtues when explaining the phenomena. The caveat “if there is one” blocks inferences to the best of a bad lot: the best explanation may not reach a minimally acceptable threshold. Finally, like any inferential principle that amplifies our knowledge, conclusions inferred by IBE are fallible: while they are more likely to be true, they could be false. Second, the “justification” for IBE is two-fold. (1) It is needed for science. Simple enumerative induction (which entitles us to move probabilistically from “All observed As are Bs” to “All As are Bs” cannot handle inferences from observed phenomena to their “hidden” causes. For example, we cannot inductively infer “Galaxy X is receding” from “Light from Galaxy X is red-shifted”, but we can infer by IBE that Galaxy X is receding because that is the best explanation of why its light is red-shifted. More strongly, Harman (1965) argues that IBE is needed to warrant straight enumerative induction: we are entitled to make the induction from “All observed As are Bs” to “All As are Bs” only if “All As are Bs” provides the best explanation of our total evidence. (2) Scientific uses of IBE are grounded in, and are just sophisticated applications of, a principle we use in everyday inferential practice. If I see nibbled cheese and little black deposits in my kitchen and hear scratching noises in the walls, I reasonably infer that I have mice, because that best explains my evidence. IBE thus needs no more justification than does modus ponens—each is part of the very practices that constitute what rational inference is.

Realists employ IBE at different levels. At the ground-level, they observe surprising regularities like the phenomenological gas laws relating pressure, temperature, and volume. These cannot be just cosmic coincidences. Realists argue that observed gas behavior is as it is because of underlying molecular behavior; we have reason to believe the molecular hypothesis (by IBE) because it best explains the observed gas behavior. At this level, antirealist rejections of IBE seem stretched: it seems unsatisfactory to say either that we do not need an explanation (since it appears to be a guiding aim of inquiry to explain regularities where possible) or that observed gas behavior is as it is because gases behave as if they are composed of molecules (since ordinary and scientific practice distinguishes genuine explanations from just-so stories).

Realists also employ IBE at a meta-level (§5d): we should be realists about our current theories because only realism can explain how our methodological reliance on them leads to the construction of empirically successful theories (Boyd) or only realism can explain the way in which scientific theories succeed each other and the methodological constraints scientists impose on themselves when constructing new theories (Putnam). Relativity theorists felt bound to have Newton’s theory derivable in the limit from Einstein’s theory. Why? The realist answer is: “because a partially correct account of a theoretical object (as the gravitational field) must be replaced by a better account of the same theory-independent object (as the metric structure of spacetime)”. Similarly, realists claim that scientific progress is best explained by SR5, the thesis that science is converging on a true account of the world. As Putnam says, realism is the only hypothesis that does not make the success of science a miracle. At the meta-level, the alleged phenomenon is that our best scientific traditions and theories are instrumentally and methodologically successful; SR is alleged to be the best (or only) explanation of that phenomenon; thus we should infer SR. As we will see (§§6d, 7, 11b), it is not clear that these uses of IBE are legitimate, because the alleged phenomenon itself is questionable, or the SR-“explanation” does not explain, or no explanation may be needed, or alternative antirealist explanations may be better.

6. Constructive Empiricism

Van Fraassen (1980) proposed constructive empiricism (CE), arguing that we can preserve the epistemological spirit of positivism without subscribing to its letter. Van Fraassen’s is an antirealism concerning unobservable entities. Recognizing the difficulties of basing antirealism on a “broken-backed” linguistic distinction between O-terms and T-terms, he allows our judgments about unobservables to be literally construed but, he argues, our evidence can never entitle us to our beliefs about unobservables. CE is consistent with SR3 and SR4 (though it does not commit to them, it has no quarrels with realist objectivity or semantics) but replaces SR1, SR2, and SR5 respectively with:

CE1     Science aims to provide empirically adequate theories of the phenomena.

CE2     To accept a theory is to believe it is empirically adequate, but acceptance has further non-epistemic/pragmatic features.

CE5     The progress of science produces increasing empirical adequacy.

A theory T is empirically adequate if and only if what T says about all actual observable things and events is true (that is, T saves all the phenomena, or T has a model that all actual phenomena fit in). Empirical adequacy is logically weaker than truth: T’s truth entails its empirical adequacy but not conversely. But it is still quite strong: an empirically adequate theory must correctly represent all the phenomena, both observed and unobserved. CE2 distinguishes epistemic and pragmatic aspects of acceptance. Epistemic acceptance is belief; beliefs are either true or false. Pragmatic acceptance involves non-epistemic commitments to use the theory in certain ways (basing research, experiments, and explanations on it, for example); commitments are neither true nor false; they are either vindicated or not. CE5 acknowledges that there is instrumental progress without trying to explain it. CE concedes a realist semantics (“electron”-talk is not highly derived talk about observables) but preserves the spirit of positivism by recommending agnosticism about a theory’s literal claims about unobservables.

a. The Semantic View of Theories and Empirical Adequacy

On the positivist view, a theory T is a syntactic object: T is the set of theorems in a language generated from a set of axioms (the laws of T) and derivation rules. The empirical content (the entire literal content) of T is T/O, the theorems expressible in the observational vocabulary. A theory T is empirically (observationally) adequate if T/O is the class of all true observational sentences.  Two theories, T and T’, are empirically (observationally) equivalent if T/O = T’/O. Since such theory pairs have the same literal content and differ only in their non-literal, theoretical content, they are merely inter-definable variants of a common observational basis: they say the same thing but express it differently. There is no fact of the matter whether T or T’ is true (both are or neither are), and whether we work with T or T’ is purely a pragmatic matter concerning which is simpler, more convenient, and so forth. For SR and CE there is a fact of the matter: at most one of T, T’ can be true. For SR there may be reasons to believe one of T, T’. For CE there can be no epistemic reason to believe one over the other, though there may be pragmatic reasons to accept (commit to using) one over the other. Van Fraassen needs a different account of theories if he is to agree with realists about literal content and there being a fact of the matter about empirically equivalent theories.

For him, a theory T is a semantic object, the class of models, A = <D, R1, R2, …, Rn>, that satisfy its laws (where D is a set of objects and Ri are properties and relations defined on them). For example, D might contain billiards and molecules; the property is elastic in A might be instantiated by both billiards and molecules, is a molecule by some members of D, and is a billiard ball by others. Now let A’ = <D’, R’1, R’2, …, R’m> (where m < n, D’ is a proper subset of D, and R’i = Ri/D’ (Ri restricted to D’)). Intuitively A’ is obtained from A by removing all unobservables, so D’ would contain billiard balls but not molecules, is elastic would now be restricted to billiard balls, is a molecule would not be instantiated, and so forth. Then A’ is an empirical substructure of A, the result of restricting the original domain to observables and its properties and relations accordingly. T is empirically adequate if and only if T has an empirical substructure that all observables fit in. Two theories, T and T’, are empirically equivalent if all the observables in a model of T are isomorphic to the observables in a model of T’. Such theory pairs agree in what they say about observables but may disagree in what they say about unobservables. Thus CE can agree with SR that at most one of T, T’ can be true and to be a realist about that theory is to believe it is true (SR2). Yet CE can preserve the spirit of positivism by holding that we can never have reason to believe a theory; at most we have reason to believe it is empirically adequate. Friedman (1982) questions whether van Fraassen achieves this.

b. The Observable-Unobservable Distinction

Since CE recommends agnosticism about unobservables but permits belief about observables, the policy requires an epistemologically principled distinction between the two. Though rejecting the positivists’ distinction between T-terms and O-terms, van Fraassen defends a distinction between observable and unobservable objects and properties, a distinction that grounds his policy of agnosticism concerning what science tells us about unobservables. There is a fact of the matter about what is observable-for-humans: given the nature of the world and of the human sensory apparatus, some objects/events/properties possess the property is observable-for-humans; others lack that property; the former are observables, the latter unobservables. For example, Jupiter’s moons are observable because a human could travel close enough to see them unaided, but electrons are unobservable because a human could never see one (that is just the nature of humans and electrons). Van Fraassen also claims that the limits of observation are disclosed by empirical science and not by philosophical analysis—what is observable is simply a fact disclosed by science. It should be noted that the distinction, as he draws it, has no a priori ontological implications: flying horses are observable but do not exist; electrons may exist but are unobservable.

Critics (Churchland 1985; Musgrave 1985; Fine 1986; Wilson 1985) complain that this distinction cannot ground a sensible epistemological policy. First, van Fraassen runs together different notions, none of which has special epistemological relevance. What is observable is variously taken as: what is detectable by human senses without instruments (Jupiter’s moons); what can be “directly” measured as opposed to “indirectly” calculated; what is detectable by humans-qua-natural-measuring-instruments (as thermometers measure temperature, humans “measure” observables). Critics ask why any of these should divide the safe from the risky epistemic bet. Why is it legitimate to infer the presence of mice from casual observation of their tell-tale signs but illegitimate to infer the presence of electrons from careful and meticulous observation of their tell-tale ionized cloud-chamber tracks?

Second, many critics find van Fraassen’s agnosticism about unobservables unwarrantedly selective. CE claims that we ought to believe what science tells us about all observables (both observed and unobserved) but not about unobservables. In each case there is a gap between our evidence (what has been observed) and what science arrives at (claims about all observables (CE) or claims about all observables and unobservables (SR)). Why is it legitimate to infer from what we have observed in our spatiotemporally limited surroundings to everything observable but not to what is unobservable (though detectable with reliable instruments or calculable with reliable theories)? Our experience is limited in many ways, including lacking direct access to: medium-sized events in spatiotemporally remote regions, events involving very small or very large dimensions, very small or very large mass-energy, and so forth. Why should inductions to claims about the first be legitimate but not to claims about the others?

Third, CE’s epistemic policy is pragmatically self-defeating or incoherent. Suppose a scientific theory T tells us “A is unobservable by humans”. In order to use T to set our epistemic policy we must accept T; that is, believe what T tells us about observables, but we should be agnostic about what T tells us about unobservables, including whether A is observable or unobservable. But if we should be agnostic about A’s observability, then we do not know whether or not we should believe in As. A consistent constructive empiricist will have trouble letting science determine what is unobservable and using that determination to guide her epistemic policy—often she will not know what not to believe.

Finally, if we interpret the language of science literally (as van Fraassen does), then we ought to accept that we see tables if and only if we see collections of molecules subject to various kinds of forces. But then if we are willing to assert there are tables we should be willing to assert that there are collections of molecules (Friedman 1982; Wilson 1985).

c. The Argument from Empirically Equivalent Theories

As realists rely on IBE, antirealists rely on EET:

  1. If T and T’ are empirically equivalent, then any evidence E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n.
  2. If (E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n), then we have no reason to believe T rather than T’ or vice versa.
  3. For any T, there exists a distinct empirically equivalent T’.
  4. Thus, for any theory T, we have no reason to believe it rather than its empirically equivalent rivals.

The argument appears to be valid, but each of its premises can be challenged (Boyd 1973; Laudan and Leplin 1991). Premise 1 is under-specified. Any abstract, sufficiently general theory (for example, Newton’s theory of gravitation) has no empirical consequences on its own. Trivially, two such theories are empirically equivalent since each has no empirical consequences; so any evidence equally confirms/infirms each. But no realist will worry about this. In order to give Premise 1 bite, the theories must have empirical consequences, which they will have only with the help of auxiliary hypotheses, A (§4). But then Premise 1 becomes:

1A. If (T and A) and (T’ and A) are empirically equivalent, then any evidence E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n.

Whether 1A is plausible depends on what A is. If A is any hypothesis which has been accepted to date, then 1A is false because current empirical indistinguishability does not entail perpetual empirical indistinguishability, since evidence and auxiliary hypotheses change over time as we discover new instruments, methods, and knowledge. But if A is any hypothesis whatsoever, then there is no reason to think that the antecedent of Premise 1A is true, and thus 1A is again a trivial, vacuous truth. Moreover, the connection between empirical equivalence (agreement about observables in the sense of §6a) and evidential support is questionable (Laudan and Leplin 1991). Premise 1 presupposes that all and only what a theory says or implies about observables is evidentially relevant to that theory. But this is false: Brownian motion, though not an empirical consequence of atomic theory, supported it. Thus T and T’ could be empirically equivalent, yet one could have better evidential support than the other; for example, T, but not T’, might be derivable from a more comprehensive theory that entails evidentially well-supported hypotheses.

Some IBE-realists resist Premise 2: T and T’ may be equally confirmed by the evidence, yet one of them may possess superior explanatory virtues (§5e) that make it the best explanation of the evidence and thus, by IBE, more entitled to our assent—especially if the other is a less natural, ad hoc variant of the “nice” theory. The success of this response depends on whether explanatorily attractive theories are more likely to be true—why should nature care that we prefer simpler, more coherent, more unified theories?—and on whether a convincing case can be made for the claim that we are evolutionarily equipped with cognitive abilities that tend to select theories that are more likely to be true because their explanatory virtues appeal to us (Churchland 1985).

The very strong, very general conclusion of EET, however, depends on the very strong, very general Premise 3, which, critics argue, is typically supported either by “toy” examples of theory-pairs from the history of physics, by contrived examples of theories, one of which is transformed from the other by a general algorithm (Kukla 1998), or by some tricks of formal logic or mathematics. None is likely to convince any realist (Musgrave 1985; Stanford 2001).

d. Constructive Empiricism, IBE, and Explanation

For van Fraassen, a theory’s explanatory virtues (simplicity, unity, convenience of expression, power) are pragmatic—a function of its relationship to its users. This implies that explanatory power is not a rock bottom virtue like consistency (Newton could decline to explain gravity, but he could not decline to be consistent) and does not confer likelihood of truth or empirical adequacy (Newton’s theory explained lots of phenomena but is neither true nor empirically adequate). The fact that a theory satisfies our pragmatic desiderata has no implications for its being true or empirically adequate, contrary to what IBE-realists maintain.

IBE is a rule guiding rational choice among rival hypotheses. But there is always the option of declining to choose, of remaining agnostic. To undercut this general option, van Fraassen argues, the realist must commit to some claim like: every regularity and coincidence must be explained. Van Fraassen challenges this alleged requirement. First, the quest for explanation has to stop somewhere; even “realist” explanations must bottom out in brute fundamental laws; so, why cannot an antirealist bottom out in brute phenomenological laws? Second, scientists do not consider themselves bound by a principle that demands that every correlation be explained. In quantum mechanics, for example, spin states of entangled particles are perfectly correlated, yet every reasonable explanation-candidate has failed, and scientists no longer insist that they must be explained, contrary to what realists allegedly require (Fine 1986). However, these arguments may be directed at a straw man, since no realist is likely to require that every regularity be explained. Musgrave (1985), for example, suggests that these arguments confuse realism (the view that science aims to explain the phenomena where possible) with essentialism (the view that science aims to find theories that are fundamentally self-explanatory): it is not antirealist to claim that Newton explained a host of phenomena in terms of gravity but declined to explain gravity itself.

Van Fraassen also denies that only realism can explain the phenomena. There are rival explanations that are compatible with CE, and some of them are more plausible than realism. In §5e we distinguished ground-level and meta-level uses of IBE and suggested that this strategy might be more promising for the latter than the former. Recall the realists’ reasoning: there is a surprising phenomenon—our current scientific theories, scientific methodology, and the history of modern science, are surprisingly successful—which cries out for explanation; the only explanation is that the theories are approximately true; thus, by IBE, realism. But there is a more mundane explanation: many very smart people construct our scientific theories and methods, throwing out the unsuccessful ones (which we tend to ignore (Magnus and Callender 2004)) and refining and keeping only the successful ones. A variant of this success-by-design-and-trial-and-error is explanation of success in Darwinian terms: just as the mouse’s running away from its enemy the cat is better explained in Darwinian terms (only flight-successful mice survive and pass their genes along) than in representational terms (the mouse “sees” that the cat is his enemy and therefore runs), so too the instrumental success of science is better explained in Darwinian terms (only the successful theories survive) than in realist terms (they are successful because they are approximately true). These rival antirealist explanations of success are controversial, however (Musgrave 1985). The success-by-design explanation does not seem right, since scientists often construct theories that make completely unexpected, novel predictions.

7. Historical Challenges to Scientific Realism

A range of arguments attempt to show that scientific realism is often supported by an implausible history of science. (In what follows, T* and T are successor and predecessor theories in a sequence of theories; for example, think of the sequence <Aristotelian physics, Medieval physics, Cartesian physics, Newtonian physics, (Newtonian + Maxwellian physics), Special Theory of Relativity (STR), (General Theory of Relativity (GTR) + Quantum Mechanics (QM)), …> as ordered under the relation T* succeeds T.)  Both realists and empiricists think of science as being cumulative and progressive. For empiricists, cumulativeness requires at least that T* have more true (and perhaps less false) observational consequences than T. Since the content of a theory on logical positivists’ views is exhausted by its observational consequences, if T* has more true observational consequences than T, then T* is “more true than” T. However, SR-realists require more. Because of SR5, they are committed to a historical thesis: that science asymptotically converges on the truth. Because of their externalist semantics, they are committed to theses about reference: theoretical terms genuinely refer, reference is trans-theoretic, and reference is preserved in T-T* transitions (so that “electron” in Bohr’s earlier and later theories refers to the same object and the later theory provides a more adequate conception of that object). Finally, because of their meta-level appeals to IBE, they are committed to SR5 because it best explains the instrumental success of our best theories and the increasing instrumental success of sequences of theories (where T* is more successful than T because T* is closer to the truth than T), and so forth.

a. Kuhn’s Challenge

According to Kuhn (1970), the standard view of science as steadily cumulative (presupposed by both positivism and realism) rests on a myth that is inculcated by science education and fostered by Whiggish historiography of science. When the myth is deconstructed, we see science as historically unfolding through stable cycles of cumulativeness, punctuated by periods of crisis and revolution.

During periods of normal science, practitioners subscribe to a paradigm. They have the same background beliefs about: the world, its fundamental ontology, processes, and laws (statements that are not to be given up); correct mathematical and linguistic expression; scientific values, goals, and methods; scientifically relevant questions and problems; and experimental and mathematical techniques. Within a given paradigm P—for example, Newtonian physics—there is a relatively stable background: a world of Newtonian particles moving in space and time subject to Newtonian forces (like gravity) and obeying Newton’s laws. There are exemplary methods and techniques—for example, to solve a problem of motion, bring it under the equation, F = ma, which manifests itself across the board and is treated as counterexample-free. And there are shared values—for example, unified mathematical representation of phenomena—and problems (for example, the solution of the arbitrary n-body problem for a system of gravitationally attracting bodies or the resolution of the anomaly in the orbit of Uranus) that require further articulation of the theory. In normal science, cumulativeness occurs: the theory becomes extended to answer its own questions and cover its phenomena. (Kuhn thinks that clean views of history come from focusing too much on normal science.) But sooner or later anomalies crop up that the paradigm cannot handle (for example, the failure to bring electromagnetism, black body radiation, and Mercury’s orbit under the Newtonian scheme). There is a crisis that only a revolutionary new paradigm (for example, STR, QM, and GTR) can handle. Once in place, the new paradigm P* provides a radically new way of looking at the world.

Kuhn (1970) was interpreted (wrongly, but with some justice given his sometimes incautious language) as arguing for an extremely radical constructivist/relativist position: P and P* are incommensurable in the sense that they are so radically distinct that they cannot be compared; the P and P* scientists work “in different worlds”, “see different things”, use different maps (theories and conceptual schemes) and also have different rules for map-making (methods), different languages, and different goals and values. As a result, during the transition, scientists have to learn a new way of seeing and understanding phenomena—Kuhn likens the experience to a “gestalt switch” or “religious conversion”. There is no commonality—in ontology, methodology, observational base, or goals/values—that P and P* scientists can use to rationally adjudicate their disagreements. There is no paradigm-independent reason for preferring P* over P, since such reasons would have to appeal to something common (common observations, methods, or norms), and they share no commonality. Even more strongly, there is no paradigm-independent, objective fact of the matter concerning which of them is correct. If this were true, then all standard theses about progress would be undermined. There is no referential or meaning continuity across paradigms; no sense can attach to theses like T* is more true than T, T is a limiting case of T*; or T* preserves all T’s true observational consequences, since such theses presuppose T-T* commensurability.

Critics have pointed out that this view is too extreme (McMullin 1991). The history of science shows more continuity and fewer radical revolutions than this account attributes to it. Scientists make rational choices between “paradigms” (for example, most scientists who were skeptical of atoms came to reasonably believe in them as a result of Perrin’s experiments). Many scientists work within two traditions without experiencing gestalt shifts (for example, 19th century energetics and molecular theories). T and T* advocates often argue, criticize each other, and rationally persuade each other that one of the two is incorrect. How could this be, if the radical interpretation of Kuhn were correct?

Kuhn clearly did not intend the radical reading, and in later writings (1970 Postscript, 1977) he distinguishes his views from such radical, subjectivist, and relativist interpretations. Paradigm transitions and incommensurability, he argues, are never as total as the radical interpretation assumes: enough background (history, instrumentation, and every-day and scientific language) is shared by P- and P*-adherents to underwrite good reasons they can employ to mount persuasive arguments. Moreover, he lists several properties any theory should have—accuracy (of description of experimental data), consistency (internal and with accepted background theories), scope (T should apply beyond original intended applications), fecundity (T should suggest new research strategies, questions, problems), and simplicity (T should organize complex phenomena in a simple tractable structure). Application of these criteria accounts for progress and theory choice. However, these are “soft” values that guide choices rather than “hard” rules that determine choices. Unlike rules, (i) they are individually imprecise and incomplete, and (ii) they can collectively conflict (and there is no a priori method to break ties or resolve conflicts). Moreover, Kuhn argues, an individual’s choice is guided by a mixture of objective (accuracy, and so forth) and subjective (individual preferences like cautiousness and risk-taking, and so forth) factors, the latter influencing her interpretation and weighing of the criteria. A cautious scientist may be unwilling to risk a high probability of being wrong for a small probability of being informative in novel ways, and vice versa for the risk-taker. In this way Kuhn (1977) offers a middle ground between theory choices being completely subjective and being objective (qua being determined by rules applied to evidence). This “softer” view of science, he argues, enables new theories to get off the ground: progress can be made only if there are values to allow rational discussion and argument but not hard rules that would pre-determine an answer (because then everyone would conform to the rule and not risk proposing new alternatives).

Kuhn has shown that evidence and reasons are sometimes incapable of deciding between P and P*. But a realist may concede that hard choices occur: at most one of P or P* is correct, and we may have to wait and see which, if either, pans out. Temporary gridlock need not amount to permanent undecidability: the lack of decisive reasons at a time does not imply that there will be no decisive reasons forever; when more evidence is acquired and its relevance better understood, convincing reasons usually emerge. Realists should concede these points; many in the 21st century do. But no SR-realist can accept the thesis, never abandoned by Kuhn, that there is no fact of the matter whether P or P* is correct.

b. Laudan’s Challenge: The Pessimistic Induction

Although it is widely agreed that our best theories are instrumentally successful and many T-T* sequences show increasing success, Laudan (1981) disputes that success and progress are to be explained in realist SR5-terms of increasing approach to the truth. The history of science, Laudan argues, shows that referential success is neither necessary nor sufficient for empirical success: not necessary because the central terms of many successful theories did not refer (19th century ether, caloric, and phlogiston theories, for example); not sufficient because the central terms of many failing theories did, by our lights, refer (18th century chemical atomism, Prout’s hypothesis for most of the 19th century, Wegener’s theory of continental drift in the first half of the 20th century, and so forth).

Moreover, realist notions of approximate truth and convergence-to-the-truth are problematic. Despite best efforts, no satisfactory metric has emerged that would characterize distance from the truth or the truth-distance between T and T* (Laudan 1981; Miller 1974; Niiniluoto 1987). For some T-T* sequences in mathematical physics, there are limit theorems whereby T can be derived as a special case of T* under appropriate limiting conditions. For example, special relativity passes asymptotically into Newtonian mechanics as (v/c) 2 approaches 0. Such theorems suggest that Newtonian mechanics yields close to correct answers for applications close to the relativistic limits (not too fast). In this way realists can appeal to them to argue that T* extends and improves upon T. However, for many T-T* sequences there are no analogous limit theorems: Lavoisier’s oxygen theory is a progressive successor of Priestley’s phlogiston theory, yet there is no neat mathematical relationship indicating that phlogiston theory is a limiting case of oxygen theory. Moreover, even for cases where T* approaches T as some parameter approaches a limit, it is controversial what to conclude. If reference is determined by meaning (§5b), then “massnewton” and “masseinstein” refer to different things, and the fact that there is a derivation of classical mass-facts from relativistic mass-facts under certain conditions does nothing to show that T* provides a more global, more accurate description of mass-facts than T (since they’re talking about different things); the limit theorems show at most that some structure of abstract relations but not semantic content gets preserved in the T-T* transition (§11a). But if reference is determined by causal-historical relations (§5c), then the references of some key terms of T get lost in the transition to T*—“ether” was a key referring term of classical physics, but there is no ether in special relativity; so how can classical physics capture part of the same facts that special relativity captures when all its claims about the ether are either plainly false or truth valueless?

These are serious challenges to SR. On one hand, it is hard to shake the idea that theories are successful because they are “onto something”. Yes, we build them to be successful, but their scope and novel predictions generally greatly outstrip our initial intentions. Realists tend to see the history of science as supporting an optimistic meta-induction: since past theories were successful because they were approximately true and their core terms referred, so too current successful theories must be approximately true and their central terms refer. On the other hand, skeptics see the history of science as supporting a pessimistic meta-induction: since some (many, most) past successful theories turned out to be false and their core terms not to refer, so too current successful theories may (are likely to) turn out to be false and their key terms not to refer. Realists must be careful not to interpret history of science blindly (ignoring the successes of ether theories and the failures of early atomic theories, for example) or Whiggishly (begging questions by wrongly attributing to our predecessors our referential intentions—by assuming, for example, that Newton’s “gravity” referred to properties of the space-time metric).

8. Semantic Challenges to Scientific Realism

Realist truth and reference are word-world/thought-world correspondences (SR4), an intuitively plausible view with a respectable pedigree going back to Aristotle. Moreover, some IBE-realists argue that real correspondences are needed to explain the successful working of language and science: we use representations of our environment to perform tasks; our success depends on the representations causally “tracking” environmental information; truth is a causal-explanatory notion. Several philosophical positions challenge this idea.

a. Semantic Deflationism

Tarski showed how to define the concept is true-in-L (where L is a placeholder for some particular language). Treating “is true” as predicated of sentences in a formal language, he provided a definition of the concept that builds it up recursively from a primitive reference relation that is specified by a list correlating linguistic items syntactically categorized with extra-linguistic items semantically categorized. Thus, for example, a clause like “‘electron’ refers to electrons” would be on this list if the language were English. Although Tarski’s definition is technically sophisticated, the main points for our purposes are these. First, it satisfies an adequacy condition (referred to as Convention T): for every sentence P (of L), when P is run through the procedure specified by the definition, “P” is true (in L) if and only if P. Thus, for example, “Electrons exist” is true-in-English if and only if electrons exist, and so forth. Second, truth and reference are disquotational devices: because of the T-equivalences, to assert that “snow is white” is true (in English) is just to assert that snow is white; similarly, to assert that “snow” refers (in English) to some stuff is just to assert that the stuff is snow.

Semantic deflationists (Fine 1996; Horwich 1990; Leeds 1995, 2007) argue that Tarski’s theory provides a complete account of truth and reference: truth and reference are not causal explanatory notions; they are merely disquotational devices that are uninformative though expressively indispensable—useful predicates that enable us to express certain claims (like “Everything Putnam said is true”) that would be otherwise inexpressible. So long as a truth theory satisfies Convention T, these things will be expressible, and a trivial list-like definition of reference (“P” refers to x iff x is P) will suffice to generate the T-sentences. As native speakers, we know, without empirical investigation, that “electron” refers to electrons just by having mastered the word “refers” in our language. Our beliefs about electrons could be mistaken, but not our belief that “electron” applies to electrons. In particular, we cannot coherently suppose that “electron” does not refer to electrons because this is but a step away from a formal contradiction—some electrons are not electrons. Deflationists argue that such “thin” concepts and trivial relations cannot bear the explanatory burdens that scientific realists expect of them.

Deflationism is a controversial position. Field, before he endorsed deflationism, argued that Tarski merely reduced truth to a list-like definition of reference, but such a definition is physicalistically unacceptable (Field 1972). Chemical valence was originally defined by a list pairing chemical elements with their valence numbers, but later this definition was unified in terms of the number of outer shell electrons in the element’s atoms. Field argued that reference should be similarly reduced to physical notions. While this seems an implausibly strong requirement, many philosophers think it obvious that the success of action depends on the truth of the actors’ beliefs: John’s success in finding rabbits in the upper field, they argue, depends on his rabbit-beliefs corresponding to the local rabbits (Liston 2005). Deflationists respond that John’s success is explained by there being rabbits there (no need to mention ‘true’), but deflated explanations become strained when John is not an English thinker, because the sentences Jean holds true (‘Des lapins habitent le champ supérieur’) must first be translated into sentences we hold true and then disquoted—a strategy known as extended disquotationalism—and it is difficult to see why Jean’s success has anything to do with his sentences translating into ours.

Deflationists reject SR4 and SR5, but this does not mean they cannot believe what our best scientific theories tell us: deflationists can and typically do accept SR3 as well as all the object-level inferences that science uses, including object-level IBE (Leeds 1995, 2007). It means only that deflationists reject the meta-level IBE deployed by realists (§5e)—such inferences must be rejected if truth is not an explanatory notion.

b. Pragmatist Truth Surrogates

Pragmatists question metaphysical realism (SR3): it presupposes a relation between our representations (to which we have access) and a mind-independent world (to which we lack access), and there cannot be such a relation, because mind-independent objects are in principle beyond our cognitive reach. Thus SR3 (and correspondence truth) is either vacuous or unintelligible. For them, word-world relations are between words and objects-as-conceived by us. If we cannot reach out to mind-independent objects, we must bring them into our linguistic and conceptual range.

Pragmatists also tend to supplement Tarski’s understanding of truth, like philosophers in a broadly idealist tradition (including Hume, Kant, the positivists, and Kuhn) who employ truth-surrogates that structure the “world” side of the correspondence relation in some way (impressions, sense data, phenomena, a structured given) that would render the correspondence intelligible. Depending on the kind of idealism adopted “p is true” might be rendered “p is warrantedly assertible”, “p is derivable from theory Θ”, or “p is accepted in paradigm P”, all of the form “p is E” where E is some epistemic surrogate for “true”. We have already seen (§5d) how realists object to this move: it assigns to the concepts truth and reference the wrong properties (it makes them intra-theoretic rather than trans-theoretic) and thus cannot properly capture key features of practice. More generally, Putnam argues, truth cannot be identified with any epistemic notion E: take any revisable proposition p that satisfies E, we already know that p might not be true; so being E does not amount to being true. For example, that Venus has CO2 in its atmosphere is currently warrantedly assertible, but future investigation could lead us to discover that it is not true. Thus, Putnam thinks, truth is epistemically transcendent: it cannot be captured by any epistemic surrogate (Putnam 1978).

c. Putnam’s Internal Realism

In his SR period, Putnam held that only real word-world correspondences could capture the epistemic transcendence and causal explanatory features of truth. In the late 1970s Putnam came to doubt SR3, reversed his position, and proposed a new program, internal realism (Putnam 1981). IR has negative and positive components.

The main negative component rejects metaphysical realism (SR3) and the associated thesis that truth and reference are word-world correspondences (SR4). The primary argument for this rejection is Putnam’s model-theoretic argument (Merrill 1980; Putnam 1978, 1981). Take our language and total theory of the world. Suppose the intended reference scheme (which correlates our word uses with objects in the world) is that which satisfies all the constraints our best theory imposes. This supposition is problematic because those constraints would fix at best the truth conditions of every sentence of our language; they would not determine a unique assignment of referents for our terms. Proof: Assume there are n individuals in the world W, and our theory T is consistent. Model theory tells us that since T is consistent it has a model M of cardinality n; that is, all the sentences of T will be true-in-M. Now define a 1-1 mapping f from the domain of M, D(M), to the domain of W, D(W), and use f to define a reference relation R* between L(T) (the language of our theory) and objects in D(W) as follows: if x is an object in D(W) and P is a predicate of L(T), then P refers* to x if and only if P refers-in-M to f-1x. Then any sentence S will be true* (of W) if and only if S is true-in-M. Intuitively, truth* and reference* are not truth and reference but gerrymandered relations that mimic truth-in-M and refers-in-M, where M can be entirely arbitrary, provided it has enough objects in its domain. Unfortunately, anything we do to specify the correct reference scheme for our language and incorporate it into our total theory is subject to this permutation argument. One might object, for example, that a necessary condition for (real) reference is that P refer to x only if x causes P and P is not causally related to the objects it refers* to (Lewis 1984). But if we add this condition to our theory, then we can redeploy a permutation whereby “x causes* P (in W)” will mimic “f-1x causes P (in-M)”; and instead of failing to fix the real reference relation we will be failing to fix the real causal relations. This formal result is the basis of Putnam’s argument that even our best theory must fail to single out its intended model (reference scheme). The permutation move is so global that no matter what trick X one uses to distinguish reference from reference*, the argument will be redeployed so that if X relates to cats in a way that it does not to cats*, then X* (a permutation of X) will relate to cats* in the same sort of way, and there will be no way of singling out whether we’re referring to X or X*.

The positive component of internal realism replaces SR3 and SR4 with IR3 and IR4:

IR3 We can understand a determinate world only as a world containing objects and properties as it would be described in the ideal theory we would reach at the limit of human inquiry;

IR4 Theories are true, when they are, partly because their concepts correspond to objects and properties that the ideal theory carves out;

and it reinterprets references to truth in SR1, SR2, and SR5 in terms of IR3 and IR4.

IR3 replaces allegedly problematic, inaccessible mind-independent objects with unproblematic, accessible objects that would be produced by the conceptual scheme we would reach in the ideal theory, and IR4 relates our words to the world as it would be carved up according to the ideal theory. When truth, reference, objects, and properties are thus relativized to the ideal theory, then IR1, IR2, and IR5 are just IR counterparts of their SR analogs: we aim to give accounts that would be endorsed in the ideal theory; to accept a theory is to believe it approximates the ideal theory; science (trivially) progresses toward the ideal theory. Putnam believes he can avoid unintelligible correspondences to an inaccessible, God-eye view of the world yet still have a concept of truth that is explanatory and epistemically transcendent. While truth-in-the-ideal-limit is an epistemic concept—it is relativized to what humans can know—it transcends any particular epistemic context; so we can have the best reasons to believe that Venus has CO2 in its atmosphere though it may be false (for it may turn out not to be assertible in the ideal theory).

Objects and properties, according to IR3, are as much made as discovered. To many realists, this seems to be an extravagant solution to a non-problem (Field 1982): extravagant to claim we have a hand in making stars or dinosaurs; a non-problem, because many realists think the content of metaphysical realism (SR3) is just that there is a mind-independent world in the sense that stars and dinosaurs exist independently of what humans say, do, or think. The problem is not how to extend our epistemic and semantic grasp to objects separated from us by a metaphysical chasm; it is the more ordinary, scientific problem of how to extend our grasp from nearby middle-sized objects with moderate energies to objects that are very large, very small, very distant from us spatiotemporally, and so forth. (Kitcher 2001; Liston 1985). Moreover, realists point out, true-in-the-ideal-theory falls short of true. We know that either string theory is true and the material universe is composed of tiny strings or this is not the case. But it is conceivable that no amount of human inquiry, even taken to the ideal limit, will decide which; so though one disjunct is true, neither may be assertible in the ideal limit. Consequently, internalist truth lacks the properties of truth. (It is noteworthy that Putnam recanted internalist truth in his last writing on these matters (Putnam 2015)).

Rorty is another pragmatist who rejects, in a far more radical manner than Putnam, the fundamental presuppositions of the realist-antirealist debate (Rorty 1980).

9. Law-Antirealism and Entity-Realism

Cartwright (1983) and Hacking (1983) represent this mix of theoretical law antirealism and theoretical entity realism. The kind of account that Cartwright rejects has three main components. First is the facticity view of fundamental physical laws: adequate fundamental laws must be (approximately) true. The basic equations of Newton, Maxwell, Einstein (STR/GTR), quantum mechanics, relativistic quantum mechanics, and so forth, are typical examples of such laws. Second is the covering law (or DN) model of explanation (Hempel 1965, §3c): a correct explanation of a phenomenon or phenomenological law is a sound deduction of the explanandum from fundamental laws together with statements describing, for example, compositional details of the system, boundary and initial conditions, and so forth. The deduction renders the explanandum intelligible by showing it to be a special case of the general laws. Thus, for example, Galileo’s law of free fall is explained as a special case of Newtonian fundamental laws by its derivation from Newton’s gravitational theory plus background conditions close to the earth’s surface. Third is IBE: the success of DN-explanations in rendering large classes of phenomena intelligible can justify our inferring the truth of the covering laws. The fact that Galileo’s law, Kepler’s laws, the ideal gas laws, tidal phenomena, the behavior of macroscopic solids, liquids, and gases all find a deductive home under Newton’s laws provides warrant for belief in the facticity of Newton’s laws.

Cartwright rejects all three components. She begins by challenging the first two components: there is a trade-off between facticity and explanatory power. Newton’s law of gravitation, FG = Gm1m2/r122, tells us what the gravitational force between two massive bodies is. Coulomb’s law, FC = kq1q2/r122, tells us what the electrostatic force between two charged bodies is. Each law gives the total force only for bodies where no other forces are acting. But most actual bodies are charged and massive and have other forces acting on them; thus the laws either are not factive (if read literally) or do not cover (if read as subject to the ceteris paribus modifier “provided no other forces are acting”). In physics, we explain by combining the forces: the actual force acting on a charged massive body is FA = FG + FC, the vector-sum of the Newton and Coulomb forces, which determines the actual acceleration and path. Cartwright objects that (a) we lack general laws of interaction allowing us to add causal influences in this way, (b) there is no reason to think that we can get super-laws that will be true and cover, (c) in nature there is only the actual cause and resultant trajectory. But if the facticity and explanatory components clash in this way, the third component is in trouble also. Realists cannot appeal to IBE to justify belief in factive fundamental covering laws because good explanations that cover a host of phenomena rarely proceed from true (factive) laws. Consequently, the explanatory success of fundamental laws cannot be cited as evidence for their truth.

Cartwright’s own account has three corresponding components. First, fundamental laws are non-factive: they describe idealized objects in abstract mathematical models, not natural systems. In nature there are no purely Newtonian gravitational systems or purely electromagnetic systems. These are mathematical idealizations. Only messy phenomenological laws (describing empirical regularities and fairly directly supported by experiment) truly describe natural systems. Second, we should replace the DN model of explanation with a simulacrum account: explanations confer intelligibility by fitting staged mathematical descriptions of the phenomena to an idealized mathematical model provided by the theory by means of modeling techniques that are generally “rigged” and typically ignore (as negligible) disturbing forces or mathematically incorporate them (often inconsistently). To explain a phenomenon is to fit it in a theory so that we can derive fairly simple analogs of the messy phenomenological laws that are true of it. Intelligibility, not truth, is the goal of theoretical explanation. Third, although we should reject IBE, we should embrace inference to the most likely cause (ILC). Whereas theoretical explanations allow acceptable alternatives and need not be true, causal explanations prohibit acceptable alternatives and require the cause’s existence. ILC, on Cartwright’s view, can justify belief in unobservables that are experimentally detectible as the causes of phenomena. Thus, for example, Perrin’s experiments showed that the most likely cause of Brownian motion was molecular collisions with the Brownian particles; Rutherford’s experiments showed that the most likely cause of backward scattering of a-particles bombarded at gold foil were collisions with the nuclei of the gold atoms.

The laws of physics lie, Cartwright claims, and the hope of a true, unified, explanatory theory of physics is either based on a misunderstanding of physics practice or a vestige of 17th century metaphysical hankering for a neatly designed mechanical universe. The practice of physicists, she argues, indicates that we ought to be antirealists about fundamental laws and points instead to a messy, untidy universe that physicists cope with by constructing unified abstract stories (Cartwright 1999). Thus Cartwright is anti-realist about fundamental laws: contrary to realists, they are not (even approximately) true; contrary to van Fraassen, she is not recommending agnosticism—we now know they are non-factive. On the other hand, also contrary to van Fraassen, scientific practice indicates that we should be realists about “unobservable” entities that are the most likely causes of the phenomena we investigate.

Critics complain that Cartwright confuses metaphysics and epistemology: even if we lack general laws of interaction, it does not follow that there are none. Cartwright replies that the unifying ideal of such super-laws is merely a dogma. However, practice seems Janus-faced here: the history of modern physics is one of disunity leading to unity leading to disunity, and so forth. Each time distinct fundamental laws resist combination, a new unifying theory emerges that combines them: electrodynamics and eventually Einstein’s theories succeeded in combining Newton and Coulomb forces. The quest for unity is a powerful force guiding progress in physics, and, while the ideal of a unified “theory of everything” continues to elude us, Cartwright’s examples hardly show that it is a vain quest. Moreover, Cartwright arguably conflates different kinds of laws: in classical settings, the fundamental laws are Newton’s laws of motion, and his F = ma is the super-law that combines Newton’s gravitational and Coulomb’s electrostatic laws (Wilson 1998).

Cartwright’s distinction between “theoretical” and “causal” explanations has also been criticized. Nothing about successful theoretical explanations, she claims, requires their truth, whereas successful causal explanations require the existence of the cause. To many this move seems fallacious—if “successful” means correct, then the truth of the former follows as much as the existence of the latter; if “successful” does not mean correct, then neither follows. Presumably, in the IBE context, “successful” does not entail truth, but similarly in the ILC context, “successful” does not entail existence: the most likely cause could turn out not to exist (for example, caloric flow or phlogiston escape) just as the best explanation could turn out to be false (caloric or phlogiston theory).

10. NOA: The Natural Ontological Attitude

Fine (1996, 1986) presented NOA, an influential response to the debates amounting to a complete rejection of their presuppositions. We generally trust what our senses tell us and take our everyday beliefs as true. We should similarly trust what scientists tell us: they can check what is going on behind the appearances using instruments that extend our senses and methods that extend ordinary methods. This is NOA: we should accept the certified results of science on a par with homely truths. Both realists and antirealists accept this core position, but each adds an unnecessary and flawed philosophical interpretation to it.

Realists add to the core position the redundant word “REALLY”: “electrons REALLY exist”. SR realists add substantive word-world correspondences, a policy that serves no useful purpose. The only correct notion of correspondence is the disquotational one: “P” refers to (or is true of) x if and only if x is P. Realist appeals to IBE are problematic for two reasons. First, they beg the question against antirealists, who ab initio question any connection between explanatory success and approximate truth. Moreover, there is no inferential principle that realists could employ and antirealists would accept. Straight induction will not work: we can induce from the observed to the unobserved, because the unobserved can be later observed to check the induction; but we cannot induce to unobservables, because there can be no such independent check (according to the antirealist). Second, IBE does not work without some logical connection between success and (approximate) truth. But the inference from success to (approximate) truth is either invalid if read as a deductive move (because many successful theories turned out to be false (§7b)), weak if read as an inductive move (because nearly all successful past theories turned out to be false), or circular if read as a primitive IBE move. The antirealist, by contrast, has a ready answer: if a scientific theory or method worked well in the past, tinker with it, and try it again. Finally, Fine argues, contrary to what realists often claim, realism blocks rather than promotes scientific progress. In the Einstein-Bohr methodological debates about the completeness of quantum mechanics, the realist Einstein saw QM as a degenerate theory, while the instrumentalist Bohr saw QM as a progressive theory. Subsequent history favored Bohr over Einstein.

However, antirealism is no better off. Empiricists attempt to set limits: we should believe only what science tells us about observables. Fine criticizes these limits for reasons given in §5a and §6b—the observable-unobservable distinction cannot be drawn in a manner that would motivate skepticism or agnosticism about unobservables but not about observables. We have standard ways of cross checking to ensure that what we are “seeing’ with an instrument or calculating with a theory is reliable even if not “directly” observable. Fine concludes that the checks that science itself uses should be the ones we appeal to when in doubt. Pragmatists and constructivists react to the inaccessible, unintelligible word-world correspondences posited by realists by pulling back and trying to reformulate the correspondences in terms of some accessible surrogate for truth and reference (§8). Fine reiterates the criticisms of §5d and §8: truth has properties that any epistemic truth-surrogate lacks.

Both realists and antirealists view science as a practice in need of a philosophical interpretation. In fact, science is a self-interpreting practice that needs no philosophical interpretation. It has local aims and goals, which are reconfigured as science progresses. Asking about the (global) aim of science is like asking about the meaning of life: it has no answer and needs none. NOA takes science on its own terms, a practice whose history and methods are rooted in, and are extensions of, everyday thinking (Miller 1987). NOA accepts ordinary scientific practices but rejects apriorist philosophical ideas like the realist’s God’s-Eye view and antirealist’s truth-surrogates.

Critics see NOA as a flight from, rather than a response to, the scientific realism question (Musgrave 1989). The core position, they argue, is difficult to characterize in a philosophically neutral manner that does not invite a natural line of philosophical questioning. Once one accepts that science delivers truths and explanations, it is natural to ask what that means, and realist and antirealist replies will naturally ensue—as they always have, since these interpretations are as old as philosophy itself. Moreover, it may be difficult to characterize NOA non-tendentiously: ground-level IBE and correspondence truth, for example, are arguably rooted in common sense and ought to be included in NOA; but then any antirealism that rejects them is incompatible with NOA.

11. The 21st Century Debates

Between 1990 and 2016 new versions of the debates, many focusing on Laudan’s PI (§7b), have emerged.

a. Structuralism

Structural Realism claims that: science aims to provide a literally true account only of the structure of the world (StR1); to accept a theory is to believe it approximates such an account (StR2); the world has a determinate and mind-independent structure (StR3); theories are literally true only if they correctly represent that structure (StR4); and the progress of science asymptotically approaches a correct representation of the world’s structure (StR5). (Here we replace each SR thesis in §5 with an analogous StR thesis.)

Structuralism comes from philosophy of mathematics. Consider the abstract structure <ω, o, ξ>, where ω is an infinite sequence of objects, o an initial object, and ξ a relation that well-orders the sequence. This structure is distinct from its many exemplifications: for example, the natural numbers ordered under successor, <0, 1, 2, 3, …>; the even natural numbers in their natural order, <0, 2, 4, 6, …>; and so forth. We can similarly consider the offices of the U.S. President, Vice-President, Speaker of the House, and so forth. as a collection of objects defined by the structure of relations given in the U.S. Constitution, distinct from its particular exemplars at a given time: Bush, Cheney, Pelosi (January 2007), and Obama, Biden, Boehner (January 2011). Similarly, structuralists suggest, the structure of relations that obtain between scientific objects is distinct from the nature of those objects themselves. The structure of relations is typically expressed (at least in physics) by mathematical equations of the theory (Frigg and Votsis 2011). For example, Hooke’s law, F = -ks describes a structure, the set of all pairs of reals <x, y> in Rsuch that y = -kx, which is distinct from any of its concrete exemplifications like the direct proportionality between the restoring force F for a stretched spring and its elongation s. If the world is a structured collection of objects (StR3), then StR1 says that science aims to describe only the structure of the objects but not their intrinsic natures.

Structuralism is not new: precursors include Poincaré and Duhem in the 19th century (§2c), Russell (1927), Ramseyfied-theory versions of logical positivism (§3b), Quine (§4), and Maxwell (1970). Russell claimed that we can directly know (by acquaintance) only our percepts, but we can indirectly know (by structural description) the mind-independent objects that give rise to them. This approach presupposes a problematic distinction between acquaintance and description and a problematic isomorphism between the percept and causal-entity structures. Worse, it runs afoul of a devastating critique by the mathematician M.H.A. Newman (1928), closely related to Putnam’s model-theoretic argument (§8c), and never satisfactorily answered by Russell. Newman argues that a fixed structure of percepts can be mapped 1-1 onto a host of different causal-entity structures provided there are enough objects in the latter; thus the structural knowledge that science allegedly delivers is trivial—it merely amounts to a claim that the world has a certain cardinality, the size of the percept-structure. (The Ramseyfied-theory approach encounters similar problems (Psillos 2001).)

Contemporary proponents, beginning with Worrall (1989), hold that structuralism steers a middle path between standard versions of scientific realism and antirealism. StR, they argue, provides the best of both worlds by acknowledging and reconciling the pull of both pessimistic and optimistic inductions on the history of science. Pessimistic inductions (PI) argue against SR (§7b): the ontology of our current best theories (quarks, for example) will likely be discarded just like that of past best theories (for example, ether). Optimistic inductions (like the NMA) argue for SR (§5d): because past successful theories must have been approximately true, current more successful theories must be closer to the truth. Structuralists respond that, though ontologies come and go, our grip on the underlying structure of the world steadily improves. Underlying ontology need not be (and is not) preserved in theory change, but the mathematical structure is both preserved and improved upon: Fresnel’s correct claims about the structure of light (as a wave phenomenon) were retained in later theories, while his incorrect claims about the nature of light (as a mechanical vibration in a mechanical medium, the ether) were later discarded. Structuralists can also resist the argument from empirically equivalent theories (§6c)—to the extent that the theories are structurally equivalent they would capture the same structural facts, which is all a theory needs to capture—and do so without embracing a particular realist ontology occupying the nodes of the structure.

But can the needed distinction between structure and nature be drawn and can structures be rendered intelligible without the ontology that gives them flesh (Psillos 1995, 1999, 2001)? Two possible StR answers are suggested.

First, there is epistemological structural realism (EStR), endorsed by Poincaré, Worrall, and logical positivists in the Ramseyfied-theory tradition: electrons are objects as Obama is an object, but, unlike Obama, science can never discover anything about electrons’ natures other than their structural relations. For EStR to be a realist position, it will not suffice to say: we can know only observable objects (like Obama) and their (observable) structural relations; we must be agnostic about unobservable objects and their relations. This is merely a CE version of structuralism, as van Fraassen points out (2006, 2008), and inherits many problems of CE (§6). To be a realist position, EStR has to presuppose that, in addition to the structure of the phenomena whose objects are knowable, there is a mind-independent, knowable “underlying” structure, whose objects are unknowable. But now one must distinguish Obama from electrons so that Obama’s nature is knowable but electrons’ natures are not; the problematic observable-unobservable distinction (§§5a, 6b) has returned.

Critics argue that there is no sharp, epistemologically significant distinction between form (structure) and content (nature) of the kind needed for EStR. First, our knowledge of the nature of electrons is bound up with our knowledge of their structural relations so that we come to know them together: saying what an electron is includes saying how it is structured; our knowledge of its nature forms a continuum with our knowledge of its structure. Second, EStR requires a variant of the NMA (restricted to retention of structure) to uphold StR5. But this requires that, in progressive theory-change, structure (retained and improved) is what explains increased empirical success. But structure alone (without auxiliary hypotheses describing non-structural features of the world) never suffices to derive new empirical content. Finally, critics object to structuralists’ interpretations of the history. Worrall, for example, argues that Fresnel’s structural claims about light (the mathematics) were retained, but not his commitments to a mechanical ether; his critics question whether Fresnel could have been “just” right about the structure of light-propagation and completely wrong about the nature of light.

Second, there is ontological structural realism (OStR), advocated by Ladyman and others (Ladyman and Ross 2007) and similar to Quine’s realism (§4). OStR bites the bullet: we can know only structure because only structure exists. Obama is no more an object than electrons are; each is itself a structure; more strongly, everything is structure. Some of the attraction of this strange metaphysical position comes from its promise to handle problems in quantum mechanics that are orthogonal to our debates. Its proponents argue that it can account, for example, for apparently indistinguishable particles in entangled quantum states. In the context of our debates, OStR is supposed to avoid the epistemological problems of EStR: qua objects understood as structural nodes, electrons are in principle no more unknowable (or knowable) than Obama or ordinary physical objects. However, it runs into its own metaphysical problems, since it threatens to lose touch with concrete reality altogether. Even if God created nothing concrete, it would still be a structural (mathematical) fact that neutrons and protons, if they exist, form an isospin doublet related by SU(2) symmetry. For this to be a concrete (physical) fact, God would have had to create some objects—nucleons with symmetrically related isospin states or some more fundamental objects that compose nucleons—to occupy the neutron- and proton-nodes of the SU(2) group-structure. Even if those objects had only structural properties, they would have to have one non-structural property—existence (van Fraassen 2006, 2008). So, not everything is structure; there is a distinction between empty mathematical structures and realized physical structures; OStR can not capture that distinction.

b. Stanford’s New Induction

Kyle Stanford’s new induction provides the latest historical challenge to SR (Stanford 2001, 2006, 2015). Following Duhem (1991) Stanford poses what he calls the Problem of Unconceived Alternatives (PUA): for any fundamental domain of inquiry at any given time t there are alternative scientific hypotheses not entertained at t but which are consistent with (and even equally confirmed by) all the actual evidence available at t. PUA, were it true, would seem to create a serious underdetermination problem for SR: we opt for our current best confirmed theory, but there is a distinct alternative that is equally supported by all the evidence we possess, but which we currently lack the imagination to think of. (Two things about PUA are worth noting. First, it concerns the actual evidence we have at a time; it is not that the theory and the alternatives are underdetermined by all possible evidence; the underdetermination may be transient; future evidence may decide that the theory we have selected is not correct. Second, the unconceived alternative hypotheses are ordinary scientific hypotheses, not recherché philosophical hypotheses involving brains-in-vats, and so forth.)

Stanford argues that PUA is our general predicament. His New Induction on the history of science, he argues, shows that our epistemic situation is one of recurrent, transient underdetermination. Virtually all T-T* transitions in the past were affected by PUA: the earlier T-theorists selected T as the best supported theory of the available alternatives; they did not conceive of T* as an alternative; T* was conceived only later yet T* is typically better supported than T. At any given time, we could only conceive a limited set of hypotheses that were confirmed by all the evidence then available, yet subsequent inquiry revealed distinct alternatives that turned out to be equally or better confirmed by that evidence. We thus have good inductive reasons to believe we are now in the same predicament—our current best theories will be replaced by incompatible and currently unconceived successors that account for all the currently available evidence.

Stanford proposes a new instrumentalism. Like van Fraassen’s (§6), his instrumentalism is epistemic: it distinguishes claims we ought literally to believe from claims we ought only to accept as instrumentally reliable and argues that instrumental acceptance suffices to account for scientific practice. Unlike van Fraassen, Stanford bases his distinction, not on an observable-unobservable dichotomy, but on whether our access to a domain is based primarily on eliminative inference subject to PUA challenges: if it is, then we should adopt an instrumentalist stance; if it is not (as, for example, our access to the common sense world is not), then we may literally believe.

c. Selective Realism

Many debates in the early 21st century focus on historical inductions, especially on what representative basis would warrant an inductive extrapolation. Putnam and Boyd were aware that care was needed with the NMA and sometimes restricted their claims to mature theories so that we discount ab initio some theories on Laudan’s troublesome list—like the theory of crystalline spheres or of humoral medicine. Mature theories (with the credentials to warrant optimistic induction) must have passed a “take-off” point: there must be background beliefs that indicate their application boundaries and guide their theoretical development; their successes must be supported by converging but independent lines of inquiry and so forth. Moreover, many realists argue, a theory is suitable for optimistic induction only if it has yielded novel predictions; otherwise it could just have been rigged to fit the phenomena. Roughly, a prediction P (whether known or unexpected) is novel with respect to a theory T if no P-information is needed for the construction of T and no other available theory predicts P. Thus, for example, Newton’s prediction of tidal phenomena was novel because those phenomena were not used in (and not needed for) Newton’s construction of his theory and no other theory predicted the tides (Leplin 1997; Psillos 1999). Nevertheless, even thus restricted, the induction will not meet Laudan’s challenge, for that challenge includes an undermining argument (Stanford 2003a): many discarded yet empirically successful theories were mature and yielded novel predictions—for example, Newton’s theory, caloric theory, and Fresnel’s theory of light—so, if our current theories are correct, these theories were false.

More recent responses to these counterexamples attempt to steer a middle course between optimistic inductions like Putnam’s NMA (§5d) and pessimistic inductions like Laudan’s and Stanford’s (§§7b, 11b). These responses typically have a two-part normal form: (1) they concede to the pessimists that some parts of past empirically successful theories are discarded, yet (2) they argue with the optimists that some parts of past successful theories are retained, improved upon, and explain the successes of the old theories. Advocates of this “divide and conquer” strategy (Psillos 1999) try to have their cake and eat it too.

Variants of the strategy depend on how one separates those “good” features of past theories that are preserved, that explain empirical success, and that warrant optimistic induction from those “bad” features that are discarded. Structuralists, we saw, argue that structure (form), but not nature (content), is what is both preserved and responsible for success. Kitcher (1993) distinguishes a theory’s working and presuppositional posits. The term “light-wave” in Fresnel’s usage referred to light, no matter what its constitution is, in some contexts and to what satisfies the descriptionthe oscillations of ethereal molecules” in other contexts. In the former contexts, “light-wave” referred to high frequency electromagnetic waves, a mode of reference that was doing explanatory and inferential work and was retained in later theories. In the latter contexts, “light-wave” referred to the ether (that is, nothing), a mode of reference that was presupposed yet empty, idle, and not retained in later theories.

Other variants rely on the causal theory of reference. Hardin and Rosenburg (1982) exploit the idea that one can successfully refer to X (by being suitably causally linked to X) while having (largely) false beliefs about X. Thus, Fresnel and Maxwell were referring to the electromagnetic field when they used the term “ether”, and, though they had many false beliefs about it (that it was a mechanical medium, for example), the electromagnetic field was causally responsible for their theories’ success and was retained in later theories.  A big problem with this response is that referential continuity does not suffice for partial or approximate truth (Laudan 1984; Psillos 1999). Psillos (1999) employs causal descriptivism to deal with this problem: “ether” in 19th century theories refers to the electromagnetic field, since that (and only that) object has the properties (medium of light-propagation that is the repository of energy and transmits it locally) that are causally responsible for the relations between measurements we get when we perform optical experiments. By contrast, “phlogiston” does not refer since nothing has the properties that the phlogiston theorists mistakenly believed to be responsible for the body of information they had about oxidation of metals, and so forth. During theory change, the causal-theoretical descriptions of some terms are retained and thereby their references also; these are the essential parts of the theory that contribute to its success; but this is consistent with less central parts being completely wrong.

The latest twist to these divide and conquer strategies is Chakravartty’s doctrine of semirealism (Chakravartty 1998, 2007). Taking his cue from Hacking-Cartwright (§9), Chakravartty distinguishes detection and auxiliary properties. The former are causal properties of objects (and the structure of real relations between them) that are well-confirmed by experimental manipulation because they underwrite the causal interactions we and our instruments exploit in experimental set-ups; the latter are merely theoretical and inferential aids. The former are retained in later theories; the latter are not. Past theories that were on the right track were so because they mathematically coded in systematic ways the detection properties (as opposed to the idle auxiliary properties).

Any of these strategies must meet two further challenges, emphasized in (Stanford 2003a, 2003b). First, they must answer the undermining challenge (above) in a way that is not ad hoc, question-begging, or transparently Whiggish. Simply arguing (with Hardin and Rosenburg) for preservation of reference via preservation of causal role is too easy: do Aristotle’s natural place, Newton’s gravitational action, and Einstein’s space-time curvature all play the same causal role in explaining free-fall phenomena? And if we tighten the account by claiming that continuity requires retention of core causal descriptions (Psillos) or detection property clusters (Chakravartty), are we engaged in a self-serving enterprise? Are we using our own best theories to determine the core causal properties/descriptions and then “reading” those back into the past discarded theories?

Second, they must respond to the trust argument. Divide and conquer strategies argue that successful past theories were right about some things but wrong about others. But then we should expect our own theories to be right about some things and wrong about others. Though perhaps an advance, this does not provide us with a good reason to trust any particular part of our own theories, especially any particular assessment we make (from our vantage point) of the features of a past discarded theory that were responsible for its empirical success. We judge that X-s in a past theory were working posits (Kitcher), essentially contributing causes of success (Psillos), detection properties (Chakravartty), while Y-s in that theory were merely presuppositional posits, idle, or auxiliary properties. But the past theorists were generally unable to make these discriminations, so why do we think we can now make them in a reliable manner. Stanford argues that realists can avoid this problem only if they can provide prospectively applicable criteria of selective confirmation—criteria that past theorists could have used to distinguish the good from the bad in advance of future developments and that we could now use—but they did not have such criteria, nor do we.

12. References and Further Reading

  • Boyd, R. (1973), “Realism, Underdetermination and the Causal Theory of Evidence”, Nous 7, 1-12.
  • Boyd, R. (1983), “On the Current Status of the Issue of Scientific Realism”, Erkenntnis, 19, 45–90.
  • Carnap, R. (1936), “Testability and Meaning”, Philosophy of Science 3, 419-471.
  • Carnap, R. (1937), “Testability and Meaning–Continued”, Philosophy of Science 4, 1-40.
  • Carnap, R. (1939), “Foundations of Logic and Mathematics”, International Encyclopedia of Unified Science 1(3), Chicago: The University of Chicago Press.
  • Carnap, R. (1950), “Empiricism, Semantics and Ontology”, Revue Intérnationale de Philosophie 4, 20-40.
  • Carnap, R. (1956), “The Methodological Character of Theoretical Concepts”, in H. Feigl and M. Scriven (eds), Minnesota Studies in the Philosophy of Science I, Minneapolis: University of Minnesota Press.
  • Cartwright, N. (1983), How the Laws of Physics Lie. Oxford: Clarendon Press.
  • Cartwright, N. (1999), The Dappled World. Cambridge: Cambridge University Press.
  • Chakravartty, A. (1998), “Semirealism”, Studies in the History and Philosophy of Science 29 (3), 391-408.
  • Chakravartty, A. (2007), A Metaphysics for Scientific Realism: Knowing the Unobservable. Cambridge: Cambridge University Press.
  • Churchland, P. (1985), ‘The Ontological Status of Observables: In Praise of the Superempirical Virtues’, in Churchland and Hooker 1985.
  • Churchland, P. and C. Hooker (eds) (1985), Images of Science: Essays on Realism and Empiricism, (with a reply from Bas van Fraassen). Chicago: University of Chicago Press.
  • Duhem, P. (1991/1954/1906), The Aim and Structure of Physical Theory. trans. P Wiener, intro. Jules Vuillemin, Princeton: Princeton University Press.
  • Field, H. (1972), “Tarski’s Theory of Truth”, Journal of Philosophy 64 (13), 347-375.
  • Field, H. (1982), “Realism and Relativism”, Journal of Philosophy 79 (10), 553-567.
  • Fine, A. (1996/1986), The Shaky Game. Chicago: University of Chicago Press.
  • Friedman, M. (1982), “Review of The Scientific Image”, Journal of Philosophy 79 (5), 274-283.
  • Friedman, M. (1999), Reconsidering Logical Positivism. Cambridge: Cambridge University Press.
  • Frigg, R. and I. Votsis. (2011), “Everything You Always Wanted to Know about Structuralism but Were Afraid to Ask”, European Journal for the Philosophy of Science 1, 227-276.
  • Hacking, I. (1983), Representing and Intervening. Cambridge: Cambridge University Press.
  • Hardin, C. and A. Rosenburg. (1982), “In Defense of Convergent Realism”, Philosophy of Science 49, 604-615.
  • Harman, G. (1965), “The Inference to the Best Explanation”, The Philosophical Review 74, 88–95.
  • Hempel, C. G. (1965), Aspects of Scientific Explanation. New York: Free Press.
  • Hertz, H. (1956), The Principles of Mechanics. New York: Dover.
  • Horwich, P. (1990), Truth. Oxford: Blackwell.
  • Kitcher, P. (1993), The Advancement of Science. Oxford: Oxford University Press.
  • Kitcher, P. (2001). “Real Realism: The Galilean Strategy”, The Philosophical Review 110 (2), 151-197.
  • Kuhn, T.S. (1970/1962), The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
  • Kuhn, T.S. (1977/1974), The Essential Tension. Chicago: University of Chicago Press.
  • Kukla, A. (1998), Studies in Scientific Realism. Oxford: Oxford University Press.
  • Ladyman, J. and D. Ross. (2007), Every Thing Must Go: Metaphysics Naturalized. Oxford: Oxford University Press.
  • Laudan, L. (1981), “A Confutation of Convergent Realism”, Philosophy of Science, 48, 19–48.
  • Laudan, L. (1984), “Realism without the Real”, Philosophy of Science, 51, 156-162.
  • Laudan, L. and J. Leplin. (1991), “Empirical Equivalence and Underdetermination”, Journal of Philosophy 88 (9), 449-472.
  • Leeds, S. (1995), “Truth, Correspondence, and Success”, Philosophical Studies 79 (1), 1-36.
  • Leeds, S. (2007), “Correspondence Truth and Scientific Realism”, Synthese 159, 1–21.
  • Leplin, J. (1997), A Novel Defence of Scientific Realism. Oxford: Oxford University Press.
  • Lewis, D. (1970), “How to Define Theoretical Terms”, Journal of Philosophy 67, 427-446.
  • Lewis, D. (1984). ‘Putnam’s Paradox’, Australasian Journal of Philosophy 62: 221-236.
  • Lipton, P. (2004/1991), Inference to the Best Explanation. London: Routledge.
  • Liston, M. (1985), “Is a God’s-Eye-View an Ideal Theory?”, Pacific Philosophical Quarterly 66.3-4, 355-376.
  • Liston, M. (2005), “Does ‘Rabbit’ refer to Rabbits?”, European Journal of Analytic Philosophy 1, 39-56.
  • Mach, E. (1893), The Science of Mechanics, trans. T. J. McCormack, 6th edition., La Salle: Open Court.
  • Magnus, P.D. and C. Callender. (2004), “Realist Ennui and the Base Rate Fallacy”, Philosophy of Science 71, 320–338.
  • Maxwell, G. (1962), “On the Ontological Status of Theoretical Entities”, in H. Feigl and G. Maxwell (eds.), Minnesota Studies in the Philosophy of Science III, Minneapolis: University of Minnesota Press.
  • Maxwell, G. (1970), “Structural Realism and the Meaning of Theoretical Terms”, in S. Winoker and M. Radner (eds.), Minnesota Studies in the Philosophy of Science IV, Minneapolis: University of Minnesota Press.
  • McMullin, E. (1991), “Rationality and Theory Change in Science”, in P. Horwich (ed.), Thomas Kuhn and the Nature of Science. Cambridge: MIT Press.
  • Merrill, G. H. (1980), “The Model-Theoretic Argument Against Realism”, Philosophy of Science 47, 69-81.
  • Miller, D. (1974), “Popper’s Qualitative Theory of Verisimilitude”, British Journal for the Philosophy of Science 25, 166–177.
  • Miller, R. (1987), Fact and Method. Princeton: Princeton University Press.
  • Musgrave, A. (1985), “Realism vs Constructive Empiricism”, in Churchland and Hooker 1985.
  • Musgrave, A. (1989), “Noa’s Ark–Fine for Realism”, Philosophical Quarterly 39, 383–398.
  • Newman, M. H. A. (1928), “Mr. Russell’s ‘Causal Theory of Perception”’, Mind 37, 137-148.
  • Niiniluoto, I. (1987), Truthlikeness. Dordrecht: Reidel.
  • Poincaré, H. (1913), The Foundations of Science. New York: The Science Press.
  • Psillos, S. (1995), “Is Structural Realism the Best of Both Worlds?”, Dialectica 49, 15-46.
  • Psillos, S. (1999), Scientific Realism: How Science Tracks Truth. London: Routledge.
  • Psillos, S. (2001), “Is Structural Realism Possible?”, Philosophy of Science 68, S13–S24.
  • Putnam, H. (1962), “What Theories Are Not”, in Putnam 1975c.
  • Putnam, H. (1975a), “Explanation and Reference”, in Putnam 1975d.
  • Putnam, H. (1975b), “The Meaning of ‘Meaning”’, in (Putnam 1975d).
  • Putnam, H. (1975c), Philosophical Papers 1: Mathematics, Matter and Method. Cambridge: Cambridge University Press.
  • Putnam, H. (1975d), Philosophical Papers 2: Mind, Language and Reality. Cambridge: Cambridge University Press.
  • Putnam, H. (1978), Meaning and the Moral Sciences. London: Routledge.
  • Putnam, H. (1981), Reason, Truth and History. Cambridge: Cambridge University Press.
  • Putnam, H. (2015), “Naturalism, Realism, and Normativity”, Journal of the American Philosophical Association 1(2), 312-328.
  • Quine, W.V. (1955), “Posits and Reality”, in W. V. Quine, The Ways of Paradox and Other Essays. Cambridge: Harvard University Press (1976), 246-254.
  • Quine, W.V. (1969), “Epistemology Naturalized”, in W. V. Quine, Ontological Relativity and Other Essays. New York: Columbia University Press (1969): 69-90.
  • Rorty, R. (1980), Philosophy and the Mirror of Nature. Princeton: Princeton University Press.
  • Russell, B. (1927), The Analysis of Matter. London: Routledge, Kegan-Paul.
  • Stanford, P. K. (2001), “Refusing the Devil’s Bargain: What Kind of Underdetermination Should We Take Seriously?”, Philosophy of Science 68 (3), S1-S12.
  • Stanford, P.K. (2003a), “Pyrrhic Victories for Scientific Realism”, Journal of Philosophy 100 (11), 553-572.
  • Stanford, P.K. (2003b), “No Refuge for Realism: Selective Confirmation and the History of Science”, Philosophy of Science 70, 917-925.
  • Stanford, P.K. (2006), Exceeding our Grasp. Oxford: Oxford University Press.
  • Stanford, P.K. (2015), ““Atoms Exist” is Probably True, and Other Facts That Should Not Comfort Scientific Realists”, Journal of Philosophy 112 (8), 397-416 .
  • van Fraassen, B. (1980), The Scientific Image. Oxford: Clarendon Press.
  • van Fraassen, B. (2006), “Structure: its Shadow and Substance”, British Journal for Philosophy of Science 57, 275-307.
  • van Fraassen, B. (2008), Scientific Representation. Oxford: Clarendon Press.
  • Wilson, M. (1982), “Predicate Meets Property”, Philosophical Review 91(4), 549-589.
  • Wilson, M. (1985), “What can Theory Tell us about Observation?”, in Churchland and Hooker 1985.
  • Wilson, M. (1998), “Mechanics, Classical”, in Edward Craig (ed.), The Routledge Encyclopedia of Philosophy Vol. 6, 251-259, London: Routledge.
  • Wilson, M. (2006), Wandering Significance. Oxford: Oxford University Press.
  • Worrall, J. (1989), “Structural Realism: The Best of Both Worlds?”, Dialectica 43, 99–124.

 

Author Information

Michael Liston
Email: mnliston@uwm.edu
University of Wisconsin-Milwaukee
U. S. A.

Stoicism

Stoa of Attalus in AthensStoicism originated as a Hellenistic philosophy, founded in Athens by Zeno of Citium (modern day Cyprus), c. 300 B.C.E. It was influenced by Socrates and the Cynics, and it engaged in vigorous debates with the Skeptics, the Academics, and the Epicureans. The name comes from the Stoa Poikile, or painted porch, an open market in Athens where the original Stoics used to meet and teach philosophy. Stoicism moved to Rome where it flourished during the period of the Empire, alternatively being persecuted by Emperors who disliked it (for example, Vespasian and Domitian) and openly embraced by Emperors who attempted to live by it (most prominently Marcus Aurelius). It influenced Christianity, as well as a number of major philosophical figures throughout the ages (for example, Thomas More, Descartes, Spinoza), and in the early 21st century saw a revival as a practical philosophy associated with Cognitive Behavioral Therapy and similar approaches. Stoicism is a type of eudaimonic virtue ethics, asserting that the practice of virtue is both necessary and sufficient to achieve happiness (in the eudaimonic sense). However, the Stoics also recognized the existence of “indifferents” (to eudaimonia) that could nevertheless be preferred (for example, health, wealth, education) or dispreferred (for example, sickness, poverty, ignorance), because they had (respectively, positive or negative) planning value with respect to the ability to practice virtue. Stoicism was very much a philosophy meant to be applied to everyday living, focused on ethics (understood as the study of how to live one’s life), which was in turn informed by what the Stoics called “physics” (nowadays, a combination of natural science and metaphysics) and what they called “logic” (a combination of modern logic, epistemology, philosophy of language, and cognitive science).

Table of Contents

  1. Historical Background
    1. Philosophical Antecedents
    2. Greek Stoicism
    3. Roman Stoicism
    4. Debates with Other Hellenistic Schools
  2. The First Two Topoi
    1. “Logic”
    2. “Physics”
  3. The Third Topos: Ethics
  4. Apatheia and the Stoic Treatment of Emotions
  5. Stoicism after the Hellenistic Era
  6. Contemporary Stoicism
  7. Glossary
  8. References and Further Readings

1. Historical Background

Classically, scholars recognize three major phases of ancient Stoicism (Sedley 2003): the early Stoa, from Zeno of Citium (the founder of the school, c. 300 B.C.E.) to the third head of the school, Chrysippus; the middle Stoa, including Panaetius and Posidonius (late II and I century B.C.E.); and the Roman Imperial period, or late Stoa, with Seneca, Musonius Rufus, Epictetus and Marcus Aurelius (I through II century C.E.). Of course, Stoicism itself originated as a modification from previous schools of thought (Schofield 2003), and its influence extended well beyond the formal closing of the ancient philosophical schools by the Byzantine Emperor Justinian I in 529 C.E. (Verbeke 1983; Colish 1985; Osler 1991).

a. Philosophical Antecedents

Stoicism is a Hellenistic eudaimonic philosophy, which means that we can expect it to be influenced by its immediate predecessors and contemporaries, as well as to be in open critical dialogue with them. These includes Socratic thinking, as it has arrived to us mainly through the early Platonic dialogues; the Platonism of the Academic school, particularly in its Skeptical phase; Aristotelianism of the Peripatetic school; Cynicism; Skepticism; and Epicureanism. It is worth noting, in order to put things into context, that a quantitative study of extant records concerning known philosophers of the ancient Greco-Roman world (Goulet 2013) estimates that the leading schools of the time were, in descending order: Academics-Platonists (19%), Stoics (12%), Epicureans (8%), and Peripatetics-Aristotelians (6%).

Eudaimonia was the term that meant a life worth living, often translated nowadays as “happiness” in the broad sense, or more appropriately, flourishing. For the Greco-Romans this often involved—but was not necessarily entirely defined by—excellence at moral virtues. The idea is therefore closely related to that of virtue ethics, an approach most famously associated with Aristotle and his Nicomachean Ethics (Broadie & Rowe 2002), and revived in modern times by a number of philosophers, including Philippa Foot (2001) and Alasdair MacIntyre (1981/2013).

Stoicism is best understood in the context of the differences among some of the similar schools of the time. Socrates had argued—in the Euthydemus, for instance (McBrayer et al. 2010)—that virtue, and in particular the four cardinal virtues of wisdom, courage, justice and temperance, are the only good. Everything else is neither good nor bad in and of itself. By contrast, for Aristotle the virtues (of which he listed a whopping twelve) were necessary but not sufficient for eudaimonia. One also needed a certain degree of positive goods, such as health, wealth, education, and even a bit of good looks. In other words, Aristotle expounded the rather commonsensical notion that a flourishing life is part effort, because one can and ought to cultivate one’s character, and part luck, in the form of the physical and cultural conditions that affect and shape one’s life.

Contrast this to the rather extreme (even for the time) take of the Cynics, who not only thought that virtue was the only good, like Socrates, but that the additional goods that Aristotle was worried about were actually distractions and needed to be positively avoided. Cynics like Diogenes of Sinope were famous for their ascetic and shall we say rather eclectic life style, as is epitomized by a story about him told by Diogenes Laertius (VI.37): “One day, observing a child drinking out of his hands, he cast away the cup from his wallet with the words, ‘A child has beaten me in plainness of living.’”

Diogenes and the boy without a cup

Diogenes and the boy without the cup

One way to think of this is that the Aristotelian approach comes across as a bit too aristocratic: if one does not have certain privileges in life, one cannot achieve eudaimonia. By contrast, the Cynics were preaching a rather extreme minimalist lifestyle, which is hard to practice for most human beings. What the Stoics tried to do, then, was to strike a balance in the middle, by endorsing the twin crucial ideas, on which I will elaborate later, that virtue is the only true good, in itself sufficient for eudaimonia regardless of one’s circumstances, but also that other things—like health, education, wealth—may be rationally preferred (Proēgmena) or “dispreferred” (Apoproēgmena), as in the case of sickness, ignorance, and poverty, as long as one did not confuse them for things with inherent value.

b. Greek Stoicism

The “Greek” phase of the Stoa covers the first and second periods, from the founding of the school by Zeno to the shifting of the center of gravity from Athens to Rome in the time of Posidonius in the I Century B.C.E., who became a friend of Cicero—not a Stoic himself, but one of our best indirect sources on early Stoicism. Stoicism was not just born, but flourished in Athens, even though most of its exponents originated from the Eastern Mediterranean: Zeno from Citium (modern Cyprus), Cleanthes from Assos (modern Western Turkey), and Chrysippus from Soli (modern Southern Turkey), among others. According to Medley (2003), this pattern is simply a reflection of the dominant cultural dynamics of the time, affected as they were by the conquests of Alexander.

From the beginning Stoicism was squarely a “Socratic” philosophy, and the Stoics themselves did not mind such a label. Zeno began his studies under the Cynic Crates, and Cynicism always had a strong influence on Stoicism, all the way to the later writings of Epictetus. But Zeno also counted among his teachers Polemo, the head of the Academy, and Stilpo, of the Megarian school founded by Euclid of Megaria, a pupil of Socrates. This is relevant because Zeno came to elaborate a philosophy that was both of clear Socratic inspiration (virtue is the Chief Good) and a compromise between Polemo’s and Stilpo’s positions, as the first one endorsed the idea that there are external goods—though they are of secondary importance—while the second one claimed that nothing external can be good or bad. That compromise consisted in the uniquely Stoic notion that external goods are of ethically neutral value, but are nonetheless the object of natural pursuit.

Zeno established the tripartite study of Stoic philosophy (see the three topoi[[hyperlink]]) comprising ethics, physics and logic. The ethics was basically a moderate version of Cynicism; the physics was influenced by Plato’s Timaeus (Taran 1971) and encompassed a universe permeated by an active (that is, rational) and a passive principle, as well as a cosmic web of cause and effect; the logic included both what we today refer to as formal logic and epistemology, that is, a theory of knowledge, which for the Stoics was decidedly empiricist-naturalistic.

The Stoics after Zeno disagreed on a number of issues, often interpreting Zeno’s teachings differently. Perhaps the most important example is provided by the dispute between Cleanthes and Chrysippus about the unity of the virtues: Zeno had talked about each virtue in turn being a kind of wisdom, which Cleanthes interpreted in a strict unitary sense (that is, all virtues are one: wisdom), while Chrysippus understood in a more pluralistic fashion (that is, each virtue is a “branch” of wisdom).

The early Stoics could also be stubbornly anti-empirical in their apologetics of Zeno’s writings, as when Chrysippus insisted in defending the idea that the heart, not the brain, is the seat of intelligence. This went against pretty conclusive anatomical evidence that was already available in the Hellenistic period, and earned the Stoics the scorn of Galen (for example, Tieleman 2002), though later Stoics did update their beliefs on the matter.

Despite this faux pas, Chrysippus was arguably the most influential Stoic thinker, responsible for an overhaul of the school, which had declined under the guidance of Cleanthes, a broad systematization of its teachings, and the introduction of a number of novel notions in logic—the aspect of Stoicism that has had the most technical philosophical impact in the long run. Famously, Diogenes Laertius (2015, VII.183) wrote that “But for Chrysippus, there had been no Porch.”

In the six decades following Chrysippus there were just two heads of the Stoa, Zeno of Tarsus (south-central Turkey) and Diogenes of Babylon, whose contributions were rather less significant than those of Chrysippus himself. We have to wait until 155 B.C.E. for the next impactful event, when the heads of the three major schools in Athens—the Stoics, the Academics and the Peripatetics—were sent by the city to Rome in order to help with diplomatic efforts. (It is interesting to note, as does Sedley (2003) that the fourth large school, the Epicurean one, was missing, following their stance of political non-involvement.) The philosophers in question, including the Stoic Diogenes of Babylon, made a huge impression on the Roman public with their public performances (and, apparently, an equally worrisome one on the Roman elite, thus beginning a long tradition of tension between philosophers and high-level politicians that characterized especially the post-Republican empire), paving the road for the later shift of philosophy from Athens to Rome, as well as other centers of learning, like Alexandria.

Beginning with Antipater of Tarsus, and then more obviously Panaetius (late II Century B.C.E.) and Posidonius (early I Century B.C.E.), the Stoics revisited their relationship with the Academy, especially in light of the above mentioned importance of the Timaeus for Stoic cosmology. Apparently, what particularly interested Posidonius was the fact that Plato’s main character in the dialogue is a Pythagorean, a school that Posidonius somewhat anachronistically managed to link to Stoicism.

It appears that the broader project pursued by both Panaetius and Posidonius was one of seeking common ground (Sedley 2003 uses the term “syncretism”) among Academicism, Aristotelianism and Stoicism itself, that is, the three branches of Socratic philosophy. This process seems to have been in part responsible for the further success of Stoicism once the major philosophers of the various schools moved from Athens to Rome, after the diaspora of 88-86 B.C.E.

c. Roman Stoicism

If the visit to Rome by the head of various philosophical schools in 155 B.C.E. was crucial for bringing philosophy to the attention of the Romans, the political events of 88-86 B.C.E. changed the course of Western philosophy in general, and Stoicism in particular, for the remainder of antiquity.

At that time philosophers, particularly the Peripatetic Athenion and—surprisingly—the Epicurean Aristion, were politically in charge at Athens, and made the crucial mistake of siding with Mithridates against Rome (Bugh 1992). The defeat of the King of Pontus, and consequently of Athens, spelled disaster for the latter and led to a diaspora of philosophers throughout the Mediterranean.

To be fair, we have no evidence of the continuation of the Stoa as an actual school in Athens after Panaetius (who often absented himself to Rome anyway), and we know that Posidonius taught in Rhodes, not Athens. However, according to Sedley (2003), it was the events of 88-86 B.C.E. that finally and permanently moved the center of gravity of Stoicism away from its Greek cradle to Rome, Rhodes (where an Epicurean school also flourished), and Tarsus, where a Stoic was at one point chosen by Augustus to govern the city.

Most crucially, however, Stoicism became important in Rome during the fraught time of the transition between the late Republic and the Empire, with Cato the Younger eventually becoming a role model for later Stoics because of his political opposition to the “tyrant” Julius Caesar. Sedley highlights two Stoic philosophers of the late First Century B.C.E., Athenodorus of Tarsus and Arius Didymus, as precursors of one of the greatest and most controversial Stoic figures, Seneca. Both Athenodorus and Arius were personal counselors to the first emperor, Augustus, and Arius even wrote a letter of consolation to Livia, Augustus’ wife, addressing the death of her son, which Seneca later hailed as a reference work of emotional therapy, the sort of work he himself engaged in and became famous for.

Once we get to the Imperial period (Gill 2003), we see a decided shift away from the more theoretical aspects of Stoicism (the “physics” and “logic,” see below) and toward more practical treatments of the ethics. However, as Gill points out, this should not lead us to think that the vitality of Stoicism had taken a nose dive by then: we know of a number of new treatises produced by Stoic writers of that period, on everything ranging from ethics (Hierocles’ Elements of Ethics) to physics (Seneca’s Natural Questions), and the Summary of the Traditions of Greek Theology by Cornutus is one of a handful of complete Stoic treatises to survive from any period of the history of the school. Still, it is certainly the case that the best known Stoics of the time were either teachers like Musonius Rufus and Epictetus, or politically active, like Seneca and Marcus Aurelius, thus shaping our understanding of the period as a contrast to the foundational and more theoretical one of Zeno and Chrysippus.

Importantly, it is from the late Republic and Empire that we also get some of the best indirect sources on Stoicism, particularly several books by Cicero (2014; for example., Paradox Stoicorum, De Finibus Bonorum et Malorum, Tusculanae Quaestiones, De Fato, Cato Maior de Senectute, Laelius de Amicitia, and De Officiis) and Diogenes Laertius’ Lives of the Eminent Philosophers (Book VII, 2015). And this literature went on to influence later writers well after the decline of Stoicism, particularly Plotinus (205-270 C.E.) and even the 6th Century C.E. Neoplatonist Simplicius.

All of the above notwithstanding, what is most vital about Stoicism during the Roman Imperial period, however, is also what arguably made the philosophy’s impact reverberate throughout the centuries, eventually leading to two revivals, the so-called Neostoicism of the Renaissance, and the current “modern Stoicism” movement to which I will turn at the end of this essay. The sources of such vitality were fundamentally two: on the one hand charismatic teachers like Musonius and Epictetus, and on the other hand influential political figures like Seneca and Marcus. Indeed, Musonius was, in a sense, both: not only he was a member of the Roman “knight” class, and the teacher of Epictetus, he was also politically active, openly criticizing the policies of both Nero and Vespasian, and getting exiled twice as a result. Others were not so lucky: Stoic philosophers suffered a series of persecutions from displeased emperors, which resulted in murders or exile for a number of them, especially during the reigns of Nero, Vespasian and Domitian. Seneca famously had to commit suicide on Nero’s orders, and Epictetus was exiled to Greece (where he established his school at Nicopolis) by Domitian.

It is also important to appreciate different “styles” of being Stoic among the major Roman figures. As Gill (2003) points out, Epictetus was rather strict, arching back to the Cynic model of quasi-asceticism (see, for instance, his “On Cynicism” in Discourses III.22). Musonius was a sometimes odd combination of “conservative” and “progressive” Stoic, advocating the importance of marriage and family, but also stating very clearly that women are just as capable of practicing virtue and philosophizing as men are, and moreover that it is hypocritical of men to consider their extramarital sexual activities differently from those of women! Seneca was not only more open to the pursuit of “preferred indifferents” (he was a wealthy Senator, but it seems unfair to accuse him of endorsing a simplistic self-serving philosophy: see the nuanced biographies by Romm 2014 and Wilson 2014), but explicitly stated that he was critical of some of the doctrines of the early Stoics, and that he was open to learn from other schools, including the Epicureans. Famously, Marcus Aurelius was open—one would almost want to say agnostic—about theology, at several points in the Meditations (1997) explicitly stating the two alternatives of “Providence” (Stoic doctrine) or “Atoms” (the Epicurean take), for instance: “Either there is a fatal necessity and invincible order, or a kind Providence, or a confusion without a purpose and without a director. If then there is an invincible necessity, why do you resist? But if there is a Providence that allows itself to be propitiated, make yourself worthy of the help of the divinity. But if there is a confusion without a governor, be content that in such a tempest you have yourself a certain ruling intelligence” (XII.14); or: “With respect to what may happen to you from without, consider that it happens either by chance or according to Providence, and you must neither blame chance nor accuse Providence” (XII.24). More is said about this specific topic in the section on Stoic metaphysics and teleology.

There is ample evidence, then, that Stoicism was alive and well during the Roman period, although the emphasis did shift—somewhat naturally, one might add—from laying down the fundamental ideas to refining them and putting them into practice, both in personal and social life.

d. Debates with Other Hellenistic Schools

One should understand the evolution of all Hellenistic schools of philosophy as being the result of continuous dialogue amongst themselves, a dialogue that often led to partial revisions of positions within any given school, or to the adoption of a modified notion borrowed from another school (Gill 2003). To have an idea of how this played out for Stoicism, let us briefly consider a few examples, related to the interactions between Stoicism and Epicureanism, Aristotelianism, and Platonism—without forgetting the direct influence that Cynicism had on the very birth of Stoicism and all the way to Epictetus.

Epictetus is pretty explicit about his—negative—opinions of the Epicureans, drawing as sharp a contrast as possible between the latter’s concern with pleasure and pain and the Stoic focus on virtue and integrity of character. For example, Discourses I.23 is entitled “Against Epicurus,” and begins: “[1] Even Epicurus realizes that we are social creatures by nature, but once he has identified our good with the shell, he cannot say anything inconsistent with that. [2] For he further insists—rightly—that we must not respect or approve anything that does not share in the nature of what is good.” “The shell” here is the body, a reference to the Epicureans’ insistence on pleasure and the absence of pain as what leads to ataraxia, or tranquillity of mind—a term interestingly different from the one preferred by the Stoics, apatheia, or lack of disturbing emotions, as shall be seen below.

A longer section, II.20, is entitled “Against the Epicureans and the Academics,” at the beginning of which Epictetus calls the bluff, in his mind, on the rivals’ theories, which he understands as clearly impractical and contrary to common sense: “[1] Even people who deny that statements can be valid or impressions clear [that is, the Skeptics] are obliged to make use of both. You might almost say that nothing proves the validity of a statement more than finding someone forced to use it while at the same time denying that it is sound.” Epictetus even goes so far as suggesting that Epicurus is incoherent, as he advises a life of retired tranquility away from society, and yet bothers to write books about it, thus showing himself to be concern about the welfare of society after all: “[15] What urged him to get out of bed and write the things he wrote was, of course, the strongest element in a human being—nature—which subjected him to her will despite his loud resistance.”

Attacking the Skeptics among the Academics, Epictetus turns up the rhetoric significantly: “What a travesty! [28] What are you doing? You prove yourself wrong on a daily basis and still you won’t give up these idle efforts. When you eat, where do you bring your hand—to your mouth, or to your eye? What do you step into when you bathe? When did you ever mistake your saucepan for a dish, or your serving spoon for a skewer?” And he sees his invective as justified—in sure Stoic fashion—not on theoretical grounds, but on practical ones: “[35] We could give adulterers grounds for rationalizing their behavior; such arguments could provide pretexts to misappropriate state funds; a rebellious young man could be emboldened further to rebel against his parents. So what, according to you, is good or bad, virtuous or vicious—this or that?”

Even so, not all Stoics rejected either Academic or Epicurean ideas altogether. I have mentioned Marcus Aurelius’ relative “agnosticism” about Providence vs. Atoms (though he clearly preferred the first option, in line with standard Stoic teaching), and Seneca is often sympathetic to Epicurean views, though, as Gill (2003, note 58) comments, this is in the spirit of showing that even some of the rival school’s ideas are congruent with Stoic ones. He very clearly states, however, in Natural Questions: “I do not agree with [all] the views of our school” (2014, VII.22.1).

Cicero, in Book III of De Finibus, provides us with some glimpses of the disagreement between Stoics and Aristotelians, by way of his imaginary dialogue with Cato the Younger. At [41] he writes: “Carneades never ceased to contend that on the whole so-called ‘problem of good and evil,’ there was no disagreement as to facts between the Stoics and the Peripatetics, but only as to terms. For my part, however, nothing seems to me more manifest than that there is more of a real than a verbal difference of opinion between those philosophers on these points.” He continues: “The Peripatetics say that all the things which under their system are called goods contribute to happiness; whereas our school does not believe that total happiness comprises everything that deserves to have a certain amount of value attached to it,” referring to the different treatment of “external goods” between Aristotelians and Stoics.

There are well documented examples of Stoic opinions changing in direct response to challenges from other schools, for instance the modified position on determinism that was adopted by Philopator (80-140 C.E.), a result of criticism from both the Peripatetic and the Middle Platonist philosophers. We also have clear instances of Stoic ideas being incorporated by other schools, as in the case of Antiochus of Ascalon (130-69 B.C.E.), who introduced Stoic notions in his revision of Platonism, justifying the move by claiming that Zeno (and Aristotle, for that matter) developed ideas that were implicit in Plato (Gill 2003). Finally, Stoicism found its way into Christianity via Middle Platonism, at the least since Clement of Alexandria (150-215 C.E.).

2. The First Two Topoi

A fundamental aspect of Stoic philosophy is the twofold idea that ethics is central to the effort, and that the study of ethics is to be supported by two other fields of inquiry, what the Stoics called “logic” and “physics.” Together, these form the three topoi of Stoicism.

We will take a closer look to each topos in turn, but it is first important to see why and how they are connected. Stoicism was a practical philosophy, the chief goal of which was to help people live a eudaimonic life, which the Stoics identified with a life spent practicing the cardinal virtues (next section). Later in the Roman period the emphasis shifted somewhat to the achievement of apatheia, but this too was possible because of the practice of the topos of ethics.

This, in turn, was to be supported by the study of the other two topoi, “logic,” which was more expansive than the modern technical meaning of the term, including logic sensu stricto, but also a theory of knowledge (that is, epistemology), as well as cognitive science, and “physics,” by which the Stoics meant roughly what we would today identify as a combination of natural science and metaphysics (the latter including theology). Roughly, then, “logic” means the study of how to reason about the world, while “physics” means the study of that world.

Logic and Physics are related to Ethics because Stoicism is a thoroughly naturalistic philosophy. Even when the Stoics are talking about “God” or “soul,” they are referring to physical entities, respectively identified with the rational principle embedded in the universe itself and with whatever makes human rationality possible. Stoics often invoked creative imagery to explain the relationship among Physics, Logic and Ethics, as found in Diogenes Laertius (VII.39), for instance. Perhaps the most famous of such analogies is the one using an egg, where the shell is the Logic, the white the Ethics, and red part the Physics. However, given how the three topoi were meant to relate to each other, this is probably misleading, possibly due to a misunderstanding of the biology of eggs (the Physics is supposed to be nurturing the Ethics, which means that the former should be the white and the latter the red part of the egg). The best simile in my mind is that of a garden: the fence is the Logic—defending the precious inside and defining its boundaries; the fertile soil is the Physics—providing the nutritive power by way of knowledge of the world; and the resulting fruits are the Ethics—the actual focal objective of Stoic teachings.

While the Stoics disagreed on the sequence in which the three topoi should be presented to students (that is, just like faculty in a modern university, they had contrasting opinions about the merits of different curricula!), the crucial point is that of a naturalistic philosophy where there is no sharp distinction between “is” and “ought,” as assumed in much modern moral philosophy, because what an agent ought to do (Ethics) is in fact closely informed by that agent’s knowledge of the workings of the world (Physics) as well as her capacity to reason correctly (Logic). This section describes the first two topoi and the next describe Ethics.

a. “Logic”

Stoics made important early contributions to both epistemology (Hankinson 2003) and logic proper (Bobzien 2003), and much has, deservedly, been written about it. While Stoics held that the Sage, who was something of an ideal figure, could achieve perfect knowledge of things, in practice they relied on a concept of cognitive progress, as well as moral progress, since both logic and physics are related to, and indeed function in the service of, ethics. They referred to this idea as prokopê (making progress), and they engaged in a long running dispute with Academic Skeptics about just how defensible this notion actually is.

Unlike the Epicureans, Stoics did not maintain that all impressions are true, but rather that some of them were “cataleptic” (that is, leading to comprehension) and others were not. Diogenes Laertius explains the difference (VII.46): “the cataleptic, which [the Stoics] hold to be the criterion of matters, is that which comes from something existent and is in accordance with the existent thing itself, and has been stamped and imprinted; the non-cataleptic either comes from something non-existent, or if from something existent then not in accordance with the existent thing; and it is neither clear, nor distinct.”

So the Stoics did admit that one’s perception can be wrong, as in cases of hallucinations, or dreams, or other sources of phantasma (that is, impressions on the mind, the result of automatic—we would say unconscious—judgment), but also that proper training allows one to make progress in distinguishing cataleptic from non-cataleptic impressions (that is, impressions to which we may reasonably give or withhold assent). Chrysippus even suggested that it is important to absorb a number of impressions, since it is the accumulation of impressions that leads to concept-formation and to making progress. In this sense, the Stoic account of knowledge was eminently empiricist in nature, and—especially after relentless Skeptical critiques—relied on something akin to what moderns call inference to the best explanation (Lipton 2003), as in their conclusion that our skin must have holes based on the observation that we sweat.

It is important to realize that a cataleptic impression is not quite knowledge. The Stoics distinguished among opinion (weak, or false), apprehension (characterized by an intermediate epistemic value), and knowledge (which is based on firm impressions unalterable by reason). Giving assent to a cataleptic impression is a step on the way to actual knowledge, but the latter is more structured and stable than any single impression could be. In a sense, then, the Stoics held to a coherentist view of justification (for example, Angere 2007), and ultimately, like all ancients, to a correspondence theory of truth (for example, O’Connor 1975).

Hankinson (2003) comments on an interesting aspect of the dispute between Stoics and Academic Skeptics, concerning the epistemic warrant to be granted to cataleptic impressions. What, precisely, makes them “clear and distinct,” a Stoic terminology that clearly anticipates Descartes (who, obviously, was not an empiricist)? If clarity and distinctiveness are internal features of cataleptic impressions, then these are phenomenal features, and it is easy to come up with counterexamples where they do not seem to work (for instance, the common occurrence of mistaking one member of a pair of twins for the other one).

This is where we encounter one of the many episodes of growth of Stoic thought in response to external pressure. Cicero tells us (2014, in Academica II.77) that Zeno was aware that the same impression could derive from something that did or did not exist, so he modified his stance (as Diogenes Laertius reports: VII.50), adding the following clause: “of such a type as could not come from something non-existent.” Of course this does not solve the issue, but it builds on the Stoic metaphysical assumption that there cannot be two things that are exactly alike, as much as at times it may appear so to us. Frede (1983) advanced the further view that what makes a cataleptic impression clear and distinct is not any internal feature of that impression, but rather an external causal feature related to its origin. According to this account, then, Stoic epistemology is externalist (for example, Almeder 1995), rather than internalist (for example, Goldman 1980). Indeed, there is evidence that they became—again as a result of criticism from the Skeptics—reliabilists about knowledge (Goldman 1994). Athenaeus tells of the story of Sphaerus, a student of Cleanthes and colleague of Chrysippus, who was shown at a banquet what turned out to be birds made of wax. After he reached to pick one up he was accused of having given assent to a false impression. To which he—rather cleverly, but indicatively—replied that he had merely assented to the proposition that it was reasonable to think of the objects as actual birds, not to the stronger claim that they actually were birds.

When it comes to the area of Stoic “logic” that is closest to our, much narrower, conception of the field, the school made major contributions. Their system of syllogistics recognized that not all valid arguments are syllogisms and significantly differs from Aristotle’s, having more in common with modern-day relevance logic (Bobzien 2006). To simplify quite a bit (but see Bobzien 2003 for a somewhat in-depth treatment), Stoic syllogistics was built on five basic types of syllogisms, and complemented by four rules for arguments that could be deployed to reduce all other types of syllogisms to one of the basic five.

The broader Stoic approach to logic has been characterized as a type of propositional logic, anticipating aspects of Frege’s work (Beaney 1997). Stoic logic made a fundamental distinction between “sayables” and “assertibles.” The former are a broader category that includes assertibles as well as questions, imperatives, oaths, invocations and even curses. The assertibles then are self-complete sayables that we use to make statements. For instance, “If Zeno is in Athens then Zeno is in Greece” is a conditional composite assertible, constructed out of the individual simple assertibles “Zeno is in Athens” and “Zeno is in Greece.” A major difference between Stoic assertibles and Fregean propositions is that the truth or falsehood of assertibles can change with time: “Zeno is in Athens” may be true now but not tomorrow, and it may become true again next month. It is also important to note that truth or falsehood are properties of assertibles, and indeed that being either true or false is a necessary and sufficient condition for being an assertible (that is, one cannot assert, or make statements about, things that are neither true nor false).

The Stoics were concerned with the validity of arguments, not with logical theorems or truths per se, which again is understandable in light of their interest to use logic to guard the fruits of their garden, the ethics. They also introduced modality into their logic, most importantly the modal properties of necessity, possibility, non-possibility, impossibility, plausibility and probability. This was a very modern and practically useful approach, as it directed attention to the fact that some assertibles induce assent even though they may be false, as well as to the observation that some assertibles have a higher likelihood of being true than not. Finally, the Stoics, and Chrysippus in particular, were sensitive to and attempted to provide an account of logical paradoxes such as the Liar and Sorites cases along lines that we today recognize as related to a semantic of vagueness (Tye 1994).

b. “Physics”

The Stoic topos of Physics includes what we today would classify as natural science (White 2003), metaphysics (Brunschwig 2003), and theology (Algra 2003). Let us briefly look at each in turn.

When it comes to natural science and cosmology, recall that the Stoics sought to “live according to nature,” which requires us to make our best efforts to understand nature. This also implies a very different view of natural science from the modern one: its study is not an end in itself, but rather subordinate to help us live a eudaimonic life.

Stoics thought that everything real, that is, everything that exists, is corporeal—including God and soul. They also recognized a category of incorporeals, which included things like the void, time, and the “sayables” (meanings, which played an important role in Stoic Logic). This may appear as a contradiction, given the staunchly materialist nature of Stoics philosophy, but is really no different from a modern philosophical naturalist who nonetheless grants that one can meaningfully talk about abstract concepts (“university,” “the number four”) which are grounded in materialism because they can only be thought of by corporeal beings such as ourselves.

They embraced what we might call a “vitalist” understanding of nature, which is permeated by two principles: an active one (identified with reason and God, referred to as the Logos) and a passive one (substance, matter). The active principle is un-generated and indestructible, while the passive one—which is identified with the four classical elements of water, fire, earth and air—is destroyed and recreated at every, eternally recurring, cosmic conflagration, a staple of Stoic cosmology. The cosmos itself is a living being, and its rational principle (Logos) is identified with aether, or the Stoic Fire (not to be confused with the elemental fire that is part of the passive principle). Consequently, God is immanent in the universe, and it is in fact identified with the creative cosmic Fire. This also means that the Stoics, unlike the Aristotelians, did not recognize the concept of a prime mover, nor of a Christian-type God outside of time and space, on the ground that something incorporeal cannot act on things, because it has no causal powers. From all of this, as White (2003) puts it, emerges a biological, rather than a mechanical picture of causation, which is significantly different from post-Cartesian and Newtonian mechanical philosophy.

Cosmic conflagrations, for the Stoics, repeat themselves in exact manner, apparently because God/Nature laid out things in the best possible way the previous time around, and there is therefore no reason to change (though one would get the same outcome from an entirely deterministic causal model of the universe). It is interesting to muse about the fact that some modern cosmological models also predict either identical or varied recurring universes (Ungerer and Smolin 2014), but of course do away with the concept of Providence altogether. According to Eusebius (quoted by White), during the phase of cosmic conflagration, the creative Fire is “a sort of seed, which possesses the principles of all things and the causes of all things that have occurred, are occurring, and will occur—the interweaving and ordering of which is fate, knowledge, truth, and a certain inevitable and inescapable law of the things that exist.”

Cicero, in De Fato, lays out the Stoic theory of causality and actually equates fate with antecedent causes. Chrysippus had argued that there is no possibility of motion without causes, deducing that therefore everything has a cause. This concept of universal causality led the Stoics to accept divination as a branch of physics, not a superstition, as explained again by Cicero in De Divinatione, and this makes sense once one understands the Stoic view of the cosmos: predicting the future is not something that one does by going outside the laws of physics, but by intelligently exploiting such laws.

Metaphysically the Stoics were determinists (Frede 2003). Here is Cicero: “[the Stoics] say, that it is impossible, when all the circumstances surrounding both the cause and that of which it is a cause are the same, that things should not turn out a certain way on one occasion but that they should turn out that way on some other occasion” (De Fato, 199.22-25). The Stoics did have a concept of chance, but they thought of it (much like modern scientists) as a measure of human ignorance: random events are simply events whose causes are not understood by humans.

The consequences of Stoic physics for their ethics are clear, and are summarized again by Cicero, when he says that Chrysippus aimed at a middle position between what we today would call strict incompatibilism and libertarianism (Griffith 2013). White (2003) interestingly notes in this respect that—just like Spinoza—the Stoics shifted the emphasis from moral responsibility to moral worth and dignity.

In terms of fundamental ontology, the Stoics were anti-corpuscularian (unlike the pre-Socratic Atomists, and Stoics’ chief rivals, the Epicureans), on the grounds that the idea of atoms violated their concept of a seamless unity of the cosmos. It is tempting to see this as in the same ballpark of modern quantum mechanical theories that see the entire universe as constituted of a single “wave function” (Ladyman and Ross 2009), but of course this would be an anachronistic interpretation.

3. The Third Topos: Ethics

Stoic Ethics was not just another theoretical subject, but an eminently practical one. Indeed, especially for the later Stoics, ethics—understood as the study of how to live one’s life—was the point of doing philosophy. It was no easy task: Epictetus famously said (in Discourses III.24.30): “The philosopher’s lecture room is a hospital: you ought not to walk out of it in a state of pleasure, but in pain—for you are not in good condition when you arrive!” The starting point for Epictetus was the famous dichotomy of control, as expressed at the very beginning of the Enchiridion: “We are responsible for some things, while there are others for which we cannot be held responsible” (also translated as “Some things are up to us, other things are not up to us”).

The early Stoics were somewhat more theoretical in their approach, with Zeno, Cleanthes and Chrysippus attempting to both systematize their doctrines and defend them from critiques from both Epicurean and especially Academic-Skeptic quarters. The early Stoa’s famous motto in ethics was “follow nature” (or “live according to nature”), by which they meant both the rational-providential aspect of the cosmos (see Physics above) and more specifically human nature, which they conceived as that of a social animal capable of bringing rational judgment to bear on problems posed by how to live one’s life. (It appears that Zeno’s original articulation of the principle was “live consistently” to which Cleanthes added the clarifying clause “with nature”: Schofield 2003.) Tightly related to this idea of following (human) nature was the Stoic concept of oikeiôsis, often translated as affinity, or appropriation. For the Stoics human beings have natural propensities to develop morally, propensities that begin as what we today would call instincts and can then be greatly refined with the onset of the age of reason at the childhood stage and beyond. It is interesting to note that this naturalistic account of the roots of virtuous/moral behavior is highly compatible with modern findings in both evolutionary and cognitive science (for example, Putnam and others 2014).

Specifically, we naturally: (i) behave in a fashion as to advance our interests and goals (health, wealth, and so forth); (ii) identify with other people’s interests (initially our parents, then friends, then countrymen); (iii) figure out ways to practically navigate the vicissitudes of life. The Stoics related these propensities directly to the four cardinal virtues of temperance, courage, justice and practical wisdom. Temperance and courage are required to pursue our goals, justice is a natural extension of our concern for an ever-increasing circle of people, and practical wisdom (phronêsis) is what best allows us to deal with whatever happens.

Which brings us to the matter of how the virtues are related to each other. To begin with, the Stoics recognized the above mentioned four cardinal virtues, but also a number of more specific ones within each major category (complete list in Sharpe 2014, derived from Stobaeus): for instance, practical wisdom included good judgment, discretion, resourcefulness; temperance could be broken down into propriety, sense of honor, self-control; courage was divided into perseverance, confidence, magnanimity; and justice comprised piety, kindness, sociability. Even so, they held to a view of virtue that is much more unitary than it may come across from this kind of list (Schoefield 2003). The cardinal virtues are derived from Socrates, especially in Plato’s Republic, and so is a certain unifying way of considering the virtues. Justice can be conceptualized as practical wisdom applied to social living; courage as wisdom concerning endurance; and temperance as wisdom with regard to matters of choice. Chrysippus further elaborated this idea of pluralism within an underlying unity, making the virtues essentially inseparable, so that, say, one cannot be courageous and yet intemperate—in the Stoic sense of those words.

Hadot (1998) draws a series of parallels between the four virtues, the three topoi and what are referred to as the three Stoic disciplines: desire, action, and assent. The discipline of desire, sometimes referred to as Stoic acceptance, is derived from the study of physics, and in particular from the idea of universal cause and effect. It consists in training oneself to desire what the universe allows and not to pursue what it does not allow. A famous metaphor here, used by Epictetus, is that of a dog leashed to a cart: the dog can either fight the cart’s movement at every inch, thus hurting himself and ending up miserable; or he can decide to gingerly go along with the ride and enjoy the panorama. This is a version of what Nietzsche eventually called amor fati (love your fate), and that is encapsulated in Epictetus’ phrase “endure [what the universe throws your way] and renounce [what the universe does not allow]” (Fragments 10). Consequently, according to Hadot, the discipline of desire is linked to the virtues of courage (to follow the order of the cosmos) and temperance (to be able to control one’s desires).

The second discipline, of action, is also called Stoic “philanthropy” and is the most prosocial of the cardinal virtues. The basic idea is that human beings ought to develop their natural concern for others in a way that is congruent with the exercise of the virtue of justice. Here the area of study most directly connected to the discipline is that of ethics itself. A representative quote is perhaps the one found in Marcus Aurelius’ Meditations (VIII.59): “Men exist for the sake of one another. Teach them then or bear with them.” The first sentence is a statement of philanthropy (in the Stoic, not modern, sense), while the second one makes it clear that for the emperor this was a duty to be performed either by engaging other people positively or at the very least by suffering their non virtuous behavior, if that is the case.

The last discipline is that of assent, referred to as Stoic “mindfulness” (not to be confused with the variety of Buddhist concepts by the same name, especially the Zen one). I will get back to the concept of assent in the next section, as it is related to the Stoic treatment of the (moral) psychology of emotions, but for now suffice to say that the discipline regards the necessity to make decisions about what to accept or reject of our experience of the world, that is, how to make proper judgments. It is therefore linked to the virtue of practical wisdom, as well as to the area of study of logic. If we had to summarize it in a single sentence, Seneca’s “bring the mind to bear upon your problems” (On Tranquility of Mind, X.4) may be appropriate.

As we have seen so far, Stoic ethics is concerned exclusively with the concept of virtue (and associated disciplines)—whether understood as a unitary thing with a number of facets or otherwise. In this the Stoics were akin to the Cynics and unlike the Peripatetics, who instead allowed that a number of other things are necessary for a eudaimonic life, including (some) wealth, health, education, and so forth. The Peripatetics would not have assented to the idea of a eudaimonic Sage on the rack, a classic Stoic concept.

However, Stoic ethics actually attempts to strike a balance between the asceticism of the Cynics and the somewhat elitist views of the Peripatetics. It does so through the introduction of the Stoic concept of preferred and dispreferred “indifferents” briefly mentioned at the beginning. This is found already in Zeno’s book on Ethics, which is now lost, but about which we know from Diogenes Laertius (VII.4). Zeno distinguished between indifferents that have value (axia) and those that have disvalue (apaxia). The first group included things like health, wealth and education, while the second group was comprised of things like sickness, poverty and ignorance. The move was a brilliant one: as I argued above, it allowed the Stoics to get the best of both the Cynic and the Peripatetic worlds: yes, it is true that—if they don’t get in the way of practicing virtue—some indifferents are preferred; but they are called indifferents for a reason: they do not truly matter for the pursuit of the (moral) eudaimonic life. In other words, while it is undeniable that people naturally and rationally seek the preferred indifferents, it is also the case that one can be a person of moral integrity, achieving eudaimonia, regardless of one’s material circumstances.

There is much more to be said about Stoic ethics, of course, but before closing this introductory sketch let me comment on an issue that does not fail to come up, and which I have already briefly mentioned above: the connection between the undeniably teleological-providential views of the cosmos advanced by Stoic physics and the actual practice of Stoic ethics. The issue is this: given that the Stoic themselves insisted that the study of physics (and of logic) influences how we understand ethics, and given that they believed in the providential nature of the cosmos, does that mean that only people who accept the latter view can pursuit eudaimonia? The generally accepted answer is no.

Gregory Vlastos (referred to in Schofield 2003) convincingly argued that what he called the “theocratic” principle does affect one’s conception of the relation between virtue and the order of the cosmos, specifically because it tells us that being virtuous is in agreement with such order. Crucially, however, Vlastos maintains that this does not change the content of virtue, nor does it affect one’s conception of eudaimonia. This is so because although the “physics” (which, remember, is a combination of natural sciences and metaphysics, and hence theology) does inform the ethics, it does so in what modern philosophers would call an underdetermined fashion: while ethics is not independent of physics (or logic), in the Stoic system, it also cannot be read directly off it. Stoic ethics is naturalistic, and thus very modern in nature, but it—to put it in rather anachronistic terms—does not simplistically erase Hume’s is/ought divide.

Vlastos’ position finds plenty of textual support from a number of Stoic sources, perhaps no more obviously so than in Marcus, as already reported. There are, however, other passages in the classical Stoic literature that do not lend themselves to a clear cut position on the matter, such as this one from Epictetus: “What does it matter to me […] whether the universe is composed of atoms or uncompounded substances, or of fire and earth? Is it not sufficient to know the true nature of good and evil, and the proper bounds of our desires and aversions, and also of our impulses to act and not to act; and by making use of these as rules to order the affairs of our life, to bid those things that are beyond us farewell? It may very well be that these latter things are not to be comprehended by the human mind, and even if one assumes that they are perfectly comprehensible, well what profit comes from comprehending them? And ought we not to say that those men trouble in vain who assign all this as necessary to the philosopher’s system of thought? […] What Nature is, and how she administers the universe, and whether she really exists or not, these are questions about which there is no need to go on to bother ourselves” (Fragments 1). Please remember that for the Stoics “nature” was synonymous with “god.”

Indeed, it is because of this and other passages that Ferraiolo (2015), for instance, concludes that: “metaphysical doctrines about the nature and existence of God, and a rationally governed cosmos, are rather cleanly separable from Stoic practical counsel, and its conductivity to a well-lived, eudaimonistic life. Stoicism may have developed within a worldview infused with presuppositions of a divinely-ordered universe … but the efficacy of Stoic counsel is not dependent upon creation, design, or any form of intelligent cosmological guidance.”

On balance, it seems fair to say that the ancient Stoics did believe in a (physical) god that they equated with the rational principle organizing the cosmos, and which was distributed throughout the universe in a way that can be construed as pantheistic. While it is the case that they maintained that an understanding of the cosmos informs the understanding of ethics, construed as the study of how to live one’s life, it can also be reasonably argued that Stoic metaphysics underdetermined—on the Stoics’ own conception—their ethics, thus leaving room for a “God or Atoms” position that may have developed as a concession to the criticisms of the Epicureans, who were atomists.

4. Apatheia and the Stoic Treatment of Emotions

The naturalistic system of ethics developed by the Stoics bridges what would later be referred to as the is/ought gap by way of a sophisticated account of human developmental moral psychology (Brennan 2003). This section focuses on a related, major difference between Stoics and Epicureans, which begins with the respective use of two key terms indicating a desirable state of mind according to the two schools, and continuing with a broader discussion of the Stoic classification of emotions (or “passions”).

As we have seen, Epictetus explains in a number of places where the Stoa differs from the Garden (for example, “Against Epicurus,” Discourses I.23), while Seneca tells his friend Lucilius that he happily borrows from Epicurus when it makes sense, as it is his “custom to cross even into the other camp, not as a deserter but as a spy” (Letter II, A beneficial reading program, in the new translation by Graver and Long 2015).

Recall that the Stoics thought the pivotal thing in life is virtue and its cultivation, while the Epicureans thought that the point was to seek moderate pleasure and especially avoid pain. Nonetheless, both schools thought that a crucial component of eudaimonia (the flourishing life) was something very similar, to which the Stoics referred to as apatheia and the Epicureans as ataraxia. There are, however, some differences between the two concepts, especially in the way the two schools taught how one could achieve, or at the least approximate, the respective states of mind.

The IEP article on Epictetus defines the two terms in the following fashion:

apatheia: freedom from passion, a constituent of the eudaimôn life

ataraxia: imperturbability, literally “without trouble,” sometimes translated as “tranquillity”; a state of mind that is a constituent of the eudaimôn life

So, both apatheia and ataraxia are components of the eudaimonic life, and indeed, while the second term is usually associated with the Epicureans, both schools used it.

As far as the Stoics are concerned, however, it is good to remember that “passion” did not mean what we now mean by that term, and indeed it did not even exactly overlap with the term “emotion” in the modern sense of the word. That is why it is grossly incorrect to say that the Stoics aimed at a passionless life, or at the suppression of emotions. Rather, the Stoics divided the “passions” into unhealthy and healthy ones. The first group included pain, fear, craving, and pleasure. The second one “discretion,” “willing,” and “delight.” The latter were the opposite of the first group, except for pain, which does not have a positive counterpart. Here is a summary diagram:

stoicism-passions-table

A diagram of the Stoic passions

For the Stoics, then, the “passions” are not automatic, instinctive reactions that we cannot avoid experiencing. Instead, they are the result of a judgment, giving “assent” to an “impression.” So even when you read a familiar word like “fear,” don’t think of the fight-or-flight response that is indeed unavoidable when we are suddenly presented with a possible danger. What the Stoics meant by “fear” was what comes after that: your considered opinion about what caused said instinctive reaction. The Stoics realized that we have automatic responses that are not under our control, and that is why they focused on what is under our control: the judgment rendered on the likely causes of our instinctive reactions, a judgment rendered by what Marcus Aurelius called the ruling faculty (in modern cognitive science terminology: the executive function of the brain).

The Stoic view of emotions finds very nice parallels in modern neuroscience. For instance, Joseph LeDoux (2015) makes the important, if often neglected, point that there is a difference between what neuroscientists mean by “emotion” and what psychologists mean. Neuroscientifically, fear, for example, is the result of a defense and reaction mechanism that is involuntary and nonconscious, and whose major neural correlate is the amygdala. But what psychologists refer to when they talk of “fear” is a more complex emotion, constructed in part of the basic defense and reaction mechanism, to which the conscious mind adds cognitive interpretation, something very similar to the Stoic concept. The two meanings are not in contradiction, but are rather complementary. The cognitive interpretation of the raw emotion of fear, then, is brought about by a combination of one’s memories, cultural upbringing, deliberative thinking, and so forth. The Stoics clearly referred to the psychological, not the neuroscientific meaning of emotion as “passions,” and LeDoux’s own research seems to support the Stoic account and the practicability of their discipline of assent, seen in the previous section.

Going back to the above diagram: pain is not the simple sensation of pain, but the failure to avoid something that we mistakenly judge bad. Similarly for the other pathê: fear is the irrational expectation of something bad or harmful; craving is the irrational striving for something mistakenly judged as good; and pleasure is the irrational elation over something that is actually not worth choosing. Contrariwise, the eupatheiai are the result of a rational aversion of vice and harmful things (discretion), a rational desire for virtue (willing), and a rational elation over virtue (delight). (It should be clear now why there is no such thing as a rational emotional pain.)

All of the above is why apatheia is best construed as equanimity in the face of what the world throws at us: if we apply reason to our experience, we will not be concerned with the things that do not matter, and we will correspondingly rejoice in the things that do matter.

There is another crucial difference between the two schools to be highlighted here: they get to apatheia/ataraxia by very different routes. The Epicureans sought ataraxia as a goal, achieved most of all through the avoidance of pain, which meant especially to withdraw from social and political life. It was good, for Epicurus, to cultivate your close friendships, but attempting to play a full role in the polis was a sure way to experience pain (physical or mental), and therefore it was to be avoided. For the Stoics, on the contrary, the goal was the exercise of virtue, which led them to embrace their social role. Marcus Aurelius, for instance, constantly writes in the Meditations that we need to get up in the morning and do the job of a human being, which he interprets to mean to be useful to society. Hierocles elaborated on the Cynic/Stoic concept of cosmopolitanism. The motto of the school was “follow nature,” by which it was meant, as we have seen, the human nature of a social animal capable of rational judgment. And of course one of the four virtues examined in the previous section is justice, and one of the three disciplines is that of action—both explicitly prosocial. Apatheia, then, was not a goal for Stoics, but an advantageous byproduct (a preferred indifferent, so to speak) of living the virtuous life.

5. Stoicism after the Hellenistic Era

As Long (2003) has remarked, Stoicism has had a pervasive, yet largely unacknowledged influence on Western philosophical thought throughout the Middle Ages, Renaissance, and into modern times. Among the philosophers that he lists as being directly or indirectly affected by Stoicism are Augustine, Thomas More, Descartes, Spinoza, Leibniz, Rousseau, Adam Smith, and Kant, to which we can easily add David Hume. During the Renaissance both Stoic books, particularly Epictetus’ Enchiridion and Seneca’s Letters, and books favorable to Stoicism, like Cicero’s De Officiis, were widely read.

Christianity was far more sympathetic to Stoicism than to its main rival, Epicureanism (and it also absorbed elements of Platonism in its “neo” form). The Epicurean emphasis on pleasure, as well as their metaphysics of cosmic chaos, where prima facie incompatible with Christian theology. The case of Stoicism was more complex. On the one hand, the Stoic insistence on materialism and pantheism was criticized and rejected; on the other hand, the idea of the Logos could easily be adapted—if in a fashion that the Stoics themselves would not have recognized—and the emphasis on virtue was often seen as pretty much the best that people could manage before the coming of Christ.

This is why we find an interestingly mixed record of Christian attitudes toward Stoicism. Augustine initially wrote favorably about it, while later on he was more critical. Tertullian was positively inclined toward Stoicism, and versions of the Enchiridion were commonly used (with Paul replacing Socrates) in monasteries. Peter Abelard and John of Salisbury were influenced by Stoic ethics too, while Thomas Aquinas was critical, especially of an early attempt at reviving Stoicism made by David of Dinant at the beginning of the 13th century.

A major revival of Stoicism did eventually take place, during the Renaissance, largely because of the work of Justus Lipsius (1547-1606). He was a humanist and classic philologist who published critical editions of Seneca and Tacitus. His major opus was De Constantia (1584), where he argued that Christians can draw on the resources of Stoicism during troubled times, while at the same time carefully pointing out aspects of Stoicism that are unacceptable for a Christian. Lipsius also drew on Epictetus, whose Enchiridion had first been translated in English a few years earlier. Other Neostoics included the French statesman Guillaume Du Vair, the churchman Pierre Charron, the Spanish author Francisco de Quevedo, and most importantly Michel de Montaigne, who wrote one of his essays in defense of Seneca.

The reception of Neostoicism was mixed. Even before Lipsius, Calvin had strongly criticized the “novi Stoici” for their revival of the idea of apatheia, and later critics included Pascal. In part in order to preempt such reactions, according to Sellars, one of the Neostoic texts began with the following cautious endorsement: “philosophie in generall is profitable unto a Christian man, if it be well and rightly used: but no kinde of philosophie is more profitable and neerer approaching unto Christianitie than the philosophie of the Stoicks.” Despite the interest in Stoicism displayed by other Renaissance figures, even outside of philosophy (for example, the poet Petrarch), Neostoicism never really became a movement, and its import largely rests on the impact of Lipsius’ writings, and perhaps on the influence of Montaigne.

Arguably the most important modern philosopher to be influenced by Stoicism is Spinoza, who was in fact accused by Leibniz to be a leader of the “sect” of the new Stoics, together with Descartes (Long 2003). There are indeed a number of striking similarities between the Stoic conception of the world and Spinoza’s. In both cases we have an all-pervasive God that is identified with Nature and with universal cause and effect. While it is true that the Stoic understanding of the cosmos was essentially dualistic—in contrast with Spinoza’s monism—the Stoic “active” and “passive” principles were nonetheless completely entwined, ultimately yielding an essentially unitary reality. Long points out, however, that a major difference was Spinoza’s concept of God’s infinite attributes and extension, in marked contrast to the finite (if eternal) God of the Stoics: “the upshot of both systems is a broadly similar conception of reality—monistic in its treatment of God as the ultimate cause of everything, dualistic in its two aspects of thought and extension, hierarchical in the different levels or modes of God’s attributes in particular beings, strictly determinist and physically active through and through.” He goes on to remark that the similarities are even more marked in terms of ethics, and “Spinoza’s ethics becomes transparently and profoundly Stoic.” That said, another major difference is that Spinoza did not believe in an underlying teleology to the world. For him Nature has no aim and God does not direct the cosmic drama. Indeed, as Long puts it: “If the Stoics had taken Spinoza’s route of denying divine providence, they would have avoided a battery of objections brought against them from antiquity onward.” In an important sense, perhaps, one can think of Spinoza as updating the Stoic system to modern times, a project that is currently seeing a number of concerted efforts.

Finally, there is also a connection between the Stoics and Kant, particularly in their shared concept of duty which transcends the specific consequences of one’s action. But as Long again points out, the differences are also quite striking: while Kant arrived at his system by a priori reasoning, the Stoics were eminently naturalistic and empiricist at heart. This is a major distinction between a deontological system like Kant’s and a eudaemonistic one like the Stoic, and it is only with the recent resurgence of virtue ethics in contemporary philosophy (Foot 1978, 2001; MacIntyre 1981/2013; Nussbaum 1994) that the ground was laid out for yet another revival of Stoicism as a practical moral philosophy.

6. Contemporary Stoicism

The 21st century is seeing yet another revival of virtue ethics in general and of Stoicism in particular. The already mentioned work by philosophers like Philippa Foot, Alasdair MacIntyre, and Martha Nussbaum, among others, has brought back virtue ethics as a viable alternative to the dominant Kantian-deontological and utilitarian-consequentialist approaches, so much so that a survey of professional philosophers by David Bourget and David Chalmers (2013) shows that deontology is (barely) the leading endorsed framework (26% of respondents), followed by consequentialism (24%) and not too far behind by virtue ethics (18%), with a scatter of other positions gathering less support. Of course ethics is not a popularity contest, but these numbers indicate the resurgence of virtue ethics in contemporary professional moral philosophy.

When it comes more specifically to Stoicism, new scholarly works and translations of classics, as well as biographies of prominent Stoics, keep appearing at a sustained rate. Examples include the superb Cambridge Companion to the Stoics (Inwood 2003), individual chapters of which have been cited throughout this entry; an essay on the concept of Stoic sagehood (Brouwer 2014); a volume on Epictetus (Long 2002); a contribution on Stoicism and emotion (Graver 2007); the first new translation of Seneca’s letters to Lucilius in a century (Graver and Long 2015); a new translation of Musonius Rufus (King 2011); a biography of Cato the Younger (Goodman 2012); one of Marcus Aurelius (McLynn 2009); and two of Seneca (Romm 2014 and Wilson 2014); and the list could continue.

In parallel with the above, Stoicism is, in some sense, returning to its roots as practical philosophy, as the ancient Stoics very clearly meant their system to be primarily of guidance for everyday life, not a theoretical exercise. Indeed, especially Epictetus is very clear in his disdain for purely theoretical philosophy: “We know how to analyze arguments, and have the skill a person needs to evaluate competent logicians. But in life what do I do? What today I say is good tomorrow I will swear is bad. And the reason is that, compared to what I know about syllogisms, my knowledge and experience of life fall far behind” (Discourses, II.3.4-5). Or consider Marcus’ famous injunction: “No longer talk at all about the kind of man that a good man ought to be, but be such” (Meditations, X.16).

The Modern Stoicism movement traces its roots to Victor Frankl’s (Sahakian 1979) logotherapy, as well as to early versions of Cognitive Behavioral Therapy, for instance in the work of Albert Ellis (Robertson 2010). But Stoicism is a philosophy, not a therapy, and it is in the works of philosophers such as William Irvine (2008), John Sellars (2003), and Lawrence Becker (1997) that we find articulations of 21st century Stoicism, though the more self-help oriented contribution by CBT therapist Donald Robertson (2013) is also worthy of note. All of these authors attempt to distance the philosophical meaning of “Stoic”—even in a modern setting—from the common English word “stoic,” indicating someone who goes through life with a stiff upper lip, so to speak. While there are commonalities between “Stoic” and “stoic,” for instance the emphasis on endurance, the latter is a diminutive version of the former, and the two should accordingly be kept distinct.

Perhaps the most comprehensive and scholarly attempt to update (as opposed to simply explain) Stoicism for modern audiences comes from Becker (1997), though a more accessible treatment is offered by Irvine (2008). One of Irvine’s major contributions is shifting from Epictetus’ famous dichotomy of control to a more reasonable trichotomy: some things are up to us (chiefly, our judgments and actions), some things are not up to us (major historical events, natural phenomena), but on a number of other things we have partial control. Irvine recasts the third category in terms of internalized goals, which makes more sense of the original dichotomy. Consider his example of playing a tennis match. The outcome of the game is under your partial control, in the sense that you can influence it; but it is also the result of variables that you cannot control, such as the skill of your opponent, the fairness of the referee, or even random gusts of wind interfering with the trajectory of the ball. Your goal, then, suggests Irvine, should not be to win the game—because that is not entirely within your control. Rather, it should be to play the best game you can, since that is within your control. By internalizing your goals you can therefore make good sense of even the original Epictetean dichotomy. As for the outcome, it should be accepted with equanimity.

Becker (1997) is more comprehensive and even includes a lengthy appendix in which he demonstrates that the formal calculus he deploys for his normative Stoic logic is consistent, suggesting also that it is complete. There are three important differences between his New Stoicism and the ancient variety: (i) Becker defends an interpretation of the inherent primacy of virtue in terms of maximization of one’s agency, and builds an argument to show that this is, indeed, the preferred goal of agents that are relevantly constituted like a normal human being; (ii) he interprets the Stoic dictum, “follow nature” as “follow the facts” (that is., abide by whatever picture of the universe our best science allows), consistently with Stoic sources attesting to their respect for what we would today call scientific inquiry, as well as with an updated Stoic approach to epistemology; and (iii) Becker does away with the ancient Stoic teleonomic view of the cosmos, precisely because it is no longer supported by our best scientific understanding of things. This is also what leads him to make his argument for virtue-as-maximization-of-agency referred to in (i) above. Whether Becker’s (or Irvine’s, or anyone else’s) attempt will succeed or not remains to be seen in terms of further scholarship and the evolution of the popular movement.

That movement has grown significantly in the early 21st century, manifesting itself in a number of forms. There is a good number of high quality blogs devoted to practical modern Stoicism. There is also a significant presence on social networks, for instance the Stoicism Group on Facebook.

7. Glossary

The Stoics were well known (some would say infamous) for having developed a rich technical vocabulary. Cicero, in book III of De Finibus, explicitly says that Zeno invented a number of new terms, and he feels that Latin is not a sufficiently sophisticated tongue to render all the subtleties of Greek thought. Below are some of the major Stoic terms and their meanings.

Andreia = courage, fortitude, one of the four Stoic cardinal virtues.

Apatheia = tranquility, overcoming disturbing desires and emotions.

Apoproēgmena = dispreferred indifferents, externals, outside of virtue that—other things being equal—should be avoided.

Aretê = virtue, excellence at one’s function. For Becker this is equivalent to the perfection of agency.

Ataraxia = absence of fear, largely an Epicurean concept, but also adopted by the Stoics.

Dikaiosynê = justice, integrity, one of the Stoic cardinal virtues.

Eu̯dai̯monía = flourishing, by means of living an ethical life.

Eupatheiai = the healthy passions cultivated by the Sage.

Hormê = the discipline of action.

Kathēkon = appropriate, rational, action, the thing one ought to do.

Logos = rational principle governing the universe.

Oikeiôsis = something properly yours, leading to Hierocles’ circle of expanding affection, Stoic cosmopolitanism.

Orexis = the discipline of desire.

Philanthrôpia = love of mankind, related to the concept of Oikeiôsis.

Phronȇsis = practical wisdom, one of the Stoic cardinal virtues.

Proēgmena = preferred indifferents, externals, outside of virtue that—other things being equal—can be pursued unless they compromise one’s virtue.

Propatheiai = involuntary emotional reactions, to which one has not yet given or withdrawn assent.

Prosochê = applying key ethical precepts to the present moment, mindfulness.

Sôphrosynê = self-discipline, temperance, one of the Stoic cardinal virtues.

Sunkatathesis = the discipline of assent.

 

8. References and Further Readings

  • Algra, K. (2003) Stoic theology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Almeder, R. (1995) Externalism and justification. Philosophia 24:465-469.
  • Angere, S. (2007) The defeasible nature of coherentist justification. Synthese 157:321-335.
  • Beaney, M. (1997) The Frege Reader. Blackwell.
  • Becker, L.C. (1997) A New Stoicism. Princeton University Press.
  • Bobzien, S. (2003) Logic. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Bobzien, S. (2006) Ancient logic. Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/entries/logic-ancient/ (accessed on 22 December 2015)
  • Bourget, D. and Chalmers, D.J. (2013) What do philosophers believe? Philosophical Studies 3:1-36.
  • Brennan, T. (2003) Stoic moral psychology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Broadie, S. & Rowe, C. (eds.) (2002) Aristotle — Nicomachean Ethics. Oxford University Press.
  • Brouwer, R. (2014) The Stoic Sage: The Early Stoics on Wisdom, Sagehood and Socrates. Cambridge University Press.
  • Brunschwig, J. (2003) Stoic metaphysics. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Bugh, G.R. (1992) Athenion and Aristion of Athens. Phoenix 46:108-123.
  • Cicero, C.T. (2014) Complete Works. Delphi Classics.
  • Colish, M. (1985) The Stoic Tradition from Antiquity to the Early Middle Ages. E.J. Brill.
  • Diogenes Laertius (trans. by R.D. Hicks) (2015) Lives of the Eminent Philosophers. Delphi Classics.
  • Epictetus (trans. by R. Dobbin) (2008) Discourses and Selected Writings. Penguin
  • Ferraiolo, W. (2015) God or Atoms: Stoic Counsel With or Without Zeus. International Journal of Applied Philosophy 29:199-205.
  • Foot, P. (1978) Virtues and Vices. Blackwell.
  • Foot, P. (2001) Natural Goodness. Clarendon Press.
  • Frede, M. (1983). Stoics and skeptics on clear and distinct impressions. In: M.F. Burnyeat (ed.), The Skeptical Tradition. University of California Press, pp.65–93.
  • Frede, D. (2003) Stoic determinism. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Gill, C. (2003) The School in the Roman Imperial period. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Goldman, A. (1980) The internalist conception of justification. Midwest Studies in Philosophy 5:27-52.
  • Goldman, A. (1994) Naturalistic epistemology and reliabilism. Midwest Studies in Philosophy 19:301-320.
  • Goodman, R. (2012) Rome’s Last Citizen: The Life and Legacy of Cato, Mortal Enemy of Caesar. Thomas Dunne.
  • Goulet, R. (2013) Ancient philosophers: a first statistical survey. In: M. Chase, R.L. Clark, and M. McGhee (eds.) Philosophy as a Way of Life: Ancients and Moderns — Essays in Honor of Pierre Hadot. John Wiley & Sons.
  • Graver, M. (2007) Stoicism and Emotion. University Of Chicago Press.
  • Graver, M. and Long, A.A. (translators) (2015) Letters on Ethics: To Lucilius. University of Chicago Press.
  • Griffith M. (2013) Free Will: The Basics. Routledge.
  • Hadot, P. (1998) The Inner Citadel: the Meditations of Marcus Aurelius. Trans. by M. Chase, Cambridge University Press.
  • Hankinson, R.J. (2003) Stoic epistemology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Inwood, B. (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Irvine, W.B. (2008) A Guide to the Good Life: The Ancient Art of Stoic Joy. Oxford University Press.
  • King, C. (2011) Musonius Rufus: Lectures and Sayings. CreateSpace.
  • Ladyman, J. and Ross, D. (2009) Every Thing Must Go: Metaphysics Naturalized. Oxford University Press.
  • LeDoux, J. (2015) Anxious: Using the Brain to Understand and Treat Fear and Anxiety. Viking.
  • Lipton, P. (2003) Inference to the Best Explanation. Routledge.
  • Long, A.A. (2002) Epictetus: A Stoic and Socratic Guide to Life. Oxford University Press.
  • Long, A.A. (2003) Stoicism in the philosophical tradition: Spinoza, Lipsius, Butler. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • MacIntyre, A. (1981/2013) After Virtue. A&C Black.
  • Marcus Aurelius (trans. by G. Long) (1997) Meditations. Dover.
  • McBrayer, G.A., Nichols, M.P., and Schaeffer, D. (2010) Euthydemus. Focus.
  • McLynn, F. (2009) Marcus Aurelius: A Life. Da Capo Press.
  • Nussbaum, M. (1994) The Therapy of Desire: Theory and Practice in Hellenistic Ethics. Princeton University Press.
  • O’Connor, D.J. (1975) The Correspondence Theory of Truth. Hutchinson.
  • Osler, M.J. (1991) Atoms, pneuma and tranquillity: Epicurean and Stoic Themes in European Thought. Cambridge University Press.
  • Putnam, H., Neiman, S., and Schloss, J.P. (eds.) (2014) Understanding Moral Sentiments: Darwinian Perspectives? Transaction Publishers.
  • Robertson, D. (2010) The Philosophy of Cognitive-behavioural Therapy (CBT): Stoic Philosophy as Rational and Cognitive Psychotherapy. Karnac Books.
  • Robertson, D. (2013) Stoicism and the Art of Happiness – Ancient Tips For Modern Challenges: Teach Yourself. Teach Yourself.
  • Romm, J. (2014) Dying Every Day: Seneca at the Court of Nero. Knopf.
  • Sahakian, W.S. (1979) Logotherapy’s Place in Philosophy. In: Logotherapy in Action. J. Fabry, R. Bulka, and W.S. Sahakian (eds.), foreword by Viktor Frankl. Jason Aronson.
  • Schofield, M. (2003) Stoic ethics. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Sedley, D. (2003) The School, from Zeno to Arius Didymus. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Sellars, J. (2003) The Art of Living: The Stoics on the Nature and Function of Philosophy. Ashgate.
  • Seneca, L.A. (2014) Complete Works of Seneca the Younger. Delphi Classics.
  • Sharpe, M. (2014) Stoic virtue ethics. In: S. van Hooft and N. Athanassoulis (eds.), The Handbook of Virtue Ethics. Acumen Publishing.
  • Taran, L. (1971) The creation myth in Plato’s Timaeus. In: J.P. Anton, G.L. Kustas and A. Preus (eds.) Essays in Ancient Greek Philosophy. State University of New York Press, 372-407.
  • Tieleman, T. (2002) Galen on the seat of the intellect: anatomical experiment and philosophical tradition. In: C. Tuplin (ed.) Science and Mathematics in Ancient Greek Culture. Oxford University Press, pp. 256-273.
  • Tye, M. (1994) Sorites paradoxes and the semantics of vagueness.  Philosophical Perspectives 8:189-206.
  • Ungerer, R.M. and Smolin, L. (2014) The Singular Universe and the Reality of Time: A Proposal in Natural Philosophy. Cambridge University Press.
  • Verbeke, G. (1983) The Presence of Stoicism in Medieval Thought. Catholic University of America Press.
  • White, M.J. (2003) Stoic natural philosophy. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Wilson, E. (2014) The Greatest Empire: A Life of Seneca. Oxford University Press.

 

Author Information

Massimo Pigliucci
Email: mpigliucci@ccny.cuny.edu
City University of New York
U. S. A.

Aesthetics in Continental Philosophy

Klee angelus novus, 1920Although aesthetics is a significant area of research in its own right in the analytic philosophical tradition, aesthetics frequently seems to be accorded less value than philosophy of language, logic, epistemology, metaphysics, and other areas of value theory such as ethics and political philosophy. Many of the most prominent analytic philosophers have not written on aesthetics at all. Matters stand very differently in continental philosophy, where aesthetics has been given an important place by nearly every major thinker and tradition. There are undoubtedly important extra-philosophical reasons for this—such as the importance of art in European education and tradition and the French model of the philosophe as philosopher-writer—but there are also clearly philosophical reasons. In the analytic tradition, meaning and truth are frequently thought to be exemplified by logic, science, and the formal structures of language, whereas in continental philosophy, art has often taken this role of exemplifying meaning and truth. As such, art becomes akin to a philosophical activity insofar as it is thought to produce meaning and truth, and aesthetics takes an important place because it is seen as a branch of philosophy which gives access to some of philosophy’s perennially central concerns. Moreover, while the analytic tradition tends to abstract aesthetic questions from other concerns, the continental tradition tends to think about its role in relation to epistemology and metaphysics, to emphasise art’s historical and social situatedness, and to ask questions concerning its role and value in culture, politics, and everyday life. However, and in further contrast to analytic aesthetics, there is no general consensus concerning central topics of debate in continental aesthetics. Instead, and following a method of organisation typical of continental philosophy, this area of aesthetics may be approached according to major traditions and thinkers. This article gives a synoptic overview of such in the twentieth and twenty-first centuries. The ideas developed by each often remain highly unique, yet they have also influenced and reacted against each other (and these points of contact are marked within the article). Most of these developments have taken place in critical relation with modern and nineteenth-century aesthetics, especially as exemplified by the works of Immanuel Kant, G.W.F. Hegel, and Friedrich Nietzsche. Kant’s Critique of the Power of Judgement (1790) has been particularly important in shaping debates in later continental aesthetics, since on the one hand it stakes out aesthetics as a domain autonomous in relation to other areas of philosophical concern, such as epistemology and practical philosophy, and on the other it shows how this domain has relevance for other areas, (In Kant’s system, aesthetics provides a model for how judgement acts as a power that can unify the other branches of philosophical interest.)

Table of Contents

  1. The Position of Aesthetics in Continental Philosophy
  2. Phenomenology and Existentialism
    1. Introduction
    2. Heidegger
    3. Merleau-Ponty
  3. Hermeneutics
    1. Introduction
    2. Gadamer
  4. Psychoanalysis
    1. Introduction
    2. Freud
    3. Lacan
  5. Critical Theory
    1. Introduction
    2. Benjamin
    3. Adorno
  6. Poststructuralism
    1. Introduction
    2. Derrida
    3. Lyotard
    4. Deleuze
  7.  Developments in the Early 21st Centrury
    1. Introduction
    2. Rancière
    3. References and Further Reading

1. The Position of Aesthetics in Continental Philosophy

The importance and scope of aesthetics in continental philosophy may be indicated at the outset by taking the relatively ‘canonical’ example of Heidegger’s reading of Nietzsche on art. While the specific views about art and aesthetics expressed in this reading do not extend their influence to all traditions and thinkers within continental philosophy, the example gives a good indication of the dominant role aesthetics frequently takes in such traditions. During his first lecture course on Nietzsche, ‘The Will to Power as Art’, Heidegger sets out five statements on art:

  1. Art is the most perspicuous and familiar configuration of will to power;
  2. Art must be grasped in terms of the artist;
  3. According to the expanded concept of artist, art is the basic occurrence of all beings; to the extent that they are, beings are self-creating, created;
  4. Art is the distinctive countermovement to nihilism;
  5. Art is worth more than ’the truth’.

In addition to the above, and taking preeminent place as an expression of Nietzsche’s entire thinking on art, Heidegger adds the following major statement on art: art is the greatest stimulant of life.

These theses indicate that for (Heidegger’s) Nietzsche, art is far more than a pleasant diversion; it has profound ontological, cultural, political, and existential significance, and is even worth more than truth itself. Heidegger expands these theses as follows. Nietzsche’s ontology is that of the ‘will to power’, in which Being as a whole is understood in terms of shifting relations of confluent and conflictual forces, producing the creation and destruction of particular beings. Art is privileged both as an expression of will to power, and as a being which gives us special insight into the nature of Being as a whole as will to power.  The first statement suggests that of all types of being, art is that which is most clearly accessible to us in its essence. Moreover, art does not simply illuminate itself as a particular type of being, but illuminates Being as a whole. The second statement shifts the significance of art from its reception to its creation, and this shift opens up the ontological scope of aesthetics. Art, considered from the perspective of the artist, is then understood in terms of the creative act itself, which illuminates the way that beings in general are ‘brought forth’. Aesthetics, as meditation on art, may then be understood not simply as a consideration of beautiful things, but as ontology, the thinking of the essence of Being as a whole. The third statement tells us that, according to Nietzsche’s ontology, the will to power becomes visible as and with art, understood as paradigmatic for all creation, or productive ‘bringing forth’.

The fourth and fifth statements give art a practical dimension in Nietzsche’s philosophy. Heidegger insists that ‘truth’ in the fifth statement (and in all of Nietzsche’s philosophy) must be understood in a specifically Platonic sense as referring to the supposedly true supersensuous world of the Ideas, in contrast with the untrue sensuous world of mere appearances. For Nietzsche, the old values he associates with nihilism—the decadence of culture and the devaluation of life—are essentially grounded in this Platonic conception of truth through its dominance in the religion, morality, and philosophy of the Western tradition. Art then operates as a countermovement to nihilism on two essential points: first, as a sensuous thing, its very nature is to affirm the value of the sensuous world that nihilism denies; and second, as a paragon expression of will to power it helps us to understand what Nietzsche posits as the necessary grounding principle for the creation of new, non-nihilistic values (the will to power itself). Heidegger develops this reading of Nietzsche’s philosophy of art in order to critique it (see section 2. b. below). Nevertheless, in its clear emphasis on the ontological and practical roles of art, this reading indicates well the significance and scope that aesthetics (understood as philosophical reflection on art in general) has had for twentieth and twenty-first century continental philosophers.

2. Phenomenology and Existentialism

 a. Introduction

Phenomenology is a philosophical method which focuses on the close examination of phenomena, that which appears. In its contemporary sense, it was founded by Edmund Husserl in the early part of the twentieth century. Many philosophers influenced by Husserl’s work developed phenomenology in ways which contributed significantly to aesthetics, including Martin Heidegger, Roman Ingarden, Jean-Paul Sartre, Mikel Dufrenne, Maurice Merleau-Ponty, and Michel Henry. While developed in varying ways by each of these thinkers, in general phenomenology has offered an approach which displaces the traditional aesthetic categories of subject and object. Instead, phenomenology has focused on the examination of aesthetic experience and the work of art in terms of appearance and the conditions of appearance, thought as prior to the categorisation of ourselves and the world into subject and object. Moreover, phenomenological aesthetics has examined the ontology of the work of art as a special kind of thing which appears, with a distinctive character marking it out from the rest of appearances. Art has frequently been given a privileged status as affording us special insight into the way in which things in general come to appear, and how meaning as such is constituted. In this way, for many phenomenologists aesthetics has been closely connected with ontology, epistemology, and value theory.

Existentialism, while stemming from nineteenth century thinkers such as Søren Kierkegaard and Nietzsche, intersects with phenomenology in the thought of many of its prominent twentieth century exponents (for example, Heidegger, Sartre, de Beauvoir, and Merleau-Ponty). Existentialism focuses on the concrete existence of human life. It rejects the adequacy of traditional philosophical thought, which proceeds by way of abstract essences and general categories, to do justice to the lived experience of the individual. For existentialists, art—and, in particular, literature (many existentialist philosophers were also literary authors)—has an advantage over philosophy insofar as it is able to dramatise concrete experience through singular imaginary examples and ‘indirectly communicate’ existential truths. Art is also better able to evoke the irrational—such as sensation, affect, feeling, mood, and everyday, non-theoretical modes of thinking—which existentialists believe is necessary to do justice to the full range of human experience. Existentialism has typically emphasised human freedom, especially the freedom to create values, and art has been taken as a testament to, and an exemplary model for, such creative activity. I will develop some of these themes in further detail here by focusing on two of the most well-known and influential contributors to aesthetics in the tradition of existential phenomenology: Heidegger and Merleau-Ponty.

 b. Heidegger

Edmund Husserl’s most prominent student, Martin Heidegger, combined the method of phenomenology with a deep attention to the history of philosophy, especially that of ancient Greece, to forge one of the most influential philosophical bodies of work of the twentieth century. Heidegger turns his attention to art and aesthetics as part of his wider philosophical project, which seeks to uncover the meaning and truth of Being. Heidegger contends that the history of philosophy, and of western culture generally, has seen a decline with respect to Being, such that today Being has practically become nothing. (This is how he interprets Nietzsche’s thesis of nihilism, the negation of meaning and value as such.) Heidegger understands Being as the way particular beings (or ‘entities’) come to appear as what they are, with the meaning that they have, within particular historical epochs. For Heidegger, Being is a historical process, such that beings appear differently in different epochs. He specifies three main epochs, the ancient, medieval, and modern, to each of which corresponds a leading meaning of Being; that is, a main way in which beings are revealed. Heidegger’s reflections on art and aesthetics appear within his critique of the modern epoch and his attempts to retrieve a deeper understanding of Being from the near-oblivion into which he believes it has fallen.

Heidegger specifies that ‘aesthetics’ is itself a part of this modern tradition: a particular way of viewing art which first became explicit with Alexander Baumgarten’s Aesthetica (1750), then was quickly taken up in successive influential formulations, in particular by Kant (Critique of the Power of Judgement, 1790) and Hegel (Lectures on Fine Art, c. 1818–29). For Heidegger, the distinctive feature of the modern way of revealing beings is to give them the character of subject and object, as per Descartes’ philosophy. The aesthetic view of art follows suit by positioning the artwork as an object, ‘experienced’ by the one who takes in and appreciates it, positioned as subject. Heidegger wants to critique aesthetics as one aspect of modern philosophy, which he critiques in general, in order to allow a different view of art to emerge. According to Heidegger, the modern worldview covers over a more primordial relation of being-in-the-world in which we are immersed in and alongside other beings in the world. Heidegger’s own form of phenomenological philosophy aims to be attentive to beings as they appear in order to overcome the modern presupposition of thinking, which distributes things according to the subject/object divide before we truly encounter them, and to reveal the more primordial character of things which such presuppositions hide. For Heidegger, the modern tendency is particularly pernicious because it views beings according to a framework he calls Gestell, which determines them as resources available to be put to use (Bestand). Such a view distracts us from being attentive to the many other ways beings can be revealed and impoverishes the meaningfulness of the world we inhabit by reducing everything—including human beings themselves—to this narrow scope.

‘The Origin of the Work of Art’ is the most significant and well-know of Heidegger’s essays in which he tries to draw attention to the enigma of art and to sketch an alternative ontology of the artwork. Here, the artwork is described as the setting-to-work of truth. Truth must be understood not in its usual meaning as thought corresponding to facts in the world, but in Heidegger’s specific determination as aletheia, meaning disclosure, uncovering, or revealing. For Heidegger, truth means the way in which beings come to light as what they are and with the meaning they have. Art for him, therefore, has a privileged relation to truth because great art can be an occurrence of truth; that is, it has the power to reveal not just itself, but other beings in a particular way. He expresses this by saying that art ‘sets up a world’ and ‘sets forth the earth’, and that the artwork enacts the ‘strife’ which is the turbulent relation between world and earth. As in much of Heidegger’s writings, these terms are more suggestive than determinate, and remain open to contesting interpretations. They may, however, be roughly glossed as follows. ‘World’ is the network or system of interpretations in which beings appear as what they are and form an open but interrelated whole. The world, in short, is the set of shared meanings of a historical culture. It is the meaning of the artwork as interpreted in that culture, and the effect on cultural meaning in general that the work has. ‘Earth’, on the other hand, refers to the material and sensory thing that the work necessarily must be, and to its capacity to hold dimensions of potential meaning currently concealed but which might, in the future, come to be revealed. The earth is the inexhaustibility of the work, its irreducibility to interpretation by a critic or a culture. For Heidegger, every artwork, at least if it is ‘great’, contains both these dimensions, which exist in a state of tension or strife: world opening up to reveal meaning, earth drawing back to keep meaning opaque. While these terms themselves may seem opaque, arguably they do successfully describe something of the mysterious character of artworks in their ability to simultaneously reward and thwart our desire to understand them. Heidegger also asserts in this essay that the essence of all art is poetry. ‘Poetry’ needs to be understood here in a specific sense, not simply as the pleasing combination of words, but insofar as Heidegger understands poetry as the essence of language, which again has a significant relation to truth. ‘Poetry’ in this sense designates the capacity of language to reveal beings and determine them as what they are in their specific character by naming them, and he asserts here that poetry has a privileged place in the system of the arts by virtue of being that art which most exemplifies the ontological power of all art to reveal. In sum, Heidegger’s phenomenological approach to art aims to subvert the aesthetic tradition’s determination of art as the object of aesthetic experience, and to uncover a deeper meaning in which art may be understood as a site of ontological revealing. This is consonant with his wider project of overcoming the metaphysical tradition, and modern philosophy more specifically, in order to rethink the meaning and truth of Being.

 c. Merleau-Ponty

Albeit in a different way, Merleau-Ponty also seeks to understand art in the context of an ontology which moves beyond the subject/object division characteristic of modern philosophy. This is most evident in his late essay ‘Eye and Mind’ (1960), while the earlier essay ‘Cézanne’s Doubt’ (1945) presents a more existentialist perspective. For Merleau-Ponty, it is as if painting were phenomenological research pursued by other means. In fact—and this contra Sartre—he holds art in a privileged position over philosophy and literature, and accords to painting a higher position at least than music, which for him remains too amorphous to properly render phenomenological insight. The most distinctive feature of Merleau-Ponty’s own brand of phenomenology is the emphasis he puts on the embodied nature of human existence only noted in passing by Heidegger. The body takes on primary importance for Merleau-Ponty as he sees it as the fundamental condition for the appearing of a world. Such appearing takes place through the body’s perceptual system. For Merleau-Ponty, science and philosophy have distorted both our idea of the body and the way perception works by freezing them in idealised third-person representations. The aim of his phenomenological ontology is to return us to a more primordial understanding of our first-person experience of the body as it is lived by us, and of perception as it reveals the world. He gives painting a privilege over science and philosophy because he sees the painter as performing a kind of natural epoche, the phenomenological ‘reduction’ which suspends our commonsense beliefs about things, while he or she tries to see how things genuinely appear to vision and to render what they see on canvas. Moreover, Merleau-Ponty emphasises the necessarily bodily activity of painting, claiming that it would be impossible for a pure mind to paint. This is because painting revels in the bodily conditions that abstract thought tends to idealise and cover, such as the position of the body in space from which any sight is necessarily seen (the viewpoint), the movement of the eye, prevailing lighting conditions affecting the quality of visual perception, and so on.

As well as science and philosophy in general, Merleau-Ponty’s critical targets are Descartes’ theory of vision in his Dioptics (1637), and the ‘linear perspective’ method of constructing the space of painting developed in the Italian Renaissance, which employs precise geometric rules for creating the illusion of three-dimensional space on the two-dimensional surface of the painting. Neither is condemned outright by Merleau-Ponty. Rather, what he criticises in both is a partial truth which abstracts certain elements from our perceptual experience, and elevates them to the pretension of sole and exhaustive truth. Descartes’ theory of vision treats space as a homogenous expanse equally accessible at all points; an abstraction useful for thought, but impossible for any actual body to experience. Linear perspective, for its part, has for centuries been treated as the formula of realism, and has been supposed to present things as they actually appear to the eye. Merleau-Ponty contests this, citing various abstractions and exclusions which operate on real perception in order to construct linear perspective. Two distinctive features on which he concentrates are depth and movement.

First, perspective presents the illusion of depth by varying the sizes of objects relative to ‘parallel’ lines which converge at a vanishing point. Because this method was presented as rendering the true nature of visual space, the theoreticians of the Renaissance had to deny the theorem of Euclid’s Geometry which states that parallel lines never converge. Second, Merleau-Ponty notes that static art such as photography, painting, and sculpture, no matter how supposedly realistic, falsifies reality by excluding time, and hence, motion. Following a suggestion made by Auguste Rodin, he asserts that the phenomenology of movement is best expressed by a paradoxical arrangement in which different aspects of the figure in motion, which would be visible at different times in real life, are presented simultaneously in the artwork. According to his analysis, the truth of movement is better expressed by (for example) Théodore Géricault’s anatomically incorrect painting of racing horses Epsom Derby (1821) than by the gaits of horses photographically captured by Étienne-Jules Marey. What the painter is able to capture, Merleau-Ponty asserts, is not the outside of the object of motion, but motion’s ‘secret cipher’: time rendered visible in an indirect, stylistic manner.

In general for Merleau-Ponty, it is a focus on the primary qualities (those which can be specified with exactness: quantified and rationally calculated, such as extension and form) which is responsible for the intellectual abstractions that distort our understanding of the body’s perception. The supposedly secondary qualities, especially colour, are what reveal more primordial truths about perception in painting.  For him, painting is capable of rendering visible the birth of perception, the way that fully-formed, recognisable objects emerge from a deeper, more primordial, inchoate visual field. This is evident in Paul Cézanne’s painting in the way that rather than beginning with lines which give form to objects and then adding colour, the reverse procedure is evident—dashes of graduated colour build up and give shape to forms. In this way, Merleau-Ponty sees Cézanne and artists in general as performing phenomenological work: they are able to reveal the conditions and processes of perception, which are usually covered over by our focus on the end product, or what is perceived.

Merleau-Ponty’s essay ‘Cézanne’s Doubt’ displays aspects of an existentialist aesthetic by giving attention to the relationship between the life of the artist and the meaning of the artist’s work. This takes place in conversation with Sigmund Freud’s psychoanalytic treatment of Leonardo da Vinci (see section 4. b. below), which is often taken to excessively reduce the meaning of the work to the artist’s psychopathology. Echoing positions worked out in The Phenomenology of Perception (1945) in response to Sartre’s views on radical human freedom, Merleau-Ponty develops a nuanced view of the relationship between freedom and concrete situation in human existence. He asserts that the meaning of the work takes its bearings from the artist’s life but cannot be wholly reduced to it, just as the free transcendence of the subject must take its bearings from the facts of the life which it is born into but to which it is not reducible. For Merleau-Ponty, the individual’s freedom is necessarily determined in relation to facts over which we have no choice (for example, Leonardo’s being abandoned by his father in early childhood), but how we respond to such facts retains an element of choice. At the same time, the artwork produced by an individual will be informed by facts about the artist’s life, yet the meaning of such works will nevertheless defy any attempt to reduce it to such facts. Thus, Merleau-Ponty develops a qualified defence of the psycho-biographical approach to art characteristic of psychoanalytical criticism. According to him, the doubt which plagued Cézanne’s life and work stems from a general feature of situated human existence (albeit one which he felt more keenly than most): our freedom to create meaning is attended by no guarantee that it will become meaningful in the fullest sense of which it is capable; that is, to be accepted in the eyes of others, to transform the world that others inhabit, and to contribute to the store of human culture. In these various ways, then, Merleau-Ponty’s reading of Cézanne exemplifies the concerns of an existentialist aesthetic: to see the meaning of the artwork in relation to the life of the individual artist and to see art itself as exemplifying general features of human existence.

Merleau-Ponty’s essay ‘Eye and Mind’ develops many of the themes above in the context of his late ontology of ‘the visible and the invisible’, which attempts to overcome the subject/object division by invoking new concepts such as ‘flesh’. Against the transparent lucidity of Cartesian consciousness, flesh invokes the thickness and density of the perceptual field: the ambiguous region of the mutual imbrication of the perceiving self and the perceived world in a space in which both overlap without clear dividing line or boundary, but also without either being able to be reduced to the other. The perceiver and perceived cross over and inhabit the same ambiguous space, but this space is non-homogenous, and includes breaks, gaps, and undersides where the two fail to meet, or at least to do so in mutual accord. Merleau-Ponty gives the helpful example of seeing the world from the bottom of a pool of water. In this case, the water itself is like flesh, and the ‘distortions’ of our perception it introduces appear to be between subject and object insofar as they condition how we see objects beyond the pool. For Merleau-Ponty, our whole perceptual field and our body’s being-in-the-world, has this in-between character in all cases, even when this is less evident. Although in this later work he abandons much of the language of ‘classical’ phenomenology, Merleau-Ponty continues the phenomenological project of bringing to light the conditions of appearances which themselves are usually not apparent. Art holds for him a privileged status in its capacity to do this. Merleau-Ponty’s ontology of painting differs markedly from Heidegger’s aesthetics by disputing the latter’s location of the essence of all art in poetry, and insisting that there is a difference between the kind of meaning apparent in language and that which emerges with the visual. (This difference is taken up and developed by Jean-François Lyotard—see section 6. c.)

3. Hermeneutics

 a. Introduction

Hermeneutics is a theory of interpretation and understanding which has roots in Biblical exegesis and developments in German romanticism, but which emerged as a significant branch of contemporary continental philosophy primarily through the work of one of Heidegger’s most prominent students: Hans-Georg Gadamer. Under Gadamer’s influence, hermeneutics has been developed in differing ways by various other leading proponents, such as Paul Ricoeur in France and Gianni Vattimo in Italy. However, it is Gadamer’s work which remains at the core of this tradition, and continues to be most influential in aesthetics.

 b. Gadamer

As a philosophical heir to Heidegger, Gadamer took up aspects of the phenomenological tradition but focused on further developing reflections on interpretation and understanding found in Heidegger’s Being and Time. The relevance of Gadamer’s hermeneutics for aesthetics can be understood to have two parts. First, in his 1960 magnum opus Truth and Method, art and aesthetic experience are used as an example to defend the kind of understanding appropriate to the human sciences from methodological scientism and to develop a general theory of hermeneutics. Second, in various later essays, Gadamer wrote more explicitly about the hermeneutic character of the experience of artwork and of specific literary and artistic works. Of these later essays, the most significant for developing a hermeneutic philosophy of art is ‘The Relevance of the Beautiful’ (1964).

Gadamer’s general approach to hermeneutics developed in Truth and Method seeks to defend the human sciences from reduction to the kind of methodology appropriate for the natural sciences. Rather than trying to gain objective knowledge about reality in the way natural sciences do, Gadamer argues that the human sciences aim at understanding works (theoretical or literary texts, artworks, and other cultural products), which themselves cannot be understood to be entirely separate from the one who seeks to understand. Rather, both are inscribed within a common horizon of tradition. Tradition (Überlieferung), he insists, must not be understood in the static sense of conserving what already exists but rather as a transmission through which works from a world different to the one we inhabit are passed down to us. Understanding then becomes something like a translation, in which the aim is a ‘fusion of horizons’ of the world of the work and the world we inhabit. Both ourselves, as interpreters, and the work interpreted are transformed by these acts of interpretation, so that we cannot speak of understanding in the human sciences on the model of a subject correctly representing external natural objects to itself. Tradition, for Gadamer, is a matter of constant change and transformation of understanding through acts of interpretation within a continuous overarching horizon.  In the first section of Truth and Method, aesthetic experience and the artwork stand as paradigm examples to show why understanding in the human sciences differs from explanation in the natural sciences, as propaedeutic to working out a general theory of hermeneutics.

Turning to the later essays and Gadamer’s more explicit ‘aesthetics’, we continue to see Heidegger’s influence as well as a number of significant differences. Like Heidegger, Gadamer seeks to challenge and overcome what has been called aesthetics in modern philosophy but for different reasons and with a different aim in mind. While Heidegger concerns himself primarily with the kind of monumental art that can open and sustain a world, Gadamer is interested in any and all kinds of aesthetic experiences, including more mundane ones. Gadamer objects to Kant’s thesis of the disinterested nature of aesthetic experience, insisting instead that such experience brings a cognitive content which connects artworks with our understanding of other features of the world, and other types of experience, including our ‘interests’. Artworks are encountered and understood as part and parcel of the general fabric of interpretations we weave as we encounter the wider world. It is this supposedly disinterested nature of aesthetic experience which Gadamer believes needs to be overcome in modern aesthetics in order to make way for a more genuine understanding of, and possibility for, encountering artworks. In this way, he asserts, modern philosophical aesthetics should be overcome by being absorbed into hermeneutics.

In ‘The Relevance of the Beautiful’, Gadamer applies his hermeneutic approach to illuminate the nature of the work of art. The problem he sets himself to understand is the nature of the artwork considered trans-historically, so that we may understand what ‘art’ means such that the same word refers to the works of the ancient world and to contemporary experimental arts, such as non-objective painting. He proposes that we may do this with reference to the anthropological basis of our experience of art, which he develops through three key ideas: play, symbol, and festival. Like Freud (see section 4. b. below), though without any direct reference, Gadamer links the phenomenon of art as a human activity to that of play. What is significant about play for Gadamer is that it is an intentional activity involving a mere repetition without any real purpose or goal. Applied to the work of art, the concept of play implies that there is no real separation between the work itself and the one who receives it: the work is constituted through a kind of playful activity of the receiver with the work. This activity constitutes the work; it brings the work’s different aspects together through the synthetic activity of interpretation and constitutes the unity of the work. (This understanding of the work again challenges the modern aesthetic tradition, which maintains a distinction between the subject and object in aesthetic experience.)

Gadamer uses the notion of the symbol to explain the way in which an artwork should be thought to be meaningful. As we have seen, Gadamer wants to underline the cognitive dimension of artworks: what they are about and the connections to our interests and involvements that they imply. At the same time, he objects to Hegel’s idealist account of artworks, which reduces them entirely to conceptual content. For Hegel, art has ‘ended’ because the conceptual content artworks were best able to express in sensuous form in classical times has been superseded by philosophy’s more perspicuous conceptual articulation. Here Gadamer again takes inspiration from Heidegger, and reiterates the latter’s idea that every revealing is also a concealing; every setting up of a world also sets forth an earth, so that every artwork always maintains something concealed within it which resists the current interpretation. Gadamer translates this into every encounter with a work of art, so that the artwork is imputed with an inexhaustible excess of meaning, gradually revealed through repeated engagements with the work (like a conversation), without the prospect that such meaning might ever be wholly revealed or exhausted.

The symbol, for Gadamer, then expresses the way that an artwork can have a meaning which is cognitive and quasi-linguistic yet excessive and inexhaustible. For Gadamer, language stands as a paragon for all experience of meaning, so that even apparently non-linguistic experiences such as encounters with artworks must be understood according to a linguistic model. Art speaks to us, it says something to us, and understanding a work of art is all about working out what it has to say. This means learning to listen to or read a work of art, which in turn means learning to understand the kind of language it speaks. He notes how both ancient and contemporary works can challenge us by appearing to speak a language we do not understand and how interpreting them requires a process of learning the appropriate language in order to approach a meaning which will, however, never be able to be summed up in a simple conceptual determination or linguistic formulation. In this manner, understanding a work of art will be an interminable affair involving repeated encounters and acts of interpretation.

Finally, the festival reveals something about the temporal character of the artwork and its affinity with human community. Like the experience of a festival or holiday, the artwork invites us into an experience of time which differs from the quantitative, calculative experience of time we have when we are engaged in work (and similar everyday activities). Gadamer calls this ‘fulfilled’ or ‘autonomous’ time; it is time which has a certain unity and cannot be dissolved into separate moments, and which stands apart—and stands us apart, as we experience it—from everyday concerns. It is here that we see the phenomenological character of Gadamer’s aesthetics: like Heidegger, he is concerned with the way that art invites us to ‘let things be’, to be open to the way they reveal themselves. It is also through festive experiences that community is formed by dissolving the usual hierarchical distances which divide citizens according to social roles.  Art, Gadamer proposes, can also be experienced in a way which does not appeal to any particular social class, but unites people in sharing the same kind of experience. While hermeneutics has sometimes been characterised as conservative because of its emphasis on tradition and community, it must be emphasised that Gadamer sees both in terms of openness and transformation. And while he reminds us of the trans-historical importance of great works such as Ancient Greek tragedy, it is notable that Gadamer also asserts the legitimacy and importance of contemporary experimental arts, of happenings and anti-art, and even of pop music.

4. Psychoanalysis

 a. Introduction

Psychoanalysis, which received more mainstream acceptance in continental Europe than in English-speaking countries, produced a body of theory which has become a major current feeding into continental philosophy, including its aesthetic reflections. Sigmund Freud, the father of psychoanalysis, extended his theoretical model of the human psyche beyond the clinical setting to develop many aspects of a general philosophical anthropology. One area he treated in a number of essays was art. In his writings, we find both some general reflections on the psychological significance of creative activity in general and the interpretation of a number of specific painters and writers, including Leonardo, Michelangelo, Fyodor Dostoevsky, and E.T.A Hoffmann. Following Freud, other psychoanalysts, notably C.J. Jung, Melanie Klein, and Jacques Lacan, have included ruminations on art in their distinctive developments of psychoanalytic theory. Psychoanalysis has inspired many art critics to adopt its ideas and methods, and artists themselves have also been subject to its influence (most notably in the Surrealist movement). Aspects of psychoanalytic theory have been taken up by some continental philosophers, such as Julia Kristeva, Jean-François Lyotard, and Slavoj Zizek, in their own aesthetics and philosophies of art. Such philosophers have engaged deeply with the writings of psychoanalysts themselves, treating their ideas as philosophical theories, and we will do the same here, outlining some of the prominent contributions to aesthetics we find in the works of Freud and Lacan.

 b. Freud

Freud develops the outlines of a general approach to aesthetics that he calls variously ‘pathography’ or ‘psychobiography’. He finds the origins of the artist’s creative activity in children’s play, in which the child imaginatively creates a world of his or her own. He suggests that this creative activity then takes the form of phantasies in adulthood, where phantasies are understood as imaginative fulfilments of desires which remain unfulfilled in reality. Artists, he contends, are particularly neurotic people who are especially incapable of fulfilling their desires in reality and find a substitute sense of fulfilment by externalising their fantasies in works of art. However, artists are also especially gifted: their talents allow them to represent such desires in a way which makes them acceptable to others when the desires confronted directly in themselves would be repulsive (because of their brute, animalistic character—which is why they are often repressed). Artists’ own activities have a therapeutic value for them because artistic expression acts as a release for the pressures of desires which are unfulfilled and/or repressed. Moreover, Freud contends that the real reason we enjoy art is that it serves the same function for the aesthetic spectator. The formal or properly aesthetic qualities, he suggests, merely have an initial ‘incentive value’ to draw us to the work, while the real enjoyment comes from the release we feel by sharing with the artist the phantasy of fulfilling unfulfilled or repressed desires.

In general, the desires which artists represent are of two main kinds: ambitious (the desire for power, achievement, and security) and erotic (the desire for love and sexual pleasure). However, according to Freud, artists express their own unique phantasies with enough specificity that, with the help of biographical knowledge of the artist’s life, we may interpret artworks like symptoms (hence, ‘pathography’, meaning ‘marks of illness’) in order to reconstruct a picture of the artist’s psychological life (hence ‘psychobiography’.) Freud’s most famous example here comes from Leonardo. He sees in Leonardo’s painting The Virgin and Child with St. Anne (c. 1503) the outlined figure of a vulture in the Virgin’s clothing and uses this as a clue to a psychoanalytic interpretation which also draws on Leonardo’s diaries and on biographical accounts. He notes a passage in the diaries which seems to underline the significance of the vulture where Leonardo writes of a memory from early childhood in which a vulture repeatedly struck him on his open mouth with its tail. Interpreting this as a fellatio fantasy, and drawing it together with a number of other interpretive elements, Freud diagnoses Leonardo as a passive homosexual who did not actively pursue his homosexual desires but sublimated them into his work. This case his proven infamous because of Freud’s misinterpretation of the key word in Leonardo’s writings—he in fact refers to a kite, not a vulture—but the psychoanalytic approach does not stand or fall with a single example. The general approach illustrated here, that of psychobiography, has since been taken up and developed by numerous other writers in relation to many other examples. More significantly, and aside from the many criticisms to which psychoanalysis in general has been subject, this form of psychoanalytic aesthetics has been criticised for reducing the value of the work to the life of the artist (see section 2. c.) and attempting psychoanalysis without the benefit of a living and present analysand.

 c. Lacan

Jacques Lacan is arguably the second most influential psychoanalyst, both in general and in aesthetics and art theory, after Freud. Lacan’s contribution to aesthetics is at once less and more ambitious than Freud’s: less insofar as he did not produce any lengthy psycho=biographical studies of artists yet more insofar as he moved beyond Freud’s hesitant tone where art was concerned to confidently pronounce on the motive of artistic creation and the power of visual art to fascinate. Moreover, his writings and seminars are peppered with examples from the arts used to explain psychoanalytic concepts, an approach which has proven influential on many other art writers and cultural theorists, who (questionably) assume that the reverse also holds true: that such psychoanalytic concepts tell us something about the arts that may be used to illustrate them. Lacan’s distinctive approach to psychoanalysis may be glossed as an attempt to update and formalise Freud’s teachings by applying the concepts and methods of structural linguistics (see section 6. a.). Freud’s own works lend themselves to this because of their extensive references to the importance of language in psychic functioning. Lacan’s approach is well indicated by his famous dictum: ‘the unconscious is structured like a language’.

While retaining Freud’s categories (the unconscious, id, ego, and so forth), Lacan added three ‘registers’ of psychological functioning to his model of the mind: the imaginary, the symbolic, and the real. The imaginary concerns thinking in images, the symbolic is the register of language and formal symbolisation but is also associated with the law and social custom, while the real designates what falls outside the limits of the imaginary and the symbolic. Together, the imaginary and symbolic constitute what we think of as ‘reality’, which contrasts with ‘the real’. The real has its origins in the experience of infantile life, before the imaginary and symbolic mental functions have developed, but it returns as a kind of surplus energy to make its presence felt in those registers throughout life. Two further key ideas are necessary to understand Lacan’s specific contributions to aesthetics: the famous ‘mirror stage’ and symbolic castration. Lacan postulates that at around 6–18 months of age, the child develops an awareness of their separation from the mother’s body, but experiences their own body as disorganised, lacking unity. This unity is provided by identification with other people (as in a mirror image) but at the cost of a fundamental split and lack: contra Descartes, the thinking subject is not a self-sufficient unity but gains an identity only through a fundamental identification with an other. This identification with an other structures our own desire, as we learn to desire what the other desires (by imitation) but also in relation to what the other desires (we ask ourselves: what does the other want from me, or want me to be?).

Lacan transforms Freud’s theories of the Oedipus complex and castration anxiety by suggesting that the Oedipus complex is in fact resolved through the accomplishment of a symbolic castration. This occurs as the child learns language and also learns to attenuate their desires in relation to the law and social custom. This is attended by a further alienation and feeling of lack as the unconscious idea of a pleasurable plenitude prior to both the alienation of the mirror stage and symbolic castration develops. Lacan postulates that things or objects can take on the role of what he calls the ‘object petit a’ (the object designated with a little ‘a’, for autre, the French word for ‘other’). Human beings are principally motivated by a ‘fundamental fantasy’ which is the fantasy of fulfilling our lack through the object petit a, which we unconsciously see as a lost object (symbolically, the phallus which is lost through castration), the regaining of which would make us whole. The central aim of Lacanian psychoanalysis is to help the analysand ‘traverse the fantasy’—to realise the fundamental fantasy for what it is (precisely that, an impossible fantasy), and to accept the inevitable necessity of symbolic castration.

With these basic points in place, we are in a position to understand some of the key ideas Lacan outlines in the text which has been his most influential in aesthetics, the section called ‘Of the Gaze as object petit a’ in the transcript of his Seminar Book XI: The Four Fundamental Concepts of Psychoanalysis (delivered in 1964). Coincidentally, the first of these seminars was given in the same week that Merleau-Ponty’s posthumous work The Visible and the Invisible appeared, and Lacan takes the phenomenologist’s work as the basis from which he develops a psychoanalytic theory of art (see the section on Phenomenology and Existentialism above). At the same time, it must be understood that Lacan develops these ideas in a direction which, by focusing on the unconscious and a structuralist approach, seeks to provide an alternative account of the visual to that of existential phenomenology. The latter, for Lacan, does not adequately account for the decentred and split nature of subjectivity because it identifies it too strongly with the conscious ego. Lacan begins from Merleau-Ponty’s contention that the seeing subject is not the most primordial aspect of the visual, as such, and develops a deeper conception of the visible, and the invisible which conditions it, in terms of a distinction between ‘the eye’ and ‘the gaze’. ‘The eye’ designates the seeing subject, associated with the Cartesian analysis of vision, and the ideal point of the observer in single-point perspective painting. By contrast, ‘the gaze’ designates something in the visual field which escapes the eye’s ability to see it clearly and, in so doing, is the presence in the visual of something which escapes the Cartesian subject’s supposedly transparent self-consciousness and mastery of the objects it surveys. The skewed perspective in the visual field that the gaze suggests is famously exemplified by Lacan with Hans Holbein’s anamorphic painting The Ambassadors (1533). The skull at the bottom of the picture can only be seen clearly by viewing the rest of the painting from an extreme angle, illustrating both the lack of a single perspective which can master the visual field and the lack at the heart of the subject (exemplified by the skull as the symbol of death, the memento mori of the painting’s traditional interpretation).

The primordial model of this ‘gaze’ is the gaze of the mother, which evokes in the child questions that persist in our unconscious and shape our experience of vision: what does the (m)other want from me? What does she want me to be? As a seeing subject (‘the eye’), Lacan argues, my visual field is haunted by something which looks back at me, but which I cannot clearly see. This haunting of the visual field by an other which decentres my own point of vision is what Lacan understands by ‘the gaze’. As patterned on our mother’s look, the gaze is an example of the object petit a, and it evokes in us the sense of a lack which might be filled by regaining the appropriate lost object. Lacan answers the question ‘What is a picture?’ by suggesting that a picture is something the artist produces in the hope of appeasing or pacifying the gaze: it is an attempt to give the mother what she wants in the hope of regaining the fantasised, lost, utopian, perfect relationship with her, and thus fulfilling our own lack. This then leads to Lacan’s pronouncements on the meaning of all painting (and by extension, all visual art): it is ‘a trap for the gaze’.

For Lacan, visual art has a particular psychological function which works specifically in the register of the imaginary: it acts as a ‘lure’ for desire, inviting us to fantasise about the overcoming of alienation and the regaining of the lost object. This is because it is not only the painter who seeks to fulfil his or her desire by giving the other what it wants with the picture but the picture itself, as an image of self-enclosed completeness, which satisfies the spectator’s desire to fulfil the fundamental fantasy. It is in this way, Lacan suggests, that the picture can have a taming, civilizing, and fascinating power. Some interpreters (such as Lyotard; see section 6. c.) have taken this to be Lacan’s last word on art, which leaves it at a level inferior to the symbolic and the possibility of traversing the fantasy. Others, however, have suggested that Lacan’s work is open to the reading that some visual art can work at the symbolic level by deconstructing the illusion of painting from within, showing how the supposed realism of the painting is a product of artifice. Desire is then prevented from fantasising about its own fulfilment in the supposed unity and wholeness of the image and is forced to confront the arbitrary constructions of the symbolic order. This seems to be suggested, for example, by Lacan’s reading of Diego Velázquez’s Las Meninas (1656) in his May 1966 seminar. Notably, this was a response to Michel Foucault’s examination of the same painting in his immediately popular and now classic 1966 work The Order of Things, where it is examined as a representation of how representation itself was understood in what the French call the Classical period. However it is interpreted, Lacan’s idea of the gaze makes a fascinating contribution to aesthetics by suggesting that our experience of the visual is not a simple given, reducible to geometric analysis, but is conditioned by our split subjectivity and the intrigues of our desire. His ideas have been particularly influential in film theory, where through various formulations they have dominated the field for several decades. A key development for film theory, and beyond it, is the way that Lacan’s ideas were taken up by Louis Althusser to develop a structuralist theory of ideology according to which our subjectivity is structured through its capture in the gaze of the big Other, the symbolic authority figure which conditions social reality in any given society.

5. Critical Theory

 a. Introduction

The tradition known as critical theory is associated with the Frankfurt School, more formally known as the Institute for Social Research (Institut für Sozialforschung), which was founded in Frankfurt in 1923, moved its base of operations to New York in 1934, then returned to Frankfurt in 1951. With its unique combination of sociology and philosophy, Critical Theory is arguably the most prominent strand of Western Marxism. A number of philosophers and cultural critics working in this tradition have made contributions to aesthetics that have been highly influential throughout continental philosophy and in wider aesthetic and art-critical contexts. These thinkers have been concerned with both the fate of art under the conditions of industrial capitalism and the potential of art to critique such conditions. These themes were treated by Walter Benjamin in his highly influential essay ‘The Work of Art in the Age of Mechanical Reproduction’ (1936), and by Theodor W. Adorno in his Aesthetic Theory (1970) and numerous shorter works.

 b. Benjamin

Benjamin’s essay hypothesises that the function of the artwork has changed as the conditions of production under industrial capitalism have changed. Following Marx but seeking to update his analyses, he suggests that it has taken some time for the changes at the superstructural (cultural) level to manifest the implications of the changes at the substructural (economic) level, which Marx analysed in the nineteenth century. The key factor in this change is the technique of mechanical reproduction. Benjamin concedes that reproducibility as such has always been a concern with art since antiquity and points to developments in the history of reproducibility such as the printing press and lithography. However, he classifies these techniques as forms of manual reproduction and asserts that with mechanical reproduction we see the development of something significantly new. The main artforms he has in mind, and discusses at length in the essay, are photography and film. With these, the process of the production of images itself is largely mechanical and reproduction can no longer be said to simply copy an original. Benjamin famously claims that what artworks previously had, which they lose through mechanical reproduction, is what he calls aura. The aura of a work is the unique ‘presence’ which the original exudes in occupying a distinct time and place.  It is what gives a work its authenticity and makes it possible to distinguish between an authentic original and a forgery: the authentic original has occupied a unique series of times and places, which constitutes its history. Benjamin rightly notes that with arts such as photography and film, it no longer makes sense to draw such distinctions: one reproduction from a photographic negative, for example, is no more or less authentic than another. Instead of existing as a unique object, the work of art in the age of mechanical reproduction now exists as a multiplicity of copies.

According to Benjamin, this change also has the effect of extracting the artwork from tradition. For him, the uniqueness of a work means that it is imbedded in a fabric of tradition. This traditional uniqueness is associated with the anthropological basis of artworks in ritual, and Benjamin uses Marx’s categories of use value and exchange value to suggest that ritual or cult value is the original use value of an artwork. While such value becomes secularised in the ‘cult of beauty’ that is modern aesthetics and the art world, something of the use value of the work persists in the emphasis on its authenticity. However, Benjamin argues that with the advent of mechanical reproduction, artworks are finally liberated from this cult value and instead take on an ‘exhibition value’. Copies are put into mass circulation, exhibited far more widely than would be possible with an authentic original. For Benjamin, this accords with a broader social and cultural phenomenon of ‘the mass’, a sense of the universal equivalence and exchangeability of all things in the social domain. Mechanical reproduction feeds the desire of the masses for things to be brought close, as distinct from the unique work of art which is always at a distance (even when one is ‘present’ to it in a gallery or other setting). According to Benjamin, the quantitative transformation of artworks demanded by the masses also leads to a qualitative transformation, as the nature and function of art comes to be understood according to the model of the arts of mechanical reproduction. Thus, art is transformed by its loss of aura.

Moreover, Benjamin asserts that human modes of perception are historically transformable, and the arts of mechanical reproduction are altering our perceptions of the world. Techniques in photography and film such as the close-up and slow motion are not simply reproducing what we previously know of the world but introducing new perceptions and knowledges as they capture things entirely unknown to the naked eye. Benjamin suggests that as art loses its ritual or cult value it takes on a political value, and while the essay lacks a clear political programme or set of prescriptions, he asserts that the concepts it develops are useful for a revolutionary communist politics of art. For him, older aesthetic traditions based around the aura can be seen as culpable in their co-optation by fascist regimes, while the transformations wrought in the arts by industrial technologies opens more promising possibilities for the politicization of art through the democratic communication of ideas.

 c. Adorno

While Adorno’s reflections on art and culture developed to some extent from a critical disagreement with Benjamin over the democratizing potentials of radio and film, this developed into a highly productive engagement. Adorno agrees with Benjamin that contemporary developments in society and the arts radically challenge modern and romantic aesthetics but has a far more pessimistic view of the mass media ‘culture industry’. For Adorno, popular culture (including most radio and film) is complicit with the contemporary social system of capitalist exploitation, which he analyses as a culmination of the logic of ‘instrumental rationality’ devolving from the Enlightenment. In this system, human beings are radically alienated from nature through the project of manipulation and control of the natural world, which has not resulted in the hoped-for emancipation of human beings, but a ‘new barbarism’ in which we are psychologically dominated by the very system which was supposed to set us free. This system is one which determines everything according to a logic of rational specification and calculation, leaving little of any other way of understanding ourselves or relating to the world than what we understand to be instrumentally useful or productive.

The culture industry acts as an ideological support of this system, keeping people blind to the real conditions of their existence. However, Adorno saw more positive potentials in ‘genuine’ art, in particular experimental modernism. Through extensive writings on musicology, literature, and the visual arts, Adorno contributed one of the most important bodies of work in continental aesthetics. These reflections culminated in Adorno’s last book, Aesthetic Theory, completed but not finally edited by him.

Aesthetic Theory deliberately employs strategies which resist any easy summation of the work into simply stated concepts and theses. Adorno developed a paratactic style of writing on the model of atonal music, in which sentences clash with each other to dissonant effect, rather than developing a clear line of argument. Moreover, Adorno deploys his own ‘negative’ dialectical style of thinking, in which pairs of contrasting concepts ‘constellate’ around topics of discussion without resolving into static propositions and conclusions. These difficulties are far from arbitrary and are, in fact, highly motivated by Adorno’s own views on how critical thought can best resist the system in which it operates. This includes a demand that thought take time and be difficult, in contrast to the expectations of quick and easy consumption which dominate in the culture industry. Nevertheless, there are several key themes the work develops which are readily appreciable in the context of the broader tradition of aesthetics in continental philosophy we are outlining here.

First, Aesthetic Theory is a critical interrogation of the tradition of philosophical aesthetics itself, especially as exemplified by the works of Kant and Hegel. Adorno’s view is that many of the categories of traditional aesthetics are outmoded because of developments in both society and the arts. Nevertheless, he seeks to rethink such categories critically rather than simply abandon the legacy of the aesthetic tradition. To take one easily appreciable example, Adorno draws attention to the limitations of the aesthetic tradition’s focus on the beautiful in the face of the apparent ugliness and dissonance characteristic of much modernist art. Second, Adorno (somewhat infamously) asserts the autonomy of the artwork. (Here he follows Kant on the autonomy of aesthetic judgement but insists that this autonomy should also be ascribed to the art object itself.) This is an insistence that the aesthetic value of an artwork is independent of other values which might be ascribed to it, such as epistemological or ethical value. Unfortunately, this claim has often been (mis)interpreted to mean that artworks should be understood to be entirely unrelated to their social context or political value. Adorno’s view is more complex and, in fact, strongly asserts the relevance of cultural context and the political import of art.

As a third major point, then, for Adorno, artworks may be understood as ‘monads’ (a concept drawn from Gottfried Leibniz): while they are independent, self-enclosed entities, they are products of the social conditions in which they are created and mirror these social conditions within them. Following on from Marx’s framework of analysis, Adorno sees the conditions of contemporary capitalist society as fundamentally contradictory, and it is these contradictions which the artwork embodies. Adorno argues that the most politically relevant artworks are not ones with explicit political content, but those which best reflect the deeply conflicted conditions of contemporary culture (such as the atonal compositions of Arnold Schoenberg or the absurdist literature of Samuel Beckett). Such works have a ‘truth content’ but not one which could be stated in clear propositions with cognitive value. Moreover, the autonomy of art in fact gives it a function of political resistance: while the artwork, like everything else under the conditions of contemporary capitalism, has a commodity form, it also resists incorporation into the instrumentally rationalised system of production and consumption through its very uselessness. For Adorno, modernist experimental art is a privileged site of politics in the contemporary world, as it can both reflect and resist the difficulties and contradictions of contemporary existence better than explicit political discourse. Nevertheless, due to its opacity, art still needs a philosophical aesthetics to aid in its comprehension, and the complex arguments of Aesthetic Theory attempt to rework the concepts of the aesthetic tradition so that they become adequate to the task. While there was no outright dialogue between them, there are palpable and interesting resonances between Adorno’s aesthetic theory and the dominant modernist school of art theory which was developed in America by Clement Greenberg, Michael Fried, and others in the twentieth century. As we find in Adorno’s writings, this brand of aesthetic modernism also combined a concern with formalism, autonomy, and experimentation in the arts with a belief in the socially and politically critical relevance of such works.

6. Poststructuralism

 a. Introduction

Poststructuralism is the name given in the English-speaking world for a loose collection of influential French philosophers and theorists working in the wake of structuralism, a movement which itself deserves some mention for its impact on aesthetics in continental philosophy. Structuralism came to prominence in France in the nineteen-fifties and -sixties, rivalling and, to some extent, succeeding phenomenology and existentialism as a leading methodological approach in the human sciences. It applies some basic tenets of Ferdinand de Saussure’s structural linguistics to phenomena other than language, such as the unconscious (Lacan, as we have seen above in section 4. c.), myth and ritual (Claude Lévi-Strauss), and history (Michel Foucault). Most significantly for aesthetics, Roland Barthes applied structuralist principles to literary criticism, and developed Saussure’s suggestion of a ‘semiology’, a study of signs in general (broader than the study of linguistic signs alone), applying such an approach to various forms of art and culture. Simply put, structuralism views the meaningful content of any phenomena as given in the structured relations between basic units (signs). This structure is taken to be hidden (or deep), and interpretation of an artwork or cultural product then becomes a matter of making the structure which informs it explicit. Because of its formalism and methodological rigour, structuralism was touted by its supporters as a more ‘scientific’ method for studying the phenomena of the human sciences (that is, ‘meaningful’ phenomena), and it swept through the French academy like a revolution.

To some extent, poststructuralism can be understood as a philosophical reaction to the excessive zeal for formal method that structuralism exhibited. Most poststructuralists continued to draw on the phenomenological tradition, as well as psychoanalytic theory, and adopted aspects of structuralism while critiquing others. In short, poststructuralists tend to argue that meaning is not reducible to static structures and cannot be uncovered using a formal method. Generalising greatly, we might say that poststructuralists insist upon the necessity of some element of indeterminacy (which accounts for the genesis of the structure) that operates within the structure to generate meaning, and that constitutes an instability which threatens the coherence of the structure and may disrupt it and cause it to change. Understood as an interplay between structure and the element of indeterminacy (often called ‘the event’), meaning cannot be uncovered using a formal method, and poststructuralists have had recourse to highly unorthodox, experimental modes of thinking and writing in theorising and demonstrating those aspects of meaning or effect they believe structuralism misses. Art and aesthetics have been significant topics for all poststructuralists because, as the philosophical tradition attests, aesthetic meaning or effect seems to be a paradigm case of a kind of meaning which is not ‘scientific’. I will summarise here some of the key ideas of the two poststructuralists who have been most influential in aesthetics (Jacques Derrida and Gilles Deleuze) as well as those of the philosopher in this tradition who has engaged most extensively with art, Jean-François Lyotard.

 b. Derrida

Derrida and his philosophy of deconstruction have had an enormous influence on literary criticism and some influence as well in the wider arts and aesthetic theory. Notoriously difficult to summarise, Derrida’s works may be approached for our purposes through the observation that he develops a quasi-transcendental theory of meaning, which has implications for how meaning is understood to operate in philosophy, literature, and the arts. In post-Kantian, contemporary continental philosophy, ‘transcendental’ refers to the ‘conditions of possibility’ for a thing. The ‘quasi’ in Derrida’s case notes that while traditional transcendental thinking posits a priori structures that are taken to be universal and necessary, Derrida follows Heidegger in positing that the way things become meaningful is a function of time and subject to temporal and historical change. Derrida’s ‘principle’ of meaning, which claims to capture something of these conditions for anything whatsoever being meaningful, is ‘arche-writing’. This term indicates that it is some of the key features or properties of writing, as it has been understood in the metaphysical tradition, which are quasi-transcendental conditions of the possibility of meaning, rather than writing as such. These features are indicated by Derrida’s well-known term différance, which indicates spatial differing and temporal deferring.

This idea of différance contests the principle of meaning which has, according to Derrida, dominated throughout the Western tradition, which he calls the ‘metaphysics of presence’. This theory proposes an origin or full presence of ‘pure’ meaning in an idea held in the mind, which is then progressively corrupted by being put into spoken, then written, discourse. This supposed corruption of meaning corresponds with the spatial and temporal differing and deferring which, Derrida contends, are in fact the conditions of anything being meaningful in the first place. According to Derrida, there is no possibility of a pure, simple, original meaningful presentation, and every apparently original presentation is always already a repetition or a re-presentation. His arguments are extremely complex but may be treated summarily by noting how they draw on the traditions of phenomenology and structural linguistics. In the Husserlean phenomenological tradition, which takes consciousness as the transcendental condition of meaning, Derrida reads Husserl to show that conscious experience requires a synthesis of different temporal moments, such that any ‘presence’ of something to consciousness is already subject to the passing of time, that is, temporal difference. From the structural linguistics of de Saussure, Derrida draws the idea that every linguistic meaning only functions because of the possibility of its reiteration, or what Derrida calls its ‘iterability’.  Every linguistic usage draws from an already-existing store of linguistic meaning (the virtual structure of language as a whole), and in that sense is already a reiteration. Moreover, every use presupposes the possibility of the listener or reader reiterating the use in another context, because the very nature of linguistic competence—and thus, the capacity to understand—depends upon the ability to use language in this appropriative and citational manner.

While Derrida has often been seen as collapsing the distinction between philosophy and literature, he is in fact drawn to the latter and deploys it to contaminate and complicate the former because of the differences he sees between them. While he seeks to deconstruct any simple opposition between philosophy and literature, such a deconstruction would not be possible without also insisting on the differences between them. Philosophy has traditionally set itself up in opposition to the ‘merely’ literary, claiming truth to be its own exclusive competence and categorising literature as belonging to the fictional or untrue. Philosophical texts have typically been tightly structured according to the metaphysics of presence, deployed in structures of binary oppositions which set up hierarchies of meaning, such as truth/falsity, essence/appearance, form/matter, presence/absence, and so on. By contrast, although Derrida sees all meaning and all texts as to some degree structured by the metaphysics of presence, he sees the virtue of literature (and especially the works of experimental writers such as Stéphane Mallarmé, James Joyce, or Antonin Artaud) as asserting and developing the ambiguities, contradictions, aporias, and playfulness of meaning that philosophical texts and modes of writing strive to suppress. Deconstruction, for Derrida, is a strategy of reading and writing which aims to identify and subvert the binary oppositions structuring a text, showing how the privileged term is in fact parasitic on the underprivileged one, and opening up the space for a play of meaning beyond simple oppositions by inventing concepts (such as the trace, différance, the hymen, and so on) which are ‘undecidable’ from the point of view of such oppositions. What Derrida finds in literature are such undecidables already in play to a much greater extent than in philosophical texts, and he affirms and reinscribes these in his own writings. Derrida strove to emulate literary modes of writing in his philosophical texts precisely in order to open them to a freer play of meaning. Through Derrida’s influential association with prominent literary critics such as Paul de Man and J. Hillis Miller, deconstruction became enormously popular in literary criticism from the nineteen-seventies to -nineties, often taking the form of a reductive methodology for exposing contradictions internal to a text, which Derrida himself would never have approved of.

When Derrida turns his attention to the visual arts, in texts such as The Truth in Painting, he develops concepts (such as the trait, the parergon, and the subjectile) which essentially follow the same differential logic as arche-writing. Derrida suspects any supposition of a pure presence of meaning in an image and works in various ways to complicate this, showing that images depend on an ambiguous play between concepts and categories such as the inside and outside of the frame, the visible and the invisible, word and image, single artwork and entire oeuvre, and so on. These playful movements are processes of spatial differing and temporal deferring, which work against the metaphysics of presence and underline a differential form of meaning in the visual which is similar to that which he sees operating in the written text. Moreover, Derrida also seeks to complicate any simple opposition between visual and textual meaning, seeing such an opposition as itself implying a metaphysics of presence. This complication is notably played out in his text on the Italian artist Valerio Adami in The Truth in Painting, where attention is given to the communication and interplay of meaning between Adami’s images and the text he places within, outside, and in transgression of the frame of his visual works.

 c. Lyotard

Lyotard’s Discourse, Figure (1971) stages a significant encounter between phenomenology, structuralism, and psychoanalysis, with the aim of doing justice to the aesthetic event, and in particular the visual. Lyotard insists—against structuralism, hermeneutics, and indeed much of the literature of art history and visual culture—that the visual has its own kind of meaning, which differs from and cannot be reduced to linguistic meaning. For him, it is wrong to say that a picture can be ‘read’. Instead, he tries to account for how art can leave us with the feeling of being ‘lost for words’. The first part of Discourse, Figure carefully compares the kind of meaning proper to perception, as developed in Merleau-Ponty’s phenomenology (see section 2. c.), with the kind of meaning operative in language according to structuralism. While it is clear that Lyotard thinks Merleau-Ponty gives a more adequate account of the kind of meaning specific to the visual, he also finds phenomenology ultimately inadequate. He argues that Merleau-Ponty’s notion of the ‘flesh’ remains a too-harmonious interface with the world at the level of conscious perception, and has recourse to psychoanalysis to try to find in the unconscious the source of radical creativity and sheer unexpectedness characteristic of avant-garde art.

Lyotard objects to Lacan’s structuralist reading of the unconscious, however, and believes that the latter’s interpretation of art as lodged in the register of the imaginary, acting as a lure for desire, is an affront to the grandeur of art (see section 4. c.). Employing a close reading of Freud, he develops an alternative view of the unconscious, which emphasises plastic transformations (rather than linguistic operations) of its contents. Lyotard also objects to much of Freud’s own explicit aesthetics, however, and argues that the meaning of an artwork is not to be found in the pathology of the artist. Instead, he develops Freud’s theory of the unconscious and desire, along with Merleau-Ponty’s phenomenology, to give a complex account of the artwork: it is neither simply the impression of conscious perceptions, nor the expression of unconscious desires (fantasies), but a mutual deconstruction of one by the other, which produces a new and unrecognisable form. He refers to this deconstructive element as the figural.

In Lyotard’s later work, he reconfigures the traditional aesthetic category of the sublime to account for and defend avant-garde art and the significance of the aesthetic in the contemporary world. Lyotard now follows Adorno in postulating a crisis of traditional aesthetics, both in relation to the conditions of (post)industrial capitalism and developments in the arts, and tries to update aesthetics in response (see section 5. c.). For Lyotard, there is a crisis of meaning on the level of perception in the contemporary world, because—following the analyses of Heidegger, Benjamin, Adorno, and others—scientific and technological developments, operating in tandem with capitalism, have mutated the perceptual bearings by which we coordinated ourselves in the world. Sciences and technologies have both extended our sensory capacities (seeing and hearing at a distance, through television and telephones, for example), and revealed a reality beyond our body’s capacities for sensory awareness (atoms, microbes, nebulae, and so on). According to Lyotard, these changes have meant that the basic forms of sensory experience—time and space—have been thrown into uncertainty.

Lyotard sees avant-garde art of the twentieth-century as having pursued an analogous exploration of this crisis of perception. Traditionally, aesthetics has been concerned with the beautiful, understood in the arts as an ideal fit between the form and matter of a work. Lyotard sees avant-garde art, especially minimalism and abstraction, as moving away from a concern with ‘good form’ and towards an exploration of matter. Following Kant, ‘matter’ is something which defies rational calculation and specification: for example, colour in painting and timbre in music. While Kant himself only saw the sublime in art in depictions of sublime scenes in nature (mountains, storms at sea, and so forth), Lyotard suggests that the sublime is the aesthetic category appropriate to art which is less interested in exploring formal structure than in experimenting with matter, precisely because the sublime concerns feeling in relation to something ‘formless’. Lyotard characterises the sublime stake of art as ‘presenting the unpresentable’, because for him the aesthetic event is something which cannot be reduced to a ‘presentation’, understood in the Kantian sense as a ‘good form’ given to a sensation. Rather, art-events evoke thoughts and feelings in relation to works which surprise us and we cannot make sense of on the level of perception as well as concept: works that leave us feeling moved but lost for words. In his later works, the sublime is the aesthetic which Lyotard thinks best names this feeling. However, he also seeks to update this category in relation to the way it was understood in romantic or modern aesthetics. While in such aesthetics it had invoked the Idea of the absolute through a nostalgic feeling of loss for something transcendent which is missing, Lyotard posits a postmodern, immanent sublime in which the absolute is nothing other than the formless work itself.

 d. Deleuze

Both in his writings with Félix Guattari and on his own, Gilles Deleuze made important and influential contributions to the philosophy of film, painting, literature, and music. Many of his reflections on aesthetic issues are summarised in his last book with Guattari, What Is Philosophy?, where they are accompanied by a criticism of the phenomenological approach to aesthetics and Merleau-Ponty’s notion of the ‘flesh’ in particular. Characteristic of all of Deleuze’s work, he sees the level of perception with which phenomenologists are preoccupied as insufficiently deep to provide a full account of reality. It is on this level that he and Guattari situate, for example, Mikel Dufrenne’s a priori of aesthetic experience, and Merleau-Ponty’s flesh (see section 2. c.). Deleuze and Guattari delve deeper to give an account of art and aesthetic experience grounded in a metaphysical description of reality, where ‘sensation’ becomes the key aesthetic issue. Sensation is posited in a register prior to the distinction between subject and object, and consists of two main types: percepts and affects. Understood in this specific sense, they are perceptions and feelings considered independently of the lived experience which reveals them and raised to the level of independent metaphysical existence. For Deleuze and Guattari, a work of art is a ‘being of sensation’, a compound of percepts and affects, which is a ‘monument’ that preserves the sensation in and as the material from which the work is made. While for them the artist undoubtedly has a role in creating the work and the spectator or auditor a role in appreciating it, the emphasis is on the independent ontological status of the work as embodying that aspect of being which is sensation. They associate Merleau-Ponty’s flesh with the lived experience which reveals sensation, but insist on two further, deeper, and necessary conditions for sensation: the ‘house’, and ‘cosmic forces’. (While terms such as these may appear strange in the context of philosophical discourse, these and others are inspired by the writings of artists and other non-philosophers, and their use indicates a characteristic poststructuralist desire to think art with artists and art itself, rather than construct an independent theory about it, on the model of traditional aesthetics.)

Briefly glossed, the ‘house’ is the structure that gives sensations some consistency, such as the frames of paintings or the walls of architectural constructions but also more abstract principles of composition. ‘Cosmic forces’ are the basic physical and metaphysical forces constituting the real. Deleuze and Guattari list gravity, heaviness, rotation, the vortex, explosion, expansion, germination, and time. The main point here is that Deleuze and Guattari want to connect the activity of art with things usually considered extraneous to art and, indeed, with the universe as a whole. One notable way this placement of art within a broad metaphysical view plays out is the claim that animals can be artists through their exploitation of the expressive qualities of materials in marking territory, attracting mates, and so on. However, Deleuze and Guattari also insist that art must be considered to be a form of thinking which thinks with sensations, just as philosophy thinks with concepts and science thinks with functions. Art thinks against the common opinions, doxa, or clichés of our perceptions and feelings and adds new varieties of sensation to the world. This insistence gives art a legitimacy equal to that of philosophy and science, again indicating the importance accorded to the aesthetic which is characteristic of continental philosophy.

7. Developments in the Early 21st Century

 a. Introduction

Contemporary continental philosophy continues to see contributions to aesthetics which develop on all of the previous traditions discussed. Some twentieth century contributions to continental aesthetics, such as Adorno’s Aesthetic Theory or Lyotard’s extensive writings on the arts, still await much needed interpretation and discussion before the potential of their influence can be made manifest. In addition, contributing to the broad, pluralist landscape of aesthetics in continental philosophy, most of the more prominent continental philosophers in the early 21st century have written on the arts, including figures such as Slavoj Zizek, Alain Badiou, Giorgio Agamben, Jean-Luc Nancy, Michel Serres, Peter Sloterdijk, and Bernard Stiegler. Perhaps the most notable of these, however, is Jacques Rancière, whose distinctive works in aesthetics during the first fifteen years of the 21st century has revivified thinking on the relations between art and politics. We may therefore take Rancière’s work as an indicative example of early 21st century developments in continental aesthetics, while keeping in mind that this is just one of many important developments.

 b. Rancière

Rancière has become known for the idea of the ‘distribution of the sensible’, which suggests that systems of inclusion and exclusion, and of political relationships generally, don’t only operate on the conceptual or cognitive level, but on the sensory level. The idea of the distribution of the sensible captures the way the world is divided up according to sensations and the political implications of this. Rancière suggests that within communities there is a dimension of the sensible that is held in common by all members, allowing a common participation in the community as such, but that this is subdivided into parts, dividing members according to different areas of participation and non-participation. The distribution of the sensible concerns the circulation of words and images, the demarcation of spaces and times, and the forms of activity. It concerns the way that certain things are held to be meaningful or even self-evident to sense perception, while others are dismissed as meaningless noise. The different ways that what is held to be meaningful on a sensorial level in various contexts then affects what can meaningfully be thought, said, made, or done in those contexts. According to Rancière, social inequalities are in large part a result of this sensible distribution. A key implication of this idea is that art can be understood to be directly political on the level of the sensible (rather than indirectly, as simply representing ideas about social and political issues). Rancière’s politics is one of a non-utopian ideal of democratic emancipation, understood as the constant process of intervening in the current order to broaden spaces of participation and to open potentials of inclusion and participation where these are closed to parts of the community through the existing distribution of the sensible. Art can play an important political role by intervening in the existing order of distributions and helping to redistribute the sensible.

Rancière has also made a notable contribution to aesthetics in contesting the category of ‘modernism’, which has dominated much of the discourse around  art history and aesthetic theory in the early 21st century. According to Rancière, modernism attempts to impose a single meaning and a single historical narrative on the course of developments in the arts, a course which he sees as more complex, involving multiple meanings and temporalities. Moreover, modernism abstracts developments in the arts from other social and cultural forms of collective experience, which, on the contrary, Rancière sees as co-determining them. In place of categories which organise artistic developments according to a simple linear historical progression, Rancière proposes three ‘regimes’ of the arts. These regimes operate to some degree in a historically periodising fashion—as different regimes have predominated in different historical periods—but they also complicate and cut across such periodization. This is because they are not most fundamentally historical categories but, rather, ways that art is thought to operate or be significant, which can function in any historical period. Significantly, more than one regime of art can be operative at a single time. These regimes of art are 1. the ethical regime of images; 2. the representative (or poetic) regime of art; and 3. the aesthetic regime of art.

The ethical regime of images predominated in Ancient Greece and is exemplified with Plato’s discussion of images. Art does not emerge as a category here. Images are understood in relation to their effect on the ethos, or mode of behaviour, of members of the community, and they are interrogated according to their origin and their end, function, or purpose. In this regime, there are images which are thought to be truer or falser and to have a beneficial or detrimental effect on the ethical community. The representative regime of art predominated from the Renaissance to the nineteenth century. Here the idea of not just art but of a system of the arts emerged. Arts were thought of in terms of poetics; that is, sets of rules which determine the different forms of expression and arrange them in a hierarchy, and which also determine which forms of expression (arts, genres) are suitable for particular types of content. It is called the representative regime because this system of categorisation of the arts is organised around the key idea of representation, or mimesis, understood as a fit between form of expression and type of content. Finally, the aesthetic regime of the arts roughly corresponds with the experimentations more usually categorised with terms such as ‘modernism’ or ‘the avant-garde’. With this regime, the idea of art as something truly unique and singular emerges. However, this singularity is involved in a paradox, insofar as the rules for governing the arts characteristic of the representative regime also break down. In the aesthetic regime, art is asserted as a special kind of activity but, since anything can now count as art, there are no longer any criteria for distinguishing it from other forms of activity or production. While the aesthetic regime predominates in the contemporary world with its highly pluralist art scene, Rancière insists that all three regimes are still operative today to some degree.

8. References and Further Reading

  • Adorno, Theodor W. Aesthetic Theory. Edited by Gretel Adorno and Rolf Tiedemann, translated by Robert Hullot-Kentor. London and New York: Continuum, 1997.
    • Adorno’s major work on aesthetics.
  • Benjamin, Walter. ‘The Work of Art in the Age of Mechanical Reproduction’. In Illuminations, edited by Hannah Arendt, translated by Harry Zorn, 211–44. London: Pimlico, 1999.
    • The best known of the several versions of this highly influential essay, in which Benjamin develops the concept of artistic ‘aura’.
  • Cazeaux, Clive, ed. The Continental Aesthetics Reader, 2nd ed. London and New York: Routledge, 2011.
    • A collection of classic readings in aesthetics across the major traditions in continental philosophy, accompanied by insightful introductory essays.
  • Deleuze, Gilles, and Félix Guattari. What Is Philosophy? Translated by Hugh Tomlinson and Graham Burchell. London: Verso, 1994.
    • The chapter ‘Percept, Affect, and Concept’ condenses many aspects of Deleuze’s more extensive treatments of painting, film, and literature, and positions art in relation to philosophy and science.
  • Derrida, Jacques. Acts of Literature. Edited by Derek Attridge. London and New York: Routledge, 1992.
    • An edited collection of some of Derrida’s most important writings on literary topics, including essays on Mallarmé, Joyce, Kafka, Ponge, and Celan, and an interview with Derrida on literature.
  • Derrida, Jacques. The Truth in Painting. Translated by Geoff Bennington and Ian McLeod. Chicago and London: University of Chicago Press, 1987.
    • Derrida’s most well-known application of deconstructive strategies to aesthetic topics beyond literature. Contains the essay on Valerio Adami, ‘+R (Into the Bargain)’.
  • Freud, Sigmund. Writings on Art and Literature. Stanford: Stanford University Press, 1997.
    • A selection of Freud’s writings on aesthetic topics collected from James Strachey’s Standard Edition, which however does not include the two important texts listed below.
  • Freud, Sigmund. ‘Creative Writers and Day-Dreaming’. In Jensen’s ‘Gradiva’ and Other Works. Vol. 9 of The Standard Edition of the Complete Psychological Works of Sigmund Freud, edited by James Strachey, 141–54. London: Hogarth Press, 1959.
    • Freud, Sigmund. ‘Leonardo da Vinci and a Memory of His Childhood’. In Five Lectures on Psycho-Analysis, Leonardo da Vinci, and Other Works. Vol. 11 of The Standard Edition of the Complete Psychological Works of Sigmund Freud, edited by James Strachey, 63–138. London: Hogarth Press, 1957.
  •  Gadamer, Hans-Georg. Truth and Method, 2nd revised ed. Translated by Joel Weinsheimer and Donald G. Marshall. London: Bloomsbury, 2013.
  •  Gadamer, Hans-Georg. The Relevance of the Beautiful and Other Essays. Edited by Robert Bernasconi, translated by Nicholas Walker. Cambridge: Cambridge University Press, 1986.
  •  Heidegger, Martin. Nietzsche: Volumes One and Two. Translated by David Farrell Krell. San Francisco: Harper Collins, 1991.
    • Volume one, ‘The Will to Power as Art’, presents Nietzsche’s view of art as holding a privileged ontological status and a value higher than truth.
  • Heidegger, Martin. ‘The Origin of the Work of Art’. In Off the Beaten Track, edited and translated by Julian Young and Kenneth Haynes, 1–56. Cambridge: Cambridge University Press, 2002.
    • Heidegger’s most extensive, significant, and well-known contribution to aesthetics.
  • Kearney, Richard and David Rasmussen, eds. Continental Aesthetics: Romanticism to Postmodernism: An Anthology. London: Wiley-Blackwell, 2001.
    • A useful collection of classic readings, though unaccompanied by any guiding text.
  • Lacan, Jacques. The Seminar of Jacques Lacan, Book XI: The Four Fundamental Concepts of Psychoanalysis. Edited by Jacques-Alain Miller, translated by Alan Sheridan. New York and London: W.W. Norton & Company, 1978.
    • Transcripts of Lacan’s seminar delivered in 1964 and first published in French in 1973. Contains the most influential of Lacan’s work relating to aesthetics, ‘Of the Gaze as Object petit a.’ A problematic translation, but still the only one available.
  • Lyotard, Jean-François. Discourse, Figure. Translated by Antony Hudek and Mary Lyton. Minneapolis: University of Minnesota Press, 2011.
    • The definitive statement of Lyotard’s early aesthetics, which stages an encounter between structuralist, phenomenological, and psychoanalytic approaches.
  • Lyotard, Jean-François. Writings on Contemporary Art and Artists. Edited by Herman Parret. Leuven: Leuven University Press, 2013.
    • Extensive, though still not exhaustive, bilingual (French and English) collection of Lyotard’s writings on aesthetic topics.
  • Merleau-Ponty, Maurice. The Merleau-Ponty Aesthetics Reader. Edited by Michael B. Smith. Evanston, Illinois: Northwestern University Press, 1993.
    • Contains the essays ‘Cézanne’s Doubt’, ‘Indirect Language and the Voices of Silence’, and ‘Eye and Mind’, along with introductory essays on each and a collection of critical essays on Merleau-Ponty’s philosophy of art.
  • Merleau-Ponty, Maurice. The Visible and the Invisible. Edited by Claude Lefort, translated by Alphonso Lingis. Evanston: Northwestern University Press, 1968.
    • Merleau-Ponty’s final, unfinished book. Contains the chapter ‘The Intertwining—The Chiasm’, which outlines the ontology developed in relation to painting in ‘Eye and Mind.’
  • Rancière, Jacques. Dissensus: On Politics and Aesthetics. Edited and translated by Steven Corcoran. London: Bloomsbury, 2010.
    • A collection of some of Rancière’s most important essays on the relationship of politics and aesthetics.
  • Rancière, Jacques. The Politics of Aesthetics. Edited and translated by Gabriel Rockhill. London: Bloomsbury, 2013.
    • A brief, accessible introduction to some of Rancière’s most important ideas in aesthetics, such as the distribution of the sensible, the critique of ‘modernism’ as an aesthetic category, and the three regimes of art.
  • Rancière, Jacques. Aisthesis: Scenes from the Aesthetic Regime of Art. Translated by Zakir Paul. London: Verso, 2013.
    • Rancière’s most complete and definitive work on aesthetics to date.

 

Author Information

Ashley Woodward
Email: a.z.woodward@dundee.ac.uk
University of Dundee
United Kingdom

Epistemology and Relativism

Epistemology is, roughly, the philosophical theory of knowledge, its nature and scope. What is the status of epistemological claims? Relativists regard the status of (at least some kinds of) epistemological claims as, in some way, relative— that is to say, that the truths which (some kinds of) epistemological claims aspire to are relative truths. Self-described relativists vary, sometimes dramatically, in how they think about relative truth and what a commitment to it involves. Section 1 outlines some of these key differences and distinguishes between broadly two kinds of approaches to epistemic relativism. Proposals under the description of traditional epistemic relativism are the focus of Sections 2-4. These are, (i) arguments that appeal in some way to the Pyrrhonian problematic; (ii) arguments that appeal to apparently irreconcilable disagreements (for example, as in the famous dispute between Galileo and Bellarmine); and (iii) arguments that appeal to the alleged incommensurability of epistemic systems or frameworks. New (semantic) epistemic relativism, a linguistically motivated form of epistemic relativism defended in the most sophistication by John MacFarlane (for example, 2014), is the focus of Sections 5-6.  According to MacFarlane’s brand of epistemic relativism, whether a given knowledge-ascribing sentence is true depends on the epistemic standards at play in what he calls the context of assessment, which is the context in which the knowledge ascription (for example, ‘Galileo knows the earth revolves around the sun’) is being assessed for truth or falsity. Because the very same knowledge ascription can be assessed for truth or falsity from indefinitely many perspectives, knowledge-ascribing sentences do not get their truth values absolutely, but only relatively. The article concludes by canvassing some of the potential ramifications this more contemporary form of epistemic relativism has for projects in mainstream epistemology.

Table of Contents

  1. Relativism in Epistemology: Two Approaches
  2. Traditional Arguments for Epistemic Relativism: The Pyrrhonian Argument
  3. Traditional Arguments for Epistemic Relativism: Non-Neutrality
  4. Traditional Arguments for Epistemic Relativism: Incommensurability and Circularity
  5. New (Semantic) Epistemic Relativism: Assessment-Sensitive Semantics for ‘Knows’
  6. New (Semantic) Epistemic Relativism: Issues and Implications in Epistemology
  7. References and Further Reading

1. Relativism in Epistemology: Two Approaches

“Relativism” is notoriously difficult to define. There are however some core insights about relativism that are more or less embraced across the board amongst self-described relativists. One such insight is negative, framed in terms of what relativists are characteristically united in denying. Take for example the following epistemological claims:

  1. Copernicus’s belief that the earth revolves around the sun is justified.
  2. Edmund does not know that the man who will get the job has ten coins in his pocket.
  3. Knowledge is not factorable into component parts.
  4. Beliefs formed on the basis of direct observation are better justified than beliefs formed on the basis of drug-induced wishful thinking.

Relativists of all stripes typically deny at least one—if not all—of the following: that the truth of claims like (a-d) are applicable to all times and frameworks; that they are objective (for example, trivially dependent on our judgments or beliefs) and monistic (for example, in the sense that competing claims are mutually exclusive) (see Baghramian and Carter (2015)). In some cases—a notable example here is Richard Rorty (1979)—philosophers have been labelled relativists primarily on the basis of their distinctive denial(s) of such claims about the status of these kinds of judgments.

Moreover, along with denying the sorts of claims characteristic of metaepistemological realism (for example, Cuneo 2007: Ch 3), the epistemic relativist is also committed to denying the metaepistemological analogues of non-relativist positions that are familiar territory in contemporary metaethics.

For example, contra epistemic error theory (for example Olson 2009), which insists that claims like (a)-(d) which attribute epistemic properties are categorically false, the epistemic relativist maintains that some claims like (a)-(d), which attribute epistemic properties, are true—albeit, true in a way that is in some interesting sense ‘relative’. Likewise, contra the epistemic expressivist (for example Chrisman 2007; Gibbard 1990; Field 1998) who insists that claims like (a-d) are expressions of attitude, the relativist is a cognitivist. Accordingly, the relativist maintains that (a)-(d) are truth-apt, while adding that the truth-aptness is not to be thought of as the realist thinks of it; expressions like (a)-(d) are relatively truth-apt in that the truths they aspire to are relative truths. (We consider shortly what this might involve—as the point is highly controversial amongst relativists).

Another core insight about relativism, generally construed, is co-variance (for example Baghramian 2004; 2014 and Swoyer 2014). Co-variance is the idea that some object, x, depends on some underlying, independent variable, y, such that, in some suitably specified sense, change in the latter results in a change in the former. In embracing relativism about some class of truths, one thereby embraces some kind of co-variance claim. For example, a cultural relativist about epistemic justification tells us that the truth of claims (a-b) varies with local cultural norms and in doing so holds that cultural norm change instances change in what one counts as knowing, justifiably believing, and so forth.

Beyond these mostly uncontroversial ingredients of a relativist proposal—or necessary conditions for being a relativist—the matter of what is sufficient for a view to count as a relativist view is controversial. One influential approach to characterizing relativism has been put forward by Paul Boghossian (2006a). As Boghossian sees things, we can attribute to the epistemic relativist the following package of three claims: epistemic non-absolutism, epistemic relationism and epistemic pluralism.

Epistemic Relativism (Boghossian’s Formulation)

  1. There are no absolute facts about what belief a particular item of information justifies. (Epistemic non-absolutism)
  2. If a person, S’s, epistemic judgments are to have any prospect of being true, we must not construe his utterances of the form ‘‘E justifies belief B’’ as expressing the claim E justifies belief B but rather as expressing the claim: According to the epistemic system C, that I, S, accept, information E justifies belief B. (Epistemic relationism)
  3. There are many fundamentally different, genuinely alternative epistemic systems, but no facts by virtue of which one of these systems is more correct than any of the others. (Epistemic pluralism)

Boghossian’s model is often called the replacement model for formulating epistemic relativism. This is largely due to the inclusion of claim (B), the epistemic relationism thesis. In attributing relationism to the epistemic relativist, Boghossian (2006a: 84) regards the relativist as effectively endorsing a replacing of unqualified epistemic claims with explicitly relational ones. As he puts it:

[…] the relativist urges, we must reform our talk so that we no longer speak simply about what is justified by the evidence, but only about what is justified by the evidence according to the particular epistemic system that we happen to accept, noting, all the while, that there are no facts by virtue of which our particular system is more correct than any of the others.

One of the central moves Boghossian makes against the epistemic relativist in his monograph Fear of Knowledge is to argue that epistemic relativism—formulated as such—is ultimately an incoherent position. In response, some critics—notably Martin Kusch (2010)—have replied that epistemic relativism, formulated in accordance with the replacement model, is not incoherent for the reasons Boghossian suggests—or, at least, in Kusch’s case, that there is a version of this view that is defensible.

A comparatively deeper issue, however, and one that is prior to whether the replacement model leads to incoherence, is whether the inclusion of the relationist clause is an apt way of representing the relativist’s view. Though Boghossian and Kusch disagree on the matter of whether epistemic relativism formulated within the replacement model is tractable, both think that the framework is capable of characterising the epistemic relativist’s core position.

But this point is highly controversial. Crispin Wright (2008: 383) for instance, says of Boghossian’s inclusion of the relationist clause in formulating epistemic relativism:

We can envision an epistemic relativist feeling very distant from this characterisation and of its implicit perception of the situation.

Wright’s complaint, in the main, is that, insisting on the relationist clause is tantamount to insisting that the only way the relativist (who must reject absolute facts about what justifies what) can make sense of how claims of the form ‘S is justified in believing X’ are true (at all) is by construing their content in an explicitly relational way, so that the explicitly relational truths (for example ‘S is justified in believing X, according to system A) are themselves candidates for absolute truth.

But this, Wright says:

[…] is just to fail to take seriously the thesis that claims such as [sic … S is justified in believing X] can indeed be true or false, albeit, only relatively so. (Ibid., 383, my italics).

Wright’s complaint, as quoted in this passage, gestures to what is probably the most substantial divide in the contemporary landscape in relation to epistemic relativism. There are really two important and connected ideas that need unpacking here. The first has to do with charity, and the second has to do with inclusiveness.

Regarding charity: to the extent that one insists that epistemic relationism is an indispensable component of epistemic relativism, one is de facto excluding (by viewing as tacitly unintelligible) the thought that non-explicitly relational claims (for example S is justified in believing p) can be true or false, albeit, only relatively so. And so if it turns out that that this excluded possibility is a viable one, then the attribution to the relativist of the relationist clause is not a suitably charitable way of formulating the relativist’s position.

New (semantic) relativists—whose motivations draw from analytic philosophy of language—regard this excluded possibility as not only viable, but moreover, the only legitimate way to capture a philosophically interesting kind of relativist position. The rationale for thinking this way has been articulated most notably by John MacFarlane (for example 2007, 2011, 2014). MacFarlane’s work over the past decade has stressed that simply relativizing propositional truth to what seem like exotic parameters (for example other than worlds and times—such as judges, perspectives, or standards (including epistemic standards)—is not in itself ‘enough to make one a relativist about truth in the most philosophically interesting sense’. This is because such relativization is compatible with truth absolutism, and MacFarlane’s position is that philosophically interesting relativism must part ways with the absolutist.

Consider, for example, that the epistemic contextualist (for example Cohen 1988; DeRose 1992, 2009) insists that whether ‘S knows that p’ is true can shift with different standards at play in different contexts in which the sentence ‘S knows that p’ is used. This is because, for the contextualist, my utterance of “Keith knows the bank is open” can express different propositions depending on the context in which I use this sentence. If I use the sentence in a context in which it doesn’t matter to me whether Keith knows the bank is open, what I’ve asserted can be true even if uttering the very same sentence would come out false if uttered in a context in which it is extremely important to me that the bank is open—and for the contextualist, this is so even if all other epistemically relevant features of Keith’s situation (for example what evidence Keith has for thinking the bank is open) are held fixed across these contexts of use. When knowledge ‘is relative to an epistemic standard’ in the way that the contextualist relativizes knowledge to an epistemic standard, it remains that a particular occurrence of ‘knows’ used in a particular context, gets its truth value absolutely. A philosophically interesting relativist, as MacFarlane sees it, denies this. The line, according to MacFarlane, between the (genuine) relativist and the non-relativist is best understood as a line that is between views that allow truth to vary with the context of assessment and those that do not’ (2014, vi). A context of assessment is a possible situation in which a use of a sentence might be assessed, where the agent of the context is the assessor of the use of a sentence. This view is described in more detail in Section 5.

This brings us to the point about inclusiveness. From the perspective of the new-age (semantic) relativist like MacFarlane, the kind of position described by Boghossian as epistemic relativism is not really an interesting relativist position. Boghossian’s epistemic relativist, modelled on Gilbert Harman’s (1975) moral relativism, is (by MacFarlane’s lights) best understood as a version of contextualism (see MacFarlane (2014: 33, fn. 5)). After all, (a la epistemic relationism) the explicitly relational claims which Boghossian regards the relativist as in the market to putting forward as true are candidates for absolute truth.

This article does not attempt to adjudicate which kind of approach to thinking about relativism, more generally, is the right one. Rather, the article is divided into two main parts: in short, (i) arguments for epistemic relativism which do not give a context of assessment a significant semantic role (Sections 2-4)—which is termed traditional arguments for epistemic relativism, and (ii) arguments that do—which is termed new (semantic) epistemic relativism (Sections 5-6). The former kinds of arguments are not primarily motivated by considerations to do with how we use language whereas the latter kind of argument strategy (the focus of Sections 5-6) is.

2. Traditional Arguments for Epistemic Relativism: The Pyrrhonian Argument

One influential argument strategy under the banner of epistemic relativism takes as a starting point a famous philosophical puzzle traditionally associated with Pyrrhonian skepticism— that is to say, the Pyrrhonian problematic. The most famous version of the puzzle, the ‘regress’ version of the problematic, goes as follows—the simple presentation here owes to John Greco (2013, 179). Suppose you claim to know that p is true but you are asked to provide a good reason for p. If it is granted that good reasons—for example the sort of reasons good enough to epistemically justify a belief—are non-arbitrary reasons, reasons that we have good reason to believe, then a regress threatens. The idea is that, at least, with the above assumptions in place, it looks as though knowledge as well as epistemic justification require an infinite number of good reasons. But it seems that this is something we do not have, and thus, as the puzzle goes, it looks like we do not know or justifiably believe anything. With reference to this puzzle, the sceptic effectively places the onus on her non-sceptical adversary to reject one or more of the assumptions underwriting the puzzle. Foundationalism, coherentism and infinitism are typically distinguished from one another with reference to which assumption(s) is rejected.

Against this background, Howard Sankey (2010; 2011; 2012) has argued, in a series of papers, that the Pyrrhonian problematic offers the tools to capture the most compelling argument strategy available to the epistemic relativist; in one place, he writes that the ancient Pyrrhonian argument “constitutes the foundation for contemporary epistemic relativism” (Sankey 2012, 184, my italics).

Sankey’s argument comes in primarily in two parts: a negative part and a positive part. Before outlining the negative part, some terminology is helpful.  Sankey (2013: 3) defines epistemic relativism in a restricted way: as a view about epistemic norms, where he defines an epistemic norm as ‘a criterion or rule that may be employed to justify a belief’.  Epistemic relativism is then defined as the thesis that there are no epistemic norms over and above the variable epistemic norms operative in different (local) cultural settings or contexts, where these local contexts are defined as always including at least a system of beliefs and a set of norms. (Sankey 2012, 187). For Sankey’s relativist, whether a belief is justified, or counts as knowledge, depends on epistemic norms, and so, given that different epistemic norms can operate in different contexts, the same belief might be rational/justified/knowledge relative to one context, and not to another.

Sankey’s ‘negative’ argument on behalf of the relativist appeals to the Pyrrhonian puzzle to generate the intermediate conclusion that all epistemic norms are on equal standing; his positive argument moves from the equal standing claim established by the negative argument to the conclusion that epistemic relativism (as he has defined it) is true. The negative argument can be summarized as follows: Take an epistemic norm, N1. Question: how is N1 to be justified? With reference to the Pyrrhonian puzzle, the options don’t look very promising. One option is to Justify N1 by appealing to a further epistemic norm N2. Another option is to justify N1 by appealing to N1. Sankey says neither of these options satisfactorily justifies N1; the former generates an infinite regress, the latter is viciously circular. Now: take any other epistemic norms, N3, N4 … Nn. By running through this same line of thinking with any of N3, N4 … Nn in an attempt to justify any of these norms, we end up in the same place. That is, each of N1 and N3, N4 … Nn are equally lacking in justification. From here, Sankey’s positive move (for example see Sankey 2011 §3, esp. pp. 564-566) on behalf of the relativist goes as follows:

If no norm is better justified than any other, all norms have equal standing. Since it is not possible to provide an ultimate grounding for any set of norms, the only possible form of justification is justification on the basis of a set of operative norms. Thus, the norms operative within a particular context provide justification for beliefs formed within that context. Those who occupy a different context in which different norms are operative are justified by the norms which apply in that context… the relativist is now in a position to claim that epistemic justification is relative to locally operative norms.

Sankey himself, not a relativist, attempts a naturalistically motivated overriding strategy to the argument—one which grants the relativistic challenge as legitimate and then attempts to meet the challenge (2010). Carter (2016) and Seidel (2013) by contrast have proposed undercutting responses which call into question whether the relativist can viably use the argument strategy which Sankey regards as the epistemic relativist’s strongest play. Carter (2016, Ch. 3) challenges the first (negative) part of the argument by noting that the intermediate conclusion (that all norms are equally justified) is one the would-be relativist is entitled to only if it is already granted that foundationalism, coherentism and infinitism are all unsuccessful. But Sankey’s relativist proposes no positive case for this—but rather takes it for granted.

Carter (2016) and Markus Seidel (2013, 137) have both expressed worries that, even if the first part of the argument were granted (and so, even if it were granted that by the Pyrrhonian strategy is effective in establishing that all epistemic norms are on epistemic standing), it’s not clear how relativism is to be motivated over scepticism. As Seidel puts it, Sankey’s relativist actually travels so far down the road with the sceptic that the relativist is “at pains to provide us with reasons [for the relativist to] part company” (137). That is: once it has been claimed that all norms are equally unjustified—no norm is more justified than in any other in any way—it is not apparent, as Seidel observes, how locally credible epistemic norms are supposed to have any positive epistemic status, positive status the relativist wants to preserve when insisting that epistemic norms aspire to relative justification.

For an alternative perspective for how relativism might be better motivated than scepticism—generally speaking—see Michael Williams (for example, 1991; 2001) who defends an anti-sceptical form of relativism (though he rejects this label), specifically a Wittgensteinian-inspired brand of contextualism’ (compare, DeRose 1992), as an alternative to both scepticism as well as metaepistemological realism.

3. Traditional Arguments for Epistemic Relativism: Non-Neutrality

Another kind of argument for traditional epistemic relativism is what Harvey Siegel (2011: 205) has termed the non-neutrality argument. A much-discussed reference point for this argument strategy is Rorty’s (1979) discussion of the famous dispute between Galileo and Cardinal Bellarmine about Copernican heliocentrism. In short, Galileo and Cardinal Bellarmine could not agree about the truth of Copernican heliocentrism, but even more, they also could not agree about what evidential standards were even relevant to settling the matter. Galileo had argued for the Copernican picture on the basis of telescopic evidence. Cardinal Bellarmine dismissed Galileo’s suggestion that Earth revolves around the sun as heretical, by appeal to Scripture. From these disparate starting points, Rorty noted, it looked as though neither was in a position to appeal to neutral ground in the service of rational adjudication—each was operating within a different “grid which determines what sorts of evidence there could be for statements about the movements of the planets” (Rorty 1979: 330-331).

Siegel (2011: 105-106) captures, with reference to this case, the relativist’s reasoning as follows:

The relativist here claims that there can be no non-relative resolution of the dispute concerning the existence of the moons, precisely because there is no neutral, non-question-begging way to resolve the dispute concerning the standards. Any proposed meta-standard that favors regarding naked eye observation, Scripture, or the writings of Aristotle as the relevant standard by which to evaluate “the moons exist” will be judged by Galileo as unfairly favoring his opponents since he thinks he has good reasons to reject the epistemic authority of all these proposed standards; likewise, any proposed metastandard that favors Galileo’s preferred standard, telescopic observation, will be judged to be unfair by his opponents, who claim to have good reasons to reject that proposed standard. In this way, the absence of neutral (meta-) standards seems to make the case for relativism.

The pro-relativist argument that is motivated by the Galileo/Bellarmine dispute, which Siegel (2011: 206) calls “No Neutrality, Therefore Relativism”, as represented in Siegel’s passage, can be pared down to the following argument:

“No Neutrality, Therefore Relativism”

  1. There can be a non-relative resolution of the dispute concerning the existence of the moons, only if there is an appropriately neutral meta-norm available.
  2. In the context of the dispute between Galileo and Bellarmine, no such metanorm is available.
  3. Therefore, it is not the case that there can be a non-relative resolution of the dispute concerning the existence of the moons.
  4. Therefore, epistemic relativism is true.

As stated, the argument is not valid. In order to make the argument valid, a further ‘bridge’ premise (or premises) would be needed to get from (3)—the premise that there can be no non-relative resolution of the dispute concerning moons [or some similar such dispute]—to the conclusion that epistemic relativism is true (4).

What are the prospects of ‘bridging’ (3) and (4)? The viability of a no-neutrality therefore relativism-style argument rests importantly on this question. Steven Hales (2014) defends a version of the no-neutrality therefore relativism argument which attempts to bridge the gap (between (3) and (4)) via process of elimination. Hales argues, with reference to a case involving a similarly deadlocked dispute concerning the nature of the human soul (by interlocutors who adhere to analytic philosophy of mind and the Catechism, respectively) that—from their irreconcilable position—the salient options for resolving the dispute are: (i) keep arguing until capitulation, (ii) compromise, (iii) locate an ambiguity or contextual factors; (iv) accept scepticism or (v) adopt relativism (Hales 2014: 63). Relativism is defended by Hales as the most satisfactory option.

Carter (2015, Ch. 4) has criticised this strategy. For one thing, appealing to relativism’s success as a disagreement-resolution strategy doesn’t obviously help move one from (3) to (4). For example, even if both parties’ can easily resolve their disagreement by adopting the belief that relativism is true, relativism might just as well be false. More generally, that interlocutors’ accepting something X is efficacious in resolving a dispute is not satisfactory grounds for thinking X is true or even probably true. Furthermore, Hales’ process of elimination strategy dismisses skepticism out of hand as “throwing in the towel.” However, this just reinvites the issue of why relativism should be (in the face of the no-neutrality, therefore relativism) argument regarded as motivated over skepticism. As with Sankey’s redeployment of the Pyrrhonian argument considered in Section 2, it is not clear how this is so.

It is worth noting that the no-neutrality therefore relativism argument is but one way philosophers have attempted to motivate relativism by pointing to disagreements. Another route is to appeal to what Max Kölbel (2003) calls “faultless disagreements” (for example, apparently genuine disagreements in some discretionary area of discourse where it seems neither party to the disagreement has made a mistake). These faultless disagreement strategies which appeal to disagreements to motivate relativism, and the neutrality-based strategy considered in this section, are only superficially similar. Unlike the no-neutrality, therefore relativism argument, faultless-disagreement arguments simply do not regard properties of any particular disagreement (for example, the disagreement between Bellarmine and Galileo) as in the market for establishing epistemic relativism. Faultless disagreement-style arguments reason from semantic and pragmatic evidence about disagreement patterns, much more generally, to the conclusion that a relativist semantics (in certain domains where we find such disagreements) best explains our practices of attributing certain terms. This kind of argument is discussed in more detail in Section 5, as it is an argument strategy used by new (semantic) epistemic relativists.

4. Traditional Arguments for Epistemic Relativism: Incommensurability and Circularity

A third kind of argument which has motivated versions of epistemic relativism appeals to incommensurability and epistemic circularity. The idea is that, upon confronting radically different epistemic systems (for example, radically different Kuhnian paradigms, Wittgensteinian framework propositions or individuals who employ what Ian Hacking (1982) calls alien ‘styles of reasoning’) we are called upon to justify not just ordinary beliefs as we usually do, but rather the very epistemic system (that is, the set of epistemic principles or rules) within which we our epistemic evaluations are made. However, once we begin to attempt to justify our own epistemic system, epistemic circularity threatens. Michael Williams (2007: 3-4) expresses the idea on behalf of the relativist as follows:

In determining whether a belief—any belief—is justified, we always rely, implicitly or explicitly, on an epistemic framework: some standards or procedures that separate justified from unjustified convictions. But what about the claims embodied in the framework itself: are they justified? In answering this question, we inevitably apply our own epistemic framework. So, assuming that our framework is coherent and does not undermine itself, the best we can hope for is a justification that is epistemically circular, employing our epistemic framework in support of itself. Since this procedure can be followed by anyone, whatever his epistemic framework, all such frameworks, provided they are coherent, are equally defensible (or indefensible).

There are really two ‘key moves’ in this line of thinking. The first key move contends that—in the face of radically different epistemic systems from our own—our activity of attempting to justify our own epistemic system will lead to epistemic circularity. The second key move adverts to the claim that all attempts to justify epistemic systems result in epistemic circularity and from this claim concludes the epistemic relativist-friendly conclusion that all epistemic systems are equally defensible, or on a par.

The first move, stated more carefully, seems to be that, when an individual S is in a position where S is trying to justify S’s own epistemic framework or system, X, by attempting to justify the claims that comprise the system (x1 … xn), then: (i)  S must (inevitably) apply that system (X); and, the application, by S, of a system X to justify the claims (x1 … xn) of that very system, X, is sufficient for leaving S’s epistemic justification for the claims of X (x1 … xn) circular.

From here, it is helpful to note three central issues which are relevant to the success of this kind of ‘pro-relativist’ strategy, in so far as the kind of epistemic circularity that is supposed to materialise via the application of a system in its own defence is itself of a sort that will leave all epistemic systems equally defensible. The first two issues concern the first key move and the third concerns the second key move.

Firstly, note that it seems in principle possible to pre-empt epistemic circularity altogether by simply rejecting that the justification of S’s epistemic framework depends on S’s ability to non-circularly justify that framework. Consider, for example, the line an externalist reliabilist might take. The process reliabilist (for example, Goldman 1979) might say that the epistemic principles constituting S’s epistemic system (X) are justified simply provided they are reliable and regardless of whether one can successfully justify or know that they are reliable. Compare here the reliabilist’s commitment to basic knowledge— that is to say, that S can know p even though S has no antecedent knowledge that the process R that produced S’s belief is reliable. Likewise, as this idea goes—at greater generality—the reliabilist is in a position to submit that any positive epistemic status which the belief that our own epistemic principles are correct has does not depend on any antecedent facts about our appreciation that they have this status. The reliabilist attempts to undercut the circularity objection then by mooting it.

Two salient replies to this line of reasoning have to do with assertion and bootstrapping, respectively. Regarding assertion: as Mikkel Gerken (2012, 379) has suggested, although some conversational contexts are ones where “S may assert something although S is unable to provide any reason for it” other contexts may not be permissive in this way.  Discursive contexts are on Gerken’s view ones where “interlocutors share a presupposition that an asserter must be able to back up unqualified assertions by reasons… and in which ‘being a cooperative speaker involves being sensitive to reasons for and against what is asserted” (2012, 379). Gerken’s position is that, in such contexts, epistemically appropriate assertion must be discursively justified, where discursive justification is something S possesses only if S is able to articulate some epistemic reasons for believing that p. But if this is right, then, there is a case to make that while an externalist line such as the one sketched above cuts epistemic circularity off at the pass, it does so in a way that would effectively leave one in no position to claim (in the face of a challenge from an interlocutor with a radically different epistemic system) to know that one’s own system is correct.

A second salient kind of reply to the externalist move is to suggest, in short, that even if (with reference to the Williams passage quoted above) it looks as though epistemic circularity materialises only once one uses the epistemic principles constituting one’s own epistemic system in the service of justifying it, this might be misleading. The idea here is that if one attempts to cut this kind of epistemic circularity off at the pass, by opting for the reliabilist move sketched above, then one at the same time (at least, potentially) encounters what is allegedly another malignant form of epistemic circularity in the form of bootstrapping (for example, Vogel 2000)— that is to say, that one would be in a position to acquire track-record evidence via the deliverances of applying one’s own epistemic principles that the application of one’s own epistemic principles is reliable. This point, in conjunction with the previous point about assertion, suggest that the kind of circularity problem Williams intimates can’t be simply circumvented by ‘going externalist’ without also incurring some further challenges.

So the viability of an attempt to block epistemic circularity ex ante by “going externalist” was the first of three issues to highlight relevant to the viability of the kind of argument strategy Williams describes. The second issue concerns the nature of the epistemic circularity in question and which on this line of argument is said to materialise when one attempts to justify one’s epistemic system by appealing to it. Consider that there are in fact two very different kinds of ways in which one might apply an epistemic principle or rule in the service of justifying one’s epistemic system (where, again, the epistemic system is understood as a set of epistemic principles).

Firstly, one might apply a principle by simply following it (for example as when one might follow an inference rule in the service of justifying that inference rule or perhaps justifying the epistemic system of which the inference rule is a part). See Boghossian (2001). However, just as a judge might apply a rule (consider, the rule that ‘one must drive only with a license’) not by following the rule but by invoking its authority (for example McCallum 1966), one might also apply one’s own epistemic principle or principles not by following them but by invoking their authority. For example, one might attempt to justify inference to the best explanation (IBE) by invoking the authority of the wider system of epistemic principles within which IBE belongs: Western Science.

The overarching point here is that the kind of epistemic circularity that materialises as a function of one’s appealing to one’s epistemic system in the service of justifying it can take on different shapes—with different kinds of premise-conclusion dependence relations. Accordingly, an argument that attempts to move from epistemic circularity to relativism must thus be appropriately sensitive to these different shapes epistemic circularity can potentially take on when one applies one’s own epistemic system in the service of justifying it. This is because it is not obvious that all such shapes are equally epistemically objectionable. (For discussion on this point, see Pryor 2004 and Wright 2007).

The third issue to raise concerns the second ‘key move’ in the sequence Williams describes: the move that is supposed to get us from circularity to relativism. Even on the assumption that the kind of epistemically circular justification one is left with for one’s own epistemic principles (and more generally, one’s epistemic system) renders all epistemic principles on an ‘equal footing’—this equal-footing option is compatible with both scepticism as well as relativism. An argument successfully establishes epistemic relativism from the position described only if provides a non-arbitrary reason to embrace relativism over scepticism.

5. New (Semantic) Epistemic Relativism: Assessment-Sensitive Semantics for ‘Knows’

One recurring objection-type to traditional arguments for epistemic relativism (of the sort surveyed in §2-4) is that these arguments face a shared difficulty when it comes to showing why, in light of the philosophical considerations adverted to, relativism is at the end of the day a more attractive option than skepticism. New (semantic) epistemic relativism doesn’t face this kind of challenge. This is because new (semantic) relativism (hereafter, new relativism) is motivated on the basis of very different kinds of philosophical considerations than the argument strategies considered in §§2-4.

The present section is organised as follows: two preliminary points about new relativism are first noted, and then MacFarlane’s most substantial (2014) argument for an assessment-sensitive semantics for “knows” is outlined; it is an argument that depends on two key premises, and MacFarlane’s rationale for defending these premises are discussed in some depth. Note that while there are other ways of motivating semantic relativism that do not appeal explicitly to ‘contexts of assessment’ (for example, Richard 2004; Egan 2007), which is MacFarlane’s distinctive terminology, I am in what follows focusing on MacFarlane’s presentation, as it is the most developed.

That said, the first preliminary point to note concerns the relationship between epistemic contextualism and relativism. As was noted in section 1, epistemic contextualism is—by MacFarlane’s lights—not on the interesting side of the line between absolutism and relativism. The point to stress here is that while the contextualist can, no less than the relativist, recognize a ‘standards’ parameter (and in this respect can allow the extension of “knows” to vary with standards), for the contextualist, its value will be supplied by the context of use, whereas the relativist (proper) takes it to be supplied completely independently of the context of use, by the context of assessment.

The second preliminary remark concerns the rationale for embracing a MacFarlane-style relativist semantics for “knows” which should be understood as differing from the kind of rationale we find in Lewis’s (1980) and Kaplan’s (1989) foundational work in semantics according to which sentence truth was relativized to familiar parameters such as worlds, times and locations. The important point here is that while Lewis’s and Kaplan’s reasons for “proliferating” parameters were primarily based on considerations to do with intensional operators, the more contemporary reasons (for example as appealed to by MacFarlane and other ‘new relativists’) for adding a standards parameter (that is in the context of assessment) are often to do with respecting linguistic use data, for example disagreement data (for example, see Baghramian and Carter 2015). For example, those who endorse truth-relativism about predicates of personal taste, (for example Lasersohn 2005; Kölbel 2003, MacFarlane 2014) take a truth-relativist semantics to better explain our patterns of using terms like “tasty” than do competing contextualist, sensitive and insensitive invariantist semantics. Accordingly, defending new-age relativism typically involves, for some area of discourse D, a philosophical comparison of costs and benefits of different competing semantic approaches to the relevant D expressions, replete with a case for thinking that the truth-relativist all-things-considered performs the best. A familiar such claimed advantage by a MacFarlane-style truth-relativist is that the kind of ‘subjectivity’ (for example standards-dependence) the contextualist claims the traditional invariantist cannot explain can be captured by the relativist without—or so the relativist tells us—“losing disagreement” where losing disagreement is a stock objection to contextualism in areas where disagreements appear genuine.

In three different places, MacFarlane (2005, 2009, 2014) has argued that knowledge attributions of the form “S knows that p” are assessment-sensitive. The focus of his presentation has varied across these three defenses of the view, but one core strand of thought resurfaces each time.

For ease of convenience, we can call this core strand MacFarlane’s “master argument” for an assessment-sensitive semantics for knowledge attributions.

Master Argument for Assessment Sensitive Semantics for Knowledge Attributions

(1) Standard invariantism, contextualism and SSI all have advantages and weaknesses.

(2) Relativism preserves the advantages while avoiding the disadvantages.
(3) Therefore, prima facie, we should be relativists about knowledge attributions.

The remainder of this section attempts to show why MacFarlane thinks that premises (1) and (2) of the master argument are true, and thus why he thinks we should embrace a relativist treatment of “knows”. The discussion to this end draws primarily from MacFarlane’s latest presentation of his relativist treatment of “knows”, one which gives the notion of relevant alternatives a central place.

Question: Why should we think (1) is true? As MacFarlane sees things, each of the three standard views of the semantics of knowledge-attributions—standard invariantism, contextualism and subject-sensitive invariantism (SSI)—has a grain of truth to it, as well as an “Achilles heel: a residuum of facts about our use of knowledge attributions that it can explain only with special pleading” (2005, 197).

His latest way of making this point relies on a kind of sceptical “conundrum”, one which arises in light of our ordinary practices of attributing knowledge, and which he uses as a frame of reference for magnifying what he regards as the salient weaknesses of the three standard views.

MacFarlane’s Conundrum: If you ask me whether I know that I have two dollars in my pocket, I will say that I do. I remember getting two dollar bills this morning as change for my breakfast; I would have stuffed them into my pocket, and I haven’t bought anything else since. On the other hand, if you ask me whether I know that my pockets have not been picked in the last few hours, I will say that I do not. Pickpockets are stealthy; one doesn’t always notice them. But how can I know that I have two dollars in my pocket if I don’t know that my pockets haven’t been picked? After all, if my pockets were picked, then I don’t have two dollars in my pocket. It is tempting to concede that I don’t know that I have two dollars in my pocket. And this capitulation seems harmless enough. All I have to do to gain the knowledge I thought I had is check my pockets. But we can play the same game again. I see the bills I received this morning. They are right there in my pocket. But can I rule out the possibility that they are counterfeits? Surely not. I don’t have the special skills that are needed to tell counterfeit from genuine bills. How, then, can I know that I have two dollars in my pocket? After all, if the bills are counterfeit, then I don’t have two dollars in my pocket (2014: 177).

MacFarlane articulates the form of the conundrum-argument as follows:

(i) p obviously entails q. [premise]

(ii) If a knows that p, then a could come to know that q without further empirical investigation. [1, Closure]

(iii) a does not know that q and could not come to know that q without further empirical investigation. [premise]

(iv) Hence a does not know that p. [2, 3, modus tollens]

Standard (insensitive) invariantism, the view that the epistemic standards that must be met for “S knows p” to be true are not (in any way) context sensitive, faces two central problems, by MacFarlane’s lights. Both problems are familiar. Firstly, standard invariantism has trouble making sense of the variability of our willingness to attribute knowledge. Secondly, standard invariantism seems stuck with an unhappy choice of either: embracing scepticism (if the invariantist simply accepts (iv)), embracing dogmatism (if the invariantist tries to avoid the sceptical conclusion (iv) by rejecting (iii)), or rejecting the closure principle which licenses the move from (i) to (ii)— that is to say, the principle that (as MacFarlane states it): ‘if a knows that p, and p obviously entails q, then a could come to know q without further empirical investigation’ (2014, 177).

By contrast, contextualism offers a way to avoid each of these problems facing standard invariantism. Unlike the invariantist whose position is at tension with data about the variability of our willingness to attribute knowledge, the contextualist has an explanation to offer for this variability: namely, our willingness to attribute knowledge varies across contexts because what is meant by “knows” is sensitive to the context in which it is used. As MacFarlane writes, “on the most natural form of this view, ‘knowing’ that p requires being able to rule out contextually relevant alternatives to p. Which alternatives are relevant depends on the context”. For instance, and with reference to MacFarlane’s Conundrum, when I’m first asked whether I know (p)—that I have two dollars in my pocket—‘knowing’ that p requires I need only to be able to rule out very basic alternatives (for example that I didn’t already spend the $2); I needn’t be able to also rule out that my pockets have been picked to count as ‘knowing’ (Ibid., p. 177). Though when someone asks me whether my pockets have been picked, then ‘knowing’ requires ruling out this alternative, and if I can’t, then the standard required for ‘knowing’ in this context is not met. Contextualism can make sense not only of the variability of our willingness to attribute knowledge, but it also avoids the unpalatable dilemma facing standard invariantism: reject closure or embrace scepticism or dogmatism. As the standard line goes, contextualists needn’t be tarred as sceptics or dogmatists because they can in fact preserve closure, at least, within any one context of use. So contextualism is looking pretty good.

However, although treating “knows” like “tall”—where the meaning of knows depends on the context in which it is being used—offers a nice escape route (vis-à-vis MacFarlane’s Conundrum), there are other respects in which treating “knows” like “tall” raises new problems. For example, an apparent disagreement between A and B about whether Michael Jordan is tall quickly is revealed to be no disagreement at all when it is clear to both parties that A means “tall for a given person” and B means “tall for an NBA player”. However, as MacFarlane notes, things are different with “know”. He writes:

If I say “I know that I have two dollars in my pocket,” and you later say, “You didn’t know that you had two dollars in your pocket, because you couldn’t rule out the possibility that the bills were counterfeit,” I will naturally take your claim to be a challenge to my own, which I will consider myself obliged either to defend or to withdraw. It does not seem an option for me to say, as the contextualist account would suggest I should: “Yes, you’re right, I didn’t know. Still, what I said was true, and I stick by it. I only meant that I could rule out the alternatives that were relevant then.” Similarly, the skeptic regards herself as disagreeing with ordinary knowledge claims—otherwise skepticism would not be very interesting. But if the contextualist is right, this is just a confusion (Ibid., p. 181; compare, Vogel 1990).

And here is where the special pleading comes in. The contextualist can attempt to say that our taking each other to agree/disagree with each other in the relevant kinds of cases is just a mistake of some sort. But, as MacFarlane sees it, this is a double edged sword: the more speaker error the contextualist must posit to explain the way we use “knows”, the less the contextualist can rely on the way we use “knows” to support contextualism. While contextualism does better than standard invariantism in that it avoids the dilemma raised to standard invariantism, standard invariantism makes better sense of disagreement.

By contrast with insensitive invariantism and contextualism, subject-sensitive invariantism (‘SSI’) might have the best offer to make yet. According to SSI, whether my utterance of “Archie knows that his car is in the parking lot” is true does depend on context, though in a different sense than it does for the contextualist: rather than depending on what alternatives I (the utterer of the sentence) can rule out (for example whether or not I know there are no thieves lurking nearby) what matters on SSI is whether Archie, the subject of the knowledge attribution, can rule out the alternatives relevant to his practical environment. This proposal has some advantages. For one thing, the ‘SSIist’ looks well-positioned to make sense of disagreement, given that ‘knows’ is not being treated like ‘tall’. Further, the SSIist unlike the insensitive invariantist can make sense of variability in willingness to attribute knowledge. Where the special pleading comes in concerns temporal and modal embedding.

The alleged problem (see, for example, Blome-Tillmann 2009) for SSIists is this: temporal and modal operators shift the circumstances of evaluation in such a way that, if SSI is true, we should expect that (in cases of temporal and modal embeddings of “know”) knowledge attributions will track whether the subject can rule out alternatives relevant in the subject’s practical environment in the (temporally or modally shifted) circumstance of evaluation. But this prediction doesn’t seem to pan out, as speakers are inclined to regard the same alternatives as relevant when evaluating non-embedded and embedded uses of “know”.

As MacFarlane sees it, I will not be inclined to say either of the following, which the SSIist predicts I should be willing to say:

Temporal embedding: I know that I had two dollars in my pocket after breakfast, but I didn’t know it this morning, when the possibility of counterfeits was relevant to my practical deliberations—even though I believed it then on the same grounds that I do now.

Modal embedding: I know that I have two dollars in my pocket, but if the possibility of counterfeiting were relevant to my practical situation, I would not know this—even if I believed it on the same grounds as now.

The moral of the story—though see Stanley (2016) for a reply on behalf of the SSIist—is supposed to be that, while each of the three leading competitor views does better than others in some respects, none of these views can make sense of our willingness to attribute knowledge without some sort of Achilles heel. And that is more or less MacFarlane’s defense of (1) in the master argument.

What about premise (2)? Premise (2) of the master argument, recall, says that:

(2) Relativism preserves the advantages while avoiding the disadvantages.

Toward the end of defending (2), MacFarlane suggests that what we want is a semantics for knowledge attributions that satisfies the following three key desiderata, desiderata such that (as he takes himself to have established in defending (1)) none of the three leading contender views can satisfy all of them:

Alternative-variation: It would explain how the alternatives one must rule out to count as knowing vary with context (otherwise, the view faces the dilemma facing insensitive invariantism, with respect to MacFarlane’s conundrum).

Alternative variation context ( use): the alternatives one must rule out to count as knowing must not vary with context of use (otherwise: disagreement cannot be preserved, a la contextualism).

Alternative variation context ( circumstances of subject): the alternatives one must rule out to count as knowing must not vary with circumstances of the subject to whom knowledge is ascribed (otherwise: temporal and modal embeddings cannot be made sense of, a la SSI).

Here is where the relativist is said to come to the rescue. The first step is to preserve alternative variation by taking the relevant alternatives to be determined by the context of assessment. As MacFarlane puts it:

The resulting view would agree with contextualism in its predictions about when speakers can attribute knowledge, since when one is considering whether to make a claim, one is assessing it from one’s current context of use. So it would explain the variability data as ably as contextualism does, and offer the same way of rescuing closure from the challenge posed by the conundrum. But it would differ from contextualism in its predictions about truth assessments of knowledge claims made by other speakers, and about when knowledge claims made earlier must be retracted. Moreover … it would vindicate our judgments about disagreement between knowledge claims across contexts (MacFarlane 2014, 188).

What about the temporal and modal embedding problem that faced SSI? Relativism, he argues, dodges this because a parameter for a set of contextually relevant alternatives is added to the index as a parameter distinct from world and time indices such that shifting the world and time indices (for example as when ‘knows’ is temporally or modally embedded) does not involve shifting also the relevant alternatives parameter (Ibid., 188).

the relation “knows” expresses does not vary with the context—there is just a single knowing relation—but the extension of that relation varies across relevant alternatives. As a result, it makes sense to ask about the extension of “knows” only relative to both a context of use (which fixes the world and time) and a context of assessment (which fixes the relevant alternatives). (Ibid., 189).

MacFarlane takes the view he hass proposed as one that escapes the sceptical conundrum while threading the gauntlet so as to avoid the disagreement problem that faces contextualists and the temporal and modal embedding problem that faces SSI. At this stage, we can see why MacFarlane thinks his view has all the advantages and none of the disadvantages. This concludes the presentation of MacFarlane’s defense of premise (2) of the master argument. And from (1) and (2) it follows that “knows” gets a relativist treatment.

6. New (Semantic) Epistemic Relativism: Issues and Implications in Epistemology

Is MacFarlane’s argument sound? Interestingly, this is relatively new terrain. The above line of argument is from 2014, so there has yet to be substantial criticism in the literature to this new form of relativism. See, however, Carter (2016, Ch. 7) for criticisms of MacFarlane’s (2014) view to the effect that the view generates the wrong results in cases of environmental epistemic luck and normative defeaters.

In this section, however, the focus is on implications in epistemology for embracing an assessment-sensitive semantics for “knows.” MacFarlane concludes his 2009 defense of an assessment-sensitive semantics for “knows” with a section entitled “Questions for the Relativist.” One question he asks, in light of his recommendation to extend a truth-relativist semantics for “knows” is: “are there other expressions for which a relativist treatment is needed? How does know relate to them?” (MacFarlane 2009: 16). A more specific version of this question is: if “know” gets a truth-relativist semantics, then since knowledge relates intimately with other epistemic concepts, do any other epistemic concepts need a relativist treatment? This is an important question and one which has obvious implications for the wider shape new epistemic relativism would take.

In tracing out epistemological ramifications of a relativist treatment of ‘knows’ in epistemology, it is helpful to begin with especially tight conceptual connections (between knowledge and other epistemic standings) and move outward from there. This section takes as a starting point two such connections: namely, connections between propositional knowledge and (i) evidence; and (ii) knowledge-how (for a more detailed discussion, see Carter 2017).

Firstly, evidence. Consider, as an example case, Williamson’s (2000) knowledge-evidence equivalence: E=K. Suppose, for reductio, that E=K, and further, that the truth-conditions for E are not assessment sensitive, but the truth-conditions for K, are. The resulting tension would be untenable (at best), at worst, contradictory. While of course Williamson’s view is controversial, it seems that if Williamson is right that our evidence is what we know, and thus that S’s evidence includes E if, and only if, S knows E, then one who embraces a relativist semantics for (propositional) knowledge ascriptions should be willing to embrace the view that that evidence ascriptions are assessment-sensitive.

Of course, E=K is a controversial position. The above point however was meant to illustrate one very straightforward sense in which a commitment to giving a relativist treatment to “knows” would have a straightforward implication in epistemological theory.

Let us move from a straightforward equivalence thesis (as was E=K) to a reductivist thesis. We needn’t look further than the most standard contemporary version of intellectualism about knowledge-how. Reductivist versions of intellectualism (compare Bengson & Moffett (2011)) insist that knowing how to do something is just a species of propositional knowledge (Stanley 2010, 207). As Stanley puts it:

[…] you know how to ride a bicycle if and only if you know in what way you could ride a bicycle. But you know in what way you could ride a bicycle if and only if you possess some propositional knowledge, viz. knowing, of a certain way w which is a way in which you could ride a bicycle, that w is a way in which you could ride a bicycle (Ibid., 209).

Like Williamson’s E=K thesis, Stanley’s reduction of knowledge-how to a kind of knowledge-that is also controversial, though very much a live and increasingly popular view in contemporary epistemology. Suppose, for reductio, that knowing how to do something is (a la Stanley) just a kind of propositional knowledge, and further, that the truth-conditions for knowing how to do something (for example, as in the case of attributions of the form “Hannah knows how to ride a bike”) are not assessment sensitive, but the truth-conditions for proposition knowledge are, such that “Hannah knows p” is assessment-sensitive, where p is a proposition specifying of a way w which is a way in which Hannah could ride a bicycle, that w is a way in which Hannah could ride a bicycle. Again, the resulting tension would be untenable (at best), at worst, contradictory.

What the foregoing brief consideration of evidence and knowledge-how indicates is that, at least for those with certain substantive commitments in epistemology where epistemic standings other than knowledge are either identified with or in some way reduced to (a kind of) propositional knowledge, an extension of an assessment-sensitive semantics to these standings as well looks potentially unavoidable. One interesting future direction of research will be to trace out the implications of a relativist semantics for “knows” even further, by moving outward to epistemic standings with (perhaps) looser but not insignificant conceptual connections to knowledge, such as justification, rationality, understanding and intellectual virtue. See Carter (2014; 2015, Ch. 8) for some discussion here. A further complementary direction for future research will be to consider how other notions, besides “knows’ for which a relativist semantics has been proposed might have implications in epistemology. A natural candidate expression here is “ought” (for example, Kolodny and MacFarlane 2010; MacFarlane 2014, Ch. 11). In short, if the moral ought gets a relativist treatment, it is hard to see how the epistemic ought would not likewise. However, if the epistemic “ought” is relative, then this has ramifications for epistemic normativity more generally. For example, if whether one ought to believe something is a relative matter, then plausibly, whether one is justified in believing something is a relative matter. Likewise, if epistemic oughts are relative, then presumably so will the epistemic norms which generate epistemic oughts.

A relativist treatment of “knows” also stands to have interesting implications for epistemologists concerned with how the kind of function the concept of knowledge plays might potentially inform our theory of knowledge. A flourishing contemporary research program within mainstream epistemology, one which Robin McKenna (2013) has called the “functional turn” in epistemology, takes as a starting point that “a successful analysis of knowledge must also fit with an account of the distinctive function or social role that the concept plays in our community […] Call this the ‘functional turn’ in epistemology (McKenna 2013: 335-336). Participants in the functional turn in epistemology appeal to practical explications of the concept of knowledge, on the basis of which they identify a function, where that function is regarded as generating an ex ante constraint on an analysis of knowledge (or a semantics of knowledge attributions). Henderson (2009; 2011), McKenna (2013; 2014), Pritchard (2012) and Hannon (2013; 2014; 2015) have for instance defended views about the concept of knowledge (or knowledge ascriptions) inspired by Craig’s (1990) favoured account of the function of knowledge as identifying good informants. By contrast, Kappel (2010), Kelp (2011) and Rysiew identify closure of inquiry as the relevant function and regard this rather than Craig’s tracking-good-informants function as generative of an ex ante constraint for theorizing about knowledge and its truth-conditions. For Krista Lawlor (2013) the relevant function is identified (a la Austin) as that of providing assurance.

Can “knows”, given a relativist treatment, potentially play (any of) these widely identified functional roles— that is, of identifying reliable informants, marking the closure of inquiry or providing assurance? This is an open question for future research.

Finally, and much more generally, semantic (new) relativism about “knows” raises some interesting metaepistemological issues. Mainstream epistemologists, by and large, take for granted within epistemological theory that the explanandum under the description of “knowledge” is not relative. If the ordinary concept of knowledge, however, requires a relativist treatment, then this presses the complicated issue of whether the ordinary concept of knowledge and the concept of interest to epistemologists are the same, and (even more generally) just how knowledge attributions should inform the theory of knowledge.

7. References and Further Reading

  • Baghramian, Maria. Relativism. London: Routledge, 2004.
  • Baghramian, Maria. The Many Faces of Relativism. London: Routledge, 2014.
  • Baghramian, Maria and Carter, J. Adam. “Relativism.” Stanford Encyclopedia of Philosophy, 2015. http://plato.stanford.edu/entries/relativism/
  • Blome-Tillmann, Michael. “Contextualism, Subject-Sensitive Invariantism, and the Interaction of’ Knowledge’-Ascriptions with Modal and Temporal Operators.” Philosophy and Phenomenological Research (2009): 315-331.
  • Boghossian, Paul. “How are Objective Epistemic Reasons Possible?” Philosophical Studies 106, no. 1 (2001): 1-40.
  • Boghossian, Paul. “Epistemic Relativism.” The Routledge Companion to Epistemology, 2011. doi:10.4324/9780203839065.ch8.
  • Boghossian, Paul. Fear of Knowledge: against Relativism and Constructivism. Oxford: Clarendon Press, 2006.
  • Carter, J. Adam. Metaepistemology and Relativism. Palgrave Macmillan, 2016.
  • Carter, J. Adam. “Disagreement, Relativism and Doxastic Revision.” Erkenntnis 79, no. S1 (February 2013): 155–72. doi:10.1007/s10670-013-9450-7.
  • Carter, J. Adam. “Relativism, Knowledge and Understanding.” Episteme 11, no. 01 (April 2013): 35–52. doi:10.1017/epi.2013.45.
  • Carter, J. Adam. “Epistemological Implications of Relativism.” In J.J. Ichikawa (ed.) Routledge Handbook of Contextualism, 2017, London: Routledge.
  • Chrisman, Matthew. “From Epistemic Contextualism to Epistemic Expressivism.” Philosophical Studies 135, no. 2 (2006): 225–54. doi:10.1007/s11098-005-2012-3.
  • Cohen, Stewart. “How to Be a Fallibilist.” Philosophical Perspectives 2 (1988): 91. doi:10.2307/2214070.
  • Craig, Edward. Knowledge and the State of Nature: An Essay in Conceptual Synthesis. Oxford University Press, 1990.
  • Cuneo, Terence. The Normative Web: An Argument for Moral Realism. Oxford: Oxford University Press, 2007.
  • DeRose, Keith. The Case for Contextualism. Oxford: Oxford Univ. Press, 2009.
  • Derose, Keith. “Contextualism and Knowledge Attributions.” Philosophy and Phenomenological Research 52, no. 4 (1992): 913. doi:10.2307/2107917.
  • Derrida, Jacques. Of Grammatology. Baltimore: Johns Hopkins University Press, 1976.
  • Egan, Andy. “Epistemic Modals, Relativism and Assertion.” Philosophical Studies 133, no. 1 (2007): 1-22.
  • Gerken, M. “Discursive justification and skepticism.” Synthese, (2012). 189 (2), 373-394.
  • Gibbard, Allan. Wise Choices, Apt Feelings: A Theory of Normative Judgment, n.d.
  • Greco, John. “Reflective Knowledge and the Pyrrhonian Problematic.” Virtuous Thoughts: The Philosophy of Ernest Sosa, 2013, 179–91. doi:10.1007/978-94-007-5934-3_10.
  • Hacking, Ian. ‘Language, Truth and Reason.’ In Rationality and Relativism, 48–66. (1982).
  • Hales, Steven D. “Motivations for Relativism as a Solution to Disagreements.” Philosophy 89, no. 01 (September 2013): 63–82. doi:10.1017/s003181911300051x.
  • Hales, Steven D. Relativism and the Foundations of Philosophy. Cambridge, MA: MIT Press, 2006.
  • Harman, Gilbert. “Moral Relativism Defended.” The Philosophical Review 84, no. 1 (1975): 3. doi:10.2307/2184078.
  • Kaplan, David (1977). “Demonstratives.” In Joseph Almog, John Perry & Howard Wettstein (eds.), Themes from Kaplan. Oxford University Press 481-563.
  • Kölbel, Max. “III-Faultless Disagreement.” Proceedings of the Aristotelian Society (Hardback) 104, no. 1 (2004): 53–73. doi:10.1111/j.0066-7373.2004.00081.x.
  • Kolodny, Niko, and John MacFarlane. “Ifs and Oughts.” The Journal of philosophy 107, no. 3 (2010): 115-143.
  • Lammenranta, Markus. “The Pyrrhonian Problematic.” Oxford Handbooks Online, 2008. doi:10.1093/oxfordhb/9780195183214.003.0002.
  • Lasersohn, Peter. “Context Dependence, Disagreement, and Predicates of Personal Taste.” Linguistics and Philosophy 28 (6):643—686, 2005.
  • Lewis, David. “Index, Context, and Content.” In Stig Kanger & Sven Öhman (eds.), Philosophy and Grammar: Reidel (1980): 79-100.
  • MacFarlane, John. Assessment Sensitivity: Relative Truth and Its Applications. Oxford: Oxford University Press, 2014.
  • MacFarlane, John. ‘The Assessment Sensitivity of Knowledge Attributions.’ Oxford Studies in Epistemology 1: 197– 233 (2005).
  • Macfarlane, John. “Relativism and Knowledge Attributions.” The Routledge Companion To Epistemology, 2011. doi:10.4324/9780203839065.ch49.
  • Macfarlane, John. “Xiv *-Making Sense of Relative Truth.” Proceedings Of the Aristotelian Society (Hardback) 105, no. 1 (2005): 305–23. doi:10.1111/j.0066-7373.2004.00116.x.
  • McKenna, Robin. “Knowledge Ascriptions, Social Roles and Semantics. ”Episteme 10(4), 335-350. (2013).
  • Meiland, Jack and Michael Krausz. (eds). Relativism, Cognitive and Moral. Notre Dame, Indiana: University of Notre Dame Press, 1982.
  • Olson, Jonas. “Error Theory and Reasons for Belief.” Reasons for Belief, 2011, 75–93. doi:10.1017/cbo9780511977206.006.
  • Pryor, James. “What’s Wrong with Moore’s Argument?” Philosophical Issues 14, no. 1 (2004): 349-378.
  • Richard, Mark. “Contextualism and Relativism.” Philosophical Studies 119, no. 1 (2004): 215-242.
  • Rorty, Richard. Philosophy and the Mirror of Nature. Princeton: Princeton University Press, 1979.
  • Sankey, Howard. “Scepticism, Relativism and the Argument from the Criterion.” Studies In History and Philosophy of Science Part A 43, no. 1 (2012): 182–90. doi:10.1016/j.shpsa.2011.12.026.
  • Sankey, Howard. “Witchcraft, Relativism and the Problem of the Criterion.” Erkenn Erkenntnis 72, no. 1 (2009): 1–16. doi:10.1007/s10670-009-9193-7.
  • Seidel, Markus. Epistemic Relativism: A Constructive Critique. Palgrave MacMillan, 2014.
  • Seidel, Markus. “Scylla and Charybdis of the Epistemic Relativist: Why the Epistemic Relativist Still Cannot Use the Sceptic’s Strategy.” Studies in History and Philosophy of Science Part A 44, no. 1 (2013): 145–49. doi:10.1016/j.shpsa.2012.10.004.
  • Seidel, Markus. “Why The Epistemic Relativist Cannot Use the Sceptic’s Strategy. A Comment on Sankey.” Studies In History and Philosophy of Science Part A 44, no. 1 (2013): 134–39. doi:10.1016/j.shpsa.2012.06.004.
  • Stanley, Jason. “On a Case for Truth Relativism.” Philosophy and Phenomenological Research 92.1, 2016: 179-188
  • Williams, Michael. “Why (Wittgensteinian) Contextualism Is Not Relativism.” Episteme 4, no. 01 (2007): 93–114. doi:10.3366/epi.2007.4.1.93.
  • Williamson, Timothy. Knowledge and its Limits. Oxford: Oxford University Press, 2000.
  • Vogel, Jonathan. “Reliabilism Leveled.” The Journal of Philosophy 97 (2000): 602-623.
  • Vogel, Jonathan. “Are there Counterexamples to the Closure Principle?”. In Michael David Roth & Glenn Ross (eds.), Doubting: Contemporary Perspectives on Skepticism. Dordrecht: Kluwer (1990): 13-29.
  • Williams, Michael. Unnatural Doubts: Epistemological Realism and the Basis of Skepticism, Cambridge, MA: Blackwell, 1991.
  • Williams, Michael. Problems of Knowledge, Oxford and New York: Oxford University Press, 2001.
  • Williams, Michael. “Why (Wittgensteinian) Contextualism Is Not Relativism,’ Episteme 4 (2007): 93–114.
  • Wright, Crispin. “Fear of Relativism?” Philosophical Studies 141, no. 3 (2008): 379–90. doi:10.1007/s11098-008-9280-7.
  • Wright, Crispin. “The Perils of Dogmatism.” In Nuccetelli & Seay (eds.), Themes from G. E. Moore: New Essays in Epistemology. Oxford University Press
  • Wright, Crispin. “New Age Relativism and Epistemic Possibility: The Question of Evidence 1.” Philosophical Issues 17, no. 1 (2007): 262–83.

 

Author Information

J. Adam Carter
Email: jadamcarter@gmail.com
University of Edinburgh
United Kingdom

Tu Weiming (1940—)

Tu Weiming (pinyin: Du Weiming) is one of the most famous Chinese Confucian thinkers of the 20th and 21st centuries. As a prominent member of the third generation of “New Confucians,” Tu stressed the significance of religiosity within Confucianism. Inspired by his teacher Mou Zongsan as well as his decades of study and teaching at Princeton University, the University of California, and Harvard University, Tu aimed to renovate and enhance Confucianism through an encounter with Western (in particular American) social theory and Christian theology. His writings about Confucianism have served as critical links between Western philosophy and religious studies and the world of modern Confucian thought. Tu asserted that Confucianism can learn something from Western modernity without losing recognition of its own heritage. By engaging in such “civilizational dialogue,” Tu hoped that different religions and cultures can learn from each other in order to develop a global ethic. From Tu’s perspective, the Confucian ideas of ren (“humaneness” or “benevolence”) and what he calls “anthropocosmic unity” can make powerful contributions to the resolution of issues facing the contemporary world.

While Tu’s particular presentation of Confucian thought has proven to be both intelligible and popular among Westerners, his use of Western religious concepts and terminology to describe Confucianism has also generated controversy in the Chinese Confucian world. In particular, the cultural hybridity and explicit spirituality that are key elements of Tu’s Confucianism have been criticized by some other contemporary Chinese Confucian thinkers, who—like modern Chinese philosophy in general—have been more influenced by nationalism and secularism than Tu. Nonetheless, Tu’s influence on contemporary Confucian philosophy cannot be overestimated, especially where its reception in the West is concerned.

Table of Contents

  1. Biography
  2. Confucianism as Religious Humanism
    1. Continuity of Being
    2. Anthropocosmic Unity
  3. Selfhood as Creative Transformation
    1. Ren
    2. Li
    3. The Junzi or Profound Person
    4. Fiduciary Community and Filial Piety
    5. Embodied Knowing
      1. Cheng
      2. Fourfold Human Nature
  4. Confucianism and Modernity
    1. Critique of Confucian Tradition
    2. Critique of Western Enlightenment Thought
    3. Civilizational Dialogue
  5. Influence and Criticism
  6. References and Further Reading

1. Biography

Tu was born to well-educated parents in Kunming, Yunnan Province, China in 1940. He has described the nanny who helped to care for him as uneducated, yet expressive of Confucian values in her daily words and actions. Thus, although Tu did not study the Confucian classics during his childhood, he was brought up immersed within a Confucian cultural environment. In 1949, he moved with his family to Taiwan and studied at Taipei Municipal Jianguo High School. At that time, the Taiwan government was promoting national moral education, which included a heavy emphasis on Confucianism—a subject about which the young Tu became enthusiastic. Among his teachers was Zhou Wenjie, a student of the “New Confucian” philosopher Mou Zongsan. After completing high school, Tu enrolled in Taiwan’s Tunghai University, where he studied directly with Mou as well as Mou’s fellow “New Confucian” thinker, Xu Fuguan. Tu’s undergraduate studies with Mou and Xu led to his being awarded a Harvard-Yenching Institute scholarship to study at Harvard University in the United States. Here, he completed courses taught by luminaries of Western social thought, such as the sociologists Talcott Parsons and Robert N. Bellah, and the historian of religions Wilfred Cantwell Smith, earning both an M.A. (1963) and a Ph.D. (1968) in East Asian studies. Beginning in 1967, Tu served as a faculty member in a series of prestigious U.S. universities, initially at Princeton University and later at the University of California at Berkeley, from which he went on to become a professor at Harvard University (1981-2010). As of 2016, Tu held two academic positions, serving as both the Chair Professor of Humanities and Founding Director of the Institute for Advanced Humanistic Studies at Beijing University and as Research Professor and Senior Fellow of the Asia Center at Harvard University.

2. Confucianism as Religious Humanism

Tu’s understanding of selfhood is intertwined with his understanding of religiosity. Indeed, one of the distinctive features of Tu’s Confucianism is his emphasis on the religiousness of Confucianism. Many Confucian scholars stress that Confucianism is a cultural tradition or a philosophy rather than a religion. However, Tu insists that Confucianism also involves religiosity. Unlike other religions, Confucianism is not an institutional religion; however, similar to other religions, Confucianism has its ultimate concern, which is creative self-transformation of the self. According to Tu, ethics (norms for behavior), aesthetics (theory of value), and religiosity (commitment to engagement with an ultimate concern) are inseparable from one another for Confucianism, and Chinese traditional thought in general. Thus, for Tu, being religious is equivalent to learning to be fully human. It is an infinite developmental process and the ultimate ideal is to achieve the oneness of Heaven (or Tian—the cosmic source of ethical and aesthetic values for Confucians) and humanity—in Chinese, Tianren heyi. This transformational ideal of self-cultivation is at the heart of Tu’s understanding of Confucianism as a form of religious humanism.

According to Tu, Confucian self-cultivation is a long and strenuous process; it is also a process of ceaseless creative self-transformation. What is important is one’s conscious determination to do so. In a key early work, Centrality and Commonality: an Essay on Confucian Religiousness (1989; later republished in a bilingual Chinese/English edition as Zhongyong dongjian [An Insight into Zhongyong, 2008], Tu highlights the distinction between “learning for the sake of the self” and “learning for the sake of others.” From a Confucian perspective, one’s motivation of self-cultivation should be for the sake of the self—that is, it should be seen as an intrinsic good, not undertaken for reasons that are purely external to oneself. As Tu states, “A decision to turn our attention inward to come to terms with our inner self, the true self, is the precondition for embarking on the spiritual journey of ultimate self-transformation”. Learning only for the sake of others is an inauthentic motivation, because it is driven by consideration of others’ opinions, rather than a genuine desire to cultivate oneself. However, Tu does not argue that Confucianism aims at cultivating a private ego. Rather, for Tu, Confucianism emphasizes the cultivation of a true self critically. Here, Tu sees himself as making a claim similar to the classical Confucian thinker Mencius (Mengzi), for whom the aim of knowing the true self is to recognize the inner “great body” (dati) which signifies the true self that can form a unity with Heaven, Earth and numerous other things. Thus, the ultimate aim of self-cultivation is the unity of Heaven and humans.

One important approach to recognizing the great body is to identify one’s sense of commiseration, that is, the sense of sympathy and empathy, as the unique characteristic of human nature distinguishing us from animals. To recognize the great body, as Tu understands this, also means to establish the will (lizhi), to make decisions to act in accordance with the great body. We must deliberately transcend the temptation of learning for the sake of others in order to learn for ourselves. The willingness to do this would help us to would help us to access the inexhaustible inner resources of self-transformation and to achieve a state of being truly “self-possessed” (zide) (to use Mencius’ term)—that is, the authentic way of learning to be human.

With his study of Mencius and other neo-Confucians, Tu argues that Confucianism encourages “human perfectibility through self-effort.” It means that the source of self-actualization is in humans, rather than through the mediation of some supernatural agent. It assumes that everyone possesses sufficient internal resources for ultimate self-transformation. And this assumption is further based on a Chinese cosmology that, in his Confucian Thought: Selfhood as Creative Transformation (1985), Tu calls the “continuity of being.”

a. Continuity of Being

Tu believes that the aim of unity of Heaven and humans is perceivable and realizable through self-effort because of what Confucians assume is a kind of continuity between the self, others, and Heaven and Earth, in which humans, by nature, share a certain reality of Heaven. Therefore, we are obligated to cultivate ourselves to actualize this moral ideal. Thus, Tu argues that Chinese cosmology does not reject the idea of a Creator, but that Confucianism, unlike Christianity, does reject the dichotomy of creator and creature. For Confucianism, it is inconceivable that man can be alienated from Heaven in any essential way. For this reason, Tu’s Confucian religiosity possesses no equivalent to the Christian idea of original sin and divine grace.

According to Tu, within Chinese culture the cosmos is perceived as an unfolding of all-embracing continuous creativity in which all modalities are organically connected. It consists of a dynamic energy field in which all parts of the cosmos mutually interact in a “spontaneously self-generating life process.” While the nature of the cosmic is impersonal, it is not inhuman. It is impartial to all modalities of being, and rejects a kind of anthropocentrism. While human beings are part of this, our minds and consciousness make us unique in that we are able to probe “the transcendental anchorage of our nature,” and to achieve “sympathetic accord with the myriad things in nature.” This sense of continuity of being also gives a sense of deep awe towards nature and Heaven within Chinese culture, an aspiration to sustain harmony with nature. Thus, Tu argues, self-knowledge is both a necessary and sufficient condition of knowing the Way (Dao), that is, the way of actualization of authentic human nature. As humans are endowed by Heaven with the “centrality” of the universe—the most refined quality of the universe—Heaven sends humans on the mission of transforming the cosmos into its actualization.

It is important to note that here Tu is taking liberties with the English translation of the Chinese phrase zhongyong, the title of one of the “Four Books” of classical Confucianism, often rendered as “doctrine of the mean” in scholarship prior to Tu’s work on the text. Instead, Tu understands zhong as “centrality” and yong as “commonality.”

“Centrality” denotes the most refined and irreducible quality inherent in human beings. Therefore, the way that comes from “centrality” cannot be separated from us. It is an ontological condition of being, in which one’s mind is unperturbed by external forces. According to ancient Chinese thought, human beings have embodied the centrality of Heaven and Earth. Therefore, human beings are united with Heaven and Earth by inherent centrality. Centrality is a state of the self in which emotions are not yet aroused. When our emotions are aroused to the due measure, it is called harmony. Thus, while centrality is the great foundation of the universe, harmony is an appropriate, unfolding expression of the inner self. Tu reminds us that although the great foundation is inherent in human beings, it does not guarantee that everyone can attain the harmonious state. It is a heavy burden and a long road to attain centrality and harmony; one needs the determination of self-cultivation. By “commonality,” Tu means activities that are ordinary and common, such as eating and walking, which are deployed to describe the Way in Zhongyong. Thus, Tu emphasizes, human beings must accept their embeddedness in the world as the condition of self-transcendence. As the Way (Dao) is inseparable from our ordinary life, we must realize our humanity through our ordinary daily existence. It is tantamount to fulfilling our Heavenly-ordained mission.

b. Anthropocosmic Unity

For Tu, the Heaven-human relationship is not simply that of creator and creature, but “one of mutual fidelity; and the only way for man to know Heaven is to penetrate deeply into his own ground of being.” Thus, Tu argues, the ideal of the Heaven-human relationship is neither theocentric (God-centered or, in Confucian terms, Heaven-centered) nor anthropocentric (human-centered), but “anthropocosmic unity.” By this, Tu means that through a continuous interaction between the two, “the human way necessitates a transcendent anchorage for the existence of man and an immanent confirmation for the course of Heaven.” Because of the mutuality of Heaven and humans, the Confucian Heaven-human relationship is also a kind of “transcendent as immanent.”

From the above analysis, we can see that Confucian religiousness involves three interrelated dimensions: (1) the self-transformation of the person, (2) the communal act of community and (3) the dialogical response to the transcendent. Confucianism aims at forming a kind of organismic unity between humans and Heaven. Thus, Tu stresses that Confucianism is a kind of “inclusive humanism” with “an anthropocosmic idea.” Unlike secular humanism, Tu’s Confucian inclusive humanism is very much concerned with the transcendent. Indeed, Tu claims that if Confucianism wants to be continuously developed it must develop its spiritual tradition in a religious form, so that it can continuously contribute to society. This is because Tu considers religiosity as that which provides human beings with a sense of deep awe towards Heaven and nature; without this sense of transcendence, human life would be shallow. Moreover, for Tu, although human beings by nature are earthbound, they have an aspiration to transcend themselves and to join with Heaven.

Given Tu’s ideas about the “continuity of being” that properly leads to the development of “anthropocosmic unity,” we can see why Tu’s thesis of “learning for the sake of the self” is not a kind of ego-centrism. As there is continuity of being between the self, the others and the cosmos, learning for the sake of the self can be said to be also for the sake of others, as well as the whole world. Furthermore, in the following section, it is explained that Tu does not reject the motivation of self-cultivation driven by our sincere feeling towards our family members. Thus, what Tu really wants to reject seems to be the motivation of “learning for the recognition from others” rather than “learning for the sake of others.”

3. Selfhood as Creative Transformation

While Tu’s Confucianism emphasizes learning for the sake of the self, it does not mean that one can learn to be fully human separated from the community. Tu asserts that people generally establish a “fruitful communication with the transcendent through communal participation.” Apart from communal participation, humans must develop a “constant dialogical relationship with Heaven” and are transformed through “a faithful dialogical response to the transcendent” in self-cultivation. For Tu, implicit in Confucian thought is a “covenant” with Heaven, for our moral duty is to achieve the highest human aspirations of forming a “trinity with Heaven and Earth” through self-realization and community-perfecting. Thus, on Tu’s account, Confucian self-cultivation is a gradual process of combining all levels of the community in the process of self-transformation, from the family to the neighborhood, clan, race, nation, world, and finally to the universe and the cosmos.

Following the work of his teacher, Bellah, Tu argues that one of the perennial human problems in modernity is individualism. In order to respond to individualism, Tu investigates the possibility of “a new vision of the self which is rooted in the reality of a shared life together with other human beings and inseparable from the truth of transcendence” (CT, 8) and his answer is “Confucian selfhood as creative transformation” (CT, 7). Tu stresses that the main concern of Confucianism is how to become a sage or how to become fully realized as an authentic human being. Confucianism believes that man is perfectible by self-effort and one has to achieve self-realization through self-cultivation.

a. Ren

Tu understands the classical Confucian term ren (benevolence or humaneness) as the highest human achievement of self-cultivation and the fullest manifestation of humanity. A man of humanity (renren), who embodies love in his daily life, represents the most authentic realization of human beings from Tu’s Confucian perspective. However, in actual practice, there is a gradual process of extension of love, and ren is most exemplified in our caring toward our relatives (qinqin). This is further discussed in the following section. For Tu, ren is primarily concerned with the self, although human relations are also crucial to it. Tu conceives ren as a concept of personal morality, as a principle of the inward process of self-fulfilment. It is not simply a personal virtue, but also a metaphysical moral mind that is, at the same time, equivalent to the cosmic mind. Thus, ren is both the moral and metaphysical foundation of self-cultivation. While we are earthbound and limited, we can also participate in the communal and divine enterprise of self-transformation, that is, to enlarge one’s humanity so that humanity as a whole shared by every human being will be enriched. The driving force of such anthropological and cosmological assumptions enable Confucianism to function as an ethico-religious system in Chinese society despite its lack of institutional religious character.

b. Li

According to Tu, li (ritual) is the externalization of ren in a specific context. Li implies the existence of human relationships. For Tu as a Confucian, the self and sociality are not separable. Society is conceived as an extended self. Confucian self-transformation must be manifested in the context of sociality. And li points to a concrete manifestation by which one enters into proper relations with others. Tu conceives li as a dynamic process of humanization. It is a process of individual development from (1) cultivating personal life to (2) regulating familial relations, and then (3) ordering the social affairs, before finally (4) bringing peace to the world. It assumes forms of integration between personality, family, state, and the world. Thus, for Tu, Confucian self-cultivation is a gradual process of inclusion. Without li, there would be a lack of a concrete manifestation of ren. However, without ren as the internal basis, li could become coercive and distort human nature. While ren denotes metaphysical reality, li means the standard of this world. Thus, Tu argues, there is a creative tension about Confucian concepts of ren and li, and it is important to seek the balance between ren and li in a dynamic process.

c. The Junzi or Profound Person

Just as with the phrase zhongyong, in another case Tu uses license in translating a classical Confucian term. Junzi, which typically is rendered as “gentleman” or “superior person” in other English-language scholarship, becomes “profound person” in Tu’s usage of the term. This translation shows Tu’s particular concern regarding the inner nature of morally superior persons. Basically, junzi is the paradigmatic model of Confucian personality. For Tu, self-knowledge of the profound person is deeper than that of average people. As the Way (Dao) is inherited in human nature, the actualization of the Way very much depends on our self-knowledge. While the Way is deeply rooted in Heaven-endowed human nature, it is manifested in ordinary life. Thus, the profound person must be sensitive to one’s interiority through which true humanity is manifested as the Way. However, following Confucian tradition, Tu claims that only a few have the inner strength to fully actualize what is inherent in them.

To be a profound person means that one can be (1) sensitive to the outside world, (2) at ease with oneself and (3) courageous. The important way of self-cultivation to achieve self-knowledge is “vigilant solitariness” (shentu) or “self-watchfulness when alone,” that is, a kind of continuous vigilance on one’s own. It is “a process toward an ever-deepening subjectivity.” On the one hand, through continuous critical self-examination, one is sensitive to the subtle manifestation of one’s inner feelings. On the other hand, by such inner penetration of the self, one is able to reach the reality underlying common humanity and to realize the true nature of human-relatedness. Tu argues that in practising vigilant solitariness, one can hear one’s authentic self as expressing the quality of Heaven-ordained nature. As our innermost core is common to all human beings, our self-understanding of Heaven-ordained nature would help us to know the “great foundation” (tapen) of the cosmos.

The uniqueness of the profound person lies in how he integrates the Way with his daily life; therefore, the way of a profound person is also understood as the common way. However, such commonness cannot be defined in terms of an abstract formula by which everyone can learn to become a profound person. Thus, while the Way is common, it is also dynamic in nature. As Tu states, “its meaning can never be fully comprehended and its potential never exhausted. There is always something ‘hidden’ in its commonness.” In the face of constantly changing social situations, a profound person can still be at peace with themselves. This is because he “rectifies himself and seeks nothing from others”, and can recognize the possibilities in his fate and bring himself into harmony with the situation. Thus, it is a kind of creative transformation not only of the self, but also of the world in the spirit of Zhongyong. Therefore, in practising vigilant solitariness, one must also be courageous in order to be truthful to oneself and to resist the temptation of seeking recognition from others as a motivation for self-cultivation. One must possess an inner strength to continue the long and strenuous task of self-cultivation at one’s own pace, undisturbed by the changing environment.

d. Fiduciary Community and Filial Piety

As the Way (Dao) starts with human-relatedness, the idea of community is also an important concern for Confucianism. For Tu, the ideal Confucian society is conceived as “a fiduciary community based on mutual trust.” Thus, the goal of politics is not simply the establishment of law and social order, but also the development of a fiduciary community through moral persuasion by rectifying the ruler’s moral character as a moral leader.

The self, in the process of creative transformation, embodies the network of continuous expanding and deepening human relationships. According to Tu, the network is “a series of concentric circles” which involves different kinds of structural limitations, such as gender, race, and social background. However, through self-cultivation, the structural limitations of each circle can be transformed into an instrument of self-transcendence, and we can extend ourselves continuously to achieve unity with Heaven and Earth and other numerous things. When we have achieved unity with the most generalized commonality, we also reconfirm the centrality of the self. Tu states, “This broadening and deepening of the self can be characterized, in Mencian terminology, as the manifestation of the ‘great self’ and the concomitant dissolution of the ‘small self’.”

In Confucianism, filial piety is considered the prime virtue of the community underlying the anthropocosmic vision. With most other Confucians, Tu defines filial piety in terms of “transmission and continuity.” He understands exercising filial piety as honoring one’s parents by transmitting the wisdom and values exemplified by their parents, and continuing their unfinished tasks. The underlying idea is that their developments are indebted to the tradition, that is, the origin of their existence; thus they are responsible for transmitting the wisdom of the old and honor their forefathers in their ancestral line. As Tu points out,

The centrality of filial piety in Confucian ethics is predicated on the belief that human beings become aware of themselves by responding naturally to the loving care of those around them. Such a reciprocal response, laden with rich symbolic significance for the transmission and continuity of humanity, is seen by the Confucians as the way to provide a solid basis for personal growth: filial piety and brotherly love are roots (pen) of humanity. (C&C, 140)

Tu states that Confucians reject the idea of identity formation through isolation, because our sincere feeling towards our family members can provide motivation for our self-transformation. When we have nurtured our mind/heart to be capable of regulating our family, we will then open ourselves and transcend our egocentrism. However, familialism may degenerate into nepotism and may become a kind of structural limitation. Thus, for Tu, we have to establish meaningful relationships with people outside our familial members and transcend the nepotism of familialism. Furthermore, in order to avoid being a narrow-minded nepotist, ren must go with righteousness (yi)—that is, a sense of judgment and honoring the worthy. Thus, li is not only the externalization of ren; it also signifies a structure of yi realized in the context of human relations. It is primarily concerned with establishing an authentic way of human-relatedness.

Indeed, Tu is thoroughly Confucian insofar as he considers filial piety to be the foundation of political virtue. Traditionally, a filial son is perceived as likely to be vigilant about his personal conduct, diligent about family affairs, compassionate in regard to social obligations and, therefore, qualified for political assignments. Historically, filial sons often end up becoming loyal ministers. According to Tu, Confucianism often uses family as a metaphor for country and the world, addressing the emperor as son of Heaven, and the magistrate as the “father-mother official,” because Confucianism perceives a kind of transcendent vision implicated by familial nomenclature. It implies that the self is not egoistic; rather the enrichment of the self is through cultivating one’s relationship with the family. The emphasis of family is not equivalent to nepotism; rather it is concerned with a developmental process towards organismic unity with the country and the world. Thus, Tu claims that Confucianism perceives self-cultivation and regulation of the family as the root, and governing the country and peace of the world as the branches. And “the dichotomy of root and branch conveys the sense of a dynamic transformation from self to family, to community, to state, and to the world as a whole.” (C&C, 148).

Rituals and ceremonies (li), as manifestations of the ethico-religiosity of community, are also important elements of moral education and social solidarity. In particular, through ancestral worship, the old are respected and the dead are honored for their past contributions; their yearning for the forefathers also establishes their communal identity. Indeed, with the anthropocosmic vision, we are not only responsible to our ancestors, but also to future generations, for they will also transmit our values and continue our tasks.

It is well-known that Confucianism is very much concerned about five cardinal human relationships. According to Tu, reciprocity (shu) is the fundamental principle of five cardinal human relationships and human-environmental relationships. Reciprocity helps us to harmonize social relationships, interact sympathetically with nature and establish a dialogical relationship with Heaven. “Through reciprocity, humanity becomes interfused with the cosmic transformation and thus, as a co-creator, forms a trinity with Heaven and Earth. Humanity, in this perspective, stands as the filial son and daughter of the cosmos.” (C&C, 134).

Another important virtue is reverence toward Heaven. Tu states that filial piety and reverence toward Heaven are “parallel principles in the Confucian anthropocosmic worldview.” Both do not simply serve political purposes, but fundamentally bear a cosmological concern, that is, a concern of “bringing peace and harmony to the universe.” Thus, these two virtues attempt to establish “a pattern of mutual dependence and organismic unity” between Heaven and humans.

e. Embodied Knowing

For Tu, moral reasoning is a kind of “embodied knowing” (tizhi). Following Neo­Confucianism, Tu argues that Confucian ontology rejects Kantian metaphysics which assumes the objectivity of the moral will, excluding any emotional dimension. Tu’s Confucianism also rejects the study of humanity by means of the scientific method. Following Zhang Zai, Tu argues for the distinction between moral knowledge and empirical knowledge. While empirical knowledge derives ideas from our detached observations, moral knowledge cannot be grasped in a disengaged way. Rather, moral knowledge is a kind of bodily experiential knowledge based on reflection on one’s bodily practice and experience. Crucial to Tu’s understanding of embodied knowing are his interpretations of two key classical Confucian concepts, cheng (sincerity) and the fourfold division of human nature into body (shen), heart/mind (xin), soul (ling) and spirit (shen).

i. Cheng

Unlike Christianity, Tu points out, Confucian thought does not regard revelation as the foundation of perceiving the moral order, but rather looks to “our common experience.” In order to perceive this ultimate reality and to attain our true humanity, people must learn to be cheng, that is, to be sincere, true, and real, to our common experience. According to Zhongyong, cheng leads to enlightenment (ming). Cheng also allows us to fully realize ourselves and to understand that Heaven is inherent in human nature. Tu stresses that the possibility of being cheng is not because of divine grace; rather it is based on the idea of heavenly endowed human nature, by which the identification of human nature with reality of Heaven becomes possible. According to Tu, cheng not only means sincerity, but also authenticity. To be cheng does not only signify what a person should be in an ultimate sense, but also signifies a process of actualizing the ultimate reality in ordinary life. The program of self-cultivation that leads to the development of cheng is “a process toward an ever-deepening subjectivity” which involves “a deep penetration into one’s own ground of existence” and includes and embraces others as “an integral part of one’s quest for self-realization.” For this reason, says Tu, Confucianism gives enormous value to the practice of vigilant solitariness.

Ontologically speaking, the expression of the moral subject is a priori true and sincere. However, in reality, if we do not maintain the exercise of self-cultivation, we cannot be cheng enough, and the moral knowledge of the subject would finally be exhausted. This is because our moral knowledge and action are intertwined. Our bodily experiential knowledge can lead to self-transformation, that is, the enhancement of moral knowledge and the renovation of one’s disposition. Thus, for Confucianism, embodied knowing must necessarily lead to the enhancement of moral practices. The highest human achievement by moral self-cultivation is to become an authentic moral person, which is what Tu calls “renren,” that is, a man who embodies ren; such embodiment must have a concrete manifestation by the observance of rites (li).

The ultimate concern of self-cultivation is not simply about the self, but also to manifest our humanity. As human nature shares the ultimate reality of Heaven, human nature is potentially a manifestation of the reality of Heaven. The manifestation of authentic humanity implies that human beings, as co-creators, participate in the creative process of the cosmos. Tu stresses that such a process is not creation ex nihilo (out of nothing). Rather, humans are “capable of assisting the transforming and nourishing process of heaven and earth.” In short, the creative transformation of sage is derived from human inner nature by focusing on our common human experience. As Tu states, “The way of the sage therefore is centered on the commonality of human nature.”

Moreover, Tu’s vision of Confucian embodied knowing and self-cultivation is to be realized through social practice in a complicated social network. The aim of Confucian self-cultivation is not only to establish oneself, but also to establish others. Thus, our moral knowledge is not simply derived through our own self-understanding, but also by knowing others. Knowing others is not through a kind of disengaged introspection, but rather though participating in a network of mutual trust, knowing others’ dispositions and characters through dialogue and interaction with them. Thus, embodied knowing is also a kind of empathic sensual perception; it rejects the objectification of others, things or humans. And it can accommodate and integrate everything in the world, letting all these things become something that is non-objectified in our minds. Finally, as there is continuity between the being of Heaven and that of humans, embodied knowing is also related to the framework of unity and harmony between humans and Heaven (Tianren heyi). Everyone can be connected to Heaven by embodied knowing of one’s own nature.

ii. Fourfold Human Nature

Tu’s idea of embodied knowing is based on his fourfold division of human nature, perhaps better understood as four levels of subjectivity. According to Tu, the foundation of Confucian morality is an embodied person with sensitivity and emotion. It includes body (shen), heart/mind (xin), soul (ling) and spirit (shen), that is, four different levels. Our embodied knowing is an experiential knowledge based on the integration of these four different levels.

Unlike self-cultivation in some religions which are concerned about nurturing the human soul only, Confucian self-cultivation very much concerns the human body. Tu reminds us that Confucian self-cultivation literally means “nourishing the body” (xiushen). As our body includes five sense organs, Confucian teaching traditionally emphasizes nourishing our bodily senses through the Six Arts (liuyi) – the six ancient disciplines of ritual, music, archery, charioteering, calligraphy, and mathematics, which were foundational to the classical Confucian curriculum. The aim is to aestheticize human life through the practice of rituals (li) and music (yue), to cultivate one’s disposition, to facilitate one’s thinking and emotion-controlling, and finally to achieve one’s embodiment of virtues (yishen tizhi).

While our body contains sense organs, our heart/mind is a rational faculty integrating our different sensual experiences. Here, Tu is building on the Neo­Confucian interpretation of Mencius’ anthropology as the study of heart/mind (xin). Our heart/mind includes spiritual resources endowed by Heaven as the defining feature of being human. It also provides theoretical and practical foundations of our self-cultivation. The spiritual resources are four germinations (siduan) and the power of the will for self-realization. Four germinations are four kinds of universal predispositions. They are shown in our sense of (1) commiseration, (2) shame, (3) reverence, and (4) rightness and wrongness. By cultivating these four germinations, we can acquire four Confucian virtues: benevolence (ren), propriety (yi), observance of rites (li), and wisdom (zhi). By doing our best with our mind, we can extend our virtuous sense not only towards another person, but it also “flows abroad, above and beneath, like that of Heaven and Earth” (Mencius 7A:13). The power of will is what Mencius calls, “vast, flowing qi” (haoranzhiqi). This power is great and strong; it can be forever latent, but is never totally lost. If one can nourish this power with uprightness, “it will fill the space between Heaven and earth” (Mencius 2A: 2). Thus, profound persons should focus on tapping their own internal energy in the process of realizing humanity.

Apart from body and mind, there are also soul and spirit in human beings. Tu claims that the soul is the extension of the mind, a kind of awareness in the existential situation. Spirit is the state of transcendence, the ultimate goal of self-cultivation. The relation between soul and spirit is just like that of body and mind, where the former is concrete, definite, and with fixed shape, and the latter is indefinite and hardly traceable. Tu argues that we can find Confucian stages of development of self-transcendence in the Mencius. The stage from goodness to realness belongs to the ascending level from body to mind. Beauty to greatness is at the rise from the mind to the soul. Sage-spirit is about the ascending status from the soul to the spirit. Thus, one self-elevating process inevitably involves one’s embodiment through ritual practice, in which one can learn about one’s mind, and then be aware of one’s soul, and finally rise to the level of spirit. By such Confucian self-cultivation, we can experience a kind of union with the cosmos; therefore, the embodiment of virtue and a ritual-musical cultivated harmonious world can finally be ratified. Here and throughout Tu’s presentation of Confucianism, the role of religiosity in the Confucian humanistic project remains central.

4. Confucianism and Modernity

In Confucian China and its Modern Fate (1958), Joseph R. Levenson argues that due to the degeneration of feudal society which nourished Confucianism, Confucianism inevitably faded into the background of modernity. However, Tu disagrees with Levenson and criticizes Levenson’s analysis for failing to distinguish between “Confucian China” and “Confucian tradition.” For Tu, “Confucian China” denotes the politicalized Confucian ideology of traditional Chinese feudal society. However, by “Confucian tradition,” Tu means the main spirit of Confucian Chinese culture by which Chinese people traditionally have governed their lives. To a certain extent, Confucian tradition is responsible for the problem of Confucian China. However, Tu believes that as long as we can eliminate the adverse influences of Confucian China, the Confucian tradition can be revived.

In order to refute Levenson’s claims for Confucianism’s demise, Tu explores the possibility of the “third epoch” of Confucianism. Tu considers Confucianism from earliest times through the Han dynasty (202 B.C.E.-220 C.E.) as the first epoch, Neo-Confucianism from the Song (960-1279 C.E.) through the Ming (1368-1644 C.E.) dynasties as the second epoch, and Confucian thought from the May Fourth New Cultural Movement (1919 C.E.) to the present as the third epoch. Historically, the development of Confucianism occurred through dialogue and debate with other traditions in China. For Tu, the challenges to Confucianism in its third epoch come not only from Western science, democracy, psychology, and religiosity, but also from the more universal question of how Confucianism as an inclusive humanism can address the perennial human problems of the world–that is, how Confucianism can be a new philosophical anthropology for humanity as a whole. Tu is not unique in formulating contemporary Confucianism’s intellectual and spiritual situation in such terms, as similar formulations can be found in the work of his teacher, Mou, as well as the writings of Mou’s associates, Tang Junyi and Xu Fuguan. Where Tu distinguishes himself from other “New Confucians” is in his insistence that contemporary Confucians must not only (1) reflect on the past problems of traditional Confucianism, but also (2) communicate with different civilizations so that Confucian thinkers can benefit from such dialogues and thus contribute to the global world.

a. Critique of Confucian Tradition

Regarding the controversies about traditional Confucianism as it existed prior to modernity, Tu thinks that one crucial problem was its political integration with despotic regimes. He calls this kind of politicized Confucianism “Confucian China,” by which he means a conservative political ideology used by the powerful to control and oppress people, including the internalized oppression whereby Chinese minds became accustomed to rejecting any proposals for cultural change or reform. Tu’s critique thus is very much in sympathy with the May Fourth New Cultural Movement, but he argues that some May Fourth intellectuals went too far in their criticisms of Confucianism by uncritically embracing Western Enlightenment thought, which had problems of its own. Pointing out that the West has its vices and China has its virtues, culturally and intellectually speaking, Tu suggests that Chinese interaction with Western culture should be grounded in a deep understanding of the Chinese cultural heritage and a realistic awareness of the pitfalls of Western-style modernity. Finally, Tu insists that there can be different kinds of modernity; Western modernity is not the only way, especially for China.

b. Critique of Western Enlightenment Thought

Tu is very much appreciative towards values and institutions of Western modernity brought to us by the Enlightenment, values such as reason, freedom, equality, individuality, rule of law, democracy, science, and capitalism. He thinks that China can learn from these in its modernization. However, Tu criticizes Enlightenment for inducing scientism, anthropocentrism, and individualism. Anthropocentrism and scientism bring to us a kind of disenchantment and finally lead to the exploitation of nature and destruction of the environment. Anthropocentrism with the rise of capitalism also leads to the marketization of the economy, politics, academia, and education, and, what is deemed as worse by Tu, the marketization of religions. Capitalist society has been dominated by the instrumental reason which, for Tu, is disembodied, unsympathetic, and even cruel. Individualism also caused the prevalence of egocentrism and the disintegration of community. Thus, Enlightenment has finally precipitated the antagonism among humans and antagonism between humans and nature. In these respects, Tu thinks that Confucian values, such as sympathy, distributive justice, strong sense of communal responsibility, propriety, and the sense of collectivism and emphasis of self-restraint, should also be universalized and be able to contribute to the modern world.

c. Civilizational Dialogue

In response to the many Western social theorists who regard Western modernity as the future trend of human development, Tu points out the fallacy of Eurocentric chauvinism present in their work. Inspired by the existentialist philosopher Karl Jaspers’ concept of the “Axial Age”—whereby modern civilizations have been shaped by different cultural and intellectual traditions that arose between the 700s and the 200s B.C.E., including Confucianism—Tu argues that the rise of East Asian industrialization has dispelled the myth of Westernization as the single model of modernization. In the case of Singapore, for example, Tu cites how Confucian virtues—such as frugality, industriousness, self-discipline, loyalty, and active participation in collective welfare—have proven to be constitutive to the success of the ethnically Chinese city-state. In Tu’s view, Singapore—and by extension all of industrialized East Asia—could not have proceeded to modernity by a route other than its own (Chinese) cultural traditions, and cannot be judged by the standards of other cultural traditions. Thus, for Tu, the development of modernity is a pluralistic phenomenon that proceeds by different cultural pathways.

Moreover, Tu identifies the twenty-first century cultural-historical moment as a kind of new “Axial Age,” in which conditions of cultural and religious pluralism can foster constructive dialogue between traditions and civilizations. Throughout this conversation, each civilization should open itself to learning from others, on the one hand, while at the same time maintaining a strong sense of self-recognition so that it will develop, rather than dissipate, as a result of dialogue. Tu’s hope is that such civilizational dialogue will aid in the search for a global ethic. In Way, Learning, and Politics: Essays on the Confucian Intellectual (1993, hereafter abbreviated as WLP), Tu writes: “If the well-being of humanity is its central concern, Confucian humanism in the third epoch cannot afford to be confined to East Asian culture. A global perspective is needed to universalize its concerns. Confucians can benefit from dialogue with Jewish, Christian, and Islamic theologians, with Buddhists, with Marxists, and with Freudian and post-Freudian psychologists.” (WLP, 158-9).

This emphasis on inter-cultural, inter-religious, inter-disciplinary, and inter-civilizational dialogue is one of the defining features of Tu’s Confucianism. It also represents a response to the theory of the “clash of civilizations” proposed by Samuel P. Huntington. Huntington argues that international conflict in the post-Cold War era was primarily caused by conflicts between people’s cultural and religious identities. Huntington anticipates that the conflict between the West and what he calls Confucian and Islamic civilizations will be the greatest conflict in the future. Tu does not deny the existence of a civilizational clash; nevertheless, Tu criticizes Huntington’s civilizational clash as one-sided and static in its assumptions about the structure of civilizations. In contrast, Tu finds that the development of civilizations is dynamic and mutually influential, and he argues that the trend of global development should be the promotion of civilizational dialogue rather than the anticipation of civilizational conflict. From Tu’s Confucian perspective, this posture reflects a distinctively Chinese faith that true harmony may be achieved by respecting differences, as stated in Lunyu (Analects) 13:23: “The superior man is conciliatory but does not identify himself with others.” Such civilizational dialogue is crucial for the future development of Confucianism as Tu envisions it, just as Confucianism in the past developed out of dialogue with Buddhism in China. If contemporary Confucians want their tradition to continue to grow, argues Tu, then they must face the challenge of Western civilization. Tu goes so far as to say that, without such civilizational dialogue, there can be no future for Confucianism as a living tradition.

But the value of civilizational dialogue does not lie only with its benefits to the Confucian tradition. In Tu’s mind, Confucianism is not simply for people in China or East Asia, but also for the whole world. Tu thinks that the third epoch of Confucianism must respond to four aspects of challenges from the West: (1) the spirit of scientific inquiry, (2) democracy, (3) Western religiosity and its sense of transcendence, and (4) the Freudian psychological exploration of human nature. Thus, Confucianism must open itself, reach out to the world, and learn from the West about the ideas of freedom, equality, science, democracy, human rights, and rule of the law, while rejecting the Western tendencies toward either radical individualism or radical collectivism. Confucianism must reject its own past elements of authoritarianism, hierarchalism, and androcentrism (male-centeredness) while conserving the value to be found in traditional Confucian aesthetics, morality and religiosity. Through such civilizational dialogue, Tu believes that Confucianism can both renew itself and become a valuable resource for the world.

5. Influence and Criticism

One of the important and influential contributions of Tu’s Confucianism is his ongoing dialogue with non-Confucian religions and social theories. Tu’s many years of living in the United States make such dialogue possible and have enabled him to become the foremost spokesperson for Confucian thought in the West. However, his prolonged expatriate status also renders him vulnerable to criticisms from Confucian thinkers in China and elsewhere outside of the West.

While Tu’s particular presentation of Confucian thought has proven to be both intelligible and popular among Westerners, his use of Western religious concepts and terminology to describe Confucianism also has generated controversy in the Chinese Confucian world. In particular, the cultural hybridity and explicit spirituality that are key elements of Tu’s Confucianism have been criticized by some other contemporary Chinese Confucian thinkers, who—like modern Chinese philosophy in general—have been more influenced by nationalism and secularism than a diaspora thinker such as Tu. For example, some Chinese Confucian thinkers have pointed out that the idea of a covenant with Heaven can scarcely be found in Confucian texts, even in implicit terms. Others have questioned whether Confucianism truly possesses a concept of dialogue between human beings and Heaven, given that Confucius is recorded as having taught that Heaven does not say anything (Analects 17:19). Finally, Tu’s philosophical anthropology—particularly his distinction between soul (ling) and spirit (shen), which is not always clear in his writings—has come under fire from some other Confucian critics.

Despite such controversies, however, Tu’s enormous impact and legacy as a modernizer of Confucian thought and champion of Confucian engagement with non-Confucian traditions must be acknowledged.

6. References and Further Reading

  • Hung, Andrew T. W. “Tu Wei-Ming and Charles Taylor on Embodied Moral Reasoning.” Philosophy, Culture, and Traditions 3 (2013): 199-216.
  • Huntington, Samuel P. “The Clash of Civilizations?” Foreign Affairs 72/3 (Summer 1993): 22-49.
  • Levenson, Joseph R. Confucian China and its Modern Fate: The Problem of Intellectual Continuity, Volume 1. Berkeley: University of California Press, 1958.
  • Tu, Wei-ming. Neo-Confucian Thought in Action: Wang Yang-ming’s Youth (1472-1509). Berkeley: University of California Press, 1976.
  • Tu, Wei-ming. Humanity and Self-Cultivation: Essays in Confucian Thought. Berkeley: Asian Humanities Press, 1978.
  • Tu, Wei-ming. Confucian Ethics Today: The Singapore Challenge. Singapore: Federal Publications, 1984.
  • Tu, Wei-ming. Confucian Thought: Selfhood as Creative Transformation. Albany: State University of New York Press, 1985.
  • Tu, Wei-ming. Centrality and Commonality: an Essay on Confucian Religiousness, Albany: State University of New York Press, 1989. Later published as Zhongyong dongjian (An Insight into Zhongyong), bilingual (Chinese and English) edition, trans. Duan Dezhi (Beijing: People’s Publishing House, 2008).
  • Tu, Wei-ming. Way, Learning, and Politics: Essays on the Confucian Intellectual. Albany: State University of New York Press, 1993.
  • Tu, Wei-ming, and Mary Evelyn Tucker, eds. Confucian Spirituality. 2 vols. New York: The Crossroad Publishing Company, 2003-04.

 

Author Information

Hung Tsz Wan Andrew
Email: ccandrew@hkcc-polyu.edu.hk
Hong Kong Community College, The Hong Kong Polytechnic University
China

Thick Concepts

A term expresses a thick concept if it expresses a specific evaluative concept that is also substantially descriptive. It is a matter of debate how this rough account should be unpacked, but examples can help to convey the basic idea. Thick concepts are often illustrated with virtue concepts like courageous and generous, action concepts like murder and betray, epistemic concepts like dogmatic and wise, and aesthetic concepts like gaudy and brilliant. These concepts seem to be evaluative, unlike purely descriptive concepts such as red and water. But they also seem different from general evaluative concepts. In particular, thick concepts are typically contrasted with thin concepts like good, wrong, permissible, and ought, which are general evaluative concepts that do not seem substantially descriptive. When Jane says that Max is good, she appears to be evaluating him without providing much description, if any. Thick concepts, on the other hand, are evaluative and substantially descriptive at the same time. For instance, when Max says that Jane is courageous, he seems to be doing two things: evaluating her positively and describing her as willing to face risk. Because of their descriptiveness, thick concepts are especially good candidates for evaluative concepts that pick out properties in the world. Thus they provide an avenue for thinking about ethical claims as being about the world in the same way as descriptive claims.

Thick concepts became a focal point in ethics during the second half of the twentieth century. At that time, discussions of thick concepts began to emerge in response to certain disagreements about thin concepts. For example, in twentieth-century ethics, consequentialists and deontologists hotly debated various accounts of good and right. It was also claimed by non-cognitivists and error-theorists that these thin concepts do not correspond to any properties in the world. Dissatisfaction with these viewpoints prompted many ethicists to consider the implications of thick concepts. The notion of a thick concept was thought to provide insight into meta-ethical questions such as whether there is a fact-value distinction, whether there are ethical truths, and, if there are such truths, whether these truths are objective. Some ethicists also theorized about the role that thick concepts can play in normative ethics, such as in virtue theory. By the beginning of the twenty-first century, the interest in thick concepts had spread to other philosophical disciplines such as epistemology, aesthetics, metaphysics, moral psychology, and the philosophy of law.

Nevertheless, the emerging interest in thick concepts has sparked debates over many questions: How exactly are thick concepts evaluative? How do they combine evaluation and description? How are thick concepts related to thin concepts? And do thick concepts have the sort of significance commonly attributed to them? This article surveys various attempts at answering these questions.

Table of Contents

  1. Background and Preliminaries
  2. Significance of Thick Concepts
    1. Foot’s Argument against the Is-Ought Gap
    2. McDowell’s Disentangling Argument
    3. Williams on Ethical Truth
    4. Thick Concepts in Normative Ethics
  3. How Do Thick Concepts Combine Evaluation and Description?
    1. Reductive Views
    2. Non-Reductive Views
  4. How Do Thick and Thin Differ?
    1. In Kind: Williams’ View
    2. Only in Degree: The Continuum View
    3. In Kind: Hare’s View
  5. Are Thick Terms Truth-Conditionally Evaluative?
    1. Pragmatic View
    2. Semantic View
  6. Broader Applications
  7. References and Further Reading

1. Background and Preliminaries

Bernard Williams first introduced the phrase ‘thick concept’ in his 1985 book, Ethics and the Limits of Philosophy. Williams used this phrase to classify a number of ethical concepts that are plausibly controlled by the facts, such as treachery, brutality, and courage. But his use of the phrase was assimilated from Clifford Geertz’ notion of a thick description—an anthropologist’s tool for describing “a multiplicity of complex conceptual structures, many of them superimposed upon or knotted into one another” (1973:). Incidentally, Geertz borrowed the phrase ‘thick description’ from Gilbert Ryle, who took thick description to be a way of categorizing actions and personality traits by reference to intentions, desires, and beliefs (1971). Although Geertz’ and Ryle’s notions of thick description influenced Williams’ terminology, their notions did not necessarily involve evaluation. By contrast, Williams’ notion of a thick concept is bound up with both evaluation and description. Or, in Williams’ terms, thick concepts are both “action-guiding” and “guided by the world”. They are action-guiding in that they typically indicate the presence of reasons for action, and they are world-guided in that their correct application depends on how the world is (1985: 128, 140-41).

Although the phrase ‘thick concept’ first appeared in Williams’ Ethics and the Limits of Philosophy, there was a distinction between thick and thin that predated Williams’ 1985 book. In R.M. Hare’s The Language of Morals, published in 1952, Hare distinguished between primarily evaluative words and secondarily evaluative words (121-2). Hare later identified the former with thin terms, and the latter with thick terms (1997:54). So, the idea of a thick term was present in ethics well before Williams’ terminology.

Hare’s distinction between thick and thin is explicitly about words, and it makes no mention of concepts.  But, in general, the literature on the thick speaks about both thick concepts and thick terms.  Very roughly, concepts are on the level of propositions and meanings (broadly construed), whereas terms are the linguistic entities used to express these items.  In this entry, expressions with single-quotes, for example ‘chaste’, will be used to designate terms.  Italicized expressions, for example chaste, will be used to designate concepts.  Thick concepts, then, can approximately be seen as the meanings of thick terms.

Thick and thin terms are two distinct subclasses of the evaluative. However, readers should exercise caution when encountering the phrase ‘thin term’, since some theorists allow wholly descriptive terms like ‘red’, ‘grass’, and ‘green’ to count as thin (for example, Elgin 2008: 372). Their usage of ‘thin’ diverges from prevailing philosophical jargon, where the thin is seen as a subclass of the evaluative. This article uses the prevailing jargon: on this way of speaking, there are no wholly descriptive thin terms.

It is typically claimed that thick terms are in some sense evaluative and descriptive, but what do these notions mean? The evaluative and the descriptive are normally meant to distinguish between two classes of terms. Descriptive terms can be illustrated with words like ‘red’, ‘solid’, ‘small’, ‘tall’, ‘water’, ‘cat’, and ‘hydrogen’. Paradigmatic evaluative terms include thin terms like ‘good’, ‘bad’, and ‘best’, as well as normative words like ‘ought’, ‘should’, ‘right’, and ‘wrong’. Although some theorists deny that there is any substantive difference between the descriptive and the evaluative (for example, Jackson 1998:120), on the face of it there is a difference. Various attempts have been made to account for this putative difference. Two general approaches are relevant for our purposes.

One approach stems from traditional non-cognitivism. On this view, descriptive terms express beliefs and are capable of picking out properties and facts. But paradigmatic evaluative terms, such as ‘good’ and ‘right’, have neither of these features. These terms do not express beliefs and are incapable of picking out properties and facts. Instead, the function of an evaluative term is to express and induce attitudes, or to commend, condemn, and instruct. Basically, for traditional non-cognitivism, descriptive expressions are capable of representing properties and facts, whereas evaluative one’s express attitudes or imperatives that cannot represent properties or facts. Since this version of the distinction denies that evaluations can be factual, we can call it the strong distinction.

The strong distinction is also known as the fact/value distinction. Thick terms are often seen as a problem for this distinction because they seem both descriptive and evaluative. Indeed, Williams holds that the world-guidedness of thick concepts “is enough to refute the simplest oppositions of fact and value” (1985:150).

There is also a weak distinction between description and evaluation, which is neutral on the question of whether evaluations can be factual. Proponents of this weak distinction may agree with the strong distinction regarding the primary function of evaluative terms. For example, they might agree that evaluative terms function to express and induce attitudes, or to commend, condemn, and instruct. However, they do not rule out the possibility that evaluations are also factual. What then distinguishes the evaluative from the descriptive? Simply put, descriptive terms are all the other predicates within a language—that is, descriptive terms just are non-evaluative. Since this distinction allows that evaluations can be factual, we can call it the weak distinction.

Thick terms are seen as significant because they straddle the above distinctions—they have something in common with both the evaluative and the descriptive. Consequently, thick terms raise interesting questions about whether there is value in the world and whether value claims can be inferred from factual ones. The main arguments and views in this vicinity are from Philippa Foot, John McDowell, and Bernard Williams, which are discussed next.

2. Significance of Thick Concepts

a. Foot’s Argument against the Is-Ought Gap

David Hume is often interpreted as holding that one cannot derive an ‘ought’ from an ‘is’, or more generally, that one cannot derive an evaluative statement from a purely descriptive statement. This view has some intuitive appeal. The basic thought is that evaluative statements can condemn, commend and instruct, whereas descriptive statements can do none of these things. So any inference from purely descriptive premises to an evaluative conclusion would involve a conclusion with content nowhere expressed in its premises. Hence, the inference as a result must be invalid. If we do have a valid inference to an evaluative conclusion, then the premises must somehow involve evaluative content, perhaps covertly. This claim that there are no conceptually valid inferences from purely descriptive premises to evaluative conclusions is known as the “is-ought gap”.

The is-ought gap may seem plausible when the evaluative conclusion employs thin terms like ‘right’ and ‘wrong’. Philippa Foot rejects the is-ought gap by focusing instead on an evaluative conclusion that employs a thick term: ‘rude.’ She points out that ‘rude’ should count as evaluative, because it seems to express an attitude, or to condemn, much like ‘bad’ and ‘wrong’. But, according to Foot, this evaluation can be derived from a description. Consider the description D1: that x causes offence by indicating a lack of respect. Can one accept D1 as true, but deny that x is rude? Foot thinks this denial would be inconsistent. If she is right, then a thick evaluative claim—that x is rude—can be derived from a descriptive claim (Foot 1958).

Foot’s argument is primarily aimed at non-cognitivists, like Hare.  But Hare replies by considering an analogous inference involving a racial slur.  To demonstrate Hare’s point, consider a racial slur like ‘gringo’, ‘kraut’, or ‘honky’.  Most of us disagree with the attitude of contempt that is expressed by the slur ‘kraut’.  But, according to Hare, an analogous inference would logically require us to accept that attitude—that is, to despise Germans.  And this is absurd.  Consider the descriptive claim D1*: that x is a native of Germany.  Is it logically consistent for one to accept D1* as true but deny that x is a kraut?  If the denial in Foot’s example is inconsistent, then, according to Hare’s thinking, it should also be inconsistent in this example.  So, by Foot’s reasoning, D1* should entail ‘x is a kraut’.  Moreover, ‘kraut’ is a term of contempt, which means that this conclusion entails that one must despise x.  By the transitivity of entailment, it follows that an acceptance of D1* requires one to despise x, which is an unintuitive result.  According to Hare, the two inferences are “identical in form.”  So, there must be something wrong with both inferences (1963:188).

Where do the above inferences go wrong? Hare holds that people who reject the attitude associated with ‘kraut’ will substitute it with an evaluatively-neutral expression, such as ‘German’, which does not commit them to the attitude of contempt. So, people who accept D1* are not required to use the evaluative word ‘kraut’ in expressing the conclusion of the inference. Analogous points hold for ‘rude’. Hare concedes that we rarely have evaluatively-neutral expressions corresponding to paradigmatic thick terms, but he thinks such expressions are at least possible. After all, we could use ‘rude’ with a certain tone of voice or with scare-quotes around it, thereby indicating that we mean it in a purely descriptive sense (1963:188-89).

Hare’s response to Foot assumes that slurs are evaluative in the same way as thick terms. This, however, has been taken by some to be unintuitive. But Hare could modify his reply: instead of employing slurs he could use thick terms like ‘chaste’, ‘blasphemous’, ‘perverse’, and ‘lewd’, which are often called “objectionable thick terms”. Objectionable thick terms are terms that embody values that ought to be rejected. It is, of course, a matter of debate whether these thick terms really are objectionable. But Hare could run his argument by using thick terms that are commonly regarded as objectionable. Such terms seem to be evaluative in much the same way as ‘rude’. And, much like slurs, there are many people who reject the values embodied by such terms; these people are consequently reluctant to use the term in question. Notice that arguments like Foot’s would require such people to accept the values embodied by the thick terms they regard as objectionable, and this seems equally implausible. So, Hare’s basic reply need not assume any fundamental similarity between thick terms and slurs.

However, it is unclear that Hare has shown what he needs to show—that the relevant is-ought inferences are invalid. In particular, he has not shown that it’s possible for D1 to be true while ‘x is rude’ is false, or for D1* to be true while ‘x is a kraut’ is false. The mere fact that reluctant speakers will substitute the evaluative conclusions with neutral ones does not show that the evaluative conclusions are false. For instance, you may hate the word ‘prune’ and prefer to substitute it with ‘dried plum’, but that doesn’t mean it’s false that the thing in question is a prune (Foot 1958:509).

Hare could claim that the is-ought gap only exists on the level of concepts or propositions, not on the level of terms or sentences. This makes a difference because Hare holds that the evaluations of thick terms are detachable in the sense that there could be evaluatively-neutral expressions that are propositionally equivalent to sentences involving thick terms. Detachability is the upshot of Hare’s view that ‘German’ can be substituted for ‘kraut’. If the evaluations of thick terms are detachable, then they only attach to the terms, but are not entailed by the propositions expressed by such terms. So, there is no breach of the is-ought gap on the level of propositions. From D1, one can infer the proposition that x is German. And this proposition can also be expressed by using the term ‘kraut’. But one cannot infer a negative evaluation from this proposition; the negative evaluation is only inferable from uses the term ‘kraut’, not from the proposition expressed by such uses.

Hare’s view that the evaluations of thick terms are detachable has led to debates over how exactly thick terms are evaluative. For example, is the evaluation merely pragmatically associated with the term in a way that would make it detachable? Or are these evaluations part of its truth-conditions? These debates are discussed in section 5.

b. McDowell’s Disentangling Argument

Even if Foot is right that there are descriptions that are sufficient for the correct application of thick terms, it need not be the case that these descriptions are necessary. John McDowell’s Disentangling Argument is believed to show that there could not be a description that is both necessary and sufficient for the correct application of a thick term. In this argument McDowell is primarily arguing against non-cognitivists, such as Hare, who accept the strong distinction between description and evaluation.

Before diving into the argument, recall that thick terms seem to straddle the strong distinction. For example, the claim that OJ committed murder seems to aim at stating a fact, which is a feature of descriptive claims (on the strong distinction). But this claim also seems evaluative; by calling it murder, rather than killing, we seem to be evaluating OJ’s action negatively. Thus, thick terms such as ‘murder’ call into doubt the strong distinction because they seem to be both descriptive and evaluative. This issue does not present any obvious challenge to those who accept only the weak distinction, since a term’s being evaluative on the weak-distinction does not preclude it from being fact-stating. How do proponents of the strong distinction meet this challenge?

A.J. Ayer is one non-cognitivist who holds that thick terms like ‘hideous’, ‘beautiful’, and ‘virtue’ are solely on the evaluative side of the strong distinction—they are purely non-factual, evaluative concepts (1946:108-13). Ayer’s view is counterintuitive, and, if generalized, would oddly entail that there is no fact as to whether OJ committed murder.

Most non-cognitivists disagree with Ayer, and claim that thick terms have hybrid meanings that contain two different kinds of content: a descriptive content and an evaluative content. This kind of view is called a Reductive View because it reduces the meaning of a thick term to a descriptive content along with a more basic evaluative content (for example, a thin concept).

McDowell’s Disentangling Argument targets a specific kind of Reductive View, one that is coupled with the strong distinction between description and evaluation. We may thus call his target “the Strong Reductive View”. McDowell assumes that Strong Reductive Views must hold that the thick concept’s descriptive content completely determines the thick concept’s extension. It does so by identifying a property that completely determines what does and does not fall within the thick concept’s extension. The evaluative content plays no role in determining what property the thick concept picks out, but is instead an attitudinal or prescriptive tag that explains the concept’s evaluative perspective.

McDowell’s argument against Strong Reductive Views invites us to consider the epistemic position of an outsider who does not share the evaluative perspective associated with a given thick term. Consider, for example, someone who fails to understand the sexual mores associated with ‘chaste’. Will this person be able to anticipate what this term applies to in new cases? Initially, one might think this is possible: is this not what anthropologists are trained to do? Williams, a proponent of McDowell’s argument, says that anthropologists must at least “grasp imaginatively” the evaluative point of ‘chaste’ (1985:142). She must imagine that she accepts the evaluative point of this term, at least for the purposes of anticipating its usage. Even this might be a problem for Strong Reductive Views. If a Strong Reductive View is correct, then there would be no need for an outsider to grasp the evaluative content of ‘chaste’, even imaginatively. After all, the descriptive content is supposedly what drives the extension of ‘chaste’, which means that an unsympathetic outsider could master its extension just by grasping the descriptive content and observing that it applies to all and only the features that the insiders call ‘chaste’. So, the Strong Reductive View seems to predict that an unsympathetic outsider could anticipate the insider’s usage of ‘chaste’. Many find this implausible.

McDowell’s argument has two premises: (1) If a Strong Reductive View is true of ‘chaste’, then an unsympathetic outsider could master the extension of ‘chaste’ (that is, she could know what things ‘chaste’ would apply to in new cases) without having any grasp of its evaluative content. But, (2) an unsympathetic outsider surely could not achieve this—she could not anticipate its usage if she stands completely outside the evaluative perspective of those who employ the concept. Therefore, the Strong Reductive View is not true of ‘chaste’ (1981:201-3). This sort of argument could be advanced with respect to any thick term and perhaps even thin ones.

The Disentangling Argument is sometimes thought to be a distinctive problem for all Reductive Views, not just Strong Reductive Views. But this is a mistake. Consider Reductive Views that accept only a weak distinction between description and evaluation: call such views “Weak Reductive Views”. Weak Reductive Views can allow that the evaluative content of ‘chaste’ picks out a property, and can therefore allow that this evaluative content plays a role in determining the extension of ‘chaste’. For example, if morally good is the evaluative content associated with ‘chaste’, and morally good picks out a property, then morally good can also play a role in determining the extension of ‘chaste’. This means that its extension need not be completely determined by its descriptive content. In this case, an unsympathetic outsider would be a very strange person—that is, someone who does not accept the evaluative point of morally good, even imaginatively. But such a person does not seem impossible.

To be sure, Weak Reductive Views could be vulnerable to the Disentangling Argument if they accept an additional claim, namely, that chaste is coextensive with a descriptive concept that is perhaps not encoded within the content of chaste. Consider an analogy: it is plausible that water and H2O are coextensive, even though neither concept is encoded within the other. If Weak Reductive Views hold that this situation is true of chaste and some descriptive concept D, which is not encoded in the content of chaste, then the Disentangling Argument could be run against these views. This type of Weak Reductive View predicts that an outsider could master the extension of ‘chaste’ just by grasping D and observing that insiders apply ‘chaste’ to all and only things that are D. Thus, the Disentangling Argument could be run against Weak Reductive Views if they accept the additional claim that chaste is coextensive with a descriptive concept.

However, the same problem also arises for Non-Reductive Views that accept this additional claim. Non-Reductive Views hold that thick concepts cannot be divided into distinct contents (more on this in section 3). And, strictly speaking, Non-Reductive Views are compatible with the additional claim just mentioned—that chaste is coextensive with a descriptive concept. If Non-Reductivists accept this additional claim—which would be uncharacteristic, though not inconsistent— then the combined view would also be vulnerable to the Disentangling Argument.

So, the Disentangling Argument can be used to target any view, Reductive or Non-Reductive, that holds thick concepts to be coextensive with descriptive concepts. It is thus a mistake to think the Disentangling Argument is a problem for all and only Reductive Views. The reason McDowell’s argument targets Strong Reductive Views is that these views appear fit to accept the problematic claim—that thick concepts are coextensive with descriptive concepts.

Most opponents of the Disentangling Argument reject premise (1), by showing that Strong Reductive Views can allow that an unsympathetic outsider could not master the extension of thick terms. This approach is discussed in section 3a. But Hare takes a different approach. He accepts the Strong Reductive View but rejects premise (2). Recall Hare’s way of arguing that there could be a descriptive concept that is extensionally equivalent to a thick concept. One can express this descriptive concept by muting the thick term’s evaluative content in one of two ways: either by using the thick term with a certain tone of voice or by placing scare-quotes around the term. Suppose that these methods successfully show that there is a purely descriptive concept—call it des-chaste—which is coextensive with chaste. In this case, an outsider could employ des-chaste to track the insider’s usage of ‘chaste’.

One might object that Hare’s two methods of uncovering des-chaste reveal that this concept cannot be grasped without already grasping chaste. So, the interpreter in question would not be a genuine outsider. However, although Hare’s methods of uncovering des-chaste require a grasp of chaste, there is no automatic reason to assume that there could not be another method of uncovering des-chaste without grasping chaste—for example, by learning des-chaste independently of any encounter with insiders or their value system.

Would the outsider’s grasp of des-chaste help her anticipate the insider’s use of ‘chaste’ in new cases? Hare thinks so. According to Hare, the outsider could anticipate their use in new cases because she could observe similarities between the old cases and the new cases, and infer based on those similarities that ‘chaste’ would or would not apply in new cases (1997: 61). Of course, McDowell and followers would not be convinced by this claim, since they hold that the similarities between such cases are evaluative. In other words, they accept what is known as “the shapelessness hypothesis”—that the extensions of thick terms are only unified by evaluative similarity relations.

The fundamental disagreement between Hare and McDowell concerns whether the shapelessness hypothesis is true. Is there any reason to accept shapelessness? This hypothesis is sometimes supported by the fact that it can explain why premise (2) of the Disentangling Argument is plausible—that is, it can explain why an unsympathetic outsider could not master the extension of ‘chaste’ (Roberts 2013: 680). Of course, this idea won’t convince someone like Hare who rejects premise (2). Something more should be said. In section 5b, further support for shapelessness is discussed.

It is worth noting that there is a common thread running through Hare’s replies to both Foot and McDowell. In both replies Hare claims that there could be a descriptive expression that is extensionally equivalent to a thick term. Without this claim he could not hold that des-chaste is coextensive with chaste. He also could not escape Foot’s objection to the is-ought gap by claiming that the evaluations of thick terms are detachable.

c. Williams on Ethical Truth

If successful, McDowell’s argument would show that there could not be a wholly descriptive expression coextensive with a thick term. Furthermore, if we assume that utterances involving thick terms are sometimes true, McDowell’s argument might show that there are evaluative facts—facts that can only be characterized in evaluative terms. But this is too quick. After all, sentences involving thick terms might only be true in a minimalist sense. On the minimalist theory of truth, to say that ‘lying is dishonest’ is true is equivalent to saying simply that lying is dishonest. This is all that can be significantly said about the truth of this sentence. Since nothing more can be said, its mere truth does not entail the existence of a fact to which the sentence corresponds. So, even if McDowell’s argument succeeds, the truth of sentences involving thick terms does not guarantee that there are facts that can only be characterized in evaluative terms.

To get a fuller picture of how the truth of such sentences might support the existence of evaluative facts, we must turn to Bernard Williams. According to Williams, utterances involving thick terms show promise of being more than just minimally true, whereas utterances involving thin terms do not. The main difference, according to Williams, is that thick terms bear a close connection to the concept of knowledge and to the notion of a helpful advisor.

Consider the connection between knowledge and thick concepts. There is a precedent for thinking that certain epistemic difficulties arise for thin concepts but not thick ones. For example, how exactly can one come to know that lying is sometimes wrong? This plausible truth, which involves a thin concept, seems to be neither analytic nor a posteriori. Some ethicists have thus held that it is synthetic a priori and is knowable by a special faculty of the mind, such as moral intuition. But many ethicists find this view implausible and have instead turned to thick concepts for an account of ethical knowledge. It may seem more plausible that we can know a posteriori that a thick concept applies, for example, that a certain action is cowardly. According to Mark Platts, we can know such truths “by looking and seeing,” without any special faculty, such as moral intuition (1988: 285).

Williams agrees that thick ethical knowledge is more feasible than thin ethical knowledge. His reasons, however, are different from Platts’. Williams holds that the concept of knowledge is associated with the notion of a helpful informant or advisor, and that there are only such advisors with regard to the application of thick concepts, not thin ones. According to Williams, a helpful advisor is someone who is better than others at seeing that a certain outcome, policy, or action falls under a concept. And Williams holds that there are helpful advisors with regard to thick concepts. For example, the advice that a certain action would be cowardly “can offer the person who is being advised a genuine discovery” (1993: 217). Are there helpful advisors with regard to thin concepts? Not according to Williams—“not many people are going to say ‘Well, I didn’t understand the professor’s argument for his conclusion that abortion is wrong, but since he is qualified in the subject, abortion probably is wrong’” (1995: 235). Thus, according to Williams, utterances involving thick terms show promise of being more than minimally true, given that thick terms have this association with knowledge and helpful informants.

Even though Williams holds that utterances involving thick terms can be more than minimally true, he does not think these utterances can be objectively true—that is, true independently of particular perspectives. To illustrate this, Williams asks us to compare ethics with science. Although there are disagreements in science, there is at least some chance of scientists converging on a perspective-free account of the world, and this convergence would be best explained by the correctness of that account. But Williams thinks our ethical opinions stand no chance of converging on an account of how the world really is independently of particular perspectives—at any rate, if they do converge, this will not be because these opinions have tracked how the world is independently of perspective (1985: 135-6). So, on Williams’ view, the truth of ethical opinions is dependent upon perspective, and hence, not objectively true.

How exactly is the truth of utterances involving thick terms dependent upon perspective? Williams illustrates his view by asking us to envision a hyper-traditional society which is maximally homogenous and minimally reflective. Williams holds that ethical reflection primarily employs thin concepts, and that this hyper-traditional society is unreflective because it only employs thick concepts. According to Williams, their utterances involving thick terms can be true in their language L, which is distinct from our language since L does not express thin concepts whereas our language does. Williams thinks it is undeniable that the thick concepts expressed in L need not be expressible with our language, which means that we may be unable to use our language to assert or deny what insiders say with their thick terms. Of course, it is possible for a sympathetic outsider, such as an anthropologist, to understand and speak L. But, according to Williams, the outsider cannot formulate an equivalent utterance in his own language because “the expressive powers of his own language are different from those of the native language precisely in the respect that the native language contains an ethical concept which his doesn’t” (1995: 239).

To explain Williams’ view further, we can borrow an example from Allan Gibbard (1992). Imagine that gopa is a positive thick concept expressible in L but not expressible in our language. Although a reflective outsider cannot assert that x is gopa in her own language, she can likely reject the proposition that x is good, which involves a thin concept. And if the local’s thick concept gopa entails good, then the outsider could reject the insider’s statement as false by denying that x is good. So, it looks like the insider’s statement can be assessed as false from an outside perspective. However, Williams does not accept that the insider’s concept gopa entails good. A judgment involving a thin concept, such as good, “is essentially the product of reflection” which comes about “when someone stands back from the practices of the society and its use of the concepts and asks… whether these are good ways in which to assess actions…” (1985:146). But this hyper-traditional society is unreflective, which means they do not employ the thin concept good. So, according to Williams, there’s no reason to assume that their concept gopa entails good, which means the outsider’s denial that x is good poses no clear threat to the truth of ‘x is gopa’.

On Williams’ view, if a person from the hyper-traditional society has knowledge that x is gopa, but later reflects and draws the conclusion that x is good, this reflection may unseat his previous knowledge by making it so that this person no longer possesses the traditional concept gopa (1995: 238). In this way, “rejection can destroy knowledge,” because the one who reflects may thereby cease to possess their traditional thick concepts (1985, 148).

Williams has here outlined a possibility in which utterances involving thick terms could be true in a way that is dependent upon perspective—in particular, the perspective of a person who speaks a certain ethical language, such as L. Opposition to Williams comes from at least two fronts.

First, McDowell (1998) and Hilary Putnam (1990) have both objected to Williams’ conception of science as providing a perspective-free account of the world. They hold that science is perspective-dependent. Although this objection would destroy Williams’ contrast between science and ethics, it would not mean that ethical truth is perspective-free, but only that science and ethics are both perspective-dependent, which leaves ethics in good company.

A second source of opposition is known as Thin Centralism—the view that thin concepts are conceptually prior to, and independent of, thick concepts. If good is conceptually prior to gopa, then the locals cannot grasp gopa without also grasping good. This would mean that Williams is wrong to claim that the locals may lose the concept gopa when they draw the reflective inference from x is gopa to x is good. Furthermore, the outsider who denies that x is good is required, by way of this inference, to deny that x is gopa. So, the truth or falsity of ‘x is gopa’ would not depend on what is knowable solely from the local’s perspective, contrary to Williams’ view. It may depend partly on whether x is good, which may be discernible from the outsider’s perspective.

Williams rejects Thin Centralism, though he does not give any arguments against it (1995: 234). He is plausibly a Thick Centralist, holding that thick concepts are conceptually prior to, and independent of, thin concepts. That is, one cannot grasp a thin concept without grasping some thick concept or other, but not vice versa. This view can be understood by way of a color analogy. The concept color is a very general concept that, according to Susan Hurley, cannot be understood independently of specific color concepts, such as red, green, etc. (1989: 16). And according to Thick Centralism, thin concepts like good cannot be grasped independently of specific thick concepts like courageous, kind, and so on.

It might be true that the grasp of color requires the grasp of some specific color concept (for example, red), but is the opposite also true? Does the grasp of red require a grasp of color? If so, then the color analogy would actually support what is known as the No-Priority View—thick and thin concepts are conceptually interdependent with neither one being prior to the other (Dancy 2013). It is worth noting that the No-Priority View is not available to Williams, since this view would mean that the local’s grasp of gopa requires the grasp of a thin concept, and this presents the same problem that Thin Centralism presents for Williams’ view.

d. Thick Concepts in Normative Ethics

It is often urged that ethicists should stop focusing as much on thin concepts and should expand or shift attention towards the thick (Anscombe 1958; Williams 1985). As a result, there has been much attention paid to thick concepts within meta-ethics, primarily regarding the issues discussed above. Have thick concepts also played a substantive role in normative ethics? They have to some extent. Normative ethics is partly concerned with the question of what kind of person one should be. And the virtue and vice concepts, which are paradigmatic thick concepts, have played a significant role in these discussions.

However, normative ethics is also concerned with the question of how one should act, and in this context it is common to focus on thin concepts, like right, wrong, and good. Of course, there are some thick concepts, such as just and equitable that figure into these discussions, but it is not immediately clear why it would matter whether these concepts are thick, rather than thin or purely descriptive. There is at least one attempt at giving thick concepts a substantive role in a theory of how to act. This comes from Rosalind Hursthouse’s virtue theory of right action.

Virtue theory is sometimes criticized for being unable to provide a theory of right action. The mere fact that virtues are character traits of persons does not mean that virtue concepts cannot be applied to actions. Actions can also be honest, courageous, patient, and so on. The problem is that these characterizations of action do not clearly tell us anything about rightness, which would be a major flaw of a normative ethical theory.

Hursthouse meets this criticism by providing a theory of right action in terms of virtue. She holds that an action is right just in case it is what a virtuous agent would characteristically do in the circumstances (1999: 28). The virtuous agent is one who has the virtuous character traits and exercises them. And a virtue is a character trait that a human being needs to flourish or live well. These particular virtues must be enumerated, but the list typically includes paradigmatic thick terms, such as ‘courage’, ‘honesty’, ‘patience’, ‘generosity’, and so on. Hursthouse explicitly claims that the virtue terms are thick (1996: 27). Does it matter for her view whether the virtue terms are thick? Hursthouse’s theory faces an objection, and it is in response to this objection that it might matter.

The objection alleges that the virtue theory of right action cannot provide clear action-guidance, whereas rival normative theories, such as deontology and utilitarianism, can provide clear action-guidance by generating rules, such as “Don’t lie” or “Maximize happiness.” According to this objection, the virtue theory of right action can only generate a very unhelpful rule: “Do what a virtuous person would do.” This rule is not likely to provide action-guidance. If you are a fully virtuous person, you will already know what to do and so would not require the rule. If you are less than fully virtuous, you may have no idea what a virtuous person would do in the circumstances, especially if you don’t know of anyone who is fully virtuous (indeed, such a person might be purely hypothetical). So, according to this objection, the virtue theory of right action cannot provide action-guidance.

In response, Hursthouse points out that every virtue generates positive instruction on how to act—do what is honest, charitable, generous, and so on. And every vice generates a prohibition—do not do what is dishonest, uncharitable, mean, and so on (1999: 36). So, one can get action-guidance without reflecting on what a hypothetical virtuous agent would do in the circumstances. According to Hursthouse, “the agent may employ her concepts of the virtues and vices directly, rather than imagining what some hypothetical exemplar would do” (1991: 227). For example, the agent may reason “I must not tell this lie, since it would be dishonest.” And since dishonesty is a vice, which no virtuous person would have, this agent will be directed towards right action.

Thus, it’s important for Hursthouse’s view that the virtue concepts are at least action-guiding. After all, imagine that the virtue concepts were wholly descriptive concepts of character traits, like slow, calm, or quiet. These descriptive concepts would not generate any prohibitions or positive instruction.

Does it matter whether the virtue concepts are thick rather than thin concepts? Hursthouse does not speak directly to this question, though she does claim that, if we are unclear on what to do in a circumstance, we can seek advice from people who are morally better than ourselves (1999: 35). And, here, Williams’ point about helpful advisors might be useful. If the virtue concepts were thin, then on Williams’ view there would be no helpful advisor with regard to whether the virtue concepts apply. But such advice is possible if the virtue concepts are thick. In short, it is important for Hursthouse that the virtue concepts are action-guiding. And, if Williams is right, it may also matter whether the virtue concepts are thick.

One potential challenge to Hursthouse’s reply might contest the traditional list of virtues, and claim that there is no reason to think this list, when properly enumerated, will contain thick action-guiding concepts. For example, why should we think that courageous will be on the list of virtues rather than a similar concept that rarely generates positive instruction (for example, gutsy)? In considering this objection, readers are advised to consult Hursthouse’s approach to enumerating the virtues (1999: Ch. 8).

Another potential challenge may come from Thin Centralism. Suppose that right is conceptually prior to, and independent of, courageous. In this case, it might be argued that the positive instruction generated by courage (for example, “Do what is courageous”) is wholly due to the action-guidingness of right. The latter is precisely what we wanted to explain, which means that Hursthouse’s reply might be uninformative. However, Hursthouse does not account for particular virtue concepts in terms of right. Furthermore, even if Thin Centralism is true, it could still be claimed that some other thin concept, such as good, is conceptually prior to thick virtue concepts. So, Hursthouse’s account cannot be deemed uninformative merely on the basis of Thin Centralism.

3. How Do Thick Concepts Combine Evaluation and Description?

Thin Centralists typically accept Reductive Views of the thick, which aim to analyze the meanings of thick terms by citing more fundamental concepts (for example, thin concepts and descriptive concepts). Proponents of these Reductive Views often aim at escaping the Disentangling Argument. In particular, they aim to reject premise (1) of that argument by showing that Reductive Views can consistently claim that an outsider could not grasp the extension of a thick term. This strategy proceeds by providing different versions of the Reductive View, which shall be discussed below.

It is worth noting that Reductive Views are typically neutral on whether the weak or strong distinction ought to be accepted. They also tend to be neutral on whether cognitivism or non-cognitivism is true. To be sure, Reductivism is often associated with non-cognitivists, like Hare, but there are some traditional cognitivists, like Henry Sidgwick and G.E. Moore, who hold Reductive Views of the thick (Hurka 2011: 7).

Those who reject Thin Centralism and accept the Disentangling Argument normally accept Non-Reductive Views, holding that the meanings of thick terms are evaluative and descriptive in some sense, though cannot be divided into distinct contents. The basic disagreement between Reductive and Non-Reductive views is on whether thick concepts are fundamental evaluative concepts or are complexes built up from more fundamental concepts (for example, thin concepts). These two approaches are compared in the following sections.

a. Reductive Views

In general, Reductive Views understand the meaning of a thick term as the combination of a descriptive content with an evaluative content. Different Reductive Views can be distinguished based on how they specify this general account. There are three main types of Reductive Views: (i) some views specify the sort of descriptive content within the analysis; (ii) some views specify the relation between evaluative and descriptive contents; and (iii) other views specify what the evaluative content is. There are also various ways of combining (i)-(iii).

Consider type (i) first. Daniel Elstein and Thomas Hurka provide two patterns of analysis that explain the descriptive content of a thick term in two different ways. On their first pattern of analysis, the descriptive content of a thick term is not fully specified within the meaning of the thick term. The meaning of the thick term may only specify that there are some good-making descriptive properties of a general type, without specifying exactly what these good-making properties are. For example, on their view, ‘x is just’ means ‘x is good, and there are properties XYZ (not specified) that distributions have as distributions, such that x has XYZ and XYZ make any distribution that has them good’. Elstein and Hurka hold that this kind of Reductive View is not a Strong Reductive View, because the thick concept does not have a fully specified descriptive content that determines the thick concept’s extension. Still, their view is available to non-cognitivists who accept the strong distinction. Most importantly, Elstein and Hurka believe their view allows non-cognitivists to claim that an outsider could not grasp the extension of ‘just’. Grasping that extension requires determining which properties of the general type are the good-making ones, and doing this requires evaluative judgments that the outsider is not equipped to make (2009: 521-2).

Elstein and Hurka’s second pattern of analysis involves an additional evaluation, which is embedded within the descriptive content. Many virtue and vice concepts are supposed to fit into this second pattern of analysis. For example, on their view, ‘an act x is courageous’ means roughly ‘x is good, and x involves an agent’s accepting risk of harm for himself for the sake of goods greater than the evil of that harm, where this property makes any act that has it good’ (2009: 527). The reference to goods is an embedded evaluation, and it is impossible to determine the extension of ‘courageous’ without determining what can count as goods—but determining this requires an evaluation which the outsider is not equipped to make (2009: 526).

Stephen Burton offers an account of type (ii) by clarifying the relationship between descriptive and evaluative contents of a thick concept. A simple way of expressing the relationship between a thick term’s evaluative and descriptive contents is as follows: ‘x is D and therefore x is E’, where D is a description and E is an evaluation that follows from that description. The trouble is that this simple formula entails that D is coextensive with the thick term itself, and this makes the simple formula vulnerable to the Disentangling Argument. So, Burton modifies the account so that the thick term is not coextensive with D. Burton proposes that a thick term’s meaning can be analyzed as follows: ‘x is E in virtue of some particular instance of D’. For example, ‘courageous’ means ‘(pro tanto) good in virtue of some particular instance of sticking to one’s guns despite great personal risk’. Here, the thick term only groups together those cases in which a thing is E in virtue of some particular instance of D. But D does not entail E, and so is not coextensive with the thick term. Thus, an outsider’s ability to track D will not be enough for her to track the insider’s use of the thick term. But what does it mean for E to depend upon a particular instance of D? For Burton, this means that E “depends on the various different characteristics and contexts” of D, and so D alone is not sufficient for E. Various different characteristics and contexts, which are not encoded in the meaning of the thick term, also need to obtain (1992: 31).

Now consider a view of type (iii). Most Reductive Views hold that thick concepts inherit their evaluative-ness from a constituent thin concept. However, Christine Tappolet proposes that they are instead evaluative on account of specific affective concepts, like admirable, pleasant, desirable, and amusing. These concepts are not thin concepts, but Tappolet holds that they are the basic evaluative constituents of thick concepts, like courageous and generous. For example, Tappolet’s analysis of courageous goes like this: ‘x is courageous’ means ‘x is D and x is admirable in virtue of this particular instance of D’, where D is a description. Essentially, Tappolet accepts Burton’s account of the relation between descriptive and evaluative contents, but modifies the account so that it incorporates affective concepts instead of thin concepts. In doing so, she parts company with other Reductivists by rejecting Thin Centralism. She rejects Thin Centralism because she holds that understanding a thin concept, such as good, requires an understanding of certain specific concepts such as pleasant and admirable (2004: 216).

An objection may arise: affective concepts are also thick concepts, but they do not fit into Tappolet’s analyses of thick concepts. This is because one affective concept, such as admirable, cannot be defined in terms of another, such as pleasant. How then should we account for these affective concepts? Tappolet’s answer is that affective concepts are to be treated differently from other thick concepts, like courageous. In particular, she treats positive affective concepts as determinates of the determinable good, and she holds that determinates cannot be analyzed in terms of their determinables. Roughly, the determinable/determinate relation is a relation of general concepts to more specific ones, where the general determinables are common to each specific determinate, but there is nothing distinguishing the determinates from each other except for the determinates themselves—for example, the only thing that distinguishes red from other colors is redness itself.

Edward Harcourt and Alan Thomas (2013) have pointed to a tension between Tappolet’s treatment of affective concepts and her treatment of other thick concepts. What reason is there to think courageous is analyzable but not admirable? Tappolet holds that admirable is unanalyzable because there is no way of stating the relevant descriptive content associated with admirable (2004: 217). In response, Harcourt and Thomas claim that this is just as much a problem for her analyses of other thick concepts. For example, it is far from clear what should be substituted for ‘D’ within Tappolet’s analysis of courageous. This objection leads Harcourt and Thomas to a Non-Reductive View, according to which all thick concepts are treated as determinates of thin concepts like good and bad (2013: 25-9).

One problem is that there is reason to think that both parties to this dispute are mistaken in claiming that affective concepts cannot be analyzed. There is a simple Reductive account of the meaning of ‘admirable’, which is not represented by any of the above views—‘admirable’ just means ‘worthy of admiration’. Similar accounts can be given for other affective concepts. If this simple analysis is correct, then Tappolet and Harcourt and Thomas are mistaken about the unanalyzability of thick affective concepts.

Some of the analyses provided above may not withstand potential counterexamples. But it is worth pointing out that our inability to state an adequate analysis for a given thick term does not show that its meaning is unanalyzable. Analyses can only be attempted by using a language, and it is possible that our language’s vocabulary does not contain the expressions needed for providing an adequate analysis of the thick term’s meaning. Reductive Views are only committed to the view that the meanings of thick terms involve appropriately related evaluative and descriptive contents; they are not committed to there being any actual language that can express these contents in a way that counts as a satisfactory analysis.

What then is the point in providing these patterns of analysis? The point is to illustrate the general ways in which descriptive and evaluative contents can be combined within the meanings of thick terms. Typically, Reductive Views only commit to the possibility of there being a certain general type of analysis and do not commit to the particular details of their sample analyses (for example, Elstein and Hurka, 2009: 531).

Are there any advantages to Reductivism about the thick? According to Hurka, Reductivism allows cognitivists to explain the difference between virtues and their cognate vices (2011: 7). For example, both courage and foolhardiness involve a willingness to face risk for a cause. What then differentiates courage from foolhardiness? It is plausible that courage requires that the cause be good enough to justify the risk, whereas foolhardiness does not require this. This explanation appeals to a thin concept—good—that many Reductivists are perfectly willing to cite as a constituent of courage. However, there is nothing forbidding Non-Reductivists from also claiming that courage requires a good enough cause, provided they do not take this content to be a constituent of courage. So, this may be no clear advantage for Reductivism.

Another potential advantage is that Reductivism allows us to explain a wide variety of evaluative concepts by recognizing only a few basic ones, such as ought or good. Moreover, if a successful analysis can be achieved, then Non-Reductivists are committed to positing two meanings where Reductivists can posit only one. For example, if the meaning of ‘admirable’ can be analyzed with ‘worthy of admiration’, then Reductivists can claim that the meanings of these two expressions are identical, whereas Non-Reductivists must hold that these meanings are distinct. Lastly, Reductive Views can explain how a thick term is both evaluative and descriptive, since the evaluative-ness of a thick term’s meaning is inherited from a constituent content that is paradigmatically evaluative (for example, a thin concept); and the descriptiveness of its meaning is inherited from a constituent descriptive content. In the next section, we shall examine whether Non-Reductivists can provide a comparable explanation.

b. Non-Reductive Views

Non-Reductive Views hold that the meanings of thick terms are both descriptive and evaluative, although these features are not due to constituent contents within the meanings of thick terms. In slogan form, thick concepts are irreducibly thick. For example, the thick term ‘brutal’ expresses a sui generis evaluative concept, which is not a combination of bad or wrong along with some descriptive content. The challenge is for Non-Reductive Views to explain how these meanings are both evaluative and descriptive. As noted, Reductive Views explain this in terms of constituent contents. The challenge is for Non-Reductive Views to explain how the meanings of thick terms are both descriptive and evaluative without appealing to constituent contents.

This challenge should be weakened in light of the fact that our notions of the descriptive and the evaluative are theoretically-loaded. Non-Reductive theorists do not accept the strong distinction between description and evaluation, because they hold that thick terms are both evaluative and capable of picking out properties. The strong distinction precludes this possibility, unless the content of the thick term is built up from constituents, which Non-Reductivists reject. Non-Reductivists typically accept some version of the weak distinction, but the present challenge cannot be framed in terms of this distinction. On the weak distinction, the descriptive is identical to the non-evaluative. This means that Non-Reductivists are being asked to explain how the meanings of thick terms are both evaluative and not evaluative, which is plainly contradictory. How then are we to understand the challenge faced by Non-Reductive Views?

The challenge can be framed in a two-fold way: (I) Non-Reductive Views need to explain what the meanings of thick terms have in common with the meanings of thin terms—this would explain the evaluative-ness of the thick term’s meaning. And (II) they also need to explain what the meanings of thick terms have in common with the meanings of paradigmatic descriptive terms—this would explain the descriptiveness of the thick term’s meaning.

Starting with (I), Jonathan Dancy holds that both thick and thin terms express concepts that have “practical relevance,” a feature that is lacked by descriptive concepts. To see what he means, consider how thick and thin concepts differ from descriptive concepts like water. The latter can make a practical difference in some circumstances: water may be something to seek when stranded in a desert. But in this case, we must explain the practical relevance of water by citing other properties in the particular situation, such as being thirsty, in a desert, and so forth. By contrast, there is nothing to be explained when a thick or thin concept makes a practical difference, since their practical relevance “is to be expected.” For example, it is expected that courage is something to aspire for and admire, and this does not require explanation by citing other concepts. Dancy expands upon this by claiming that competence with a thick concept requires not only an ability to determine when the concept applies, but also an ability to determine what practical relevance its application has in the circumstances. Competence with a descriptive concept requires only the former, not the latter (2013: 56).

At this point, Reductive theorists may emphasize a potential benefit of their view—they have a simple explanation for why competence with a thick concept requires an ability to determine its practical relevance. In particular, competence with a thick concept requires an ability to determine its practical relevance because its constituent thin concept is practically relevant. But Dancy and other Non-Reductivists cannot appeal to this explanation. How then can they explain the practical relevance of thick concepts?

Conceptual competence can surely be explained without appealing to constituent concepts, otherwise competence with a simple concept would be inexplicable. One potential explanation, which does not appeal to constituent concepts, comes from Harcourt and Thomas (2013: 24-7). Harcourt and Thomas hold that thick concepts are related to good and bad analogously to how red is related to colored. On their view, colored is not a constituent of red, since there are no other concepts that can be combined with colored to yield red. Instead, red is a determinate of the determinable color. Similarly, the thin concept bad is not a constituent of the thick concept brutal—according to Harcourt and Thomas, there is no other concept that can be combined with bad to yield brutal. Instead, brutal is a determinate of the determinable bad. Moreover, given that brutal is a determinate of bad, it can be claimed that the practical relevance of brutal is inherited from the practical relevance of bad, even though the latter is not a constituent of the former.

Debbie Roberts provides another explanation of what the meanings of thick terms have in common with thin terms. Many ethicists claim that thick and thin terms express and induce attitudes, or condemn, commend, and instruct. Roberts takes a different approach. On her view, a concept is evaluative in virtue of ascribing an evaluative property. A concept ascribes a property if and only if the real definition of the property it refers to is given by the content of that concept. What then is an evaluative property? According to Roberts, a property P is evaluative if (i) P is intrinsically linked to human concerns and purposes; (ii) there are various lower-level properties that can each make it the case that P is instantiated (that is, P is multiply-realizable); but (iii) these lower-level properties do not necessitate that P is instantiated (that is, other features must also obtain). Roberts holds that both thick and thin concepts ascribe properties that satisfy (i)-(iii) (2013).

One potential problem is that there might be some paradigmatically descriptive properties that satisfy (i)-(iii).  Consider a particular mental state with moral content, such as the belief that lying is wrong.  The property of being in this state is intrinsically linked to human concerns and purposes, since it is a moral belief.  And if belief-states are multiply realizable, then this property will satisfy (ii) as well.  And finally, if there are lower-level brain states that make it the case that someone has this belief, without necessitating it, then (iii) will be satisfied as well.  Thus, certain mental properties may satisfy (i)-(iii), even though they seem descriptive. Roberts could reply by holding that the above-mentioned moral belief is not linked to human concerns and purposes in the right sort of way.

Turning to (II): What do the meanings of thick terms have in common with paradigmatic descriptive terms? Recall that a key point about paradigmatic descriptive terms is that these terms are capable of representing properties. Non-Reductive theorists can point out that thick terms also seem capable of representing properties. This, in fact, was the fundamental motivation for focusing on thick terms to begin with. And nearly all ethicists (except for Ayer) would agree that this is true. It plainly seems true that ‘courage’ is capable of picking out a property, and in this way ‘courage’ shares something in common with paradigmatic descriptive terms like ‘red’ and ‘water’.

Another key point about descriptive terms is that they are intuitively different from thin terms like ‘wrong’ and ‘good’. Indeed, a central motivation for classifying terms as descriptive is to exclude thin terms like ‘good’ and ‘wrong’ from paradigmatically descriptive expressions. How then do thick terms share this feature with the descriptive—that of being different from thin terms? There are two general answers that Non-Reductivists provide. On one approach, thick and thin differ in kind. On the other, thick and thin differ only in degree but not in kind. These general approaches are discussed in the next section. Reductivist theories are also discussed under each approach.

4. How Do Thick and Thin Differ?

a. In Kind: Williams’ View

Williams is a Non-Reductive theorist who holds that thick and thin differ in kind. On his view, thick terms are both world-guided and action-guiding. For Williams, a world-guided term is one whose usage is “controlled by the facts”—that is, there are conditions for its correct application and competent users can largely agree that it does or does not apply in new situations. An action-guiding term is one that is “characteristically related to reasons for action” (1985: 140-1). For Williams, thick terms are both world-guided and action-guiding, whereas thin terms are action-guiding but “do not display world-guidedness” (1985: 152).

There are some potential problems for Williams’ distinction. First, Williams’ claim that thin terms “do not display world-guidedness” seems to commit him to something controversial—namely, that non-cognitivism is true of thin terms. Other Non-Reductivists accept Williams’ characterization of thick terms, but hold that thin terms are also world-guided and action-guiding (Dancy 2013: 56). If they are right, then Williams’ distinction between thick and thin is compromised.

Nevertheless, there is a straightforward way of distinguishing between thick and thin, which does not assume non-cognitivism about thin terms. On this view, thin terms express wholly evaluative concepts, whereas thick terms express concepts that are partly evaluative and partly descriptive. This straightforward distinction gives us a difference in kind between thick and thin. The trouble is that it too appears to be theoretically loaded (much like Williams’ distinction). This straightforward distinction presupposes a Reductive View, since it holds that thick concepts are built up from evaluative and descriptive components. Another potential problem is that it is not clear whether thin concepts are wholly evaluative. For example, it looks as though the thin concept ought implies the descriptive concept can, assuming the ought-implies-can principle (Väyrynen 2013: 7). Of course, as Dancy points out, the mere fact that one concept entails another does not mean that the latter is a constituent of the former—cow entails not-a-horse, but neither is a constituent of the other (2013: 49).

A second potential problem for Williams’ view, and the straightforward view just mentioned, comes from Samuel Scheffler. Scheffler points out that there are many evaluative terms that are hard to classify as either thick or thin. Consider ‘just’, ‘fair’, ‘impartial’, ‘rights’, ‘autonomy’, and ‘consent’. Upon reflecting on such concepts, Scheffler suggests that world-guidedness is a matter of degree and that a division of ethical concepts into thick and thin is a “considerable oversimplification” (1987: 417-8).

In a later essay, Williams replies to Scheffler by agreeing that thickness comes in degrees, and that “there is an important class of concepts that lie between the thick and the thin” (1995: 234). This reply, however, does not entail that Williams must reject his earlier view.  Assume that thickness and thinness each come in degrees, and that thick and thin do not exhaust all evaluative concepts. These two claims do not entail that the difference between thick and thin is merely a matter of degree. This can be seen via analogy: belief that P and disbelief that P are exclusive categories that each come in degree, and which do not exhaust all doxastic states since suspension of judgment is also possible. But the difference between belief that P and disbelief that P is not merely a matter of degree. These states are different in kind, assuming the former is about the affirmative proposition P while the latter is about the negation ¬P. Similarly, thick and thin could also differ in kind, even if they are exclusive degree categories that do not exhaust all evaluative concepts. Thus, Scheffler’s considerations and Williams’ concessions do not entail that Williams’ earlier view is false.

b. Only in Degree: The Continuum View

Still, many theorists have seized upon Scheffler’s point and have claimed that thick and thin differ only in degree, not in kind. Some consider this to be the standard view (Väyrynen 2008: 391). On this view, thin and thick lie on opposite ends of a continuum of evaluative concepts, with no sharp dividing line between them. For example, good and bad might lie on one end of the continuum, with kind, compassionate, and cruel on the other end. There are at least two gradable notions that can serve to distinguish the ends of this continuum—degrees of specificity or amounts of descriptive content. Greater specificity, or greater amounts of descriptive content, provides a thicker concept with a narrower range of application. Non-Reductive theorists typically focus on the greater specificity of thick terms. Reductive theorists can choose either path; indeed, they can explain the greater specificity of a thick concept in terms of how much descriptive content it has as a constituent. In general, a concept must have enough specificity, or enough descriptive content, for it to reside on the thicker end of the continuum.

Support for the continuum view may come from several considerations. First, consider that some thin concepts have narrower ranges of application than other thin concepts. For example, good can apply to actions, people, food, cars, and so on, whereas right cannot apply to all these things. This may suggest that there are degrees of thinness. Second, as already noted, some thin concepts have descriptive entailments—for instance, the thin concept ought entails the descriptive concept can. Even if can is not a constituent of ought, this entailment at least narrows down the range of application for ought, which could bring it closer to the thick end of the spectrum, even if it is still fairly thin. Thirdly, there seems to be a vague area between thick and thin—for example, it is not clear whether just has enough specificity or enough descriptive content for it to count as thick, but it is also hard to classify this concept as thin. So, perhaps just is a borderline case between thick and thin.

Given these considerations, one may be tempted to hold that thick and thin do not differ in kind. But the above considerations do not strictly entail this. Again, analogous considerations hold for both belief and disbelief—some beliefs have narrower ranges of application than other beliefs (for example, rabbits cannot have complex mathematical beliefs though they can have perceptual beliefs). There is also a vague area between belief and disbelief, yet these two doxastic states differ in kind. So, these considerations only seem to support the Continuum View if there is no way of drawing a distinction in kind. But Hare has provided a distinction in kind that has largely escaped notice.

c. In Kind: Hare’s View

Hare is a Reductivist who holds that thick and thin are distinct in kind, not merely in degree. He holds that thick terms have both descriptive and evaluative meanings associated with them. Interestingly, Hare holds that this is also true of thin terms. Thus, for Hare, thin terms are not wholly evaluative, contrary to the straightforward view mentioned in 4a.

What then is the difference between thick and thin? The difference has to do with the relationship that the two meanings bear to the term in question. A thin term is one whose evaluative meaning is “more firmly attached” to it than its descriptive meaning. And a thick term is one whose descriptive meaning is “more firmly attached” than its evaluative meaning (1963: 24-5). Although Hare agrees that being firmly attached is “only a matter of probability and degree” (1989: 125), this does not mean that the distinction between thick and thin is only a matter of degree. Indeed, Hare’s phrase “more firmly attached” actually marks out a difference in kind. Consider an analogy: a child who is more firmly attached to her mother than to her father is different in kind from a child who is more firmly attached to her father than to her mother. Both are different in kind from a child who is equally attached to both parents. So, the language that Hare uses actually suggests three possible categories of evaluative terms—thick, thin, and neither. Although Hare never mentions the third category, it is at least a potential category for Scheffler’s examples of the neither thick nor thin.

What does Hare mean by “more firmly attached”? For Hare, the more firmly attached meaning is the one that is less likely to change when language users alter their usage of the term. For example, it is less likely that ‘right’ will eventually be used to evaluate actions negatively (or neutrally) than that it will be used to describe lying, promise-breaking, killing, torture, and so forth. The reason is that, if we start using ‘right’ to evaluate actions negatively (or neutrally), there is a great chance that we will be misunderstood or accused of misusing the word. In this sense, the evaluative meaning of ‘right’ is more firmly attached than its descriptive meaning. But just the opposite is the case for thick terms like ‘generous’. If we start using ‘generous’ to evaluate actions negatively, we will not be misunderstood (for example, Ebenezer Scrooge could use ‘generous’ negatively and we would still understand him). Yet, if we started using ‘generous’ to describe selfish acts, for example, then we will be misunderstood or accused of misusing the term. In this sense, the descriptive meaning of ‘generous’ is more firmly attached than its evaluative meaning (1989: 125).

Hare frames his distinction in terms of descriptive and evaluative meanings, which assumes a Reductive View. But his distinction and thought experiment can be formulated without assuming a Reductive View. Rather than talking about descriptive and evaluative meanings, we could instead speak of two different speech acts—describing and evaluating—that are commonly performed through ordinary uses of the terms. Hare’s thought experiment can be formulated by changing the speech acts that we typically perform with the term. For example, although we ordinarily use ‘generous’ to perform a speech act of positive evaluation, a speaker who uses it to evaluate negatively would still be understood.

At the outset it was said that thick concepts are evaluative concepts that are substantially descriptive. Thin concepts, by contrast, are not substantially descriptive. Exactly what ‘substantially descriptive’ means can now be clarified, depending on which of the above three views is accepted. On Williams’ view, being substantially descriptive is matter of being world-guided. On the Continuum View, being substantially descriptive is a matter of having enough specificity or enough descriptive content. On Hare’s view, being substantially descriptive is a matter of having a descriptive meaning that is more firmly attached than its evaluative meaning.

5. Are Thick Terms Truth-Conditionally Evaluative?

The putative significance of the thick depends upon a crucial assumption about how thick terms are evaluative. Several of the arguments and hypotheses discussed in 2.a-c assume that thick terms are evaluative as a matter of truth-conditions—that is, the conditions that must obtain for utterances involving thick terms to express true propositions.

To see how this assumption is made, first recall Foot’s argument. If ‘x is rude’ were not evaluative as a matter of truth-conditions, then its truth would not require anything evaluative, and there would not be anything evaluative following from the purely descriptive claim that x causes offense by indicating lack of respect. Hare’s response, that the evaluation of ‘rude’ is detachable, is a denial of the assumption that ‘rude’ is evaluative in its truth-conditions. Consider McDowell’s premise (2) of the Disentangling Argument. It’s often assumed that the only reason an outsider could not master the extension of ‘chaste’ must be that the truth-conditions associated with ‘chaste’ incorporate something evaluative, which the outsider cannot track. Moreover, the shapelessness hypothesis states that the extensions of thick terms are only unified by evaluative similarity relations. This suggests that something evaluative must obtain for utterances involving thick terms to express true propositions.

Nevertheless, it is controversial that thick terms are evaluative as a matter of truth-conditions. Generally, ethicists agree that thick terms are somehow associated with evaluative contents, but not all agree that these contents are part of the truth-conditions of utterances involving thick terms. How else can a thick term be associated with evaluative content, if not by way of truth-conditions?

Our use of language can communicate lots of information that is not part of the truth-conditions of what we say. In each of the following cases, a speaker B communicates a proposition that is not part of the truth-conditions of B’s utterance. In this first example, the proposition is communicated by way of presupposition:

B: “I don’t regret going to the party.”

Presupposition: that B went to the party.

Plausibly, B’s utterance could express a true proposition even if its presupposition is false—one way to have no regrets about going to a party is by simply not going. This presupposition can plausibly be apart of the background of the conversation at hand, but not part of the truth-conditions of B’s utterance.

Now consider a slightly different example, involving a phone conversation between A and B. In this case, B communicates a proposition by way of conversational implicature:

A: “Is Bob there?”

B: “He’s in the shower.”

Conversational Implicature: that Bob cannot talk on the phone right now.

This proposition is not part of the truth-conditions of B’s utterance—it is obviously possible that Bob can talk on the phone while in the shower. Instead, this proposition is inferred from B’s utterance by relying on conversational maxims and observations from context (for example, that A and B are having a phone conversation, and that B would not provide irrelevant information about Bob’s showering unless he is trying to convey that Bob cannot talk).

Now consider a third example, where B communicates a proposition by way of conventional implicature:

B: “Sue is British but brave.”

Conventional Implicature: that Sue’s bravery is unexpected given that she is British.

The proposition communicated in this example is not part of the truth-conditions of B’s utterance. One way to see this is by comparing B’s utterance with “Sue is British and brave.” These two utterances would seem to be true in all the same circumstances. But the latter does not communicate the implicature in question. This implicature is detachable, in the sense that a truth-conditionally equivalent statement need not have the implicature in question. Although Hare does not mention conventional implicature, his view about the detachability of a thick term’s evaluation could be explained in terms of conventional implicature.

In short, there are many ways to communicate information without it being part of the truth-conditions of the utterance. There are three widely-discussed pragmatic mechanisms—presupposition, conversational implicature, and conventional implicature—and there are others as well. Some hold that utterances involving thick terms do not convey evaluations that are part of their truth-conditions, but instead convey them via some pragmatic mechanism. This view is known as the Pragmatic View. It should be noted that some proponents of the Pragmatic View write as though their view entails that there are no thick concepts (Blackburn 1992). In other words, if the only evaluation associated with courage is pragmatically associated with it, then these philosophers will say that courage is not really a thick concept. Still, others who accept the Pragmatic View are happy to talk of these concepts as thick (Väyrynen 2013). This article does so as well.

The traditional view, however, is that thick terms are evaluative as a matter of their truth-conditions—this view is known as the Semantic View. The follow two sections discuss the Pragmatic and Semantic View, respectively.

a. Pragmatic View

Sometimes the Pragmatic View is supported by the idea that thick terms are variable in what evaluations they express. Typically, a given thick term conveys a particular evaluation that is either positive or negative, but not both. It is natural to assume that the term conveys this evaluation (whichever it is) in all assertive contexts. But it turns out that many paradigmatic thick terms can be used to evaluate something negatively in some contexts while positively in others. There are two ways of illustrating this variability.

The first involves combining a thick term with comparative constructions, such as ‘too’ or ‘not…enough’. For example, ‘lewd’ is typically negative, but it appears to convey something positive in the following quote: “this year’s carnival was not lewd enough” (Blackburn 1992: 296). Similarly, ‘tidy’ is typically positive, but can be used to convey something negative if one is criticized as “too tidy” (Hare 1952: 121). These considerations may be taken to show that thick terms are not evaluative as a matter of truth-conditions—if these thick terms have an evaluation as part of their truth-conditions, one might think ‘not lewd enough’ and ‘too tidy’ should be semantically awkward, but they are not.

Opponents to this argument may claim that the atypical evaluation in each case can be explained solely by reference to ‘too’ and ‘not… enough’, without claiming that ‘lewd’ and ‘tidy’ express an atypical evaluation. Consider that ‘too F’ and ‘not F enough’ are evaluative even when F is a wholly non-evaluative expression—for example, one might say that a color sample is too red, which seems to characterize the sample negatively. Here, the negative evaluation is solely because of ‘too’, and so it should be no surprise if this word generates a negative evaluation when combined with ‘tidy’. Furthermore, there are contexts where F is clearly seen as a positive quality even when it is combined with ‘too’. Borrowing an example from Väyrynen, a military commander could count a soldier as too courageous to waste on a simple mission, and instead select him for a more formidable mission where his courage would be needed (2011: 7). Thus, it appears the atypical evaluation can be attributed to the modifiers ‘too’ and ‘not…enough’ rather than the thick term itself.

The second sort of example pertains to utterances that convey an atypical evaluation without employing a comparative construction. For example, even though ‘cruel’ typically conveys a negative evaluation, it might be that the cruelty of an action was “just what made it such fun” (Hare 1981: 73). Or, even though ‘frugal’ typically conveys a positive evaluation, a person could be condemned as frugal if his “main job is dispensing hospitality” (Blackburn 1992: 286).

A worry associated with examples of this second sort is that the atypical evaluation can be explained in ways that are consistent with the Semantic View. For instance, it might be claimed that the examples involve non-literal uses of the thick term, or that they only convey the alternative evaluation by way of speaker meaning, and not word meaning. Alternatively, one could hold that thick terms are context sensitive, and that there are several different evaluations conveyed by the thick term depending on the context of utterance. In this case, the thick term would be evaluative as a matter of truth-conditions —it is just that those truth-conditions incorporate different evaluations in different contexts (Väyrynen 2011: 8-14).

Another argument for the Pragmatic View comes from Pekka Väyrynen (2013), who focuses on objectionable thick terms. Recall that objectionable thick terms embody values that ought to be rejected. Potential examples include ‘lewd’, ‘perverse’, and ‘blasphemous’, and ‘chaste’. The last of these terms seems to embody the view that a certain kind of sexual restraint is praiseworthy. Those that reject this view regard ‘chaste’ as objectionable—one can refer to such individuals as chastity-objectors. Chastity-objectors tend to exhibit interesting linguistic behavior. They would obviously be reluctant to assert that, say, John is chaste; but they are also reluctant to utter non-affirmative sentences like the following:

(a) John is not chaste.

(b) Is John chaste?

(c) Possibly, John is chaste.

(d) If John is chaste, then so is Mary.

None of these utterances imply that the truth-conditions of ‘chaste’ are satisfied. So, if chastity-objectors are reluctant to utter (a-d), this leads us to expect that the evaluation projects outside of the truth-conditions of ‘chaste’. It is worth noting that this argument is not restricted to examples like ‘chaste’, ‘perverse’, ‘lewd’, and ‘blasphemous’. According to Väyrynen, virtually any thick term could be regarded as objectionable, at least in principle, which means that his argument should extend even to examples like ‘courageous’ and ‘murder’.

In addition to arguing against the Semantic View, proponents of the Pragmatic View need to explain what pragmatic mechanism is responsible for the evaluations of thick terms. The three pragmatic mechanisms cited above can provide potential explanations, but Väyrynen rejects these explanations in favor of an alternative view. He proposes that the evaluative implications of paradigmatic thick terms are “not-at-issue” in normal contexts. Roughly, an implication is at-issue if it is part of the main point of the conversation at hand, and it is not-at-issue if it is part of the background (2013: ch. 5). Väyrynen takes this pragmatic view to be “superior to its rivals by standard methodological principles” from the philosophy of language and linguistics (2013: 10).

Proponents of the Semantic View can provide at least two lines of response to Väyrynen’s argument involving objectionable thick terms. For the first, it is important to note that Väyrynen explains the objector’s reluctance by holding that (a-d) all project the same evaluation beyond the truth-conditions of ‘chaste’. But one might hold that there is no single evaluation projected by all of (a-d)—instead, there are at least two different claims implied throughout (a-d). For example, just as ‘not happy’ conversationally implicates ‘unhappy’, it is equally plausible that ‘not chaste’ conversationally implicates ‘unchaste’. Since chastity-objectors clearly do not want to imply that John is unchaste, they are reluctant to assert (a). Moreover, (b-d) conversationally imply (or assert) that John might be chaste. If chastity-objectors believe it is impossible for anyone to be chaste, then they will be reluctant to assert (b-d). This piecemeal approach calls into doubt Väyrynen’s claim that a single evaluation projects beyond the truth-conditions of ‘chaste’. Moreover, these ways of explaining the reluctance of chastity-objectors are perfectly consistent with the Semantic View (Kyle 2013a: 13-19).

For a second response, it can be pointed out that even Väyrynen’s preferred explanation is consistent with the Semantic View (Kyle 2015). The mere fact that an evaluation projects outside of the truth-conditions of ‘chaste’ does not entail that there is no evaluation within those truth-conditions. For example, it is possible that ‘John is chaste’ conveys two evaluations, one that is part of its truth-conditions, and another that projects outside of them. This possibility can be illustrated with the affirmative sentence ‘It is good that Sue is moral’, which has an evaluative content within its truth-conditions and also projects one outside those truth-conditions. Consider the corresponding non-affirmative sentences:

(a′) It is not good that Sue is moral.

(b′) Is it good that Sue is moral?

(c′) Possibly, it’s good that Sue is moral.

 (d′) If it is good that Sue is moral, then we should applaud her.

An evaluative content—that Sue is moral—is implied by each of (a′-d′), as well as the affirmative sentence. So this evaluation projects much like the evaluation that Väyrynen thinks is projected by ‘chaste’. But none of this precludes the affirmative statement from having a different evaluation as part of its truth-conditions, namely the evaluation associated with ‘good’. Of course, this doubling of evaluation will have no purchase unless there is reason to think there is an evaluation within the truth-conditions of ‘chaste.’

b. Semantic View

What reason is there to think the evaluations of thick terms might be part of their truth-conditions? One potential reason stems from considering additional linguistic data. Notice that the following claim seems highly awkward:

(e) Sue is generous and not good in any way.

Similar statements can be provided using negative thick terms and ‘not bad in any way’. The Semantic View provides a straightforward explanation of the awkwardness of (e). This view can claim that (e) is a contradiction, assuming goodness-in-a-way is a part of the truth-conditions associated with ‘generous’ (Kyle 2013a). This, of course, is only one potential explanation—there may be other ways of explaining the awkwardness of (e), for example, by claiming that ‘generous’ presupposes or conventionally implicates an evaluation that incorporates goodness-in-a-way. Just as before, the issue must be decided by figuring out which view is the best explanation of this and other linguistic data. The matter is up for debate.

Still, one might object that the Semantic View is ill-suited to explain the oddity of (e), since this view mistakenly predicts that (e) would sound odd in every context, yet there are some unusual contexts in which (e) would not sound odd. For example, imagine Ebenezer Scrooge uttering (e) in a context where generosity is seen as a bad thing. (e) might not seem awkward in this context. However, it is a mistake to think that the Semantic View predicts that (e) is awkward in all contexts. Consider that its second conjunct involves a quantifier expression—‘any’—and quantifiers are notoriously context-sensitive. In one context, it might be true to say ‘O.J. Simpson is not good in any way’; but in other contexts, where being a good athlete is relevant to discussion, an utterance of the same sentence could be false. Similarly, the second conjunct in (e) is true or false relative to context. And it is only in contexts where generosity is a relevant way of being good that (e) should sound contradictory. In contexts where generosity is not a relevant way of being good—such as Scrooge’s context—(e) should not sound contradictory. In those contexts, the first part of (e) could be true while the second part is false (Kyle 2013a).

It’s worth emphasizing that the linguistic data about thick terms does not strictly entail the Semantic View, or the Pragmatic View. Rather, the proponents of such views only claim that their respective view is part of the best explanation of a wide-body of linguistic data involving thick terms. This matter has only been explored in recent years, with proponents on each side (Kyle 2013a; Väyrynen 2013).

Another way of supporting the Semantic View stems from the shapelessness hypothesis, that the extensions of thick terms are only unified by evaluative similarity relations. If the shapelessness hypothesis is true, then the truth-conditions associated with thick terms must be at least partly evaluative. But what reason is there to accept the shapelessness hypothesis?

The main support for shapelessness comes from the idea that thick terms seem to “outrun” any descriptive characterizations we can give to the items in their extensions (Kirchin 2010). Consider the various types of action that can be considered kind—shoveling snow for a neighbor, giving chocolate to a child, adopting a stray cat, standing up to someone’s bully, paying a complement to a friend, and so on. Furthermore, there are some actions that would be considered kind in some circumstances even though the opposite action would be considered kind in other circumstances—for example, telling the truth is sometimes kind but so is telling a white lie. Can these various actions be descriptively classified in a way that allows us to correctly characterize kind actions in new cases? The descriptive classification might be a long disjunction of unrelated features, a shapeless classification that would be unhelpful in confronting new cases.

One way of opposing this argument is to show that the various actions mentioned above can be unified under a shapely descriptive classification. For example, each of the above actions seems to benefit others by treating them as ends in themselves. This shapely classification helps us classify at least some new cases of kind action (for example, giving food to a homeless person). And benefit can be understood in purely descriptive terms—for example, as the increasing of happiness. So, it is not obvious that shapelessness is actually supported by the outrunning data given above, although other data could perhaps be provided.

Suppose that thick terms do outrun our ability to provide wholly descriptive characterizations of their extensions. Väyrynen argues that this does not support the Semantic View, because our inability to give a descriptive classification for a thick term can be explained even if the Semantic View is false. Consider that, for some terms T, the extension of T cannot be unified under relations that are expressible in independently intelligible T-free terms (that is, terms that can be understood independently of T). For example, it might be that the extension of ‘pain’ across all sentient animals cannot be unified without employing the word ‘pain’ itself. Now return to the question of what explains our inability to give a wholly descriptive characterization of the extension of ‘kind’. It might be that the only way to characterize its extension is by employing the word ‘kind’ itself. But this would not be a descriptive classification, if the Pragmatic View were true of ‘kind’. It is just that the evaluative-ness of ‘kind’ would be explained by a pragmatic mechanism, rather than its truth-conditions. So, if ‘kind’ is a term like T, then one could not give a wholly descriptive classification of the extension of ‘kind’, even if the Semantic View is false (2013 193-201).

6. Broader Applications

Thick concepts have been an interest primarily among ethicists, although these concepts have made an entrance into discussions in other areas, such as aesthetics, metaphysics, philosophy of law, moral psychology, and epistemology. This section focuses mainly on epistemology’s recent discussions on thick concepts, since these discussions have been the most extensive (outside of ethics). But the discussions from the first four areas shall be briefly summarized.

In aesthetics, there is much discussion about thick aesthetic concepts, like gaudy, elegant, delicate, and brilliant. However, many of these discussions are centered on the question of whether there are any thick aesthetic concepts at all. In this context, it is often assumed that an aesthetic concept is not thick if it is only pragmatically associated with evaluative content (recall that this assumption is sometimes made in ethics as well). So, the discussions over whether there are any thick aesthetic concepts often mirrors the discussions in ethics on whether thick concepts are only pragmatically evaluative, or whether they are evaluative as a matter of truth-conditions (Bronzon 2009; Zangwill 2001). It is worth noting that Burton’s Reductive View (discussed in 3a) is explicitly aimed at accounting for thick aesthetic concepts, as well as ethical ones.

In metaphysics, Gideon Yaffe criticizes two competing views on the nature of freedom of will—one that equates freedom of will with self-expression and one that equates it with self-transcendence. Yaffe then holds that the debates between these approaches have proceeded from the (at least implicit) assumption that freedom of will is a descriptive concept. He argues that there are facts about freedom of will that are best explained if freedom of will is instead assumed to be thick. According to Yaffe, the descriptive content of this concept would correspond to the features that make the agent either self-expressive or self-transcendent. But, according to Yaffe, this is not enough to account for freedom of will. We must also determine whether “the agent has choices that come about through worthwhile processes, processes possessing a certain kind of value” (2000: 219-20). And it is this kind of value that corresponds to the evaluative content of freedom of will.

In the philosophy of law David Enoch and Kevin Toh (2013) point out that legal statements often straddle the divide between the descriptive and the evaluative. They put forth the hypothesis that many legal statements express thick concepts. Potential examples of such thick concepts may include crime, constitutional, inheritance, and infringement, though Enoch and Toh focus primarily on the concept legal, which they argue is a thick concept. The descriptive content of legal consists in its representation of certain social facts, and its evaluative content is a kind of endorsement. They do not assert that the thickness of legal can by itself settle debates over the nature of law. But they hold that its classification as thick can situate these debates within a broader philosophical context analogous to that of ethics, and can introduce new options for thinking about the nature of law.

Thick concepts also play a role in Gabriel Abend’s critique of current moral psychology and neuroscience. Abend argues that moral psychologists and neuroscientists unwarrantedly restrict their research to thin ethical concepts, but ignore thick ones. In particular, these scientists attempt to understand the psychological or neural bases for moral judgments, and they do so by testing various subjects’ judgments involving thin concepts like right and permissible. But judgments involving thick ethical concepts, like cruel and courageous, have scarcely been featured in these experiments. This is no small oversight, given “that thick concepts appear in some or much of people’s moral lives” (2011: 150). Abend also argues that this problem cannot be fixed merely by expanding out psychological and neurological research to include thick ethical judgments, because thick concepts “challenge the conception of a hardwired and universal moral capacity in a way that thin concepts do not” (2011: 145-6). In advancing this last point, Abend relies on the Disentangling Argument and the shapelessness hypothesis, as well as the claim that thick concepts presuppose institutional and cultural facts that do not hold universally.

Outside of ethics, the most extensive discussions on thick concepts occur in epistemology. In 2008, a special issue of the journal Philosophical Papers (vol. 37 no. 3) was devoted to thick concepts in epistemology. Examples of thick epistemic concepts include concepts like intellectual curiosity, truthfulness, open-mindedness, and dogmatic; these are contrasted with thin epistemic concepts, which are typically illustrated with concepts like justification, rationality, and knowledge. The editors of this issue hold that “traditional epistemology has tended towards using the thin concepts in theorizing,” but “these thin epistemic concepts are far less prevalent in everyday discourse than the thick epistemic” (2008: 342). The overarching question of this collection is whether epistemology would benefit from substantive investigations of thick epistemic concepts.

One way of addressing this question is simply to do a substantive investigation of a particular thick epistemic concept, and to show that epistemological theories are enhanced by the investigation. Two contributors to the Philosophical Papers collection take this approach. Catherine Elgin focuses on the concept trustworthy. She argues that trustworthiness does not reduce to justified or reliable true belief, but can help to explain why justified or reliable true beliefs are valuable (2008: 371-87). Harvey Siegel considers whether education is an epistemic virtue concept, and whether it makes sense to classify it as thick. Siegel is skeptical about the helpfulness of the thin/thin distinction, but, to the extent that this distinction is viable, he maintains that education is “more thick than thin.” He also seeks to clarify the relationship between education and virtue epistemology (2008: 467).

The other contributors focus on general issues concerning thick epistemic concepts. Heather Battaly’s contribution seeks to address an objection advanced by Simon Blackburn against Non-Reductive Views. According to Blackburn, Non-Reductive Views mistakenly imply that the differences in how we respond to, say, lewdness would not count as genuine disagreements, because the disputants would be employing different concepts and therefore talking past one another. For example, a person who thinks the carnival was not lewd enough might be employing a different concept from someone who disvalues lewdness, because there would be “no detachable description, no ‘semantic anchor,’ that they can share” (1992: 297-99). (This is an expanded version of the variability objection discussed in section 5a). Battaly responds by arguing that certain thick epistemic concepts, such as open-minded, are subject to combinatorial vagueness—these concepts have several independent conditions of application, but there is no sharp distinction between the conditions that are necessary or sufficient and those conditions that are neither. Battaly holds that disputants can share the same vague concept, while disagreeing over which conditions are necessary and/or sufficient. Battaly maintains that this allows them to have genuine disagreements about whether the concept refers to an epistemic virtue. According to Battaly, this at least shows that virtue epistemologists can disagree about the epistemic virtues without talking past one another (2008: 435-54).

Väyrynen’s contribution focuses on whether thick and thin epistemic concepts can be distinguished in ways comparable to thick and thin ethical concepts, and on whether a focus on thick epistemic concepts can lead to a preferable epistemology. Regarding the first issue, he argues that the way thick and thin concepts are typically distinguished in ethics provides no straightforward distinction between thick and thin epistemic concepts (2008: 390-95). Regarding the second, he argues that neither semantics nor substantive epistemological theory provides a basis for assigning thick epistemic concepts theoretical priority over thin epistemic concepts (2008: 395-408). Väyrynen concludes by claiming that we so far lack good reasons for taking a theoretical turn to a thick epistemology.

Despite Väyrynen’s final conclusion, the Philosophical Papers collection contains an explicit defense of the view that epistemology should expand its focus to thick epistemic concepts. Bernard Williams pushed for a similar expansion in the ethical sphere. And the comparison here brings up an important question, which is the main question of Alan Thomas’ contribution: Can Williams’ treatment of thick ethical concepts be applied analogously in the epistemic sphere? Recall that Williams holds that utterances involving thick ethical concepts cannot be objectively true. Thus, if epistemologists want to model their theory of thick epistemic concepts after Williams’ view in ethics, it appears they will not be able to claim that there are objectively true claims involving thick epistemic concepts. Thomas, however, points out that Williams’ non-objectivism in ethics “is based on the assumption that there are a variety of social worlds, structured by plural sets of thick ethical concepts.” But Williams’ view of thick epistemic concepts, such as truthfulness, allows for the possibility of “only one epistemic world.” According to Thomas, truthfulness is “such a central need of human life that it can be abstractly modeled in a way that… [is] culturally invariant” (2008: 368).

Guy Axtell and J. Adam Carter focus on outlining a positive account for how thick epistemic concepts could play a central role in epistemological theory. The account begins by claiming that epistemic value should be a central focus in epistemology, and that not all epistemic values can be reduced to the value of truth (or to some other single epistemic good). Other values, such as open-mindedness, “can be useful in articulating our epistemic aims,” even if they cannot be thusly reduced (2008: 418). Axtell and Carter also reject Thin Centralism in the epistemic sphere—the view that “general concepts like ‘justified’ and ‘ought’ are logically prior to and independent of specific reason-giving thick epistemic concepts of virtue and vice” (2008: 418). In the epistemic sphere, ‘justified’ and ‘ought’ are primarily used to evaluate beliefs, but the rejection of Thin Centralism allows that there could be fundamental ways of evaluating agents with virtue and vice concepts, and that these evaluations may not be reducible to belief evaluations. The above tenets open up the possibility of, what Axtell and Carter call, a “second-wave of virtue epistemology.” This contrasts with the “first-wave” which takes thick epistemic concepts to be theoretically important primarily because they play a role in analyzing knowledge. On the second wave of virtue epistemology, thick epistemic concepts are “a subject for research in their own right, apart from whatever role they might have in explaining knowledge” (2008: 427).

Many authors in the Philosophical Papers collection take knowledge to be a paradigmatic example of a thin epistemic concept (Battaly 2008: 435; Axtell and Carter 2008: 427; Thomas 2008: 363; Väyrynen 2008: 392; Kotzee and Wanderer 2008: 339). It is not immediately clear whether their arguments rely substantively on this assumption, but the assumption has been contested in a separate context. Brent Kyle argues that knowledge is actually a thick concept. According to Kyle, knowledge is best accounted for as a close relation between a descriptive content—true belief—and an evaluative content—justification. If successful, this argument would establish that traditional epistemology has already focused on at least one thick concept—namely knowledge. But Kyle’s main goal is not to defend the traditional focus of epistemological theories. Instead, he aims to argue that the thickness of knowledge can explain why the Gettier Problem arises. He does so by arguing that the Gettier Problem is a specific instance of a general problem about analyzing thick concepts. It is worth noting that his argument takes no stand on whether thick concepts can be analyzed, or on whether the Gettier Problem is resolvable (Kyle 2013b).

Generally speaking, thick concepts have become a source of optimism for many philosophers who find traditional research within normative disciplines to be myopic, stagnant, or misdirected. Nevertheless, it is still a matter of debate whether a plausible theory of thick concepts actually has the implications typically hoped for. In particular, the literature on thick concepts still contains lively debates regarding fundamental issues such as the Disentangling Argument, the shapelessness hypothesis, non-reductivism, and the Semantic View. And if proponents of the significance of thick concepts make assumptions regarding these controversial issues, then their views will be met with significant opposition, at least until these issues are resolved. But, on the flip side, if opposing theorists account for all normativity with thin concepts, and take these concepts to be non-factual, they too will meet significant opposition. The recent debates about thick concepts are largely responsible for this. Ultimately, whatever approach one takes to these fundamental issues, it is clear that theories of value and normativity cannot be complete unless they give some attention to the thick.

7. References and Further Reading

  • Abend, G.  2011, “Thick Concepts and the Moral Brain,” European Journal of Sociology 52, 143-72.
  • Anscombe, E.  1958, “Modern Moral Philosophy,” Philosophy 33, 1-19.
  • Axtell, G. and A. Carter, 2008, “Just the Right Thickness: A Defense of Second-Wave Virtue Epistemology,” Philosophical Papers 37, 413–434.
  • Ayer, A.J.  1946, Language, Truth, and Logic, Dover, New York.
  • Battaly, H.  2008, “Metaethics Meets Virtue Epistemology: Salvaging Disagreement about the Epistemically Thick,” Philosophical Papers 37, 435–454.
  • Blackburn, S.  1992, “Through Thick and Thin,” Proceedings of the Aristotelian Society, suppl. vol. 66, 285-99.
  • Bronzon, R.  2009, “Thick Aesthetic Concepts,” The Journal of Aesthetics and Art Criticism 67:2, 191-99.
  • Burton, S.  1992, “‘Thick’ Concepts Revisited,” Analysis 52, 28-32.
  • Dancy, J.  2013, “Practical Concepts,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Dancy, J.  1995, “In Defense of Thick Concepts,” Midwest Studies in Philosophy 20, 263-79.
  • Elgin, C.  2008, “Trustworthiness,” Philosophical Papers 37, 371-87.
  • Elstein, D. and T. Hurka, 2009, “From Thick to Thin: Two Moral Reduction Plans,” The Canadian Journal of Philosophy 39, 551-36.
  • Enoch, D. and K. Toh 2013, “Legal as a Thick Concept,” in W.J. Waluchow and S. Sciaraffa (eds.), Philosophical Foundations of the Nature of Law, Oxford University Press, Oxford.
  • Foot, P.  1958, “Moral Arguments,” Mind 67, 502-13.
  • Geertz, C.  1973, “Thick Description: Toward and Interpretive Theory of Culture,” The Interpretation of Cultures: Selected Essays, Basic Books, New York, 3-30.
  • Gibbard, A.  1992, “Thick Concepts and Warrant for Feelings,” Proceedings of the Aristotelian Society, suppl. vol. 66, 285-99.
  • Harcourt E. and A. Thomas, 2013, “Thick Concepts, Analysis, and Reductionism,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Hare, R.M.  1952, The Language of Morals, Oxford University Press, Oxford.
  • Hare, R.M.  1963, Freedom and Reason, Oxford University Press, Oxford.
  • Hare, R.M.  1989, Essays in Ethical Theory, Oxford University Press, Oxford.
  • Hare, R.M.  1997, Sorting Out Ethics, Clarendon Press, Oxford.
  • Hurka, T.  2011, “Common Themes from Sidgwick to Ewing,” in T. Hurka (ed.) Underivative Duty, Oxford University Press, Oxford, 6-25.
  • Hurley, S.  1989, Natural Reasons, Oxford: Oxford University Press.
  • Hursthouse, R. 1991, “Virtue Theory and Abortion,” Philosophy and Public Affairs 20, 223-46.
  • Hursthouse, R.  1996, “Normative Virtue Ethics,” in Roger Crisp (ed.) How Should One Live?  Oxford University Press, Oxford, 19-36.
  • Hursthouse, R. 1999, On Virtue Ethics, Oxford University Press, Oxford.
  • Jackson, F.  1998, From Metaphysics to Ethics: A Defence of Conceptual Analysis, Oxford University Press, Oxford.
  • Kirchin, S.  2010, “The Shapelessness Hypothesis,” Philosophers Imprint 10, 1-28.
  • Kotzee, B. and J. Wanderer, 2008, “Introduction: A Thicker Epistemology?” Philosophical Papers 37, 337-343.
  • Kyle, B.  2013a, “How Are Thick Terms Evaluative?” Philosophers’ Imprint 13, 1-20.
  • Kyle, B.  2013b, “Knowledge as a Thick Concept: Explaining Why the Gettier Problem Arises,” Philosophical Studies 165, 1-27.
  • Kyle, B. 2015, “Review of ‘The Lewd, the Rude, and the Nasty: A Study of Thick Concepts in Ethics’ by Pekka Väyrynen,” The Philosophical Quarterly 65, 576-82.
  • McDowell, J.  1981, “Non-Cognitivism and Rule-Following,” in S. Holtzman and C. Leich (eds.) Wittgenstein: To Follow a Rule, Routledge, London.
  • McDowell, J. 1998, “Aesthetic Value, Objectivity, and the Fabric of the World,” in Mind, Value, and Reality, Harvard University Press, Cambridge, Mass.
  • Platts, M.  1988, “Moral Reality,” in G. Sayer-McCord (ed.) Essays on Moral Realism, Cornell University Press, Ithaca, NY.
  • Putnam, H. 1990, “Objectivity and the Science/Ethics Distinction,” in J. Conant (ed.) Realism with a Human Face, Harvard University Press, Cambridge, Mass.
  • Roberts, D.  2011, “Shapelessness and the Thick,” Ethics 121, 489-520.
  • Roberts, D.  2013, “It’s Evaluation, Only Thicker,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Ryle, G.  1971, “The Thinking of Thoughts: What is ‘Le Penseur’ Doing?” in Collected Papers 2, Routledge, London, 480-83.
  • Scheffler, S.  1987, “Morality through Thick and Thin: A Critical Notice of Ethics and the Limits of Philosophy,” Philosophical Review 96, 411-34.
  • Siegel, H.  2008, “Is ‘Education’ a Thick Epistemic Concept?”  Philosophical Papers 37, 455-469.
  • Tappolet, C.  2004, “Through Thick and Thin: Good and Its Determinates,” Dialectica 58, 207-21.
  • Thomas, A.  2008, “The Genealogy of Epistemic Virtue Concepts,” Philosophical Papers 37, 345–369.
  • Väyrynen, P.  2008, “Slim Epistemology with a Thick Skin,” Philosophical Papers 37, 389–412.
  • Väyrynen, P. 2011, “Thick Concepts and Variability,” Philosophers’ Imprint 11, 1-17.
  • Väyrynen, P.  2013, The Rude, the Lewd, and the Nasty: A Study of Thick Concepts in Ethics, Oxford University Press, Oxford.
  • Williams, B.  1985, Ethics and the Limits of Philosophy, Harvard University Press, Cambridge, Mass.
  • Williams, B.  1993, “Who Needs Ethical Knowledge?” in A Griffiths, Royal Institute of Philosophy, suppl. vol. 35, 213-22.
  • Williams, B.  1995, “Truth in Ethics,” Ratio 8(3), 227-42.
  • Yaffe, G.  2000, “Free Will and Agency at Its Best,” Philosophical Perspectives 14, 203-29.
  • Zangwill, N.  2001, “The Beautiful, the Dainty, and the Dumpy,” in Metaphysics of Beauty, Cornell University Press, Ithaca, NY.

 

Author Information

Brent G. Kyle
Email: brent.kyle@usafa.edu
United States Air Force Academy
U. S. A.

Desiderius Erasmus (1468?—1536)

ErasmusDesiderius Erasmus was one of the leading activists and thinkers of the European Renaissance. His main activity was to write letters to the leading statesmen, humanists, printers, and theologians of the first three and a half decades of the sixteenth century. Erasmus was an indefatigable correspondent, controversialist, self-publicist, satirist, translator, commentator, editor, and provocateur of Renaissance culture. He was perhaps above all renowned and repudiated for his work on the Christian New Testament. He was not a systematic thinker, and he did not found a system or school of philosophy. In fact, his profound contempt for the scholastic philosophers of the Middle Ages and Renaissance puts him at odds with the institution of philosophy. Perhaps Erasmus’ most important role in the history of philosophy is to have challenged and expanded the disciplinary boundaries of the field. He did so by propounding his philosophy of Christ, which displays some affinities for prior traditions including Platonism and Epicureanism, but which depends primarily on the understanding that philosophy is not an exclusive university discipline, but rather a moral obligation incumbent upon all believers.  In this context he founded an ethics of speech to guide himself and others to what he regarded as the true love of wisdom.

Table of Contents

  1. Life and Works
  2. Erasmus and Philosophy
    1. Humanism
    2. Platonism
    3. Philosopher Kings
    4. Paradox
    5. Epicureanism
  3. The Word
    1. Humanist Theology
    2. The Ethics of Speech
  4. A Controversial Legacy
    1. Canon Formation
    2. Censorship
    3. Scholarship
  5. References and Further Reading
    1. Editions
    2. Studies

1. Life and Works

Erasmus was born in the city of Rotterdam in the late 1460s and was educated by the Brethren of the Common Life, first at Deventer and then at s’Hertogenbosch. Orphaned at an early age, he took monastic vows and entered the Augustinian monastery at Steyn in 1486. In 1492 he was ordained a priest and in 1493 he entered the service of Hendrik van Bergen, the Bishop of Cambrai, who had just been named chancellor of the order of the Golden Fleece by the court of Burgundy. Service as secretary to an ambitious prelate delivered Erasmus from the tedium of monastic life and offered the prospect of travel and advancement. When the bishop’s career stalled, Erasmus left to study theology at the University of Paris in 1495, where he remained long enough to contract a lifelong aversion to the professional study of theology and its addiction to dialectic.

It was in Paris that Erasmus became attached to his first important patron, William Blount, Lord Mountjoy, whom he accompanied to England as tutor in 1499. This first English sojourn, though brief, proved crucial to Erasmus’ subsequent career since it was during this visit that he became acquainted with Thomas More and John Colet, founder of St. Paul’s School in London. Erasmus returned to Paris in 1500 to publish his first collection of proverbs, the Adagiorum Collectanea, whose dedicatory epistle, addressed to Mountjoy, remains a crucial statement of Erasmian poetics. After further itineracy in France and the Low Countries, he returned to England, where he was the guest of Thomas More, with whom he collaborated on a translation of selected dialogues by Lucian of Samosata. He embarked in 1506 on a long awaited voyage to Italy. In Venice,Erasmus worked with the humanist printer Aldus Manutius to publishthe first great collection of adages, the Adagiorum Chiliades in 1508. It was completed with the generous collaboration of numerous Italian humanists, as gratefully recorded in the adage Festina lente. From Italy, he went back to England, where he stayed long enough to compose the Praise of Folly (1511) and several educational writings including the De ratione studii of 1511, a preliminary version of his manual on letter writing De conscribendis epistolis, which was not published until 1522, and the completed version of De copia or On abundance in style (1512).

Having returned to the European continent in 1514, Erasmus began his association with the Swiss printer Johann Froben, for whom he prepared an expanded version of the adages in 1515. The following year brought forth from the Froben press (of Basel, Switzerland) the two works which Erasmus regarded as the twin masterpieces of his career. First came the Novum Instrumentum consisting of the Greek text and Erasmus’ own Latin translation of the New Testament, with Erasmus’ annotations keyed to the Latin text of the Vulgate. Next came, in nine volumes, the complete works of Saint Jerome, including four volumes of Jerome’s letters, edited by Erasmus himself and dedicated to William Warham, the Archbishop of Canterbury. In the beginning of Volume One stands Erasmus’ Life of Jerome. In 1517 Erasmus took up residence in Louvain. There he quickly became embroiled in a controversy with the faculty of theology at the university, over the role of the three languages–Greek, Latin, and Hebrew–in the study of theology. This was upon the immediate occasion of the foundation of a trilingual college in Louvain from the bequest of Erasmus’ friend Jérôme de Busleyden. Erasmus championed humanist theology, based on study of ancient languages, against the reactionary stance of the Louvain theologians who were intent on preserving their professional prerogatives. At the same time Erasmus launched another important scholarly venture, the Paraphrases on the New Testament, starting with the Epistle to the Romans in 1517.

In 1521, Erasmus moved to Basel where he collaborated closely with the Froben press on a succession of expanded editions of the Adages while continuing the Paraphrases on the New Testament. As the decade wore on Erasmus became involved in a reluctant and debilitating quarrel with Martin Luther over the competing doctrines of free will and predestination. Erasmus published his Diatribe on Free Will in 1524, to which Luther answered in 1525 with his treatise The Enslaved Will, which elicited from Erasmus the Hyperaspistes or Shieldbearer issued in two parts in 1526 and 1527. From this quarrel, Richard Popkin dated the advent of modern skepticism in his authoritative History of Scepticism. Having alienated many Catholic clerics with his trenchant criticism of Church hierarchy and Catholic devotion, Erasmus refused to join the Protestant reformers and found himself increasingly isolated as an advocate of Church unity through conciliation rather than persecution or reform. Towards the end of the decade, in the wake of the 1527 Sack of Rome, the Spanish Inquisition convened a conference in the city of Valladolid in order to deliberate on suspect passages in Erasmus’ work and to determine whether his books should be banned in Spain. Though the plague interrupted the Conference of Valladolid in August 1527 before it could reach a verdict, this did not deter Erasmus from composing a lengthy Apology addressed to the Spanish monks who had challenged his orthodoxy. The following year, 1528, Erasmus published his dialogue Ciceronianus (1528), attacking the pagan instincts of Cicero’s strictest humanist disciples, thereby ensuring himself continued notoriety among European men of letters. Erasmus finally left Basel in 1529 when the city officially declared its allegiance to the Reform, and took up residence in the Catholic city of Freiburg. In the few moments of leisure left to him by his interminable polemics and his voluminous correspondence, Erasmus composed his last masterpiece, his treatise on the rhetoric of preaching, entitled Ecclesiastes, which he completed and published in 1535. Erasmus’ travels came to an end on July 12, 1536 in Basel, where he had stopped on his way back to the Low Countries.

2. Erasmus and Philosophy

Scholarship has long recognized Erasmus’ problematic standing in the history of philosophy. For Craig Thompson, Erasmus cannot be called philosopher in the technical sense, since he disdained formal logic and metaphysics and cared only for moral philosophy. Similarly, John Monfasani reminds us that Erasmus never claimed to be a philosopher, was not trained as a philosopher, and wrote no explicit works of philosophy, although he repeatedly engaged in controversies that crossed the boundary from philosophy to theology. His relation to philosophy bears further scrutiny.

a. Humanism

To evaluate Erasmus’ relationship to philosophy, we have to understand his identity as a humanist. One of his earliest works, begun in his monastic youth, though not published until 1520, was the Antibarbari. It proposes a defense of the humanities, then essentially the study of classical languages and literature, against detractors who were scorned as barbarians. One of the key themes of the work is the vital role of classical culture in a Christian society, and this theme entails a redefinition of philosophy, in contrast to the prevailing university discipline of philosophy. Everything of value in pagan culture, insists the main interlocutor in the dialogue, was intended by Christ to enrich Christian society, and this includes philosophy, since Christ himself was “the very father of philosophy” (CWE 23:102; ASD I-1:121). The philosophy he fathered is the philosophy that Erasmus professed throughout his life and work, the philosophia Christi. This philosophy is not so much a set of dogmas as it is a way of life or an ethical commitment.

b. Platonism

One of Erasmus’ first published works and one of his most enduringly popular ones was the Enchiridion militis Christiani, or Manual of the Christian Soldier, of 1503. Written to an anonymous friend at court who had asked Erasmus to compose for him a guide to life, or ratio vivendi, that would lead him to a state of mind worthy of Christ. The Enchiridion gained immediate notoriety for its repudiation of monasticism and its insistence that true piety consists not in outward ceremonies but inward conversion. In the course of events, these themes would become associated with the Protestant Reformation. The Enchiridion espouses a philosophy of duality, the duality of body and soul, letter and spirit, that is explicitly modeled on Platonism.  The author deplores the fact that professional philosophers, obsessed with Aristotle, have banished Platonists and Pythagoreans from the classroom, and he cites approvingly St. Augustine’s preference of Plato over Aristotle. Of the two classical philosophers, Plato’s doctrines are closer to Christianity and his allegorical style is better suited to the exposition of scripture.

When the Enchiridion defines philosophy, it invokes a Socratic precedent. Socrates, in Plato’s dialogue Phaedo, said that philosophy was nothing other than meditation on death because it gradually alienates us from the material and corporeal concerns of life. Philosophy takes us out of the world, the mundus, and into Christ. This withdrawal from the world is not just for religious professionals, such as priests or monks, but for everyone in every walk of life. Philosophy is a spiritual state rather than a professional identity.

c. Philosopher Kings

Among the many Platonic topoi that appealed to Erasmus, none feature more consistently in his work than the ideal of the philosopher king, first introduced in Plato’s Republic. Often this serves as an epideictic topos, as when Erasmus hails Charles V’s brother Ferdinand as a philosopher king (in the conclusion to his treatise on the Christian concord De sarcienda ecclesiae concordia) (ASD V-3:313), or when he similarly acclaims King Sigismund of Poland in a letter to a Polish correspondent (Ep. 2533). Yet, the topic also sponsors some provocative thinking on philosophy. When Erasmus published his Education of a Christian Prince in 1516, he dedicated the treatise to Prince Charles, King of Spain, who was soon to succeed his grandfather Maximilian as Holy Roman Emperor, under the title Charles V. The dedicatory epistle evokes the familiar Platonic claim that no republic will be fortunate until philosophers are kings or kings embrace philosophy. By philosophy, Erasmus understands not the Aristotelian physics and metaphysics that dominated the university curriculum, but rather that kind of philosophy that frees our mind from errors and vices, and demonstrates correct government on the model of the eternal power. This model, or aeterni numinis exemplar, (ASD IV-1:133) is Christ, and the new philosopher king is a disciple of Christ.

Erasmus returns to the philosopher king in the text of his treatise on political philosophy when he again qualifies the meaning of “philosophy.”  To be a philosopher does not mean to be skilled at dialectic or physics, but rather to prefer truth to illusion. In sum, to be a philosopher is the same as to be a Christian: “idem esse philosophum et esse Christianum” (ASD IV-1:145). This is the most compact and emphatic statement of Erasmus’s philosophy of Christ. His most mature and complete statement can be found in the Paraclesis, one of the forewords or prefaces he composed for his edition of the New Testament, also from 1516. In its opening lines, the Paraclesis exhorts all mortals to the holiest and most salubrious study of Christian philosophy, insisting that this type of wisdom can be learned from fewer books and with less effort than the arcane doctrines of Aristotle. The philosophy of Christ is a straight road open to all who are endowed with pure and simple faith, and not an exclusive discipline reserved for specialists. In this context, Erasmus adds a controversial endorsement of vernacular translations of the Bible, so that everyone can share in the message of Christ.

d. Paradox

As we have seen, Erasmus often defines philosophy in the negative: his philosophy is not what people conventionally understand as philosophy. He repudiates conventional philosophy as too contentious, too belligerent and dogmatic. He prefers instead to experiment with a non-assertive form of philosophy that relies on paradox and on the neutralizing force of opposing arguments. His best known experiment in extended paradox, and his best claim to permanence in school curriculum, is the Praise of Folly, first published in Paris in 1511, and accompanied in subsequent editions by a commentary attributed to Gerhard Lister but thought to have been dictated by Erasmus himself. Folly, or Moria, delivers her own encomium, proudly invoking the inspiration of the ancient Greek sophists and seemingly disqualifying her every claim, except perhaps for her satire of the clergy and the learned professions; this is followed by a deceptively earnest exposition of Pauline spirituality, where we detect the same stylistic devices and profusion of proverbs as in the rest of the text. Erasmus labeled his text a declamation, in the sense of a thesis meant to provoke a counter-thesis rather than to assert a dogma. After all, the speaker is Folly, a notoriously unreliable authority on all matters secular and religious and from whom it is no dishonor to differ in views. Rather, we should be ashamed to agree with her. By resorting to this subterfuge, rather than positively asserting his beliefs in his own name, Erasmus was able to intervene in a number of intellectual, political, and spiritual debates in contemporary Christendom without affirming or denying anything.

Though voiced by Folly herself, the critique of church and clergy contained in the Praise of Folly provoked a bitter resentment among doctors of theology, as we know from correspondence between Erasmus and the Louvain theologian Martin Dorp. In a letter dated September 1514, Dorp testifies to the hostile reaction of the professional theologians to the Moriae Encomium and their apprehension of Erasmus’ new project to edit the New Testament, of which word had already begun to circulate largely due to Erasmus’ own public relations campaign. Erasmus’ response to Dorp in epistle 337 recapitulates many of the themes of his life-long tirade against scholasticism. One of the key themes here is the stark contrast drawn between the recent style of theology exemplified by the scholastics, and the old style of theology associated with the Church Fathers, many of whose works Erasmus edited. The passage from old to new has hardly been an improvement. The upstarts or recentiores are so engrossed in their factional disputes and dialectical quibbles that they do not have time to read the Bible. Erasmus makes his case most succinctly when he asks, “What does Aristotle have to do with Christ?” In effect, he accuses the schoolmen of idolatry in their substitution of pagan for Christian authority. The letter to Dorp, which was revised for publication with the Praise of Folly beginning with the Basel edition of 1516, offers a powerful and insidious repudiation of university theology and philosophy.

e. Epicureanism

Erasmus thus had little enthusiasm for the various philosophic orthodoxies prevailing among the ancients or the moderns. However, fairly late in life, and only belatedly acknowledged by criticism, he did turn in sympathy to one of the Hellenistic schools of philosophy, namely Epicureanism. Erasmus never espoused Epicureanism as a comprehensive system of thought, and he could not endorse the central tenet of the mortality of the human soul. He was, however, attracted to the Epicurean ideal of peace of mind through the retreat from worldly cares and the cultivation of a clear moral conscience. As Reinier Leushuis writes, Erasmus’ very last familiar colloquy, The Epicurean, published in 1533, contains the startling claim that no one better deserves the name “Epicurean” than the revered prince of Christian philosophy, Christ himself (CWE 40:1086; ASD I-3:731). Christ teaches his followers how to attain a state of complete tranquility, and freedom from the torments of a guilty conscience, that corresponds to the Epicurean ideal of ataraxia. This ideal, it is worth noting, is not the same as Stoic apathy, which Erasmus carefully disassociates from Christianity in various places including the Ecclesiastes. He points out that apathy would defeat the purpose of the Christian preacher who tries to arouse the emotions of the audience. The Christian ought to combine compassion with a clear conscience in order to achieve tranquility. This understanding of Christianity has little in common with the sterner tenor of Protestant thought, and may explain why Martin Luther labeled Erasmus an Epicurean in the vulgar sense of an atheist or unbeliever. Finally, it may not be out of place here to point out that, through his mature, non-dogmatic embrace of Epicureanism, Erasmus shows some affinity for the late Renaissance prose writer Michel de Montaigne.

3. The Word

Speech, for Erasmus, is not only a defining attribute of humanity but also a key to the relation between humanity and divinity, which was a central preoccupation of his thought. The Gospel of John declares that in the beginning was the logos, which Erasmus famously, or infamously, retranslated as sermo in preference to the reading of the Vulgate, verbum. Erasmus felt justified in changing the reading of the Vulgate, but only in the second edition of his New Testament published in 1519, because the received text of scripture is not divine. The words are human, or rather, a mediation between the human and the divine. The Bible is speech, and as such it must be read, interpreted, and understood according to the arts of speech. Moreover, the arts of speech are the only means we have of approaching divinity. Speech brings man close to God.

a. Humanist Theology

It is not entirely clear under what circumstances Erasmus first took an interest in biblical scholarship. His meeting with John Colet in England is known to have kindled his interest. Another factor, and perhaps an independent factor, was his discovery, in the summer of 1504 at the Praemonstratensian abbey of Parc just outside the walls of Louvain, of the manuscript of Lorenzo Valla’s Annotations on the New Testament, completed fifty years earlier. Erasmus prepared the editio princeps in Paris in 1505 with a prefatory epistle addressed to Christopher Fisher, papal protonotary and doctor of canon law. This preface in defense of Valla can be read as a sort of paradoxical encomium, or praise of invective, since Valla had a controversial reputation as a harsh critic and bitter polemicist. Erasmus compares Valla to Zoilus, a proverbial figure of odious slander for his presumptuous critique of Homeric epic, as we are reminded in the Adages; but here, in the preface he carries a positive connotation as a heroic censor. In his masterpiece on the elegance of the Latin language, Valla rescued literature from barbarity by administering the harsh medicine of criticism to the inveterate disease of scholastic Latin. Erasmus published an epitome of Valla’s work in 1531 based on an abstract he had composed many years earlier in the monastery at Steyn.  Valla had collated the Vulgate with the Greek text of the New Testament and emended the translation according to grammatical criteria, including the criterion of elegance. Surely, Erasmus asks, translation, whether of sacred or profane texts, is the purview of grammar. A translator is not a prophet. The prophet requires the gift of the Holy Spirit, but the translator needs grammar and rhetoric, the arts of language. In effect, since the word of God reaches us through human intermediaries using human language, theology cannot dispense with skill in language or linguarum peritia. This is the program of humanist theology.

The fervor with which the letter to Fisher defends Valla’s biblical philology suggests that Erasmus had already begun a similar enterprise himself. Indeed, the text that Erasmus edited in 1505 was the model and impulse for his own Annotations on the New Testament first published as part of the Novum Instrumentum in 1516. Jacques Chomarat has demonstrated convincingly how much Erasmus owed to Valla’s precedent and how little he acknowledged it, often expressing severe criticism of Valla’s choices, which can be taken as a sort of mimetic tribute to Valla, the modern Momus or god of criticism. Even the substitution of sermo for verbum, which embroiled Erasmus in such endless quarrels, and earned him a denunciation from the pulpit of St. Paul’s Cathedral in London, was anticipated by Valla in his notes on the Gospel of John. Often the Erasmian Annotations revisits the themes first broached in the prefatory epistle to Fisher, including the notion that the Bible, like all human language, is immersed in historical time.

Chomarat draws our attention to Erasmus’ annotation on Acts 10.38, where the apostle Peter is preaching to the Roman soldier Cornelius. Peter tells Cornelius that God anointed Jesus, which the Vulgate renders with a turn of phrase that elicits a long commentary from Erasmus, lengthened in successive editions. First of all, he observes, though the New Testament is written in Greek, it’s not clear whether Peter was speaking Greek or Hebrew, that is to say the local dialect of Hebrew. If he spoke Greek, he spoke it as a foreign language inflected by his own vernacular, which accounts for stylistic irregularities. After all, the apostles were only human, subject to error and ignorance like the rest of us. This got Erasmus in a lot of trouble. He rounds off his note with a disclaimer: “I’m not the oracle; you can take my opinion or leave it.”  This pronouncement coincides with similar disclaimers throughout his biblical scholarship and his apologies. To his implacable foe Edward Lee, Erasmus insists that his opinions are not dogmas to be taken on faith but only ideas to be debated: excutienda propono non sequenda (ASD IX-4:46). In humanist theology, both the text and the exegete are fallible.

b. The Ethics of Speech

Erasmus’ main statement on human speech comes at the beginning of his Paraphrase on the Gospel of John, published in 1523 between the third and fourth editions of his Annotations. Speaking in the voice of the evangelist, he acknowledges that the nature of God surpasses human understanding, and exceeds all of our powers of representation. Consequently, it is better to believe in God than to try to understand him through human reason. Christian philosophy is a kind of fideism, and not a speculative theology. But, in order to convey some understanding of things that are neither intelligible to us nor explainable by us, we have to use names of things that are familiar to our senses, though there is nothing in reality that is truly analogous to God. Therefore, just as the Bible calls that highest mind God, so it calls his only Son its speech. For the Son, though not identical to the Father, nevertheless resembles him through perfect similitude. In the same way, speech is the true mirror of the mind. In this sense, John’s logos is the paradigm of human speech. There is something miraculous about human speech which, arising from the inner recesses of the mind and passing through the ears of the listener, by a kind of occult energy transfers the soul of the speaker to the soul of the listener. This “occult energy” is not divine but nevertheless insinuates the proximity of the human and the divine. Christ is called the logos because God wanted to make himself known to us through him, so that we might be saved. Speech is the gateway to eternity.

The topos of the mirror of speech runs throughout Erasmus’ work as a guiding thread of his ethical and religious thought. It is the figure of speech that animates his philosophy. We find the mirror in the Praise of Folly, in several adages, and in the Lingua. In the latter, it is associated with a saying of Socrates, “Speak that I may see you,” that reappears in the Apophthegmata, whose prefatory epistle to William of Cleves cites Plutarch’s claim that, more than any deeds, sayings are the true mirror of the mind. Erasmus’ last work, the Ecclesiastes devotes a rather important development to the exalted role of speech as the true mirror and image of the soul. In all these instances, we should recognize the expression not of a linguistic theory but rather of a moral imperative. Our society and our salvation depend, according to Erasmus’ favorite figure of speech, on the coincidence of our words and thoughts.

Erasmus brings out this dimension of his theme best in the Lingua of 1525, which identifies the tongue as the source of our greatest benefits and ills. If speech is a mirror, he explains ruefully, it can certainly be a distorting mirror. Christ wanted to be called the logos and the truth, so that we would be ashamed of lying, but now even Christians have become so accustomed to lying that they don’t even realize they are lying. This sad state of affairs is the occasion for a very solemn admonition: “Once lying becomes acceptable, then we can have no more trust, and without trust we lose all human society” (CWE 29:316; ASD IV-1A:83). True speech is the foundation of society, and once this foundation is cracked, the social edifice collapses. At the end of the sixteenth century Michel de Montaigne will make the very same claim in the same prophetic tones, first in his essay “On liars” and again in the essay “On giving the lie.” Both writers experienced their age as a crisis of truth and language. If we want to do justice to the prolific and multifarious achievement of Desiderius Erasmus, we might say that he devoted his life to the ethics of speech.

4. A Controversial Legacy

a. Canon Formation

Since his death in 1536, Erasmus can hardly be said to have rested in peace. His adversaries and detractors were unappeased by his earthly disappearance, and his partisans and disciples shared some of his enthusiasm for polemic. One way to assess Erasmus’ legacy is to trace the publication history of his work, or what we might call the making of his canon. Erasmus himself was thinking about an edition of his complete works as early as 1523, as we know from a letter he sent to Johann von Botzheim in which he drew up a preliminary catalogue of his works to be edited in nine separate tomes with a tenth tome pending. The first tome was to include everything that concerns literature and education including all his translations from Lucian, Euripides, and Libanius. Tome two was reserved for the adages, and tome three for his correspondence. Volume four would be devoted to moral philosophy, including his various translations from Plutarch’s Moralia, and his original works such as the Praise of Folly, the Education of a Christian Prince, and the Complaint of Peace. Volume five was to handle works of religious instruction such as the Enchiridion, his psalm commentaries, and all the prefaces to his New Testament. Eventually, his Ecclesiastes would be included in this group. Volume six would consist of the New Testament and the Annotations, while volume seven was for the Paraphrases. Volume eight was supposed, on a conservative estimate, to hold all the Apologies, or polemical writings. Volume nine was for the Letters of St. Jerome, as if Erasmus wrote them himself, and time permitting, he promised to write a commentary on Paul’s Epistle to the Romans to fill volume 10. Erasmus revised his plan in a letter addressed to the Scottish historian Hector Boece in 1530, partly to accommodate all he had written in the intervening seven years. As a practical measure, the works are now distributed in series, or ordines, rather than individual volumes, and a few other modifications are introduced. The Paraphrases join the New Testament in ordo six to make room for all of Erasmus’ translations of the Greek Fathers in ordo seven. The ninth series now includes several Latin Fathers in addition to Jerome, and the plans for a tenth volume have been shelved. When the posthumous Opera omnia finally issued, from the Froben press of Basel in the course of 1538 to 1540, the publishers followed Erasmus’ plan, but once again separated New Testament and Paraphrases into orders six and seven, while the projected volume of Latin Fathers was canceled. The Apologies occupy the ninth and final order. This is substantially the same organization adopted by the French Protestant refugee Jean Le Clerc in his ten volume edition of the Opera omnia published in Leiden during the first decade of the eighteenth century. This edition is known as LB from the Latin name for the place of publication, Lugduni Batavorum. Le Clerc had to add a tenth volume to accommodate all the Apologies. As a further innovation, he even added an Index Expurgatorius, or repertory of all the passages in Erasmus’ work that were marked out for expurgation by the Spanish and the Roman censors, keyed to the pagination of the Basel edition. From this index it appears that the censors were particularly interested in the correspondence, and that only ordo eight, consisting of Latin translations of the Greek Fathers, escaped their vigilance.

b. Censorship

Le Clerc’s index is an opportune reminder that during the century and a half intervening between the Basel edition and LB, the history of the reception of Erasmus is largely a history of censorship. The Counter-Reformation devoted a fair amount of its institutional effort to the suppression of Erasmus’ legacy in Italy. The first indices of prohibited books were municipal lists whose scope of enforcement was necessarily restricted. In 1555 the Congregation of the Index promulgated the first papal Index of Prohibited Books, which included several titles of Erasmus such as the Annotations on the New Testament, the Colloquies, the Praise of Folly, the Enchiridion, and some writings on prayer and on celibacy. This index was suspended shortly after its promulgation but it did deter Italian printers from publishing new editions of Erasmus’ work. Then in 1559 the papacy promulgated a new index which inscribed Erasmus among the first class of heretics whose works were banned in entirety. This index in turn provoked a major incidence of book burning in the capital of European printing, Venice. Pope Pius IV issued a revised index in the wake of the Council of Trent, the Tridentine Index of 1564, which relaxed the comprehensive ban on Erasmus’ work. However, it also included new provisions for enforcement that resulted in the confiscation and destruction of a considerable number of Erasmus editions in the stock of Italian booksellers. The Tridentine Index also included a new provision for the expurgation of those works by Erasmus that were not banned outright. This provision yielded, after some delay, one of the more important printing and publishing ventures of the Counter Reformation, namely the official expurgated edition of Erasmus’ Adages. This was commissioned by the Council of Trent, prepared by Paolo Manuzio, prefaced by his son Aldo, and published in Florence by the Giunti in 1575 bearing the imprimatur of Pope Gregory XIII.  This was the censored edition of the Adages, purged of anything that could give offense to pious ears, and from which all traces of its author, Desiderius Erasmus, were so scrupulously removed that the work is often catalogued as the Adages of Paolo Manuzio. It’s a blessing that readers beyond Italy had access to the unexpurgated edition of the Adages, for if Montaigne had had to rely on Paolo Manuzio’s version, he never would have invented the essay form. Subsequent revisions of the Index of Prohibited Books gradually abandoned the project of further expurgation. The index of Pope Clement VIII in 1596 turned over the task of expurgation to individual readers relying on their own conscience or someone else’s directions. Eventually the papacy resigned itself to the presence of some copies of Erasmus’ literary and educational writings in private libraries in Italy.

c. Scholarship

Just as the era preceding the LB was an era of censorship, so the succeeding era has been, for the study of Erasmus, an era of burgeoning scholarship. The twentieth century in particular initiated several monuments of modern scholarship. From 1906 to 1958, P. S. Allen and his collaborators published in twelve volumes, with the Oxford University Press, the complete correspondence of Erasmus, the Opus epistolarum Desiderii Erasmi Roterodami. It has completely superseded the third ordo of his Opera omnia. In 1933, Wallace Ferguson published, as a supplement to the LB, his Erasmi opuscula, including Erasmus’ biography of St. Jerome. In the same year Hajo Holborn published an important edition of several key texts including what remains today the standard edition of the Enchiridion.  In 1969 an international committee based in the Netherlands began publishing the critical edition of the complete works of Erasmus known as ASD. Finally, in 1974, the University of Toronto Press launched its complete works of Erasmus in English translation known as CWE, for Collected Works of Erasmus. Both of those projects are still under way in the spirit of that royal adage Festina lente. Every year, under the auspices of the Erasmus of Rotterdam Society (which also publishes the journal Erasmus Studies, formerly the Erasmus of Rotterdam Society Yearbook), the Margaret Mann Phillips Lecture is delivered in the Spring and the Birthday Lecture in the Fall in order to commemorate and conserve the cultural and ethical legacy of Erasmus. The Society has credited an Erasmus website at erasmussociety.org.

5. References and Further Reading

a. Editions

  • Opus epistolarum Desiderii Erasmi Roterodami. Ed. P.S. Allen et al. 12 vols. Oxford: Clarendon Press, 1906-1958.
  • Opera omnia Desiderii Erasmi Roterodami. Amsterdam, 1969–. ASD
  • Desiderii Erasmi Roterodami opera omnia. Ed. Jean Le Clerc. 10 vols. Leiden: P. van der Aa, 1703-1706. (LB)
  • Ausgewählte Werke. Ed. Hajo Holborn. Munich: C.H. Beck’sche, 1933.
  • Erasmi opuscula. Ed. Wallace Ferguson. The Hague: Martin Nijhoff, 1933.
  • Collected Works of Erasmus. Toronto: University of Toronto Press, 1974-. (CWE)

b. Studies

  • Boyle, Marjorie O’Rourke. Erasmus on Language and Method in Theology. Toronto: University of Toronto Press, 1977.
  • Chomarat, Jacques. “Les Annotations de Valla, celles d’Erasme et la grammaire.” In Histoire de l’exégèse au XVIe siècle. Geneva: Droz, 1978.
  • Grendler, Paul. “The Survival of Erasmus in Italy.” Erasmus in English 8 (1976). 2-22.
  • Leushuis, Reinier. “The Paradox of Christian Epicureanism in Dialogue: Erasmus’ Colloquy The Epicurean.” Erasmus Studies 35 (2015). 113-136.
  • Meer, Tineke ter. “A True Mirror of the Mind: Some Observations on the Apophthegmata of Erasmus.” Erasmus of Rotterdam Society Yearbook 23 (2003) 67-93.
  • Monfasani, John. “Erasmus and the Philosophers.” Erasmus of Rotterdam Society Yearbook 32 (2012). 47-68.
  • Popkin, Richard. The History of Scepticism from Erasmus to Descartes. New York: Harper and Row, 1968.
  • Thompson, Craig. “Introduction.” In Erasmus, Literary and Educational Writings. Vol. 1. Toronto: University of Toronto Press, 1978. CWE 23.

 

Author Information

Eric MacPhail
Email: macphai@indiana.edu
Indiana University
U. S. A.

Parmenides of Elea (Late 6th cn.—Mid 5th cn. B.C.E.)

ParmenidesParmenides of Elea was a Presocratic Greek philosopher. As the first philosopher to inquire into the nature of existence itself, he is incontrovertibly credited as the “Father of Metaphysics.” As the first to employ deductive, a priori arguments to justify his claims, he competes with Aristotle for the title “Father of Logic.” He is also commonly thought of as the founder of the “Eleatic School” of thought—a philosophical label ascribed to Presocratics who purportedly argued that reality is in some sense a unified and unchanging singular entity. This has often been understood to mean there is just one thing in all of existence. In light of this questionable interpretation, Parmenides has traditionally been viewed as a pivotal figure in the history of philosophy: one who challenged the physical systems of his predecessors and set forth for his successors the metaphysical criteria any successful system must meet. Other thinkers, also commonly thought of as Eleatics, include: Zeno of Elea, Melissus of Samos, and (more controversially) Xenophanes of Colophon.

Parmenides’ only written work is a poem entitled, supposedly, but likely erroneously, On Nature. Only a limited number of “fragments” (more precisely, quotations by later authors) of his poem are still in existence, which have traditionally been assigned to three main sections—Proem, Reality (Alétheia), and Opinion (Doxa). The Proem (prelude) features a young man on a cosmic (perhaps spiritual) journey in search of enlightenment, expressed in traditional Greek religious motifs and geography. This is followed by the central, most philosophically-oriented section (Reality). Here, Parmenides positively endorses certain epistemic guidelines for inquiry, which he then uses to argue for his famous metaphysical claims—that “what is” (whatever is referred to by the word “this”) cannot be in motion, change, come-to-be, perish, lack uniformity, and so forth. The final section (Opinion) concludes the poem with a theogonical and cosmogonical account of the world, which paradoxically employs the very phenomena (motion, change, and so forth) that Reality seems to have denied. Furthermore, despite making apparently true claims (for example, the moon gets its light from the sun), the account offered in Opinion is supposed to be representative of the mistaken “opinions of mortals,” and thus is to be rejected on some level.

All three sections of the poem seem particularly contrived to yield a cohesive and unified thesis. However, discerning exactly what that thesis is supposed to be has proven a vexing, perennial problem since ancient times. Even Plato expressed reservations as to whether Parmenides’ “noble depth” could be understood at all—and Plato possessed Parmenides’ entire poem, a blessing denied to modern scholars. Although there are many important philological and philosophical questions surrounding Parmenides’ poem, the central question for Parmenidean studies is addressing how the positively-endorsed, radical conclusions of Reality can be adequately reconciled with the seemingly contradictory cosmological account Parmenides rejects in Opinion. The primary focus of this article is to provide the reader with sufficient background to appreciate this interpretative problem and the difficulties with its proposed solutions.

Table of Contents

  1. Life
  2. Parmenides’ Poem
    1. The Proem
    2. Reality
    3. Opinion
    4. Positive Aletheia. Negative Opinion?
  3. Interpretative Treatments
    1. Reception of the Proem
    2. The A-D Paradox: Select Interpretative Strategies and their Difficulties
      1. Strict Monism and Worthless Opinion
      2. Two-World (or Aspectual) Views
      3. Essentialist (or Meta-Principle) Views
      4. Modal Views
  4. Parmenides’ Place in the Historical Narrative
    1. Influential Predecessors?
      1. Anaximander/Milesians
      2. Aminias/Pythagoreanism
      3. Xenophanes
      4. Heraclitus
    2. Parmenides’ Influence on Select Successors
      1. Eleatics: Zeno and Melissus of Samos
      2. The Pluralists and the Atomists
      3. Plato
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

As with other ancient figures, little can be said about Parmenides’ life with much confidence. It is certain that his hometown was Elea (Latin: Velia)—a Greek settlement along the Tyrrhenian coast of the Appenine Peninsula, just south of the Bay of Salerno, now located in the modern municipality (comune) of Ascea, Italy. Herodotus reports that members of the Phocaean tribe established this settlement ca. 540-535 B.C.E., and thus Parmenides was of Ionian stock (1.167.3). Parmenides’ father, a wealthy aristocrat named Pyres, was probably one of the original colonizers (Coxon Test. 40-41a, 96, 106).

When exactly Parmenides was born is far more controversial. There are two competing methods for dating Parmenides’ birth, to either 540 (Diogenes Laertius) or 515 (Plato) B.C.E. Neither account is clearly convincing in-itself, and scholars are divided on their reliability and veracity.

The source for Parmenides’ earlier birthdate (c. 540 B.C.E.) is based upon a relatively late (3rd cn. C.E.) doxographical account by Diogenes, who relied on Apollodorus’ (2nd cn. B.C.E.) writings. This account claims Parmenides “flourished”—a euphemism conventionally understood to correspond with having reached forty years of age, and/or the height of one’s intellectual career—during the sixty-ninth Olympiad (between 504-500 B.C.E.). The reliability of this account is esteemed for its historical focus (as opposed to any philosophical agenda) of these authors. However, the lateness of the account can be considered a weakness, and the “flourishing” system of dating is quite artificial, vague, and imprecise.

The later birthdate (515 B.C.E). is based upon the opening of Plato’s Parmenides (127a5-c5). The narrative setting describes a young Socrates (about 20) conversing with Parmenides, who is explicitly described as being “about 65.” Since Socrates was born c. 470 B.C.E., subtracting the remaining 45 years yields a birthdate for Parmenides c. 515 B.C.E. Some have taken Plato’s precise mention of Parmenides’ age as indicative of veracity. However, Plato is also known for including other entirely fictitious, clearly anachronistic yet precise details in his dialogues. In fact, the very conversation reported in the dialogue would have been impossible, as it depends upon views Plato developed late in his life, which are certainly not “Socratic” at all. Plato is not necessarily a reliable historical source.

Choosing between these accounts can have significant historical implications regarding Parmenides’ possible relationship to other thinkers, particularly Heraclitus. For instance, if one accepts Plato’s later date, this would seem to require denying that Parmenides influenced Heraclitus (540-480 B.C.E., also based upon Diogenes’ reports) as Plato suggests (Sophist 242d-e). On the other hand, if one accepts the earlier dating by Diogenes, it makes it very unlikely Heraclitus’ work could have influenced Parmenides, as there would not have been sufficient time for his writings to become known and travel across the Greek world from Ephesus, Ionia.

Whenever Parmenides was born, he must have remained a lifelong citizen and permanent resident of Elea—even if he traveled late in life, as Plato’s accounts in Parmenides and Theatetus suggest. This is first indicated by the evident notoriety he gained for contributions to his community. Several sources attest that he established a set of laws for Elea, which remained in effect and sworn to for centuries after his death (Coxon Test. 16, 116). A 1st cn. C.E. pedestal discovered in Elea is dedicated to him, with an inscription crediting him not only as a “natural philosopher,” but as a member (“priest”) of a local healing cult/school (Coxon 41; Test. 106). Thus, he likely contributed to the healing arts as a patron and/or practitioner. Finally, if Parmenides really was a personal teacher of Zeno of Elea (490-430 B.C.E)., Parmenides must have been present in Elea well into the mid-fourth century B.C.E. Ultimately, however, when and where Parmenides died is entirely unattested.

2. Parmenides’ Poem

Ancient tradition holds that Parmenides produced only one written work, which was supposedly entitled On Nature (Coxon Test. 136). This title is suspect, as it had become common even by Sextus’ time to attribute this generic title to all Presocratic works (Coxon 269-70; Test. 126). No copy of the original work has survived, in any part. Instead, scholars have collected purported quotations (or testimonia) from a number of ancient authors and attempted to reconstruct the poem by arranging these fragments according to internal and external (testimonia) evidence. The result is a rather fragmentary text, constituted by approximately 154 dactylic-hexameter lines (some are only partial lines, or even only one word). This reconstructed arrangement has then been traditionally divided into three distinct parts: an introductory section known as the Proem; a central section of epistemological guidelines and metaphysical arguments (Aletheia, Reality); and a concluding “cosmology,” (Doxa, or Opinion).

The linear order of the three main extant sections is certain, and the assignment of particular fragments (and internal lines) to each section is generally well-supported. However, it must be admitted that confidence in the connectedness, completeness, and internal ordering of the fragments in each section decreases significantly as one proceeds through the poem linearly: Proem-Reality-Opinion. Furthermore, many philological difficulties persist throughout the reconstruction. There are conflicting transmissions regarding which Greek word to read, variant punctuation possibilities, concerns surrounding adequate translation, ambiguities in the poetical form, and so forth. Given all of this, any serious engagement with Parmenides’ work should begin by acknowledging the incomplete status of the text and recognizing that interpretative certainty is generally not to be found.

a. The Proem

The Proem (C/DK 1.1-32) is by far the most complete section available of Parmenides’ poem. This is due entirely to Sextus Empiricus, who quoted Lines 1-30 of the Proem (C1) as a whole and explicitly reported that they began the poem (Coxon Test. 136). Not only are the bulk of these lines (1.1-28) not quoted by any other ancient source, but their content is not even mentioned in passing. In short, modern scholars would have no idea the Proem ever existed were it not for Sextus. Simplicius quotes lines C/DK 1.31-32 immediately after quoting lines very similar to Sextus’ 1.28-30, and thus these are traditionally taken to end the Proem.

Nevertheless, there is some controversy regarding the proper ending of the Proem. While Lines 1.28-30 are reported by several additional sources (Diogenes Laertius, Plutarch, Clement, and Proclus), Simplicius alone quotes lines 1.31-32. In contrast, Sextus continued his block quotation of the Proem after line 1.30 with the lines currently assigned to C/DK 7.2-7, as if these immediately followed. Diels-Kranz separated Sextus’ quotation into distinct fragments (1 and 7) and added Simplicius’ lines to the end of C/DK 1. The vast majority of interpreters have followed both these moves. However, there may be good reasons to challenge this reconstruction (compare Bicknell 1968; Kurfess 2012, 2014).

The Proem opens mid-action, with a first-person account of an unnamed youth (generally taken to be Parmenides himself) traveling along a divine path to meet a didactic (also unnamed) goddess. The youth describes himself riding in a chariot with fire-blazing wheels turning on pipe-whistling axles, which seems to be traversing the heavens. The chariot is drawn by mares, steered by the Daughters of the Sun (the Heliades), who began their journey at the House of Night. The party eventually arrives at two tightly-locked, bronze-fitted gates—the Gates of Night and Day. In order to pass through these “aethereal” gates, the Heliades must persuade Justice to unlock the doors with soft words. After successfully passing through this portal and driving into the yawning maw beyond, the youth is finally welcomed by the unnamed goddess, and the youth’s first-person account ends.

Many have thought the chariot journey involved an ascent into the heavens/light as a metaphor for achieving enlightenment/knowledge and for escaping from darkness/ignorance. However, it would seem that any chariot journey directed by sun goddesses is best understood as following the ecliptic path of the sun and Day (also, that of the moon and Night). This is further confirmed given the two geographical locations explicitly named (the “House of Night” and the “Gates of Night and Day”), both of which are traditionally located in the underworld by Homer and Hesiod. Thus, the chariot journey is ultimately circular, ending where it began (compare C2/DK5). From the House of Night—far below the center of the Earth—the Heliades would follow an ascending arc to the eastern edge of the Earth, where the sun/moon rise. The journey would then continue following the ecliptic pathway upwards across the heavens to apogee, and then descend towards sunset in the West. At some point along this route over the Earth they would collect their mortal charge. Following this circular path, the troupe would eventually arrive back in the underworld at the Gates of Night and Day. Not only are these gates traditionally located immediately in front of the House of Night, but the mention of the chasm that lies beyond them is an apt poetical description of the completely dark House of Night. On this reading, rather than a metaphorical ascent towards enlightenment, the youth’s journey is actually a didactic katabasis (a descent into the underworld). It also suggests a possible identification of the anonymous spokes-goddess—Night (compare Palmer 2009).

The rest of the poem consists of a narration from the perspective of the unnamed goddess, who begins by offering a programmatic outline of what she will teach and what the youth must learn (1.28b-30):

…And it is necessary for you to learn all things,

Both the still heart of persuasive reality

And the opinions of mortals, in which there is no genuine reliability.

That the youth is supposed to learn some truth about “reality” (aletheia) is uncontroversial and universally understood to be satisfied by the second major section of the poem, Reality (C 2-8.50). It is also uncontroversial that the “opinions of mortals” will be taught in Opinion (C 8.51-C 20) and that this account will be inferior to the account of Aletheia in some way—certainly epistemically and perhaps also ontologically. The standard reconstruction of the Proem then concludes with the two most difficult and controversial lines in Parmenides’ poem (C 1.31-32):

ἀλλ’ ἔμπης καὶ ταῦτα μαθήσεαι ὡς τὰ δοκεῦντα

χρῆν δοκίμος εἴναι διὰ παντὸς πάντα περῶντα [περ ὄντα].

The suspicion that these lines might help shed light on the crucial relationship between Reality and Opinion is well-warranted. However, there are numerous possible readings (both in the Greek transmission and in the English translation) and selecting a translation for these lines requires extensive philological considerations, as well as an interpretative lens in which to understand the overall poem—the lines themselves are simply too ambiguous to make any determination.  Thus, it is quite difficult to offer a translation or summary here that does not strongly favor one interpretation of Parmenides over another. The following is an imperfect attempt at doing so, while remaining as interpretatively uncommitted as possible.

But nevertheless, you shall also learn “these things,” how the  “accepted/seeming things” should/would have had (to be) to be acceptably, passing through [just being] all things, altogether/in every way.

Commentators have tended to understand these lines in several general ways. First, Parmenides might be offering an explanation for why it is important to learn about mortal opinions if they are so untrustworthy/unreliable, as line 1.30 argues. Another common view is that Parmenides might be telling the youth he will learn counterfactually how the opinions of mortals (or the objects of such opinions) would or could have been correct (even though they were not and are not now). Alternatively, Parmenides might be pointing to some distinct, third thing for the youth to learn, beyond just Reality and Opinion. This third thing could be, but is not limited to, the relationship between the two sections, which does not seem to have been explicitly outlined in the poem (at least, not in the extant fragments). In any case, these lines are probably best dealt with once one already has settled upon an interpretative stance for the overall poem given the rest of the evidence. If nothing else, whether a selected interpretation can be coherently and convincingly conjoined with these lines can provide a sort of final “test” for that view.

b. Reality

Immediately following the Proem (C/DK 1), the poem moves into its central philosophical section: Reality (C. 2-C 8.49). In this section, Parmenides’ positively endorsed epistemic and metaphysical claims are outlined. Though lengthy quotations strongly suggest a certain internal structure, there is certainly some room for debate with respect to proper placement, in particular amongst the shorter fragments that do not share any common content/themes with the others. In any case, due to the overall relative completeness of the section and its clearly novel philosophical content—as opposed to the more mythical and cosmological content found in the other sections—these lines have received far more attention from philosophically-minded readers, in both ancient and modern times.

In Reality, the unnamed youth is first informed that there are only two logically possible “routes of inquiry” one might embark upon in order to understand “reality” (C 3/DK 2). Parmenides’ goddess endorses the first route, which recognizes that “what-is” is, and that it must be (it is not to not be), on the grounds that it is completely trustworthy and persuasive. On the other hand, the goddess warns the youth away from the route which posits “what-is-not and necessarily cannot be,” as it is a path that can neither be known nor spoken of. The reasoning seems to be that along this latter route, there is no concept to conceive of, no subject there to refer to, and no properties that can be predicated of— “nothingness.”

Arguably, a third possible “route of inquiry” may be identified in C 5/DK 6. Here, the goddess seems to warn the youth from following the path which holds being and not-being (or becoming and not-becoming) to be both the same and not the same. This is the path that mortals are said to wander “without judgment,” on a “backwards-turning journey.” Not only is confusing “what is” and “what is not” different from positing necessary being and non-being (C 3/DK 2), and thus a distinct “route,” but this description also seems to correspond far better to the cosmogony of opposites found in Opinion than to the route of “what is not and necessarily cannot be.”

The order not to follow the path that posits only “what is” is further complicated by the fragmentary report that there is some sort of close relationship between thinking (or knowing) and being (what exists, or can exist, or necessarily exists): “…for thinking and being are the same thing,” or “…for the same thing is for thinking as is for being” (C 4/DK 3). Scholars are divided as to what the exact meaning of this relationship is supposed to be, leading to numerous mutually exclusive interpretative models. Does Parmenides really mean to make an identity claim between the two—that thinking really is numerically one and the same as being, and vice-versa? Or, is it that there is some shared property(-ies) between the two? Is Parmenides making the rather problematic claim that whatever can be thought, exists (compare Gorgias “On Nature, or What-is-Not”)? Or, more charitably, only that whatever does exist can in principle be thought of without contradiction, and thus is understandable by reason—unlike “nothingness”? Perhaps both? Most commonly, Parmenides has been understood here as anticipating Russellian concerns with language and how meaning and reference must be coextensive with, and even preceded by, ontology (Owen 1960).

In any case, from these epistemic considerations, the goddess’ deductive arguments in C/DK 8 are supposed to follow with certainty from deductive, a priori reasoning. By studiously avoiding thinking in any way which entails thinking about “what-is-not,” via reductio, the subject of Reality is concluded to be: truly eternal—ungenerated and imperishable (8.5-21), a continuous whole (8.21-25), unmoved and unique (8.21-33), perfect and uniform (8.42-49). For instance, since coming-to-be involves positing “not-being” in the past, and mutatis mutandis for perishing, and since “not-being” cannot be conceived of, “what is” cannot have either property. In a similar vein, spatial motion includes “not-being” at a current location in the past, and thus motion is also denied. This line of reasoning can be readily advanced to deny any sort of change at all.

In the end, what is certain about Reality (whatever the subject, scope, or number of this “reality” is supposed to be) is that there is purportedly at least one thing (or perhaps one kind of thing) that must possess all the aforementioned “perfect” properties, and that these properties are supposed to follow from some problem with thinking about “what is not.” It has been commonly inferred from this that Parmenides advocated that there is actually just one thing in the entire world (that is, strict monism), and that this entity necessarily possesses the aforementioned properties.

c. Opinion

 

Opinion has traditionally been estimated to be far longer than the previous two sections combined. Diels even estimated that 9/10 of Reality, but only 1/10 of Opinion, are extant, which would have the poem spanning some 800-1000 lines. This degree of precision is highly speculative, to say the least. The reason Opinion has been estimated to be so much larger is due to the fragmentary nature of the section (only 44 verses, largely disjointed or incomplete, are attested) and the apparently wide array of different topics treated—which would seem to require a great deal of exposition to properly flesh-out.

The belief that Opinion would have required a lengthy explication in order to adequately address its myriad of disparate topics may be overstated. As Kurfess has recently argued, there is nothing in the testimonia indicating any significant additional content belonging to the Opinion beyond that which is explicitly mentioned in the extant fragments (2012). Thus, though Opinion would still be far longer than the quite limited sampling that has been transmitted, it need not have been anywhere near as extensive as has been traditionally supposed, or all that much longer than Reality. Regardless of its original length, the incompleteness of this section allows for substantially less confidence regarding its arrangement and even less clarity concerning the overall meaning of the section. As a result, the assignment of certain fragments to this section has faced more opposition (compare Cordero 2010 for a recent example). Nevertheless, the internal evidence and testimonia provide good reasons to accept the traditional assignment of fragments to this section, as well as their general arrangement.

 

Following the arguments of Aletheia, the goddess explicitly ends her “trustworthy account and thought about reality” and commands the youth from there on, “hearing the deceptive arrangement of her words,” to learn mortal opinions (C 8.50-52).

The range of content in this section includes: metaphysical critiques of how mortals err in “naming” things, particularly in terms of a Light/Night duality (C 8.51-61, 9, 20); programmatic passages promising a detailed account of the origin of celestial bodies (C 10, 11); a theogonical account of a goddess who rules the cosmos and creates other deities, beginning with Love (C 12, 13); cosmogonical and astronomical descriptions of the moon and its relationship to the sun (C 14, 15), along with an apparent description of the foundations of the earth (C 16); some consideration of the relationship between the mind and body (C 17); and even accounts related to animal/human procreation (C 18-19).

The error of mortals is grounded in their “naming” (that is, providing definite descriptions and predications) the subject of Reality in ways contrary to the conclusions previously established about that very subject. As a result, mortals have grounded their views on an oppositional duality of two forms—Light/Fire and Night—when in fact it is not right to do so (8.53-54). Admittedly, the Greek is ambiguous about what exactly it is not right for mortals to do. It is common amongst scholars to read these passages as claiming it is either wrong for mortals to name both Light and Night, or that naming just one of these opposites is wrong and the other acceptable. This reading tends to suggest that Parmenides is either denying the existence of the duality completely, or accepting that only one of them properly exists. “Naming” only one opposite (for example, Light) seems to require thinking of it in terms of its opposite (for example, “Light” is “not-dark”), which is contrary to the path of only thinking of “what is,” and never “what is not” (compare Mourelatos 1979). The same holds if only Night is named. Thus, it would not seem appropriate to name only one of these forms. This problem is only doubled if both forms are named. Thus, it would seem that mortals should not name either form, and thus both Light and Night are denied as proper objects of thought. The Greek can also be read as indicating that it is the confusion of thinking both “what is” and “what is not” that results in this “naming error,” and that thinking both of these judgments (“what is” and “what is not”) simultaneously is the true error, not “naming” in-itself.

Mortal “naming” is treated as problematic overall in other passages as well. This universal denigration is first introduced at C 8.34-41 on the traditional reconstruction (For a proposal to relocate these lines to Opinion, see Palmer’s 2009 discussion of “Ebert’s Proposal”). Here, the goddess dismisses anything mortals erroneously think to be real, but which violate the perfect predicates of Reality, as “names.” C 11 expounds upon this “naming error,” arguing that Light and Night have been named and the relevant powers of each have been granted to their objects, which have also been named accordingly. C 20 appears to be a concluding passage for both Opinion and the poem overall, stating that only according to (presumably mistaken) belief, things came-to-be in the past, currently exist, and will ultimately perish and that men have given a name to each of these things (and/or states of existence). If this is truly a concluding passage, the apparently disparate content of Opinion is unified as a treatment of mortal errors in naming, which the section uncontroversially began with. From these grounds, the other fragments traditionally assigned to Opinion can be linked (directly or indirectly) to this section, based upon parallels in content/imagery and/or through contextual clues in the ancient testimonia.

Both C 9/DK 10 and C 10/DK 11 variably promise that the youth will learn about the generation/origins of the aether, along with many of its components (sun, moon, stars, and so forth). C/DK 12 and C/DK 13 then deliver on this cosmogonical promise by developing a theogony:

C 12: For the narrowest rings became filled with unmixed fire,

The outer ones with night, along which spews forth a portion of flame.

And in the middle of these is a goddess, who governs all things.

For in every way she engenders hateful birth and intercourse

Sending female to mix with male, and again in turn,

male to mix with the more feminine.

C 13: [and she] contrived Love first, of all the gods.

C 14 and C 15 then describe the cosmology that results from the theogonical arrangement, expounding the properties of the moon as, respectively, “an alien, night-shining light, wandering around the Earth,” which is “always looking towards the rays of the sun.” Similarly, C 16 is a single word (ὑδατόριζον), meaning “rooted in water,” and the testιmonia explicitly claims this is grounded in the Earth.

In many ways, the theogonical cosmology presented so far is quite reminiscent of Hesiod’s own Theogony, and certain Milesian cosmologies at times. However, C 17-19 are more novel, focusing on the relationship between the mind and body (C 17/DK 16), as well as sexual reproduction in animals—which side of the uterus different sexes are implanted on (C 18/DK 17) and the necessary conditions for a viable, healthy fetus (C 19/DK 18). These passages can be tied to the previous fragments in that they are an extension of the theogonical/cosmogonical account, which has moved on to offer an account of earthly matters—the origin of animals and their mental activity—which would still be under the direction of the “goddess who governs all things” (C 12). This is clearly the case with respect to C 18-19, as the governing goddess is explicitly said to direct male-female intercourse in C 12.

d. Positive Aletheia. Negative Opinion?

Given the overall reconstruction of the poem as it stands, there appears to be a counter-intuitive account of “reality” offered in the central section (Reality)—one which describes some entity (or class of such) with specific predicational perfections: eternal—ungenerated, imperishable, a continuous whole, unmoving, unique, perfect, and uniform. This is then followed by a more intuitive cosmogony, suffused with traditional mythopoetical elements (Opinion)—a world full of generation, perishing, motion, and so forth., which seems incommensurable with the account in Reality. It is uncontroversial that Reality is positively endorsed, and it is equally clear that Opinion is negatively presented in relation to Aletheia. However, there is significant uncertainty regarding the ultimate status of Opinion, with questions remaining such as whether it is supposed to have any value at all and, if so, what sort of value.

While most passages in the poem are consistent with a completely worthless Opinion, they do not necessitate that valuation; even the most obvious denigrations of Opinion itself (or mortals and their views) are not entirely clear regarding the exact type or extent of its failings. Even more troubling, there are two passages which might suggest some degree of positive value for Opinion—however, the lines are notoriously difficult to understand. Depending upon how the passages outlined below are read/interpreted largely determines what degree/kind (if any) of positive value should be ascribed to Opinion. Thus, it is helpful to examine more closely the passages where the relationship between the sections is most directly treated.

Consider the goddess’ programmatic outline for the rest of the poem at the end of the Proem:==

C 1:   …And it is necessary for you to learn all things, (28b)
Both the still-heart of persuasive reality,
And the opinions of mortals, in which there is no trustworthy persuasion. (30)

From the very beginning of her speech, the goddess presents the opinions of mortals (that is, Opinion) negatively in relation to Reality. However, it does not necessarily follow from these lines that Opinion is entirely false or valueless. At most, all that seems entailed here is a comparative lack of epistemic certainty in relation to Reality. However, the transition from Reality to Opinion (C/DK 8.50-52), when the goddess ends her “trustworthy account and thought about reality,” and in contrast, charges the youth to “learn about the opinions of mortals, hearing the deceptive arrangement of my words,” implies falsity (C/DK 8.50-52). This deceptive arrangement could be understood to apply only to the goddess’ presentation of the account. However, as Aletheia is described as a “trustworthy account,” and there seems to be no doubt that it is the content (as well as the presentation) that is trustworthy, the parallel should hold for Opinion as well. Accepting that it is the content of Opinion that is deceptive, one of the most difficult interpretative questions regarding Opinion remains. Is the extent of the deception supposed to apply to: a) every proposition within Opinion (for example, Parmenides wants to say it is actually false that the moon reflects sunlight), or b) only some significant aspects of its content (for example, basing an account on opposites like Light/Night)? Either way, C/DK 1.30 and 8.50-2 make it clear that Opinion and the “opinions of mortals” are lacking in both veracity and epistemic certainty—at least to some extent.

Mortal beliefs are also unequivocally derided in between these bookends to Reality, though in slightly different terms. At C 5, the goddess warns the youth from the path of inquiry upon which “…mortals with no understanding stray two-headed…by whom this has been accepted as both being and not being the same” (Coxon’s translation). C 5 not only claims mortal views are in error, it identifies the source of their error—confusing being and non-being. C/DK 7 then further identifies the reason mortals tend to fall into this confusion—by relying upon their senses, rather than rational accounts. “But do keep your thought from this way of enquiry. And let not habit do violence to you on the empirical way of exercising an unseeing eye and a noisy ear and tongue, but decide by discourse the controversial test enjoined by me” (Coxon’s translation).

Finally, the goddess’ criticism of the “naming error” of mortals—which seems to be the primary criticism offered in Opinion—furthers the case against Opinion’s apparently complete lack of veracity. At the first mention of the “naming error” on the traditional arraignment, the goddess says, “…To these things all will be a name, which mortals establish, having been persuaded they are real: to come to be and to perish, to be and not to be, and to change location and exchange bright color throughout” (C/DK 8.38b-41; Coxon’s translation). The persuasion of mortals to believe in the reality of the objects and phenomena they “name” clearly implies here that the counterfactual is supposed to be true, and that any such phenomena which do not correspond with the properties advocated in Reality are not real. Immediately following the goddess’ transition to her “deceptive account,” C/DK 8.53-56 makes it clear that the activity of “naming” and “distinguishing things in opposition,” contrary to the unity of Reality, is the initial mistake of mortals: “For they established two forms to describe (“name”) their two judgments (of which one should not be—the one by which mortals are and have been misled), and they distinguished two opposites in substance, and established predicates separating each from one another.”

Given the passages outlined so far in this section, there appears to be quite a substantial case for taking Opinion to be entirely false and lacking any value whatsoever. Nevertheless, this may not be the entire story. It is important to stress that while these passages seem to strongly suggest (and one may argue that they even entail) Opinion is false, the goddess never actually says it is “false” (ψευδής). Furthermore, there is at least some textual evidence that might be understood to suggest Opinion should not be treated as negatively as the passages considered so far would suggest.

As noted in the summary of the Proem above, there are two particularly difficult lines (C 1.31-32) which may be understood as suggesting some positive value for Opinion, despite its lacking in comparison to Reality. However, even if the Greek is read along these lines, it remains to be determined whether this value is based upon some substantial value in the account itself (there is some sense or perspective in which it is true), or merely some pragmatic and/or instructive value (for example, it is worthwhile to know what is wrong and why, so as to avoid not falling into such errors).

In any case, even if there is some positive reason for learning Opinion provided in these lines, this could hardly contradict the epistemic inferiority (“no trustworthy persuasion”) just asserted at C/DK 1.30, just as it is quite difficult to deny the falsity implied from lines C/DK 8.50-52. At most, these lines could only soften the negative treatment of mortal views. Nonetheless, the possibility should be admitted that upon certain variant readings of C/DK 1.31-32, the status of Opinion and its value could be more complex and ambivalent than other passages suggest.

Only one further extant passage remains which might offer some reason to think Opinion maintains some positive value, and this is the passage most commonly appealed to for this purpose. At C/DK 8.60-61, the goddess seems to offer an explicit rationale for providing the youth with her “deceptive” account: “I declare to you this entirely ‘likely’ (ἐοικότα) arrangement so that you shall never be surpassed by any judgment of mortals.” The key word here is the apparently positive participle ἐοικότα, which does not obviously reconcile with the otherwise negative treatment of Opinion. The participle ἐοικότα can have the general sense of “likely,” in the sense of “probable,” as well as “fitting/seemly” in the sense of “appropriate.” Either translation could suggest at least the possibility of veracity and/or value in Opinion. That is, the account in Opinion could “likely” be true, though it is epistemically uncertain whether it is or not. Or, the account could be “fitting,” given the type of account it is—one which seeks to explain the world as it appears to the senses, which is still worth knowing, even if it is not consistent with the way the world truly is.  On either of these readings, Opinion could be “deceptive,” yet still be worthwhile in-itself. On these readings view, though Opinion is inferior to Aletheia in some regards, it can be positively endorsed in its own right, as Parmenides’ own version of “mortal-style” accounts.  If it is then understood that Parmenides’ “cosmology” is superior to all other possible mortal accounts of this kind, the goddess’ promise to the youth that learning this account will insure he is never surpassed by any other mortal judgments can be explained (C/DK 8.60b-1).

On the other hand, it is just as easy to understand Opinion being “likely” in the sense that it is indicative of the sort of account a deranged mortal relying upon their senses might be prone (“likely”) to offer, which is hardly an endorsement. Since mortals are incorrect in their accounts, the particular account offered in Opinion is representative of such accounts, and is presented didactically—as an example of the sorts of accounts that should not be accepted. If the youth can learn to recognize what is fundamentally mistaken in this representative account (Opinion), any alternative or derivative account offered by mortals which includes the same fundamental errors can be recognized and resisted. This seems to be far more consistent given the treatment of Opinion overall, and C/DK 8.60b-1 arguably better fits with this interpretation. Furthermore, it is quite difficult to defend a fundamental premise required for the alternative, more positive view outlined above—that the cosmology offered in Parmenides’ Opinion is intended to be superior to all other mortal views. Not only is this in tension with the clear negative treatment of Opinion throughout the text, it is implausible on more general grounds, as any account grounded upon fundamentally incorrect assumptions cannot be “superior” in any substantial sense. While some have attempted to claim that Opinion satisfies this on account of its dualistic nature, which is second-best to Reality’s monistic claims, this approach fails to account for how Opinion could possibly be superior to any other dualistic account.

Given all of this, it is undeniable that Opinion is lacking in comparison to Aletheia, and certainly treated negatively in comparison. It should also be taken as well-founded that the Opinion is epistemically inferior. Whether Opinion is also inferior in terms of veracity seems most likely—though again, it is not certain whether this means Opinion is entirely lacking in value, and the extent of its deceptiveness (all content, or its fundamental premises and assumptions) is still an open question. Navigating the Scylla and Charybdis of: a) taking the negative, yet often ambiguous and/or ambivalent, treatment of Opinion in the text seriously, while b) avoiding apparently absurd interpretative outcomes, is what makes understanding its relationship to Reality, and thus developing an acceptable interpretation of the poem overall, so very difficult.

3. Interpretative Treatments

This section provides a brief overview of: (a) some of the most common and/or influential interpretative approaches to the Proem itself, as well as (b) the relationship between Reality and Opinion, and/or the poem overall. The purpose is to provide the reader with a head-start on how scholars have tended to think about these aspects of the poem, and some of the difficulties and objections these views have faced. The treatment is not meant to be at all exhaustive, nor advocate any particular view in favor of another.

a. Reception of the Proem

The only ancient response to the content of the Proem is from the Pyrrhonian Skeptic Sextus Empiricus (2nd cn. C.E). In an attempt to demonstrate how Parmenides rejected opinions based upon sensory evidence in favor of infallible reason, Sextus set forth a detailed allegorical account in which most details described in the Proem are supposed to possess a particular metaphorical meaning relating to this epistemological preference. Sextus describes the chariot ride as a journey towards knowledge of all things, with Parmenides’ irrational desires and appetites represented as mares, and the path of the goddess upon which he travels as representative of the guidance provided by philosophical reasoning. Sextus also identifies the charioteer-maidens with Parmenides’ sense organs. However, he then strangely associates the wheels of the chariot with Parmenides’ ears/hearing, and even more strangely, the Daughters of the Sun with his eyes/sight—as if Sextus failed to recognize the numerical identity between the “charioteer-maidens” and the “Daughters of the Sun.” Similarly, he identifies Justice as “intelligence” and then erroneously seems to think that Justice is the very same goddess which Parmenides is subsequently greeted by and learns from, when the journey clearly leaves Justice behind to meet with a new goddess. In his attempt to make nearly every aspect of the story fit a particular metaphorical model, Sextus clearly overreaches all evidence and falls into obvious mistakes. Even more evidently problematic, the division of the soul into these distinct parts and accompanying metaphorical identifications is clearly anachronistic, borrowing directly from the chariot journey described in Plato’s Phaedrus. For these reasons, no modern scholar takes Sextus’ particular account seriously.

Modern allegorical treatments of the Proem have generally persisted in understanding the cosmic journey as an “allegory of enlightenment” (for a recent representative example, see Thanssas 2007). This treatment is possible no matter what one takes the geometry/geography of the chariot ride to be—whether an ascension “into the light” as a metaphor for knowledge as opposed to ignorance/darkness, or a circular journey resulting in a chthonic katabasis along Orphic lines. Though the particular details will vary from one allegorical account to the next, they tend to face objections similar to that of Sextus’ treatment. The metaphorical associations are often strained at best, if not far beyond any reasonable speculation, particularly when one attempts to find metaphorical representations in every minor detail. More theoretically problematic, determining some aspects to be allegorical while other details are not would seem to require some non-arbitrary methodology, which is not readily forthcoming. Due to ambiguity in, and variant possible readings of the text, there is room for many variants of allegorical interpretations—all equally “plausible,” as it seems none will be convincing on the evidence of the Proem alone. Recognition of this has led some to claim that while the Proem is certainly allegorical, we are so far distant from the cultural context as to have no hope of reliably accessing its metaphorical meanings (for example, Curd 1998). Finally, the allegorical accounts available tend to offer little if any substantive guidance or interpretative weight for reading the poem overall. For these reasons, allegorical treatment has become less common (for extensive criticism of Sextus’ account and allegorical treatments of the Proem in general, see Taran 1965; Palmer 2009).

With the decline of allegorical treatments, an interest in parsing the Proem in terms of possible shared historical, cultural, and mythical themes has ascended. For instance, a fair amount has been written on the parallels between the chariot’s path and Babylonian Sun-mythology, as well as how the Proem supposedly contains Orphic and/or Shamanistic themes. However, while Greek sun mythology may well have ancient Babylonian roots, the cultural origins do not seem at all relevant to Parmenides’ own cultural understanding at his time, nor that of any likely listener or reader of his work. Shamanistic influences are more suspect as influences and can be easily dismissed as a literary device designed to get the reader’s attention (Lombardo 2010). Purported Orphic parallels turn on Orphism’s revelatory journeys to the underworld, as well as initiations led by Night, and such influences are far more likely to have been relevantly parallel. Unfortunately, little is known about this mystery tradition overall, particularly at Parmenides’ time. Thus, it is overly speculative to hang very much on this purported influence with any confidence. Also, the theme of knowledge gained via chthonic journey, while consistent with Orphism, would not seem to be unique to that tradition, and the kind of “revelations” Parmenides’ youth undergoes are very different. The youth does not learn about any topics Orphism itself focuses on: moral truths, the nature of the soul itself, or what the afterlife was like. Furthermore, Parmenides’ unnamed youth learns a rational account based upon argumentation that can (and should be) tested and applied (compare C/DK 7.5-6), which is very different from the more “revelatory” nature of Orphism.

Overall, the Proem has far more commonly been minimized, dismissed as irrelevant, and/or entirely ignored by ancients and moderns alike, probably because they saw no immediately obvious philosophical content or guidance for understanding the rest of the poem within it. A select few advocate that the reader is merely supposed to recognize that Parmenides is here indicating that his insights were the product of an actual spiritual experience he underwent. However, there is no real evidence for this, and some against. The verbal moods (optative and imperfect) suggest ongoing, indefinite action—a journey that is repeated over and over, or at least repeatable—which cuts against a description of a one-off event that would be characteristic of a “spiritual awakening.” Even more problematic, the rationalistic account/argumentation of the goddess—which she demands the listener/reader to judge by reason (logos)—would thus be superfluous, if not undermined (C/DK 7.5-6). The same objection holds for attempts to dismiss the Proem as a mere nod to tradition, whereby epic poets traditionally invoke divine agents (usually, the Muses) as a source of inspiration and/or revelatory authority. It has also been common to reduce the Proem to a mere literary device, introducing nothing of relevance except the “unnamed Goddess” as the poem’s primary speaker.

While the Proem may be enigmatic, any summary dismissal which suggests that the Proem is entirely irrelevant to understanding Parmenides’ philosophical views is likely too hasty. There are very close similarities between the imagery and thematic elements in the Proem and those found throughout the rest of the poem, especially Opinion. For instance, the Proem clearly contrasts light/fire/day imagery with darkness/night, just as the two fundamental opposing principles underlying the cosmogony/cosmology in Opinion are also Fire/Light and Night. Both the Proem and the theogonical cosmology in Opinion introduce an anonymous goddess. In fact, in contrast to Reality, both sections have extensive mythological content, which scholars have regularly overlooked. The obvious pervasive female presence in the Proem (and the rest of the poem), particularly in relation to divinity, can also hardly be a coincidence, though its importance remains unclear. Once considered at greater length, the parallels between the Proem and Opinion seem far too numerous and carefully contrived to be coincidental and unimportant. This suggests a stronger relationship between the Proem and Opinion than has commonly been recognized and the need for a much more holistic interpretative approach to the poem overall, in contrast to the more compartmentalized analyses that have been so pervasive. Further scholarly consideration along these lines would likely prove quite fruitful.

b. The A-D Paradox: Select Interpretative Strategies and their Difficulties

The central issue for understanding the poem’s overall meaning seems to require reconciling the paradoxical accounts offered in Reality and Opinion. That is, how to reconcile: a) the positively endorsed metaphysical arguments of Reality, which describe some unified, unchanging, motionless, and eternal “reality,” with b) the ambiguously negative (or perhaps, ambivalent) treatment of the ensuing “cosmology” in Opinion, which incorporates the very principles Reality denies. Based upon the Greek terms for these respective sections (Aletheia and Doxa), I will refer to this as the “A-D Paradox.”

In this section, some of the more common and/or influential approaches by modern scholars to address this paradox are considered, along with general objections to each strategy. This approach provides a more universal appreciation of the A-D Paradox than taking on any selection of authors as foils, allowing the reader a broad appreciation for why various interpretative approaches to the poem have yet to yield a convincing resolution to this problem.

i. Strict Monism and Worthless Opinion

The most persistent approach to understanding the poem is to accept that for some reason—perhaps merely following where logic led him, no matter how counterintuitive the results—Parmenides has concluded that all of reality is really quite different than it appears to our senses. On this view, when Parmenides talks about “what is,” he is referring to what exists, in a universal sense (that is, all of reality), and making a cosmological conclusion on metaphysical grounds—that all that exists is truly a single, unchanging, unified whole. This conclusion is arrived at through a priori logical deduction rather than empirical or scientific evidence, and is thus certain, following necessarily from avoiding the nonsensical positing of “what is not.” Any description of the world that is inconsistent with this account defies reason, and is thus false. That mortals erroneously believe otherwise is a result of relying on their fallible senses instead of reason. Thus, the account in Opinion lacks any intrinsic value and its inclusion in the poem must be explained in some practical way. It can be explained dialectically, as an exercise in explicating opposing views (Owen 1960). It can also be explained didactically, as an example of the sort of views that are mistaken and should be rejected (Taran 1965). This strict monism has been the most common way of understanding Parmenides’ thesis, from early times into the mid-twentieth century.

This reading is certainly understandable. The text repeatedly sets forth its claims in seemingly universal and/or exhaustive contexts (for example, “It is necessary for you to learn all things…” C/DK 1.28b, “And only one story of the way remains, that it-is…,” C/DK 8.1a-2). The arguments of C/DK 8 all describe a singular subject, in a way that naturally suggests there is only one thing that can possibly exist. There is even one passage which is commonly translated and interpreted in such a way that all other existence is explicitly denied (“for nothing else either is or will be except what is…” C/DK 8.36b); however, the broader context surrounding this line undercuts this interpretation, on either selection of the variant Greek transmission. The broad range of topics in Opinion seems to be intended as an exhaustive (though mistaken) account of the world, which the abstract and singular subject of Reality stands in corrective contrast to. Perhaps the most significant driving force for understanding Parmenides’ subject in this way is Plato’s ascription to him of the thesis that “all is one” and Aristotle’s subsequent similar treatment.

While this view is pervasive and perhaps even defensible, many have found it hard to accept given its radical and absurd entailments. Not only is the external world experienced by mortal senses denied reality, the very beings who are supposed to be misled by their senses are also denied existence, including Parmenides himself! Thus, this view results in the “mad,” self-denying position that Descartes would famously show later was the one thing we could never deny as thinkers—our own existence. If there is to be any didactic purpose to the poem overall—that is, the youth is to learn how to not fall into the errors of other mortals—the existence of mortals must be a given; since this view entails they do not exist, the poem’s apparent purpose is entirely undercut. Surely this blatant contradiction could not have escaped Parmenides’ notice.

It is also difficult to reconcile the apparent length and detailed specificity characteristic of the account offered in Opinion (as well as the Proem), if it is supposed to be entirely lacking in veracity. Providing such a detailed exposition of mortal views in a traditional cosmology just to dismiss it entirely, rather than continue to argue against mortal views by deductively demonstrating their principles to be incorrect, would be counterintuitive. If the purpose is didactic, the latter approach would certainly be sufficient and far more succinct. The view that Parmenides went to such lengths to provide a dialectical opposition to his central thesis seems weak: a convenient ad hoc motivation which denies any substantial purpose for Opinion, implying a lack of unity to the overall poem.

Though the strict monist view remains pervasive in introductory texts, contemporary scholars have tended to abandon it on account of these worrisome entailments. Yet, there seems to be no way to avoid these entailments if Parmenides’ subject is understood as: i) making a universal existential claim, and if ii) the account offered in Opinion is treated as inherently worthless. Thus, alternative accounts tend to challenge one or both of these assumptions.

ii. Two-World (or Aspectual) Views

If the problems of strict monism are to be avoided while maintaining the apparent universal, existential subject (that is, “all of reality”), it makes sense to seek some redemptive value for Opinion so that Parmenides neither: a) denies the existence of the world as mortals know it, nor b) provides an extensively detailed account of that world just to dismiss it as entirely worthless. The primary strategy for redeeming the Opinion’s value has been to emphasize the epistemic inferiority of the Opinion, while denying its complete lack of veracity. Such approaches also tend to simultaneously downplay any ontological/existential claims made in the poem.

Emphasizing the epistemic distinctions, it can be pointed out that the conclusions offered in Reality are reached through a priori, deductive reasoning—a methodology which can provide certainty of the conclusion, given the premises. The Greeks tended to associate such knowledge with divinity, and thus the conclusions in Reality can also be understood as “divine” (note that it is narratively achieved via divine assistance, the poem’s spokes-goddess). On the other hand, there is no “true trust” or reliability to mortal accounts (C 1.30), either in the traditional divine v. mortal distinction or in Parmenides’ poem. Parmenides attributes this failing to the fact that mortals rely entirely upon fallible, a posteriori sense experience. However, while mortal accounts may be fallible, as well as epistemically inferior to divine (or deductive) knowledge, such accounts may still be true. By passing along the goddess’ logos via his poem, Parmenides has shown how mortals can overcome the traditional division between divine (certain) and mortal (fallible) knowledge. If it is just that Opinion is uncertain, and not completely false, then it can have intrinsic value. The account in Opinion could thus be “likely” in the sense that it is the best account that can be offered, even though the mortal approach does not yield certainty like divine methodology does. It is for these reasons that Parmenides provides his own, purportedly superior, cosmology.

Emphasizing the epistemological differences between these sections is not altogether wrong, as the explicit epistemic contrasts between these accounts in the poem are undeniable. However, holding the sole failing of Opinion to be its lack of epistemic certainty can hardly be the entire story. The conclusions offered in Reality remain irreconcilable with the account in Opinion, and the entailment that mortals still do not really exist to learn from Parmenides’ poem if the divine account is true, persists. Furthermore, other aspects of the poem are not adequately addressed at all. How is Opinion a “deceptive” account, other than it might be if we are misled by fallible senses (but it might also be true!), and we just cannot be certain?  How do mortals err by accepting being and not-being to both be actual, and by “naming opposites”? Even if it is granted that reliance on senses can result in these errors, it seems that any lack of error on these points would once again lead back to strict monism (if “what is” remains existential and universal) and its world-denying problems.

Attempts to resolve these issues have tended to rely upon positing an ontological hierarchy to complement the epistemic hierarchy. The account revealed by the divine methodology of logical deduction in Reality reveals what the world, or at least Being, must fundamentally be like. However, the world as it appears also exists in some ontologically inferior manner. Though any account of it cannot be truly correct, since mortals actually live in this lower ontological level, learning the best account of reality at that level remains important. In short, such views trade upon a distinction between: a) an unexperienced though genuine reality, which corresponds with divine epistemic certainty (Reality), in contrast to b) a lower-level of “reality,” accounts of which are epistemically uncertain, as well as deceptive in that they tend to obscure deeper ontological truths (especially if they are taken to describe all that there is).

A number of objections can be raised to this interpretative approach. However, they tend to boil down to anachronistic worries about the “Platonization” of Parmenides, by Plato and his successors, even down to the Neoplatonist Simplicius. The ontological gradations posited on this view (in addition to anachronistic translations of Parmenides’ Greek along such lines) would suggest that Parmenides very closely anticipated the ontological and epistemological distinctions normally taken to be first developed in Plato’s Theory of Forms. While Parmenides certainly made some very basic yet pioneering advances in epistemic distinctions—advances which very likely in turn influenced Plato—the far more refined distinctions and conceptions required for this interpretation of Parmenides are almost certainly the result of interpreters reading Platonic distinctions back into Parmenides (as Plato himself seems to have done), rather than the distinctions genuinely being present in Parmenides’ own thought. The pervasiveness of such “two-world” interpretative accounts likely says far more about Plato’s extensive influence, as well as the importance of finding some way out of the world-denying entailments, than it does about Parmenides’ own novelty.

It is also quite difficult to offer a convincing explanation for what possible grounds Parmenides could have for ascribing superiority to his own account of the apparent world offered in Opinion, in comparison to any other mortal offering of his time. The content certainly doesn’t appear to be superior. The echoes to other accounts, such as Anaximander’s and Hesiod’s, are rather obvious and not at all novel. While his cosmological claims may contain some novel truths (moon gets its light from the sun, etc.), these claims are still cast in a deceptive framework—the “naming error” of mortals. The defense that Parmenides’ own account is superior on the grounds that Opinion is the simplest account possible, relying upon a dualism of conflicting opposites, fails to explain how it would be superior to any similar dualistic account. Furthermore, the methodology does not appear to be superior in any way—Parmenides abandons his pioneering deduction in Reality, resorting to a traditional mythopoetic approach in Opinion.

iii. Essentialist (or Meta-Principle) Views

A promising suggestion by some recent commentators is that, rather than drawing ontological conclusions about the entirety of existence, Parmenides was instead focused on more abstract metaphysical considerations. Such approaches impute a primarily predicative (rather than existential) usage of the Greek word “to be” by Parmenides, particularly with respect to the deductive argumentation found in Reality. Such approaches result in C/DK 8.1-50 revealing the nature, or essence, of what any fundamental or genuine entity must be like.

Mourelatos was the first to advocate that Parmenides employed the Greek verb “to be” in a particular predicative sense—“the ‘is’ of speculative predication.” Mourelatos takes Parmenides to be attempting an exhaustive account of the necessary and essential properties for any fundamental ontological entity. That is, to say “X is Y” in this way is to predicate of X all the properties that necessarily belong to X, given the sort of thing X is (Mourelatos 1970, 56-67). Nehamas (1981) and Curd (1998) have both developed more recent proposals along similar lines.

A common upshot of Essentialist views is that, while it remains true that every fundamental entity that exists must be eternal, motionless, a unified whole, etc., this is consistent with existence of a plurality of such fundamental beings. Parmenides’ view would thus not be quite so radical as seen under the ontological, strict monist approaches. Furthermore, this view can have welcome implications for the narrative of how Parmenides was received by his immediate successors (that is, Anaxagoras, Empedocles, and the early Atomists). Rather than directly rejecting Parmenides’ strict monism in developing their pluralistic systems, they would be able to freely accept his conclusions regarding the nature of fundamental entities and move on to develop pluralistic systems that respected this nature while simultaneously explaining our perception of the world. Such a change in narrative is an improvement if, as Curd argues, one thinks the lack of any explicit argumentation against Parmenides’ strict monism by his successors is problematic, as it would entail later thinkers were guilty of “begging the question” against Parmenides.

Whatever the merits of this more limited and abstract thesis of Reality, such interpretations continue to face very similar, if not the same, problematic entailments and worries related to the value of the Opinion. First, there is substantial objection particular to such accounts. If Parmenides were truly providing an account of what any genuine being should be like, and this in turn outlines the requirements any acceptable cosmology must meet, it would be expected that Parmenides’ own cosmology (Opinion) would make use of these very principles. At the very least, one should expect some hint at how such an essentialist account of being could be consistent with mortal accounts. However, there is not even a hint of such in Opinion. Furthermore, though the arguments in Reality are now consistent with a plurality of fundamental perfect beings, there seems to be no way such entirely motionless and changeless entities could be consistent with, or productive of, the contrary phenomena found in the world of mortal experience. Thus, it remains difficult to see how Opinion could be true in any way, and the existence of mortals and Parmenides is still under threat, along with the implications that follow. The purpose of the poem is frustrated if mortals and Parmenides cannot exist. If Opinion is still entirely worthless, then the objections concerning its length and specificity also remain. Any attempts to introduce a “two-world” distinction still face charges of anachronism, and attempts to explain Opinion away as Parmenides’ own “best account” of the world (even though it is false) continue to be lacking in justification.

iv. Modal Views

While the presence of modal language in Parmenides’ works has long been recognized, this fact has largely escaped scholarly attention other than to evaluate whether any fallacies have been committed (compare Lewis 2009; Owen 1960, 94 fn. 2). Only recently has its presence been taken seriously enough to warrant a full-fledged interpretative account that addresses the relationship between Reality and Opinion (Palmer 2009). This approach is quite similar in some ways to the Essentialist approach. The account in Reality is still intended to provide a thorough analysis of the essential properties of some kind of being. However, the kind of being is more narrowly prescribed. Rather than an account of what any fundamental entity must be like, Parmenides is taken to explicate in Reality what any necessary being must necessarily be like, qua necessary being.

The inspiration for this approach is found in C 3/DK 2, where Parmenides introduces the initial two paths of inquiry. The first is the way “that is, and that is not to not be.” Something that is “not to not be” is equivalent to “must be.” Therefore, the first path concerns that which necessarily exists, or necessary being. The second path is clearly the contrary—“that it is not and must not be,” or necessary non-being. Though Parmenides does not use this exact formulation later in the poem, on the reasonable hypothesis that this construction is awkward (even in prose, let alone poetry), it is posited that “what is” and “what is not” are to be taken as shorthand for referring to these modes of being.

Adopting this understanding provides new and compelling perspectives on a number of issues in Reality. At the end of C 3/DK 2, the path that follows “what is not” is dismissed as one that can neither be apprehended nor spoken of. Rather than importing the likely anachronistic parallels to modern philosophy of language, particularly Russellian concerns with negative existential statements, the difficulty can be taken to be the impossibility of conceiving of necessarily non-existent things (for example, square-circles), which is a far more likely problem to have been recognized given the historical context. It is also readily understood why knowledge along these lines is entirely trustworthy, as any necessary entity must have certain essential properties given the sort of thing it is and its mode of existence.

This view also offers a very different perspective on the third way of inquiry introduced in C 5/DK 6. This is the “mixed” path of mortals, who knowing nothing and depending entirely upon their senses, erroneously think “to be and not to be are the same and not the same.” If Parmenides’ central thesis is to explicate the essential characteristics of necessary being (and reject necessary non-being as that which cannot be conceived at all), it is fitting for him to recognize that there are other beings as well: contingent beings. Mortals who have not used reason to conceive of what necessary being must be like are stuck only contemplating and believing in contingent beings, which can and often do change their aspects and existential status—or, are at least perceived to in certain contexts—leading to a “wandering” understanding that lacks the unchanging knowledge inherent in understanding necessary being qua necessary being.

Perhaps most compelling, the properties deduced in C/DK 8 as necessarily characteristic of “what is” make far more sense on the modal approach. Clearly, the provisions against coming-to-be and perishing are far more intuitive on this model than they are on models which simply disallow “what is not” to be part of its conception. More telling, while it is still certainly possible to justify some of these properties on the grounds that thinking “what is not” is not allowed in the conception, others are far more problematic. For instance, “what is” is argued to be “limited” in spatial extent and uniform throughout (C/DK 8.42-29). Thinking of something as motionless and limited in spatial extent, and uniform throughout itself, seems to require thinking of it as “not being” in other places than it actually is and of its own properties not existing beyond—that is, thinking “not being” in both instances. On the other hand, if Parmenides’ thesis is to explicate what a necessary being must be like on account of its modality alone, it is perfectly acceptable to think of a (spatially extended, material) necessary being as a discrete entity, which must possess its modal nature uniformly throughout. In short, were we even now to construct a list of properties essential for any necessary (spatially extended) being, such a list would closely match Parmenides’ own.

Though the modal view seems compelling in many ways with respect to Reality, the same might be said of other views considered above. The real question is whether it can resolve the “A-D Paradox,” while providing a compelling and meaningful answer for the inclusion of Opinion. Since Reality explicates the nature of necessary being, and this is a very different sort of thing from the contingent beings described in Opinion, the tension between these accounts has already been largely eliminated. Thus, the modal view generally succeeds in resolving the “A-D Paradox,” as it restricts Reality to such a narrow scope that Opinion can no longer be about the same entities. However, there can still be better and worse explanations for what Parmenides’ intention is with Opinion, and this still involves resolving whether its status is ultimately positive or negative, and in what sense. While Palmer has offered a very insightful and important contribution to Parmenidean studies, it is not beyond reproach or objection.

Palmer’s own view on Opinion is quite positive. Palmer takes the error of mortals to be thinking that contingent beings are all there is in the world, by relying solely upon their senses. It is not that the objects in Opinion do not exist, it is that they do not share the same unwavering epistemic account as necessary being does, as the contingent objects and phenomena found in Opinion are in a certain way, and then they are not—as they change, move, come to be, perish, and so forth. Thus, mortal knowledge remains “wandering,” while the (divine) knowledge of necessary being that Parmenides imparts is certain and unchanging. Nevertheless, the contingent world does exist, so there is value in knowing what one can about it. Thus, Palmer avers that Opinion is Parmenides’ own best attempt to explain the world of contingent being, which does not admit knowledge via the deductive methodology used in Reality. In this way, Palmer has succeeded in developing an interpretation that requires only an epistemic hierarchy between Reality and Opinion, without the additional ontological hierarchy of Two-World views and the anachronistic worries that accompany them.

While the modal view does allow the existence of contingent beings and thus an account of them would be valuable in-itself, it does not necessarily follow that this is what Parmenides was attempting in Opinion. Such a positive treatment still seems to be in tension with the overarching negative treatment of Opinion throughout the poem. Furthermore, Palmer’s attempts to portray passages about Opinion in a more positive light are far less compelling than his modal treatment of Reality.

One of the more problematic attempts to cast Opinion in a more positive light is Palmer’s corrective emendation of the missing verb at the end of line C 5.3/DK 6.3, changing the goddess’ words from warning the youth away from the “wandering path” of mortals (εἴργω) to a programmatic promise (ἄρχω) to being an explication of that path later on (Palmer 2009, 65-67). The problem with this emendation is that it is a common rule in Greek for the active verb ἄρχω to mean “rule”—the verb normally only carries the meaning of “begin” in its middle form (ἄρχομαι). This objection is not decisive, however, as Palmer’s overall view does not require this emendation.

What is fundamentally damning is Palmer’s view that Opinion is Parmenides’ best account of contingent being. First, Palmer faces the challenges noted above of explaining why Parmenides would be entitled to think his own mythopoetic account in Opinion would be superior to any other mortal account. Palmer is likely entitled to the view that the cosmology is in some sense “Parmenides’ own,” in the sense that it is his own construction and not borrowed from someone else, on the grounds that it contains novel cosmological truths (moon gets light from sun). The account could even be “superior” in that it contains novel cosmological truths that past accounts failed to include. However, this would require that Parmenides really think there could be no further discoveries that would then surpass his own knowledge. More importantly, there are no grounds to support that his theogonical content is supposedly superior to Hesiod’s.

Second, and most importantly, Palmer’s positive account of Opinion fails to explain how mortals could possibly be mistaken about the subject of Reality, as the text clearly requires.

to it all things have been given as names                                             8.38b

all that mortals have established in their conviction that they are genuine,

both coming to be and perishing, both being and not

and altering place and exchanging brilliant colour.                              8.41

“What is” is the subject of 8.38b-41, which is uncontroversially the entity described in Aletheia, and thus necessary being on Palmer’s view. It is to that entity mortals have “given as names” all the attributions listed: coming to be, perishing, and so forth. That mortals “in their conviction” believe these names to pick out something “genuine” with respect to the subject of Aletheia clearly implies that mortals are in error to do so—that these phenomena do not correctly specify the nature of Aletheia’s subject. This is the central claim about mortal errors by the goddess, and it is undeniable that, in order for mortals to incorrectly specify the nature of Aletheia’s subject, mortals must have some conception of and familiarity with that relevant object (or type of entity). Palmer recognizes this himself, asking “How can mortals describe or misconceive What Is [necessary being] when they in fact have no grasp of it?” (167). The answer is, of course, that they cannot. The naming error of mortals requires an account of how mortals get the nature of Reality’s subject wrong.

According to Palmer, Parmenides’ task is to explicate the essential nature of necessary being, qua necessary being. Since mortals have only ever relied upon their sense perceptions rather than deductive logic, they have never conceived of the essential nature of any necessary entity. Thus, their failure is to have believed that all of reality consisted entirely of contingent beings. However, if mortals have never conceived of necessary being, then they certainly could not ever have been wrong about it, and incorrectly predicated motion, change, coming-to-be, perishing, and so forth of it.  Palmer even realizes this tension and attempts to explain it away as follows:

Apparently because mortals are represented by the goddess as searching, along their own way of inquiry, for trustworthy thought and understanding, but they mistakenly suppose that this can have as its object something that comes to be and perishes, is and is not (what is), and so on. Again, the goddess represents mortals fixing their attention on entities that fall short of the mode of being she has indicated is required of a proper object of thought. (172)

However, this is no solution. Erroneously thinking that contingent beings can provide “trustworthy thought and understanding” may indeed be an error of mortals. Yet, this is certainly not the same error as mortals thinking that which is explicated in Aletheia can be properly described in ways contrary to its nature (that is, coming to be, perishing, and so forth), which is precisely the error the goddess insists they commit. Palmer’s view of Opinion simply cannot satisfy this textual requirement.

However, Palmer’s modal view of Reality can be readily modified to be consistent with a more negative treatment of Opinion. In fact, a more negative treatment of Opinion seems necessary in order to avoid this fatal flaw. A ready solution is available, which Palmer himself considered at some length, but ultimately rejected—identifying “what is” not only as necessary being, but divine being. This allows for mortals to have a familiar subject (divine being) which they have up until now misunderstood through the mythopoetic tradition, failing to recognize that such would have to be a necessary being, and as such could not be born, die, move, change, or even be anthropomorphic.

In explicating the essential nature of the divine qua necessary being in Reality, Parmenides can be understood as continuing the Xenophanean agenda of criticizing traditional, mythopoetic views of the divine, though he uses metaphysical and deductive argumentation, rather than the ethical appeals of his predecessor. This should not be surprising, given Parmenides’ historical context. Incorporating naturalistic elements or principles that are supposed to be divine, in contrast to anthropomorphic conceptions from the mythopoetical tradition, was otherwise pervasive amongst the Presocratics. The Milesians tended to treat their fundamental and eternal arche as divine entities. Pythagoras, perhaps more of a religious mystic in the first place, certainly included his own views on divinity. Identification of divine entities certainly does not end after Parmenides either, as the systems of the Pluralists and Atomists continue to associate their fundamental “parmenidean” entities with divinity. Most importantly, of course, Xenophanes’ conception of his supreme (and perhaps only) deity very closely parallels Parmenides’ description of “what is,” and recognizing “what is” as a necessary being would only seem to advance this metaphysical treatment of divinity even further. This should not be at all surprising given the extensive evidence for Xenophanes’ role as a strong influence, or even personal teacher, of Parmenides (compare 4.a.iii below). Thus, even if Parmenides never (at least, not in the extant fragments) refers to “what is” as a god/divine thing, that he was thinking along those lines and paralleling the properties others had ascribed to their conception of deity is hard to deny, and readily makes the modal view at least tenable, and perhaps compelling.

4. Parmenides’ Place in the Historical Narrative

a. Influential Predecessors?

Hardly less certain than the rest of his general biography is Parmenides’ intellectual background with questions arising regarding whether he was a pupil of, or at least heavily influenced by, some particular thinker(s). If so, the question remains whether he sought to further refine or challenge such views—or perhaps both. This section broadly analyzes the evidence for ascribing particular intellectual influences and teachers to Parmenides.

i. Anaximander/Milesians

Theophrastus alone asserted that Parmenides was a “pupil” of Anaximander (Coxon Test. 41, 41a). However, this is historically impossible—even with the earliest birthdate, Anaximander was long dead before Parmenides ever engaged in philosophical contemplation. Thus, Parmenides could never have been personally instructed by Anaximander. At most, one could argue that Anaximander’s views influenced and/or provided a particular target for Parmenides to reject, as some modern scholars have suggested.

It is quite likely Parmenides would have been familiar with Anaximander’s works. At least, there is no good reason to doubt this. Also, there are certainly parallel conceptions and opposing contrasts that can be drawn between these two thinkers. Both can be read as understanding the cosmos to be operating in terms of some “necessity,” in accordance with “justice” (compare Miller 2006). Anaximander’s dualistic opposites consist of “hot” and “cold” interacting with each other in generative fashion. Similarly, Parmenides’ dualistic “cosmology” names “Light/Fire” and “Night” as the primordial opposites that are found in all other things, and Aristotle and Theophrastus both explicitly associate these Parmenidean opposites with “hot and cold” (Coxon Test. 21, 25, 26, 35, 45). Both Parmenides and Anaximander describe cosmological light as rings, or circles, of fire. They both think that there are deathless and eternal things (Parmenides’ “what is,” and Anaximander’s divine arche, the apeiron). Both can be understood as drawing upon a rudimentary “principle of sufficient reason,” concluding that if there is no sufficient reason for something to move in one direction or manner versus another, then it must necessarily be at rest (Parmenides’ “what is,” and Anaximander’s description of Earth). Perhaps most importantly, Anaximander suggests opposites arise from an “indefinite” or “boundless” (aperion) eternal substance. Parmenides’ metaphysical deductions can be understood as a direct denial of this, either because nothing could ever arise from something so indefinite in qualities as the apeiron (as such a thing would essentially be nothing–that is, Parmenides denies creatio ex nihilo), and/or that Anaximander’s apeiron is not a proper unity (one distinct thing) if opposites with entirely different and distinct properties (hot and cold) can arise from it (Curd 1998, 77-79; Palmer 2009, 12).

However, closer inspection of these supposed parallels tends to undermine the thesis that Anaximander is particularly influential upon, or a specific target for, Parmenides. First, many of the particulars of Anaximander’s views are noticeably absent, or distinct from, the supposed parallels in Parmenides. While Anaximander’s “necessity” is probably best understood as physical laws, Parmenides’ conception appears to rely on logical consistency. Whereas Anaximander envisions “justice” as a regulatory, ontological compensation arising from the competition for existential pervasiveness amongst opposites, this conception of natural balance via justice is entirely absent in Parmenides’ works. The Aristotelian identification of Parmenides’ Light/Night dualism with Anaximander’s Hot/Cold opposition is also highly suspect. No extant fragments of Parmenides make this connection. Furthermore, the Peripatetics mistakenly refer to Parmenides’ two primary principles in Opinion as “fire and earth” instead of “fire (or light) and night.” Anaximander’s description of the cosmic fire rings as “tubes with vent-holes” is lacking in Parmenides’ “rings of fire.” In short, none of these supposed parallels clearly identifies Anaximander as an influence/target, and they can all be understood as rather common conceptions of cosmology and physics in philosophically-oriented Greek minds during Parmenides’ time (Cordero 2004, 20; Curd 1998, 116-126). The same can be said for the apparently shared views on the existence of divine/eternal beings, appeals to some “principal of sufficient reason,” as well as the denial of creatio ex nihilo, and the impossibility of distinct pluralities arising from a properly indistinct unity.

If Parmenides is not directly targeting Anaximander in particular, it is possible that he could be understood as responding to Milesian physics and cosmology in general, but probably not. On the one hand, Parmenides seems to be engaged in a very different sort of endeavor. Whereas the Milesians sought to explain cosmology and physics by identifying the arche (“origin” or “first principle”) from which all things originated (and possibly, remained constituted by), the only section of Parmenides’ poem that could provide an alternative or competing cosmological account (Opinion) is supposed to be fundamentally and deeply flawed, and offered for rejection on some grounds. The grounds upon which this cosmology is flawed is the point of Parmenides’ overall project, which seems far broader than denying Milesian views in particular. Though Parmenides may very well be challenging fundamental assumptions made by the Milesians—that things exist; that our senses provide knowledge of existing things; that there must be a primordial, foundational element(s)—these assumptions are hardly unique to them (Cordero 2004, 20). Thus, while the Milesians should most likely be listed amongst “the mortals” whose opinions about the world Parmenides thinks are fundamentally mistaken, and while Milesian views may even be paradigmatic examples of such mistaken views, Parmenides’ criticism would seem to include the common “man on the street” as well. Parmenides’ thesis is broader, his focus more metaphysical and logically-driven, than can be explained by ascribing the more historically-based motivation of challenging the Milesians (compare Owen 1960; Palmer 2009, 8-29). He is challenging everyone’s understanding.

ii. Aminias/Pythagoreanism

Ancient sources provide very limited support for imputing a significant Pythagorean influence upon Parmenides. Sotion alone attests that Parmenides, though admittedly a student of Xenophanes, did not follow him (presumably, in his way of thinking)—but was instead urged to take up the Pythagorean “life of stillness” by Aminias (Coxon Test. 96). Though Sotion goes on to describe how Parmenides built a hero shrine to poor Aminias upon his death, nothing else is known of Aminias himself. It is also reported by Nicomachus of Gerasa that both Parmenides and Zeno attended the Pythagorean school (Coxon Test. 121). However, even if this were true, it does not necessarily follow that either adopted or sought to challenge Pythagorean views in their later thought. Finally, Iamblichus mentions Parmenides amongst a list of “known Pythagoreans,” though no defense of, nor basis for, this attribution is provided (Coxon Test. 154).

Similarly, only a relatively small minority of contemporary scholars has been committed to defending a general Pythagorean influence upon Parmenides, and even fewer are willing to grant credence to the claim that Aminias was his teacher. The vast majority of contemporary Parmenidean scholars reject the Pythagorean influence entirely, or at least hold it to not be directly or substantially significant on informing Parmenides’ own mature views. When contemporary commentators have attempted to demonstrate the presence of Pythagorean elements within Parmenides’ text itself, the attempts have been quite strained, at best—particularly given the general lack of good information about early Pythagoreanism itself. Most telling against this purported influence is the fact that even amongst modern scholars who agree that Parmenides does demonstrate Pythagorean influences, the details and purported parallels differ entirely from one commentator to the next. As a result, even those who agree that there is a Pythagorean influence cannot agree at all on what exactly that influence consists of, or what counts as evidence for it.

It would seem that the real reason for the persistence of this association is far more dependent upon geographical considerations than is often let on.  In fact, this is probably the best argument for thinking any Pythagorean influence upon Parmenides is likely in the first place, as the primary Pythagorean school was founded in Croton, just over 200 miles SSE from Elea. While geographical considerations make it virtually certain that Parmenides was aware of the Pythagorean school, and even had interactions with Pythagoreans, there is simply no compelling evidence for any significant influence by this tradition in his mature work.

iii. Xenophanes

That Parmenides was either a direct disciple of Xenophanes, or at least heavily influenced by him in developing his own views, is pervasive amongst ancient sources. Plato is the first, claiming that there is an “Eleatic tribe,” which commonly held that “all things are one,” and that this view was first advanced by Xenophanes—and even thinkers before him! While this offhand remark by Plato may not be intended to be taken seriously in pushing Eleaticism back beyond Xenophanes, the idea that there is some real sense in which the philosophical views of these two are closely related is suggestive. Both Aristotle and his student Theophrastus explicitly claim that Parmenides was a direct personal student of Xenophanes. This is further attested by several later doxographers: Aetius (2nd-1st cn. B.C.E.) and “Pseudo-Plutarch” (1st cn.? B.C.E.). Numerous other ancient sources from variant traditions and spanning nearly a millennium could be listed here, all of which attest to a strong intellectual relationship between these two thinkers, on several different interpretative bases. While some are skeptical of this relationship (for example, Cordero 2004), most modern scholars are willing to grant some degree of influence between these thinkers, and the overall evidence is perhaps suggestive of a far deeper relationship than is normally admitted.

It is often superficially recognized that both Xenophanes and Parmenides wrote in verse rather than prose. It is also common to point to stark differences on this point. First, it is commonly claimed that Xenophanes was a philosophically-oriented poet, in contrast to Parmenides—a “genuine philosopher” who simply used poetry as a vehicle for communicating his thoughts. This seems to be based primarily upon the fact that Xenophanes also wrote silloi (satirical poetry), which do not always have obvious or exclusively philosophical themes. Also, even when both do make use of the epic dactylic-hexameter meter, there is a difference in vocabulary and syntax; Parmenides extensively and deliberately imitates language, phrasing, and imagery from Homer and Hesiod, while Xenophanes does not.

On the other hand, it is unfair to dismiss Xenophanes’ philosophical focus based upon his elegiac poems, which often do contain philosophical elements—that is, criticism of traditional religious views, the value of philosophy for the state, and how to live correctly. Furthermore, aside from these silloi, the majority of the extant fragments appear to be part of one major extended work by Xenophanes, all of which are in the epic style. More importantly, the content and general structure of this work bear substantial similarities to Parmenides’ own poem. Xenophanes begins by explicitly challenging the teachings of Homer and Hesiod in particular and of mortals in general regarding their understanding of the gods. Parmenides’ Proem clearly opens with a journey grounded in Homeric/Hesiodic mythology, and one of the main things for the youth is to learn why the opinions of mortals (including, or perhaps even especially, Homer and Hesiod) are misguided. Xenophanes draws a distinction between divine and mortal knowledge which mortals cannot overcome; Parmenides’ poem also seems to acknowledge this distinction, though he may very well be suggesting this divide can be overcome through logical inquiry, in contrast to Xenophanes. Xenophanes claims that the misunderstanding of the gods is the result of mortals relying upon their own subjective perceptions and imputing similar qualities to divine nature. Similarly, Parmenides’ spokes-goddess ascribes the source of mortal error to reliance on sense-perception, in contrast to logical inquiry, and may also have a divine subject in mind.

Having identified his intellectual targets, Xenophanes seems to move from criticism of others to providing a positively-endorsed, corrective account of divine nature. There is one (supreme or only?) god, which is not anthropomorphic in form or thought. Though this being does have some sort of sensory perception (hearing and seeing) and thinking abilities, it is different from how mortals experience these states—if in no other way than that this supreme god sees, hears, and knows all things. This being is unchanging and motionless, though it affects things with its mind. It cannot be denied that the description of Xenophanes’ (supreme/only) god bears many of the same qualities as Parmenides’ “what is”—the only question is whether Parmenides was directly influenced in this matter by Xenophanes’ views.

There are also extant fragments from Xenophanes which seem to provide an extended cosmology and physics. It is possible these constituted the end of Xenophanes’ major epic work. If so, there would again be at least a superficial similarity in structure between his poem and Parmenides’ own. More substantially, there may be parallel passages that could suggest the cosmology/physics on offer in each is not to be trusted. Consider Xenophanes’ injunction to believe things he has described as “resembling the truth” (Xenophanes B35). It is unlikely that he would be undercutting his positively-endorsed account of his “one god” in such a way, thus this likely refers to his physics/cosmology. When Parmenides’ spokes-goddess tells us why she is providing her “deceptive” account of mortal opinions (which in-itself implies it should not be taken as correct or real), she uses the exact same Greek adjective (ἐοικότα) to describe this mortal account as one which is “entirely fitting” for the youth to learn (C/DK 8.60). The context here seems to be that by learning the particular account offered in Opinion, which shares the mistakes any mortal account might possess, and/or which makes the failure of mortal accounts most evident, the deceptive account on offer is worth learning so as to best know how to avoid the mistakes other mortals make.

Finally, if geographical proximity is grounds for imputing a likely intellectual influence, then the case for a Xenophanean influence on Parmenides is just as strong, if not superior to, the Pythagorean association considered above. Though Xenophanes was originally a native of Colophon, an Ionian city which lies on the opposite side of the ancient Greek world from Elea, evidence suggests he spent significant time in or at least near Elea. Xenophanes describes himself as having spent sixty-seven years traveling and sharing his teachings after leaving Colophon at the age of twenty-five. Xenophanes’ writings clearly demonstrate familiarity with Pythagoras himself, and thus implies familiarity with his school in southern Italy. Diogenes explicitly reports that Xenophanes lived at two locations in Sicily (near Elea) and that Xenophanes even wrote a poem on the founding of Elea, as well as his native Colophon. What would make these two cities worthy of odes, and no others? Likely, both were important to Xenophanes in the same respect—he identified with both as “home.” Finally, Aristotle explicitly reports that the citizens of Elea sought Xenophanes’ guidance about religious matters on at least one occasion (Rhetoric II.23 1400b6). On these grounds, in addition to the ancient reports that Parmenides was Xenophanes’ student and the parallels found in their major works, it is very likely Xenophanes lived near or in Elea during his philosophical maturity, likely at just the right time to influence Parmenides in his own philosophical development.

iv. Heraclitus

No ancient source attests that Heraclitus influenced Parmenides. In fact, the only ancient source to suggest any relationship between the thinkers is Plato, who would have Parmenides influencing Heraclitus instead. Plato’s claim is almost universally rejected today, especially since Heraclitus does not hesitate to criticize other thinkers, and he never mentions Parmenides. However, it was quite common throughout much of the twentieth-century for modern scholars to argue that Parmenides was directly challenging Heraclitus’ views, and introductory textbooks continue to regularly draw interpretative parallels between them. The interpretative comparison generally relies upon the highly questionable ascription of a motionless, strict-monism to Parmenides, in contrast with understanding Heraclitus as the “philosopher of flux,” who advocated a pluralistic universe constantly undergoing motion and change. I leave consideration of the interpretative adequacy of these views aside here. While ahistorical interpretative comparisons can certainly be worthwhile exercises in-themselves, the question here is whether there are actually any good historical grounds to think Parmenides was directly challenging Heraclitus. The thesis that Heraclitus influenced Parmenides faces serious chronological challenges. All evidence suggests Heraclitus wrote his major work near the end of his sixty-year lifetime. Evidence also suggests Parmenides could not have written much after Heraclitus’ own death. This leaves little time in between, if any, for Parmenides to become aware of or be inspired to challenge Heracliteanism.

That Heraclitus wrote late in his lifetime is evident from his explicit criticism of other thinkers. In one passage, Heraclitus criticizes the Ephesians for exiling his friend Hermodorus, which would have occurred at the very end of the sixth century (B121). In another passage, he denigrates Hesiod, Pythagoras, Xenophanes, and Hecataetus as failing to understand anything, despite their studiousness (B40). This list of names is in chronological order, and Heraclitus’ use of the past tense may be taken as indicating they are all dead by the time of his writing. If so, as Hecataetus’ lifetime is estimated as c. 550-485 B.C.E., Heraclitus would have to have completed his work in the last few years of his life. Admittedly, Heraclitus’ use of the past tense here is not decisive, as it certainly does not require all those named be dead. It only requires that the named persons have, on Heraclitus’ view, demonstrated their lack of understanding in the past (likely through written works). Nevertheless, this passage still supports a late composition date. Though Pythagoras established his school in Croton early in Heraclitus’ life (c. 530 B.C.E.), it would seem to require a significant amount of time for the arcane teachings of Pythagoreanism to have made their way to Ephesus. In addition, since Pythagoras himself did not write anything, any written works in the Pythagorean tradition that were disseminated must have been written by his followers—again, probably after Pythagoras’ own death (post-500 B.C.E.). Finally, even if Hecataetus was still alive at the time Heraclitus wrote, Hecataetus almost certainly wrote his own works late in his lifetime, after his travels (that is, post-500 B.C.E.).

Given this evidence, it is reasonable to estimate the earliest composition of Heraclitus’ book to c. 490 B.C.E., plus or minus five years. If Plato’s claims that Parmenides was the personal teacher of a young Zeno (490-430 B.C.E.), and that Zeno wrote his own book in defense of his master’s while very young, then Parmenides must have written prior to 470 B.C.E. This estimate is reasonable even if the details of Plato’s Parmenides are not reliable and if one accepts Diogenes’ account, as Parmenides would be seventy by this point. This leaves a rather short window—less than twenty years—for Heraclitus’ views to spread across the Greek world to Elea and inspire Parmenides. Though not impossible, this is unlikely.

Furthermore, when further biographical details are considered, the window arguably completely vanishes. Diogenes reports that upon completion of his book, Heraclitus deposited it (apparently the only copy) in the Temple of Artemis (the Artemisium). Access to the work in the temple’s storeroom would almost certainly have been limited and available only to particularly privileged persons. Tradition further holds that Heraclitus himself did not have any students, but that a following eventually arose amongst those who studied his book and named themselves Heracliteans. Under these circumstances, in conjunction with Heraclitus’ deliberate obscurity, the time required to study, discuss, teach, and disseminate Heraclitus’ views into the rest of the Greek world would be substantial—not years, but decades. This inference seems supported by the lack of any records of Heracliteans in the early fifth century. In fact, the very earliest evidence of a Heraclitean outside of Ephesus is of Cratylus—a thinker to whom Plato dedicated an eponymous dialogue and who Aristotle reports as the first teacher whose views Plato adopted (Metaphysics i.6, 987a29-35). If Cratylus was the first to spread Heracliteanism beyond Ionia, and if he indeed taught Plato before Socrates did (pre-410 B.C.E.), Heracliteanism would only have first arrived in Athens (let alone Elea) sometime after 450 B.C.E…far too late to influence Parmenides work (pre-470), and almost certainly after his death.

All this may still be objected to, and the unlikely possibility apparently confirmed, on the grounds of supposedly clear textual evidence. Parmenides is commonly thought to have made a clear allusion to Heraclitus, describing mortals with no understanding as simultaneously accepting that “things both are and are not, are the same and not the same” (C5/DK6). This can be understood as referring directly to Heraclitus’ paradoxical aphorisms, which describe things like rivers and roads as being both simultaneously the same, but yet not (B39; B60). Proponents may even go on to point out that Parmenides describes those who hold such mistaken beliefs as being on a “backward-turning (παλίντροπός) journey,” which is the same adjective Heraclitus uses to describe the unity of opposites that mortals fail to appreciate. While this may initially seem compelling, closer examination of these textual claims reveals their inadequacy. Furthermore, a broader examination of the texts reveals that the apparent attractiveness largely depends upon selective cherry-picking.

First, it is not even clear if Heraclitus wrote παλίντροπός—some sources report παλιντονος (“backward-stretching”) instead, and there is no scholarly consensus on which is correct. Even if both did write παλίντροπός, imputing an intellectual influence from this is rather weak. It could simply be a coincidental usage of a relatively common term or idiom. In any case, the apparent similar philosophical usage—in relation to mortal cognitive failures—only stands on the surface. It is not that mortals themselves are, in their cognitive failures, “backwards-turning,” in either case. Instead, Parmenides is using it metaphorically to describe a way of inquiring that leads to contradiction. He is using this image to describe a way in which mortals should not think about things. On the other hand, while Heraclitus’ use is also metaphorical, he is advocating for his view of how opposites should be thought of. Here, nature itself is “backwards-turning,” and the failure by mortals is the failure to recognize this fortunate confluence of opposites, which can result in the complementary unity and harmony found in, for example, the bent sapling and taut string that form a bow. Parmenides is not here denying the more limited claims by Heraclitus—that opposites can work together to produce some new harmony. Rather, he seems to be claiming that even thinking of opposites requires thinking in terms of a more fundamental distinction—“what is” and “what is not”—and this inevitably leads to contradiction. While Parmenides’ claims would certainly refute Heraclitus, his view is aimed at an error that is found in all prior philosophical systems, and even the most common mortal beliefs. Thus, to think of this passage, and thereby Parmenides’ overall poem, as intentionally designed to directly challenge Heraclitus in particular is to risk missing Parmenides’ larger project for what it is.

Perhaps more telling, the apparent direct challenges and differences between these thinkers are belied by the similarities. Heraclitus describes the divine Logos as eternal and unchanging, much as Parmenides’ describes “what is.” Properly understanding the Logos is supposed to lead to the conclusion that “all is one,” and Parmenides has often been thought to be advocating similar monistic conclusions regarding “what is.” Similarly, both can be read as advocating there is no distinction between night and day—that they are both one, and that both are also divine. Both are critical of common mortal views, and both seem to acknowledge a distinction between mortal and divine knowledge.

In the end, these similarities should no more be taken as indicative of direct influence than the apparent critical differences—the chronology makes both problematic. How then might the apparent interaction/influence between these thinkers be explained, when they were almost certainly writing in isolation and ignorance of each other on opposite sides of the Greek world? The better explanation here is to seek a common influence which would explain the similarities in doctrine and critical themes and which would have been widely spread by the end of the sixth century. The most obvious common influence in this context is Xenophanes.

b. Parmenides’ Influence on Select Successors

As this article has set out to demonstrate, understanding the meaning Parmenides intended in his poem is quite difficult, if not impossible. Given that any historical narrative of Parmenides’ legacy is directly determined by how the poem is supposed to be understood, there are almost as many plausible accounts of the former as the latter. Such considerations are further complicated by the tendency of ancient philosophical authors to use prior views to serve their own interests and purposes, with relatively little regard for historical accuracy. Even if ancient authors did conscientiously attempt to portray earlier thinkers faithfully, there is no guarantee they properly understood the original authors intended meaning—and this is particularly the case given Parmenides’ enigmatic style. Thus, the question “how did Parmenides’ own views influence later thinkers” may in many cases be a mistake, as the more relevant question could be “how did each of Parmenides’ successors understand Parmenides and /or choose to make use of his work in their own.” Given these considerations, it is especially difficult to speak about Parmenides’ influence upon his successors in any detail without first adopting a particular interpretative outlook. However, there are some general observations that can be advanced which are, at least, highly suggestive. Some of these are briefly sketched out below.

i. Eleatics: Zeno and Melissus of Samos

Though it is highly questionable as to whether Parmenides himself argued for strict monism, the views found in the writings of his immediate followers can be taken as advocating such. It is important to realize that this does not show Parmenides was in agreement on this point—it could be that this is the way they developed his thoughts and logical method further. Whatever Parmenides himself held, however, it is clear that his writings did lead some to adopt this view.

Tradition holds that Zeno of Elea was a student of Parmenides from a young age. He is famous for writing a short book of “paradoxes,” which are designed to demonstrate the absurdity of positing a plurality of beings, as well as the associated conceptions of change and motion. Zeno certainly adopts and improves upon Parmenides’ deductive, reductio ad absurdum argumentative style in his prose. However, whether the denial of pluralism was Zeno’s own addition to his teacher’s views, or if he is truly and faithfully defending Parmenides’ own account, as Plato represents him to be (Parmenides 128c-d), is not clear.

It is universally understood and uncontroversial that there was at least one Greek thinker who did adopt and defend the radical metaphysical hypothesis of strict monism: Melissus of Samos. Melissus clearly (largely because he wrote in prose) adopts Parmenides’ own language and argumentative styles, especially from C/DK 8, and expanded upon them. Most qualities of Parmenides’ “what is” remain the same in Melissus’ account—it is eternal, motionless, a uniform and complete entity, and so forth. However, Melissus conversely describes his sole being as unlimited in extension, rather than limited. He further adds that “what is” cannot undergo psychological changes (such as pain, distress, or health) and explicitly denies the existence of void. He also expressly denies the existence of things mortals believe in, but yet fails to realize the entailment that mortals—including himself—thus also would not exist.

ii. The Pluralists and the Atomists

The Pluralists include Anaxagoras and Empedocles. Each posited very different cosmological accounts, based upon very different fundamental entities and processes. Anaxagoras posits an extensive number of fundamental and eternal seeds, every kind of which is found in even the smallest portion of matter, and which give rise to objects of perception according to whichever kind of seed dominates the mixture at a particular spatio-temporal location, in accordance with the will of Nous.

Empedocles (who may have been a student of Parmenides) makes the four fundamental Greek “elements” (earth, air, fire, water), as well as two basic forces (Love and Strife), his fundamental entities. The “elements,” in conjunction with his basic forces, are continuously mixed together to become one and subsequently separated entirely in the eternal cyclical nature of the cosmos.

Despite the radical surface differences, the Pluralists share some basic ideas. Both hold that the world as it appears to mortals is an outcome of the mixture and separation of fundamental entities—entities which satisfy the description of “what is” set forth by Parmenides in Reality, except for the motion they undergo. This is almost certainly no accident, and generally indicative of Parmenides’ influence on Greek thought overall. However, it is again not clear whether the Pluralists are best understood as agreeing with, or rejecting, Parmenides’ own views. Even if passages ascribed to these thinkers are seen to be rejecting Eleaticism, the rejection may need to be taken as directed against Melissus, not Parmenides’ himself.

The situation is very similar with respect to the early Atomists—Leucippus and Democritus. The atomists posit two fundamental entities, one that corresponds to “what is” (atoms), and one that corresponds to “what is not” (void). The void is simply the absence of “what is,” and is necessary for motion. The atomists do provide arguments for the existence of void, which can seem to be a direct challenge to Parmenides’ claim that “what is not” necessarily cannot be. However, the rejection is likely more indirect, as only Melissus explicitly argued against the existence of void (though since Leucippus and Melissus were contemporaries, Melissus could be responding to Leucippus instead, though it is unlikely). The atoms are infinite in number and kind, indivisible, uncuttable, whole, eternal, and unchanging with respect to themselves. Like the pluralists, the macro-objects found in human perceptions are formed out of combining these fundamental micro-objects in particular arrangements. Overall, the fundamental entities of atomism again closely correspond in many ways to Parmenides’ description of “what is,” with the primary exception being motion. Also, though it is clear that the Atomists are aware of Eleaticism, whether they are best described as agreeing with and/or rejecting Parmenides’ own views is unclear, given the clear target Melissus has provided in explicitly adopting strict monism and denying the existence of void.

iii. Plato

If there is any Presocratic whom Plato consistently treats with deep reverence, it is Parmenides, who he describes as possessing a “noble depth,” and being “venerable and awesome.” The deep influence of Eleatic thought upon Plato is clear in his regular use of a main character known as the “Eleatic Stranger” in several late dialogues, as well as his eponymous dialogue, Parmenides, in which the coherence of the Theory of Forms is examined. Within Plato’s Theory of Forms, and in his accompanying epistemological and ontological hierarchies, Parmenides’ influence can be most readily seen. Similar to “what is” in Parmenides, each Form is an eternal, unchanging, complete, perfect, unique, and uniform whole. They can only be understood via reason, and understanding them is the highest epistemic level attainable, wherein one possesses certain knowledge. The Forms are also the most fundamental ontological level, which are always true and real in all contexts. In contrast, understanding the account in Opinion as the mistake that mortals make due to their senses—thinking that being and not-being are both the same and different—and erroneously thinking they have thus grasped the way the world truly is in these ways, provides a parallel to Plato’s description of the “world of appearances.”  For Plato, this corresponds to a typical epistemological level at which human beings, relying upon their senses, can only have opinions and not knowledge. It is also a typical ontological level, at which objects and phenomena perceived simultaneoulsy “are and are not,” as they are imperfect imitations of the more fundamental reality found in the Forms. Beyond the Theory of Forms, there are also interesting epistemic and allegorical comparisons and contrasts to be drawn between Parmenides’ poem (particularly the Proem) and Plato’s “Allegory of the Cave.”

5. References and Further Reading

This article uses the revised edition (2009) of Coxon’s seminal monograph as the standard reference for the study of Parmenides in English. For ease of reference, references to fragments of Parmenides’ poem list, first, Coxon’s numbering (C) and then, Diels-Kranz’s (DK). Thus, the same fragment is indicated by (C 2/DK 5). References to all ancient testimonia regarding Parmenides are based on Coxon’s arrangement and numbering and are listed with “Test.” preceding the relevant number (for example, Coxon Test. 1). References to ancient sources concerning Presocratics other than Parmenides are based on DK’s arrangement.

a. Primary Sources

  • Austin, Scott. Parmenides: Being, Bounds, and Logic. New Haven: Yale, 1986.
    • A work focused solely on explaining the logical aspects of Reality.
  • Cordero, Néstor-Luis. By Being, It Is. Las Vegas: Parmenides, 2004.
    • Cordero provides a new perspective on Parmenides’ reasoning and method by focusing on Parmenides’ use of the verb “to be” and its enigmatic subject, in addition to the number and meaning of the “paths of inquiry” considered in the poem,
  • Coxon, A. H. The Fragments of Parmenides: Revised and Expanded Edition. Ed. and Trans. Richard McKirahan. Las Vegas: Parmenides, 2009.
    • The sixth-edition of Diels-Kranz’s (DK) Die Fragmente der Vorsokratiker (1952) has long been the standard text for referencing Presocratic fragments in a numbered arrangement, along with related testimonia. However, it is somewhat dated, has long been out of print, the German commentary is relatively brief, it does not contain all the available or pertinent testiomonia, and no translations of the testimonia are offered. This arrangement of Parmenides by Coxon is far more accessible to most readers of this article (in English, and easily available in-print), and it provides a more comprehensive list of testimonia, with English translations. The recent revisions by McKirahan have also kept it up to date with recent advances in scholarship. For these reasons, Coxon’s is now, or should be, the current standard text for Parmenidean studies. All references in this article to fragments of Parmenides’ poem and relevant ancient testimonia follow Coxon’s arrangement.
  • Diels, Hermann, and Walther Kranz. Die Fragmente der Vorsokratikor. 6th ed. Berlin: Weidmann, 1952.
  • Kirk, G.S., Raven, J.E., Schofield, M. The Presocratic Philosophers. 2nd ed. Cambridge: Cambridge, 2011.
    • A valuable introductory work on the Presocratics which provides all fragments of Parmenides’ poem in Greek with their English translation, in the midst of a running interpretative commentary. Select testimonia related to Parmenides are also provided in both Greek and English.
  • Palmer, John A. Parmenides and Presocratic Philosophy. Oxford: Oxford, 2009.
    • In perhaps the most comprehensive contemporary monograph on Parmenides’ poem, Palmer’s most novel contribution consists of fully developing a modal perspective for understanding Reality and Opinion, as well as the relationship between both sections.
  • Sider, David, and Henry W. Johnstone, Jr.  The Fragments of Parmenides. Bryn Mawr Commentaries. Bryn Mawr: Bryn Mawr College, 1986.
    • An essential resource for students who want to study Parmenides in the original Greek.
  • Tarán, Leonardo. Parmenides. New Jersey: Princeton, 1965.
    • One of the seminal works in the field advocating Parmenides’ strict monism.

b. Secondary Sources

  • Baird, Forrest E., and Walter Kaufmann, eds. Ancient Philosophy. 4th ed. Philosophy Classics. Vol. 1. New Jersey: Prentice Hall, 2003.
    • An introductory text with an incomplete translation of the poem, which paints Parmenides as a strict monist and contrasts his position as radically opposite to Heraclitus’ “philosophy of flux.”
  • Bicknell, P. J. “A New Arrangement of Some Parmenidean Verses,” Symbolae Osloenses 42.1 (1968): 44-50.
    • Challenges the arrangement of Diels-Kranz. Bicknell suggests a new arrangement of some Parmenidean fragments, primarily based upon Sextus’ report that the lines which began the poem (C/DK 1) were followed by lines currently assigned to fragments C/DK 7 and 8.
  • Bowra, C. M. “The Proem of Parmenides.” Classical Philology 32.2 (1937): 97-112.
    • One of the earliest and most influential treatments of Parmenides’ Proem, particularly focusing on its similarities to Pindar.
  • Cherubin, Rose. “Light, Night, and the Opinions of Mortals: Parmenides B8.51-61 and B9.” Ancient Philosophy 25.1 (2005): 1-23.
    • An insightful discussion of the dualistic principles found in Opinion.
  • Cohen, Marc and Patricial Curd and C.D.C. Reeve, eds. Readings in Ancient Greek Philosophy: From Thales to Aristotle. 4th ed. Indianapolis: Hackett, 2011.
    • An excellent, if limited, introductory text with a full translation of Parmenides’ poem. In the brief introduction to Parmenides, the likelihood of a Xenophanean influence is stressed, the difficulty in reconciling Reality and Opinion is raised, and Parmenides’ ultimate position is left open.
  • Cordero, Néstor-Luis. “The ‘Opinion of Parmenides’ Dismantled.” Ancient Philosophy 30.2 (2010): 231-246.
    • Cordero advances several theses in this paper. First, he avers that the standard arrangement of the fragments is based solely upon perceived content, which ultimately depends upon imputing anachronistic, Platonic distinctions to Parmenides. Having challenged this status quo, he goes on to advocate a new arrangement for the poem, moving some passages which make true cosmological claims out of Opinion, and into Reality.
  • Cordero, Néstor-Luis. Ed. Parmenides, Venerable and Awesome. Proc. of International Symposium, Buenos Aires, 10/29-11/2/2007. Las Vegas: Parmenides, 2011.
    • A collection of scholarly essays, many of which engage with each other, presented at the first international conference for Parmenidean studies.
  • Curd, Patricia. The Legacy of Parmenides. New Jersey: Princeton, 1998.
    • Curd argues that Parmenides intended to argue for “predicational monism”—that whatever exists as a genuine entity must be one specific, basic kind of thing. Curd’s primary argument is that none of Parmenides immediate successors offers any argument for the possibility of metaphysical pluralism. Thus, if Parmenides had held the strict-monist view, later thinkers would be begging the question against him, and Curd thinks this fallacious move unlikely. Since predicational monism allows for a plurality of entities, there would be no reason for his successors to argue for the possibility of pluralism, and thus their failure to do so is no longer fallacious.
  • DeLong, Jeremy. “Rearranging Parmenides: B 1.31-32 and a Case for an Entirely Negative Opinion (Opinion). Southwest Philosophy Review 31.1 (2015): 177-186.
    • In addition to considering the meaning of C 1.31-32 and how Opinion should be taken negatively, this article directly takes on and challenges Cordero’s proposed rearrangement of the fragments.
  • Diels, Herman. Parmenides Lehrgedicht: Griechisch und Deutsch. Berlin: George Reimar, 1897.
  • Granger, Herbert. “Parmenides of Elea: Rationalist or Dogmatist?” Ancient Philosophy 30.1 (2010): 15-38.
    • An article that helpfully sets interpretative approaches to Parmenides in the context of commentators’ background assumptions regarding Parmenides’ “rationalism.”
  • Hermann, Arnold. To Think Like a God: Pythagoras and Parmenides—The Origins of Philosophy. Las Vegas: Parmenides, 2004.
    • While ultimately denying any significant historical influence by Pythagoreans upon Parmenides, Hermann traces how the ancient conceptual distinction between divine and mortal knowledge led to the development of these diametrically opposed views. Whereas this distinction essentially led the Pythagoreans to develop a religious cult, it inspired Parmenides (stepping up to Xenophanes’ challenges) to became the first true philosopher, relying upon logic and reasoning to arrive at metaphysical conclusions, and thus achieving a sort of divine knowledge as a mortal.
  • Kingsley, Peter. Reality. Inverness: Golden Sufi Center, 2003.
    • Kingsley advocates rejecting that Parmenides attempted to communicate any epistemic or metaphysical truths in his poem—at least, not in any rationalistic sense. Rather, Parmenides is a mystic who has found divine truth through ritual and spiritual experiences. His poem recounts these experiences in the Proem, and what follows is designed to open the reader’s mind to similar experiences, via losing oneself in elenchus, and facing death metaphorically, if not literally.
  • Kurfess, Christopher John. “Restoring Parmenides’ Poem: Essays Toward a New Arrangement of the Fragments Based upon a Reassessment of the Original Sources.” Diss. U. of Pittsburgh, 2012.
  • Kurfess, Christopher. “Verity’s Intrepid Heart: The Variants in Parmenides, DK B. 1.29 (and 8.4).” Apeiron 47.1 (2014): 81-93.
  • Lewis, Frank A. “Parmenides’ Modal Fallacy,” Phronesis 54 (2009): 1-8.
  • Lombardo, Stanley. Parmenides and Empedocles. Eugene: Wipf & Stock, 2010.
    • For those interested in a translation that attempts to capture Parmenides’ poetical style.
  • Miller, Mitchell. “Ambiguity and Transport: Reflections on the Proem to Parmenides’ Poem.” Oxford Studies in Ancient Philosophy 30 (2006): 1-47.
  • McKirahan, Richard D. Philosophy Before Socrates: An Introduction with Texts and Commentaries. Indianapolis: Hackett, 1994.
    • An exceptionally solid and detailed introduction to the Presocratics overall. Suitable for both beginning and more advanced readers. Provides an overview of the interpretative problems for Parmenides and various perspectives, in addition to McKirahan’s own perspective on them, (which differs markedly from Coxon’s, whose work he recently edited and revised).
  • Mourelatos, Alexander P. D. “Parmenides on Mortal Belief.” Journal of the History of Philosophy 15.3 (1979): 253-65.
  • Mourelatos, Alexander P. D. The Route of Parmenides: Revised and Expanded Edition. Las Vegas: Parmenides, 2008.
    • Perhaps the most influential modern work on Parmenides in the twentieth century, the contributions and insights offered by Mourelatos in this monograph are invaluable and extensive. He is perhaps best known for demonstrating the extensive and inventive ways in which Parmenides invokes and plays off of Homeric/Hesiodic meter and language, as well as being the first to posit an Essentialist interpretation of Reality.
  • Nehamas, Alexander. “On Parmenides’ Three Ways of Inquiry.” Deucalion 33/34 (1981): 97-111.
  • Owen, G. E. L. “Eleatic Questions.” The Classics Quarterly 10.1 (1960): 84-102.
    • In addition to focusing on the problematic lines 1.31-32, Owen provides one of the most influential interpretations of Parmenides. He claims that Parmenides was led to adopt a strict monism on logical and Russellian grounds, and explains how Opinion can be viewed negatively without contradiction as a mere dialectical exercise.
  • Reeve, C.D.C, and Patrick Lee Miller, eds. Introductory Readings in Ancient Greek and Roman Philosophy. Indianapolis: Hackett, 2006.
    • An example of an introductory text with a full translation of the poem, which straightforwardly casts Parmenides as advocating strict monism.
  • Stumpf, Samuel Enoch. Socrates to Sartre and Beyond: A History of Philosophy. 7th ed. Boston: McGraw-Hill, 2003.
    • An introductory text with no translation of Parmenides’ poem. Provides several pages of interpretative summary, casting Parmenides as a strict monist, and in opposition to Heraclitus.
  • Thanassas, Panagiotis. Parmenides, Cosmos, and Being: A Philosophical Interpretation. Milwaukee: Marquette, 2007.
    • Thanassas emphasizes the epistemological reliability as grounding the distinction between Reality and Opinion, concluding that Parmenides viewed true knowledge (and the philosophical methodology that leads to) as divine. He also offers a novel view concerning the content of Opinion, which he believes needs to be further divided up into several distinct sections with variant polemical ends.

 

Author Information

Jeremy C. DeLong
Email: jeremydelong@sbcglobal.net
University of Kansas
U. S. A.

Religious Pluralism

Religious pluralism, broadly construed, is a response to the diversity of religious beliefs, practices, and traditions that exist both in the contemporary world and throughout history. The terms “pluralism” and “pluralist” can, depending on context or intended use, signify anything from the mere fact of religious diversity to a particular kind of philosophical or theological approach to such diversity, one usually characterized by humility regarding the level of truth and effectiveness of one’s own religion, as well as the goals of respectful dialogue and mutual understanding with other traditions. The term “diversity” refers here to the phenomenal fact of the variety of religious beliefs, practices, and traditions. The terms “pluralism” and “pluralist” refer to one form of response to such diversity.

Philosophical and theological treatments of religious diversity have generally adopted different attitudes and different methods insofar as their respective disciplinary commitments differ. Since theological accounts tend explicitly to be grounded in the faith commitments that characterize particular religious traditions (or at least larger sets of traditions, such as Christianity or the “Abrahamic” religions), they often explore how members of a given faith ought to regard the beliefs and practices of other traditions. Philosophical accounts, by contrast, often tend to adopt a more or less disinterested attitude and instead evaluate, for example, the epistemological or ethical issues raised by religious diversity; this is especially true within the analytic tradition, which raises questions about the justification of conflicting religious beliefs that have received much attention in analytic literature. Although this article’s examination largely focuses on philosophical positions, despite their methodological and perspectival differences theological and philosophical accounts inform and influence each other.

As with many other philosophical topics, there are also significant differences between the way religious diversity is treated as a topic by analytic and continental philosophers. In general, it has been taken up more directly and explicitly in analytic philosophy since the 1980s, though religious diversity has also featured as an important, if secondary, theme in much continental philosophy of religion (as well as continental social, political, and ethical philosophy). Therefore, after introducing some basic terminology and exploring treatments of religious diversity in the history of modern philosophy, this article explores analytic and continental approaches in separate sections. Significant feminist discussions of religious diversity have emerged in both analytic and continental philosophy; these will be treated their own section. In addition, sections are devoted to contributions from both process philosophy and liberation theology.

The phenomenon of religious diversity occurs not only between particular religious traditions but also within them. Approaches to inter-religious and intra-religious difference are therefore explicitly treated as distinct by some philosophers of religion, while others argue that they should not be treated differently. The present article opts for the latter position, except in cases where there is an obvious reason to do otherwise.

Table of Contents

  1. Categories of Responses to Diversity
  2. Historical Influences
    1. Kant
    2. Schleiermacher
    3. Hegel
    4. James
  3. Analytic Approaches
    1. Epistemic Conflict
    2. Religious Language and Apologetics
    3. John Hick: the Pluralistic Hypothesis
    4. Criticisms of Hick
  4. Continental Approaches
    1. Hermeneutics and Religious Truth
    2. Hospitality
  5. Contributions from Feminism
  6. Process Philosophy
  7. Liberation Perspectives
  8. Conclusion
  9. References and Further Reading

1. Categories of Responses to Diversity

There are a number of different ways that philosophers and theologians have grouped various accounts of religious diversity. One of the most commonly adopted strategies – and the one that will be used in the following discussion – is the threefold division first introduced by Alan Race (1983): exclusivist, inclusivist, and pluralist. Exclusivist positions maintain that only one set of belief claims or practices can ultimately be true or correct (in most cases, those of the one holding the position). A Christian exclusivist would therefore hold that the beliefs of non-Christians (and perhaps even Christians of other denominations) are in some way flawed, if not wholly false; or that non-Christian religious practices are not ultimately efficacious – at least, to the extent that non-Christian beliefs and practices depart from or conflict with those defended by the Christian exclusivist.

Pluralist positions, in contrast, argue that more than one set of beliefs or practices can be, at least partially and perhaps wholly, true or correct simultaneously – or, that all beliefs intended to be understood in a realist fashion are false. Inclusivist positions occupy a middle ground between exclusivism and pluralism, insofar as they recognize the possibility that more than one religious tradition can contain elements that are true or efficacious, while at the same time hold that only one tradition expresses ultimate religious truth most completely. As McKim (2012) expresses it, inclusivists grant that many (perhaps all) religious traditions do well in regards to truth or salvation, but that one tradition does better than others by more accurately describing objects of belief or mechanisms of salvation. A Christian inclusivist might claim that those who live good lives but remain non-Christians may still achieve salvation, but that such salvation is nevertheless still achieved through Jesus Christ. Inclusivism thus may be understood as a more charitable variety of exclusivism, though exclusivists can also treat it as pluralism by another name. In addition, it is worth noting that exclusivist, inclusivist, or pluralist arguments about beliefs are sometimes presented separately from those about salvific practice and that consequently one approach to diversity of belief does not necessarily imply the same approach to diversity of practice, or vice versa.

There remain a few other possible positions outside of exclusivism, inclusivism, and pluralism that are worth mentioning briefly, though two of these are not commonly treated as sophisticated options in philosophical discussions of religious diversity. The first of these two is relativism: the view that the truth of beliefs or the efficacy of practices are wholly dependent on the perspective of the religious individual and her cultural environment. In contrast to the pluralist position, that of the relativist seems necessarily to imply an anti-realist theory of religious truth, which would deflate the significance of religious pluralism as a philosophical and theological issue since religious truth claims could only be upheld or defeated within the context of their own traditions. The second, which would have similar consequences, is the position that no positive religious beliefs are true in any sense, even the relativist ones, and no religious practices are efficacious (at least not according to their own terms). In this case, which may be termed strong anti-realism, religious diversity remains at most a sociological, psychological, or historical topic. It is perhaps not surprising, then, that serious philosophical approaches to religious diversity tend not to adopt either of these positions but rather to treat diverse religious traditions as at least possibly having some positive relationship to an ultimate reality. Another position, that of skepticism, seems to entertain this possibility insofar as it concedes that some set of religious claims may be true. However, since this position contends that, given the extent of religious diversity and disagreement, no one is ever justified in making such claims, this position will not be considered in detail here (for a contemporary defense of the skeptical position, see Feldman 2007).

2. Historical Influences

What is today called philosophy of religion appears early in the history of Western philosophy – arguably as far back as Plato’s Euthyphro, wherein Socrates questions the title character as to whether the nature of piety follows on the will of the gods or vice versa. However, most early accounts of religion either ignore religious diversity or do not treat it as an issue worthy of genuine consideration. Pre-modern sources that do treat religious diversity seriously tend to adopt exclusivist positions, though applying this label to such accounts is somewhat anachronistic. Even the use of “religion” as a term that signifies one particular tradition of beliefs and practices among others was not common before modernity.

Once religions began to be considered as such alongside each other, though, positions approaching pluralism soon appeared. In the seventeenth and eighteenth centuries, works by figures such as Spinoza, Locke, Hume, and Voltaire presented rational and naturalistic interpretations of religion and argued for religious toleration. Though these accounts focused largely on Christianity (and to a lesser extent Judaism), they laid the foundation for future pluralist approaches to religion. Though not the only important precursors of pluralism in the post-Enlightenment age, four influential approaches worth examining here are those of Immanuel Kant, Friedrich Schleiermacher, G.W.F. Hegel, and William James. In each, the diversity of traditions is seen not simply as contingent but as in some sense unavoidable, either because of humanity’s inherent limitations or because of the nature of history.

a. Kant

Immanuel Kant’s philosophy of religion serves as an important touchstone in the development of religious pluralism if for no other reason than for its strong influence on John Hick, whose pluralist position is one of the most significant of contemporary responses to religious diversity (see below). Kant does not offer an argument for religious pluralism especially, but such a position emerges as a consequence of his account of rational religion and the distinction he makes between it and particular religious traditions. In his Religion within the Bounds of Bare Reason (1793), he argues that authentic religion is purely rational; specifically, it is grounded in a human being’s moral capacity. Humans ultimately cannot know God’s nature, or even whether or not God exists (as he argues in the Critique of Pure Reason), but the idea of God can and should nevertheless serve as a morally regulative ideal. Similarly, Kant argues that while the existing religions and their respective sets of particular beliefs and practices can often be beneficial for the moral progress of their communities, the rational ideal is an “invisible church” made up of persons able to live autonomously moral lives. Thus, Kant makes the following claim: “There is only one (true) religion; but there can be many kinds of faith.—One may say, further, that in the various churches, set apart from each other because of the difference in their kinds of faith, one and the same true religion may nonetheless be found” (2009: 118). According to this view, multiple religious traditions (or “churches”) may exhibit, to greater or lesser degrees, the purported true essence of religion as long as they promote a morality that agrees with the dictates of practical reason. However, Kant also argues that in the progress of humanity toward greater enlightenment, these various traditions will be discarded as the purely rational religion Kant advocates becomes more fully realized. It is worth noting that he privileges Christianity (specifically, Protestant Christianity) as the paradigmatic historical faith, the tradition which, according to his argument, most fully manifests the moral core of religion.

b. Schleiermacher

Schleiermacher is in agreement with Kant in the argument that the essential core of religion lies deeper than the diverse forms of belief and practice that make up the various religious traditions. Neither Kant nor Schleiermacher accept the idea that religion is fundamentally a matter of adherence to a particular set of dogmatic claims, or that religion, properly so-called, even puts forward any metaphysical claims at all. Schleiermacher goes further though, asserting that morality is no more a part of the essence of religion than metaphysics. In his Speeches on Religion, Schleiermacher explains that religion is primarily a matter of intuition or feeling, with the entire universe as its object. Later, in his systematic theological work, he famously identifies the feeling that characterizes religion as one of “absolute dependence” – that is, the feeling that one’s very existence depends on something that, in some way or other, transcends oneself. For Schleiermacher, then, religion is first and foremost a matter for the individual, and he does not discount the possibility that one could live a highly religious life without belonging to a particular religious tradition. However, he also concedes that consciousness of religious feeling is, in most cases, best developed within a community, and that within such communities the religious intuition is always accompanied by metaphysical claims, moral prescriptions, ritual practices, and so forth. These outward forms of religion, in turn, serve to shape the way that individuals understand and articulate their own religious feelings. Given that the historical forms of religion emerge from individuals’ subjective experiences – and given that Schleiermacher holds the object of these experiences to be both all-encompassing and infinite – a wide variety of religious traditions is inevitable. Indeed, he goes so far as to argue that the diversity of traditions is necessary for the task of representing the infinite within the limitations of forms comprehensible to humans. Thus, while he also accords a privileged place to Christianity among other religions, Schleiermacher provides the framework for a type of pluralism grounded in the idea of a universal form of intuition that discounts the importance of metaphysical and moral claims in the foundation of religion.

c. Hegel

Hegel goes much further than Kant or Schleiermacher toward constructing a comprehensive and systematic philosophy of religion, particularly in offering both a general concept of religion and a detailed account of how diverse historical traditions manifest the basic religious essence. Hegel criticizes Kant for relegating God to the status of a regulative ideal, and he criticizes Schleiermacher for resting his concept of religion on subjective feeling. The main significance of Hegel’s approach, though, is its contention that the rational concept of religion and the historical phenomena of religious traditions cannot be understood separately; the latter give being to the former and make it intelligible. Indeed, he argues that the diverse concrete forms of religion are neither unavoidable deviations from an ideal form nor contingent responses to the infinite. The different forms religion has taken throughout history are determinate steps along the path of the self-manifestation of both the concept of religion and its object (that is, God). On the one hand, this means that the details of diverse forms of religion are of not only historical but also conceptual significance, considering that each religion gives humans insight into absolute spirit in its own way. On the other hand, Hegel’s insistence on the unfolding of spirit throughout the history of religions as a progressive movement leads him to a hierarchical model in which older forms are superseded by newer ones, with Christianity taking the privileged place of the “consummate” religion. In addition, Hegel maintains that religion as such can only offer a representation of the truth of absolute spirit, whereas it is philosophy’s task to proceed from this representation to complete knowledge. Much like Schleiermacher, Hegel’s attention to religious phenomena in their historical particularity is an important precursor to pluralist approaches, but the privileged status he reserves for Christianity as the most fully developed religion in terms of both belief content and practice evinces a degree of exclusivism in his approach.

d. James

James’s most direct and significant contributions to philosophy of religion come from his “The Will to Believe” and The Varieties of Religious Experience – the second of which having the greater influence on the development of religious pluralism during the twentieth century. In the series of lectures that make up this book, James largely brackets out any considerations as to the truth content of religious claims regarding the ultimate nature of reality or as to the various rituals, practices, and institutions that constitute the diverse religious traditions. Instead, as the title suggests, he focuses on the religious experiences of individual humans. James examines these experiences according to only a few basic categories, the most important distinction being that between “the healthy-minded” and “the sick soul.” Throughout his examination, James maintains that the individual’s personal experience lies at the core of religion and that rituals, doctrines, and traditions are thus secondary or inessential. His approach bears some similarity to Schleiermacher’s to the extent that it focuses on personal feeling, but it departs from the earlier approach in that it rejects the idea that there is a single “religious sentiment.” James also explicitly states that the concept of religion that he puts to use is one he has pragmatically adopted to suit the needs of his present study, and that it does not necessarily represent the actuality of religious phenomena in a comprehensive way. The few speculations James offers about the nature of ultimate reality toward which religious experiences point suggest a relatively pluralist attitude toward the diversity of religions, according to which none could claim a monopoly on genuine experience of the divine nor be excluded from it. Together with his later writings on pragmatism and pluralistic metaphysics, James serves as important touchstone for later theories of religious pluralism – especially those emerging from process philosophy (see below).

3. Analytic Approaches

Accounts of religious pluralism within the broadly construed analytic philosophy of religion tend to focus on the diversity of conflicting belief claims as the primary issue at stake. This emphasis has led some philosophers to concern themselves with religious diversity solely in terms of conflicting claims (for example, Basinger [2002] explicitly addresses diversity as “epistemic peer conflict”), while others also include treatments of variety in practice or moral norms. The present examination will address first the perspective in which the epistemology of religious belief claims is central, and second that which focuses on the semantics of belief claims. In both cases, the literature is extensive and contains sometimes highly nuanced discussions of specific issues. What follows are, therefore, only brief surveys of major trends, roughly sketched, of work in these two general areas. This section will then proceed to a more detailed examination of John Hick’s influential pluralist position, as well as some of the criticisms this position has received.

a. Epistemic Conflict

It is an undeniable fact of religious diversity that many traditions offer various accounts of the nature of both mundane and supramundane reality, of the ultimate ends of human beings, and of the ways to achieve those ends. The claims offered by different traditions, as well as the sometimes unarticulated assumptions about matters of ultimate concern, can (and do) seem to conflict with each other, and one possible approach to religious diversity is to treat this conflict as real, genuine, and significant. Treating epistemic conflict as real means not assuming that it only seems to arise because of lack of clarity or universality in the terminology of various traditions; treating it as genuine means admitting that equally sincere and knowledgeable adherents of different traditions may uphold conflicting claims, while treating it as significant means that such conflict constitutes a problem – perhaps the central problem – that any philosophical response to religious diversity must address in order not to ignore or misrepresent the challenge of religious diversity itself.

Possible responses to the epistemological issues raised by religious diversity include exclusivist and inclusivist as well as pluralist perspectives. Exclusivist arguments maintain that, in the case of conflicting claims about, for instance, the intrinsic nature of divine reality, no more than one non-contradictory set of claims can be correct. Thus, the exclusivist may hold that all the claims made by her own tradition are true and consequently that all conflicting claims made by other traditions are false. Inclusivist approaches argue that while only one set of claims is wholly true (or one set of claims is more true than others), claims made by other traditions may be true partially, selectively, or to a lesser extent. The distinction between this perspective and that of exclusivism often has to do only with the degree to which multiple religions’ claims can be conceded at least partly as true  – hence the suggestion that inclusivism may be more properly named “soft exclusivism” (Basinger 2002: 5).

The exclusivist, in order to maintain her belief claims, may simply choose not to recognize or respond to the fact of religious diversity at all. However, many philosophers of religion agree that anyone who is sincerely interested in maximizing the truth and minimizing the error of her belief claims is prima facie obligated to address significant epistemic conflict, at least in part by assessing the strength or justification of her own beliefs. It remains a debated point whether or not the existence of multiple conflicting belief claims necessarily decreases such justification; positions on this question range from outright denial of the possibility of justified religious belief in cases of epistemic conflict (cf. Schellenberg 1994) to the argument that justification in such instances is at most only slightly diminished (cf. Alston 1988). In most cases of diverse religious beliefs, it seems that there is no objective evidence or set of criteria that would allow for straightforward adjudication between conflicting belief claims. If this is the case, then the exclusivist would need to provide other grounds for justifying her commitment to the correctness of her own beliefs or to the incorrectness of others’ beliefs (and some defenders of religious exclusivism argue that this is possible). Given that the exclusivist already holds the beliefs that she does, though, epistemic conflicts with others’ beliefs do not necessarily provide sufficient reason for her to give up or modify her beliefs. Alvin Plantinga argues that the assessment and consequent modification or abandonment of some or all of one’s beliefs would be required in cases where conflicting religious beliefs are on epistemic par with one’s own. However, he maintains that such epistemic parity need not ever be obtained – that is, a Christian may reasonably believe that others’ beliefs with which her own beliefs conflict do not achieve significant epistemic parity (Plantinga 1995, p. 203-5; Plantinga’s argument is discussed at more length below).

It is also unclear that every belief one holds may reasonably be assessed in light of the conflicting claims of others. Some beliefs may serve a foundational role in the total belief system of an individual or community, in which case its epistemic status is somewhat different than other beliefs. Especially if it is granted that religious beliefs are often not subject to evidential justification, it may be the case that beliefs cannot be assessed except on the basis of other, more foundational beliefs. There would thus be a set of beliefs in any belief system that cannot meaningfully be subject to doubt or assessment in light of conflicting claims belonging to other systems. Jerome Gellman calls these “rock bottom beliefs,” arguing that their foundational status excuses those who hold them from the obligation to assess them even when faced with others’ conflicting beliefs (Gellman 1993). He goes on to argue that even if rock bottom beliefs are subject to assessment, a believer is obligated to assess them only when she loses confidence in them in the face of other, competing beliefs (Gellman 2000).

William Alston, similarly to Plantinga, also accepts that certain religious beliefs can be foundational, yet he does not agree that this renders them immune from assessment or revision. On the contrary, Alston takes religious diversity as one of the most serious challenges to the justification of a particular religious tradition’s beliefs. Nevertheless, he maintains that exclusivism with regard to one’s belief remains rational in practice precisely because of the lack of common ground between believers of different traditions. He argues that the absence of proof for the superiority of a Christian’s beliefs (or their grounds) over others’ need not diminish the justification of those beliefs, since one would have no way of knowing what such proof would entail even in principle (Alston 1988, p. 443). In fact, Alston goes on to claim, “In the absence of any external reason for supposing that one of the competing practices is more accurate than my own, the only rational course for me to take is to sit tight with the practice of which I am a master and which serves me so well in guiding my activity in the world” (1988, p. 444). Philip Quinn agrees with Alston that “sitting tight” is a rational option (and that practical self-support counts toward this option being rational), but contends that it is not the only rational option. Quinn holds that internal revision of one’s core beliefs in ways that would make them more reliable “if some refined pluralistic hypothesis were true” is also a rational option (Quinn 2006, p. 298).

Naturally, pluralists advocate a significantly different approach to the issue of epistemic assessment of beliefs. If there exists real conflict between beliefs held by different religious traditions, conflict which cannot be resolved by appeal to evidence or to a universal set of justificatory criteria, then the pluralist may conclude that no one set of beliefs can reasonably be held as preferable to others. A strong form of this position is simply the skeptical argument, which maintains that judgment must be suspended on belief claims that remain unresolved in this way such that an exclusivist would be required to give up her commitment to them (cf. Feldman 2007; Kitcher 2011). A less extreme version of the position, however, may admit that the lack of criteria by which to resolve conflicting belief claims does put the beliefs of different traditions on equal epistemological footing, but that this does not necessarily require that believers suspend or give up their beliefs. What it requires is rather that beliefs that are in conflict with others’ (equally justified) beliefs be held with less confidence. Such a pluralist response to epistemic conflict emphasizes awareness of the fact of diversity, humility regarding the level of justification of one’s own beliefs, and openness to the possibility that beliefs other than one’s own may turn out to be correct. While an exclusivist may also acknowledge that epistemic conflict may lead her to reduce her confidence in her beliefs, the pluralist position goes further in asserting that no one tradition’s beliefs are more or less justified than other tradition’s beliefs. It acknowledges that most religious believers are likely to hold on to the set of beliefs (and to engage in the practices) with which they are most familiar given their historical and cultural backgrounds, and it posits that, given the absence of strong arguments to the contrary, they are justified in doing so. The difference between this position and that of the exclusivist (for example, Alston) is that this justification does not extend to the second-order belief that one’s own beliefs are rationally preferable to other conflicting beliefs. This is the direction in which Quinn (2006) gestures when he argues that there is more than one rational option for the believer in the face of a variety of conflicting claims.

b. Religious Language and Apologetics

Arguments for the assessment of religious belief claims in instances of epistemic conflict tend to presuppose that the meaning of such claims as well as the evaluation criteria applied to them are to some extent mutually understandable – even if it is conceded that there are no strictly universal criteria by which to resolve conflicts between belief claims. However, another position advanced by some philosophers of religion is that belief claims from different religious traditions cannot be assessed for their relative strength because particular claims cannot retain their meaning outside the context of the whole system of beliefs in which they belong. This position can be understood as a version of Wittgenstein’s approach to statements about religious belief, in which he suggests that individuals holding conflicting belief claims may be operating within different cultural-linguistic frameworks, exemplified by the idea of “language games”. In this case, they would have no shared frame of reference according to which the conflict could be resolved. It may not even be accurate to say that an epistemic conflict actually exists, rather than simply a difference in linguistic practice. What is called for, then, is not assessment and justification of belief claims in epistemic terms – where different claims’ truth, likelihood, or reasonableness is evaluated – but merely examination of the use and practical consequences of statements about belief in the lives of those who offer them.

If one accepts this approach to religious language, there are significant consequences. One is that it would be meaningless to speak of the truth or falsity of an entire belief system, because the truth of belief claims could only be assessed according to the internal grammar of this or that system. Another is that an individual’s commitment to one system of belief rather than another would be more or less arbitrary – in most cases, it would be wholly dependent on accidents of birth. It also seems that adopting this approach may commit one to adopting one of the positions that were set aside at the beginning of this article: that is to say, religious relativism, strong anti-realism, or religious skepticism. The relativist would conclude that, if belief claims are justified only based on the semantic criteria provided by their particular religious cultural-linguistic frameworks, then claims from different frameworks simply cannot be meaningfully compared. The strong anti-realist would go a step further, arguing that religious belief claims thus cannot refer to content beyond their cultural discursive context, while the skeptic would hold that the cultural-linguistic delimitation of the meaning of belief claims prevents us from knowing whether or not such claims accurately refer to extra-linguistic reality. Each of these positions exhibits the tendency of the cultural-linguistic approach to religious language, which is to minimize the importance of whatever cognitive content there may be in such language.

Critics of this approach argue that it does not accurately represent the attitudes of religious individuals toward the meaning and status of the particular claims they make or the ways in which they use religious language generally. Statements like “Jesus Christ is divine” or “Atman is Brahman” seem in most cases not only to fit into a larger system of religious linguistic use but also to describe a state of affairs the reality of which the utterer of the statement believes in. Even if one were to take such statements merely as expressive of certain subjective attitudes or existential commitments on the part of the person who utters them, one could argue that the adoption of such attitudes or commitments on the part of the speaker presupposes that she also believes the statements to be cognitively true. Otherwise, the attitude that she expresses in them would be arbitrary or unwarranted.

If one maintains, contrary or in addition to the cultural-linguistic model, that at least some religious statements do express actual truth claims and that some of these claims conflict with each other, then the question returns as to whether such conflicts need to be resolved and, if so, how. One of the most significant responses to this issue made in terms of the logic of religious language is that of Paul J. Griffiths, who argues that the incompatibility of certain doctrinal statements belonging to diverse religious traditions creates the obligation that representatives of these traditions engage in inter-religious apologetics (1991, p. 3). Griffiths calls this the principle of the necessity of inter-religious apologetics (NOIA). It proceeds from the initial presuppositions that statements that seem to make claims about ultimate religious truth do so, that arguments for believing such claims can intelligibly be made to those who do not believe them, and that to admit that others’ beliefs conflict with mine and subsequently to try to persuade others to accept the truth of my claims is to show respect for the sincerity and rationality of others. Griffiths divided the enterprise of inter-religious apologetics into two categories: “negative apologetics” aims to defend the plausibility of belief claims against outside criticism, while the goal of “positive apologetics” is to argue for the superiority of one tradition’s set of belief claims over that of others. He maintains that the NOIA principle enjoins representatives of religious communities to engage in both forms of apologetics, insofar as the interest in maximizing truth that is evident in making statements of doctrine entails both internal self-evaluation and external correction. Griffiths argues that the imperative of the NOIA principle is both epistemic, insofar as religious communities have a responsibility to justify and defend the content of their claims against challenges posed by conflicting claims, and ethical, insofar as assent to certain belief claims is often understood by religious communities to be a necessary condition of salvation (1991, p.15).

To a certain extent, Griffiths’ advocacy of apologetics commits him to exclusivism. Adherence to the NOIA principle implies not only that there is a real referent to which some belief claims point but also that, when claims conflict, one claim is likely to be more accurate in its description of reality than others. If this were not the case, it would be difficult to defend the necessity of positive apologetics, and perhaps negative apologetics as well. However, a commitment to the importance of apologetic dialogue as Griffiths proposes it is not necessarily equivalent to strict exclusivism, especially the variety that would allow the exclusivist to retain confidence in the superiority of her own position without assessing them in a religiously diverse context. That Griffiths advocates not only recognizing the fact of religious diversity but indeed engaging with a variety of conflicting claims via respectful argument – thereby entertaining the possibility that such claims might be partly or wholly true – suggests his position may be better understood as a form of inclusivism.

c. John Hick: the Pluralistic Hypothesis

The pluralist position advanced by John Hick has been and continues to be one of the most, if not the single most, significant and influential philosophical approaches to religious pluralism. It has garnered both considerable praise and considerable criticism from a variety of fronts. After outlining the main features of Hick’s arguments, we will also examine a few of the most substantial criticisms they have received.

Hick begins from the position that the world as it appears is ambiguous with regard to religion – that is, there is no inherent epistemological obstacle to experiencing and interpreting the world from the point of view of one religion rather than another, or indeed from a non-religious perspective (2004, p. 124). From here, he proceeds to his central claim that diverse religious traditions have emerged as various finite, historical responses to a single transcendent, ultimate, divine reality. The diversity of traditions (and the belief claims they contain) is a product of the diversity of religious experiences among individuals and groups throughout history, and the various interpretations given to these experiences. Hick’s claim that diverse interpretations are responses to a single transcendent Real draws on the Kantian distinction between noumenon and phenomenon; that is, the Real is epistemologically unavailable to human beings in itself, but we can nevertheless experience it “as the range of gods and absolutes which the phenomenology of religion reports” (2004, p. 242). Thus, Hick argues that no single description or set of descriptions applied to the Real from within the realm of human experience can apply literally to the Real (2004, p. 246). Nevertheless, the Real remains the final referent of the ontological claims made by the different religious traditions, even though such claims can at best only partially approximate the real truth of divine reality. Hick understands the multiple claims from diverse religious traditions that the object of their respective beliefs is ultimately unsayable and incomprehensible as supportive of his argument. As for the content of particular belief claims, Hick understands the personal deities of those traditions that posit them (for example, Yahweh, Viṣnu, Amida) as personae of the Real, explicitly invoking the connotation of a theatrical mask in the Latin word persona. Alongside these, he recognizes impersonae of the Real: concepts such as Brahman, nirvana, and Tao that represent ultimate reality in a non-personal way. Since the ideas in both of these categories arise from our experience of phenomenal reality, none of them can adequately describe the Real as it is in itself. The only way that humans can describe the Real directly is by using formal language such as “ultimate truth” or “ground of experience” (2004, p. 246).

In light of his epistemological arguments, Hick claims that all religious understandings of the Real are on equal footing insofar as they can only offer limited, phenomenal representations of transcendent truth. This position, which he calls the “pluralistic hypothesis,” brings together elements of several philosophical perspectives on religion into a complex whole. At the phenomenal, historical level in which humans live and religious traditions emerge, Hick advances the view that meaning is constituted largely by practice (linguistic and otherwise) within the contexts of particular cultures. Thus, the justification of belief claims rests on the relation of various practical commitments within a certain tradition or culture, and evaluation of or comparison between claims of distinct cultures appears problematic at best. This aspect of Hick’s position seems to rely heavily on the cultural-linguistic model of religion that places little importance on any cognitive content that belief claims may carry. However, Hick also maintains that the content of religious belief claims necessarily implies the believer’s sincerity in supposing that such claims actually refer to the transcendent reality that they purport to describe. That is, Hick maintains that the belief claims of the various historical religions have traditionally been articulated in ways that imply a realist perspective (2004, p. 176). Hick’s response to this issue is to posit that the referent of religious belief claims is ultimately real, but that any claims or knowledge about it must necessarily be historically and culturally mediated. If this is the case, then, as the pluralist hypothesis maintains, each religious tradition has some grounds for holding to its own beliefs and practices while no one tradition has grounds for claiming an exclusive or privileged status.

Hick claims that the pluralist hypothesis is more reasonable than either anti-realism or exclusivism given the broad extent and wide diversity of religious experiences and traditions. In response to religious anti-realism, Hick argues it is at least no less plausible to postulate a real, transcendent referent for religious belief claims than it is to reject such a reality in favor of a purely naturalistic explanation. Furthermore, he argues (drawing on William James) that a religious individual’s basic trust in her own experience is rationally justified, given the fundamental ambiguity of the world as it appears (2004, p. 228). In response to exclusivism, Hick maintains that adopting an exclusivist stance toward the justifiability of beliefs is not rationally defensible (2004, p.235). Even if it were the case that only one religious tradition correctly represented the Real, it would not be possible for humans to know this with any certainty. However, Hick is clear that this is not a case of merely epistemological uncertainty; because the Real positively transcends all human description, no one way of describing can even possibly be true (2004, p. xx).

The moral and soteriological content of diverse religious traditions is also an important focus of Hick’s argument. He posits that, in various ways, all the major religious traditions that emerged from the “axial age” understand the salvation from or transformation of the present world as a central aspect of the human relation to the Real (2004, p. 300). The ability to bring about such a transformation, as well as to promote a generally moral way of life, is perhaps the only common method by which one can evaluate diverse religious traditions. Thus, despite the various concrete paths to such an end proposed by the world’s major religious traditions, Hick affirms that the soteriological process at work in them is essentially the same. Furthermore, he points to what he sees as a broad consensus regarding basic moral claims among religious traditions to advocate the equal validity of diverse traditions with respect to their soteriological claims. Similarly to his argument regarding the ontological claims of various traditions, Hick does not ignore the fact that the details of both moral and soteriological prescriptions of these traditions often not only differ but conflict with each other. Nevertheless, he maintains that their overall moral themes generally agree and that their soteriological visions depict in various ways a path of transformation from self-centeredness to “Reality-centeredness.” The logic of this aspect of his position is also similar to the ontological-epistemological aspect insofar as no phenomenal experience can provide humans with certainty about the true effectiveness of any one soteriological path (2004, p. 337). In principle, such knowledge can only be attained when salvation is achieved; according to Hick, such a point would be the proverbial mountain peak at which the various upward paths converge.

d. Criticisms of Hick

Hick’s version of pluralism has garnered a wide variety of critical responses since it was first proposed; nearly every aspect of his argument has encountered some kind of challenge. This discussion will cover some of the most significant criticisms leveled from others in the analytic tradition (broadly construed), but other criticisms will be mentioned briefly in subsequent sections where appropriate.

Some of Hick’s critics argue that his claim that the same process of transformation from self-centeredness to Reality-centeredness is at work in each of the world’s major religious traditions is not a valid interpretation of the different forms of diverse religions’ soteriological paths. While, from the point of view of philosophical or sociological studies, there may be structural similarities between the ultimate aims of different religious traditions, it is more difficult to take the further step (to which Hick seems committed) of positing that these aims are essentially the same (cf. Hick 2004, p. xxvi). Furthermore, it seems a crucial part of many religions’ self-understanding that their beliefs and practices are uniquely suited to achieving a true salvation not offered by other traditions. This is an issue, of course, with which any pluralist approach must contend, but Hick’s critics maintain that his argument does not address it more clearly than other possible perspectives (for example, Plantinga 2000, p. 56ff.). In order to maintain the identity of soteriological paths across religions, Hick seems forced to minimize or ignore differences while translating important concepts out of their native terms into ones that members of particular traditions may not accept.

Among the criticisms of the soteriological aspect of Hick’s perspective, one of the most radical is that offered by S. Mark Heim. Heim argues that Hick’s position rests on two basic assumptions: the unity of the Real and the convergence of soteriological aims toward a single religious end (2006, p. 23). The first, says Heim, rests largely on Hick’s interpretation and acceptance of a Kantian distinction between the noumenal and phenomenal – a distinction that, Heim points out, is not always easily reconcilable with particular religious claims. The second assumption, argues Heim, undermines Hick’s concession that religious diversity does truly exist, since the unity of the soteriological process would render non-soteriological (that is, purely dogmatic) differences ultimately insignificant unless they inhibit this process for the religious individual. However, Heim explains that Hick seems to deny that such impediment is even possible in principle, so that any salvific human effort (even of a non-religious variety) participates in the same soteriological process (2006, p. 28).

Heim’s alternative – what he calls a “more pluralistic hypothesis” – combines a modified inclusivism with the concession of the possibility of a real plurality of ultimate ends. The former part of his argument holds that the concrete claims of diverse traditions regarding avenues for salvation ought to be acknowledged in their particularity and on their own terms, and that the sincerity of their adherents’ commitments to such claims is significant. The latter part of Heim’s argument, however, rests on an epistemological humility regarding one’s possible knowledge of eschatological reality and acceptance of the notion that such reality may in itself be plural. This puts Heim’s position in close proximity to that of process philosophical approaches (see below).

Criticisms of the epistemological and ontological aspects of Hick’s argument proceed in a vein similar to those of the soteriological aspects. Some claim that Hick’s hypothesis does not provide adequate support for the claim that the world is religiously ambiguous and that one is thus justified not only in treating one’s religious experience or belief as valid but also, perhaps, in treating it as superior to conflicting beliefs. Plantinga, for instance, does not concede that awareness of religious diversity necessarily calls for alteration of one’s previously held beliefs, though it might invite new reflection on them. Against Hick (and others), he defends a version of exclusivism, which he defines minimally as holding that “the tenets or some of the tenets of one religion—Christianity, let’s say—are in fact true” and that “any proposition, including other religious beliefs, that are incompatible with those tenets are false” (Plantinga 1995, p. 194). However, this minimal definition of exclusivism is not necessarily that which stands in need of defense since it could include non-culpable ignorance of others’ actual conflicting beliefs, so Plantinga narrows his version of exclusivism to include sticking to one’s beliefs despite awareness of other religions, acknowledgment that they contain examples of genuine piety, and belief that one does not have a conclusive rational argument that proves the truth of one’s own beliefs (1995, p. 196). He recognizes that challenges to this version of exclusivism are made on moral and epistemological grounds, and attempts to defend it against both by showing that, in one way or another, these challenges ultimately undermine themselves. If the moral argument is that exclusivism is arrogant, then Plantinga claims that either this argument itself is also arrogant (that is, that one ought to withhold belief from beliefs that contradict those of others) or that it lacks grounds (1995, p. 200). Plantinga breaks down the epistemological challenge to exclusivism into those centered on justification, on rationality, and on warrant. His defense in each case remains largely the same: either the pluralist position succumbs to the same criticism that it levels against exclusivism, or it cannot provide sufficient grounds for its own reasoning. Ultimately, he contends that religious diversity itself can prove neither the falsity nor the lack of warrant of particular beliefs – though it may tend to reduce confidence in belief (1995, p. 214). At the same time, though, Plantinga suggests that awareness of and reflection of diversity can also serve to strengthen the exclusivist’s conviction.

Plantinga also levels a challenge to the logical consistency of Hick’s position, contending that it would be nonsensical to suggest that the Real cannot have either of two strictly contrary properties (similar to the criticism made by Rowe [1999, p. 146]). As Plantinga puts it, the Real “could hardly be neither a tricycle nor a non-tricycle, nor do I think that Hick would want to suggest that it could” (2000, p. 45). However, Hick replies that he does indeed mean to suggest that the Real is “neither a tricycle nor a non-tricycle,” neither green nor non-green, and so forth, both because these sorts of concepts cannot apply either positively or negatively to the Real and because they are not religiously relevant. Hick finds Plantinga’s argument unacceptable because, when translated from an irrelevant concept (tricycle or non-tricycle) to a religiously relevant one (for example, personal or non-personal), it forces an exclusivist choice between different conceptions of the Real – a choice that seems to have no stronger support than personal preference or cultural background (Hick 2004, p. xxii).

Another, more recent epistemological criticism of Hick’s position takes aim at precisely this point: namely, his contention that to a significant extent our religious beliefs are contingent products of factors such as where and when we were born, and that this contingency poses a problem for claims to religious knowledge and, particularly, exclusivism (Hick 2004, p. 2). Tomas Bogardus argues that the inferences from contingency to pluralism or skepticism in this argument are invalid. He specifies, first, that the problem of the contingency of belief is only a significant problem if it deals with “only unreflective religious belief, belief formed genuinely, for example, on the basis of passive receipt of testimony during childhood” (Bogardus 2013, p. 378). Reflective belief must escape this problem, since otherwise pluralism (which is itself a reflective position) would be subject to the contingency objection. Bogardus then further specifies that Hick’s (and others’) contingency-based objections seem to target the “safety” of beliefs – that is, if a person had been born elsewhere than she was and used the same method to form her beliefs as she actually has, then (in light of her actual beliefs) she might have believed falsely. Bogardus reads the contingency objection as inferring from this statement that religious beliefs are, in fact, not formed safely, and therefore they do not constitute knowledge (2013, p. 380). Yet, he maintains that these inferences are invalid because, in the first place, “[t]he fact that something might have happened which would have rendered my faculties unsafe does not entail that my faculties are actually unsafe,” and, in the second place, because safety is not necessary for knowledge (2013, p. 381-2). After offering a similar criticism of the version that makes the contingency objection a question not of safety but of accidentality, Bogardus departs from his criticism of Hick to consider the issue of epistemic symmetry between conflicting beliefs held for contingent reasons (cf. Kitcher 2011). However, he maintains that this version of the contingency objection is either self-defeating or else excuses reflective belief. He concludes with the suggestion that even the unreflective believers specifically targeted by the narrow version of the contingency objection that he has considered may be excused due to non-culpable ignorance (2013: 391).

These epistemological challenges are distinct from, but related to, the issue of Hick’s ontology: the central claim that there is one ultimate, noumenal Real to which the different religions address in a variety of ways. Critics charge that it is difficult to maintain that all religious representations of ultimate reality refer to the same Real given not simply the wide variety of different actual representations but particularly those elements of them that seem to be incompatible or contradictory (cf. Rowe 1999). On this point, it is not obvious why Hick maintains the unity of the Real rather than positing a plurality of ultimate referents to match the plurality of ways it is signified; Mavrodes (1995) goes so far as to describe Hick’s position as actually polytheistic. In addition, if the Real is as epistemologically inaccessible as Hick maintains, then explaining how religious concepts can refer to it at all becomes problematic (cf. Plantinga 2000). Hick’s responses to these and other related criticisms are by and large pragmatic, appealing to a principle of global religious irenicism or claiming that his hypothesis is offered as a “‘best explanation,’ not an iron dogma” (Hick 2004, pgs. xxii, xxvii).

4. Continental Approaches

Treatments of religious pluralism in continental philosophy of religion tend not to focus on epistemological or ontological issues, but rather on the ethical and political aspects of diversity. To the degree that the potential truth of religious claims is discussed in these contexts, it tends to be in hermeneutic perspectives that focus on the history of texts in various traditions – or on various traditions approached as texts – and that aim to bring these traditions into constructive dialogues with each other. In addition, the concept of hospitality has become a prominent theme in discussions of both textual interpretation and interreligious ethics. The present discussion will examine the hermeneutic model and the theme of hospitality separately, though it will become clear that there is significant overlap between these two.

a. Hermeneutics and Religious Truth

Hermeneutic approaches to religious diversity take seriously both the deep divergences between different religious traditions and the idea that diverse religions’ claims to truth should be taken seriously as such. Drawing on the work of Heidegger, Ricoeur, and especially Gadamer, advocates of what may be termed “hermeneutical pluralism” attempt to understand interreligious encounters according to the models provided by textual interpretation and interpersonal conversation. Gadamer argues that a text to be interpreted makes a demand on the interpreter in that it presents itself as holding claims to truth that can only be adequately judged, affirmed, or denied after a careful reading in which the interpreter, to some extent, adopts the same questions as those that the text itself addresses (2002, p. 369). Similarly, one must acknowledge in another person the possibility of her knowing and speaking truth if one is to be able to engage in any meaningful conversation. In both cases, the relationship between the interpreter and the interpreted is asymmetrical: the truth claims of the person or text to be interpreted (and the questions to which they relate) should be understood as the measure against which one’s interpretations are judged. It is this focus on interpretation as a model for understanding that gives this approach the designation “hermeneutical.” And it is the claim that an effective understanding of a text, person, or tradition depends on one being open to the possibility of truth outside one’s own familiar tradition or worldview and on one being able to place one’s own position not only on par with but even “below” that of the other (insofar as the other, which takes the place of the text to be interpreted, serves as the standard against which interpretations are measured) that makes this approach a pluralist one.

Gadamer uses the term “horizon” to designate the position from which the interpreter approaches the text (or other object of interpretation). One’s horizon is constituted by one’s present phenomenal and conceptual environment as well as the history that has shaped this present and the particular ways in which one is open toward future possibilities. Thus, one’s horizon includes all presuppositions that one brings into new encounters, presuppositions that one can recognize but never completely escape (2002, p. 302). The work of interpretation involves engaging with an other whose horizon is different, sometimes drastically so, from one’s own. Openness to others does not rest on attaining an objective point of view free from prejudices; this sort of abandonment of one’s own horizon would be impossible. Instead, the openness that Gadamer advocates is a willingness to let our presuppositions and perspectives alter as a result of an interpretive or dialogical encounter, a process he calls the fusion of horizons.

In the application of hermeneutic philosophy to religious diversity, religious traditions or their respective exemplary texts are taken as representative of different horizons. The issue, then, is how to join these horizons effectively, and what the resulting new horizon will look like. First, one can make the argument that if one has even rudimentary knowledge of a tradition beside one’s own, then the fusion of horizons has already begun. Any relation between two or more religious traditions means that at some point the respective horizons of the different traditions overlap. The task would then be to map out these overlaps by way of discourse with members of different traditions or interpretation of their religious texts. For example, insofar as Jewish, Christian, and Muslim traditions share a common set of basic theological concepts (for example, strict monotheism) or scriptural figures (for example, Adam, Noah, and Abraham), these commonalities could be and often are used as frameworks for interreligious dialogue in the present. In the process of interpretation and discourse, relations could be extended and new points of contact developed, so that a more extensive fusion of religious horizons can be achieved.

Since the term “horizon” signifies one’s present perspective and particularly since it includes the presuppositions on the basis of which one begins an interpretive encounter, to a certain extent one’s criteria for the judgment of truth claims (including one’s own) are shaped by one’s current horizon. Therefore, openness to an encounter with another tradition from which a new horizon might emerge entails openness not only to the revision of specific presuppositions but also to the possibility of reassessing one’s criteria for judgment. Interpretation or dialogue across different religious traditions would then include openness both to the possibility that the other’s traditions contain elements of truth not available within one’s own tradition and to the possibility that engagement with such truth might affect one’s understanding of truth in general. Depending on the hermeneutical approach taken and the arguments on which it is based, perspectives that advocate such dialogue are often correctly characterized as pluralist (though more inclusivist positions are not ruled out).

Advocacy of this openness to the revisability of basic epistemological criteria is based on the assumption, which many pluralist accounts of religion share with hermeneutics generally, that present standards of judgment are always subject to adjustment or augmentation in the future. This assumption goes along with the claim that there is in principle no end to the hermeneutic process; interpretation is continually producing new understandings of its objects and new horizons (that is, new contexts of understanding) that can subsequently become starting points for future attempts at dialogue and expansions of mutual understanding between different perspectives. Since hermeneutics denies the possibility of objective access to truth, or truth that is free from a particular perspective, every mutual understanding achieved by diverse religious traditions will necessarily be provisional and subject to later reinvestigation. This is, however, not to be considered an unfortunate limitation but rather an opportunity for further constructive engagement. Only in particular historical encounters can new horizons of understanding and experience be forged; the hermeneutical philosopher of religious pluralism seeks out new possibilities for dialogue through which to partly reappraise the merit of former interpretations.

Regarding the question of whether or not diverse claims and interpretations point beyond themselves to a higher universal truth, the hermeneutical approach to pluralism may lead to different conclusions. One possibility would affirm that there is such a universal truth, resulting in pluralism not unlike that of Hick. This seems to be the approach that follows most closely from Gadamer’s claims about the nature of language, insofar as he maintains that the diversity of languages nevertheless expresses a basic universal relationship to the world. One could then say that all of the various religious traditions, and the diversity of representations of reality found within them, express the same ultimate truth from their different horizons. On the other hand, it is also possible to argue on Gadamerian grounds that what is productive and valuable in interreligious discourse is precisely the encounter with the other as other, and that appeals to a higher unity undermine this. According to this view, when beliefs and practices belonging to different traditions seem to refer to different ultimate realities, it may be preferable either to hold that they really do so or to avoid the question of a possible transcendent unity of religious truth entirely. The former position makes the issue of ongoing interpretation and judgment of conflicting truth claims particularly significant, while the latter is sometimes more pragmatic but also potentially less productive. Merold Westphal, for example, argues against something like the latter view, claiming that it precludes meaningful dialogue with religious others by eliminating even the possibility of cognitive conflict (1999, p. 3). His defense of a modest form of theological exclusivism that has been uncoupled from political and cultural imperialism may serve as a helpful point of contact between analytic and continental approaches to religious diversity, especially insofar as he advocates the consideration of epistemological issues and of the cognitive content of belief claims in addressing the ethical concerns at the root of many forms of pluralism.

b. Hospitality

Hospitality to others has become a central theme in continental philosophy of religion, as well as in continental ethical and political philosophy more broadly. Understood as an ethical demand, the idea of hospitality prescribes that one should offer one’s home, food, and other resources to the needs of others – particularly those who do not share one’s nationality, culture, language, or religion. This idea carries with it a demand to respect the other in her difference, and thus has been taken up as a productive concept in terms of which to articulate a pluralist approach to religious diversity. In addition to its conceptual resources, it has also traditionally been recognized as an important virtue in many different religions and cultures, thus making it a helpful starting point for dialogue. Philosophers who discuss the ethics of hospitality also recognize certain inherent tensions in the concept that make it difficult to translate into a concrete practice.

As discussed in the work of Ricoeur (2010) and Derrida (2002), among others, hospitality appears to be something that in principle can never be fully offered. Being hospitable in the most complete sense means anticipating the appearance of the other, understanding her needs and wants, and ultimately being able to turn over anything and everything within one’s control to the other. This raises a few problems, one of which is that the ability to successfully anticipate the appearance and needs of the other seems to undermine her very otherness – what marks the other as other is surely, at least in part, one’s lack of knowledge about her. If one attempts to make up for this lack of knowledge by encountering the other not on her own terms but on one’s own, however, this also undermines the otherness of the other by not paying sufficient attention to her difference. Another problem is that the other to whom one is called to be hospitable can appear as a threat insofar as one’s hospitality to the other, when carried to its furthest point, displaces one from one’s own position of control and puts one at the mercy of the other. The host, in her obligation to the needs of the guest, becomes in a sense the guest of the guest – and the possibility is always present that the other will treat one inhospitably or even violently.

Pluralist accounts that think in terms of hospitality must thus deal with at least two related issues: how to address the seeming impossibility of adequately fulfilling the requirement that one be hospitable to the religious other as other, and how to be respectful and welcoming of the religious other even while recognizing the risks inherent in doing so. Specifically, religious communities must recognize both an obligation toward engagement with those from different religious traditions and the impossibility of achieving a complete understanding of religious others. In addition, hospitality demands that religious communities open themselves up to religious others even if their own communities or beliefs are threatened in the process.

Ricoeur connects hospitality to a hermeneutic theory of translation, arguing that the impossibility of being perfectly hospitable mirrors the impossibility of translating a text from one language to another without any loss or distortion of meaning. In each case, he argues that this impossibility suggests that we should not aim for the ideal of perfection but rather accept that every instance of hospitality, just like every translation, will be risky, limited, and contingent (2010, p. 38). Accepting such a fact may be difficult, especially insofar as it seems to suggest failure to achieve the goals toward which hospitality is oriented. However, Ricoeur maintains that fixation on perfect, complete hospitality is counterproductive and that achieving it, if it were possible, would actually result in the erasure of the difference between oneself and the other (cf. Kearney & Taylor 2011, p. 17). Instead, the limitations inherent in any attempt at hospitality produce a situation in which creative exchange between two individuals, communities, texts, or traditions happens without either subsuming one into another or synthesizing both into a third that erases the initial identities. Because there is no absolute perspective against which to measure the particular perspectives of the various religious traditions, their particularities must be respected as fundamental – that is, there is no prior “common ground” against which the relative value or truth of particular differences can be measured. Interreligious encounter offers the opportunity to engage in a process of translation in which the parties involved learn both about others and about themselves by constructing a context in which their differences and limitations are emphasized.

While his account of hospitality agrees with Ricoeur’s in a number of respects, Derrida does not see the ideal of perfect hospitality as something that should be abandoned because of its impossibility. Drawing on the ethical philosophy of Levinas, Derrida (2000) argues that the other makes an infinite ethical demand that one is never able to answer adequately, but that this inability is precisely what drives the continual concrete effort of hospitable practice. Making a conceptual distinction between the “conditional hospitality” that dictates practical action and the “absolute hospitality” that this action can never achieve, he claims that the inescapable gap between the two not only motivates us to continue to act in conditionally hospitable ways but also to strive toward making our actions and dispositions ever more hospitable. In the context of religious pluralism, this means that one must recognize not only that every attempt to understand and welcome the religious other is necessarily inadequate but also that one has a responsibility to try to be more understanding, more welcoming, and more accepting of the religious difference as such (cf. Derrida 2002). Importantly for Derrida, it also entails a recognition that one stands in a similar relation to the other as one does to the divine. Recalling again the thought of Levinas (1989), Derrida stresses that “every other is wholly other,” and the religious individual or community bears the same obligation to the religious other as it does to God (Derrida 2008, p. 78).

5. Contributions from Feminism

Feminist theology and philosophy of religion has, perhaps surprisingly, turned its attention to religious pluralism only recently. Feminist scholars have long emphasized the need for greater diversity in both analytic and continental discussions, but this has often meant gender, ethnic, and socio-economic diversity rather than religious diversity. To the extent that feminist perspectives have been applied specifically to religious pluralism, though, arguments have emphasized the degree to which particular religions have been taken as homogeneous traditions without internal diversity (Gross 2002, p. 60). A similar criticism has been levied against representations of femininity in certain pluralist discussions of religion – for instance, in Hick’s discussions of feminist theological critiques of traditional Christianity (Hick 2004, pg. 52; cf. xxxviii). In either case, one can observe a skepticism concerning understandings of both religions and women as possessing relatively stable or universal identities, as such conceptions provide adequate pictures neither of historical realities nor of the full range of belief claims (and their implications). Indeed, they can often serve to conceal the roles, concerns, beliefs, and practices of not only women but also other power minorities within traditions. Insofar as they share a critical stance toward male-dominated traditions of thinking about religion generally and diversity specifically, feminist approaches to religious diversity may serve as points of contact between analytic and continental discussions.

The arguments that have emerged recently for explicitly feminist varieties of religious pluralism – or, conversely, for explicitly pluralist approaches to feminist theology – offer a variety of reasons to support the claim that feminism and religious pluralism are natural allies. For instance, many feminist philosophers and theologians argue that special attention needs to be paid not only to the experiences of people with diverse gender, ethnic, national, and economic identities, but also particularly to those whose experiences have traditionally been ignored or underrepresented within Western philosophical and theological traditions (cf. Gross 2002; Kwok 2002). If emphasizing the importance of such diversity is already a feminist value, then it stands to reason that inclusion of experiences from members of diverse religious traditions should also be valued. Furthermore, concepts, practices, and experiences arising from non-Western traditions may deserve special attention since they have traditionally been given less consideration in Western philosophical and theological discourse.

In addition, many feminist theologians take it as a central aim to find alternative scriptural sources or minority practices that can be used to critique, augment, or replace traditions that have historically excluded or undervalued women. This same inquiry into alternative sources and interpretations leads feminist theology toward interreligious dialogue, out of both the spirit of openness to difference and that of solidarity (Ruether 1987, p. 147). By extending the scope of critical feminist investigations across diverse religious traditions, the possibilities for finding constructive resources may be broadened (Gross 2002, p. 63). For example, many non-Christian traditions include devotions to goddesses, or (as discussed above) non-personal representations of the divine. Such practices and concepts can be helpful in critiquing the traditional masculine bias in Christianity, provided that the Christian theologian adopts a pluralist attitude toward other religious ideas. Of course, this is not to say that non-Christian traditions do not also contain patriarchal elements, and it is likewise the task of the feminist pluralist to analyze these critically (cf. Ruether 1987). Nevertheless, the main goal of much feminist theology and philosophy of religion is criticism and reinterpretation of one’s own tradition. In the context of religious diversity, this criticism would include not only fostering greater appreciation of the concepts and practices of other traditions as such, but, particularly, adopting a charitable attitude when approaching elements of other traditions that may at first glance seem anti-feminist. For example, norms of female dress that may seem heavily modest or the practice of arranged marriage may appear to limit women’s individual freedom, but upon further study could be seen as protecting women from sexual objectification (Gross 2002, p. 69).

Another important feminist contribution to religious pluralism is the critique of conceptions of particular religious traditions as possessing single, uniform identities. Perhaps the most direct and substantial of such critiques is offered by Jeannine Hill Fletcher (2005), who argues that describing religious traditions according to their specific differences and then identifying their members according to such distinctions is misleading in several ways. For one, this approach ignores the internal diversity of religious traditions; members of the same religion differ from one another in gender, ethnicity, profession, nationality, wealth, and so forth, and these differences may lead to drastically different experiences of the “same” tradition. Also, the religious identity of a single individual or community always intersects various other identities, all of which are informed by social, cultural, and geographical locations and particular experiences and behaviors. Perhaps most importantly for her understanding of religious pluralism, Fletcher also contends that such intersectional religious identities are always hybrid, by which she means that all identities are formed in relation with multiple other identities. Because of this, no identity is absolutely distinct and no difference completely precludes any communication and understanding. Dialogue and mutual understanding between members of diverse religious traditions is possible because, in the complex mesh of relationships out of which different religious identities emerge, possibilities already exist for building solidarity with one another. Furthermore, the particular points of contact and the character of the mutual understanding and solidarity that result from interreligious discourse cannot be determined beforehand, because one cannot know before actually engaging with the other what similarities and differences one will encounter. Thus, from Fletcher’s perspective, feminist approaches to religious pluralism must be characterized by attention to the details of concrete engagements between individuals or communities, rather than abstract conceptions of religious identities.

6. Process Philosophy

One important approach to religious pluralism that is not covered in the discussions of analytic and continental perspectives above is that of process philosophy. Drawing primarily on the work of Alfred North Whitehead and Charles Hartshorne, process philosophies of religion highlight the potential for novelty and creativity in the world. Since it is constitutive of process philosophy to hold that becoming and change are ontologically fundamental, process philosophers of religion tend to reinterpret, downplay, or deny the idea of an immutable divine reality. Process approaches also emphasize complexity and difference as inherent aspects of all levels of reality, so a pluralist approach to religious diversity would seem to follow naturally from their metaphysical commitments.

Another commitment that serves a central role for many process philosophers is the commitment to the naturalistic worldview of modern science. This is not to be equated with a strict positivism in which hard scientific inquiry is the only route to knowledge, but rather with the attitude that supernatural explanations that run counter to the presuppositions and conclusions of scientific knowledge—that is, explanations that depend on the interruption of natural causal processes by a deity or other supernatural force – should be abandoned as untenable. Griffin argues that although such naturalism is often combined with atheism and ontological materialism, it is not a necessary relation (2005, p. 13). Accepting such naturalism as part of the basis for a pluralist approach to religious diversity may create difficulties for process positions, though, insofar as one cannot presume it to be a shared assumption of various contemporary religious perspectives. On the other hand, it provides an ontologically minimalist and relatively objective framework from which members of different religious traditions can, at least in principle, begin dialogue. It can also serve as the basis for an argument for epistemic humility in a similar vein to that of Hick – though with crucially different ontological claims.

In both Hick’s pluralistic hypothesis and process pluralism, ontological claims, religious experience, and devotional practices are understood to have a real referent. The first distinction between these two positions, though, is that in the former the Real is posited as a single ultimate referent while in the latter different claims or representations can be taken to have distinct ultimate referents. John Cobb, for instance, argues especially that personal and impersonal representations of the divine are difficult to reconcile as simply different concepts of the same noumenal reality, and attempting such a reconciliation would end up revealing very little about either tradition from which the concepts emerge or the Real to which they point (cf. Griffin 2005, p. 47). Instead, Cobb posits that ultimate reality must in itself be unimaginably complex. The various perspectives on this reality found among diverse religious traditions are then not only different ways of representing the same ultimate truth, but indeed distinct ways of representing different aspects of the complex totality of this truth. Impersonal representations signify one basic aspect of the Real, while personal representations signify a fundamentally different basic aspect.

Cobb’s claim rests on a broader metaphysical hypothesis that involves a plurality of ultimate realities, or at least ultimate aspects of reality: a personal deity or supreme being to which personal representations of the ultimate refer, an impersonal creativity to which impersonal representations refer, and the universe itself understood as the totality of all finite beings, to which nature-centered religious traditions refer. Given this basic plurality, the different concepts and experiences found in different religious traditions can be taken to be equally valid while retaining even those differences that appear to be mutually contradictory. In order to reconcile such contradictions, conflicting claims can be understood essentially as answers to different questions about the nature of reality (Griffin 2005, p. 48). If this is the case, then claims that at first seem conflicting can be reinterpreted as complementary. The work of pluralistic dialogue would be synthetic, placing claims from different traditions alongside each other to attempt a deeper understanding of the multiform nature of the ultimate.

The aim of such a synthetic approach, though, would not be to construct a new perspective that would incorporate claims from various religious traditions into a larger system, but rather to provide a context in which members of one tradition can both learn about and appreciate the value of other traditions and meaningfully reflect on their own beliefs. Between religions that focus on the same ultimate reality (for instance, the Abrahamic traditions, theistic Indian traditions, and others that posit personal deities), pluralistic dialogue may primarily motivate reflection on and purification of one’s own concepts. Between religions that focus on different ultimates (as in Buddhist-Christian dialogue, the example to which much of Cobb’s work is devoted), dialogue may serve to broaden and enrich the perspective of each participant.

While process approaches tend toward the kind of deep pluralism examined so far, there are exceptions. The differences have to do in large part with the way that one’s ontological commitments are articulated. Cobb’s position is grounded in a Whiteheadian metaphysics that affirms plurality and complexity all the way down; however, Schubert Ogden proposes a position that, ontologically speaking, more closely resembles Hick’s in that Ogden affirms that ultimate reality must be conceived as singular. Following Hartshorne more closely than Whitehead, Ogden maintains that ultimate reality must in itself have a single structure, and he identifies this primarily with the Christian concept of God (1992, p. 47). Proceeding from this basis, it is still possible to affirm the possibility of a plurality of valid religious perspectives, insofar as complete knowledge of the ultimate structure of reality lies beyond the grasp of human experience. However, it seems to be more difficult to proceed to affirmation of the actuality of such pluralism, unless that affirmation takes the general form of Hick’s hypothesis. Alternatively – and this is the position that Ogden seems to support – a process philosophy that posits a singular ultimate reality may support an epistemological pluralism (that is, a rejection of the idea that one can know that there is not more than one valid religious perspective) joined with a broader religious inclusivism in which only that within others’ traditions which is “substantially” consonant with one’s own religious commitments is recognized as true (Ogden 1992, p. 102).

7. Liberation Perspectives

Liberation theology, which advocates a religious duty to aid those who are poor or suffering other forms of inequality and oppression, has had a significant influence on recent discussions of pluralism. The struggle against oppression can be seen as providing an enterprise in which members of diverse religious traditions can come together in solidarity. Paul F. Knitter, whose work serves as a prominent theological synthesis of liberation and pluralist perspectives, argues that engaging in interreligious dialogue is part and parcel of the ethical responsibility at the heart of liberation theology. He maintains not only that any liberation theology ought to be pluralistic, but also that any adequate theory of religious pluralism ought to include an ethical dimension oriented toward the goal of resisting injustice and oppression.

Knitter claims that, if members of diverse religions are interested (as they should be) in encountering each other in dialogue and resolving their conflicts, this can only be done on the basis of some common ground. Yet in contrast to a position such as Hick’s, which posits a common noumenal reality toward which different religious traditions are oriented but which none can definitively represent, Knitter thinks this common ground needs to be neither transcendent nor already existing. In fact, the most meaningful interreligious encounters can spring from constructing shared responses to particular situations. What is necessary is that such responses react to experiences or phenomena that are more or less universal, and suffering is just such a universal phenomenon. It provides a common cause with which diverse religious traditions are concerned and towards which they can come together to craft a common agenda. Particular instances of suffering will, of course, differ from each other in their causes and effects; likewise, the practical details of work to alleviate suffering will almost necessarily be fleshed out differently by different religions, at different times and in different places. Nevertheless, Knitter maintains that suffering itself is a cross-cultural and universal phenomenon and should thus serve as the reference point for a practical religious pluralism. Confronting suffering will naturally give rise to solidarity, and pluralist respect and understanding can emerge from there.

Knitter does not limit his argument only to confrontations with suffering that fall within the scope of human ethics or politics. He extends his claim to encompass the entire earth, insofar as this is the shared horizon within which not only humans but all creatures coexist. Earth not only serves as a common physical location for all religious traditions, but it also provides these traditions with what Knitter calls a “common cosmological story” (1995, p. 119). That is, the earth is the focal point for our modern scientific knowledge of the cosmos, and this knowledge has become so all-encompassing that it now unavoidably places the entirety of creation alongside humanity in its narrative. On this basis, Knitter makes a case that different religious traditions share an ecological responsibility and that awareness of this shared responsibility, as it continues to emerge, can also serve as a basis for mutual understanding.

A few possible criticisms of Knitter’s liberationist view are worth mentioning. First, while it may be difficult to dispute that suffering in some form or other is actually a universal phenomenon, it is not apparent that the particular forms of suffering that arise in particular circumstances bear enough commonality to ground the kind of deep, cross-cultural, and interreligious solidarity that Knitter maintains they will. At least, it seems to be going too far to claim that such solidarity will arise automatically. Similarly, while it may be reasonable to presuppose that religious communities will respond to suffering in solidarity with those who suffer (or to prescribe that they should do so), this certainly is not and has not been the case universally. One can all too easily point out the many instances of one religious community suffering at the hands of another. Even in cases where members of multiple religions agree that a particular instance of suffering ought to be alleviated, it is conceivable different religions will respond to the same instance of suffering in different, conflicting ways. While some form of justice is a central value in most religious traditions, the ways in which such a value is understood and practiced can vary considerably. By basing a pluralist approach on solidarity in response to suffering, one runs the risk of not giving sufficient attention to the diversity of actual forms justice can take. As Heim points out, treating others as we would like to be treated does not necessarily equate to treating others as they would like to be treated, and it is not obvious that there is an objective standard by which one’s response to suffering can be evaluated (2006, p. 96). It would then perhaps benefit advocates of the liberationist version of pluralism to posit that even justice is a concept capable of critical examination and reinterpretation in the context of interreligious dialogue (Suchocki 1987, p. 159).

With regard to Knitter’s appeal to a common cosmological story, it is unclear to what degree such a narrative is actually held in common, especially among diverse religions. While cosmological narratives grounded in broadly accepted scientific knowledge are certainly widely accepted, there just as certainly remain communities of those who reject such narratives – in many cases, precisely for reasons having to do with religious belief and practice. If it is still the case that some religious traditions do tell largely different stories about the world, it seems problematic to take it as given that “the earth” or “the universe” forms the basis for a common cosmological narrative. Knitter’s optimism regarding an emerging awareness of a shared ecological responsibility might thus be premature.

One possible way of addressing these concerns is to build a liberationist approach to pluralism on the basis solely of particular, concrete struggles rather than the idea of a universal struggle against suffering in general. In a particular situation, the needs of an oppressed group can be addressed on their own terms, and the responses offered by religious communities already involved in the situation can serve as a starting point for fighting injustice and working toward liberation. Of course, this is only a starting point, since the responses of communities already engaged may be found wanting either by those who are suffering or by other communities who enter the struggle later. A pluralist approach would maintain that such involvement by religious outsiders be held open as a possibility; this is the function of the liberationist appeal to solidarity in struggle. The religious outsider may be motivated to work on behalf of the oppressed by commitments that differ from those of the oppressed, though, and the pluralist would hold that these differences ought to be respected. At the same time, the situation of the particular struggle provides a concrete context within which members of different religious communities can achieve better understanding of each other in their difference from each other. Any solidarity and mutual respect achieved would, in this account, be contingent and perhaps not easily transposable to other particular contexts – though this remains as a possibility.

8. Conclusion

Religious pluralism, understood as a broad category of philosophical and theological responses to religious diversity, aims to account for this diversity as a positive phenomenon and to articulate ways that religious differences can be celebrated and conflicts mitigated, explained, or at least reasonably discussed. Pluralist positions can vary according to one’s understanding of religion (for example, whether it is taken primarily to consist of epistemic content, culturally constructed discursive practices, or salvation-oriented behavior), as well as according to one’s ultimate goal in articulating a position (for example, clarifying philosophical concepts of religion, or effecting social and political change such as in liberation theology). While there are significant differences in pluralist approaches evident in analytic and continental philosophy, there is also significant overlap in the content of arguments belonging to these traditions.

Major nineteenth and early twentieth century accounts of religion provide important precursors to religious pluralism, though they are largely not pluralist according to the strict sense of the term but rather exclusivist or inclusivist. Religious pluralism as a distinct philosophical and theological position has emerged more recently, and in its various forms it both draws on and is critical of these earlier accounts. Pluralism, of course, continues to be debated. It faces external challenges from exclusivists and inclusivists as well as religious anti-realists and relativists, and its various arguments are contested internally by those who argue that it concedes too much or that it has not yet become pluralist enough.

9. References and Further Reading

  • Alston, William. “Religious Diversity and the Perceptual Knowledge of God.” Faith and Philosophy 5.4 (1988): 433-448.
    • Offers an exclusivist argument that it is not irrational to continue to form and hold socially established religious beliefs (that is, those belonging to this or that religious tradition), even though the fact of religious diversity (particularly epistemic conflict) may decrease confidence in one’s beliefs.
  • Alston, William. Perceiving God: the Epistemology of Religious Belief. Ithaca, NY: Cornell UP, 1993.
    • A detailed (and significant) examination of the perception of God and the way in which such perception can provide grounds for religious belief. Alston’s response to the problem of diversity as laid out in his earlier article is included here, in the context of a more thorough discussion of its epistemological presuppositions.
  • Basinger, David. Religious Diversity: A Philosophical Assessment. Burlington, VT: Ashgate, 2002.
    • An overview and critical analysis of a variety of issues and perspectives regarding religious diversity understood primarily as epistemic conflict.
  • Bogardus, Tomas. “The Problem of Contingency for Religious Belief.” Faith and Philosophy 30.4 (2013): 371-392.
    • A criticism of the argument (particularly as advanced by John Hick and Philip Kitcher) that many religious beliefs are held due to contingent factors irrelevant to the truth or warrant of the beliefs. Bogardus attempts to show that several common forms of this argument rely on invalid logical inferences.
  • Byrne, Peter. Prolegomena to Religious Pluralism: Reference and Realism in Religion. New York: St. Martin’s, 1995.
    • Covers ontological, epistemological, and semantic aspects of religious pluralism, arguing that pluralism is plausible given a realist account of religious language.
  • Dean, Thomas, ed. Religious Pluralism and Truth: Essays on Cross-Cultural Philosophy of Religion. Albany: SUNY, 1995.
    • Contains essays addressing religious diversity and pluralism from a variety of perspectives, discussing questions of truth criteria, dialogue, and interpretation. Several essays take up the hermeneutic model of pluralism.
  • Derrida, Jacques. “Hospitality.” Acts of Religion. New York: Routledge, 2002.
    • Provides a sustained deconstructive account of hospitality as it relates to religion, with a lengthy analysis of the inter-religious work of Louis Massignon.
  • Derrida, Jacques. The Gift of Death. 2nd ed. Chicago: U of Chicago P, 2008.
  • Derrida, Jacques and Anne Dufourmantelle. Of Hospitality. Stanford, CA: Stanford UP, 2000.
    • Contains a deconstructive account of hospitality broadly construed, as well as more specifically in political and legal contexts, with reference to Kant.
  • Esack, Farid. Qur’an, Liberation, and Pluralism. Rockport, MA: Oneworld Publications, 1997.
    • A liberation-theological account of pluralism from a Muslim point of view, set in the context of the South African struggle against apartheid.
  • Feldman, Richard. “Reasonable Religious Disagreements.” Philosophers without God. Ed. Louise M. Antony. New York: Oxford UP, 2007.
    • A defense of the skeptical view that, in light of the diversity of conflicting religious belief claims, no one is justified in holding any set of such beliefs as true.
  • Fletcher, Jeannine Hill. Monopoly on Salvation? A Feminist Approach to Religious Pluralism. New York: Continuum, 2005.
    • A feminist theological perspective on religious pluralism, emphasizing the intersectionality and hybridity of religious identities.
  • Gadamer, Hans-Georg. Truth and Method. 2nd revised ed. New York: Continuum, 2002.
    • The classic foundational text for twentieth-century hermeneutic philosophy.
  • Gellman, Jerome. “Religious Diversity and the Epistemic Justification of Religious Belief.” Faith and Philosophy 10.3 (1993): 345-364.
    • A defense of an exclusivist position concerning “evidence-free” belief in the face of religious diversity, on the basis of the foundational nature of at least some religious beliefs.
  • Gellman, Jerome. “In Defence of a Contented Religious Exclusivism.” Religious Studies 36.4 (2000): 401-417.
    • An extension and further defense of Gellman’s exclusivist position against more recent criticisms.
  • Griffin, David Ray, ed. Deep Religious Pluralism. Louisville, KY: Westminster John Knox, 2005.
    • A collection of essays articulating process approaches to religious pluralism, including contributions by and about John Cobb.
  • Griffiths, Paul J. An Apology for Apologetics. Maryknoll, NY: Orbis, 1991.
    • An argument that religious pluralism implies obligations for both “negative” and “positive” apologetic arguments concerning one’s own religious commitments.
  • Gross, Rita M. “Feminist Theology as Theology of Religions.” The Cambridge Companion to Feminist Theology. Ed. Susan Frank Parsons. New York: Cambridge UP, 2002.
    • An argument that feminist theology ought to incorporate more religious diversity and that pluralist philosophy and theology ought to engage more with feminist perspectives.
  • Hegel, G. W. F. Lectures on the Philosophy of Religion. Ed. Peter C. Hodgson. 3 vols. New York: Oxford UP, 2007.
    • Hegel’s monumental treatment of the concept and history of religion; this edition gathers three versions of his lectures, given in 1824, 1827, and 1831. Volume one focuses on the concept of religion, volume two on the various forms it has taken historically, and volume three on what Hegel calls the “Consummate Religion” (that is, Christianity).
  • Heim, S. Mark. Salvations: Truth and Difference in Religion. Mayknoll, NY: Orbis, 2006.
    • A criticism of the versions of pluralism offered by Hick, Knitter, and Wilfred Cantwell Smith, and a pluralist account that posits a diversity of ultimate religious ends.
  • Hick, John. An Interpretation of Religion. 2nd ed. New Haven, CT: Yale UP, 2004.
    • Hick’s classic account of the “pluralistic hypothesis;” his introduction to the second edition contains responses to some of the criticisms his arguments have received.
  • Hick, John and Paul F. Knitter, eds. The Myth of Christian Uniqueness. Maryknoll, NY: Orbis, 1987.
    • An important early collection of essays by Christian theologians advocating pluralistic approaches to religious diversity.
  • James, William. The Varieties of Religious Experience. New York: Routledge, 2002.
    • The influential account of religion in terms of diverse personal experiences and emotions.
  • Kant, Immanuel. Religion within the Bounds of Bare Reason. Trans. Werner S. Pluhar. Indianapolis: Hackett, 2009.
    • Kant’s classic formulation of religion as properly concerning human morality, positing religious diversity as historically inevitable but nevertheless inessential.
  • Kearney, Richard and James Taylor, eds. Hosting the Stranger: Between Religions. New York: Continuum, 2011.
    • A collection of essays approaching religious pluralism in terms of hospitality, from a variety of religious traditions.
  • Kitcher, Philip. “Challenges for Secularism.” The Joys of Secularism. Ed. George Levine. Princeton, NJ: Princeton UP, 2011.
    • An argument for the necessity of developing secularism as a positive doctrine and way of life, which posits that the primary challenge secularism makes against religious belief is the argument from the epistemic symmetry of conflicting beliefs.
  • Knitter, Paul F. One Earth, Many Religions. Maryknoll, NY: Orbis, 1995.
    • A significant liberation-theological approach to religious pluralism from a Christian perspective.
  • Kwok, Pui-Lan. “Feminist Theology as Intercultural Discourse.” The Cambridge Companion to Feminist Theology. Ed. Susan Frank Parsons. New York: Cambridge UP, 2002.
    • A review of the ways that feminist theology has been influenced both by the particular experiences of women in diverse cultures and by dialogue among these cultures. The chapter also offers a critique of Eurocentric tendencies in feminist theology and a plea for greater attention to injustices exacerbated by globalization.
  • Levinas, Emmanuel. “Ethics as First Philosophy.” The Levinas Reader. Ed. Seán Hand. Cambridge, MA: Blackwell, 1989.
    • A summary of Levinas’s central philosophical argument, which aims to reorient earlier phenomenological and hermeneutic positions (particularly those of Edmund Husserl and Martin Heidegger) towards greater concern for ethics.
  • Lindbeck, George A. The Nature of Doctrine. Louisville, KY: Westminster John Knox, 1984.
    • An influential account of religion and theology that articulates two distinct models: the experiential-expressive and the cultural-linguistic. Chapter three deals particularly with the issue of religious diversity.
  • Mavrodes, George I. “Polytheism.” The Rationality of Belief and the Plurality of Faith: Essays in Honor of William P. Alston. Ed. Thomas D. Senor. Ithaca, NY: Cornell UP, 1995.
    • A critical reading of Hick’s “pluralistic hypothesis” that argues that this hypothesis demonstrates a “descriptive polytheism,” despite Hick’s positing the unity of the Real.
  • McKim, Robert. On Religious Diversity. New York: Oxford UP, 2012.
    • An examination and critical appraisal of a variety of approaches to religious diversity, focusing on epistemological and soteriological concerns and ultimately advocating a version of inclusivism.
  • Ogden, Schubert M. Is There Only One True Religion or Are There Many? Dallas: SMU P, 1992.
    • A process-theological account of pluralism that challenges the idea, put forward by other process thinkers, that ultimate reality must be plural in order for religious pluralism to be a tenable position.
  • Plantinga, Alvin. “Pluralism: A Defense of Religious Exclusivism.” The Rationality of Belief and the Plurality of Faith: Essays in Honor of William P. Alston. Ed. Thomas D. Senor. Ithaca, NY: Cornell UP, 1995.
    • As the title suggests, a defense of a version of exclusivism against charges that it is either epistemologically or morally objectionable.
  • Plantinga, Alvin. Warranted Christian Belief. New York: Oxford UP, 2000.
    • An important and extensive epistemological examination of the idea of warrant: that is, that quality of a belief that is accorded to it due the proper function of cognitive faculties. The book includes consideration of possible “defeaters” of theistic (and specifically Christian) belief, including religious diversity. Plantinga maintains that a version of exclusivism remains tenable even given such diversity.
  • Quinn, Philip L. “Toward Thinner Theologies: Hick and Alston on Religious Diversity.” Essays in the Philosophy of Religion. New York: Oxford UP, 2006.
    • Provides critical but mostly positive accounts of Hick’s and Alston’s respective positions, arguing that a revised version of Hick’s hypothesis ought to be considered as rational within the framework of Alston’s approach.
  • Quinn, Philip L. and Kevin Meeker, eds. The Philosophical Challenge of Religious Diversity. New York: Oxford UP, 2000.
    • A wide-ranging collection of (mostly analytic-philosophical) essays that presents a variety of exclusivist, inclusivist, and pluralist accounts of religious diversity.
  • Race, Alan. Christians and Religious Pluralism. London: SCM P, 1983.
    • The Christian theological account of pluralism that introduced the categories of exclusivism, inclusivism, and pluralism.
  • Ricoeur, Paul. “Religious Belief: the Difficult Path of the Religious.” Passion for the Possible. Eds. Brian Treanor and Henry Isaac Venema. New York: Fordham UP, 2010.
    • One version of Ricoeur’s account of interreligious dialogue as translation.
  • Rowe, William. “Religious Pluralism.” Religious Studies 35.2 (1999): 139-150.
    • A criticism of Hick’s “pluralistic hypothesis” focused on Hick’s claim that, given pairs of contradictory properties, the Real in itself does not possess either one.
  • Ruether, Rosemary Radford. “Feminism and Jewish-Christian Dialogue.” The Myth of Christian Uniqueness. Ed. John Hick and Paul F. Knitter. Maryknoll, NY: Orbis, 1987.
    • An examination of possibilities for feminist pluralism in the context of, on the one hand, Jewish-Christian dialogue and, on the other hand, particular feminist criticisms of each of these two traditions.
  • Runzo, Joseph. “God, Commitment, and Other Faiths: Pluralism vs. Relativism.” Faith and Philosophy 5.4 (1988): 343-364.
    • An articulation and defense of religious relativism as an alternative to pluralism.
  • Schellenberg, J. L. “Religious Experience and Religious Diversity: a Reply to Alston.” Religious Studies 30.2 (1994): 151-159.
    • An argument that Alston’s response to the problem of diversity is unsuccessful and that, in cases of significant epistemic conflict, justification for religious belief is removed.
  • Schleiermacher, Friedrich. On Religion: Speeches to its Cultured Despisers. Ed. Richard Crouter. New York: Cambridge UP, 1996.
    • The influential account of religion that distinguishes it from both metaphysics and morality, and grounds it in an individual intuition of the absolute. The fifth and last speech concerns religious diversity; Schleiermacher argues that such diversity is necessary.
  • Suchocki, Marjorie Hewitt. “In Search of Justice: Religious Pluralism from a Feminist Perspective.” The Myth of Christian Uniqueness. Ed. John Hick and Paul F. Knitter. Maryknoll, NY: Orbis, 1987.
    • An argument that feminist theology, out of its commitment to work against oppression, must affirm a pluralist perspective. The author draws on liberation theology to posit justice as the foundation on which interreligious dialogue can happen.
  • Tracy, David. Dialogue with the Other: the Inter-Religious Dialogue. Grand Rapids, MI: W. B. Eerdmans, 1991.
    • A hermeneutical approach to religious pluralism primarily from a Christian point of view, though it incorporates perspectives from other traditions.
  • Westphal, Merold. “The Politics of Religious Pluralism.” The Proceedings of the Twentieth World Congress of Philosophy 4 (1999): 1-8.
    • A brief critique of the pluralist approach to religious diversity, which argues that ethical and political norms found in religious traditions – particularly the commitment to non-violence – can help dissociate the cognitive content of religious belief claims from “religiously sanctioned” violence.
  • Wittgenstein, Ludwig. Lectures and Conversations on Aesthetics, Psychology, and Religious Belief. Berkeley, CA: U of California P, 1966.
    • Wittgenstein’s lectures on religious belief have been influential on positions that argue that religious diversity is not primarily an epistemic or cognitive matter, especially the cultural-linguistic model favored by Lindbeck and the aspect of Hick’s argument that focuses on practice rather than belief.

 

Author Information

Michael Barnes Norton
Email: mbnorton@ualr.edu
University of Arkansas at Little Rock
U. S. A.

Philosophy of Biology

Philosophy of biology is the branch of philosophy of science that deals with biological knowledge. It can be practiced not only by philosophers, but also by scientists who reflect on their own work. The distinctive mark of philosophy of biology is the effort to achieve generalizations about biology, up to various degrees. For instance, philosophy of biology makes biology relevant to classic issues in philosophy of science such as causation and explanation, chance, progress, history, and reductionism. It also works to characterize how knowledge is acquired and modified in different areas of biology, and sometimes to clarify the criteria that demarcate science from non-science.

Philosophy also performs constructive criticism of biology. For example, it has an important role in analyzing cases of “naturalization”—when science becomes able to study issues that traditionally were the exclusive domain of philosophy. The life sciences and their objects are changing and growing exponentially. A challenge for philosophy of biology is thus to keep the pace, not only with new knowledge modifying long-standing ideas (for example, the “Tree of Life”), but also with new scientific practices and unprecedented kinds of data. Accordingly, philosophy of biology is constantly provoked in shifting its own methods and attention. In some cases, philosophy of biology can aid the life sciences to reach their goals, by means of conceptual analysis, linguistic analysis, and epistemological analysis.

Hybridizations and intersections between scientific fields are particularly conducive to philosophical considerations. Contemporary examples are ‘EvoDevo’ (the recent integration between development and evolution) and ‘cultural evolution’ (an approach to cultural change inspired by evolutionary biology). Theses and analyses of philosophy of biology are often entwined with history of biology and with the history of evolution. Finally, philosophy of biology can elaborate messages and general views out of biology, and has a crucial role in caring for how science is publicly interrogated and communicated.

Table of Contents

  1. Introduction
  2. General Issues in Philosophy of Biology
    1. General Problems in Philosophy of Science, as Seen in Biology
    2. A General Picture of Biology
    3. A General Picture of Science
    4. Generalization as a Possible Distinctive Feature of Philosophy with Respect to Biology
  3. Philosophy Flanking Biology
    1. Clarifying Taxonomy, Classification, Systematics, Phylogeny, Homology
    2. Formulating Natural Selection
  4. Who Can Do Philosophy of Biology?
    1. Philosophical Biologists
      1. Mayr and Population Thinking
      2. Gould and Adaptationism
    2. Philosophical Issues Naturalized
      1. An Example: The Biology of Morality
      2. Philosophy Versus Naturalization?
  5. Philosophy Bringing the Life Sciences out of Their Research Context
    1. Philosophy of Biology at Intersections
    2. Biology’s Critical Friend
    3. Developing Messages from Biology
  6. Scientifically Up-to-Date Philosophy
    1. Questioning Influential Ideas
    2. Understanding New Scientific Practices
    3. Rethinking the Philosophical Approach from New Ways of Doing Science
  7. History and Philosophy of Biology
  8. Conclusion
  9. References and Further Reading
    1. Cited Examples
    2. Classics
      1. First Generation
      2. Second Generation
    3. Contemporary
      1. Reviews
      2. Some Monographs
    4. Anthologies and Textbooks
    5. Journals
      1. Dedicated
      2. Generalist
    6. Organizations
    7. Online Resources

1. Introduction

According to several reconstructions of the history of philosophy of biology, the field emerged gradually in the 1960s with a first generation of self-identified philosophers of biology, especially Morton Beckner, David Lee Hull, Marjorie Grene, Kenneth Schaffner, Michael Ruse, and William C. Wimsatt. As an explanation for such branching of philosophy of science, some philosophers put forth the decline of logical positivism in the 1960s and 1970s. For others, logical positivism did not actually decline, and anyway it had never suppressed philosophy of biology (Callebaut 1993). At times, the ‘official’ chronology gets questioned. For Byron (2007), proper philosophy of biology was already there in early philosophy of science, since the 1930s, as shown by a bibliometrical analysis. The most quoted philosopher in this article is David Lee Hull (1935-2010). He is a noncontroversially important figure in the founding generation of philosophers of biology. His meta-reflective papers “What philosophy of biology is not” (1969) and “Recent philosophy of biology” (2002) are particularly useful.

Philosophy of biology turned into a professional subdiscipline since the mid-1970s, with a ‘second generation’ of philosophers, the most cited being Ronald Amundson, John Beatty, Robert N. Brandon, Richard Burian, Lindley Darden, David J. Depew, John Dupré, James R. Griesemer, Philip Kitcher, Elisabeth A. Lloyd, Alexander Rosenberg, Elliott Sober, and Bruce H. Weber. Some of them were experienced philosophers who progressively shifted to biological issues. The first journal partially devoted to philosophy of biology – History and Philosophy of the Life Sciences – began to be published in 1979, and in the mid-1980s the discipline was fully established. Specialized journals flourished. In the early 2000s, a growing number of scholars, institutions, and journals specialized in philosophy of biology, and the discipline gained more and more room in scientific books, journals, and conferences (see the resources at the end of the article).

As we shall see, philosophy of biology provides accounts of biological knowledge, asking: how are explanation, causation, evidence and other epistemological primitives elaborated in the explanations that are typical of biology, such as natural selection, genetic drift, and homology? Does biology differ from other sciences? How? And how do we understand the epistemological diversity across different branches of the biological sciences? Philosophy of biology also considers whether biology may contribute to redefine classical demarcations   of science from other forms of knowledge and human creation.

Philosophy of biology can be seen as a possible aid for scientific advancement in the life sciences. Contributions of philosophers were widely appreciated by scientists, for example, in the areas of classification, taxonomy, and related activities, and in the abstract formulation of natural selection in the development of biology after Darwin. Scientists themselves may reflect philosophically on their own field of research, justifying and correcting their practices, or denouncing biases and transformations in their own community. Concepts, such as ‘adaptation’ or ‘species,’ are underlain by complex, inferential structures that can be revealed and sometimes criticized by philosophical analysis. Multiple and conflicting meanings may be uncovered and systematized to help the progress of science and to develop more general messages.

Phenomena studied by biology make this science particularly sensible and interesting for philosophy. Humans are organisms, and quite a few fields of biology have potential or direct implications for our self-understanding. Interesting philosophical debates have stemmed, for example, since the 1970s from the provocative proposal of a ‘sociobiological synthesis’; such synthesis claims to provide evolutionary explanations for human prosocial (and anti-social) behaviors that were traditionally covered by ethics. Philosophy overcame mere self-defensive attitudes, and its important role lied in epistemological analysis and in deep reflections on the limits and conditions of naturalization, which may be understood as the transition of a problem into the domain of empirical science. Neurobiology offers a particularly fertile ground for reflections about how human phenomena can be related to, or even explained by, biology. And how should a philosophical field like moral philosophy take biology into account? (For more on the topic of the naturalization of morality, for example, see ethics.)

Philosophy of biology may study and support the interaction among different life sciences, as in the case of evolutionary developmental biology, where workers claim to be reuniting genetics and evolution with embryology, recomposing a historical divide in biology. How do different research traditions integrate or replace each other? This question illuminates classic issues such as progress and scientific change with new light. Philosophy of biology also monitors the natural hybridization of biology with extra-biological fields, such as cultural transmission, and enriches the debate among scientists where extreme positions often pop out: does biology offer more rigorous methods to replace the failing methods of the social sciences? Are we facing, instead, a case of mutual inspiration? Or methodological integration? Which reciprocal prejudices are well-grounded? And how can they be overcome for fruitful scientific collaborations?

Philosophy of biology also has a mandatory critical role towards biology. For example, it can unveil the progressionist, anthropomorphic, and anthropocentric biases that affect scientists as human beings who live immersed in a society and in a cultural environment. Critical attention must be particularly high when scientific classifications of humans (for example, through measures such as IQ or ethnicity) may lead to justify and increase social discrimination, segregation or oppression.

Philosophy of biology may also develop ways of thinking up from biological research, providing an inspiring and readable encompassing view of the living world that will hardly be found in any standard, scientific publication. Furthermore, philosophy of biology is called upon to work on the interface between science and society, contributing to both the common misunderstandings and the best strategies for citizens to become conscious and informed, as they are called to decide what kind of research and intervention will be allowed or actively pursued by society.

It is hard for philosophy of biology to keep pace with the fast development of biological knowledge. But the effort of following the moving frontier of knowledge allows philosophy of biology to study the fall of influential ideas, such as the universal Tree of Life, and the rise of new scientific practices, such as intensive computer modeling. Philosophy also has the unsettling opportunity to constantly rethink its own approach, avoiding drifting too far away from scientific practice so as to become detached. In this dynamic, philosophy of biology is also well integrated with history of science, so that it is often hard to distinguish between the two. An analysis of the relationship between molecular biology and Mendelian genetics, for example, is intertwined with the historical account of the birth and early development of molecular biology in the 1980s. In turn, the philosophical framing of genetics and developmental biology as either ontology-based disciplines or research styles transforms radically the way in which the history of the two fields is told.

Philosophy of biology belongs to philosophy, therefore, no fixed procedure or protocol constrains its research (what is philosophy?). Philosophy of biology consists in free and critical — although rigorous and informed — thought on biological knowledge as the latter develops through time. However, as a mature and recognized field with its own interconnected practicing community, philosophy of biology seems to feature some methodological principles:

  • Philosophy of biology is supposed to be scientifically informed and up-to-date, capturing how recent research modifies established knowledge and creates new scientific practices. In turn, these novelties transform philosophy’s problems and approaches, especially in the current explosive growth of biology.
  • Philosophy and biology are not always clearly distinct. Scientific work can routinely require, for example, conceptual or epistemological On the other hand, philosophy can turn out to be effective in setting up scientific research projects. However, philosophy can be characterized by its leaning towards generalities about biology, namely general philosophical problems, general characterizations of fields and approaches within biology, or conceptualizations of biology as a whole or even of science as a whole.
  • Philosophy of biology should try to be understandable and possibly useful to biologists. Its tools — conceptual analysis, epistemology, traditions of thinking and debates — should be put to use for improving scientific research.
  • Biologists can do philosophy of biology. This happens, for example, when they become interested in general features of biology and try to contribute with principles derived from their work or when they think about the inferential patterns employed by themselves and their colleagues. Also, scientists can do philosophy and speak to philosophy when particular objects of philosophical study, such as human morality, get naturalized (see below).
  • Philosophy of biology cares for working across disciplinary contexts. For example, it studies novel contacts between previously separated fields, develops general views of the living world from some aspect of the life sciences, or reveals complex connections between science and the socio-cultural context in which it is carried out. It also takes advantage of its knowledge for monitoring and assisting how science is publicly communicated and interrogated.
  • Philosophy of biology is increasingly seen as one piece with history of biology, since philosophical and historical theses are mutually necessary, and their results reverberate reciprocally.

These six methodological principles are usually tacit, but sometimes they are made explicit by philosophers of biology, who may also disagree on some of them. The principles will be presented here by means of exemplar studies. Any set of examples is anyhow partial and biased, since philosophy of biology is a huge field full of fascinating topics, growing exponentially along with biology. For a more complete picture, the interested reader will have to navigate the resources listed at the end of the article, such as philosophy of biology journals or programs of conferences such as the biennial meetings of the International Society for the History, Philosophy, and Social Studies of Biology, the main reference society of the field. A number of textbooks in philosophy of biology are available, often in the form of anthologies. A list of all these resources is provided at the end for further reading.

Given the vastness of the philosophy of biology literature, this article can only indicate some of the main topics and the richness of discussions. The examples in this article are mainly focused on evolutionary biology. Evolutionists such as Ernst Mayr (1904-2005) and Stephen Jay Gould (1942-2002), two of the most influential authors, are extensively treated in this article though they are not universally representative. The predominance of evolution can be justified not only by the author’s specialization, but also by the fact that—as many philosophers of biology have critically stressed in recent times—evolutionary theory has long been the main target of philosophy of biology. Only in the last few decades has this situation changed radically (Müller-Wille 2007, Pradeu 2009). Philosophy of biology is already tackling an enormous range of topics in the most disparate fields, from biomedicine to community ecology, from neurobiology to microbiology and microbial ecology, and from chemistry and biochemistry to exobiology. To look for some specific area, the interested reader is, once again, encouraged to venture into the journals and online resources.

2. General Issues in Philosophy of Biology

Philosophers of science (though not always under this fairly recent name) have reflected for centuries on explanation, causation, correlation, chance, and many other general topics concerning science or knowledge in general. Important philosophers contributed to concepts like reduction vs. multiple realizability, and provided theories of explanation that describe what a scientific explanation is. During the second half of the 20th century, philosophy of science adopted a pluralistic strategy, considering the diversity of scientific disciplines and methods and striving to understand their differences along with their common aspects. With this pluralization, the complex task remained of finding a satisfying description of science as an endeavor which—unitary or not—is distinct from other forms of knowledge.

The study of living beings offers a universe of occasions for philosophy of science to advance the reflection. For instance, explanations by natural selection and by drift (see below, 2.a) can be seen as instances of causal explanations that nonetheless bring about new reflections and conceptual puzzles on the classic issues of causation, randomness, and ontology of processes. Homology explanations, another typical feature of biology, explain the properties or the variability of a biological character by citing the ancestral character or organ, and the causal factors that historically modify the descendants of that ancestral organ. In trying to account for biological sciences, philosophy of biology may take concepts from philosophy of science, such as causal explanation or reduction, and find new putative cases of them in the life sciences or locate failures of reduction in biology. Other times, philosophy of biology may need to tailor new concepts to accommodate biology. In fact, some kinds of explanation seem peculiar to biology or to historical sciences. While chasing the peculiarities of biology, philosophy of biology also has some general research goals among its aims:

  1. Often, philosophy of biology scours biology in search of new insights on general philosophical problems about science, such as the “problem of induction,” or the realism vs. instrumentalism dichotomy. Additionally, new general problems arise from the particular forms that explanation, causation, or reduction take in biology.
  2. Sometimes philosophy of biology seeks general characterizations of particular fields, practices, or ways of thinking within the life sciences Other times, the goal seems to be a general picture of biology, especially by contrast to other sciences, such as classical mechanics.

Sometimes philosophy of biology suggests general views of science, descriptions and characterizations of science with all the complexity, differentiation, and plurality that it exhibits in the contemporary world. In fact, the classical task of a demarcation of true science from other forms of knowledge has lost importance under the effect of philosophy of ‘special’ sciences like biology (Fodor 1974).

a. General Problems in Philosophy of Science, as Seen in Biology

Natural selection is a major biological explanation for the features of organisms. Inherited traits, originated by cumulative retention of random variation, are there because of the positive contribution they have brought to their bearers in past generations, in terms of survival and reproduction. Yet, the explanatory structure of natural selection is very complex, and implies a reflection on concepts like causality and randomness. In a classic book, Sober (1984) pointed out that only a few of the traits that get selected are in fact explanatory, specifically those traits that are selected for. Other traits are free riders that are somehow correlated with traits that are selected for and thus are preserved in the population without actively contributing to fitness. Thus, there is selection of such free rider traits that are not causally relevant to survival and reproduction. Hearts are positively selected; heartthrob is also selected, but not selected for; efficiency in pumping blood is selected for; the existence of heartthrob is thus explained by natural selection, but heartthrob is not explanatory per se; it undergoes selection of, not selection for. The idea of free riders on selection was already considered by Darwin, but philosophers of biology spelled out its consequences for the explanatory structure of natural selection. Causal relationships are the core of some theories of explanation. Sober proposed rethinking the idea of causality in light of evolutionary biology, and this is an example of how classic philosophical categories can be modified in their application to biology: “We must show that by considering evolutionary theory, old problems can be transformed and new problems brought into being. It remains to be seen, I think, how radically the philosophy of science will be reinterpreted” (Sober 1984: 7; see Matthen and Ariew 2009, Ramsey 2013).

Randomness and chance are very important in biology. Natural selection is not random: “it requires randomness as its ‘input’…but the ‘output’ of natural selection is decidedly nonrandom, the differential survival and reproduction of the variants that are better adapted” (Rosenberg and McShea 2008: 21). Philosophers worked, for example, on the meaning of “random mutation,” a concept considered essential to Darwinism as opposed to Lamarckism (Merlin 2010). Random does not necessarily mean lacking deterministic causes. Rather, random mutation points to the fact that the usefulness of a trait in the environment where it appears is not among the causes of its appearance. The source of variation is thus more properly contingent with respect to fitness. An interesting line of reasoning and evidence points out that “evolutionary divergence is sometimes due to differences in the order of appearance of chance variations, and not to differences in the direction of selection” (Beatty 2010: 39).

Genetic drift is the predictable change of the frequencies of traits that are not under selection. The absence of selection makes the dynamic depend only on the reproduction mechanism, and although the fate of individual traits is not predictable, the overall landscape of frequencies is. Genetic drift has long been known in formal models, studied in the field, and used for evolutionary reconstruction: it is not only necessary, but also causally relevant to evolution. Yet, a lively philosophical debate exists on the ontology of drift and selection (Millstein 2002; Walsh et al. 2002). A major disagreement concerns the epistemic status of mathematical models: granted that drift is a necessary feature of mathematical models, what can be legitimately inferred about the existence of a process in the world to be called drift? A statistical interpretation sees both drift and selection as mathematical features of aggregates of individuals (Walsh 2007, Matthen 2009, 2010). Another point of view considers them as causal physical processes (Millstein 2006, Millstein and Skipper 2009). The notion of evidence as a way of choosing between alternative explanatory hypotheses, selection versus drift, for example, is another object of philosophical study. Some philosophers think scientists are more qualified to evaluate and weigh evidence (Hull 1969: 169). Others point out that it is up to philosophy to probe the explanatory limits—current and constitutional—of biology (Rosenberg and McShea 2008: 2). One fortunate approach to such a task is the technical account of evidence based on Bayesianism (Sober 2008). Bayes’ theorem belongs to the mathematics of probability theory. It is based on prior probability, the probability of a particular statement before the observation, and posterior probability, the probability of the statement in the light of the observation. Bayes’ theorem is used by some schools of philosophers of biology in explicating various issues connected with evidence and confirmation.

Homology explanations explain the properties or the variability of a biological character,  the form of a wing or the range of different wing shapes across different groups of organisms, for example, by citing the ancestral character or organ, and the modification factors that affect the descendants of that ancestral organ. Like many modes of explanation in biology, homology explanations are historical (see also 3.a). Philosopher Ereshefsky (2012) compares homology explanations with analogy explanations, which instead explain a character by citing the contribution of that character to a function. Ereshefsky points out that homology explanations are more detailed and offer a better account of observed differences. Homology explanations can be also turned into strong historical explanations, as opposed to weak ones that only cite the ancestral, initial condition. This happens, for example, when detailed molecular studies of the development of the target character enlighten precise events, such as gene duplications, that must have been crucial along the historical path. The study of genetic and developmental pathways also gives access to hierarchical disconnect which, for Ereshefsky, relates homology explanations to classical topics of philosophy of science: multiple realizability and reductionism. Hierarchical disconnect happens when “a homologue at one level of biological organization is caused by non-homologous developmental factors at lower levels of organization” (p. 385). Along the historical path, for example, a morphological structure can remain stable while its underlying developmental modules change (Griffiths and Brigandt 2007). For Ereshefsky, this is a biological example of multiple realizability, that is the fact that one level of organization cannot be reduced to a kind at a lower level. This in turn, for Ereshefsky, counters ideas by Alex Rosenberg about reductionism in biology (Rosenberg 2006). For Rosenberg—Ereshefsky says—the history of homologous characters should be reducible to the history of their physical substrata, but Ereshefsky says hierarchical disconnect shows decoupled histories and multiple realizability.

b. A General Picture of Biology

In characterizing ways of thinking and kinds of explanations and evidence, philosophy of biology formulates generalizations about biology. These generalizations become particularly encompassing when philosophy of biology tries to characterize biology explicitly and comprehensively as distinct from other sciences. Biology is generally considered a ‘special science’—a term inherited from logical positivist philosophy of science that doesn’t preclude, in the long run, a reduction to physics (Fodor 1974, Rosenberg and McShea 2008). Among the most noticed and studied features of biology there is the apparent absence of scientific laws, whose blueprint, so to speak, are physical laws (but see Waters 1998). Many have been the endeavors of describing biology as a science without relying on laws, but also without relegating it as a mere collection and description of singular events where exceptions are the law.

For Ernst Mayr (1982, 2004), biology is based on concepts or principles, which are more flexible than laws: biology is a unique science by virtue of concepts that allow for biological explanation, including inheritance, program, population, variation, emergence, organism, individual, species, selection, fitness, and so on. Biology is also characterized by population thinking, introduced by Darwin, which differentiates biology from mechanics or chemistry whose thinking is, for Mayr, essentialist or typological (for more on this see 4.a.i).

For paleontologist Niles Eldredge biology is based on patterns. Patterns are law-like regularities, consisting of repeated schemes of events. This notion characterizes biology as a historical science, while reducing the gap that, in other views, separates biology from other natural sciences like physics. The pattern of inclusive hierarchical similarity in the biological world was seen by Linnaeus and captured in his binomial nomenclature. Darwin saw more patterns, for example in the geographical distribution of species and varieties. He then discovered a subset of the grand complex of repeated events, or regular processes, that give rise to biodiversity on Earth—the pattern of evolution. Mendel caught some patterns as well, in his observations of inheritance. Patterns have a double nature, ontological and epistemological: “Patterns in the natural world are extremely important.… They pose both the questions and the answers that scientists formulate as they seek to describe the world…. Science is a search for resonance between mind and natural pattern as we try to answer these questions” (Eldredge 1999: 4-5).

Other scholars think of biology as a science of mechanisms. A growing philosophical movement called “the New Mechanism” (Machamer et al. 2000) use the concept to revise classic ideas like causation, discovery, and explanation. In this view, biologists aim to discover and represent mechanisms with their schemas, sketches, and theoretical models. The characterization of natural selection as a mechanism, for instance, has been proposed, but is yet to be resolved (Skipper and Millstein 2005).

Model-based accounts have acquired particular importance in philosophy of biology (Schaffner 1993, 1998). The semantic view of scientific theories in the 1980s (see 2.c) was a first ambitious endeavor to characterize biology as a model-based science. Downes (1992) pointed out the limits of the semantic view in its original formulation, but proposed that philosophers of biology keep some of its central claims, among which are the centrality of various kinds of models in biology and their promise of accounting for all scientific theorizing.

At a lower degree of generality, philosophy of biology offers a proliferation of ideas about how schools of scientists, fields, or approaches perform fundamental activities of science like explaining, describing, understanding, or predicting. Some included the aforementioned and conceptually challenging explanations like natural selection, drift and historical homology explanations. Others include the practices of ecological modeling and model organisms, (6.b) particular inferential patterns such as adaptationism (4.a.ii), and the difference between geneticists and developmental biologists (7). These are all typicalities, or modest generalizations, of sub-parts of biology. Population genetics is a thoroughly studied field from this point of view. By studying population genetics, philosophers have raised wide-ranging topics of reflection, such as the degree of idealization of models and experiments (Plutynski 2005, Morrison 2006). The concept of the possible has played a role in accounting for how biological idealizations can be explanatory: if biological models cannot predict or demonstrate necessity, they can at least restrict the field of what is possible, yielding so-called “how possibly” answers or explanations. For Rosenberg and McShea (2008) it is a “usual scientific” strategy that “scientists try to characterize a range of possible causes of evolution, and then to determine which of these possibilities actually obtained. The actual is first understood by first embedding it in the possible” (p. 13).

Even the tightest case studies usually produce generalizations to a certain degree. Grote and O’Malley (2011), for example, claim that their historical reconstruction of research on microbial rhodopsins

offers a novel perspective on the history of the molecular life sciences…, sheds light on the dynamic connections between basic and applied science, and hypothesis-driven and data-driven approaches [and] provides a rich example of how science works over longer time periods, especially with regard to the transfer of materials, methods and concepts between different research fields. (Grote and O’Malley 2011, p. 1082)

Sometimes generalizations come in negative form. Case studies are counted as evidence that science may not have features that are commonly thought as necessary requirements. Grote and O’Malley (cit.) propose their case against those philosophical descriptions that interpret scientific advancement in terms of general, overarching, and unifying theories. They observe that “productive interactions between different fields of science occur not only through the adaptation of theories, concepts, or models, but very much at the level of materials or experimental systems” (p. 1094, emphasis added).

c. A General Picture of Science

What is science in and of itself? How does it differ from other forms of knowledge? These two sides of the coin were a major motivating question for philosophy of science since its very beginning. Philosophers of biology sometimes take the methodological lessons they learn from biology and tentatively postulate their broader applicability to science (see Griesemer in section 7). However, a general view of science is often far from the explicit goals of philosophy of biology. There are historical reasons for this. In a paper on general philosophy of science, Psillos (2012) traces the problem back to Aristotle, through landmark thinkers like Immanuel Kant and Pierre Duhem, to the Modern era. The problem of demarcation was surely central in logical empiricism, but in the 1970s the idea was spread that “individual sciences are not similar enough to be lumped together under the mould of a grand unified scheme of how science works” (Psillos, cit.: 100). After 1950s—and certainly in the early 1980s, when philosophy of biology became an academic field—there was a consensus on the basic fact that science employs a variety of methods. Indeed, biology, together with other sciences like the human and the social sciences, had shaken the Modern idea of the scientific method.

Nevertheless, some important works have been moved by the need of finding new descriptions of science that would fit biology better than received ones, in order to legitimately include biology among the natural sciences. Political and intellectual movements, such as Intelligent Design, contribute to urging philosophy of biology to tackle the demarcation of science: their strategy includes casting doubts on the scientific status of official biological sciences and presenting alternative, religiously inspired views as scientific theories (Sober 2007, Boudry et al. 2010). The creationist movement is present in the philosophers’ mental map worldwide since at least 1981, when philosopher of biology Michael Ruse was a key witness for the plaintiff in the trial McLean vs. Arkansas, a challenge to the state law permitting the teaching of creation science in the Arkansas school system. The federal judge ruled that the state law was unconstitutional. A significant part of the written opinion (see http://www.talkorigins.org/faqs/mclean-v-arkansas.html) was devoted to philosophical considerations on the epistemological characteristics of science (Claramonte Sanz 2011). Intelligent Design is a new form of scientific creationism, characterized by the wedge strategy, or public insistence on supposed flaws in the established biological sciences combined with supposed scientific demonstrations of an Intelligent Designer of the Universe. While there are different positions about whether philosophers and scientists should directly appear in public debates with creationists, Intelligent Design undoubtedly engages philosophers of biology in reflecting upon possible distinctions within science (between hypotheses, theories, and models, for example), between science and pseudo-science, and between science and religious beliefs. With their competence, philosophers can help scientists in what sociologist Thomas F. Gieryn termed “boundary work” (1999). The need of working on demarcation can also stimulate critical revisions of the patterns of scientific explanation in biology. In a book on evidence, Elliott Sober (2008) stated in a provocatively:

If the only thing that evolutionary biologists do is go around saying ‘that’s due to natural selection’ when they examine the complex and useful traits that organisms have, they are engaged in the same sterile game that creationists play when they declare ‘that’s due to intelligent design’. Assumptions about natural selection of course can be invented that allow the hypothesis of natural selection to fit what we observe. But that is not good enough: the question is whether there is independent evidence for those auxiliary propositions. (Sober 2008, p. 189)

The semantic view of scientific theories is an example of a general account of science that was put to work in justifying evolutionary biology as a model-based science, in face of the inveterate poor fit of the received view based on syntactic theories and universal laws (Lloyd 1983, 1984, 1988). Under the semantic view, models constitute the core of scientific work; theories are to be seen as combinations of families of models plus hypotheses about the empirical scope of those models. This framework contrasts with the received view which described theories as sets of sentences—including laws—about the world. In the semantic view, laws are conceived in a new way: from universal empirical claims about the world, they become specifications of models, embedded in them. An important argument for philosophers of biology in the 1980s was that the semantic view was not developed ad hoc for evolutionary theory: it had already been successful in describing Newtonian mechanics, equilibrium thermodynamics, and quantum mechanics. However, it was also said, these sciences fitted the received view as well. The semantic view was very demanding in terms of formalization, so it worked only for a very limited area of biology Not all scientific work consists in model building, so the semantic view is hardly considered an exhaustive view of science, and things work so differently for different kinds of models (for example, mathematical vs. non-mathematical) that the existence of a unitary view was questioned (Downes 1992). But new versions of the semantic view continue to be the topic of papers and debates in philosophy of biology.

d. Generalization as a Possible Distinctive Feature of Philosophy with Respect to Biology

As we have seen, philosophy of biology formulates various degrees of generalizations about biology. These generalizations may concern biology as a whole or some subfield of biology. Even in close-distance case studies, where generalization seems far from the goals, it can be shown that generalizations are made (2.b). Philosophy is a generalizing activity. This feature of philosophy may be a way to approach a tricky issue, namely the distinction between biology and philosophy of biology.

Intuitively, philosophy is different from science. The methods of philosophy are based, for example, on logical differences and implications (Hull 1969: 162) or on thought experiments (Rosenberg and McShea 2008: 6), while the methods of science are based on empirical phenomena. But this distinction is not as clear-cut as it might seem. Scientists also use thought experiments. Einstein famously used them for developing both special relativity and general relativity, and Darwin worked intensely with thought experiments—to mention only two representative cases. Conceptual, foundational, and linguistic analyses are an integral part of scientific research too. This means that scientists autonomously do philosophy in their day-to-day work. Indeed, professional biologists, as thinkers, could and should shift from their own main activity to more philosophical ones whenever needed. The other way around, philosophers of biology usually want to contribute to the advancement of science. In a sense, they want to be part of science. Furthermore, the methods of philosophy are constantly evolving, stimulated by advancements in biology, so that it is not rare now to find philosophical studies that include mathematical analysis, computer simulations, or even empirical research. The science-philosophy distinction is thus very blurred. Given this situation, is there any possible demarcation between philosophy and biology?

David Hull (2002) characterized philosophy of biology as meta-science:

Knowing some science can be of great assistance to philosophers of science, but philosophers of science as philosophers of science do not do science…. For example, the equation F = ma is part of science, in particular physics. The claim that F = ma is a law of nature is part of the domain of philosophy of science. To the extent that science and philosophy can be distinguished in practice, scientists tell us what the world is like, and philosophers of science tell us what science is like. (Hull 2002, p. 117)

One interpretation of philosophy of science as meta-science, which seems faithful to Hull’s idea, is that philosophy of biology seeks generalizations about biology.

The idea of philosophy of biology as generalization about biology has descriptive advantages, but it would be contested by some philosophers. In fact, no idea in the literature about the specificity of philosophy remains uncontroversial. When it comes to ambiguous science-philosophy cases, philosophers debate the right way of doing philosophy. They are much clearer in saying what it means to be a scientist than they are in saying what it means to be a philosopher.

In a view of philosophy as meta-science, the mark of philosophy is the leaning towards generalizations about science. This is not a yes/no requirement; it comes in degrees. Sometimes, for instance, works in biology make generalizations about biology (see Mayr and Gould’s examples below). These works sometimes become philosophical while other times they remain scientific because they generalize only to a degree which is functional to the scientific activity. This continuity accounts for the demarcation uncertainties that are going to persist in the field.

3. Philosophy Flanking Biology

 As is demonstrated in other parts of this article, philosophy can take a theme and develop it with great autonomy from science or adopt a critical focus on science and on the complex relationship between science and society. According to many authors, however, there is also potential for philosophy to actively and directly contribute to the advancement of life sciences. While it is commonly accepted that biologists can undertake philosophy of biology, it is more controversial whether philosophers can do biology in a proper sense. But biologists have generally been a receptive community. In 2002, David Hull had written:

Philosophers are attempting to join with biologists to improve our understanding of…biological phenomena. As such, they run the risk of being considered by biologists to be ‘intruders’. In point of fact, biologists have been amazingly receptive to philosophers who have turned their hand to philosophy of biology with a significant emphasis on “biology.” (Hull 2002: 124)

When philosophers are close to scientific practice and current problems, they see how they can help science in advancing towards its aims, either by criticizing existing ideas and practices or by proposing revised ones. In Hull’s line of thought, philosophers are encouraged to devote time and effort to work with biologists, or to formulate problems in ways that are interesting and understandable to biologists. Philosophical treatments that are too extensive, stereotypical, or posed in a way that is not relevant to biologists, or those with an excess of philosophical formalization that prevents access to scientists, should be avoided: “Formalization may be an excellent way of working out problems in the philosophy of science. It is not a very good way of communicating the results” (Hull 1969: 178).

Philosophers try to help biologists to better frame their questions, for example by “uncovering presuppositions and making them explicit” (Sober 1984), by analyzing conceptual foundations (Pigliucci and Kaplan 2006), or by pursuing the detection, analysis, and sometimes solution of theoretical and methodological problems. A great deal of the philosophy of biology consists in working on scientific language to clarify the meaning of concepts such as life, purpose, progress, complexity, genetic program, adaptation, and so on (Rosenberg and McShea 2008: 4). Since these concepts frame the scientific questions, philosophy of biology can “clarify, broaden or narrow the domain of theories, uncovering ‘pseudo-questions’” (Rosenberg and McShea, cit.: 6). Critical analysis and conceptual clarification are particularly valued tasks, and conceptual clarity is seen as a necessary virtue of philosophy of biology. One mode of work consists in looking, together with biologists, at concepts or mathematical models and their interpretations. This line of work is exemplified hereafter (3.a, 3.b): two traditional areas where philosophy has contributed by casting some light are the connections among biological enterprises like taxonomy, classification, and systematics (3.a), and the abstract descriptions of natural selection (3.b).

a. Clarifying Taxonomy, Classification, Systematics, Phylogeny, Homology

In biology, taxonomy consists in the recognition of natural groups, that is the taxa (singular: taxon). Classification deals with the categories and ranks to be assigned to taxa (for example, species, genus, family). Systematics is, by definition, a systematic study of the living world in search for order, or, in other words, the search for the relationships among taxa. And phylogeny is the reconstruction of the temporal scheme of common descent and relatedness among taxa. In fact, despite these gross characterizations, the four activities are intertwined and depend on each other. Their distinctions, definitions, and relationships are a traditional matter of reflection for philosophy of biology (Wilkins and Ebach 2011).

In the 1960s, philosophers, perhaps thanks to the heritage of the Western philosophical tradition, proved to be particularly ready and equipped for helping scientists understand their own various ways of finding order in the living world. According to Hull (1969), one of the important early contributions of philosophy of biology to logical clarity was the taxon-category distinction, the distinction between “individuals, classes, and classes of classes” (p. 171). Philosophers were able to contribute, for Hull, once they accepted to arrange their formalisms to communicate with biologists. Hull himself (1976) long argued for species having the ontological status of evolutionary individuals unlike taxa in other categories. Hull relied on the Biological Species Concept (BSC) that defines species as reproductive communities. In accordance with the BSC, whereas a taxon is determined by a set of shared characteristics, the species-rank of the taxon—its membership in the species category—is determined by actual evolutionary relationships, in particular by interbreeding habits. With a similar line of reasoning, Hull defended the idea of species as historical individuals, as opposed to classes or types. Species, perhaps, are leaving the limelight of philosophical reflection as a consequence of the body of discoveries about the fluidity of their boundaries, the heterogeneity of their phenomenology, and the rarity of canonic biological species across the biological world. But the debate on individuality was complex and lively, and is still partly open today (Wilkins 2011).

In early years, philosophy of biology demonstrated its value also by analyzing the entanglement among biological hypotheses—explicit and implicit—in different domains. For example, philosophers refuted the idea of taxonomy as a theory-free activity, prior to functional attributions and to evolutionary hypotheses. Character recognition does bring into play theoretical considerations: the definition of “kidney,” for instance, presupposes physiological knowledge of kidney function, and/or hypotheses on the evolutionary derivation of kidneys. The acceptance of evolutionary theory in all fields of biology, completed during the first half of the 20th century, triggered hot debates on its relevance to taxonomy, systematics, and classification (Hull 1970): does taxonomy have to reflect evolution? And in what sense? Could adaptation by natural selection be a criterion for systematics, or is pure common descent the candidate? Philosophers took part in trying to clarify these issues.

The network of assumptions and hypotheses of biology is an enduring object of study for philosophy. There is a certain consensus on the fact that phylogeny should constrain systematics; that is, systematics should reflect phylogeny as much as possible. Many authors even equate the two tasks altogether: “Practitioners of systematics study the historical pattern of evolution among groups of living things, i.e., phylogeny” (Haber 2008). Yet, taxonomy, classification, systematics, and phylogeny are unequally performed by different professionals upon significantly uneven living and fossil taxa, and there are many conflicting needs and goals. Species concepts in paleontology are by force very different from those that fit neontology (Wilkins 2011), an issue that yields differently organized classifications (Wilkins & Ebach 2013). Even among biologists who study currently living organisms, the ideas on how to integrate taxonomy and classification with systematics and phylogeny vary much across specialists at all degrees of specialization—for example,entomologists, botanists, and mammalogists. Furthermore, a classification of domestic animals or plants is likely to incorporate much morphological, physiological, and ecological information beyond phylogenetic relationships if it is to be of any practical use to the knowledge communities that are involved in rearing. And when it comes to decide what information is relevant to identify endangered species (or other units) that should be the object of conservation biology, different ideas —and ethical stances are on the table (Casetta and Marques da Silva 2015).

Philosophy of biology can make or support concrete proposals about how to integrate and refine the various ways of ordering the living world. Philosopher Marc Ereshefsky (1997), for example, has been arguing for systems of nomenclature that are alternative to the Linnean hierarchy (species-genera-Orders etc.). The reason for this position is that all scientific changes that have been happening after Linnaeus’ century—Darwinism, neo-Darwinism, cladistics and new methods in systematics—have turned the hierarchy into an obstacle rather than an aid to taxonomical work.

A specific challenge for philosophy of biology isphylogenetic trees, which are ever revisable hypotheses corroborated to varying degrees. Epistemological and methodological discussions concern the ways of building, interpreting, testing, and revising trees, as well as of relating them to other domains of knowledge such as taxonomy or adaptation. “So what can biologists meaningfully say about phylogeny?” philosopher Matt Haber asks (2008).

Broadly, two different issues have been at the center of recent systematics debates: given epistemic limitations, whether any inference of phylogeny may justifiably be drawn; and given an affirmative answer, what methods ought biologists use to justifiably infer phylogenies, and what are the limits of these inferences? (Haber 2008, p. 231)

Competing and sometimes conflicting methods have been developed to make phylogenetic inference more exact, manageable, or informative. In this domain, methodological differences raised heated conflicts among scientists. The parsimony principle was questioned in its legitimacy and importance. The principle states that the most economic hypothesis has to be more true (Sober 1988). More generally, trees have been examined for their theory-ladenness, that is, their sensitivity to background theoretical assumptions. Other contentious matters were the acceptability of different kinds of data ( genetic vs. morphological, for example), and the limits of the domain of phylogenetic inference. Some workers maintain that methods in phylogenetics should be able to detect homologies (see also 2.a) with certainty, discerning them from analogies. Many others argue under different conceptions that homology detection should rely on large amounts of data, mathematical models of evolution, and probability. Supporters of cladistics reject probability and likelihood for being an unstable ground on which to draw evolutionary trees. They put more confidence in the cladist practictioner’s ability to recognize derivation between characters (Hull 1970, Haber 2008). Meanwhile, the great majority of phylogenetic trees are built by relying on huge sets of genetic sequences and a few morphological characters by means of more and more cost-effective computer programs (this is an example of novel computer-based scientific techniques posing new philosophical problems, see more below). “Total evidence” methods (Sober 2008) address the challenge of building phylogenetic trees by integrating all the available evidence—morphological, geological, ecological, and fossil.

The question of homology is a last example of philosophical enquiry into entangled theoretical backgrounds and hypotheses: “when are two instances of a character to be considered instances of the same character and in what sense?” (Hull 1969: 174). Griffiths and Brigandt (2007) recognize different concepts of homology. The taxic approach to homology—the best known in systematics and philosophy—uses points of resemblances between organisms (shared character states) to diagnose their evolutionary relationships. The transformational approach focuses on the different states in which the same character can exist and be transformed by evolution. A third approach to homology has emerged in conjunction with findings of evolutionary developmental biology (EvoDevo): homology at the phenotypic level is potentially decoupled from homology of developmental processes and homology at the level of the genome (a phenomenon called hierarchical disconnect, see also 2.a). The level of depth thus becomes a necessary specification for homology. Griffiths and Brigandt (cit.) see the different concepts of homology as not only compatible but strongly complementary, and trigger a series of reflections on their relationships.

b. Formulating Natural Selection

Charles Darwin (1859) conceived natural selection as the mechanism of change, splitting, and divergence of lineages. Natural selection is thus the most essential notion of evolutionary theory. Darwin’s formulation of natural selection, substantiated by many empirical studies, was essentially verbal. After Darwin, definitions of natural selection underwent diversification as well as progressive refinement. Mathematical models of natural selection, created in the 20th century, made the process more precise, as population genetics introduced technical terms such as fitness and selection pressures (Haldane 1924), and established different formulas to quantify natural selection and to measure its intensity and effects. But population genetics and mathematics didn’t exhaust the task of formulating natural selection in a theoretical and general way, and philosophers of biology eventually became very actively involved. A classical abstraction of natural selection was elaborated by Richard Lewontin (1970). It was based on variation, fitness, and heritability. Today philosophers acknowledge the value of that account, but they also criticize it as, at once, too simple and too demanding (Godfrey-Smith 2009), ascribing it to a category of recipe descriptions that list the supposedly few and simple ingredients of natural selection. David Hull and Richard Dawkins, for example, independently introduced the original distinction between replicator and interactor, or vehicle. A replicator is anything that passes its structure on largely intact, while an interactor is a cohesive unit whose action in the environment makes a difference in the replication of the replicators it carries. Richard Dawkins (1976), interpreting the work of William Hamilton and George C. Williams (1966), famously took the basic idea of kin selection and developed it into the gene’s eye view, a general view of evolution where selection in the long run is seen as operating basically on genes and organisms are seen as their vehicles. Kin selection (Maynard-Smith 1964) is based on shared inheritance among relatives. Social donor traits are expected to spread in the population if they increase the fitness of the donor’s close relatives, which are likely bearers of the same traits. Inclusive fitness (Hamilton 1963, 1964) is the fitness of a trait deriving from the bearer’s survival and reproduction plus survival and reproduction of relatives (proportionally to the amount of genetic sharing). Dawkins’s “selfish gene” metaphor, based on these mathematical findings, was welcomed by many biologists as a clarifying device in their day-to-day work.  Its effects in the public perception of science are a different story that will be addressed later in the article. Some philosophers of biology got involved in the metaphor, either to develop it (for example, Dennett 1995) or, more often, to criticize it (see Oyama 1998). Several philosophers tried to find other ways of characterizing natural selection. We have already mentioned Sober’s (1984) work on this problem (2.a). Sober (1983) also described evolutionary theory as a theory of forces and population genetics as a theory of “equilibrium models.” Abstract accounts of natural selection and its enabling conditions flourished again with the turn of the 21st century. For Okasha (2006), a replicators-interactors account is too demanding and narrow, imposing unnecessary requirements for natural selection to occur. For Godfrey-Smith (2009), the concept of a population is the crucial one: some features of a population render it “paradigmatically Darwinian”, making natural selection happen. One motivating idea of such descriptions is their potential use for evaluating the operation of natural selection among units at different levels and in different domains (Rosenberg and McShea 2008). We will see below the example of the domain of culture: what kind of selection, if any, is plausible in the cultural domain?

Darwin thought natural selection happened chiefly among individuals: the individual organism was the unit of selection. But late Darwin introduced “group selection” for explaining some traits of humans and social insects (Darwin 1971). Groups were foreshadowed as a larger unit of selection, and group selection seemed to be the explanation for traits that jeopardize individual interest, such as cooperation and abnegation. In the 1960s, evolutionary modeling showed that group selection required repeated isolation, mixture, and re-isolation, namely conditions too narrow to be found in nature with any significant frequency (Maynard-Smith 1964). Group selection was disavowed; traits would not evolve simply because they are good for a group, they have to be selectively advantageous in inter-individual competition from their inception. In the 1970s, some philosophers of biology participated in a movement along with evolutionary biologists and social scientists to try and develop group selection into a scientifically respectable concept (Wilson 1975). In the meantime, kin selection had emerged as a potential alternative explanation for group-beneficial, unselfish traits, leading many scientists to conclude that group selection may be apparent and re-described as kin selection if group members are relatives. Philosophers got directly involved in the debate, sometimes working directly with biologists. In 1984, Elliott Sober, citing Williams (1966) among others, talked about the “mirage” of group fitness, seen as a mere statistical summation of individual fitnesses: “Selection works for the good of the organism; a consequence may be that some groups are better than others. However, it does not follow that selection works for the good of the group” (Sober 1984: 2). In many cases, the individual-based hypothesis is simpler than the group-level characterization, so the principle of parsimony would recommend not adding explanatory mechanisms. However, more in general, “Group, kin, and individual selection need to be disentangled, their difference made clear” (Sober 1984: 4). In fact, conceptual difficulties and fallacies in the units of selection framework were the attractor for philosophers to this debate. Sober changed his mind about group selection through a more careful conceptual analysis and by joining the work of biologist D.S. Wilson (Sober and Wilson 1998). Most philosophers now think that unselfish traits may be explained or made plausible by some combination of trait-level selection, organismal selection, and group selection in a weak sense, together with a multiplication of hierarchies of evolutionary entities and an “extended taxonomy of fitness” that contemplates co-opted by-products and functional shifts (Pievani 2011).

Beyond gene, individual, and group selection, some authors have also attempted to recognize higher levels, such as family selection, species selection, and clade selection, although authoritative biologists such as Ernst Mayr contested this idea: “in no case are these entities as such the object of selection. Selection in these cases always takes place at the level of individuals” (Mayr 1997). Today, for many biologists, the question of what unit is the “true” fundamental unit of selection has been satisfactorily settled—there are several—but there are now new theoretical and empirical questions. Given that multiple levels of vehicles exist, how does natural selection affect selection at lower or higher levels, and how are higher-level vehicles created by lower-level selection (Keller 1999)? The explanatory scope of multi-level selection, as philosophers have often emphasized, is challenged by the major evolutionary transitions in the history of life (Maynard Smith and Szathmáry 1995). Philosophers tend to describe a major transition in evolution as a phase of emergence of a new level—with new units—of selection. The new level contrasts or suppresses selection at the lower level, a process baptized “de-Darwinization” by Godfrey-Smith (2009).

4. Who Can Do Philosophy of Biology?

As David Hull observed in 2002, philosophers “are attempting to join with biologists to improve our understanding of…biological phenomena” (p. 124). As we have seen (2.d), a demarcation between the two profiles—the philosopher and the biologist—is possible but labile, and the distinction is perfectly compatible with any one scholar doing both, even simultaneously (cf. Pradeu 2009). Consider now how Hull’s quote goes on: “But sometimes the tables are turned. Biologists take up traditional philosophical topics and attempt to treat them even if they are not professional philosophers” (Ibidem). Biologists can turn to philosophy in two different ways: by reflecting philosophically on their own work or by naturalizing philosophical problems. In the first case, biologists get interested in issues of epistemology or methodology, and, from the ground of their work, they extrapolate ways of thinking or modes of inference. Naturalization happens when science becomes capable to say something significant and constraining about a traditional topic of philosophy, exemplified here by the origin of morality.

a. Philosophical Biologists

It is not infrequent that biologists undertake philosophical reflections on their own work. This is eased by the fact that some tasks of philosophy, such as conceptual analysis or linguistic clarification, are integral parts of scientific work (see also 2.d). Indeed, one might say that biologists, being experts about their own theories, are sometimes more qualified than philosophers to reflect on their own work. But what about the tendency to generality that characterizes philosophy of biology (section2)? Well, the generalizing route, too, can be followed by working scientists. A biologist’s philosophical effort may certainly vary in depth, richness, reach, and influence, depending on many factors such as the range of his or her interests in terms of both philosophical background and aims. With a good philosophical background, a biologist can match more effectively with debates in philosophical areas. Their aims may go beyond strict functionality for their own research and reach a genuine desire for capturing, defending, or criticizing something deep about their own science. Two examples—among many others—of very influential, philosophically-oriented biologists are ornithologist Ernst Mayr and paleontologist Stephen Jay Gould.

i. Mayr and Population Thinking

Ernst Mayr (1904-2005) was one of the greatest evolutionary biologists of the20th century, but he also increasingly worked in the history and philosophy of biology. He was “a crucial link between professional philosophers of science and professional biologists” (O’Malley 2010b: 530-1). Some of his areas of reflection were the distinction between proximate and ultimate causes, the nature of the neo-Darwinian synthesis, and the centrality of speciation and species defined by his Biological Species Concept (see 3.a). This article will focus on Mayr’s idea of population thinking as an example of how “one scientist uses ‘scientific concepts’ to forge a conceptual tool with a wider range of historical and philosophical applicability” (Chung 2003: 278).

The population notion was already central in Mayr’s work in 1942. His main concern was the methodology of systematics. Mayr had promoted a new systematics, looking for variation in large samples as opposed to few type specimens, and considering geography, genetics, and other sources as opposed to morphology only. As Carl Chung reconstructs, Mayr was first to formulate the distinction between typological and population thinking in 1955. A few years later, Mayr (1959) started pushing population thinking as the major innovation introduced by Charles Darwin and developed by biology as a natural historical science, different and autonomous from other sciences. According to population thinking, “no two individuals or biological events are exactly the same and processes in biology can be understood only by a study of variation” (Mayr 1955, cit. in Chung 2003: 288). In opposition, Mayr described typological thinking as having its deep roots in the Western cultural tradition, particularly in Platonic philosophy: “Implicit in this concept is that variation as such is unimportant since it represents only the ‘shadows’ of the eidos” (Mayr 1955: 485), the essences or ideas that lie behind diversity. As Chung points out, the main battlefield for this opposition was the concept of species. For Mayr, the incarnation of typological thinking in biology is the morphological species concept, according to which individuals belong to the same species by virtue of sharing their morphological characteristics. By contrast, biological species concepts are population concepts. The latter take into account variation, and in particular either reproductive barriers (two populations are different species if they coexist with no fertile mating) and/or geographical variation with reproductive flow between populations.

What is interesting here is the philosophical reach of Mayr’s ideas: he consciously worked them up from his systematics field work, used them for a philosophical interpretation of biology, and tied them to broader philosophical themes. In Chung’s words, the enterprise was “an attempt to liberate certain key ideas of the ‘students of diversity’ from their disciplinary constraints, and to render them more generally applicable by repackaging them into a broader historical and philosophical distinction that pertains to all of biology” (Chung 2003: 294-5). For sure, one goal was

to legitimize the natural historical sciences, including systematics, taxonomy, and evolutionary biology, against the criticisms of ‘the new biology’—molecular, reductionistic, and drawing explicit inspiration from the physical sciences…philosophically he could argue for the ‘in principle’ need for an evolutionary (population thinking) approach in order to offer adequate and complete explanations of biological phenomena. (Chung: 295)

Population thinking, once devised, shaped Mayr’s own interpretations of the history of biology, and provided him solutions to philosophical problems such as the method and the autonomy of biology. It was generally taken up by philosophers and philosophically-minded scientists such as Michael T. Ghiselin—notice the philosophical title of his 1997 book Metaphysics and the Origin of Species. There would be many other examples from Mayr’s work in which he developed philosophical ideas out of science with precise polemic aims and great influence on all scholars of biology. Statements like the distinction between ultimate and proximate causes became common currency for scientists. This does not mean that Mayr’s ideas weren’t criticized, as they increasingly are (Ariew 2003, Laland et al. 2011, 2013).

ii. Gould and Adaptationism

Scientists may end up doing philosophy if they get interested in the inferential structure of their own field. Inference, that is reasoning and its rules,is a classic topic in philosophy of science. Induction, deduction, and inference to the best explanation (IBE) are basic, well known types of inference, but, in scientific practice, they multiply and get combined and put to work in different ways, producing interesting conceptual and philosophical problems. Stephen Jay Gould (1942-2002) was a prolific writer (see more in section 5.d). Among his many favorite targets were a few inferential patterns employed by evolutionary biologists. The adaptationist inference was definitely one of the main ones since the famous Spandrels paper with Richard Lewontin (1979). Other biologists, like G.C. Williams (1966), had advanced proposals for revising adaptationist inferences on different grounds. Gould’s campaign can be seen as an expression of him being a paleontologist with great interest in inferential patterns and with a view of evolution directly inspired to Darwin’s works.

Adaptationism consists in explaining biological phenomena by claiming that they are adaptations. In the Spandrels paper, Gould and Lewontin used the metaphor of the San Marco cathedral in Venice to argue that even structures that exploit fundamental functions can nonetheless result, originally, as structural byproducts of a whole architecture. They wanted to criticize their colleagues’ habit to tolerate lazy tests of adaptive hypotheses that “consisted in little more than a decent qualitative fit between observed behaviour or form, and a set of posited adaptive pressures and constraints” (Lewens 2008: 180). As Gould elaborated after the Spandrels paper, between structures and functions there is no one-to-one strict correspondence, but rather, redundancy. Functions are distributed over several parts of organisms, and conversely any part we may call a trait or structure is involved in several mechanisms, functions, and processes in the organism’s life. In the appreciation of trade-offs between structural internal constraints and selected functions, Gould saw a revival of Charles Darwin’s original attention to “contrivances” (1877). Adaptive explanations will rarely suffice. In any case, they will need to be made testable and tested (Pievani and Serrelli 2011). Since 1982, together with Elisabeth Vrba, Gould proposed and promoted the neologism “exaptation” to address what he saw as two evolutionary mechanisms, distinct from adaptation, involving nonetheless natural selection and primary functions: functional shift of a structure with previously different purposes, a process already identified by Darwin; and functional cooptation of a trait whose origin is non-adaptive— for example, a side effect due to a developmental constraint or a random insertion.

Gould’s ideas about adaptationism and other aspects of biological inference triggered cascades of philosophical reflections, for example on the multiplicity of meanings of adaptation: by adaptation we may mean either something ensuring or increasing fitness or something that seems designed for the performance in a particular environment, in a range of environments, or for a particular function. We may intend something which is being positively selected, or something whose existence is due to natural selection in the past (Godfrey-Smith 2001, Lewens 2008). Biologists with philosophers, or philosophically-minded biologists, reflected on adaptationism often reacting, in one sense or the other, to Gould’s school of thought. Some scholars elaborated on the dubious ontology and instrumental nature of adaptations, while many others interpreted the challenge as the necessity of making adaptationist hypotheses testable (Pievani and Serrelli, cit.). Recently, plant biologist Mark E. Olson (2012) acknowledged the relevance of the “post-Spandrels consensus” on the importance of constraints in evolution among biologists. But he also pointed out post-Spandrels proliferation of contradictory “selection vs. constraints” and “externalist vs. internalist” explanations for the same data. These contradictions were due, for Olson, to Gould’s vagueness in defining “constraint,” as well as to the lack of experimental techniques for exploring the accessibility of unobserved forms. But today’s embryological, manipulative, and comparative empirical strategies allow for experimental exploration of morphologies that are not observed in nature. By combining these techniques, biologists can turn “internalism” and “externalism” from a-priori positions into case-by-case, testable hypotheses. Also, selection and constraint are more properly seen as complementary and not in mutual contrast as, for Olson, the Spandrels paper tended to suggest. For Olson, thus, a developmental “renaissance of adaptationism” is under way.

As customary, philosophical issues raised by biologists have been taken up and elaborated further by philosophers. Commenting on the philosophical literature on adaptationism, Godfrey-Smith (2001) distinguished three different issues on which adaptationist, anti-adaptationist and moderate positions can be taken up. The empirical issue is whether or not natural selection is a powerful and ubiquitous force in the natural world, with few constraints coming from biological variation, and with no comparable, competing causal factor. The explanatory issue is whether the most important questions in biology are about the fitting of organisms and environments, given that natural selection is the only answer to such big problems (other processes and explanations are good for less important questions). The methodological issue is whether or not starting with adaptive hypotheses—and holding to them—is good scientific practice. In 2009, a special issue of Biology & Philosophy was published as both a celebration and a critical appraisal of the influence of the Spandrels paper. Therein, Lewens (2008) elaborated on Godfrey-Smith’s taxonomy, recognizing seven types of adaptationism, and arguing for the importance of asking a prerequisite question: “what is a trait?”.

b. Philosophical Issues Naturalized

The relevance and implications of biology for humanity became a thought-provoking and heartfelt issue as soon as intellectuals and laypeople reacted to Darwin’s works (1859, 1871). More than a century later, E.O. Wilson (1975) proposed a “new synthesis” as a project of explaining the most diverse human behavioral and psychological traits by means of evolutionary hypotheses. Wilson’s proposal led to sociobiology and more recently to evolutionary psychology (Barkow et al. 1992). Human behavior, mind, morality, and systems of beliefs constitute the most interesting targets of possible, and controversial, naturalization. Naturalization is what happens when matters that are traditionally philosophical become empirically accessible by some scientific approach or method. David Hull wrote provocatively that “Philosophy lost physics, then biology, then psychology. Geometry, logic and mathematics became separate disciplines with no necessary ties to philosophy,” therefore philosophers hold very tenaciously to “epistemology, metaphysics, ethics and aesthetics” (Hull 2002: 124), because many of their other objects have been taken by science. But Hull also optimistically observed that one of the strengths of philosophy of biology “is that philosophers and biologists have ignored this distinction, working with each other on both sides of the divide” (Ivi: 117). In fact, naturalization is critically analyzed by philosophy, and there are many different philosophical positions on naturalism. So, naturalization is a fruitful object of study for philosophy of biology rather than a topic thief, and philosophy has a warranted place that is not going to evaporate by naturalization.

Philosophers’ reactions to Wilson’s “new synthesis” were, for Hull, a virtuous example of interactivity. After an initial wholesale opposition, many of them validated the challenge of naturalizing humans. Our species has to be seen as a proper part of the biological world, not as separated by any ontological divide. At the same time, philosophers highlighted inferential errors in biological explanations of human behaviors, epistemological limits in reconstructing the past, and ethical risks. They pursued theoretical refinements of the project, by improving multi-level selection models, for example. Many criticized the logical-deductive architecture of sociobiology and evolutionary psychology, frequently built on ad hoc hypotheses and adaptationist just-so stories. The debate is ongoing, and many arguments have been developed. For example, if many human psychological mechanisms are evolutionary novelties due to the interaction of ancestral genes and new environments, then many of these mechanisms are not adaptations and adaptive thinking in evolutionary psychology will fail to identify or explain them. More cautious, problematized, and integrated endeavors of biological explanation applied to humans are emerging, also under the stimulation of philosophy of biology (Sterelny 2003).

i. An Example: The Biology of Morality

It was for explaining human morality that Darwin hinted at ‘group selection’ (see 3.b), and the endeavor wasn’t finished with him. Evolutionary ethics, the association of morality with natural selection and evolution, provides good ground to Rosenberg and McShea’s remark that biology “…is the only scientific discipline that anyone has ever supposed might be able to answer the questions of moral and political philosophy” (2008: 3). Fast growing fields like neurobiology or cognitive neurosciences seem to be making biology more and more capable of addressing topics such as the origins of morality. Philosopher Patricia Churchland (2011), for example, studied the scientific literature and hypothesized a particular pattern of neural activation that would constitute a “neurobiological platform” for morality. Churchland highlighted the role played in this platform by molecules such as oxytocin, an ancient and simple peptide, found in all vertebrates. In mammals, oxytocin “is at the hub of the intricate network of mammalian adaptations for caring for others” (Churchland 2011: 14). In fact, morality would share its neurobiological platform with other familiar phenomena of human life—attachment and bonding. More generally, “the palette of neurochemicals affecting neurons and muscles is substantially the same across vertebrates and invertebrates” (Churchland 2011: 45). Among mammals, then, there is a wide “range of social patterns…, but underlying them are probably different arrangements of receptors for oxytocin and other hormones and neurochemicals” (p. 32). The striking thing is, for Churchland, that modest modifications in existing neural structures can lead to new outcomes. Morality and other phenomena would result from a not-so-exceptional modification of a pre-existing platform involved in mammalian parental cares. Churchland’s picture of evolution is again a familiar Gouldian one (4.a.ii):

Biological evolution does not achieve adaptations by designing a whole new mechanism from scratch, but modifies what is already in place, little bit by little bit. Social emotions, values, and behavior are not the result of a wholly new engineering plan, but rather an adaptation of existing arrangements and mechanisms that are intimately linked with the self-preserving circuitry for fighting, freezing, and flight, on the one hand, and for rest and digest, on the other. (Churchland 2011: 46)

Does the evolutionary continuity from mammalian parental care to morality constrain ethics and traditional philosophical theories of morality? Churchland’s view, while being only an example, is interesting in its intermediate position between strict biological determinism and cultural determinism. While biology can provide information and explanation on the platform for morality, the complexity of cultures provides scaffolding for moral development and definition so that moral decision remains a practical, dialogic, and social problem. However, the more general question is whether and how should moral philosophy—a large and highly technical field—take into any account what the sciences are discovering. The answers to this question are expected to come from philosophical studies of naturalization (Dupré 2001, De Caro and Macarthur 2004). In any case, most philosophers of biology recognize a naturalistic fallacy in the idea that knowing more about the natural world would suffice for making moral, political, and social decisions.

ii. Philosophy Versus Naturalization?

To some philosophers, the naturalization of philosophical problems is rather uncontroversial, to the point that “the difference between philosophy and theoretical science is not a matter of kind but of degree,” and the domain of philosophy is partly “the sum of all the questions to which science cannot (yet) answer” (Rosenberg and McShea 2008: 5). For many others, defining philosophy as some kind of underdeveloped science is an expression of scientism and a category mistake regarding fields of knowledge. Many philosophers point out that the biological explanation of social actions, behaviors, and culture, may imply a Darwinian dimension without boiling down to it. Naturalism is related to, but different from, other very general issues, like determinism or reductionism. Some philosophers draw a distinction between a strong “scientistic” naturalism and a pluralistic, or “liberalized,” naturalism (De Caro and Macarthur 2004). Scientistic naturalism considers philosophy as a branch of the natural sciences. Liberalized naturalism includes different epistemic levels of analysis of human nature—from natural sciences to humanities—that share the exclusion of non-natural causes or principles.

Emphasizing the impact of biology on human capacities, social institutions, and ethical values is also a way to justify philosophy of biology as useful or even indispensable to philosophy and, more generally, to the humanities. Some presentations of philosophy of biology tend to justify the field by its particularity of “concerning human affairs” (Rosenberg and McShea 2008: 8) and its being ultimately oriented to them. Some authors (Pradeu 2009) dislike this anthropocentric strategy in philosophy of biology and think the field could be otherwise justified. Under these overarching debates, human organisms and the human species are understandably a hotspot of problems for philosophy of biology, and biologists and philosophers must confront the growing biological knowledge of humans.

The neurobiology of morality (4.b.i) is not automatically a subtraction of morality as a philosophical problem. Indeed, many reflections from a scientifically-informed philosophy are of primary importance to maintain vigilance and scrutiny: what are the aims and uses of these biological studies of humans? Could they be used, for example, for a classification of people with consequences on the distribution of rights? Would this be justified? Would the consequent choices and decisions be acceptable? How are scientific results co-opted in clinical practice and health care? How are communication issues with patients handled? How influential are social and cultural biases in the construction of the object of research? Is this acceptable and justifiable? What ideas of morality underlie the studies? Scientists working in these difficult grounds will constantly exercise philosophical thought (see also 5.b). For them, the rich tradition of moral philosophy might constitute a precious aid.

Meanwhile, philosophy of biology can inform philosophy about new and obsolete approaches in the scientific explanation of morality. Evolutionary ethics is not necessarily tied to adaptationism (4.a.ii) or genetic determinism. The evolution of morality seems to resemble more bricolage than engineering. Against stereotypical and simplified views of evolution, morality cannot be identified with a set of genes, even though genes do not necessarily lose their importance, and the question about heritable patterns remains crucial in order to define what morality is in evolution.

Philosophy of biology can fight easy deterministic conclusions or identifications between naturalism and determinism while promoting useful definitions of the concepts involved: what is to be intended for morality in different contexts, and why finding an underlying arrangement of receptors is not going to replace the need for ethical reasoning and moral philosophy. Science is a form of knowledge and therefore subject to epistemology, conceptual analysis and change, and ethical reasoning. On the other hand, biology constantly brings new fuel to philosophical inquiry, even, perhaps especially, when philosophical issues get naturalized.

5. Philosophy Bringing the Life Sciences out of Their Research Context

There are multiple senses in which philosophy of biology brings the life sciences out of their research contexts. First, philosophy of biology can study and sometimes aid interactions among the life sciences or between them and other sciences. Second, sometimes philosophy is seen as capable of developing messages, ways of thinking, and their consequences and implications to elaborate a “philosophy of nature” and an overall vision of the living world (Godfrey-Smith 2009, p. 15). Third, philosophy can reflect on the roles and meanings of science and on the interactions between science and society with an approach different from that of the social sciences. In this way, it can assume critical points of view towards biology and reflect on how scientific claims are, could be, and should be received and elaborated by the public.

a. Philosophy of Biology at Intersections

What happens when different scientific fields or points of view come into intimate contact? Philosophy of biology has always been happily and effectively involved in this matter. A classic example (seen above, 3.a) is the analysis of coexistence, interaction, and implicit reciprocal dependencies between “the morphological, the physiological, and the genetical” viewpoints in taxonomy (Hull 1969: 176-7). Philosophers of biology often notice attractions and tensions and call for integrations. They attend to emergent relationships among life sciences, as in the topical cases of micro- and macroevolution (Serrelli and Gontier 2015b) and evo-devo (see below), or between biology and other sciences, as in cultural evolution studies. Classic debates in philosophy of science, for example on reductionism, provide conceptual coordinates for thinking about these connections.

A major development in biology began in the 1980s, when technical and theoretical advancements enabled the molecular study of development, opening the possibility of relating development to evolution in different ways (Gilbert et al. 1996, Minelli 2010, Olson 2012). Evo-devo, evolutionary developmental biology, was born. The contact was all the more significant because embryology, a discipline with an ancient tradition, had long been seen as far removed from evolutionary biology. Many evo-devo protagonists and observers framed evo-devo within the insufficiency of the mathematical study of the intergenerational transmission of genes. This hegemonic field, population genetics, considered development as a “black box” and assumed a linear relation between genotype and phenotype (Laubichler 2010). In doing so, it would be blind to fundamental kinds of evolutionary innovation, incapable of addressing macroevolutionary change (Minelli 2010), and, more deeply, non-mechanistic, based on “conceptual abstractions” (Laubichler, cit.). Yet, many scholars are pursuing a pluralistic integration. They point out that

To deny the internal consistency and the explanatory power of [the research program developed from population genetics] would be obviously foolish…. The objection is that there can (and should) be more to evolutionary biology than a research program restricted to the concepts and tools of population genetics (Minelli, cit.: 216).

Some scientists and philosophers focus on a broader polemic target: the Modern Synthesis, that is the foundation of evolutionary biology as it is practiced today, which happened between the 1910s and 1940s and was further canonized mainly by Ernst Mayr in the subsequent decades. Critics analyze how the Modern Synthesis excluded lines of research as non-legitimate or irrelevant to evolution, while successfully pulling together Darwin’s natural selection, Mendel’s theory of inheritance, mathematical models of population genetics, and the work of the most disparate fields in biology (Gilbert et al. 1996, Serrelli 2015). Some philosophers and scientists are therefore proposing the idea of an Extended Evolutionary Synthesis (Pigliucci and Müller 2010). These movements are very interesting to philosophers of biology. They provide new access to traditional philosophical issues about scientific change, including the unity and disunity of science (Fodor 1974, Callebaut 2010). They also invoke a necessary collaboration between history and philosophy of science, as shown by works that revised the received views on the nature of the divorce of embryology and evolution in the 1930s (Love 2003, Griesemer 2007 cf. section 7).

Philosophy of biology’s remarkable interest in cultural evolution studies is an example involving relationships between biology and other fields. The idea of similarities across biological and cultural evolution was already suggested by Darwin and his contemporaries, and several approaches were formalized in the second half of the 20th century (Cavalli Sforza and Feldman 1981, Boyd and Richerson 1985). Cultural change and stability are understood in terms of variation, selection, and inheritance/transmission of cultural traits, only requiring some correlation in cultural transmission from cultural parents (models from whom a cultural trait is acquired) to offspring (individuals acquiring the cultural trait). These evolution-inspired approaches to culture allow for a variety of unique mechanisms for cultural transmission, and incorporate processes like drift and multi-level selection. Capitalizing on such approaches, some social scientists are proposing that methods, findings, and theories be systematically exchanged between the biological and cultural sciences, particularly between disciplines that lie at the same level on a micro-to-macro scale (Mesoudi et al. 2006). This trans-disciplinarity is sometimes presented as the beginning of a late and needed evolutionary synthesis in the social sciences, similar to the Modern Synthesis in biology, but some philosophers think this is an overstatement while others think the claim is based on a naïve epistemological conception (Serrelli 2016a).

Similar proposals call philosophers of biology into question. The abstract formulations and philosophical issues of natural selection, multi-level selection, and drift (3.b) can help to evaluate the appropriateness of transfers of methods and theories (Godfrey-Smith 2009). General topics like population thinking (4.a.i) and adaptationism (4.a.ii), or reductionism will emerge in cultural evolution too. Progressionism is another very basic issue here. The misleading idea of progress was in fact roughly applied in anthropology by evolutionists in the past with potentially discriminatory effects and reductionist justifications of essential diversities within the human species. Now, philosophy can assist cultural anthropology in updating old stereotyped worries about progressionism, overcoming a derived prejudice against naturalism as such. A coherent and analytical criticism of any form of teleological and progressive evolutionism could thus be a way to reconstruct the broken bridge between cultural anthropology and evolutionary studies (Panebianco and Serrelli 2016a).  Cultural evolution also raises deep epistemological problems, not only about definitions of terms like ‘culture’, but also about the knowledge processes that are ongoing in this intensification of contact between the biological and social sciences (Serrelli 2016b, Panebianco and Serrelli 2016b).

b. Biology’s Critical Friend

As we have seen, philosophy of biology is expected to help science and to work hard to keep up with scientific research. But many authors point out that philosophy must not forget its critical role towards science. One perspective on philosophy of science sees it as an enterprise that aims “to understand the life of science, that is, to understand the development of science as a Wittgensteinian family of ongoing human epistemic practices” (Reydon 2005: 252, cf. Grene and Depew 2004). Science can be seen as a part of culture and as a sub-system of society. This critical perspective on science, with its complex relationship with society, is always available to philosophers. Authors like Linda Van Speybroeck expressed the concern that philosophy of science might become “…a servant of science, leaving science undisturbed and calling only for a ‘justification’ of the philosophical practice against scientific standards” (Van Speybroeck 2007: 54).

Stephen Jay Gould’s critique of progressionism in paleontology and paleoanthropology, and his opposition to human racial classification and measurements of intelligence, are examples of constructive criticisms of the scientific enterprise. Current knowledge overwhelmingly shows that human evolution is a bushy tree of coexistent and sometimes interacting hominin species (compare Ruse 2012). But cultural and psychological biases have shaped science in some periods. Human evolution was expected to be, and represented as, a ladder of progress, a sequence of progressively more evolved hominid species, substituting one another, and approaching Homo sapiens, the climax species, often represented as a European male (Eldredge and Tattersall 1982). Stephen Jay Gould, in books such as Wonderful Life (1989), with a peculiar method that could be called an archaeology of scientific ideas (Pievani 2012a), coined expressions like the “iconography of hope” and tied them to social history, to some deep teleological preferences, and to our habit of using the present as a key for understanding the past. For many years, Gould fought for human evolution to be separated from human hopes, satisfying tales, and “great narratives.” He concentrated on the jargon, distinguishing evolution and progress, trend and finality, and stressing the ambiguous and appealing fashion for terms like “missing link.” Gould’s critical metaphors put paleoanthropology in contact with the larger cultural context and deeper psychological roots, helping to reinforce the theoretical normalization of human evolution into a branching model of diversification of species, typical of the broader phylogenetic tree of primates.

Gould’s way of reasoning was particularly apt in showing that science is a human activity. Scientists work on the ground of cultural and social biases; they are not naïve collectors of neutral facts. Although Gould was not a sociological relativist, he showed in many cases the importance of history in shaping science, where ideas may also be dismissed and then taken up again. This attitude is also found in Eldredge and Gould’s (1972) famous “punctuated equilibria” paper, where the two paleontologists revealed the deep theory-laden nature  of fossil interpretation. The ubiquitous pattern of evolutionary stasis over geological time periods was neglected since Darwin (1859), who considered fossils a constitutionally incomplete documentation. But the pattern of stasis and punctuation was, as Eldredge and Gould pointed out, data, not lack of data. “Phyletic gradualism” was a consolidated assumption that had been blinding even paleontologists towards their own data, while perpetuating the subordinate position of paleontologists with respect to theoretically important fields. In fact, Eldredge and Gould formulated a theory that explained punctuations as speciation events, and stasis by other processes. In doing so, they posed several problems, among which was the legitimation of paleontology as a theoretically relevant field.

In The Mismeasure of Man (1981) Gould argued against the general scientific attitude of attaching “universal essences” to human disparities by measuring what is not measurable, like intelligence quotient, an artificial construct. He also exposed unconscious manipulations of the anthropometric measurements and ranking of skulls in Samuel G. Morton’s work (Crania Americana, 1839): Morton believed in “races” and in their polygenic origin, and he believed human intelligence was a unitary and inheritable object. Gould found his measurements as biased by his self-confirming preconceptions. For Gould, any scientist is an unconscious victim of his or her preconceptions. Recent studies of Morton’s material (Lewis et al., 2011) rehabilitated Morton’s original measurements. The authors hypothesized that Gould’s severe analysis was biased by his own aprioristic egalitarian and liberal cultural beliefs. “In a paradoxical way Steve had proved his own point”, Ian Tattersall observed (2013). No scientist can be immune to this kind influences, against which, for Gould, “vigilance and scrutiny” are the necessary, although insufficient, palliations. Along these lines, philosophy of biology is invited to take on its critical constructive role towards science.

c. Developing Messages from Biology

Many topics in philosophy of biology are evidently relevant in the relationship between biology and the public. Deciding the meaning of concepts, like natural selection, fitness, or function, requires an understanding of the theories and their domains (Rosenberg and McShea 2008). Philosophy of biology clarifies, for example, that “‘natural selection’ is not an entirely apt name for the process [it identifies], as it misleadingly suggests the notions of choice, desire, and belief built into the theological account of adaptations” (p. 17) and corrects popular descriptions of evolution that make it look tautological and unscientific. The notoriously slow-changing world of science education and public understanding is under pressure by the impressive rate of life sciences growth. In the popularization of science, hominid evolution is still depicted in a linear sequence of species from simple to complex, from inferior to superior, from archaic to modern. It exhibits intuitive power, not to neglect the prospective endurance of other influential pictures, like the tree of life, that are being questioned and made complex. And any time a naturalistic study of humans is carried out, misunderstandings and dangerous “genes-for-morality”-kind interpretations are just around the corner, both in the form of quick reductionistic and deterministic claims and in the form of a-priori anti-naturalistic positions. Even important scientific advances, like evo-devo and other fields that are pushing for an Extended Evolutionary Synthesis, can be negligently interpreted as breaking-offs, invalidating all the previous knowledge. Opponents of science can use this to covertly reintroduce non-naturalist and non-scientific explanations. Instead, philosophy of biology could help citizens become more scientific, and more able to exploit the directional role they have towards science (cf. Haarsma et al. 2014-2015).

According to a principle called “scientific citizenship,” science should be properly understood by everyone in society. A fundamental pillar is a shared awareness of “the nature of science” which is the wealth of philosophical reflections about what is science, what is biology, how is our scientific knowledge best acquired, and how much can we be confident in it (Matthews 1994). What are the ways of thinking used in biology that can help people to get a grasp of it? Are they similar to everyday ways of reasoning? What are the possible conceptual traps and pitfalls? What kind of knowledge can we expect from ecological simulations? What are model organisms, and why is it important to establish and fund them? What kinds of predictions can really be made, for example, in medicine or ecology? Understanding the probabilistic nature of predictions would produce not only a more educated picture of nature, but also, for example, an “awakening of public opinion to environmental problems, hydro-geological instability, and maintenance of territory” (Pievani 2012b: 352).

Philosophy of biology’s endeavor can sometimes go far away from day-to-day biology, and work out general pictures, messages, and worldviews. A classic and pervasive example is “Universal Darwinism,” developed by Richard Dawkins, Daniel Dennett and others (Dawkins 1983) along a reasoning line from kin selection theory to a “gene’s eye view” to a “replicator view”. According to the “selfish gene” part of the argument (Dawkins 1976), reliably replicating molecules (precursors of genes) would be at the origin of life, and organisms including humans would be late-comer vehicles built up in all their minute details by genes that survived the competition. The selfish gene has been particularly appealing and controversial for its philosophical implications. We are unaware machines for our genes, whose interests furthermore sometimes conflict with (and win over) ours. The selfish gene view even came to be considered the official version of Darwinism, being the one defended and advocated on many public stages against non-scientific views. Universal Darwinism is a philosophical view, according to which natural selection, intended as the selective retention and accumulation of blind variations that prove to be stable and fit, is the fundamental mechanism in the Universe. Many philosophers of biology fought these views and all their philosophical implications, seeing them as a hardly justified reification of some methodologically operationalized idealizations by population genetics (Oyama et al. 2001). Other lines of attack were the incredible polisemy and theoretical complexity of “gene” and the ongoing revision of their causal power on the organism (Griffths and Stotz 2013). Critics were all the more motivated by their discontent with the picture of evolutionary biology that was being conveyed to the public and to philosophy by the selfish gene view. Many philosophers pointed out that natural selection is arbitrarily chosen as a process to be universalized into a general philosophical view (Godfrey-Smith 2009), and some outlined provocative alternative views like Universal Symbiogenesis (Gontier 2007).

Other philosophical views were elaborated from the importance of chance, randomness, and, more comprehensively, contingency in evolution. Philosophers point out studies demonstrating that chance variation can influence evolutionary outcomes without being constrained or directed: “Evolutionary divergence is sometimes due to differences in the order of appearance of chance variations, and not to differences in the direction of selection” (Beatty 2010: 39). Since the 1980s, the neutral theory of genetic evolution promoted by Motoo Kimura (Ohta & Kimura 1971), now developed in weak neutralism in the scientific community (Hartl & Clark 2007), exposed the huge proportion of neutral variation all around. Population genetics models show that important events like speciation can well happen in conditions of fitness neutrality (Gavrilets 2004). Much earlier, population geneticists demonstrated the importance of drift (see above, 2.a). But is evolution random or contingent at any spatio-temporal scale? Are there law-like tendencies in large-scale evolution? If so, do these concern adaptedness, complexity, or other features? Since environments change over time, “what is adaptive” changes constantly through evolutionary time, so there is no strict commitment in the Darwinian theory to long-term adaptive progress or trends (Serrelli and Gontier 2015a).

Are trends towards greater complexity a better candidate? Some philosophers think so (McShea and Brandon 2010). Others think that nothing in the current understanding of evolution predicts a drive towards increased complexity. Another main disagreement concerns how to define complexity— is it through the integrated organization of interrelated parts in a whole or just through the number and diversity of parts? Some philosophers see evolution as a texture of contingent histories, the most ordinary—but most relevant to us—being the story of our species and their relatives. The human tree is just like that of other mammals. This emerging view is very different from the one inspired by Universal Darwinism:

Evolution is a process that abounds in redundancies and imperfections, and adaptation could be a collateral effect rather than a direct optimisation. Biology is a field of potentialities, and not determinations…. Complex organisms exist thanks to imperfections, to multiplicity of use and redundancy. (Pievani 2012a: 142)

For authors like Stephen Jay Gould, the preeminence of ecological contingencies and macro-evolutionary patterns, like mass-extinctions, in natural history seems to dismiss any idea of progress in evolution:

We are the offspring of the material and contingent relationships between localised populations and ever-changing environments. The massive contingency of human evolution means that particular events, or apparently meaningless details, were able to shape irreversibly the course of natural history. (Ivi: 139)

Some theorists identify the source of contingency in the complex interplay between ecological systems and genealogical entities at multiple and very diverse scales (Eldredge et al. 2016).

From the disruption of the idea of a great progressive tendency in evolution, in particular human evolution, some philosophers develop general implications. Evolutionary humility results from a naturalistic way of seeing Homo sapiens as a part of a contingent process and not as its culmination. From another point of view, the discovery of the determinant role of contingent ecological events like floods and earthquakes gives a new vision of nature. Nature is neither a harmonic Eden nor a wicked nemesis. “The expressions of violence and unpredictability of natural phenomena that shock our societies so much today are the normal ecological niches where we were born. We would not be here at this moment without them” (Pievani 2012b: 352).

6. Scientifically Up-to-Date Philosophy

In a famous paper titled “What the philosophy of biology is not,” David Hull wrote that any work in philosophy of biology should not skip “all the intricacy of evolutionary relationships, the difficulties with various mechanisms, the recalcitrant data, the wealth of supporting evidence” (Hull 1969: 162). Along the same line, philosophy of biology tends to be grounded on deep knowledge and understanding of current biology by maintaining a non-episodic familiarity with many fields that are outside philosophical specialization.

It is important to keep in mind that the life sciences and their objects change and grow. A basic example is the very definition of “life” (bios). Answers, as well as philosophical problems, for such a topic come, for example, from scientific research on the origin of life (Penny 2005) or the search for extra-terrestrial life. The origin of life has been considered as a backwards extension of the roots of the tree of common descent. It regresses from some universal “ancestral organisms” (LUCA, Last Universal Common Ancestor) to simpler, minimally living entities that might be referred to as “protoliving systems,” and it further dissolves into non-living matter along several dimensions (Malaterre 2010). In general, philosophers work together with biologists on interpreting the history of life, for example, on trying to make sense of major evolutionary transitions (Maynard Smith and Szathmáry 1995). The symbiogenesis of eukaryotes and the advent of multicellularity are examples of major transitions. Those few moments in the history of life may be seen as points of emergence of a new level of organization. Today the very idea of a tree of life is being challenged by evidence of massive transfers and blurred boundaries among its branches (O’Malley 2010a). The history of biology is a history of changing views of life and of its history, and philosophy of biology participates into this process.

In the 50 years of existence of philosophy of biology as an academic field, the expansion of life sciences has been explosive. Huge global problems like climate change, biodiversity loss, new forms of diseases, and needs for resource management in our societies have contributed to that growth. Life sciences, including biomedical, ecological, and microbiological fields, have faced challenges related to the Modern world, not only by being called into question or by actively catching opportunities to develop big research projects and programs but also by contributing to the very discovery and perception of those challenges. In parallel, life sciences have seen incredible technical advancements: cheaper and faster technologies for DNA sequencing and molecular analysis; computational methods allowing analysis of billions nucleotides in search for the most likely phylogeny and simulations of evolution of proteins or complex phenotypes; huge shared information databases like the genome projects (such as genomics, proteomics, metabolomics), particularly the Human Genome and the various “–omics” projects monitoring whole complexes of functional molecules; finally, the applications of these technical advancements in creating new kinds of organisms or in disease detection and therapy. A parallel phenomenon of the decades leading into the 21st century has been the explosion and availability of scientific literature and access.

What are the consequences for philosophy of biology? The discipline is supposed to have a role not only in understanding, describing, and communicating science but also in aiding the development of scientific programs. Given the explosive historical dynamics outlined above, almost every subfield of biology requires much of a philosopher to delve into its peculiar concepts, methods, objects, and conceptual issues. Hence, several presentations of philosophy of biology follow a field-by-field criterion, enumerating items like “philosophy of ecology” or “philosophy of molecular biology” (Griffiths 2011). Such multi-field presentations reflect the dynamic and lively development of biology. Meanwhile, periodical shifts of focus and emergence of new fields and techniques in the scientific literature, attract the curiosity and calling for the contribution of philosophers. But the identities of supposed sub-fields like “philosophy of ecology” or “philosophy of microbiology” are not crystallized, and rarely will a philosopher of biology self-limits to one sub-field. Therefore, the field-by-field approach is not followed here.

This first section provides a few examples of how philosophers of biology can chase the developments of some particular field of life sciences. In certain moments, this pursuit can lead to extensions of philosophy of biology itself to embrace not only new scientific knowledge, but also newborn ways of doing science. Furthermore, deep revisions of philosophical approaches themselves may be necessary to address new aspects of science, namely facets of scientific practice. The examples concern molecular studies of gene exchange that started in microbial evolution, advances in ecological modeling, and the construction and management of model organisms.

a. Questioning Influential Ideas

In recent years, several philosophers have become interested in the growing evidence for a variety of gene exchange mechanisms widespread in fungi, plants, and animals, not only in prokaryotes (unicellular organisms that have long been known to wildly transform, conjugate and acquire DNA by transduction). Following biologists such as W. Ford Doolittle, philosophers contrasted this evidence with the idea of a universal tree of life as a tree of sexually reproducing, genetically isolated species that multiply by genetic isolation. For O’Malley (2010b), the idea of a universal tree of life has come to be an unacceptable reach given the abundance of reticulation and lateral gene transfer (LGT) in all kinds of organisms, including animals. O’Malley proposed to give up

an animal-centric philosophy of evolution. Its key tenets are that the BSC (Biological Species Concept) is ‘universal’ to species-forming organisms, that bifurcating lines of descent are all that matter in ancestry reconstruction, and that the rest of life (non-animal) is simpler, less diverse, and less ‘true’ evolutionarily. The consequences of [such an] animal-centric philosophy of evolution are that it can include at best a severely truncated history of evolutionary events. (p. 544)

The animal-based, constraining idea of a universal tree of life was consolidated around the mid-20th century through an attempt of “excluding the messy”, such as prokaryotes and bacteria. For O’Malley, such exclusion was mainly due to the influence of ornithologist and philosopher Ernst Mayr.

As this example illustrates, philosophers, by following frontier developments of particular fields, can sense the need for a revision of overarching theoretical choices and frames of biology. At the same time, they can wriggle out of canonical problems and concepts and expand their philosophical interests: “It is definitely the case – for O’Malley – that philosophy of biology has undergone a rapid radiation of topics in the last decade and this has meant going beyond Mayr’s focus” (544). The resilience of issues like the species concept, molded on animal breeding, is pushed by new concepts amenable to philosophical analysis. Further, the very task and range of philosophy of biology happen to be questioned: “Mayr’s vision of philosophy of biology as the clarification of evolutionary concepts has…been challenged” (O’Malley 2010b: 531).

b. Understanding New Scientific Practices

According to philosopher Thomas Pradeu (2009), ecology has made “a conspicuous and very welcome entry” in philosophy of biology, and several philosophers advocate aspects of ecology as targets of philosophical attention. Ecology has its own epistemological issues, for example, the weakness of ecological laws, the debate around the idea of balance of nature, the complex problem of the predictive value of ecological models, and the involvement in environmental decision making (Cooper 2003, Mikkelson 2007, Plutynski 2008). Other problems concern the individuation of ecological units, scale-dependence, and generalizability of ecological models. Some philosophers got interested in new modeling methodologies of ecological inquiry, the area of the following example.

Ecologist Steven Peck (2008) contributed to the debate from the standpoint of an author of complex computer simulations. He pointed out the many differences between simulations and “analytic models” which can be written as mathematical equations. In simulations, many of the entities, “dispositions,” rules, and relationships, are not captured by any equation, rather they are directly written in computer code: “conditional if-then statements, looping structures, and calls to procedures” (Peck 2008: 390). Moreover, simulations are not at all complicated versions of analytic models because, as Peck explains,

The complex computer representation is an ecological system. One that you have complete control over, but which provides insights and allows complex behavior to bubble up from lower level processes and allows one to capture the emergent behavior often seen in ecological systems. (p. 387)

For Peck, models of this kind are provocative for philosophical issues, such as what are models? What are their aims? How do they work? A classic and influential framework by Richard Levins (1966) identifies three constraints among which a model has to trade-off—generality, precision, and realism. But building a simulation, for Peck, does not consist in adding more variables and parameters in order to capture more parts and processes of some targeted biological system. It is a creative effort yielding something autonomous with interesting but complex relationships with other models and with the world. Some notions in the philosophy of modeling can make a step in the direction of simulations, like the idea of “indirect representation” in which the model descriptions themselves are examined as opposed to “direct representation” in which representation is used to describe a real-world system (Godfrey-Smith 2006). But this distinction for Peck is not enough to capture the fact that simulations become experimental systems in themselves. Playing with the model means a “bracketing out of nature to explore the model itself [yielding] important insights to the model – separate from what it is supposed to represent” (Peck 2008: 395).

Complex computer simulations push philosophy of biology to reflect on new accounts of model building and interpretation, and also to deal with new ways in which scientific communities structure themselves.

c. Rethinking the Philosophical Approach from New Ways of Doing Science

Chasing the state of the art of life sciences—even of one or few fields at once—is very demanding. By doing it, however, philosophers of biology can continually fuel their thought. We have seen, for example, that philosophers, relying on molecular discoveries about gene exchange, can revise their agenda, downplaying traditionally-framed problems (for example, what is a species?) and even questioning great background pictures against which biology is thought (the tree of life, for example). By striving to understand scientific methodologies, including completely new ones such as computer simulations in ecology, philosophers are brought to probe their accounts of science to work out new ones, and to get a better hold on them and their connections. Related to all these efforts there is another tension in philosophy of science, which has been made explicit in the last few years: the philosophical orientation toward scientific practice (Boumans and Leonelli 2013). A recent trend, represented, for example, by the Society for Philosophy of Science in Practice (SPSP), is pointing out the limits of an exclusive use of conceptual analysis, proposing instead “a philosophy of scientific practice based on an analytic framework that takes into consideration theory, practice and the world simultaneously” (Boumans et al. 2011). Practice is defined as an ensemble of “organized or regulated activities aimed at the achievement of certain goals” and is an object of philosophical reflection.

Steven Peck’s analysis of ecological simulation, mentioned earlier, can also be seen as an example of a call to scientific practice. For Peck, simulations are more of experimental systems than representations. A simulation is useful because it helps “thinking more deeply and creatively into the nature of the problem” (Peck 2008: 396), but it can be appreciated only by considering the community in which the authors of the simulation are present and maintain the simulation (otherwise, the simulation dies because it cannot be understood or replicated by others). Crucial aspects are the authors’ willingness to expose the multiple perspectives that are encoded in the simulation, to confront them with other modelers and with “those who study the ecology of natural systems directly” (p. 399), and the authors’ engagement in discussing, modifying, and exploring the simulation further. In this new kind of engagement, Peck sees a “hermeneutic circle,” a “back and forth conversation among modelers, their models, and those who study the ecology of natural systems directly” characterized by each actor’s attempt to “understand her own perspective in light of others’ perspectives.” The hermeneutic circle is, for Peck, what “opens the door to deeper understanding of what the simulation model is showing us about the world” (p. 399). Scientific community practices are crucial for understanding what’s going on. Logical analysis, either of the model alone or of its relationships with some natural ecological system, does not seem to bring philosophy a long way.

Another example of a rising scientific practice is constituted by model organisms, a term introduced in the late 1990s and becoming more and more used. Official lists of model organisms include species such as the mouse, zebrafish, fruit fly, nematode worm, thale cress. Mice and other animals are extremely important in biomedical research due to the extrapolations to Homo sapiens that are considered possible with some conditions (Piotrowska 2013). In philosophy of biology, there has been interest in understanding what “model organisms” are and in demarcating them against the larger set of experimental organisms. Ankeny and Leonelli (2011) define model organisms as:

Non-human species that are extensively studied in order to understand a range of biological phenomena, with the hope that data and theories generated through use of the model will be applicable to other organisms, particularly those that are in some way more complex than the original model. (p. 313)

The two philosophers work out several concepts embedded in this definition in order to capture the specificity of model organisms. For the issue at hand, the most important aspect is the new kind of structured scientific communities that maintain a model organism stable in space and time. Examples of model organisms are the fly Drosophila melanogaster that has been studied since the dawn of genetics, knockout mice Mus musculus, and the plant Arabidopsis thaliana. The research community of a model organism performs intensive research with “a strong ethos of sharing resource materials, techniques, and data” (p. 317). While initially the organism can be chosen for experimental advantages (being easy to breed, for example), the cumulative establishment of techniques, practices, and results—for example, through databases and stock centers—leads to self-reinforcing standardization, comparability, and stability: “…the more the model system is studied, and the greater the number of perspectives from which it is understood, the more it becomes established as a model system” (Creager et al. 2007: 6).

To summarize, a recent stream in philosophy of biology associates the need for first-hand, recent, and deep scientific information with a need to consider newly emerged scientific practices that often involve innovative scientific communities and that are capable to generate new questions. It is no surprise that philosopher Leonelli, in a paper on different modes of abstraction that are performed by different communities on the model organism Arabidopsis thaliana, writes: “Focusing primarily on modelling practices, rather than on models thus produced, might prove a useful way to gain insight on some long-standing debates within the philosophy of scientific modelling and representation” (Leonelli 2007: 510). By following biology, philosophy rethinks its own methods and foundations.

7. History and Philosophy of Biology

Good philosophy of biology may be done non-historically, by working in a purely conceptual way. On the other hand, philosophy is sometimes specifically interested in grand historical processes. The inception of empirical and theoretical novelties and the birth of whole new fields provide the opportunity to probe classic accounts of scientific change such as Popper’s falsificationism (Popper 1935, 1963), Kuhn’s paradigm change (Kuhn 1962), Lakatos’s methodology of scientific programmes (Lakatos 1970), or Kitcher’s theoretical unification (Kitcher 1981). Philosophical categories such as reductionism can be tested in their capability to account for the historical relationships between biological fields. At the birth of molecular genetics, for example, a philosophical question was whether the older Mendelian genetics was being reduced to it. As Griffiths (2010) remarks, philosophers of biology achieved more adequate models of theory by debating whether or not the molecular revolution in biology was a case of successful scientific reduction.

At all scales, philosophy of biology has a constant need to refer to history, while some authors complain that philosophy of biology pays too little attention “to the question how and why things in the field have become the way they are today” (Reydon 2005: 149-150). The rediscovered interdependence between history and philosophy of biology may be seen in light of the more general problem of the “history and philosophy of science” (HPS) studies, which has been viewed in a more integrated way. The recursive and expansive dynamics between history and philosophy of biology proves to be a generator of dense and complex elaborations in all the involved fields.

James Griesemer (2007) constitutes an example of how philosophical views serve historical analysis. The topic is the molecular study of development at the heart of evo-devo (see also 5.a). Griesemer suggests a revision of conventional narratives that describe evo-devo as a union between genetics and development, the study of which was supposedly abandoned since the 1930s (see Gilbert et al. 1996). For Griesemer this separation between fields is artificial: embryology and genetics have always been “like the segments of a centipede: moving together with limited autonomy” (p. 376). Griesemer’s long, reframing argument goes through the philosophical characterization of scientists as process followers: “there is no doubt that scientists do follow processes, that this is an important and central activity in their work, and that they achieve causal understanding as a result of doing it” (p. 377). Research styles, are, for Griesemer, “commitments to follow processes in a certain way.” Griesemer constructs the idea of genetics and embryology as nothing but research styles that “package commitments to follow processes according to particular sorts of marking interactions and tracking conventions together with commitments to represent processes in particular ways” (p. 381). In this view, the separation between genes and development ceases to be considered as an ontological divide: “It does not follow from the divergence of research styles and representational practices in genetics and embryology that nature is divided into separate processes of heredity and development” (p. 414). An important ingredient of Griesemer’s view is the use of representations with their dual course. Before being pressed into service as tools for communicating results and interpretations, representations are “working objects, developed as bench or field tools for tracking phenomena and following processes” (p. 388). Scientists, to be able to follow processes, produce representations that, in turn, “provide reinforcing feedback that organizes attention into foreground and background concerns” (p. 380). This dynamic becomes important when representations survive the experimental work in which they are produced and get used by other scientists. In scientific communities, process-marking and process-following strongly reinforce each other in an attention-guiding feedback, operated by representations. Griesemer analyzes Gregor Mendel’s experiments and describes Mendel—universally considered as the founding father of genetics—as a process follower and as a developmentalist. The different and successive notations appearing in Mendel’s writings (for example, A + 2Aa + a; then A + A + a + a) are the representations sequentially devised by Mendel in order to follow his enduring interest, that is, the developmental process of hybrids. Therefore, in this account, Mendel was a developmentalist who offered lasting representations that, in turn, helped some of his followers to focus on patterns of intergenerational transmission, backgrounding development. A particular power is given, in this account, to representations like Mendel’s notation: “In Mendel’s demonstration of notational equivalence, the possibility of reinterpreting his work on the development of hybrid characters in terms of a factor theory of hereditary transmission was so strong that it took very careful analysis by historians to show that Mendel probably did not hold the factor theory with which he is credited” (Griesemer 2007: 404). The consolidation of research styles directly concealed the awareness of their unity: “theories of heredity entail methodologies of development, and conversely” (p. 414); “genetics entails an idealized and abstracted account of development and development entails an idealized and abstract account of heredity” (p. 417); and “the gene theory not only had an embryological origin, it never really left embryology at all” (p. 414). An implication for the present and future of evo-devo is that putting heredity and development “back together again” can be thought of, in part, as a problem of conceptual reorientation, change in theoretical perspective, and rehabilitation of research styles that were out of the limelight rather than extinct due to failure. The commonly suggested linear progress from embryology to genetics to evo-devo is a historical artifact: there is no “historical progression of fields or lines of work that take the scientific limelight in turn” (p. 417).

Whether the described implications are historical or philosophical is difficult to discern, particularly in such cases as evo-devo, which are in current philosophical focus. It is more productive to say that Griesemer’s analysis shows the interplay between philosophy and history of biology. The view of scientists as “process-followers” can indeed be seen as a methodological and a philosophical proposal.

Philosophically, the view of scientists as process-followers is pretty well elaborated and detailed with the related notions of marking (mental and manipulative), foregrounding and backgrounding, research styles, and so on. Griesemer explicitly confronts his proposed view with more widespread approaches. For example:

The notion of following a process unifies [commonly separated] descriptions of science as theoretical representation, as systematic observation, and as technological intervention, [and] cuts across many analytical distinctions commonly used to describe science (e.g., theory vs. observation, theory vs. experiment, hypothesis-testing vs. measurement, active manipulation vs. passive observation, scientific methods vs. scientific goals). (p. 375)

The marking activity, in particular, “is not easily assigned to either of the traditional categories of passive observation and active experiment. It is nonmanipulative yet active work” (p. 379), and since such work allows to identify causality, theory, and methodology are intimately related. This local metatheorizing can be taken up by other philosophers and elaborated further.

Historians, too, are invited to test the view in describing other cases in the history of biology. To historians, however, there is also a reflexive message: just as scientists mark and follow processes in the systems they study, so do historians with the social and historical processes they follow. History and philosophy of science, in a word, would consist in a “following of scientists while they follow nature” (p. 376). This means that history and philosophy of science itself may be subject to research styles, and may need theoretical reorientations to gain a new understanding. In general, for example, since narratives tend to follow historical developments within scientific styles, history often gives a false impression of scientific disciplines (embryology and genetics, for example) that are separate because their theories describe different processes, like development and heredity. So historians and philosophers are invited to abandon “limelight narratives” that, while making “instructive drama,” confound our understanding of social processes. In the specific case of study:

Embryology is not well tracked in narratives that foreground the success of genetics after the split…. We must instead follow the ‘bushy’ divergences and reticulations of several sciences as they spawn new lines of work if we are to understand, and follow, science as a process. (p. 376, emphasis added)

8. Conclusion

We have seen that philosophy of biology can be actively involved in some scientific activities. When this is not the case, philosophy of biology is largely an interpretive description or re-description of instances of biology. Generalities about science in terms of general descriptions of science, biology, or biology’s sub-fields, as well as in terms of general philosophical problems about science, are usually the goals, sources, and backgrounds of philosophy of biology. At the same time, tentative generalities about science—such as the just seen view of “scientists as followers of processes”—can serve historical reconstructions and translate into methodological proposals for historians, too. However, the service is reciprocal, since present or past historical cases substantiate—if they don’t suggest—new philosophical views of science, which is a specific tendency of philosophy. The conclusion is that not only philosophy and science, but also history, form an entangled, integrated whole in current research, thought, and practice.

9. References and Further Reading

a. Cited Examples

  • Ankeny, Rachel, Sabina Leonelli. 2011. “What’s so special about model organisms?” Studies In History and Philosophy of Science Part A 42(2): 313-323.
  • Ariew, André. 2003. “Ernst Mayr’s ‘ultimate/proximate’ distinction reconsidered and reconstructed.” Biology and Philosophy 18: 553–565.
  • Eds. Barkow, Jerome H., Leda Cosmides , and John Tooby . 1992. The Adapted Mind: Evolutionary Psychology and the Generation of Culture. New York, NY: Oxford University Press.
  • Beatty, John., 2010. “Reconsidering the importance of chance variation.” Evolution – The Extended Synthesis. Eds. Pigliucci Massimo and Gerd B. Müller. Cambridge-London: MIT Press, pp. 21-44.
  • Boudry, Maarten, Stefaan Blancke, and Johan Braeckman. 2010. “How not to attack Intelligent Design creationism: philosophical misconceptions about methodological naturalism.” Foundations of Science 15(3): 227-244.
  • Eds. Boumans, Marcel., Hasok Chang, and Rachel Ankeny. 2011. “Philosophy of Science in Practice.” European Journal for Philosophy of Science 1(3).
  • Boumans, Marcel, and Sabina Leonelli. 2013. “Introduction: on the Philosophy of Science in Practice.” Journal for General Philosophy of Science 44(2): 259-261.
  • Boyd, Robert, and Peter J. Richerson. 1985. Culture and the Evolutionary Process. Chicago: University of Chicago Press.
  • Callebaut, Werner. 2010. “The dialectics of dis/unity in the evolutionary synthesis and its extensions.” Evolution – The Extended Synthesis. Eds. Pigliucci, Massimo, and Gerd B. Müller. Cambridge-London: MIT Press, pp. 443-481.
  • Casetta, Elena, and Jorge Marques da Silva. 2015. “Facing the Big Sixth: from prioritizing species to conserving biodiversity.” Macroevolution – Explanation, Interpretation, Evidence. Eds. Serrelli, Emanuelle, and Nathalie Gontier. Berlin: Springer, pp. 377-403.
  • Cavalli-Sforza, Luigi L., and Marcus W. Feldman. 1981. Cultural Transmission and Evolution: A Quantitative Approach. Princeton, NJ: Princeton University Press.
  • Chung, Carl. 2003. “On the origin of the typological/population distinction in Ernst Mayr’s changing views of species, 1942–1959.” Studies in History and Philosophy of Biological and Biomedical Sciences 34(2): 277-296.
  • Churchland, Patricia S. 2011. Braintrust. What Neuroscience Tells Us about Morality. Princeton, NJ: Princeton University Press.
  • Claramonte Sanz, Vincente. 2011. Diseño Inteligente: La Pseudociencia del Siglo XXI: Aspectos Filosóficos, Sociológicos y Políticos de la Nueva Ideología Antievolucionista, Editorial Académica Española.
  • Cooper, Gregory. 2003. The Science of the Struggle for Existence: On the Foundations of Ecology. Cambridge University Press.
  • Eds. Creager, Angela N. H., Elizabeth Lunbeck, and M. Norton Wise. 2007. Science without Laws: Model Systems, Cases, Exemplary Narratives. Durham and London: Duke University Press.
  • Darwin, Charles R. 1859. On The Origin of Species. John Murray, London, sixth ed. quoted, 1872 edition.
  • Darwin, Charles R. 1871. The Descent of Man, and Selection in Relation to Sex. John Murray, London.
  • Darwin, Charles R. 1877. The Various Contrivances by Which Orchids are Fertilized by Insects. John Murray, London.
  • Dawkins, Richard. 1976. The Selfish Gene. Oxford: Oxford University Press.
  • Dawkins, Richard. 1983. Universal Darwinism. Evolution from Molecules to Man. Ed. Bendall D. S. Cambridge: Cambridge University Press.
  • Eds. De Caro, Maria, and David Macarthur. 2004. Naturalism in Question. Cambridge (MA): Harvard University Press.
  • Dennett, Daniel C. 1995. Darwin’s Dangerous Idea: Evolution and the Meanings of Life. Simon & Schuster, New York.
  • Downes, Stephen M. 1992. “The importance of models in theorizing: a deflationary semantic view.” Proc. of the Phil. of Science Ass. (PSA), volume 1. The University of Chicago Press, pp. 142-153.
  • Dupré, John. 2001. Human Nature and the Limits of Science. Oxford: Oxford University Press.
  • Eldredge, Niles. 1999. The Pattern of Evolution. New York: Freeman and Co.
  • Eldredge, Niles, and Stephen J. Gould. 1972. “Punctuated equilibria: an alternative to phyletic gradual- ism.”  Models in palaeobiology. Ed. Shopf, Thomas J. M. San Francisco CA: Freeman.
  • Eldredge, Niles., Telmo Pievani, Emanuele Serrelli, and, Ilya Temkin. 2016. Evolutionary Theory: A Hierarchical Perspective. University of Chicago Press.
  • Eldredge, Niles, and Ian Tattersall. 1982. The Myths of Human Evolution. New York: Columbia University Press.
  • Ereshefsky, Marc. 1997. “The evolution of the linnaean hierarchy.” Biology and Philosophy 12(4): 493–519.
  • Ereshefsky, Marc. 2012. “Homology thinking.” Biology and Philosophy 27(3): 381-400.
  • Fodor, Jerry. 1974. “Special sciences and the disunity of science as a working hypothesis.” Synthese 28: 97-115.
  • Gavrilets, Sergey. 2004. Fitness Landscapes and the Origin of Species. Princeton: Princeton University Press.
  • Ghiselin, Michael T. 1997. Metaphysics and the Origin of Species. Albany, NY: State University Press of New York.
  • Gilbert Scott F., John M. Opitz, and Rudolf A Raff. 1996. “Resynthesizing evolutionary and developmental biology.” Developmental Biology 173:357-372.
  • Godfrey-Smith, Peter. 2001. “Three kinds of adaptationism.” Adaptationism and Optimality. Eds. Orzack, Steve H., and Elliott Sober. Cambridge University Press, pp. 335-357.
  • Godfrey-Smith, Peter. 2006. “The strategy of model-based science.” Biology and Philosophy 21(5): 725-740.
  • Godfrey-Smith, Peter. 2009. Darwinian Populations and Natural Selection. Oxford: Oxford University Press.
  • Gontier, Nathalie. 2007. “Universal symbiogenesis: an alternative to universal selectionist accounts of evolution.” Symbiosis 44: 167-181.
  • Gould, Stephen. J. 1981. The Mismeasure of Man. New York: W.W. Norton.
  • Gould, Stephen J. 1989. Wonderful Life. The Burgess Shale and the Nature of History. New York: W. W. Norton.
  • Gould, Stephen J., and Richard C. Lewontin. 1979. “The spandrels of San Marco and the Panglossian paradigm: a critique of the adaptationist programme.” Proceedings of the Royal Society of London Part B: Biological Sciences 205: 581-59.
  • Gould, Stephen J., and Elisabeth S. Vrba. 1982. “Exaptation – a missing term in the science of form.” Paleobiology 8: 4-15.
  • Griesemer, James R. 2007. “Tracking organic processes: representations and research styles in classical embryology and genetics.” From Embryology to Evo-Devo: A History of Developmental Evolution. Eds. Laubichler, Manfred D., and Jane Mannenschein. Cambridge, MA: MIT Press, pp. 375-433.
  • Eds. Griffiths, Paul E., and Ingo Brigandt. 2007. “The Importance of Homology for Biology and Philosophy.” Biology and Philosophy 22(5).
  • Griffiths, Paul E., and Karola Stotz. 2013. Genetics and Philosophy: An Introduction. New York: Cambridge University Press.
  • Grote, Mathias, and Maureen A. O’Malley. 2011. “Enlightening the life sciences: the history of halobacterial and microbial rhodopsin research.” FEMS Microbiology Reviews 35(6): 1082-99.
  • Haarsma, Deborah, et al. 2014-2015. “The President’s notebook: reviewing Darwin’s Doubt.” BioLogos. BioLogos. Web. 20 July 2016.
  • Haber, Matt H. 2008. “Phylogenetic Inference.” A Companion to the Philosophy of History and Historiography. Ed. Tucker, Aviezer. Blackwell Publishing, pp. 231–242.
  • Haldane, John B. S. 1924. “A mathematical theory of natural and artificial selection.”Part I. Transactions of the Cambridge Philosophical Society 23: 19-41.
  • Hamilton, William D. 1963. “The evolution of altruistic behavior.” American Naturalist 97:354-356.
  • Hamilton, William D. 1964. “The genetical evolution of social behavior.” Journal of Theoretical Biology 7(1): 1-52.
  • Hartl, Daniel L., and Andrew G. Clark. 2007. Principles of Population Genetics, Fourth ed. Sunderland, Mass.: Sinauer Associates.
  • Ed. Keller, Laurent. 1999. Levels of Selection in Evolution. Princeton, NJ: Princeton University Press.
  • Kuhn, Thomas. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
  • Lakatos, Imre. 1970. “Falsification and the Methodology of Scientific Research Programmes.”  Criticism and the Growth of Knowledge. Eds. Lakatos, Imre, and Alan Musgraves. Cambridge: Cambridge University Press.
  • Laland, Kevin N., Kim Sterelny, John Odling-Smee, William Hoppitt, and Tobias Uller. 2011. “Cause and effect in biology revisited: is Mayr’s proximate-ultimate dichotomy still useful?” Science 334(6062): 1512-1516.
  • Laland, Kevin N., John Odling-Smee, William Hoppitt, and Tobias Uller. 2013. “More on how and why: a response to commentaries.” Biology and Philosophy 28(5): 793–810.
  • Laubichler, Manfred D. 2009. “Evolutionary developmental biology does offer a significant challenge to the neo-Darwinian paradigm.” Contemporary Debates in Philosophy of Biology. Eds. Ayala, Francisco J., and Robert Arp. Wiley-Blackwell, chapter 11.
  • Leonelli, Sabina. 2007. “Performing abstraction: two ways of modelling Arabidopsis thaliana.” Biology and Philosophy 23(4): 509-528.
  • Levins, Richard. 1966. “The strategy of model building in population biology.” American Scientist 54(4): 421-431.
  • Lewens, Tim. 2008. “Seven types of adaptationism.” Biology and Philosophy 24(2): 161-182.
  • Lewis, Jason E., David DeGusta, Marc R. Meyer, Janet M. Monge, Alan E. Mann, Ralph L. Holloway. 2011. “The mismeasure of science: Stephen J. Gould versus Samuel G. Morton on skulls and bias.” PLOS Biology 9: e1001071.
  • Lewontin, Richard C. 1970. “The units of selection.” Annual Review of Ecology and Systematics 1: 1-18
  • Love, Alan C. 2003. “Evolutionary morphology, innovation, and the synthesis of evolutionary and developmental biology.” Biology and Philosophy 18(2): 309–345.
  • Machamer, Peter, Carl Craver, and Lindley Darden. 2000. “Thinking about mechanisms.” Philosophy of Science 67: 1-25.
  • Malaterre, Cristophe. 2010. “Lifeness signatures and the roots of the tree of life.” Biology and Philosophy 25(4): 643-658.
  • Matthen, Mohan. 2009. “Drift and ‘statistically abstractive explanation’. Philosophy of Science 76(4): 464-487.
  • Matthen, Mohan. 2010. “What is drift? a response to Millstein, Skipper, and Dietrich.” Philosophy and Theory in Biology 2: e102.
  • Matthen, Mohan, and André Ariew. 2009. “Selection and causation.” Philosophy of Science 76(2): 201-224.
  • Smith, John. 1964. “Group selection and kin selection.” Nature 201(4924): 1145-1147.
  • Maynard Smith, John, and Eörs Szathmáry. 1995. The Major Transitions in Evolution. New York: Oxford University Press.
  • Mayr, Ernst. 1955. “Karl Jordan’s contribution to current concepts in systematics and evolution.” Transactions of the Royal Entomological Society of London 107(1-14): 25-66.
  • Mayr, Ernst. 1959. “Where are we?” Cold Spring Harbor Symposium on Quantitative Biology 24: 409–440.
  • Mayr, Ernst. 1997. “The objects of selection.” Proceedings of the National Academy of Sciences USA 94(6): 2091-2094.
  • Mayr, Ernst. 1982. The Growth of Biological Thought: Diversity, Evolution, and Inheritance. Belknap Press.
  • Mayr, Ernst. 2004. What Makes Biology Unique? Cambridge: Cambridge University Press.
  • McShea, Daniel W., and Robert N. Brandon. 2010. Biology’s First Law: The Tendency for Diversity and Complexity to Increase in Evolutionary Systems. University of Chicago Press.
  • Merlin, Francesca. 2010. “Evolutionary chance mutation: a defense of the Modern Synthesis’ consensus view.” Philosophy and Theory in Biology 2: e103.
  • Mesoudi, Alex, Andrew Whiten, and Kevin N. Laland. 2006. “Towards a unified science of cultural evolution.” The Behavioral and Brain Sciences 29(4): 329-347; discussion: 347-383.
  • Mikkelson, Gregory M. 2007. “Ecology.” The Cambridge Companion to the Philosophy of Biology. Eds. Hull, David, and Michael Ruse. Cambridge: Cambridge University Press, pp. 372-387.
  • Millstein, Roberta L. 2002. “Are random drift and natural selection conceptually distinct?” Biology and Philosophy 17: 33-53.
  • Millstein, Roberta L. 2006. “Natural selection as a population-level causal process.” British Journal for the Philosophy of Science 57, pages 627–653,.
  • Millstein, Roberta L., Robert A. Skipper, and Michael R. Dietrich. 2009. “(Mis)interpreting mathematical models: drift as a physical process.” Philosophy and Theory in Biology 1: e002.
  • Minelli, Alessandro. 2009. “Evolutionary developmental biology does not offer a significant challenge to the neo-Darwinian paradigm.”. Contemporary Debates in Philosophy of Biology. Eds. Ayala, Francisco and Robert Arp. Wiley-Blackwell, chapter 12.
  • Morrison, Margaret. 2006. “Unification, explanation and explaining unity: The Fisher-Wright controversy.” The British Journal for the Philosophy of Science 57(1): 233-245.
  • Matthews, Michael R. 1994. Science Teaching. The Role of History and Philosophy of Science. New York-London: Routledge.
  • Ohta, Tomoko, and Motoo Kimura. 1971. “On the constancy of the evolutionary rate of cistrons.” Journal of Molecular Evolution 1(1):18-25.
  • Okasha, Samir. 2006. Evolution and the Levels of Selection. Oxford: Oxford University Press.
  • Olson, Mark E. 2012. “The developmental renaissance in adaptationism.” Trends in Ecology and Evolution 27(5): 278-87.
  • O’Malley, Maureen A., ed. 2010. The Tree of Life. Spec. issue of Biology and Philosophy 25(4).
  • O’Malley, Maureen A. 2010. “Ernst Mayr, the tree of life, and philosophy of biology.” The Tree of Life. Spec. issue of Biology and Philosophy 25(4): 529-552.
  • Oyama, Susan. 1998. Evolution’s Eye. Durham (NC): Duke University Press.
  • Oyama, Susan, Paul E. Griffiths, and Russell D. Gray eds. 2001. Cycles of Contingency. Cambridge (MA): The MIT Press.
  • Panebianco Fabrizio, and Emanuele Serrelli, eds. 2016. Understanding Cultural Traits: A Multidisciplinary Perspective on Cultural Diversity. Springer, Switzerland. DOI 10.1007/978-3-319-24349-8
  • Panebianco Fabrizio, and Emanuele Serrelli. 2016. “Cultural traits and multidisciplinary dialogue.” Understanding Cultural Traits. A Multidisciplinary Perspective on Cultural Diversity. Springer, Switzerland, Chapter 1.
  • Peck, Steven L. 2008. “The hermeneutics of ecological simulation.” Biology and Philosophy 23(3): 383-402.
  • Penny, David. 2005. “An interpretive review of the origin of life research.” Biology and Philosophy 20(4): 633-671.
  • Pievani, Telmo. 2011. “Born to cooperate? Altruism as exaptation, and the evolution of human sociality.” Origins of Cooperation and Altruism. Eds. Sussman, Robert W. and C. Robert Cloninger. New York: Springer.
  • Pievani, Telmo. 2012. “Many ways of being human, the Stephen J. Gould’s legacy to Palaeo-Anthropology (2002-2012).” Journal of Anthropological Sciences 90: 33-49.
  • Pievani, Telmo. 2012. “Geoethics and philosophy of earth sciences: the role of geophysical factors in human evolution.” Annals of Geophysics 55(3): 349-353.
  • Pievani, Telmo, and Emanuele Serrelli. 2011. “Exaptation in human evolution: how to test adaptive vs exaptive evolutionary hypotheses.” Journal of Anthropological Sciences 89:9-23. [DOI 10.4436/jass.89015]
  • Pigliucci, Massimo, and Gerd B. Müller, eds. 2010. Evolution – The Extended Synthesis. Cambridge-London: MIT Press.
  • Piotrowska, Monika. 2013. “From humanized mice to human disease: Guiding extrapolation from model to target.” Biology and Philosophy 28(3): 439-455.
  • Plutynski, Anya. “Explanatory unification and the early synthesis.” The British Journal for the Philosophy of Science 56(3): 595-609.
  • Plutynski, Anya. 2008. “Ecology and the environment.”  The Oxford Handbook of Philosophy of Biology. Ed. Ruse, Michael. New York: Oxford University Press, pp. 504-524.
  • Popper, Karl R. 1935. Logik der Forschung, Vienna: Julius Springer Verlag. Eng. Tr. The Logic of Scientific Discovery. London: Hutchinson, 1959.
  • Popper, Karl R. 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge.
  • Psillos, Stathis. 2012. “What is General Philosophy of Science?” Journal for General Philosophy of Science 43(1): 93-103.
  • Ramsey, Grant. 2013. “Can fitness differences be a cause of evolution?” Philosophy and Theory in Biology 5: e401.
  • Reydon, Thomas A. C. E. 2005. “Bridging the gap between history and philosophy of biology.” Metascience 14(2): 249-253.
  • Rosenberg, Alexander. 2006. Darwinian Reductionism. Or, How to Stop Worrying and Love Molecular Biology. Chicago: The University of Chicago Press.
  • Ruse, Michael. 2012. The Philosophy of Human Evolution. Cambridge & New York: Cambridge University Press.
  • Schaffner, Kenneth F. 1993. Discovery and Explanation in Biology and Medicine. Chicago: University of Chicago Press.
  • Schaffner, Kenneth F. 1998. “Model organisms and behavioral genetics: a rejoinder.” Philosophy of Science 65:276-288.
  • Serrelli, Emanuele. 2015. “Visualizing macroevolution: from adaptive landscapes to compositions of multiple spaces.” Macroevolution: explanation, interpretation and evidence. Eds. Serrelli, Emanuele, and Niles Gontier. Interdisciplinary Evolution Research series, Springer, pp. 113-162. DOI 10.1007/978-3-319-15045-1_4
  • Serrelli, Emanuele. 2016. “Evolutionary genetics and cultural traits in a ‘body of theory’ perspective.” Understanding Cultural Traits. A Multidisciplinary Perspective on Cultural Diversity. Eds. Panebianco, Fabrizio, and Emanuele Serrelli. Springer, Switzerland.
  • Serrelli, Emanuele. 2016. “Removing barriers in scientific research: concepts, synthesis and catalysis.” Understanding Cultural Traits. A Multidisciplinary Perspective on Cultural Diversity. Eds. Panebianco, Fabrizio, and Emanuele Serrelli. Springer, Switzerland.
  • Serrelli, Emanuele, and Niles Gontier, eds. 2015. Macroevolution: Explanation, Interpretation and Evidence. Springer. [DOI 10.1007/978-3-319-15045-1]
  • Serrelli Emanuele, and Niles Gontier. 2015. “Macroevolutionary issues and approaches in evolutionary biology.” Macroevolution: explanation, interpretation and evidence. Eds. Serrelli, Emanuele, and Niles Gontier. Interdisciplinary Evolution Research series, Springer, pp. 1-25. [DOI 10.1007/978-3-319-15045-1_1]
  • Skipper, Robert A., and Roberta L. Millstein. 2005. “Thinking about evolutionary mechanisms: natural selection.” Studies in the History and Philosophy of Biological and Biomedical Sciences 36: 327-347.
  • Sober, Elliott. 2007. “What is wrong with Intelligent Design?” The Quarterly Review of Biology 82(1): 3-8.
  • Sober, Elliott, and David S. Wilson. 1998. Unto Others: The Evolution and Psychology of Unselfish Behavior. Cambridge, MA: Harvard University Press.
  • Sterelny, Kim. 2003. Thought in a Hostile World: The Evolution of Human Cognition. Oxford: Blackwell.
  • Tattersall, Ian. 2013. “Stephen J. Gould’s intellectual legacy to anthropology.” Stephen J. Gould’s Legacy. Nature, History, Society. Eds. Daniele, Antonio G., Allesandro Minelli, and Telmo Pievani.  New York: Springer-Verlag.
  • Van Speybroeck, Linda. 2007. “Philosophy of biology: about the fossilization of disciplines and other embryonic thought.” Acta Biotheoretica 55(1): 47-71.
  • Walsh, Denis M. 2007. “The pomp of superfluous causes: the interpretation of evolutionary theory.” Philosophy of Science 74(3): 281-303.
  • Walsh, Denis M., Tim Lewens, and André Ariew. 2002. “The trials of life: natural selection and random drift.” Philosophy of Science 69(3): 452-473.
  • Waters, C. Kenneth. 1998. “Causal regularities in the biological world of contingent distributions.” Biology and Philosophy 13(1): 5-36.
  • Wilkins, John S. 2011. Species: A History of the Idea. Berkeley, CA: University of California Press.
  • Wilkins, John S., and Malte C. Ebach. 2013. The Nature of Classification: Relationships and Kinds in the Natural Sciences. Palgrave Macmillan.
  • Williams, George C. 1966. Adaptation and Natural Selection: A Critique of Some Current Evolutionary Thought. Princeton University Press.
  • Wilson, David S. 1975. “A theory of group selection.” Proceedings of the National Academy of Sciences USA 72(1): 143-146.
  • Wilson, Edward O. 1975. Sociobiology: The New Synthesis. Cambridge (MA): Harvard University Press.

b. Classics

i. First Generation

  • Beckner, Morton. 1959. The Biological Way of Thought. New York: Columbia University Press.
  • Hull, David L. 1964. “The metaphysics of evolution.” British Journal for the History of Science 3: 309-337.
  • Hull, David L. 1969. “What philosophy of biology is not.” Synthese 20: 157-84.
  • Hull, David L. 1970. “Contemporary systematic philosophies.” Annual Review of Ecology and Systematics 1(1): 19-54.
  • Hull, David L. 1974. Philosophy of Biological Science. Englewood Cliffs, NJ: Prentice-Hall.
  • Hull, David L. 1976. “Are species really individuals?” Systematic Biology 25(2): 174-191.
  • Grene, Marjorie G. 1959. “Two evolutionary theories, I–II.” British Journal for the Philosophy of Science 9: 110–27; 185–93.
  • Grene, Marjorie G., and Everett Mendelsohn, eds. 1976. Topics in the Philosophy of Biology. Dordrecht: D. Reidel.
  • Schaffner, Kenneth F. 1967. “Approaches to reduction.” Philosophy of Science 34: 137-147.
  • Schaffner, Kenneth F. 1967. “Antireductionism and molecular biology.” Science 157: 644-647.
  • Ruse, Michael. 1973. The Philosophy of Biology. London:Hutchinson.
  • Wimsatt, William C. 1972. “Teleology and the logical structure of function statements.” Studies in History and Philosophy of Science 3: 1-80.

ii. Second Generation

  • Amundson, Ron. 1988. “Logical adaptationism.” Behavioral and Brain Sciences 11: 505-506.
  • Amundson, Ron. 1989. “The trials and tribulations of selectionist explanations.” Issues in Evolutionary Epistemology. Eds. Hahlweg, Kai, and Clifford A. Hooker. State University of New York Press.
  • Beatty, John. 1980. “Optimal-design models and the strategy of model building in evolutionary biology.” Philosophy of Science 47: 532-561.
  • Beatty, John. 1982. “Classes and cladists.” Systematic Zoology 31: 25-34.
  • Brandon, Robert. 1990. Adaptation and Environment. Cambridge: MIT Press.
  • Burian, Richard M. 1988. “Challenges to the evolutionary synthesis.” Evolutionary Biology 23: 247-259.
  • Darden, Lindley. 1977. “William Bateson and the promise of Mendelism.” Journal of the History of Biology 10: 87-106.
  • Darden, Lindley. 1976. “Reasoning in scientific change: Charles Darwin, Hugo de Vries, and the discovery of segregation.” Studies in History and Philosophy of Science 7: 127-169.
  • Depew, David, and Bruce H. Weber. 1985. Evolution at a Crossroads: The New Biology and the New Philosophy of Science. Cambridge MA: Bradford Books/MIT Press.
  • Depew, David, and Bruce H. Weber. 1995. Darwinism Evolving: Systems Dynamics and the Genealogy of Natural Selection. Cambridge, MA: Bradford Books/MIT Press.
  • Dupré, John A. 1987. “Human kinds.”. The Latest on the Best. Ed. Dupré, John A. Bradford Books/MIT Press, pp. 327-348.
  • Gieryn, Thomas F. 1999. Cultural Boundaries of Science: Credibility on the Line. Chicago: University of Chicago Press.
  • Griesemer, James R. 1988. “Genes, memes and demes.” Biology and Philosophy 3:179-184.
  • Griesemer, James R., and Michael J. Wade. 1988. “Laboratory models, causal explanation and group selection.” Biology and Philosophy 3: 67-96.
  • Kitcher, Philip. 1981. “Explanatory unification.” Philosophy of Science 48: 507-531.
  • Kitcher, Philip. 1982. “Genes.” British Journal for the Philosophy of Science 33: 337-359.
  • Kitcher, Philip. 1984. “Species.” Philosophy of Science 51: 308-333.
  • Lloyd, Elisabeth A. 1983. “The nature of Darwin’s support for the theory of natural selection.” Philosophy of Science 50: 112-129.
  • Lloyd, Elisabeth A. 1984. “A Semantic approach to the structure of population genetics.” Philosophy of Science 51: 242-264.
  • Lloyd, Elisabeth A. 1988. The Structure and Confirmation of Evolutionary Theory. Greenwood Press.
  • Mills, Susan K., and John Beatty. 1979. “The propensity interpretation of fitness.” Philosophy of Science 46: 263-286.
  • Rosenberg, Alexander. 1978. “The supervenience of biological concepts.” Philosophy of Science 45: 368-386.
  • Rosenberg, Alexander. 1982. “On the propensity definition of fitness.” Philosophy of Science 49: 268-273.
  • Sober, Elliott. 1983. “Parsimony in systematics: philosophical issues.” Annual Review of Ecology and Systematics 14(1): 335-357.
  • Sober, Elliott. 1983. “Equilibrium explanation.” Philosophical Studies 43: 201- 210.
  • Sober, Elliott. 1984. The Nature of Selection. Evolutionary Theory in Philosophical Focus. Chicago: University of Chicago Press.
  • Sober, Elliott. 1988. Reconstructing the Past: Parsimony, Evolution, and Inference. Cambridge: Cambridge University Press.
  • Weber, Bruce H., James Smith, and David Depew. 1988. Entropy, Information, and Evolution: New Perspectives on Physical and Biological Evolution. Cambridge, MA: Bradford Books/MIT Press.

c. Contemporary

i. Reviews

  • Byron, Jason M. 2007. “Whence philosophy of biology?” The British Journal for the Philosophy of Science 58(3): 409-422.
  • Callebaut, Werner. 2005. “Again, what the philosophy of biology is not.” Acta Biotheoretica 23: 93-122.
  • Griffiths, Paul E. 2011. “Philosophy of biology.” The Stanford Encyclopedia of Philosophy (Summer 2011 Edition). Ed. Zalta, Edward N.. URL = <http://plato.stanford.edu/archives/sum2011/entries/biology-philosophy/>.
  • Hull, David L. 2002. “Recent philosophy of biology: a review.” Acta Biotheoretica 50: 117-128.
  • Gers, Matt. 2009. “The long reach of philosophy of biology.” Biology and Philosophy 26(3): 439-447.
  • Müller-Wille, Staffan. 2007. “Philosophy of biology beyond evolution.” Biological Theory 2(1): 111-112.
  • O’Malley, Maureen A., and John A. Dupré. 2007. “Size doesn’t matter: towards a more inclusive philosophy of biology.” Biology and Philosophy 22(2): 155-191.
  • Pradeu, Thomas. 2009. “What philosophy of biology should be.” Biology and Philosophy 26(1): 119-127.

ii. Some Monographs

  • Pigliucci, Massimo, and Jonathan Kaplan. 2006. Making Sense of Evolution. Chicago: University of Chicago Press.
  • Sober, Elliott. 2008. Evidence and Evolution: The Logic Behind the Science. Cambridge University Press.
  • Sober, Elliott, and David S. Wilson. 1998. Unto Others: the Evolution and Psychology of Unselfish Behavior. Cambridge (MA): Harvard University Press.

d. Anthologies and Textbooks

  • Ayala, Francisco, and Robert Arp, eds. 2009. Contemporary Debates in Philosophy of Biology. Wiley-Blackwell.
  • Callebaut, Werner, ed. 1993. Taking the Naturalistic Turn: Or, How Real Philosophy of Science is Done. Chicago:University of Chicago Press.
  • Grene, Marjorie G., and David Depew. 2004. The Philosophy of Biology: An Episodic History. Cambridge: Cambridge University Press.
  • Hull, David L., and Michael Ruse, eds. 1998. The Philosophy of Biology. Oxford: Oxford University Press.
  • Hull, David L., and Michael Ruse, eds. 2007. The Cambridge Companion to the Philosophy of Biology. Cambridge: Cambridge University Press.
  • Kampourakis, Kostas, ed. 2013. The Philosophy of Biology: A Companion for Educators. Springer Science & Business.
  • Linquist, Stefan, ed. 2010. Philosophy of Evolutionary Biology. Vol. 1. Berlington, VT: Ashgate.
  • Rosenberg, Alexander, and Robert Arp, eds. 2009. Philosophy of Biology: An Anthology. Wiley-Blackwell.
  • Rosenberg, Alexander, and Daniel W. McShea. 2008. Philosophy of Biology: A Contemporary Introduction. Routledge.
  • Ruse, Michael, ed. 1989. What the Philosophy of Biology is. Dordrecht, The Netherlands: Kluwer.
  • Ruse, Michael, ed. 2008. The Oxford Handbook of Philosophy of Biology. New York: Oxford University Press.
  • Sarkar, Sahotra, and Anya Plutynski. 2008. A Companion to the Philosophy of Biology. Blackwell Publishing.
  • Sober, Elliott. 1993. Philosophy of Biology. Westview Press/Oxford University Press.
  • Sober, Elliott. 2006. Conceptual Issues in Evolutionary Biology. Cambridge, MA: MIT Press.
  • Sterelny, Kim, and Paul E. Griffiths. 1999. Sex and Death: An Introduction to the Philosophy of Biology. Chicago: The University of Chicago Press.

e. Journals

i. Dedicated

  • Acta Biotheoretica (since 1935)
  • History and Philosophy of the Life Sciences (since 1979)
  • Biology and Philosophy
  • Journal of the History of Biology
  • Studies in History and Philosophy of Biological and Biomedical Sciences Journal of the History of Biology
  • Biological Theory
  • Studies In History and Philosophy of Science Part A
  • Philosophical Transactions of the Royal Society B
  • Philosophy and Theory in Biology (since 2009)

ii. Generalist

  • Philosophy of Science
  • The British Journal for the Philosophy of Science
  • Foundations of Science
  • Metascience
  • Studies in History and Philosophy of Science

f. Organizations

  • International Society for the History, Philosophy, and Social Studies of Biology (http://www.ishpssb.org/)
  • Philosophy of Science Association (PSA) (http://www.philsci.org/)
  • Society for Philosophy of Science in Practice (SPSP) (http://www.philosophy-science-practice.org/)
  • Inter-Divisional Teaching Commission (IDTC) of the International Union for the History and Philosophy of Science (IUHPS): http://www.idtc-iuhps.com/

g. Online Resources

  • T. Lewens, “The philosophy of biology: a selection of readings and resources” page at the University of Cambridge: http://www.hps.cam.ac.uk/research/philofbio.html

Author Information

Emanuele Serrelli
Email: emanuele.serrelli@epistemologia.eu
University of Milano-Bicocca
Italy

Ancient Greek Philosophy

vaseFrom Thales, who is often considered the first Western philosopher, to the Stoics and Skeptics, ancient Greek philosophy opened the doors to a particular way of thinking that provided the roots for the Western intellectual tradition. Here, there is often an explicit preference for the life of reason and rational thought. We find proto-scientific explanations of the natural world in the Milesian thinkers, and we hear Democritus posit atoms—indivisible and invisible units—as the basic stuff of all matter. With Socrates comes a sustained inquiry into ethical matters—an orientation towards human living and the best life for human beings. With Plato comes one of the most creative and flexible ways of doing philosophy, which some have since attempted to imitate by writing philosophical dialogues covering topics still of interest today in ethics, political thought, metaphysics, and epistemology. Plato’s student, Aristotle, was one of the most prolific of ancient authors. He wrote treatises on each of these topics, as well as on the investigation of the natural world, including the composition of animals. The Hellenists—Epicurus, the Cynics, the Stoics, and the Skeptics—developed schools or movements devoted to distinct philosophical lifestyles, each with reason at its foundation.

With this preference for reason came a critique of traditional ways of living, believing, and thinking, which sometimes caused political trouble for the philosophers themselves. Xenophanes directly challenged the traditional anthropomorphic depiction of the gods, and Socrates was put to death for allegedly inventing new gods and not believing in the gods mandated by the city of Athens. After the fall of Alexander the Great, and because of Aristotle’s ties with Alexander and his court, Aristotle escaped the same fate as Socrates by fleeing Athens. Epicurus, like Xenophanes, claimed that the mass of people is impious, since the people conceive of the gods as little more than superhumans, even though human characteristics cannot appropriately be ascribed to the gods. In short, not only did ancient Greek philosophy pave the way for the Western intellectual tradition, including modern science, but it also shook cultural foundations in its own time.

Table of Contents

  1. Presocratic Thought
    1. The Milesians
    2. Xenophanes of Colophon
    3. Pythagoras and Pythagoreanism
    4. Heraclitus
    5. Parmenides and Zeno
    6. Anaxagoras
    7. Democritus and Atomism
    8. The Sophists
  2. Socrates
  3. Plato
    1. Background of Plato’s Work
    2. Metaphysics
    3. Epistemology
    4. Psychology
    5. Ethics and Politics
  4. Aristotle
    1. Terminology
    2. Psychology
    3. Ethics
    4. Politics
    5. Physics
    6. Metaphysics
  5. Hellenistic Thought
    1. Epicureanism
      1. Physics
      2. Ethics
    2. The Cynics
    3. The Stoics
      1. Physics
      2. Epistemology
      3. Ethics
    4. Skeptics
      1. Academic Skepticism
      2. Pyrrhonian Skepticism
  6. Post-Hellenistic Thought
    1. Plotinus
      1. Intellect, Soul, and Matter
      2. The True Self and the Good Life
    2. Later Neoplatonists
    3. Cicero and Roman Philosophy
  7. Conclusion
  8. References and Further Reading
    1. Presocratics
      1. Primary Sources
      2. Secondary Sources
    2. Socrates and Plato
      1. Primary Sources
      2. Secondary Sources
    3. Aristotle
      1. Primary Sources
      2. Secondary Sources
    4. Hellenistic Philosophy
      1. Primary Sources
      2. Secondary Sources

1. Presocratic Thought

An analysis of Presocratic thought presents some difficulties. First, the texts we are left with are primarily fragmentary, and sometimes, as in the case of Anaxagoras, we have no more than a sentence’s worth of verbatim words. Even these purportedly verbatim words often come to us in quotation from other sources, so it is difficult, if not impossible, to attribute with certainty a definite position to any one thinker. Moreover, “Presocratic” has been criticized as a misnomer since some of the Presocratic thinkers were contemporary with Socrates and because the name might imply philosophical primacy to Socrates. The term “Presocratic philosophy” is also difficult since we have no record of Presocratic thinkers ever using the word “philosophy.” Therefore, we must approach cautiously any study of presocratic thought.

Presocratic thought marks a decisive turn away from mythological accounts towards rational explanations of the cosmos. Indeed, some Presocratics openly criticize and ridicule traditional Greek mythology, while others simply explain the world and its causes in material terms. This is not to say that the Presocratics abandoned belief in gods or things sacred, but there is a definite turn away from attributing causes of material events to gods, and at times a refiguring of theology altogether. The foundation of Presocratic thought is the preference and esteem given to rational thought over mythologizing. This movement towards rationality and argumentation would pave the way for the course of Western thought.

a. The Milesians

Thales (c.624-c.545 B.C.E.), traditionally considered to be the “first philosopher,” proposed a first principle (arche) of the cosmos: water. Aristotle offers some conjectures as to why Thales might have believed this (Graham 29). First, all things seem to derive nourishment from moisture. Next, heat seems to come from or carry with it some sort of moisture. Finally, the seeds of all things have a moist nature, and water is the source of growth for many moist and living things. Some assert that Thales held water to be a component of all things, but there is no evidence in the testimony for this interpretation. It is much more likely, rather, that Thales held water to be a primal source for all things—perhaps the sine qua non of the world.

Like Thales, Anaximander (c.610-c.545 B.C.E.) also posited a source for the cosmos, which he called the boundless (apeiron). That he did not, like Thales, choose a typical element (earth, air, water, or fire) shows that his thinking had moved beyond sources of being that are more readily available to the senses. He might have thought that, since the other elements seem more or less to change into one another, there must be some source beyond all these—a kind of background upon or source from which all these changes happen. Indeed, this everlasting principle gave rise to the cosmos by generating hot and cold, each of which “separated off” from the boundless. How it is that this separation took place is unclear, but we might presume that it happened via the natural force of the boundless. The universe, though, is a continual play of elements separating and combining. In poetic fashion, Anaximander says that the boundless is the source of beings, and that into which they perish, “according to what must be: for they give recompense and pay restitution to each other for their injustice according to the ordering of time” (F1).

If our dates are approximately correct, Anaximenes (c.546-c.528/5 B.C.E.) could have had no direct philosophical contact with Anaximander. However, the conceptual link between them is undeniable. Like Anaximander, Anaximenes thought that there was something boundless that underlies all other things. Unlike Anaximander, Anaximenes made this boundless thing something definite—air. For Anaximander, hot and cold separated off from the boundless, and these generated other natural phenomena (Graham 79). For Anaximenes, air itself becomes other natural phenomena through condensation and rarefaction. Rarefied air becomes fire. When it is condensed, it becomes water, and when it is condensed further, it becomes earth and other earthy things, like stones (Graham 79). This then gives rise to all other life forms. Furthermore, air itself is divine. Both Cicero and Aetius report that, for Anaximenes, air is God (Graham 87). Air, then, changes into the basic elements, and from these we get all other natural phenomena.

b. Xenophanes of Colophon

Xenophanes (c.570-c.478 B.C.E.) directly and explicitly challenged Homeric and Hesiodic mythology. “It is good,” says Hesiod, “to hold the gods in high esteem,” rather than portraying them in “raging battles, which are worthless” (F2). More explicitly, “Homer and Hesiod have attributed to the gods all things that are blameworthy and disgraceful for human beings: stealing, committing adultery, deceiving each other” (F17). At the root of this poor depiction of the gods is the human tendency towards anthropomorphizing the gods. “But mortals think gods are begotten, and have the clothing, voice and body of mortals” (F19), despite the fact that God is unlike mortals in body and thought. Indeed, Xenophanes famously proclaims that if other animals (cattle, lions, and so forth) were able to draw the gods, they would depict the gods with bodies like their own (F20). Beyond this, all things come to be from earth (F27), not the gods, although it is unclear whence came the earth. The reasoning seems to be that God transcends all of our efforts to make him like us. If everyone paints different pictures of divinity, and many people do, then it is unlikely that God fits into any of those frames. So, holding “the gods in high esteem” at least entails something negative, that is, that we take care not to portray them as super humans.

c. Pythagoras and Pythagoreanism

Ancient thought was left with such a strong presence and legacy of Pythagorean influence, and yet little is known with certainty about Pythagoras of Samos (c.570-c.490 B.C.E.). Many know Pythagoras for his eponymous theorem—the square of the hypotenuse of a right triangle is equal to the sum of the squares of the adjacent sides. Whether Pythagoras himself invented the theorem, or whether he or someone else brought it back from Egypt, is unknown. He developed a following that continued long past his death, on down to Philolaus of Croton (c.470-c.399 B.C.E.), a Pythagorean from whom we may gain some insight into Pythagoreanism. Whether or not the Pythagoreans followed a particular doctrine is up for debate, but it is clear that, with Pythagoras and the Pythagoreans, a new way of thinking was born in ancient philosophy that had a significant impact on Platonic thought.

The Pythagoreans believed in the transmigration of souls. The soul, for Pythagoras, finds its immortality by cycling through all living beings in a 3,000-year cycle, until it returns to a human being (Graham 915). Indeed, Xenophanes tells the story of Pythagoras walking by a puppy who was being beaten. Pythagoras cried out that the beating should cease, because he recognized the soul of a friend in the puppy’s howl (Graham 919). What exactly the Pythagorean psychology entails for a Pythagorean lifestyle is unclear, but we pause to consider some of the typical characteristics reported of and by Pythagoreans.

Plato and Aristotle tended to associate the holiness and wisdom of number—and along with this, harmony and music—with the Pythagoreans (Graham 499). Perhaps more basic than number, at least for Philolaus, are the concepts of the limited and unlimited. Nothing in the cosmos can be without limit (F1), including knowledge (F4). Imagine if nothing were limited, but matter were just an enormous heap or morass. Next, suppose that you are somehow able to gain a perspective of this morass (to do so, there must be some limit that gives you that perspective!). Presumably, nothing at all could be known, at least not with any degree of precision, the most careful observation notwithstanding. Additionally, all known things have number, which functions as a limit of things insofar as each thing is a unity, or composed of a plurality of parts.

d. Heraclitus

Heraclitus of Ephesus (c.540-c.480 B.C.E.) stands out in ancient Greek philosophy not only with respect to his ideas, but also with respect to how those ideas were expressed. His aphoristic style is rife with wordplay and conceptual ambiguities. Heraclitus saw reality as composed of contraries—a reality whose continual process of change is precisely what keeps it at rest.

Fire plays a significant role in his picture of the cosmos. No God or man created the cosmos, but it always was, is, and will be fire. At times it seems as though fire, for Heraclitus, is a primary element from which all things come and to which they return. At others, his comments on fire could easily be seen metaphorically. What is fire? It is at once “need and satiety.” This back and forth, or better yet, this tension and distension is characteristic of life and reality—a reality that cannot function without contraries, such as war and peace. “A road up and down is one and the same” (F38). Whether one travels up the road or down it, the road is the same road. “On those stepping into rivers staying the same other and other waters flow” (F39). In his Cratylus, Plato quotes Heraclitus, via the mouthpiece of Cratylus, as saying that “you could not step twice into the same river,” comparing this to the way everything in life is in constant flux (Graham 158). This, according to Aristotle, supposedly drove Cratylus to the extreme of never saying anything for fear that the words would attempt to freeze a reality that is always fluid, and so, Cratylus merely pointed (Graham 183). So, the cosmos and all things that make it up are what they are through the tension and distention of time and becoming. The river is what it is by being what it is not. Fire, or the ever-burning cosmos, is at war with itself, and yet at peace—it is constantly wanting fuel to keep burning, and yet it burns and is satisfied.

e. Parmenides and Zeno

If it is true that for Heraclitus life thrives and even finds stillness in its continuous movement and change, then for Parmenides of Elea (c.515-c.450 B.C.E.) life is at a standstill. Parmenides was a pivotal figure in Presocratic thought, and one of the most influential of the Presocratics in determining the course of Western philosophy. According to McKirahan, Parmenides is the inventor of metaphysics (157)—the inquiry into the nature of being or reality. While the tenets of his thought have their home in poetry, they are expressed with the force of logic. The Parmenidean logic of being thus sparked a long lineage of inquiry into the nature of being and thinking.

Parmenides recorded his thought in the form of a poem. In it, there are two paths that mortals can take—the path of truth and the path of error. The first path is the path of being or what-is. The right way of thinking is to think of what-is, and the wrong way is to think both what-is and what-is-not. The latter is wrong, simply because non-being is not. In other words, there is no non-being, so properly speaking, it cannot be thought—there is nothing there to think. We can think only what is and, presumably, since thinking is a type of being, “thinking and being are the same” (F3). It is only our long entrenched habits of sensation that mislead us into thinking down the wrong path of non-being. The world, and its appearance of change, thrusts itself upon our senses, and we erroneously believe that what we see, hear, touch, taste, and smell is the truth. But, if non-being is not, then change is impossible, for when anything changes, it moves from non-being to being. For example, for a being to grow tall, it must have at some point not been tall. Since non-being is not and cannot therefore be thought, we are deluded into believing that this sort of change actually happens. Similarly, what-is is one. If there were a plurality, there would be non-being, that is, this would not be that. Parmenides thus argues that we must trust in reason alone.

In the Parmenidean tradition, we have Zeno (c.490-c.430 B.C.E.). As Daniel Graham says, while “Parmenides argues for monism, Zeno argues against pluralism” (Graham 245). Zeno seems to have composed a text wherein he claims to show the absurdity in accepting that there is a plurality of beings, and he also shows that motion is impossible. Zeno shows that if we attempt to count a plurality, we end up with an absurdity. If there were a plurality, then it would be neither more nor less than the number that it would have to be. Thus, there would be a finite number of things. On the other hand, if there were a plurality, then the number would be infinite because there is always something else between existing things, and something else between those, and something else between those, ad infinitum. Thus, if there were a plurality of things, then that plurality would be both infinite and finite in number, which is absurd (F4).

The most enduring paradoxes are those concerned with motion. It is impossible for a body in motion to traverse, say, a distance of twenty feet. In order to do so, the body must first arrive at the halfway point, or ten feet. But in order to arrive there, the body in motion must travel five feet. But in order to arrive there, the body must travel two and a half feet, ad infinitum. Since, then, space is infinitely divisible, but we have only a finite time to traverse it, it cannot be done. Presumably, one could not even begin a journey at all. The “Achilles Paradox” similarly attacks motion saying that swift-footed Achilles will never be able to catch up with the slowest runner, assuming the runner started at some point ahead of Achilles. Achilles must first reach the place where the slow runner began. This means that the slow runner will already be a bit beyond where he began. Once Achilles progresses to the next place, the slow runner is already beyond that point, too. Thus, motion seems absurd.

f. Anaxagoras

Anaxagoras of Clazomenae (c.500-c.428 B.C.E.) had what was, up until that time, the most unique perspective on the nature of matter and the causes of its generation and corruption. Closely predating Plato (Anaxagoras died around the time that Plato was born), Anaxagoras left his impression upon Plato and Aristotle, although they were both ultimately dissatisfied with his cosmology (Graham 309-313). He seems to have been almost exclusively concerned with cosmology and the true nature of all that is around us.

Before the cosmos was as it is now, it was nothing but a great mixture—everything was in everything. The mixture was so thoroughgoing that no part of it was recognizable due to the smallness of each thing, and not even colors were perceptible. He considered matter to be infinitely divisible. That is, because it is impossible for being not to be, there is never a smallest part, but there is always a smaller part. If the parts of the great mixture were not infinitely divisible, then we would be left with a smallest part. Since the smallest part could not become smaller, any attempt at dividing it again would presumably obliterate it.

The most important player in this continuous play of being is mind (nous). Although mind can be in some things, nothing else can be in it—mind is unmixed. We recall that, for Anaxagoras, everything is mixed with everything. There is some portion of everything in anything that we identify. Thus, if anything at all were mixed with mind, then everything would be mixed with mind. This mixture would obstruct mind’s ability to rule all else. Mind is in control, and it is responsible for the great mixture of being. Everlasting mind—the most pure of all things—is responsible for ordering the world.

Anaxagoras left his mark on the thought of both Plato and Aristotle, whose critiques of Anaxagoras are similar. In Plato’s Phaedo, Socrates recounts in brief his intellectual history, citing his excitement over his discovery of Anaxagoras’ thought. He was most excited about mind as an ultimate cause of all. Yet, Socrates complains, Anaxagoras made very little use of mind to explain what was best for each of the heavenly bodies in their motions, or the good of anything else. That is, Socrates seems to have wanted some explanation as to why it is good for all things to be as they are (Graham 309-311). Aristotle, too, complains that Anaxagoras makes only minimal use of his principle of mind. It becomes, as it were, a deus ex machina, that is, whenever Anaxagoras was unable to give any other explanation for the cause of a given event, he fell back upon mind (Graham 311-313). It is possible, as always, that both Plato and Aristotle resort here to a straw man of sorts in order to advance their own positions. Indeed, we have seen that Mind set the great mixture into motion, and then ordered the cosmos as we know it. This is no insignificant feat.

g. Democritus and Atomism

Ancient atomism began a legacy in philosophical and scientific thought, and this legacy was revived and significantly evolved in modern philosophy. In contemporary times, the atom is not the smallest particle. Etymologically, however, atomos is that which is uncut or indivisible. The ancient atomists, Leucippus and Democritus (c.5th cn B.C.E.), were concerned with the smallest particles in nature that make up reality—particles that are both indivisible and invisible. They were to some degree responding to Parmenides and Zeno by indicating atoms as indivisible sources of motion.

Atoms—the most compact and the only indivisible bodies in nature—are infinite in number, and they constantly move through an infinite void. In fact, motion would be impossible, says Democritus, without the void. If there were no void, the atoms would have nothing through which to move. Atoms take on a variety, perhaps an infinite variety, of shapes. Some are round, others are hooked, and yet others are jagged. They often collide with one another, and often bounce off of one another. Sometimes, though, the shapes of the colliding atoms are amenable to one another, and they come together to form the matter that we identify as the sensible world (F5). This combination, too, would be impossible without the void. Atoms need a background (emptiness) out of which they are able to combine (Graham 531). Atoms then stay together until some larger environmental force breaks them apart, at which point they resume their constant motion (F5). Why certain atoms come together to form a world seems up to chance, and yet many worlds have been, are, and will be formed by atomic collision and coalescence (Graham 551). Once a world is formed, however, all things happen by necessity—the causal laws of nature dictate the course of the natural world (Graham 551-553).

h. The Sophists

Much of what is transmitted to us about the Sophists comes from Plato. In fact, two of Plato’s dialogues are named after Sophists, Protagoras and Gorgias, and one is called simply, The Sophist. Beyond this, typical themes of sophistic thought often make their way into Plato’s work, not the least of which are the similarities between Socrates and the Sophists (an issue explicitly addressed in the Apology and elsewhere). Thus, the Sophists had no small influence on fifth century Greece and Greek thought.

Broadly, the Sophists were a group of itinerant teachers who charged fees to teach on a variety of subjects, with rhetoric as the preeminent subject in their curriculum. A common characteristic among many, but perhaps not all, Sophists seems to have been an emphasis upon arguing for each of the opposing sides of a case. Thus, these argumentative and rhetorical skills could be useful in law courts and political contexts. However, these sorts of skills also tended to earn many Sophists their reputation as moral and epistemological relativists, which for some was tantamount to intellectual fraud.

One of the earliest and most famous Sophists was Protagoras (c. 490-c. 420 BC). Only a handful of fragments of his thought exist, and the bulk of the remaining information about him found in Plato’s dialogues should be read cautiously. He is most famous for the apparently relativistic statement that human beings are “the measure of all things, of things that are that they are, of things that are not that they are not” (F1b). Plato, at least for the purposes of the Protagoras, reads individual relativism out of this statement. For example, if the pool of water feels cold to Henry, then it is in fact cold for Henry, while it might appear warm, and therefore be warm for Jennifer. This example portrays perceptual relativism, but the same could go for ethics as well, that is, if X seems good to Henry, then X is good for him, but it might be bad in Jennifer’s judgment. The problem with this view, however, is that if all things are relative to the observer/judge, then the idea that all things are relative is itself relative to the person who asserts it. The idea of communication is then rendered incoherent since each person has his or her own private meaning.

On the other hand, Protagoras’ statement could be interpreted as species-relative. That is, the question of whether and how things are, and whether and how things are not, is a question that has meaning (ostensibly) only for human beings. Thus, all knowledge is relative to us as human beings, and therefore limited by our being and our capabilities. This reading seems to square with the other of Protagoras’ most famous statements: “Concerning the gods, I cannot ascertain whether they exist or whether they do not, or what form they have; for there are many obstacles to knowing, including the obscurity of the question and the brevity of human life” (F3). It is implied here that knowledge is possible, but that it is difficult to attain, and that it is impossible to attain when the question is whether or not the gods exist. We can also see here that human finitude is a limit not only upon human life but also upon knowledge. Thus, if there is knowledge, it is for human beings, but it is obscure and fragile.

Along with Protagoras was Gorgias (c.485-c.380 B.C.E.), another sophist whose namesake became the title of a Platonic dialogue. Perhaps flashier than Protagoras when it came to rhetoric and speech making, Gorgias is known for his sophisticated and poetic style. He is known also for extemporaneous speeches, taking audience suggestions for possible topics upon which he would speak at length. His most well-known work is On Nature, Or On What-Is-Not wherein he, contrary to Eleatic philosophy, sets out to show that neither being nor non-being is, and that even if there were anything, it could be neither known nor spoken. It is unclear whether this work was in jest or in earnest. If it was in jest, then it was likely an exercise in argumentation as much as it was a gibe at the Eleatics. If it was in earnest, then Gorgias could be seen as an advocate for extreme skepticism, relativism, or perhaps even nihilism (Graham 725).

2. Socrates

Socrates (469-399 B.C.E.) wrote nothing, so what stories and information we have about him come to us primarily from Xenophon (430-354 B.C.E.) and Plato. Both Xenophon and Plato knew Socrates, and wrote dialogues in which Socrates usually figures as the main character, but their versions of certain historical events in Socrates’ life are sometimes incompatible. We cannot be sure if or when Xenophon or Plato is reporting about Socrates with historical accuracy. In some cases, we can be sure that they are intentionally not doing so, but merely using Socrates as a mouthpiece to advance philosophical dialogue (Döring 25). Xenophon, in his Memorobilia, wrote some biographical information about Socrates, but we cannot know how much is fabricated or embellished. When we refer to Socrates, we are typically referring to the Socrates of one of these sources and, more often than not, Plato’s version.

Socrates was the son of a sculptor, Sophroniscus, and grew up an Athenian citizen. He was reported to be gifted with words and was sometimes accused of what Plato later accused Sophists, that is, using rhetorical devices to “make the weaker argument the stronger.” Indeed, Xenophon reports that the Thirty Tyrants forbade Socrates to speak publicly except on matters of practical business because his clever use of words seemed to lead young people astray (Book I, II.33-37). Similarly, Aristophanes presents Socrates as an impoverished sophist whose head was in the clouds to the detriment of his daily, practical life. Moreover, his similarities with the sophists are even highlighted in Plato’s work. Indeed, Socrates’ courtroom speech in Plato’s Apology includes a defense against accusations of sophistry (18c).

While Xenophon and Plato both recognize this rhetorical Socrates, they both present him as a virtuous man who used his skills in argumentation for truth, or at least to help remove himself and his interlocutors from error. The so-called Socratic method, or elenchos, refers to the way in which Socrates often carried out his philosophical practice, a method to which he seems to refer in Plato’s Apology (Benson 180-181). Socrates aimed to expose errors or inconsistencies in his interlocutors’ positions. He did so by asking them questions, often demanding yes-or-no answers, and then reduced their positions to absurdity. He was, in short, aiming for his interlocutor to admit his own ignorance, especially where the interlocutor thought that he knew what he did not in fact know. Thus, many Platonic dialogues end in aporia, an impasse in thought—a place of perplexity about the topic originally under discussion (Brickhouse and Smith 3-4). This is presumably the place from which a thoughtful person can then make a fresh start on the way to seeking truth.

Socrates practiced philosophy openly, did not charge fees for doing so and allowed anyone who wanted to engage with him to do so. Xenophon says:

Socrates lived ever in the open; for early in the morning he went to the public promenades and training-grounds; in the forenoon he was seen in the market; and the rest of the day he passed just where most people were to be met: he was generally talking, and anyone might listen. (Memorabilia, Book I, i.10)

The “talking” that Socrates did was presumably philosophical in nature, and this talk was focused primarily on morality. Indeed, as John Cooper claims in his introduction to Plato: Complete Works, Socrates “denied that he had discovered some new wisdom, indeed that he possessed any wisdom at all,” contrary to his predecessors, such as Anaxagoras and Parmenides. Often his discussions had to do with topics of virtue—justice, courage, temperance, and wisdom (Memorabilia, Book I, i.16). This sort of open practice made Socrates well known but also unpopular, which eventually led to his execution.

Socrates’ elenchos, as he recognizes in Plato’s Apology (from apologia, “defense”), made him unpopular. Lycon (about whom little is known), Anytus (an influential politician in Athens), and Meletus, a poet, accused Socrates of not worshipping the gods mandated by Athens (impiety) and of corrupting the youth through his persuasive power of speech. In his Meno, Plato hints that Anytus was already personally angry with Socrates. Anytus has just warned Socrates to “be careful” in the way he speaks about famous people (94e). Socrates then tells Meno, “I think, Meno, that Anytus is angry, and I am not at all surprised. He thinks…that I am slandering those men, and then he believes himself to be one of them” (95a). This is not surprising, if indeed Socrates practiced philosophy in the way that both Xenophon and Plato report that he did by exposing the ignorance of his interlocutors.

Socrates claims to have ventured down the path of philosophy because of a proclamation from the Oracle at Delphi. Socrates’ enthusiastic follower, Chaerephon, reportedly visited the Oracle at Delphi to ask the god whether anyone among the Athenians was wiser than Socrates. The god replied that no one was wiser than Socrates. Socrates, who claims never to have been wise, wondered what this meant. So, in order to understand better the god’s claim, Socrates questioned Athenians from all social strata about their wisdom. In Plato’s Apology, Socrates claims that most people he questioned claimed to know what they did not in fact know (21-22). As a result of showing so many people their own ignorance, or at least trying to, Socrates became unpopular (23a). This unpopularity is eventually what killed him. To add to his unpopularity, Socrates claimed that the Oracle was right, but only in the respect that he had “human wisdom,” that is, the wisdom to recognize what one does not know, and to know that such wisdom is relatively worthless (23b).

Xenophon, too, wrote his own account of Socrates’ defense. Xenophon attributes the accusation of impiety to Socrates’ daimon, or personal god much like a voice of conscience, who forbade Socrates from doing anything that would not be truly beneficial for him. Both Xenophon (4-7) and Plato (40b) claim that it was this daimon who prevented Socrates from making such a defense as would exonerate him. That is, the daimon did not dissuade Socrates from his sentence of death. In Xenophon’s account, The Oracle claimed that no one was “more free than [Socrates], or more just, or more prudent” (Apology 14). Xenophon’s version might differ from Plato’s since Xenophon, a military leader, wanted to emphasize characteristics Socrates exuded that might also make for good characteristics in a statesman (O’Connor 66). At any rate, Xenophon has Socrates recognize his own unpopularity. Also, like Plato, Xenophon recognizes that Socrates held knowledge of oneself and the recognition of one’s own ignorance in high esteem (Memorabilia, Book III, ix. 6-7).

Socrates practiced philosophy, in an effort to know himself, daily and even in the face of his own death. In Plato’s Crito, in which Crito comes to Socrates’ prison cell to persuade Socrates to escape, Socrates wants to know whether escaping would be just, and imminent death does not deter him from seeking an answer to that question. He and Crito first establish that doing wrong willingly is always bad, and this includes returning wrong for wrong (49b-c). Then, personifying Athenian law, Socrates establishes that escaping prison would be wrong. While he acknowledges that he was wrongly found to be guilty of impiety and corrupting the youth, the legal process itself ran according to law, and to escape would be to “wrong” the laws in which he was raised and to which, by virtue of being a life-long Athenian, he agreed to assent.

Plato’s Phaedo presents us with the story of Socrates’ last day on earth. In it, he famously claims that philosophy is practice for dying and death (64a). Indeed, he spends his final hours with his friends discussing a very relevant and pressing philosophical issue, that is the immortality of the soul. Socrates is presented to us as a man who, even in his final hours, wanted nothing more than to pursue wisdom. In Plato’s Euthyphro, Socrates aims to dissuade Euthyphro from indicting his own father for murder. Euthyphro, a priest, claims that what he is doing—prosecuting a wrongdoer—is pious. Socrates then uses his elenchos to show that Euthyphro does not actually know what piety is. Once he is thoroughly confused and frustrated, Euthyphro says, ““it is a considerable task to acquire any precise knowledge of these things [that is, piety]” (14b). Nevertheless, Euthyphro offers yet another definition of “piety.” Socrates’ response is the key to understanding the dialogue: “You could tell me in far fewer words, if you were willing, the sum of what I asked…You were on the verge of doing so, but you turned away. If you had given that answer, I should now have acquired from you sufficient knowledge of the nature of piety” (14c1-c4). It is, in other words, the very act of philosophizing—the recognizing of one’s own ignorance and the search for wisdom—that is piety. Socrates, we are told, continued this practice even in the final hours of his life.

3. Plato

Plato (427-347 B.C.E.) was the son of Athenian aristocrats. He grew up in a time of upheaval in Athens, especially at the conclusion of the Peloponnesian war, when Athens was conquered by Sparta. Debra Nails says, “Plato would have been 12 when Athens lost her empire with the revolt of the subject allies; 13 when democracy fell briefly to the oligarchy of Four Hundred…; [and]14 when democracy was restored” (2). We cannot be sure when he met Socrates. Although ancient sources report that he became Socrates’ follower at age 18, he might have met Socrates much earlier through the relationship between Socrates and Plato’s uncle, Charmides, in 431 B.C.E. (Taylor 3). He might have known Socrates, too, through his “musical” education, which would have consisted of anything under the purview of the muses, that is, everything from dancing to reading, writing, and arithmetic (Nails 2). He also seems to have spent time with Cratylus, the Heraclitean, which probably had an impact primarily on his metaphysics and epistemology.

Plato had aspirations for the political life, but several untoward events pushed him away from the life of political leadership, not the least of which was Socrates’ trial and conviction. While the authenticity of Plato’s Seventh Letter is debated among scholars, it might give us some insight into Plato’s biography:

At last I came to the conclusion that all existing states are badly governed and the condition of their laws practically incurable, without some miraculous remedy and the assistance of fortune; and I was forced to say, in praise of true philosophy , that from her height alone was it possible to discern what the nature of justice is, either in the state or in the individual, and that the ills of the human race would never end until either those who are sincerely and truly lovers of wisdom [that is, philosophers] come into political power, or the rulers of our cities, by the grace of God, learn true philosophy. (Letter VII)

Plato saw any political regime without the aid of philosophy or fortune as fundamentally corrupt. This attitude, however, did not turn Plato entirely from politics. He visited Sicily three times, where two of these trips were failed attempts at trying to turn the tyrant Dionysius II to the life of philosophy. He thus returned to Athens and focused his efforts on the philosophical education he had begun at his Academy (Nails 5).

a. Background of Plato’s Work

Since Plato wrote dialogues, there is a fundamental difficulty with any effort to identify just what Plato himself thought. Plato never appears in the dialogues as an interlocutor. If he was voicing any of his own thoughts, he did it through the mouthpiece of particular characters in the dialogues, each of which has a particular historical context. Thus, any pronouncement about Plato’s “theory” of this or that must be tentative at best. As John Cooper says,

Although everything any speaker says is Plato’s creation, he also stands before it all as the reader does: he puts before us, the readers, and before himself as well, ideas, arguments, theories, claims, etc. for all of us to examine carefully, reflect on, follow out the implications of—in sum, to use as a springboard for our own further philosophical thought. (Cooper xxii)

Thus, while we can indubitably highlight recurring themes and theoretical insights throughout Plato’s work, we must be wary of committing Plato in any wholesale fashion to a particular view.

b. Metaphysics

Perhaps the most famous of Plato’s metaphysical concepts is his notion of the so-called “forms” or “ideas.” The Greek words that we translate as “form” or “idea” are eidos and idea. Both of these words are rooted in verbs of seeing. Thus, the eidos of something is its look, shape, or form. But, as many philosophers do, Plato manipulates this word and has it refer to immaterial entities. Why is it that one can recognize that a maple is a tree, an oak is a tree, and a Japanese fir is a tree? What is it that unites all of our concepts of various trees under a unitary category of Tree? It is the form of “tree” that allows us to understand anything about each and every tree, but Plato does not stop there.

The forms can be interpreted not only as purely theoretical entities, but also as immaterial entities that give being to material entities. Each tree, for example, is what it is insofar as it participates in the form of Tree. Each human being, for example, is different from the next, but each human being is human to the extent that he/she participates in the form of Human Being. This material-immaterial emphasis seems directed ultimately towards Plato’s epistemology. That is, if anything can be known, it is the forms. Since things in the world are changing and temporal, we cannot know them; therefore, forms are unchanging and eternal beings that give being to all changing and temporal beings in the world, if knowledge is to be certain and clear. In other words, we cannot know something that is different from one moment to the next. The forms are therefore pure ideas that unify and stabilize the multiplicity of changing beings in the material world.

The forms are the ultimate reality, and this is shown to us in the Allegory of the Cave. In discussing the importance of education for a city, Socrates produces the Allegory of the Cave in Plato’s Republic (514a-518b). We are to imagine a cave wherein lifelong prisoners dwell. These prisoners do not know that they are prisoners since they have been held captive their entire lives. They are shackled such that they are incapable of turning their heads. Behind them is a fire, and small puppets or trinkets of various things—horses, stones, people, and so forth—are being moved in front of the fire. Shadows of these trinkets are cast onto a wall in front of the prisoners. The prisoners take this world of shadows to be reality since it is the only thing they ever see.

If, however, we suppose that one prisoner is unshackled and is forced to make his way out of the cave, we can see the process of education. At first, the prisoner sees the fire, which casts the shadows he formerly took to be reality. He is then led out of the cave. After his eyes painfully adjust to the sunlight, he first sees only the shadows of things, and then the things themselves. After this, he realizes that it is the sun by which he sees the things, and which gives life to the things he sees. The sun is here analogous to the form of the Good, which is what gives life to all beings and enables us most truly to know all beings.

The concept of the forms is criticized in Plato’s Parmenides. This dialogue shows us a young Socrates, whose understanding of the forms is being challenged by Parmenides. Parmenides first challenges the young Socrates about the scope of the forms. It seems absurd, thinks Parmenides, to suppose stones, hair, or bits of dirt of their own form (130c-d). He then presents the famous “third man” argument. The forms are supposed to be unitary. The multiplicity of large material things, for example, participate in the one form of Largeness, which itself does not participate in anything else. Parmenides argues against this unity: “So another form of largeness will make its appearance, which has emerged alongside largeness itself and the things that partake of it, and in turn another over all these, by which all of them will be large. Each of your forms will no longer be one, but unlimited in multitude” (132a-b). In other words, is the form of Largeness itself large? If so, it would need to participate in another form of Largeness, which would itself need to participate in another form, and so forth.

In short, we can see that Plato is tentative about what is now considered his most important theory. Indeed, in his Seventh Letter, Plato says that talking about the forms at all is a difficult matter. “These things…because of the weakness of language, are just as much concerned with making clear the particular property of each object as the being of it. On this account no sensible man will venture to express his deepest thoughts in words, especially in a form which is unchangeable, as is true of written outlines” (343). The forms are beyond words or, at best, words can only approximately reveal the truth of the forms. Yet, Plato seems to take it on faith that, if there is knowledge to be had, there must be these unchanging, eternal beings.

c. Epistemology

We can say that, for Plato, if there is to be knowledge, it must be of eternal, unchanging things. The world is constantly in flux. It is therefore strange to say that one has knowledge of it, when one can also claim to have knowledge of, say, arithmetic or geometry, which are stable, unchanging things, according to Plato. That is, it seems absurd that one’s ideas about changing things are on a par with one’s ideas about unchanging things. Moreover, like Cratylus, we might wonder whether our ideas about the changing world are ever accurate at all. Our ideas, after all, tend to be much like a photograph of a world, but unlike the photograph, the world continues to change. Thus, Plato reserves the forms as those things about which we can have true knowledge.

How we get knowledge is difficult. The problem of acquiring knowledge gave rise to “Meno’s Paradox” in Plato’s Meno. In their search for the nature of virtue, Meno asks Socrates, “How will you look for [virtue], Socrates, when you do not know at all what it is? How will you aim to search for something you do not know at all? If you should meet with it, how will you know that this is the thing that you did not know?” (Meno, 80d-e). If one wants to know X, this implies that he/she does not know X now. If so, then it seems that one cannot even begin to ask about X. In other words, it seems that one must already know X in order to ask about it in the first place, but if one already knows X, then there is nothing to ask. Even if one could ask, one would not know when he/she has the answer since one did not know what he/she was looking for in the first place.

Socrates answers this “debaters argument” with the theory of recollection, claiming that he has heard others talk about this “divine matter” (81a). The theory of recollection rests upon the assumption that the human soul is immortal. The soul’s immortality entails, says Socrates, that the soul has seen and known all things since it has always been. Somehow, the soul “forgets” these things upon its incarnation, and the task of knowledge is to recollect them (81b-e). This, of course, is a poor argument, but Plato knows this, given his preface that it is a “divine matter,” and Socrates’ insistence that we must believe it (not know it or be certain of it) rather than the paradox Meno mentions. Thus, Socrates famously goes on to show recollection in action through a series of questions posed to Meno’s slave. Through a series of leading questions, Meno’s slave provides the answer to a geometrical problem that he did not previously know—or more precisely, he recollects knowledge that he had previously forgotten. We might imagine that this is akin to the “light bulb” moment when something we did not previously understand suddenly becomes clear. At any rate, Socrates shows Meno how the human mind mysteriously, when led in the proper fashion, can arrive at knowledge on its own. This is recollection.

Again, the forms are the most knowable beings and, so, presumably are those beings that we recollect in knowledge. Plato offers another image of knowing in his Republic. True understanding (noesis) is of the forms. Below this, there is thought (dianoia), through which we think about things like mathematics and geometry. Below this is belief (pistis), where we can reason about things that we sense in our world. The lowest rung of the ladder is imagination (eikasia), where our mind is occupied with mere shadows of the physical world (509d-511e). The image of the Divided Line is parallel to the process of the prisoner emerging from the cave in the Allegory of the Cave, and to the Sun/Good analogy. In any case, real knowledge is knowledge of the forms, and is that for which the true philosopher strives, and the philosopher does this by living the life of the best part of the soul—reason.

d. Psychology

Plato is famous for his theory of the tripartite soul (psyche), the most thorough formulation of which is in the Republic. The soul is at least logically, if not also ontologically, divided into three parts: reason (logos), spirit (thumos), and appetite or desire (epithumia). Reason is responsible for rational thought and will be in control of the most ordered soul. Spirit is responsible for spirited emotions, like anger. Appetites are responsible not only for natural appetites such as hunger, thirst, and sex, but also for the desire of excess in each of these and other appetites. Why are the three separate, according to Plato? The argument for the distinction between three parts of the soul rests upon the Principle of Contradiction.

Socrates says, “It is obvious that the same thing will not be willing to do or undergo opposites in the same part of itself, in relation to the same thing, at the same time. So, if we ever find this happening in the soul, we’ll know that we aren’t dealing with one thing but many” (Republic, 436b6-c1). Thus, for example, the appetitive part of the soul is responsible for someone’s thirst. Just because, however, that person might desire a drink, it does not mean that she will drink at that time. In fact, it is conceivable that, for whatever reason, she will restrain herself from drinking at that time. Since the Principle of Contradiction entails that the same part of the soul cannot, at the same time and in the same respect, desire and not desire to drink, it must be some other part of the soul that helps reign in the desire (439b). The rational part of the soul is responsible for keeping desires in check or, as in the case just mentioned, denying the fulfillment of desires when it is appropriate to do so.

Why is the spirited part different from the appetitive part? To answer this question, Socrates relays a story he once heard about a man named Leontius. Leontius “was going up from the Piraeus along the outside of the North Wall when he saw some corpses lying at the executioner’s feet. He had an appetite to look at them bat at the same time he was disgusted and turned away” (Republic, 439e6-440a3). Despite his disgust (issuing from the spirited part of the soul) with his desire, Leontius reluctantly looked at the corpses. Socrates also cites examples when someone has done something, on account of appetite, for which he later reproaches himself. The reproach is rooted in an alliance between reason and spirit. Reason knows that indulging in the appetite is bad, and spirit, on reason’s behalf, becomes angry (440a6-440b4). Reason, with the help of spirit, will rule in the best souls. Appetite, and perhaps to some degree spirit, will rule in a disordered soul. The life of philosophy is a cultivation of reason and its rule.

The soul is also immortal, and one the more famous arguments for the immortality of the soul comes from the Phaedo. This argument rests upon a theory of the relationship of opposites. Hot and cold, for example, are opposites, and there are processes of becoming between the two. Hot comes to be what it is from cold. Cold must also come to be what it is from the hot, otherwise all things would move only in one direction, so to speak, and everything would therefore be hot. Life and death are also opposites. Living things come to be dead and death comes from life. But, since the processes between opposites cannot be a one-way affair, life must also come from death (Phaedo 71c-e2). Presumably Plato means by “death” here the realm of non-earthly existence. The souls must always exist in order to be immortal. We can see here the influence of Pythagorean thought upon Plato since this also leaves room for the transmigration of souls. The disordered souls in which desire rules will return from death to life embodied as animals such as donkeys while unjust and ambitious souls will return as hawks (81e-82a3). The philosopher’s soul is closest to divinity and a life with the gods.

e. Ethics and Politics

It is relatively easy to see, then, where Plato’s psychology intersects with his ethics. The best life is the life of philosophy, that is the life of loving and pursuing wisdom—a life spent engaging logos. The philosophical life is also the most excellent life since it is the touchstone of true virtue. Without wisdom, there is only a shadow or imitation of virtue, and such lives are still dominated by passion, desire, and emotions. On the other hand,

The soul of the philosopher achieves a calm from such emotions; it follows reason and ever stays with it contemplating the true, the divine, which is not the object of opinion. Nurtured by this, it believes that one should live in this manner as long as one is alive and, after death, arrive at what is akin and of the same kind, and escape from human evils. (Phaedo 84a-b)

It is the philosopher, too, who must rule the ideal city, as we saw in Plato’s seventh letter. Just as the philosopher’s soul is ruled by reason, the ideal city must be ruled by philosophers.

The Republic begins with the question of what true justice is. Socrates proposes that he and his interlocutors, Glaucon and Adeimantus, might see justice more clearly in the individual if they take a look at justice writ large in a city, assuming that an individual is in some way analogous to a city (368c-369a). So, Socrates and his interlocutors theoretically create an ideal city, which has three social strata: guardians, auxiliaries, and craftspeople/farmers. The guardians will rule, the auxiliaries will defend the city, and the craftspeople and farmers will produce goods and food for the city. The guardians, as we learn in Book VI, will also be philosophers since only the wisest should rule.

This tripartite city mirrors the tripartite soul. When the guardians/philosophers rule properly, and when the other two classes do their proper work—and do not do or attempt to do work that is not properly their own—the city will be just, much as a soul is just when reason rules (433a-b). How is it that auxiliaries and craftspeople can be kept in their own proper position and be prevented from an ambitious quest for upward movement? Maintaining social order depends not only upon wise ruling, but also upon the Noble Lie. The Noble Lie is a myth that the gods mixed in various metals with the members of the various social strata. The guardians were mixed with gold, the auxiliaries with silver, and the farmers and craftspeople with iron and bronze (415a-c). Since the gods intended for each person to belong to the social class that he/she currently does, it would be an offense to the gods for a member of a social class to attempt to become a member of a different social class.

The most salient concern here is that Plato’s ideal city quickly begins to sound like a fascist state. He even seems to recognize this at times. For example, the guardians must not only go through a rigorous training and education regimen, but they must also live a strictly communal life with one another, having no private property. Adeimantus objects to this saying that the guardians will be unhappy. Socrates’ reply is that they mean to secure happiness for the whole city, not for each individual (419a-420b). Individuality seems lost in Plato’s city.

In anticipation that such a city is doomed to failure, Plato has it dissolve, but he merely cites discord among the rulers (545d) and natural processes of becoming as the reasons for its devolution. Socrates says, “It is hard for a city composed in this way to change, but everything that comes into being must decay. Not even a constitution such as this will last forever. It, too, must face dissolution” (546a1-4). We may notice here that Plato cites human fragility and finitude as sources of the ideal city’s devolution, not the city’s possible fascistic tendencies. Yet, it is possible that the lust for power is the cause of strife and discord among the leaders. In other words, perhaps not even the best sort of education and training can keep even the wisest of human rulers free from desire.

It is difficult to overlook the sometimes moralistic and fascistic tendencies in Plato’s ethical and political thought. Yet, just as he challenges his own metaphysical ideas, he also at times loosens up on his ethical and political ideals. In Phaedo, for example, Plato has Phaedo recount the story of Socrates’ final day. Phaedo says that he and other friends of Socrates arrived at the prison early, and when they were granted access to Socrates, Xanthippe, Socrates’ wife was already there with their infant son (60a), which means that Xanthippe had been there all night. Socrates, to his own pleasure, rubs his legs after the shackles have been removed (60b), which implies that even philosophers enjoy bodily pleasures. Again, Phaedo says that Socrates had a way of easing the distress of those around him—in this case, the distress of Socrates’ imminent death. Phaedo recounts how Socrates eased his pain on that particular day:

I happened to be sitting on his right by the couch on a low stool, so that he was sitting well above me. He stroked my head and pressed the hair on the back of my neck, for he was in the habit of playing with my hair at times. (89a9-b3)

Plato, with these dramatic details, is reminding us that even the philosopher is embodied and, at least to some extent, enjoys that embodiment, even though reason is to rule above all else.

4. Aristotle

Aristotle (384-322 B.C.E.) was born Stagirus, which was a Thracian coastal city. He was the son of Nichomacus, the Macedonian court physician, which allowed for a lifelong connection with the court of Macedonia. When he was 17, Aristotle was sent to Athens to study at Plato’s Academy, which he did for 20 years. After serving as tutor for the young Alexander (later Alexander the Great), Aristotle returned to Athens and started his own school, the Lyceum. Aristotle walked as he lectured, and his followers therefore later became known as the peripatetics, those who walked around as they learned. When Alexander died in 323, and the pro-Macedonian government fell in Athens, a strong anti-Macedonian reaction occurred, and Aristotle was accused of impiety. He fled Athens to Chalcis, where he died a year later.

Unlike Plato, Aristotle wrote treatises, and he was a prolific writer indeed. He wrote several treatises on ethics, he wrote on politics, he first codified the rules of logic, he investigated nature and even the parts of animals, and his Metaphysics is in a significant way a theology. His thought, and particularly his physics, reigned supreme in the Western world for centuries after his death.

a. Terminology

Aristotle used, and sometimes invented, technical vocabulary in nearly all facets of his philosophy. It is important to have an understanding of this vocabulary in order to understand his thought in general. Like Plato, Aristotle talked about forms, but not in the same way as his master. For Aristotle, forms without matter do not exist. I can contemplate the form of human being (that is, what it means to be human), but this would be impossible if actual (embodied) human beings were non-existent. A particular human being, what Aristotle might call “a this,” is hylomorphic, or matter (hyle) joined with form (morphe). Similarly, we cannot sense or make sense of unformed matter. There is no matter in itself. Matter is the potential to take shape through form. Thus, Aristotle is often characterized as the philosopher of earth, while Plato’s gaze is towards the heavens, as it appears in Raphael’s famous School of Athens painting.

Form is thus both the physical shape, but also the idea by which we best know particular beings. Form is the actuality of matter, which is pure potentiality. “Actuality” and “potentiality” are two important terms for Aristotle. A thing is in potentiality when it is not yet what it can inherently or naturally become. An acorn is potentially an oak tree, but insofar as it is an acorn, it is not yet actually an oak tree. When it is an oak tree, it will have reached its actuality—its continuing activity of being a tree. The form of oak tree, in this case, en-forms the wood, and gives it shape—makes it actuality a tree, and not just a heap of matter.

When a being is in actuality, it has fulfilled its end, its telos. All beings by nature are telic beings. The end or telos of an acorn is to become an oak tree. The acorn’s potentiality is an inner striving towards its fulfillment as an oak tree. If it reaches this fulfillment it is in actuality, or entelecheia, which is a word that Aristotle coined, and is etymologically related to telos. It is the activity of being-its-own-end that is actuality. This is also the ergon, or function or work, of the oak tree. The best sort of oak tree—the healthiest, for example—best fulfills its work or function. It does this in its activity, its energeia, of being. This activity or energeia is the en-working or being-at-work of the being.

One more important set of technical terms is Aristotle’s four causes: material, formal, efficient (moving), and final cause. To know a thing thoroughly is to know its cause (aitia), or what is responsible for making a being who or what it is. For instance, we might think of the causes of a house. The material cause is the bricks, mortar, wood, and any other material that goes to make up the house. Yet, these materials could not come together as a house without the formal cause that gives shape to it. The formal cause is the idea of the house in the architect’s soul. The efficient cause would be the builders of the house. The final cause that for which the house exists in the first place, namely shelter, comfort, warmth, and so forth. We will see that the concept of causes, especially final cause, is very important for Aristotle, especially in his argument for the unmoved mover in the Physics.

b. Psychology

Aristotle’s On The Soul (Peri Psyche, often translated in the Latin, De Anima) gives us insight into Aristotle’s conception of the composition of the soul. The soul is the actuality of a body. Alternatively, since matter is in potentiality, and form is actuality, the soul as form is the actuality of the body (412a20-23). Form and matter are never found separately from one another, although we can make a logical distinction between them. For Aristotle, all living things are en-souled beings. Soul is the animating principle (arche) of any living being (a self-nourishing, growing and decaying being). Thus, even plants are en-souled (413a26). Without soul, a body would not be alive, and a plant, for instance, would be a plant in name only.

There are three types of soul: nutritive, sensitive, and intellectual. Some beings have only one of these, or some mixture of them. If, however, a soul has the capacity for sensation, as animals do, then they also have a nutritive faculty (414b1-2). Likewise, for beings who have minds, they must also have the sensitive and nutritive faculties of soul. A plant has only the nutritive faculty of soul, which is responsible for nourishment and reproduction. Animals have sense perception in varying degrees, and must also have the nutritive faculty, which allows them to survive. Human beings have intellect or mind (nous) in addition to the other faculties of the soul.

The soul is the source and cause of the body in three ways: the source of motion, the telos, and the being or essence of the body (415b9-11). The soul is that from which and ultimately for which the body does what it does, and this includes sensation. Sensation is the ability to receive the form of an object without receiving its matter, much as the wax receives the form of the signet ring without receiving the metal out of which the ring is made. There are three types of sensible things: particular sensibles, or those qualities that can be sensed by one sense only; common sensibles, which can be sensed by some combination of various senses; and incidental sensibles, as when I see my friend Tom, whose father is Joe, I say that I see “the son of Joe,” but I see Joe’s son only incidentally.

Mind (nous), as it was for Anaxagoras, is unmixed (429a19). Just as senses receive, via the sense organ, the form of things, but not the matter, mind receives the intelligible forms of things, without receiving the things themselves. More precisely, mind, which is nothing before it thinks and is therefore itself when active, is isomorphic with what it thinks (429a24). To know something is most properly to know its form, and mind in some way becomes the form of what it thinks. Just how this happens is unclear. Since the form is what is known, the mind “receives” or becomes that form when it best understands it. So, mind is not a thing, but is only the activity of thinking, and is particularly whatever it thinks at any given time.

c. Ethics

The most famous and thorough of Aristotle’s ethical works is his Nicomachean Ethics. This work is an inquiry into the best life for human beings to live. The life of human flourishing or happiness (eudaimonia) is the best life. It is important to note that what we translate as “happiness” is quite different for Aristotle than it is for us. We often consider happiness to be a mood or an emotion, but Aristotle considers it to be an activity—a way of living one’s life. Thus, it is possible for one to have an overall happy life, even if that life has its moments of sadness and pain.

Happiness is the practice of virtue or excellence (arete), and so it is important to know the two types of virtue: character virtue, the discussion of which makes up the bulk of the Ethics, and intellectual virtue. Character excellence comes about through habit—one habituates oneself to character excellence by knowingly practicing virtues. To be clear, it is possible to perform an excellent action accidentally or without knowledge, but doing so would not make for an excellent person, just as accidentally writing in a grammatically correct way does not make for a grammarian (1105a18-26). One must be aware that one is practicing the life of virtue.

Aristotle arrives at the idea that “the activity of the soul in accordance with virtue” is the best life for human beings through the “human function” argument. If, says Aristotle, human beings have a function or work (ergon) to perform, then we can know that performing that function well will result in the best sort of life (1097b23-30). The work or function of an eye is to see and to see well. Just as each part of the body has a function, says Aristotle, so too must the human being as a whole have a function (1097b30). This is an argument by analogy. The function of the human being is logos or reason, and the more thoroughly one lives the life of reason, the happier one’s life will be (1098a3).

So, the happiest life is a practice of virtue, and this is practiced under the guidance of reason. Examples of character virtues would be courage, temperance, liberality, and magnanimity. One must habitually practice these virtues in order to be courageous, temperate, and so forth. For example, the courageous person knows when to be courageous, and acts on that knowledge whenever it is appropriate to do so (1115a16-34). Each activity of any particular character virtue has a related excessive or deficient action (1105a24-33). The excess related to courage, for example, is rashness, and the deficiency is cowardice. Since excellence is rare, most people will tend more towards an excess or deficiency than towards the excellent action. Aristotle’s advice here is to aim for the opposite of one’s typical tendency, and that eventually this will lead one closer to the excellence (1109a29-1109b6). For example, if one tends towards the excess of self-indulgence, it might be best to aim for insensibility, which will eventually lead the agent closer to temperance.

Friendship is also a necessary part of the happy life. There are three types of friendship, none of which is exclusive of the other: a friendship of excellence, a friendship of pleasure, and a friendship of utility (1155b18). A friendship of excellence is based upon virtue, and each friend enjoys and contemplates the excellence of his/her friend. Since the friend is like another self (1166a31), contemplating a friend’s virtue will help us in the practice of virtue for ourselves (1177b10). A mark of good friendship is that friends “live together,” that is that friends spend a substantial amount of time together, since a substantial time apart will likely weaken the bond of friendship (1157b5-11)). Also, since the excellent person has been habituated to a life of excellence, his/her character is generally firm and lasting. Likewise, the friendship of excellence is the least changeable and most lasting form of friendship (1156b18).

The friendships of pleasure and use are the most changeable forms of friendship since the things we find pleasurable or useful tend to change over a lifetime (1156a19-20). For example, if a friendship forms out of a mutual love for beer, but the interest of one of the friends later turns towards wine, the friendship would likely dissolve. Again, if a friend is merely one of utility, then that friendship will likely dissolve when it is no longer useful.

Since the best life is a life of virtue or excellence, and since we are closer to excellence the more thoroughly we fulfill our function, the best life is the life of theoria or contemplation (1177a14-18). This is the most divine life, since one comes closest to the pure activity of thought (1177b30). It is the most self-sufficient life since one can think even when one is alone. What does one contemplate or theorize about? One contemplates one’s knowledge of unchanging things (1177a23-27). Some have criticized Aristotle saying that this sort of life seem uninteresting, since we seem to enjoy the pursuit of knowledge more than just having knowledge. For Aristotle, however, the contemplation of unchanging things is an activity full of wonder. Seeking knowledge might be good, but it is done for the sake of a greater end, namely having knowledge and contemplating what one knows. For example, Aristotle considered the cosmos to be eternal and unchanging. So, one might have knowledge of astronomy, but it is the contemplation of what this knowledge is about that is most wonderful. The Greek word theoria is rooted in a verb for seeing, hence our word “theatre.” So, in contemplation or theorizing, one comes face to face with what one knows. 

d. Politics

The end for any individual human being is happiness, but human beings are naturally political animals, and thus belong in the polis, or city-state. Indeed, the inquiry into the good life (ethics) belongs in the province of politics. Since a nation or polis determines what ought to be studied, any practical science, which deals with everyday, practical human affairs, falls under the purview of politics (1094a26-1094b11). The last chapter of Nicomachean Ethics is dedicated to politics. Aristotle emphasizes that the goal of learning about the good life is not knowledge, but to become good (1095a5), and he reiterates this in the final chapter (1179b3-4). Since the practice of virtue is the goal for the individual, then ultimately we must turn our eyes to the arena in which this practice plays out—the polis.

A good individual makes for a good citizen, and a good polis helps to engender good individuals: “Legislators make the citizens good by forming habits in them, and this is the wish of every legislator; and those who do not effect it miss their mark, and it is in this that a good constitution differs from a bad one” (1103b3-6). Laws must be instituted in such a way as to make its citizens good, but the lawmakers must themselves be good in order to do this. Human beings are so naturally political that the relationship between the state and the individual is to some degree reciprocal, but without the state, the individual cannot be good. In the Politics, Aristotle says that a man who is so self-sufficient as to live away from a polis is like a beast or a god (1253a29). That is, such a being is not a human being at all. Again, a man who is separated from law and justice is the “worst of all” (1253a32).

In Book III.7 of the Politics, Aristotle categorizes six different political constitutions, naming three as good and three as bad. The three good constitutions are monarchy (rule by one), aristocracy (rule by the best, aristos), and polity (rule by the many). These are good because each has the common good as its goal. The worst constitutions, which parallel the best, are tyranny, oligarchy, and democracy, with democracy being the best of the three evils. These constitutions are bad because they have private interests in mind rather than the common good or the best interest of everyone. The tyrant has only his own good in mind; the oligarchs, who happen to be rich, have their own interest in mind; and the people (demos), who happen not to be rich, have only their own interest in mind.

Yet, Aristotle grants that there is a difference between an ideal and a practically plausible constitution, which depends upon how people actually are (1288b36-37). The perfect state will be a monarchy or aristocracy since these will be ruled by the truly excellent. Since, however, such a situation is unlikely when we face the reality of our current world, we must look at the next best, and the next best after that, and so on. Aristotle seems to favor democracy, and after that oligarchy, but he spends the bulk of his time explaining that each of these constitutions actually takes many shapes. For example, there are farmer-based democracies, democracies based upon birth status, democracies wherein all free men can participate in government, and so forth (1292b22-1293a12).

The most unfortunate aspect of Aristotle’s politics is his treatment of slavery and women, and we might wonder how it affects his overall inquiry into politics:

The male is by nature superior, and the female inferior; and the one rules, and the other is ruled; this principle, of necessity, extends to all mankind. Where then there is such a difference as that between soul and body, or between men and animals (as in the case of those whose business is to use their body, and who can do nothing better), the lower sort are by nature slaves, and it is better for them as for all inferiors that they should be under the rule of a master. For he who can be, and therefore is, another’s, and he who participates in reason enough to apprehend, but not to have, is a slave by nature. Whereas the lower animals cannot even apprehend reason; they obey their passions. (Politics 1254b13-23)

For Aristotle, women are naturally inferior to men, and there are those who are natural slaves. In both cases, it is a deficiency in reason that is the culprit. Women have reason but “lack authority” (1260a14), and slaves have reason enough to take orders and have some understanding of their world, but cannot use reason as the best human being does. It is difficult, if not impossible, to interpret Aristotle charitably here. For slaves, one might suggest that Aristotle has in mind people who can do only menial tasks, and nothing more. Yet, there is a great danger even here. We cannot always trust the judgment of the master who says that this or that person is capable only of menial tasks, nor can we always know another person well enough to say what the scope of his or her capabilities for thought might be. So even a charitable interpretation of his views of slavery and women is elusive.

e. Physics

Aristotle’s physics, which stood as the most influential study of physics until Newtonian physics, could be seen largely as a study of motion. Motion is defined in the Physics as the “actuality of the potentiality in the very way in which [the thing in motion] is in potentiality” (201b5). Motion is not merely a change of place. It can also include processes of change in quality and quantity (201a4-9). For example, the growth of a plant from rhizome to flower (quantity) is a process of motion, even though the flower does not have any obvious lateral change of place. The change of a light skin-tone to bronze via sun tanning is a qualitative motion. In any case, the thing in motion is not yet what it is becoming, but it is becoming, and is thus actually a potentiality qua potentiality. The light skin is not yet sun tanned, but is becoming sun tanned. This process of becoming is actual, that is that the body is potentially tanned, and is actually in the process of this potentiality. So, motion is the actuality of the potentiality of a being, in the very way that it is a potentiality.

In Book 8.1 of the Physics, Aristotle argues that the cosmos and its heavenly bodies are in perpetual motion and always has been. There could not have been a time with no motion, whatever is moved is moved by itself or by another. Rest is simply a privation of motion. Thus, if there were a time without motion, then whatever existed—which had the power to cause motion in other beings—would have been at rest. If so, then it at some point had to have been in motion since rest is the privation of motion (251a8-25). Motion, then, is eternal. What moves the cosmos? This must be the unmoved mover, or God, but God does not move the cosmos as an efficient cause, but as a final cause. That is, since all natural beings are telic, they must move toward perfection. What is the perfection of the cosmos? It must be eternal, perfectly circular motion. It moves towards divinity. Thus, the unmoved mover causes the cosmos to move toward its own perfection.

f. Metaphysics

Aristotle’s Metaphysics, legendarily known as such because it was literally categorized after (meta) his Physics, was known to him as “first philosophy”—first in status, but last in the order in which we should study his corpus. It is also arguably his most difficult work, which is due to its subject matter. This work explores the question of what being as being is, and seeks knowledge of first causes (aitiai) and principles (archai). First causes and principles are indemonstrable, but all demonstrations proceed from them. They are something like the foundation of a building. The foundation rests upon nothing else, but everything else rests upon it. We can dig to the foundation, but (let’s pretend there’s no further earth under it) we can go no further. Likewise, we can reason our way up (or down) to the first principles and causes, but our reasoning and ability to know ends there. Thus, we are dealing with an inherently difficult and murky subject, but once knowledge of this subject is gained, there is wisdom (Metaphysics 982a5). So, if philosophy is a constant pursuit of wisdom for Plato, Aristotle believed that the attainment of wisdom is possible.

Aristotle says that there are many ways in which something is said to be (Meta.1003b5), and this refers to the categories of being. We can talk about the substance or being (ousia) of a thing (what that thing essentially is), quality (the shirt is red), quantity (there are many people here), action (he is walking), passion (he is laughing), relation (A is to B as B is to C), place (she is in the room), time (it is noon), and so on. We notice in each of these categories that being is at play. Thus, being considered qua being cannot be restricted to any one of the categories but cuts across all of them.

So what is being or substance? The form of a thing makes it intelligible, rather than its matter, since things with relatively the same form can have different matter (metal baseball bats and wooden baseball bats are both baseball bats). Here, we are really getting at the essence of something. Aristotle’s phrase for essence is “to ti en einai,” which could be translated as “what it is (was) to be” this or that thing. Since nothing is what it is outside of matter—there is no form by itself, just as there is no pure matter by itself—the essence of anything, its very being, is its being as a whole. No particular being is identical with its quality, quantity, position in space, or any other incidental features. It is the singular being as a whole, the “this” to which we can apply no further name, that shows us the being in its being.

The Metaphysics then arrives at a similar end as does the Physics, with the first mover. But, in the Metaphysics, we are not primarily concerned with the motion of physical beings but with the being of all beings. This being, God, is pure actuality, with no mixture of any potentiality at all. In short, it is pure being, and is always being itself in completion. Thinking is the purest of activities, according to Aristotle. God is always thinking. In fact, God cannot do otherwise than think. The object of God’s thought is thinking itself. God is literally thought thinking thought (1072b20). We recall from Aristotle’s psychology that mind becomes what it thinks, and Aristotle reiterates this in the Metaphyiscs (1072b20-22). Since God is thinking, and thinking is identical with its object, which is thought, God is the eternal activity of thinking.

5. Hellenistic Thought

The Hellenistic period in philosophy is generally considered to have commenced with Alexander’s death in 323, and ended approximately with the Battle of Actium in 31 BC. Although the Academy and the Lyceum could be considered in a thorough investigation into Hellenistic philosophy, scholars usually focus upon the Epicureans, Cynics, Stoics, and Skeptics.

Hellenistic philosophy is traditionally divided into three fields of study: physics, logic, and ethics. Physics involved a study of nature while logic was broadly enough construed to include not only the rules of what we today consider to be logic but also epistemology and even linguistics.

a. Epicureanism

Epicurus (341-271 B.C.E.) and his school are often mistakenly considered to be purely hedonistic, such that nowadays an “epicure” designates one who delights in fine foods and drinks. Etymologically, it is accurate to call Epicurus and his followers “hedonists,” where we refer merely to pleasure, without restricting that pleasure to bodily pleasures. Epicurus’ school, the Garden (an actual Garden near Athens), was primarily friendly in nature, and non-hierarchical (Dorandi 57). Although Epicurus was a prolific author, we have only three of his letters preserved in Diogenes Laertius’ Lives. Otherwise, we depend in large part upon the Epicurean Lucretius and his work On the Nature of Things, especially in order to understand Epicurean physics, which was essentially materialistic. The goal of all true understanding for Epicurus, which must involve an understanding of physics, was tranquility.

i. Physics

Epicurus and his followers were thoroughgoing materialists. Everything except the void, even the human soul, is composed of material bodies. Epicureans were atomists and accordingly thought that there is nothing but atoms and void. Atoms “vary indefinitely in their shapes; for so many varieties of things as we see could never have arisen out of a recurrence of a definite number of the same shapes” (DL X.42). Moreover, these atoms are always in motion, and will remain in motion in the void until something can offer enough resistance to stop an atom in motion.

Epicurus’ view of atomic motion provides an important point of departure from Democritean atomism. For Democritus, atoms move according to the laws of necessity, but for Epicurus, atoms sometimes swerve, or venture away from their typical course, and this is due to chance. Chance allows room for free will (Lucretius 2.251-262). Epicureans seem to take for granted that there is freedom of the will, and then apply that assumption to their physics. That is, there seems to be free will, so Epicureans then posit a physical explanation for it.

ii. Ethics

Much of what we know about Epicurean ethics comes from Epicurus’ Letter to Menoeceus, which is preserved in Diogenes Laertius’ Lives. The goal of the good life is tranquility (ataraxia). One achieves tranquility by seeking pleasure (hedone), but not just any pleasure will suffice. The primary sort of pleasure is the simplicity of being free from pain and fear, but even here, we should not seek to be free from every sort of pain. We should pursue some painful things if we know that doing so will render greater pleasure in the end (DL X.129-130). So, Epicurus’ hedonism shapes up to be a nuanced hedonism. Indeed, he recommends a plain life, saying that the most enjoyment of luxury comes to those who need luxury least (DL X.130). Once we habituate ourselves to eating plain foods, for example, we gradually eliminate the pain of missing fancy foods, and we can enjoy the simplicity of bread and water (DL X.130-131). Epicurus explicitly denies that sensual pleasures constitute the best life and argues that the life of reason—which includes the removal of erroneous beliefs that cause us pain—will bring us peace and tranquility (DL X.132).

The sorts of beliefs that produce pain and anxiety for us are primarily two: a mistaken conception of the gods, and a misconception of death. Most people, according to Epicurus, have mistaken conceptions about the gods, and are therefore impious (DL X.124). Similar to Xenophanes, Epicurus would encourage us not to anthropomorphize the gods and to think only what is fitting for the most blessed and eternal beings. We are not thinking clearly when we think that the gods get angry with us or care at all about our personal affairs. It is not befitting of an eternal and blessed being to become angry over or involved in the affairs of mortals. Yet, perhaps Epicurus is anthropomorphizing here. The argument seems to rely upon his argument that tranquility is our greatest pleasure and upon the assumption that the gods must experience that pleasure. On the other hand, one could read Epicurus as a sort of proto-negative theologian who merely suggests that it is unreasonable to believe that gods, the best of beings, feel pain at all. One might wonder whether anthropomorphizing is avoidable at all.

We should not fear death because death is “nothing to us, for good and evil imply sentience, and death is the privation of all sentience” (DL X.124). The key here is the first premise that good and evil apply only to sentient beings. We recall that, for Epicurus, we are thoroughly material beings. Both mind and soul are part of the human body, and the human body is nothing if not sentient. Therefore, when the body dies, so too does the mind and soul, and so too does sentience. This means that death is literally nothing to us. The terror that we feel about death now will vanish once we die. Thus, it is better to be free from the fear of death now. When we rid ourselves of the fear of death, and the hope of immortality that accompanies that fear, we can enjoy the preciousness of our mortality (DL X.124-125).

b. The Cynics

The Cynics, unlike the Epicureans, were not properly a philosophical school. While there are identifiable characteristics of cynical thought, they had no central doctrine or tenets. It was a disparate movement, with varying interpretations on what constituted a Cynic. This interpretative freedom accords well with one of the characteristics that typified ancient Cynicism—a radical freedom from societal and cultural standards. The Cynics favored instead a life lived according to nature.

“Cynic,” from the Greek kunikos, meant “dog-like.” We cannot be sure whether the Dogs thought of themselves as doglike, or whether they were termed as such by non-Cynics, or both. The first of the Dogs, Antisthenes (c.445-366 B.C.E.), was supposedly close with Socrates, and was present at his death, according to Plato’s Phaedo. Yet, it was Diogenes of Sinope (c.404-323 B.C.E.), often called simply, “Diogenes the Cynic,” who was and is the most famous of the Dogs. Most information we have comes from Diogenes Laertius’ Lives, which was written centuries after Diogenes the Cynic’s life, and is therefore historically problematic. It nevertheless provides us with an imaginative description of Diogenes the Cynic’s life, which was apparently unusual and outstanding.

Diogenes the Cynic was purportedly exiled from Sinope for defacing the city’s coins, and this later became his metaphorical modus operandi for philosophy—“driving out the counterfeit coin of conventional wisdom to make room for the authentic Cynic life” (Branham and Goulet-Cazé 8). The cynic life referenced here consisted of a life lived in accordance with nature, a rebellion against and freedom from dominant Greek culture that lives contrary to nature, and happiness through askesis, or asceticism (Branham and Goulet-Cazé 9). Thus, Diogenes wore but a thin, rough cloak all year round, accustomed himself to withstand both heat and cold, ate but a meager diet, and most sensationally, openly mocked everyday Greek life.

He was reportedly at a dinner party where the attendants were throwing bones at him as though he were a dog. So, Diogenes “urinated upon them as a dog would” (DL VI.46). He reportedly masturbated in public, and when reprimanded for it, he replied that he “wished it were as easy to relieve hunger by rubbing an empty stomach” (DL VI.46). Again, “He lit a lamp in broad daylight and said, as he went about, ‘I am looking for a human being’” (DL VI.41), implying that none of the Greeks could appropriately be called “human.” These shenanigans were intended to wake up the Athenians to the life of simplicity and philosophy. One needs very little to be happy. In fact, one should severely limit one’s desires, and live as most animals do, without anxiety, and securing only what one needs to continue living. This all seems a response to the cold fact that much of human life and circumstance is out of our control. So, Diogenes claimed that philosophy was a practice that prepared him for any kind of luck (DL VI.63).

The Cynics seem to have taken certain aspects of Socrates’ life and thought and pushed it to the extreme. One might wonder what drives the ascetic practice for any sort of luck. Is it that we see that moving from one superficial pleasure to the next is ultimately unfulfilling? Or, is the practice itself driven by a sort of fear, an emotion that the Cynic means to quell? That is, one might read the asceticism of the Cynic as a futile attempt to deny the truth of human fragility; for example, at any moment the things I enjoy can vanish, so I should avoid enjoying those things. On the other hand, perhaps the asceticism of the Cynic is an affirmation of this fragility. By living the ascetic life of poverty, the Cynic is constantly recognizing and affirming his/her finitude and fragility by choosing never to ignore it.

c. The Stoics

Stoicism evolved from Cynicism, but was more doctrinally focused and organized. While the Cynics largely ignored typical fields of study, the Stoics embraced physics, logic, and ethics, making strides especially in logic. Zeno of Citium (c.334-262 B.C.E.) was the founder of the stoic school, which was named after the Stoa Poikile, a “painted portico” where the Stoics regularly met. This was the beginning of a long and powerful tradition, which lasted into the imperial era. Indeed, one of the most famous of stoic ethicists was the Roman emperor Marcus Aurelius (121-180 C.E.). Epictetus (55-135 C.E.) is another famous Stoic ethicist who also carried on the tradition of Stoicism beyond the Hellenistic period. Although the Stoics made some strides in logic after Aristotle, this article’s focus is on Stoic physics, epistemology, and ethics.

i. Physics

As Pierre Hadot has shown, the Stoics studied physics in order better to understand their own lives, and to live better lives: “Stoic physics was indispensible for ethics because it showed people that there are some things which are not in their power but depend on causes external to them—causes which are linked in a necessary, rational manner” (Hadot 128). Like the Cynics, the Stoics strove to live in accordance with nature, and so a rigorous study of nature allowed them to do so all the more effectively.

The Stoics were materialists, though not thoroughgoing materialists as the Epicureans were. Also, chance can play no role in the Stoics’ ordered and thoroughly rational and causally determined universe. Since we are part of this universe, our lives, too, are causally determined, and everything in the universe is teleologically oriented towards its rational fulfillment. Diogenes Laertius reports that the Stoics saw matter as passive and logos (god) as active, and that god runs through all of the matter as its organizing principle (DL VII.134). This divinity is most apparent in us via our ability to reason. At any rate, the universe is, as the name implies, a unity, and it is divine.

ii. Epistemology

The knowledge we have of the world comes to us directly through our senses and impresses itself upon the blank slate of our minds. The naked information that comes to us via the senses allows us to know objects, but our judgments of those objects can lead us into error. As Hadot says about these so-called objective presentations, “They do not depend on our will; rather, our inner discourse enunciates and describes their content, and we either give or withhold our consent from this enunciation” (Hadot 131). There might be a problem lurking here regarding the standard of truth, which, for the Stoics, is simply the correspondence of one’s idea of the object with the object itself. If it is true that the correspondence of our descriptions of the object with the actual object can bring us knowledge, how can we ever be sure that our descriptions really match the object? After all, if it is not the bare sense impression that brings knowledge, but my correct description of the object, it seems that there is no standard by which I can ever be sure that my description is correct.

iii. Ethics

Stoic ethics urges us to be rid of our desires and aversions, especially where these desires and aversions are not in accord with nature. For instance, death is natural. To be averse to death will bring misery. Stoic ethics can perhaps be best summed up in the first paragraph of Epictetus’ Handbook:

Some things are up to us and some are not up to us. Our opinions, and our impulses, desires, aversions—in short, whatever is our own doing. Our bodies are not up to us, nor are our possessions, our reputations, or our public offices, or, that is, whatever is not our own doing. The things that are up to us are by nature free, unhindered, and unimpeded; the things that are not up to us are weak, enslaved, hindered, not our own…If you think that only what is yours is yours, and that what is not your own is…not your own, then no one will ever coerce you, no one will hinder you, you will blame no one, you will have no enemies, and no one will harm you, because you will not be harmed at all.

This passage might be shocking to us today when, especially in the United States, many of the things that Epictetus tells us to avoid are what we are told to pursue. We therefore might wonder why our bodies, possessions, reputations, wealth, or jobs are not in our control. For Epictetus, it is simple. Possessions come and go—they can be destroyed, lost, stolen, and so forth. Reputations are determined by others, and it is reasonable to believe that even the best people will be hated by some, and even the worst people will be loved by some. Try as we might, we might never gain wealth, and even if we do, it can be lost, destroyed, or stolen. Again, public office, like reputation, is up to others to determine. So, the adage that “you can be anything you want in life” is not only false under stoic ethics, but dangerously misleading since it will almost inevitably lead to misery.

Just because, however, I live as Epictetus recommends, how can I be sure that I will never be harmed? Even if I fully grant that someone who, for instance, pushes me down a flight of stairs has committed his own wrong, and that his wrong actions are not in my control, will I not still feel pain? Physical pain, for a Stoic, is not harm. The only real harm is when one harms oneself by doing evil, just as the only real good is living excellently and in accordance with reason. In this example, I would harm myself with the judgment that what happened to me was bad. One might object here, as one might object to Cynicism, that stoic ethics ultimately demands a repression of what is most human about us. Indeed, Epictetus says, “If you kiss your child or wife, say that you are kissing a human being; for when it dies you will not be upset” (Handbook 12). For the Stoic, being moved by others brings us away from tranquility. However, kissing a “human being” is not the same as kissing this human being, this individual who would be deeply hurt by knowing that I treat them merely as a human being, and who I relate to only through a sense of duty, rather than a real sense of love. Stoic ethics risks removing our humanity from us in favor of its own notion of divinity.

d. Skeptics

The two strands of Skepticism in the Hellenistic era were Academic Skepticism and Pyrrhonian Skepticism. Somewhat like the Cynics, each major Skeptic had his own take on Skepticism, and so it is difficult to lump them all under a tidy label. Also like the Cynics, however, there are certain characteristics that can be highlighted, despite differences between particular thinkers. Skepsis means “inquiry,” but the Skeptics did not seek solid or absolute answers as the goal of their inquiry. Rather, the goal of their skepsis was tranquility and freedom from judgments, opinions, or absolute claims to knowledge. Skepticism, broadly speaking, constituted a challenge to the possibility and nature of knowledge.

i. Academic Skepticism

The sixth scholarch (leader) of Plato’s Academy was Arcesilaus (318-243 B.C.E.), who initiated a substantial tradition of Skepticism in the Academy that lasted into the first century B.C.E. Arcesilaus found the inspiration for his skepticism in the figure of Socrates. Arcesilaus would argue both for and against any given position, ultimately showing that neither side of the argument can be trusted. He directed his skepticism primarily toward the Stoics and the empirical basis of their claims to knowledge. We recall that, for the Stoics, a grasping of sense impressions in the proper way is the true foundation for knowledge. Arcesilaus’ argument against stoic empiricism is not clear (the argument is recounted in Cicero’s Academia 2.40-42), but it seems ultimately to reach the conclusion suggested above, namely that we can never be sure that the way we have perceived (judged) an object via the senses is true or false. The argument runs roughly as follows. For any given presentation of an object to the senses, we can imagine that something else could be presented to the senses in just the same way, such that the perceiver cannot distinguish between the two objects being presented, which Arcesilaus thought the Stoics would grant. The perceiver can present these objects to him/herself, via the senses, in a true or false way, which the Stoics would also grant. It is possible, then, that the perceiver thinks one presentation is true and the other is false, but he has no way of distinguishing between either. Arcesilaus’ conclusion is that we should always suspend our judgment.

Carneades (213-129 B.C.E.), the tenth scholarch of Plato’s Academy, seems to have cleverly answered a typical objection raised against Skepticism. It is inconsistent, goes the objection, to insist that it is impossible for anything to be known (“grasped”), since that statement, “nothing can be known” is itself a claim to knowledge. Carneades recognized that even the claim “nothing can be known” should be called into doubt. Again, like Arcesilaus, Carneades relied upon the typical skeptic tactic of presenting arguments both for and against the same thing and claiming that we cannot therefore claim that either side is correct.

ii. Pyrrhonian Skepticism

We know almost nothing for sure about Pyrrho of Elis (360-270 B.C.E.). He wrote nothing, which is perhaps a sign of his extreme skepticism, that is if we cannot know anything, or cannot be sure whether knowledge is possible, then nothing can definitively be said, especially in writing. Perhaps what most differentiates Pyrrhonian Skepticism from Academic Skepticism is the profound indifference that Pyrrhonian Skepticism is meant to generate. Diogenes Laertius relays the story that, when his master Anaxarchus had fallen into a swamp, Pyrrho simply passed him by, and was later praised by Anaxarchus for his supreme indifference (DL IX.63). Pyrrhonian Skepticism refutes all dogmas and opinions and vehemently clings to indeterminacy, even the idea that “nothing can be known.”

Aenesidemus, the Pyrrhonian Skeptic, advanced the “Ten Modes,” arguments that address typical difficulties in appearances and judgment—each aimed toward the conclusion that we ought to suspend judgment if we are to be at peace. The first mode argues that other animals sense things differently from human beings, and that we cannot therefore pretend to place any absolute value on the things sensed. Since the qualities of sensation vary from species to species, for example “the quail thrives on hemlock, which is fatal to man” (DL IX.80) we ought to suspend value judgments upon those things. In the quoted example, then, the hemlock is clearly not in itself evil, but neither is it in itself good, but it is a matter of indifference. The remaining modes follow a similar pattern, highlighting relativity—whether cultural, personal, sensory, qualitative or quantitative—as evidence that we ought to suspend judgment.

The Skeptics, as Pierre Hadot says, use “philosophical discourse…to eliminate philosophical discourse” (143). That is, they do not adhere to any philosophical position, but use the tools of philosophy to gain a sense of simplicity and tranquility in life, thereby ridding themselves of the need for philosophy. By using dialectic, and opposing one argument to another, the Skeptic suspends judgment, and is not committed to any particular position. The Skeptic,

In everything he did…was to limit himself to describing what he experienced, without adding anything about what things are or what they are worth. He was to be content to describe the sensory representations he had, and to enunciate the state of his sensory apparatus, without adding to it his opinion. (Hadot 145)

We might wonder just how practical such an approach to life would be. Can we flourish or thrive, effectively communicate, or find cures for diseases by merely describing our experience of the world? For example, antibiotics can help, more often than not, to cure diseases born from certain bacteria. Could we not say, for practical purposes, that we know this to be the case? We are not, after all, ignorant of the fact that bacteria are becoming resistant to certain antibiotics, but this does not mean that they do not work, or that we cannot someday find alternative cures for bacterial infections.

The Skeptic could reply in several ways, but the most effective reply to the example provided might go something like this: Medicine does not bring us knowledge, if knowledge is certainty. Medicine, and what it claims to know has, after all, changed significantly. The practice of medicine is just another way of describing the way certain bodies interact with other bodies in a given time and place. But the Skeptic would go further. The curing of a disease, he would say, is neither good nor bad. Perhaps my disease is cured, and the next day, I am killed in some other way. If death is a matter of indifference, then the cure for illnesses must be, too. Again, we might wonder in this case how one is ever spurred to action.

6. Post-Hellenistic Thought

Platonic thought was the dominant philosophical force in the time period following Hellenistic thought proper. This article focuses on the reception and reinterpretations of Plato’s thought in Neoplatonism and particularly in its founder, Plotinus.

a. Plotinus

Plotinus (204-270 C.E.), in his Enneads—a collection of six books broken into sections of nine—builds upon Plato’s metaphysical thought, and primarily upon his concept of the Good. Plotinus is also informed by Aristotle’s work, the Unmoved Mover (thought thinking thought) in particular, and is privy to the bulk of the ancient philosophical tradition. As Kevin Corrigan says, “Plotinus transforms everything he inherits by the very activity of thinking through that inheritance critically and creatively” (23). In other words, Plotinus inherits concepts of unity, the forms, divine intellect, and soul, but makes these concepts his own. The result is a philosophy that comes close to a religious spiritual practice.

There are three aspects to Plotinus’ metaphysics: the One, Intellect, and Soul. The One is the ineffable center of all reality and the wellspring of all that is—more precisely, it is the condition of the possibility for all being, but is itself beyond all being. The One cannot be accurately accounted for in discourse. We can only contemplate it, and at most relay our own experience of this contemplation (Corrigan 26). We can speak negatively about the One (VI, 9.3). Thus, for example, we say that it is impassive. It does not create Intellect or Soul or anything else; rather, by its supreme nature, it merely emanates Intellect and Soul.

i. Intellect, Soul, and Matter

The Intellect emanates from the One because of the One’s fullness. The One, by being the One, simply gives off the Intellect, so to speak (Enneads V, 2.7-18). Since being moves out from its source and returns to its source (Corrigan 28), the Intellect turns towards the One and contemplates it. The Intellect is other than the One, but united with it in contemplation. As other, it gives rise to multiplicity, namely the forms that it is and that it thinks (it thus thinks itself). The Intellect generates Soul, which shares in intellect, but also animates the material world. Thus, the material world is generated by Soul, and this includes every individual being. A particular human being, then, has its share of soul, and its highest part of the soul is intellect, where true selfhood is.

ii. The True Self and the Good Life

The best life for human beings necessitates that each human become his or her own true self, which is the intellect. That is, we must turn away, as much as is possible, from matter and the sensible world, which are distractions, and be intellect (Enneads I.4). To become one’s true self is to live the best life. Being oneself in this sense, however, is quite different from the individuality promoted in the Western world. Hadot says, “To become a determinate individual is to separate oneself from the All by adding a difference which, as Plotinus says, is a negation. By cutting off all individual differences, and therefore our own individuality, we can become the All once again” (166). The best life depends upon becoming one’s true self via the intellect, which means to step away from the part of the soul by which we typically identify ourselves, the passionate and desiring part of the soul. If we are now accustomed to identify ourselves by our likes, dislikes, opinions, , then a true Plotinian self would not be a self at all. For Plotinus, however, this is true selfhood since it is closest to the center of all life, the One.

b. Later Neoplatonists

Plotinus set off a tradition of thought that had great influence in medieval philosophy. This tradition has been known since the 19th century as “Neoplatonism,” but Plotinus and other Neoplatonists saw themselves merely as followers and interpreters of Plato (Dillon and Gerson xiii). Plotinus’ student, Porphyry, without whom we would know little to nothing about Plotinus or his work, carried on the tradition of his master, although we do not possess a full representation of his work. With Iamblichus came a focus upon Aristotle’s work, since he took Aristotle as an informative source on Platonism. Neoplatonism also saw the rise of Christianity, and therefore saw itself to some degree in a confrontation with it (Dillon and Gerson xix). Perhaps in part because of this confrontation with Christianity, later Neoplatonists aimed to develop the religious aspects of Neoplatonic thought. Thus, the later Neoplatonists introduced theurgy, claiming that thought alone cannot unite us with gods, but that symbols and rites are needed for such a union (Hadot 170-171).

c. Cicero and Roman Philosophy

Greek philosophy was the dominant philosophy for years, including in the Roman Republic and in the imperial era. Cicero (106-43 B.C.E.) considered himself to be an Academic Skeptic, although he did not take his skepticism as far as a renunciation of politics and ethics. He is a very useful source for the preservation of and commentary upon not only Academic Skepticism, but also the Peripatetics, Stoics, and Skeptics. He was also an accomplished orator and politician, and authored many works of his own, which often employed skeptic principles or commented upon other philosophies. He took pains, as a true Skeptic, to present both sides of an argument. Cicero was murdered during the rise of the Roman empire.

Stoicism played an important role in the imperial period, especially with the Roman emperor Marcus Aurelius. Marcus is most famous for his so-called Meditations, which is a translation of the Greek ta eis heauton, “[things] to himself.” As the Greek title clearly shows, these meditations were meant for Marcus himself. These were reminders on how to live, especially as an emperor who saw turbulent times. This work, in its usually short, pithy statements, reveals some principles of stoic physics, but this only in service of its larger ethical orientation. It advocates a life of simplicity and tranquility lived according to nature.

7. Conclusion

From the Presocratics to the Hellenists, there is a preference for reason, whether it is used to find truth or tranquility. The Presocratics prefer reason or reasoned accounts to mythology, sometimes in order to find physical explanations for the phenomena all around us, to think more clearly about the gods, or sometimes to find out truths about our own psychology. For Socrates, the exercise of reason and argumentation was important to recognize one’s own limitations as a human being. For Plato, the life of reason is the best life, even if it cannot ultimately answer every question. Aristotle used reason to investigate the world around him, in some sense resuscitating the Presocratic preference for physical explanations, and returning lofty discussions to earth. The Hellenists emphasized philosophical practice, always in accordance with reason. We have also seen the profoundly influential tradition set in motion by Plato with the development of his thought into the so-called Neoplatonic era. That scholars and the intellectually curious alike still read these works, and not merely for historical purposes, is a testament to the depth of thought contained therein.

8. References and Further Reading

a. Presocratics

i. Primary Sources

  • Diels, Hermann and Walther Kranz. Die Fragmente der Vorsokratiker: Griechisch und Duetsch. Berlin: Weidmannsche Buchhandlung, 1910. Print.
    • This is the first and most traditionally used collection of Presocratic fragments and testimonies. This edition has the fragments in Greek with German translations. The book is no longer in print, and while it is often still cited in most scholarship, it is not the work cited in this article.
  • Graham, Daniel W. The Texts of Early Greek Philosophy: The Complete Fragments and Selected Testimonies of the Major Presocratics. 2 vols. Cambridge: Cambridge University Press, 2010.
    • This is the first collection of the Presocratic fragments and testimonies published with the original Greek and English translations. It is the work cited in this text. Graham offers a short commentary on the fragments, as well as references for further reading for each thinker. He has organized by topic the fragments for each thinker, and labels the fragments with an F, followed by the number of the fragment. That is how the fragments have been cited in this article. Testimonies are cited merely by their designated numbers.

ii. Secondary Sources

  • Barnes, Jonathan. The Presocratic Philosophers. London and New York: Routledge, 1982.
    • A classic work with interpretations of the Presocratics.
  • Burnet, John. Early Greek Philosophy. London: A&C. Black Ltd., 1930.
    • Another classic work with interpretations of the Presocratics.
  • Long, A.A. ed. The Cambridge Companion to Early Greek Philosophy. Cambridge: Cambridge University Press, 1999.
    • A collection of sixteen essays by some of the foremost scholars on Presocratic thought. The essays are generally accessible, but some are more appropriate for specialists in the field.
  • McKirahan, Richard D. Philosophy Before Socrates: An Introduction with Texts and Commentaries. Indianapolis: Hackett, 1994
    • This is a book for non-specialists and specialists. It contains most fragments for most thinkers and reasonable explanations and interepretations of each. There is also a helpful chapter at the end of the book on the nomos-phusis debate. The text includes a fairly extensive section for suggestions for further reading.
  • Vlastos, Gregory. “Ethics and Physics in Democritus.” Philosophical Review, vol. 2, 578-592, 1994.
    • This article is technical but offers insight into the connection between Democritean physics and ethics, and it was cited in the current overview.

b. Socrates and Plato

i. Primary Sources

  • Cooper, John, ed. Plato: Complete Works. Indianapolis: Hackett, 1997.
    •  The work is the most comprehensive and is also used throughout this article. This collection includes all of Plato’s authentic work as well as every work considered to be spurious or likely spurious. There is no other such collection in English. Any citations of John Cooper in this article come from Cooper’s introduction to this work.
  • Xenophon, IV: Memorabilia, Oeconomicus, Symposium, and Apology. Jeffrey Henderson ed. E.C. Marchant and O.J. Todd trans. Cambridge: Harvard University Press, 2002.
    • This is from the Loeb Classical Library, and accordingly has the original Greek with English on the facing page.

ii. Secondary Sources

  • Benson, Hugh H., A Companion to Plato. Malden: Blackwell Publishing, 2006.
    • This is a collection of scholarly articles on Plato’s work, and on Plato’s version of Socrates.
  • Brickhouse, Thomas C. and Nicholas D. Smith, Plato’s Socrates. New York: Oxford University Press, 1994.
    • This is a scholarly yet approachable book on just what the title suggests. It covers a range of problems that thoughtful readers will encounter when reading Plato.
  • Kraut, Richard ed. The Cambridge Companion to Plato. Cambridge: Cambridge University Press, 1992.
    • This is a collection of articles from premier Plato scholars on a variety of topics.
  • Morrison, Donald R. ed. The Cambridge Companion to Socrates. Cambridge: Cambridge University Press, 2011.
    • This is a collection of scholarship on historical, fictional, and philosophical perspectives of Socrates from Aristophanes to Plato.
  • Nails, Debra, “The Life of Plato of Athens,” in A Companion to Plato. Hugh H. Benson ed. Malden: Blackwell Publishing, 2006.
  • Nails, Debra, The People of Plato: a prosopography of Plato and other Socratics. Indianapolis: Hackett, 2002.
    • While this book can be laden with details, it is an indispensible resource for Plato scholars, as well as for anyone curious enough to know more about the various interlocutors and character references in Plato’s dialogues.
  • Taylor, A.E., Plato: The Man and His Work. New York: Meridian Books, 1960.
    • Although dated, this book offers of a survey and assessment of the bulk of Plato’s dialogues.
  • Tigerstedt, E.N., Interpreting Plato. Stockholm: Almqvist & Wiksell International, 1977.
    • This book is heavy on detail, but it provides a valuable survey of problems in and interpretations of Plato.
  • Vlastos, G., Socrates: Ironist and Moral Philosopher. Ithaca: Cornell University Press, 1991.
    • This book was especially influential for its chronological categorization of Plato’s dialogues, although the chronological reading has since lost its influence.

c. Aristotle

i. Primary Sources

  • Barnes, Jonathan ed. The Complete Works of Aristotle. Princeton: Princeton University Press, 1984.
    • This book is the most comprehensive, and it includes spurious works or works thought to be spurious. It is also the edition cited in this article.

ii. Secondary Sources

  • Barnes, Jonathan, The Cambridge Companion to Aristotle. Cambridge; New York: Cambridge University Press, 1995.
    • This book contains scholarly articles on a variety of subjects in Aristotle’s thought.
  • Broadie, Sarah, Ethics With Aristotle. New York ; Oxford: Oxford University Press, 1991
    • This book is a good overview of and commentary upon Aristotelian ethics.
  • Burnyeat, Miles, Map of Metaphysics Zeta. Pittsburgh: Mathesis Publications, 2001.
    • This book is meant to help readers navigate one of the most difficult books of Aristotle’s most difficult work.
  • Irwin, Terence, Aristotle’s First Principles. Oxford: Oxford University Press, 1988.
    • Although somewhat dense, this work provides insight into Aristotle’s metaphysical first principles, which underlie much of his work.

d. Hellenistic Philosophy

i. Primary Sources

  • Empiricus, Sextus, Outlines of Scepticism. Julia Annas and Jonathan Barnes eds. Cambridge: Cambridge University Press, 2000.
    • This book gives a good overview of Hellenistic Skepticism, and contains helpful notes from Annas and Barnes.
  • Epictetus, The Handbook, Nicholas P. White trans. Indianapolis: Hackett, 1983.
    • Although Epictetus was not a Hellenist, his formulation of stoic ethics is concise and highly influential. This work was also cited in this article.
  • Corrigan, Kevin, Reading Plotinus: A Practical Introduction to Neoplatonism. West Lafayette: Purdue University Press, 2005.
    • Corrigan presents key readings representative of Plotinus’ philosophy, and after each section of primary readings, provides his own lucid and helpful commentary.
  • Dillon, John and Lloyd P. Gerson, Neoplatonic Philosophy. Indianapolis: Hackett, 2004.
    • This is a helpful introduction to Neoplatonic thought. The bulk of the text are selections of Plotinus’ work, but it also contains selections from Porphyry, Iamblichus, and Proclus.
  • Inwood, Brad, and L.P. Gerson trans. and ed. Hellenistic Philosophy: Introductory Readings. Indianapolis: Hackett, 1988.
    • Since, like the Presocratics, original works are lacking in Hellenistic thought, this book is a good place to begin. It collects central texts, including ancient commentaries, covering the central themes of physics, logic, and ethics from epicurean, stoic, and skeptic perspectives.
  • Inwood, Brad, and L.P. Gerson trans. and ed. The Stoics Reader: Selected Writings and Testimonia. Indianapolis: Hackett, 2008.
    • Again, few original works survive from Hellenistic Stoicism proper, but this book provides central readings in Hellenistic Stoicism.
  • Laertius, Diogenes, Lives of Eminent Philosophers II. Jeffrey Henderson ed. R.D. Hicks trans.
  • Cambridge: Harvard University Press, 1931.
    • This volume of Diogenes’ famous work contains the three letters purported to be Epicurus’ letters on physics and ethics.
  • Plotinus, Enneads. 7 vols. ed. Jeffrey Henderson and trans. A.H. Armstrong. Cambridge: Harvard University Press, 1966.
    • This is the Loeb edition of Plotinus complete Enneads, along with Porphyry’s “Life of Plotinus.” This edition has the Greek facing the English translation.

ii. Secondary Sources

  • Algra, Keimpe, Jonathan Barnes, Jaap Mansfeld, and Malcolm Schofield, eds. The Cambridge History of Hellenistic Philosophy. Cambridge: Cambridge University Press, 1999.
    • Although this work is intended for specialists and non-specialists alike, it is dense and sometimes overburdened with details for the non-specialists tastes. It does, however, provide valuable historical information and commentary.
  • Branham, R. Bracht and Marie-Odile Goulet Cazé, The Cynics: The Cynic Movement in Antiquity and Its Legacy. Berkeley: University of California Press, 1996.
    • This is an informative collection of scholarly articles on a variety of topics in Cynicism. It also has a very helpful historically oriented introduction, which was cited in this article.
  • Hadot, Pierre, What is Ancient Philosophy? Michael Chase trans. Cambridge: Harvard University Press, 2002.
    • This book contains informative and innovative readings of ancient philosophy in general, and was cited in this article for its treatment of Hellenistic philosophy.

 

Author Information

Jacob N. Graham
Email: jgraham@bridgewater.edu
Bridgewater College
U. S. A.

Thomas Reid: Theory of Action

Reid pictureThomas Reid (1710-1796) made important contributions to the fields of epistemology and philosophy of mind, and is often regarded as the founder of the common sense school of philosophy. However, he also offered key arguments and observations concerning human agency and morality.

Reid carefully criticized the views of his contemporaries, and defended an account of human freedom in which he argues that only beings endowed with will and understanding, who also have power over their will and actions, and who are directed by motives and reasons, are agents, beings capable of acting freely. Agents are, in Reid’s terms, efficient causes of some of their actions, causing them by exerting their power to act. To say that an agent possesses active power is, for Reid, to say that the agent is capable of exerting a productive capacity and that in such cases it is really up to the agent to produce or to refrain from producing the actions which the agent has the power to produce. Reid presents this theory of action and human liberty primarily in his last published work, the Essays on the Active Powers of Man (1788).

Reid argues that human beings have power not only over their actions, but also over their choices and intentions. Reid therefore opposes what he calls the system of necessity, and openly challenges the views of philosophers such as John Locke, Joseph Priestley, Anthony Collins, and David Hume; who tend to share the view that human beings have freedom over their actions but not over their wills.

In the Essays on the Active Powers of Man, Reid develops his theory of action around an examination of active power, of the will, of human motives and beliefs, and around a defense of his account of human freedom.

Table of Contents

  1. On Active Power
    1. On the Notion of Power
    2. Active Power, Will, and Understanding
      1. General Observations Concerning Active Power
      2. Agents with Active Power are Endowed with Will and Understanding
  2. Of the Will
    1. General Observations
    2. Voluntary Operations of the Mind
  3. The Principles of Action or Motives
    1. Mechanical Principles
    2. Animal Principles
    3. Rational Principles
    4. Influence of Motives on the Will
  4. On Moral Liberty
    1. The Notion of Moral Liberty
    2. Of Causes
      1. Efficient Causes and First Principles
      2. Physical Causes
      3. Principles of Action and Causation
    3. The Problem of Infinite Regress
    4. First Argument for Moral Liberty
    5. Second Argument for Moral Liberty
    6. Third Argument for Moral Liberty
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. On Active Power

a. On the Notion of Power

Reid writes in the first essay of the Essays on the Active Powers that active power is the power of acting—the power of producing a work of art or of labor (EAP, 12). Reid starts his book by arguing that, contrary to what David Hume had defended (T.1.3.14), human beings have an idea of power. In order to show that we do have such a notion, Reid observes that the ideas of acting and being acted upon are found very early and universally in the minds of children. If one is able to form the thought of hitting or of being hit upon, and if all languages have active and passives voices, Reid argues, then one must be able to form the ideas of activity and of passivity (EAP, 13). Moreover, Reid contends that all humans have the notion of active power because many ordinary operations of mind imply a belief in active power, both in ourselves and in others. Actions such as deliberations, purposes, making promises, counsels, encouragements and commands imply, Reid writes, the belief that we have some degree of power over these operations and over their effects.

When Reid writes that humans have an idea or a notion of power, he means that humans have at least a conception of power. Conception, for Reid, is a term of art that should not be understood in the Kantian sense of subsuming some object under a particular concept, or as predicating some quality to the object. To the contrary, conception, for Reid, is one of the most basic cognitive capacities, whereby beings (both humans and some non-human animals) hold something in mind without predicating anything to it. The being grasps the object by holding it in mind without thinking anything about or of the object. Reid thinks that human beings all have, at the very least, a conception of power—even though they often also form more complex beliefs or judgments about power.

Hume, Reid points out, argues that since the notion of power is neither produced by sensation nor reflection, and since we cannot properly define the notion, then we have no such idea (EAP, 24). Reid, in reply, points out that although we do not directly perceive an agent’s power, we are nonetheless conscious of our exertions of power. Moreover, even though ‘power’ cannot be defined by appealing to more general or more simple categories, one may still form a conception of power from its effects, and one may still offer observations about its attributes and qualities. Reid therefore writes that since we do have a notion of power we may now study its characteristics. The first attribute Reid observes is that active power only exists in beings with will and understanding.

b. Active Power, Will, and Understanding

Reid turns to the question of whether beings who have active power could at the same time lack understanding and will (EAP 27). We cannot answer this question, Reid points out, by observing changes in nature since we do not perceive the agent nor the power behind these changes (more on changes in nature in section 4.b.ii). Reid writes that it is better to turn our attention to human agents, since, in agreement with Locke, Reid points out that our first ideas of power are taken from the power we find ourselves experiencing when we produce changes in our bodies and in the world around us (EAP, 20).

i. General Observations Concerning Active Power

By turning our view to our own exertions of power we notice, according to Reid, that power is brought into action by volition. A person’s capacity to act is actualized by exerting that power. There are many times when a person may not use his or her capacity to act. However, Reid argues, when they do exert that power, it is because the agents willed, at some point, to perform the action either immediately or later in time. An agent’s power, Reid writes, “is measured by what he can do if he will” (Of Power 10). For Reid, we notice that our willings or choices are exertions of power when we turn our attention to them. Opinions vary about Reid’s use of various terms related to volition, but one way to understand Reid is to think of volitions as willings, choices, or decisions. And, for Reid we are conscious that our choices are something over which we have some control. When we chose to do something immediately, the choice (or volition) “is accompanied with an effort to execute that which we willed” (EAP, 50). We may not always pay attention to this effort, but, Reid continues, “this effort we are conscious of, if we will but give attention to it; and there is nothing in which we are in a stricter sense active” (EAP, 51). Reid’s view, therefore, is that choices or willings are exertions of power, they are mental events through which the agent puts his or her power into action, and agents are conscious of these exertions of power.

Reid’s account is hence different from an account such as the one Thomas Hobbes defended, where an agent is free or has active power (freedom) to perform an action if he willed to perform the action. If the agent wills to perform the action, but is prohibited from performing it, then the agent is not free, according to the Hobbesian picture (Hobbes, 1648, 240). Reid, however, contends that an agent is free not only to act if he willed to act, but that the agent is also free to will or to refrain from willing. The anti-Hobbesian view Reid defends is that an agent is not free unless he has power over his willing or refraining from willing. First, Reid argues, we have an idea of power that implies having control or power over our choices, and this belief is presupposed in many other beliefs, and in our everyday actions (see previous section and arguments for moral liberty). Second, Reid observes that we are conscious of the effort we exert when we choose. The exertion of power is the object of consciousness, which we directly observe if we carefully attend to the objects of our consciousness.

ii. Agents with Active Power are Endowed with Will and Understanding

That beings with active power possess a will follows immediately, Reid holds, from his claim that the power to act implies the power to will. Reid develops his argument for this claim in several steps. First, we are conscious of having power over many of our actions: over the movements of our bodies, and the movements of our minds. Second, for Reid, having power over an end such as an external action that depends on our will requires having power over the actions of the agents which bring about the end. Reid writes that the action or effect an agent produces “cannot be in his power unless all the means necessary to its production be in his power” (EAP, 203). By ‘necessary means’ Reid does not refer to all the involuntary physical or biological events that are necessary conditions for performing the action. After all, he writes that the man who intends to shoot his neighbor dead is the cause of the death, but “he neither gave to the ball its velocity, nor to the power its expansive force, nor to the flint and steel the power to strike fire…” (EAP, 41). The velocity of the ball is not something that is in the agent’s power, even though it is a necessary condition for the killing of the man. The choice and intention of the agent, however, are events over which the agent does have power (1.b.i).

Reid’s third step, therefore, is the premise that choices and intentions are actions; they are mental events over which the agent has power—the evidence for this claim is that agents are conscious of the effort of exerting their power when they choose. Furthermore, choices and intentions are necessary means to performing effects. Hume’s predecessors all agreed that we have power over our actions or effects. Reid, therefore, points out that “to say that what depends upon the will is in a man’s power, but the will is not in his power, is to say that the end is in his power, but the means necessary to that end are not in his power, which is a contradiction” (EAP, 201). For Reid, this statement follows from the claim that willings (choices) are actions. If an agent has the power to perform an external action A, then the agent has the power to perform another external action B required to perform A. But if an internal action C is required to perform A, then the agent who has the power to perform A, also has the power to C. Reid’s conclusion, therefore, is that agents who must carry out a chain of actions in order to perform an end must have the power over each specific action (including volitions) in the chain of actions in order to perform the end (see Yaffe 2004, chapter 1, for a more complete development of this argument).

Now, if the performance of an external action requires having power over the internal action of choosing, then the power to act requires having a will. Only agents with a will are capable of willing. This does not mean that all beings with a will have power over their wills. After all, some animals might be endowed with a will and act voluntarily but not have power to control their will. But having power over one’s will implies having a will. For Reid, therefore, human beings who are able to have some control over their actions and choices, who have active power, must be endowed with a will.

According to Reid, active power or human freedom requires having, not only a will, but also understanding to direct that will. To reach this conclusion Reid argues it is important to observe that the power to will implies the power to refrain from willing. To have control over our choices and to observe that our choices are exertions of active power leads to the observation that in many cases we are free to will or to refrain from willing. This is what Reid means we he argues that we have power over our actions. Actions (external actions or internal willings)—as long as they are truly the agent’s actions—are capacities to act or to refrain from acting. The inability to refrain from acting is the consequence of forces with respect to which an agent is passive. Reid therefore writes that

If, in any action [an agent] had the power to will what he did, or not to will it, in that action he is free. But if, in every voluntary action, the determination of his will be the necessary consequence of something involuntary in the state of mind, or of something in his external circumstances, he is not free; he has not what I call the liberty of a moral agent, but is subject to necessity. (EAP, 197)

To be active, for Reid, requires the ability to will and to refrain from willing to perform an action. The two way power of willing and refraining from willing defines a person’s capacity to truly have control over his choices and actions.

If active power is a power to will and to refrain from willing, Reid thinks only beings with understanding could have such a two-way power. This two-way power implies the capacity to weigh reasons, or at least to be able to act in light of a reason, or against some good reason, and this would require possessing intellectual capacities. Hence liberty, Reid writes, “implies, not only a conception of what he wills, but some degree of practical judgment or reason” (EAP, 196). The power to act, therefore, implies the power to will. The power to will together with the capacity to weigh reasons and act in light of them or not (the ability to will and to refrain from willing for certain reasons), require that the being with active power be endowed with both a will and some degree of understanding.

2. Of the Will

a. General Observations

Reid introduces his essay on the will (Essay II of EAP,) by pointing out that the will is the power to determine to act or to refrain from acting. If active power is the power to act freely, to have power or control of the direction of (some of) our thoughts, of our bodies, and to initiate changes outside of us, the will seems to be simply the capacity to determine, that is, to choose, to decide, and to intend a certain course of action. Since we have no direct knowledge of the will other than by its effects, Reid focuses his discussion on what he calls ‘volitions,’ a term that refers to the acts of the will.

Reid continues by describing five essential qualities of every act of will. First, acts of will are about something—they have an object. The person who wills must will something, Reid writes, and this implies being able to have a conception of what is willed, and an intention to carry out what is willed. Acts of will produce voluntary actions, and what distinguishes them from things done by instinct or from habit is precisely that voluntary actions involve a conception and intention, whereas things done instinctively do not require any thought or intention.

Second, the object of will must be some action of the person who wills. For Reid, this is what distinguishes willings (acts of will) from commands and desire. We may desire things that are not within our power, and we may command others to do things we do not desire them to do (think of the judge who commands that one be punished even though he or she does not desire that the criminal be punished). Acts of will, for Reid, are therefore to be distinguished from desires. Third, the object of the act of will must be something we believe to be in our power and to depend upon our will (EP 50). If one loses the power to speak, for instance, and if one believes one has no such power, one does not will to speak—only to try to speak if recovery, say, is possible.

Fourth, Reid points out that the volition to act immediately is accompanied with an effort to do what is willed. We might not always be conscious of this effort, especially when the action is easy, but Reid thinks we might still notice and be conscious of the effort involved if we are attentive to what is happening when we determine to act. Finally, Reid observes that in decisions and intentions that are important to an agent, there is always some motive or reason that influences and inclines the agent in willing one way or another. Unimportant actions might not be performed for particular reasons, but actions that are wise, virtuous, and meaningful are performed for reasons (they are motivated by rational principles of action; see section 3.c).

b. Voluntary Operations of the Mind

The intellect and the will are, Reid writes, always conjoined in the operations of the human mind (as far as we know). Even acts usually attributed to the understanding only, like perceiving and remembering, for instance, involve some degree of activity. Conversely, every act of will involves at least some conception, intention, belief, and often also some belief about the value or worth of the action. These operations clearly involve the understanding, Reid points out (EAP 60).

Reid writes that there are three operations of the mind that have mistakenly been thought to be intellectual operations only. Reid, however, holds that these operations are also active capacities. These voluntary operations he has in mind are attention, deliberation, and fixed purposes. It is important for Reid to show the voluntary aspect of these operations because they are involved in a person’s character traits and personalities. Reid’s ‘necessitarian’ opponents might think persons have no control over their personalities, but, for Reid, since the operations involved in them are active then persons will have some degree of control and hence responsibility over them.

The first voluntary operation Reid discusses is attention. The attention we might give to a subject or to an action for the most part depends upon our will. For Reid this does not mean, however, that attention is completely within the control of agents and that attention is always the result of an act of will. To the contrary, attention is often the involuntary result of some impulse or habit. Our passions and affections direct our attention to the objects that move them. Still, Reid holds, the attention one finds oneself to have may be changed or focused. Even though the wonderful smell of the garden draws my attention to it as I walk past, the act of stopping to consider the garden and to really pay attention to it is an act of the will, Reid argues. Since attention is a voluntary act, Reid will later be in a position to argue that “we ought to use the best means we can to be well informed of our duty, by serious attention to moral instruction…” (EAP, 271).

The next voluntary operation of the mind Reid discusses is deliberation. Genuine human actions are not always the result of deliberation. One may act freely out of a fixed resolution or habit to act virtuously, for instance. Moreover, one may not always have time to deliberate. And one does not deliberate in cases that are perfectly clear: “no man deliberates,” Reid writes, “whether he ought to choose happiness or misery” (EAP, 63). But when the times permits, and when the situation is unclear, then a person might deliberate. Deliberation, for Reid, is the exercise of the agent’s capacity to consider various reasons, various desires, or various feelings and affections that move him to perform a particular action or set of actions. Deliberation is an active capacity to consider outcomes and motives and then to determine to act according to, or against, some of these reasons. Since deliberation is an active operation, Reid will later be able to show that we have a duty to deliberate coolly and impartially about our actions.

Finally, fixed purposes and resolutions are also, and essentially, active operations of the mind according to Reid. He distinguishes them from volitions or determinations to act immediately. They are, rather, intentions to act at a distance. One may decide to perform a single action in the future, or one may resolve to follow a course of actions, or to pursue some general end. In fact, Reid’s view is that the general purposes and fixed resolutions of agents are the basis of the character traits of agents. Character traits, as opposed to natural tempers, are for Reid general and regular tendencies to act in a certain manner, and the tendencies result from the person’s resolution or purpose to act according to some plan or rule. A person who is a person of virtue, for instance, is a person who formed the resolution to be a person of virtue (EAP, 69), Reid writes. This resolution will express itself in the person’s general and regular tendency (the character trait) to act virtuously. Ultimately, therefore, the character traits of agents are based in voluntary determinations of the agent’s will. And a person’s constancy or steadiness depends on the person’s commitment to the person’s general purposes and resolutions. When an agent resolves to act according to his principles, this resolution is clearly an act of will, Reid argues. Reid concludes by pointing out that since resolution is a voluntary act, over which we have some control, it may therefore properly be a virtue, whereas willfulness, inflexibility and obstinacy may properly be recognized as vices.

3. The Principles of Action or Motives

It is vital, Reid maintains, to have a correct understanding of the various motives or incitements to act, because without these active power would be utterly useless and fickle. Whatever incites us to act is, according to Reid, a principle of action. By ‘principle of action’ Reid means the last answer to the ‘why did you act?’ question. For Reid, motives or principles of action are not instrumental means to reaching some further end (EAP, 110). Principles of action are motives for actions pursued for no further reason, or which are desired for no further end or purpose.

For Reid it is important to pay close attention to those principles on which human beings do or could act—recognizing that humans often act from a variety of principles concurring in one direction.  A correct account of the various motives or principles of action will help us understand the character of the action performed: whether it is instinctive, meaningful, the outcome of some natural affection, intelligible, meaningful, wise, virtuous, etc. (EAP, 75).

Reid categorizes these principles into three classes: the mechanical, the animal, and the rational principles of action.

a. Mechanical Principles

Mechanical principles of action are motives that influence beings to act and which do not require any thought or intention. Reid writes that they require no attention, no deliberation, no will, no conception and no intention. They are completely blind impulses. Among the collection of mechanical motives, Reid points out, we find instincts and habits. Instincts are blind tendencies that are natural, found early in animals and humans, and which do not require repeated use in order to motivate. The fact that infants cry when they are hurt, that they are afraid when left alone, that they are terrified by an angry tone of voice but soothed by soft and gentle voices, are the result, according to Reid, of such mechanical instincts (EAP, 79). Brute animals are also moved to perform amazing accomplishments (like honey combs or spider webs) from mere instinct, without any intelligence of the mechanics or mathematics displayed by their artworks. Another instinct Reid observes in both humans and animals is the instinct of imitation. And even belief, in the early part of our lives, is guided by instinct and imitation, Reid points out.

Human beings can also learn to perform an action easily, without any thought, by having performed the action frequently, as well as by the natural instinct to imitate others. The habit or facility to perform an action is therefore a mechanical principle, according to Reid. Speaking, writing, riding a bicycle are, Reid points out, often performed blindly by those who have developed the habit of such activities (EAP, 88-90). These actions are carried out involuntarily and without any particular intention. The mechanical principle of habit will therefore be of great help, according to Reid, in developing the habit of performing those types of action a person has resolved to perform. Certain character traits, Reid holds, are the effect of a person’s fixed resolutions or purposes. These fixed resolutions and the actions one performs in line with them are, for Reid, voluntary operations (see 2.b.). But when these actions are performed regularly, the mechanical principle of habit helps agents acquire a facility to perform them.

b. Animal Principles

The second category of principles of action are those that Reid calls ‘animal’ because, he holds, we have them in common with many brute animals (EAP, 92). Among the animal motives, Reid notes the appetites (hunger, thirst, lust), constant desires (such as a desire for knowledge, power and esteem), benevolent affections (gratitude, love, parental affection, pity, esteem, friendship, and public spirit), and malevolent affections (emulation and resentment).

Reid argues that animal motives move all human beings who are capable of having thoughts, even though an animal motive is of a more instinctive sort when it is at work in animals and small children. When animal motives move and influence a being, the mental state which is, at the very least, required, is a conception of the object or action toward which the being is moved. Conception is the bare minimum mental capacity necessary in order to be moved by animal motives, but animal motives may require in many cases forming beliefs (propositional attitudes), or even forming beliefs about value. Esteem for the wise, for instance, may require believing that the person esteemed is wise, virtuous, or acted in an estimable manner. Animals may have an instinctive kind of admiration or affection for its master, but it does not involve thinking of its master as wise, virtuous, or kind. According to Reid, the mental capacities of animals is smaller than those of humans, and animals consider actions only, and not the intentions of others. A motive such as gratitude, for instance, is observable in the dog who is “kindly affected to him who feeds it” (EAP 115), even though the person is about to kill it. Humans who are moved by gratitude, however, have in mind the intention of the benefactor, and the gratitude of humans suppose beliefs such as that the action of the benefactor goes beyond what justice or morality requires (Ibid.).

Reid’s understanding of the nature of beliefs is not always clearly expressed. He often writes of beliefs in the same way he writes of judgments, where “the objects of judgment” are “expressed by a proposition”, but at other times he writes of belief as an attitude (belief, disbelief or doubt) about that judgment or proposition, and which accompanies the judgment (EAP, 347). Reid, furthermore, thinks that animals of the most sagacious kind are capable of having conceptions of states of affairs, but he is unsure whether they are capable of forming beliefs. He writes in one place that he thinks brutes do not have opinions (EAP, 147), but he writes in another passage that brutes have instinctive beliefs, and, he points out: “Whether brutes have anything that can properly be called belief, I cannot say; but their actions shew something that looks very like it” (EAP, 86). In any case, whether or not they are capable of having beliefs, Reid points out, “it will be granted, that opinion in men has a much wider field than in brutes” (EAP, 147). Only the beliefs of human beings, Reid points out, involve evaluations, values, and reasoning about laws and systems. Animal motives (appetites, desires, affections) are present both in animals and in human beings. However, when they move animals and small children they are more simple mental operations, more similar to the mechanical motives, whereas when they are at work in adult human motivation they usually involve higher mental capacities.

Reid understands animal motives as desire/object pairs. He writes that animal motives always involve a desire for some object and sometimes—but not always—a sensation or feeling which is typical of each desire. Reid calls all animal motives animal or natural desires, since they all aim to achieve or attain some object or end.

Furthermore, according to Reid, each natural animal motive or natural desire has its own natural object (EAP, 113). Desires for the good of certain persons, desires for knowledge, desires for power, etc. imply a state or mental act in the agent who desires and some object desired. Reid writes, for example, that when we come to realize that food will satisfy our hunger, the desire and its object “remain through life inseparable. And we give the name of hunger to the principle that is made up of both” (EAP, 93). It is therefore natural, according to Reid, to describe each animal motive either in terms of the desire or mental state involved, or in terms of the end desired.

c. Rational Principles

Reid observes that human beings are not only moved by passions, affections or desires, but also by a third important class of motives, which he calls ‘the rational principles of action.’ Reid further distinguishes between two different rational principles of action: our overall good, and our duty. Reid calls these ‘rational’ because “they can have no existence in beings not endowed with reason, and, in all their exertions, require, not only intention and will, but judgment or reason” (EAP, 154).

The first rational motive, a regard for our good as a whole, which Reid also characterizes as regard for one’s overall interest or happiness, is a rational motive because only beings with reason are able to consider and be moved by what is their overall good. Reid observes that being motivated by our overall happiness requires the ability to compare actions and to determine the possible outcomes and consequences of future courses of action. Directing our actions in order to aim at our overall happiness therefore clearly requires having rational capacities.

The second rational motive, a regard for duty, also involves the rational capacity to form judgments or evaluations of what one ought, morally, to do. Agents must have the rational capacities to consider existing actions and agents, to see that this action is morally good, and that one wrong, and hence to form specific and particular estimations of the moral quality of actions and of persons. Agents must also have the capacity to form general principles or axioms such as “what is in no degree voluntary, can neither deserve moral approbation nor blame” (EAP, 271), or “we ought to act that part towards another, which we would judge to be right in him to act towards us, if we were in his circumstances and he in ours” (EAP, 274). Forming these beliefs requires a minimum level of rational capacities, for Reid, but also, and more importantly, a moral faculty or sense, which he also calls ‘conscience.’ These capacities are required to be able to judge that a future action is one we should perform.

If animal principles are best understood as desire/object pairs, rational motives, rather, are best understood in terms of the end, object or state of affairs, to which they aim rather than as the mental state required in order to be moved toward that end (Yaffe 2004, 108). It is true that Reid often speaks in a language that implies that rational principles are some kind of mental state. After all, Reid often writes that rational principles of action imply a belief or judgment about what is good: prudentially or morally. And it is easy to conclude that the motive is the belief, thought, judgment or conviction itself rather than the end that is pursued. However, as Yaffe points out, when Reid seems to hold such a view it is because he is emphasizing the point that in order to move us, ends must bear a special relation to us. An end is a reason for me only if I think of that end and if I think of my future actions as bearing the relevant kind of relation to me and to the end (Yaffe 2004, 108).

When an agent is moved by the motive of duty, the agent is moved by a future ideal or by a future state of affairs that will instantiate that ideal. The agent, for instance, recognizes that an action that does not yet exist is one he ought to bring about because the action is conducive to duty. Ultimately, for Reid, the motive does not depend on the agent’s desires since an agent might not desire to do what he thinks he ought to do.

For Reid animal motives, or passions in general, serve an important purpose and are good and useful parts of human and animal nature. These motives, when they are not deformed by bad education and bad habits, are often conducive to virtue (Kroeker 2011). However, if there is a conflict between animal and rational motives, the rational ones have the final word, according to Reid. Rational principles are simply better guides to what is wise and virtuous since they require knowing not only what we naturally desire but also what, in the long term, will be better, would fulfill moral laws and requirements, or would bring about that which is valuable. By their moral sense, in fact, human beings recognize the value and worth of their natural animal motives, and they might also consider which course of action, whether with or without the influence of animal motives, would be morally valuable. By adopting such a position, Reid clearly opposes the view of David Hume who holds that all motives are passions, and that ultimate ends are determined by human desires and not by the intrinsic value or worth of future courses of action (T 3.1 and EPM 161).

d. Influence of Motives on the Will

Reid further distances himself from the Humean position and from the position of philosophers such as Anthony Collins and Joseph Priestley (as well as Gottfried Wilhelm Leibniz, Baruch Spinoza and perhaps David Hartley) by claiming that motives do not causally necessitate the agent’s choices, intentions and actions. Rather, Reid argues that motives influence the agent in his or her volitions.

Animal motives and rational motives, Reid observes, influence agents in different ways. Animal motives, or what has commonly been called ‘passion,’ draw us “toward a certain object, without any farther view, by a kind of violence, a violence which indeed, may be resisted if the man is master of himself, but cannot be resisted without a struggle” (EAP, 55-56). These passions move us easily, and counterbalance the defects of our wisdom and virtue, Reid points out. For example, we often eat out of mere hunger, without any thought about what is wise or good for us, or without any thought about the quality of the object. In fact, Reid points out that it is often difficult to answer questions about what we should eat, how much, and how often. If we had to reason about such questions, we would often fail to be moved to eat. These passions, therefore, serve to preserve the human species, and to help us perform tasks (EAP, 52). Passions may be stronger or weaker, but an effort is always required to resist them. Two passions may move a being in contrary ways. In animals and in humans who do not have time to deliberate or think about what they are doing, the strongest passion may prevail (EAP, 53). But human beings are usually able to form some judgment about what they are doing, and to weigh goods and evils. Human beings are therefore usually passive in part, and active in part. When the passion is irresistible or when there is no time to determine, human beings are mostly passive. But when the person is able to deliberate calmly and impartially, and to determine according to distant goods and values (and not only in terms of present gratification), the active power of the person is increased.

Rational principles of action never causally determine an agent’s choices and intentions. They function like advice or like the testimony of various parties in front of a judge. They leave a person completely at liberty to choose and to determine (EAP, 59). By the rational principles of action, Reid writes, we judge “what ends are most worthy to be pursued, how far every appetite and passion may be indulged, and when it ought to be resisted” (EAP, 56). Therefore, both kinds of principles of action move us in different ways. The phenomenology involved is different in both cases. Passions are felt like forces that push us, whereas the rational principles are felt like arguments or advice, which may, at the most, produce a cool conviction of what we ought to do.

4. On Moral Liberty

a. The Notion of Moral Liberty

According to Reid, moral liberty is the agent’s power to act. Moral liberty and active power are therefore identical for Reid (see section 1 for more on active power). Moreover, moral liberty is more than the power an agent has over his or her voluntary actions. It is also power over “the determinations of his own will” (EAP, 196). Here Reid distinguishes between merely voluntary actions and free voluntary actions. Small children and animals, Reid observes, may act voluntarily if their actions are the result of their choices and intentions. However, their voluntary actions are determined by the strongest passion, appetite, affection or habit. Brute animals, Reid writes, lack moral government, which is the control one has to choose according to what one thinks is best or required. These beings, Reid continues, have no conception of a law or of a guide to action by which they could direct or fail to direct their choices and actions (EAP, 197). Instead of being guided by the moral law they are guided, blindly, by the physical laws that govern their constitution, in the same manner as the inanimate creation is governed by physical laws. Hence, since they cannot form the conception of what is best or required, they cannot govern their choices accordingly. Since their voluntary actions are the result of choices that are causally determined by physical laws, Reid concludes that they lack moral liberty.

Moral liberty, Reid contends, is a two-way power to will or not to will something (see 1.b). Reid holds that agents who possess moral liberty must not only be capable of following rules, guidelines, advice and arguments, they must also be capable of disobeying these directives (EAP, 200).  One may imagine, Reid writes, some puppet that is endowed with understanding and will, but who has no degree of active power. This puppet, Reid points out, would be an intelligent machine, but it would still be subject to the same laws of motion as inanimate matter (EAP, 222). The puppet, that is, would be incapable of disobeying these laws. But being incapable of disobedience implies, Reid holds, that the puppet is not active in its obedience – it does not even obey in the proper sense of the term. To be free, or to possess active power in Reid’s sense, would mean that the puppets’ obedience is obedience in the proper sense; “it must therefore be their own act and deed, and consequently they must have power to obey or disobey” (EAP, 222). When agents act freely, their actions are truly up to them, and hence they must be capable of willing and of refraining from willing the action they perform. Free actions, for Reid, are the effects of the agent’s exertion (putting into effect) of his active power. He writes that it is “the exertion of active power we call action” (EAP, 13). For Reid, therefore, only actions that are produced by an agent’s exertion of this kind of active power are genuine, free, actions.

b. Of Causes

i. Efficient Causes and First Principles

According to Reid, we all have an idea of productive causes. These productive or efficient causes, for Reid, are causes that have the power to bring about certain events. Efficient causes have the power to produce changes. Efficient causes are therefore very different in nature from events that constantly precede or follow other events in nature. Events in nature, humans come to realize, are passive rather than active. “Instead of moving voluntarily, we find them to be moved necessarily; instead of acting, we find them to be acted upon…” Reid writes (EAP, 207). Moreover, Reid points out that constant conjunction does not link true causes with effects. Priestley and Hume, Reid writes, define a cause as a circumstance that is constantly followed by a certain event (EAP, 205). Reid disagrees—he holds that genuine causes cannot be defined this way. A preceding event does not have the power to produce the effect with which it is constantly conjoined, Reid points out. Smoke is constantly conjoined with fire, but, according to Reid, fire is not the active producer of smoke. And yet we all have a notion of causes that are productive—of causes which actually produce and are responsible for change.

An efficient or productive cause, Reid holds, “is that which has power to produce the effect” (Of Power, 6) and “which produces a change by the exertion of its power” (EAP, 13). To produce an effect, there must be in the cause a power to produce the effect, and the exertion of that power. For Reid, a power that cannot be exerted is no power at all (EAP, 203). The only things which possess the power to produce changes and which can exert this power are beings with will and understanding (see section 1.b.ii). Beings with active power, the will to exert it, and understanding to direct it, are agents. Efficient causes—causes in the proper sense—are therefore agents (EAP, 211).

Furthermore, Reid observes that a principle or belief that appears very early in the mind of human beings is that everything that begins to exist must have a cause (in the proper or real sense of the term) of its existence, “which had power to give it existence” (EAP, 15 and 202). For Reid “that things cannot begin to exist, nor undergo any change, without a cause that hath power to produce that change…is so popular, that there is not a man of common prudence who does not act from this opinion, and rely upon it every day of his life” (EAP, 25).

In examining Reid’s account, three questions must be answered. First, ‘what are first principles?’ Second, ‘is this principle of causality—that whatever begins to exist must have a cause—truly a first principle?’ And third, ‘if it is a first principle, does it require holding a strong notion of causes (causes as efficient causes or as agents)? ‘

First principles, according to Reid, are propositions “which are no sooner understood than they are believed” (EIP 452). First principles are things human beings believe naturally: “there is no searching for evidence, no weighing of arguments; the proposition is not deduced or inferred from another…” (Ibid.). What is believed, is, for Reid, a first principle when it functions as an axiom for other propositions. Other propositions are discovered through the power of reasoning – either by inductive or by deductive reasoning. Reasoning is the capacity of drawing a conclusion from a chain of premises, Reid writes. But “first principles, principles of common sense, common notions, self-evident truths” (EAP, 452), are words that all express propositions that do not require reasoning from a chain of premises.

For instance, the proposition that the objects we perceive really do exist is a first principle (EIP 476). It is not something we usual hold by deducing it from other premises. Furthermore, it is not inferred from repeated experiences since the proposition is supposed in the experience itself. Another example is that the future will resemble the past (EIP 489). “Antecedently to all reasoning,” Reid writes, “we have, by our constitution, an anticipation, that there is a fixed and steady course of nature” (Inquiry 199). The child, Reid observes, who has once been burnt by fire will continue to shun fire. Repeated experiences and reasoning may help them confirm that fire always burns, but children will believe that the fire which burned them once will burn them again by nature, before reasoning and experience have offered any confirmation of such a fact. One may offer support for first principles, and false first principles may be defeated (EIP 463-467), but first principles are natural and found to be true as soon as they are understood and regardless of any proof in their favor.

The principle of causality, according to Reid, is one of these natural principles. That whatever begins to exist must have a cause which produced it is, according to Reid, a first principle of our human constitution (EIP 497). Reid writes that Hume has convincingly showed that the arguments offered in defense of this principle all take for granted what must be proved (EIP 498). This principle cannot be proved by induction from experience either, according to Reid. Indeed, Reid writes, “in the far greatest part of the changes in nature that fall within our observation, the causes are unknown, and therefore, from experience, we cannot know whether they have causes or not” (EIP 499). Yet, it would be absurd, Reid argues, to conclude that these events do not have causes. The proposition that every change must have a cause that produced it is therefore a first principle, an axiom that humans hold, not as a result of any kind of reasoning or of repeating observations, but as a result of their natural constitution.

Finally, now, even if Reid is correct to think of the principle of causality as a first principle, why should we think that every change or any beginning of existence must have an efficient cause? One of Reid’s strategies is to show that we hold this belief in cases of human actions. Those actions which are truly an agent’s actions, Reid argues, are actions that are up to the agent, that are within the agent’s control, and really produced by the agent. The cause of human actions is an agent with the power to act, with the will to act, with understanding to direct his power, and who exerted his power (sections 1.b and 4.a).

Another strategy to which Reid appeals, in order to show that when we think of causes of some change we have in mind efficient causes, is to focus on what causes the beginning of existence. One natural belief, according to Reid, is the belief that nothing can come into existence without an efficient cause (EAP, 202). The first cause, that which brought things into existence, it is commonly believed, is a cause which had power to bring about existence and understanding to bring about the order of existence. Order and purposiveness (teleology) are effects of something with power and intelligence. Existence and order do not just pop into being, according to Reid. We all believe, in practice if not in words, that existence has a cause. And since existence is ordered and purposeful, we think of the cause as an efficient cause (a substance endowed with active power, will, and understanding). Reid then seems to think that we all believe that the first coming into existence is caused by an efficient cause, and this would explain why we believe that all change is caused (perhaps in the beginning, or by intermediate actions), by an efficient cause. Hence, we hold the belief that we, as human beings, are sometimes efficient causes, and we hold the belief that existence itself, and the course of nature, requires an efficient cause.

ii. Physical Causes

Although the proper sense of the term ‘cause’ is efficiency or productive causality, Reid recognizes that there is also a lax or popular use of the term. We often use the term ‘cause’ to describe events in nature that are constantly conjoined with other events, which we call ‘effects.’ Causes in this sense—in the Humean sense—are not agents but events constantly conjoined with others according to the laws of nature. Only efficient causes are causes per se, for Reid, but events that ‘cause’ others according to laws of nature may be called ‘causes’ or ‘necessary causes’ as long as we bear in mind that this is an improper use of the term.

Originally, however, Reid notices that human beings were prone to think of inanimate beings as agents. In agreement with Hume, Reid writes that human beings tended to think a soul or agent is the cause of any motion that is not accounted for. In anticipation to contemporary cognitive science, Reid recognizes that humans have what is called today a ‘hyper active agency detecting device.’ We are prone to attribute powers to beings, qualities and relations that are inanimate and passive (Of Power, 6 and EAP, 207). Then by reflection and education human beings can come to observe that objects in nature are merely passive, and that they have no productive force or power. As philosophy advances, Reid writes, we find that objects which appeared to be intelligent and active are dead, inactive, passive, and moved necessarily (EAP, 207).

Nonetheless, many human beings still tend to think of laws of nature as powers or as causes in the true sense of having power to initiate change (EAP, 211). They observe constant conjunction, events constantly following others in similar circumstances, such as heat and ice melting. However, they do not perceive the real connection between these events. “Antecedent to experience,” Reid writes, “we should see no ground to think that heat will turn ice into water any more than that it will turn water into ice” (Of Power, 7). And yet the tendency is to attribute to laws of nature behind these events the same productive powers we attribute to agents. But, Reid argues, Newton himself was correct to write that a little reflection will clearly show that laws of nature cannot produce any phenomenon “unless there be some agent that puts the law in execution” (Of Power, 7). It is as absurd to attribute agency to laws of nature as it is to attribute agency to beings who lack will and understanding. However, one may continue to use the word ‘cause’ to refer to events constantly preceding others according to the laws of nature, or to refer to the laws themselves, but one must recognize, Reid insists, that such a use is a popular and improper use of the term.

We do not observe active power in inanimate beings, and we do not perceive such a productive power in the laws of nature. Still, human beings all believe that beginning of existence, order and purpose require an efficient cause (4 b.i) Our causal beliefs, Reid argues, are teleological—we assume there is purpose and activity behind events in nature. Our context, education, bad science, or certain kinds of religion or philosophy might then lead us to give up such teleological beliefs. But, for Reid, careful attention to those beliefs that are presupposed behind our arguments and practices will reveal that human beings all have the (legitimate) tendency to think of nature as purposeful.

For Reid, laws of nature express purpose because they are expressions of divine character traits. Every agent who forms resolutions and fixed purposes will tend to act in a regular manner (section 2.b). Since natural events do not have any productive capacity, since the laws of nature are not productive powers, since all change and beginning of existence requires an efficient cause, and since human agents are not these efficient causes, there must be another agent behind the existence and motion of nature, God. Events in nature follow regular courses of action, which, like all regular courses of actions, express the character traits or fixed resolutions of an agent: the Author of nature. We do not know exactly how this efficient cause acts in nature, Reid points out (EAP, 210). He might have acted once, or might act in all changes, or act through intermediary agents. But “it is sufficient for us to know, that, whatever the agents may be, whatever the manner of their operation, or the extent of their power, they depend upon the first cause, and are under his control; and this indeed is all that we know; beyond this we are left in the darkness” (EAP, 30). In conclusion, therefore, efficient causes are agents, beings with active power, will and understanding. Events in nature and inanimate objects are not agents, and hence are called ‘causes’ in a popular sense only, but they require an efficient cause, an agent who acts orderly and purposefully.

iii. Principles of Action and Causation

In opposition to his ‘necessitarian’ opponents, Reid argues that the rational principles of action that motivate human agents in their choices and actions are not causes (in any sense of the term). The opponents Reid confronts directly are David Hume and Joseph Priestley, but he also has in mind philosophers such as Locke, Collins, Leibniz, and Spinoza. All of these philosophers held, according to Reid, that the will is not free, but that human choices and decisions are determined, necessarily, by the strongest, or, under some views, the best, motive.

In Reid’s account, the two great categories of motives or principles of action are the animal and the rational ones. Reid points out animal motives may function as physical causes do, but only in cases where the agent has no power. In cases of torture, or of madness, human beings may lose all power of self-government and of action. In such instances we do not hold them responsible for their actions (although we might hold them responsible for previous actions which were in their power) because we recognize that the animal motive (pain, fear, etc.) was irresistible. For this reason Reid concludes that when a person discloses an important secret by the agony of the rack, for example, we pity him more than we blame him (EAP, 57-58). If the person with strength of mind succeeds in resisting the passion (the fear of pain, for instance), we impute the action to the person. If the person fails after trying, we impute the action more to the passion than to the person, and we blame the person in proportion to a person’s capacity to resist passions (EAP, 59). The extent to which we have power to act is inversely proportional to the influence of animal motives. Animal motives, therefore, may act as physical causes, but never as efficient causes, since events or mental states are not beings with power and understanding.

On the other hand, rational principles of action—which are valuable ends that agents recognize they may instantiate by their actions—never act as physical causes. It is impossible for such motives to function as physical causes, Reid argues. These motives are non-existing ends that the agent considers, and since they are not existing states of affairs there is no way they could function as physical causes (EAP, 214; more on this in Yaffe 2004). They could not function as efficient causes either, since efficient causes are substances with power and understanding. Hence (rational) motives are neither causes nor agents, “they suppose an efficient cause, and can do nothing without it” (EAP, 214). Instead of thinking of rational motives as causes, Reid writes, we should rather think of them as advice or exhortation, which leave agents perfectly free to determine their choices and actions.

Defenders of necessity (Collins, Hume, Kames, Hartley, Priestley, Leibniz) have argued, Reid writes, that the strongest of contrary motives must prevail. And if there is but one motive, they hold, then this motive determines the agent. There is a necessary relation, according to these philosophers, between a person’s motives and his actions. Motives, according to them, function as necessary causes. The strongest or best motive is the motive that causally determines the action.

In response to such views, Reid examines what test these philosophers use to determine the strength of a motive. Reid asks us to consider how we judge which of two motives is the strongest. One way to establish which motive is the strongest is to consider which motive prevails. The strongest motive, according to such a position, is simply the motive that prevails (that wins). Or, one could claim that “by the strength of motive is not meant its prevalence, but the cause of its prevalence…” (EAP, 217). But, Reid points out, this answer simply asserts that the strongest motive is the strongest motive. To prevail or win is identical to being the strongest (Ibid). This solution, therefore, fails to offer a test or way of determining which motive is the strongest. Moreover, it assumes what must be proved: that the strongest motive, or the cause of the strongest motive, is the necessary cause of action. These accounts all beg the question and assume, Reid argues, that “motives are the causes, and the sole causes of actions” (Ibid.). Reid points out that the test of the strength of motives we use to determine if motives are necessitating causes to action must not assume that motives function as necessitating (physical) causes.

In reaction to his opponents, Reid writes that the strength of animal motives is determined differently from the strength of rational motives. The strongest animal motive, Reid observes, is the one that is most difficult to resist. The strength of animal motives is determined by the effort required to resist them. In a limited number of situations, they are irresistible (in cases of torture, for instance) but in the majority of situations agents are able to resist strong animal motives, Reid notices. Brutes cannot resist the strongest animal motive because they are not capable of forming the notion of ‘ought’ and ‘ought not’ (EAP, 219). Agents, on the other hand, sometimes act according to rational motives, because they are able to recognize that an action is worth performing even if it is contrary to a strong animal motive. They can consider whether the action will be either prudentially or morally valuable, and hence act according to rational motives. The strongest rational motive, Reid observes, is the one that offers the best reason to act. It is not the one which is hardest to resist, but it is “that which it is most our duty and our real happiness to follow” (EAP, 219). The best reason, furthermore, cannot be a necessitating cause to action. Indeed, ends such as moral value or overall well-being are non-existing states of affairs, and they are not substances. Hence they could be neither necessary (physical) causes nor efficient causes.

Finally, Reid points out that it is true that we often reason from men’s actions to their motives. But we do not have enough evidence to be able to conclude, inductively, that human beings are always determined by motives (EAP, 220). Indeed, we do not have empirical data to justify such an inference. The relation strong motive-action is a probable one, but there are too many instances in which humans resist strong motives, or act according to weak motives, or act in light of future values, or act against what they consider to be best. The relation is probable, at best, but far from being certain, Reid concludes. Moreover, in many cases that are unimportant, it is possible for agents to act without any motive at all (EAP, 215; see also Rowe 1991, 171-175).

c. The Problem of Infinite Regress

Several well-known objections could be offered to theories of liberty such as Reid’s. One kind of objection is that theories of action which hold that choices and actions are not causally determined by motives, or that hold that agents have power over their will (their volitions), lead to problems of infinite regress. Some argue that accounts of liberty such as Reid’s presuppose a regress of acts of will. Others argue that they lead to a regress of motives or reasons for action. And some point out that they lead to a regress of exertions of power. All such kinds of infinite regresses are impossible, the objections go, and hence the doctrine of liberty which Reid defends must therefore be false.

Reid offers a response to the first kind of regress. Indeed, he notices that an objection first advanced by Thomas Hobbes is that the claim that we have power over our will (to will this way or that) amounts to saying that we may will it, if we will. Hobbes therefore argues, Reid notes, that in order to act the will must thus be determined by a prior will, which must in turn be determined by a prior will, and so on ad infinitum (EAP, 199). Hobbes, Reid writes, therefore holds that liberty consists only in the power to act as we will, and it does not extend to the will itself.

In response to Hobbes, Reid points out that his account does not imply an infinite regress of acts of will because even though the determinations of the will are effects, Reid writes, which must have a cause, the cause of the determining (the choice or intention) is not some previous choice or intention. If the will is free, the agent is the cause of the agent’s determinations, he writes, and the action is thus imputable to the agent. If another agent is the cause (either immediately or by the interposition of other events), then the determination is imputable to that other agent. There is no regress of acts of will, according to Reid, since the cause is the agent and not another act of will (EAP, 201). Moreover, as explained in section 1.b.ii, Reid argues that power over an end action requires having power over the intermediary actions leading up to the end. Volitions (choices or willings) are internal actions leading up to the end, and hence power over the end implies having power over the volitions (Ibid.).

Another possible regress implied by Reid’s account of liberty (which is often called today a ‘libertarian’ account of freedom, to be distinguished from the political theory) is a regress in reasons or explanations. Under an account like Reid’s, the agent is influenced by different motives, desires, reasons, and the agent then decides to act in line with one motive rather than another. How does the agent make this choice? Leibniz, for instance, writes that if the best motive does not causally necessitate the choice, then the agent must have a further reason or motive to choose one motive rather than another. Leibniz imagines that the agent must listen to different voices, different motives, and that in order to choose between these voices or considerations, the agent must listen to yet another higher-order voice, and choosing this higher-order motive requires considering still higher-motives. The agent, that is, needs a higher order motive in order to choose between lower order motives. But then, Leibniz argues, the agent would need a reason to choose the higher-order motive, and so on ad infinitum (see Leibniz 1985, and Rowe 1991b:181). Since agents cannot go through an infinite number of reasons, and in order to keep from falling into the regress trap, a more appealing account, according to Reid’s opponents, is the one that holds that agents are determined, necessarily, by the strongest or the best motive.

According to Reid, however, the process of weighing motives and of deliberation does not lead to an infinite regress of motives. Rational motives may be higher-order motives about which animal motive should be acted upon or resisted, or about the overall prudential or moral value of the courses of action to which various motives lead (EAP, 57). But at the moment of decision, deliberation has stopped. Deliberation ends with a final judgment about what one ought to do, and with an intention either to act accordingly, or to refrain from acting according to what is best. “The natural consequence of deliberation on any part of our conduct,” Reid writes, “is a determination how we shall act; and if it is not brought to this issue it is lost labour” (EAP, 64). At the moment of decision, Leibniz is wrong, Reid would contend, to think that the libertarian must revert to an infinite series of reasons. The agent knows which motive he should follow. Deliberation ends with a consideration of which course of action is most valuable. And whether the agent follows the one that is best or not will not depend on yet another reason or motive. Indeed, what other reason could he offer? Evaluations of what one ought to do, about prudence or moral duty, are evaluations about ends, according to Reid. And an end is something that requires no further justification or reason. Therefore, once a person recognizes the end, and which course of action will instantiate the end, no further reason needs to be offered. Now, whether or not the agent in fact acts according to such an end is a question of weakness of will, or of lack of self-government; it is not an issue of providing yet further motives for choosing between higher-order motives (Kroeker 2007).

In the late 20th and early 21st century, several Reid scholars have argued that yet another kind of regress looms for Reid: an infinite regress of exertions of active power. Reid writes that “in order to the production of any effect, there must be in the cause, not only power, but the exertion of power; for power that is not exerted produces no effect” (EAP, 203). The regress arises when we consider the nature of the exertion of active power. Is the exertion an event? If so, then it must have a cause since Reid writes that every event must have an efficient cause. Efficient causes, for Reid, are agents, and when an agent acts freely, that agent is the cause of his or her own action. But if exertions are effects, does this mean that the agent causes the exertion of power? If so, then the agent would have to exert his active power in order to exert his active power. But if this former exertion is also an event caused by the agent, then it requires another previous exertion of active power, and so on.

Reid does not explicitly address the problem of regress of exertions of active power, and scholars disagree about which response he would adopt. Since this problem is well known, it is worth mentioning some suggested responses. One possible answer is simply to accept the regress, and to defend its coherence (Chisholm 1979). Another possible strategy is to argue that some acts of will are acts of mind that are uncaused. William Rowe, for instance, argues that the exertion of active power is an event but it is logically or conceptually impossible for such an event to be the effect of the exertion of active power of any agent (Rowe 1991b). For Rowe exertions are acts by which agents agent-cause their choices and actions. But agents do not cause the exertion itself, Rowe argues.

Another Reidian response to the problem of regress of exertions is that there can be events that an agent causes without bringing about any other event as a means of producing them. Paul Hoffman, for instance, suggests that exertions of power are events. But they are the kind of events that do not require a previous exertion of active power (this solution was first considered, but not ultimately defended, by Rowe (1987)). Hoffman argues that exertions of active power are events, but events which are exempted of being caused by an exertion of active power. These events “do not require a prior activating of a power” (Hoffman 2006, 445).

A further suggested solution is to reject the view that exertions of active power are events. Timothy O’Connor, for instance, argues that exertions are better understood as relations that hold between agents and their effects. Agents bring about their volitions, but not by some further action. They bring about their effects directly. And this irreducible relation is the exertion of active power. Exertions of active power are relations between causes (agents) and effects (the agents’ volitions and actions), and as relations they are not caused, and hence are not events (O’Connor 1994, 621).

A solution to the regress of exertions problem, Gideon Yaffe alternatively suggests, may be found if we consider what is involved in trying to act. For Yaffe, when an agent tries to do something and succeeds, there is just one action. If someone tries to commit a murder and succeeds in committing the murder, for example, we do not think the person performed two different things: trying and killing. The only action performed here is killing. The trying is not a separate action, but it is somehow part of the action of killing. Hence, Yaffe writes that “if the trying is anywhere in cases of success it is ‘in’ the successful action (Yaff 2004, 157). Now, Yaffe continues, if exerting power is just like trying, then in cases of success, the exertion is in the action. In cases of failure, on the other hand, the only thing the agent does is try. Trying in cases of failure is an action or event. We might say that in cases of failure the agent exerts his power to try. Moreover, since trying is a successful action, there is no further action of trying to try which would require a further exertion of active power. Hence, when we understand the exertion of power as trying (regardless of whether the trying succeeds and the agent acts, or the agent fails and the agent only tries to act) it is false to say that such exertions lead to a regress of exertions. Yaffe concludes, therefore, that since the relation agent-action in cases of success is direct and non-reducible, Reid’s account is best characterized as an agent-causal view of action.

Finally, James Van Cleve suggests, in response to the infinite regress of exertions objection, that in causing an action, the agent causes his own causing of the action. An agent’s causing some action has a reflexive component: in causing an action, the agent thereby causes the causing of the action. And we need not worry about the further complexity that in causing the causing the action I also cause the causing of the causing of the action, because this further complexity “is matched by no corresponding complexity in the fact described” (Van Cleve, 2015: 432). In causing an action, Van Cleve argues, the agent causes his own causing of the action—the causing of the action is not a further causing of this causing but is the agent himself.

d. First Argument for Moral Liberty

In the fourth essay of the Essays on the Active Powers, Reid focuses on three arguments for the conclusion that adult human beings have active power over their will. According to Reid’s account, moral liberty is the freedom an agent has to choose and to act (see sections 1.b and 4.a). What Reid calls the ‘first argument’ for moral liberty is based on the fact that the belief that we are free and the belief that every event must have a cause which had power to produce it are natural beliefs. For Reid some beliefs are natural beliefs, and natural beliefs are legitimate. They are legitimate in that it is reasonable to hold these beliefs unless proof is brought against them. Reid will show that there is, in fact, no good proof brought against such beliefs, and hence that it is legitimate to hold that humans possess moral liberty.

Natural beliefs are held by all adult human beings who are not under the influence of madness, of drugs, or of circumstances which might diminish their rational and active capacities. They are generated by our innate natural faculties, and, for Reid, we have no good reason to distrust our natural faculties. They all stand on the same footing, and hence we have no reason to trust one faculty and not another, Reid argues. Moreover, he shows that we cannot distrust all of them since we would be using our faculties to offer reasons for distrusting all of them—a position that is self-defeating and untenable (EAP, 229).

Another characteristic of natural beliefs is that they appear very early in the mind of humans even though they are neither a result of induction nor of deduction. Children, for instance, believe very early that they have a degree of active power over their actions, and that every event must have an efficient cause. Human beings do not first form these beliefs by observing external objects. In nature we only perceive events constantly conjoined to others, Reid points out, and not the connection between them (EAP, 229). The notion of power (and of efficient causation), therefore, cannot be formed by observation of external objects (see section 1.a on our ideas of active power; and 4.b.i for more on natural principles). Moreover, as adults we are not directly conscious of our own power (only of the exertions of our power), but yet we are convinced that we do have power when we choose and act. For Reid we only will to perform actions which we believe to be in our power. Hence, in exerting our power (in acting), we also believe that the action is in our power, and hence that we have the power over our willings and external actions.

Furthermore, Reid continues, the conviction that we have power over our choices and actions is implied in all our deliberations. When we deliberate about which action we should perform, according to Reid, we are convinced that the action we are considering performing is in our power. And once we finish deliberating, our resolutions and purposes imply a conviction that we have power to execute what we have resolved to do (EAP, 230). Reid adds that our acts of promising also imply the belief that we have the power to carry out what we have promised, and the human activity of blaming also presupposes a belief that the person acted wrongly. Blame, Reid writes, presupposes a wrongful use of power, as it is absurd to blame a person for yielding to necessity. Finally, the belief that humans often do have power over their will is assumed in the regulations of tribunals.

Some of these examples, however, show that humans believe that they have power over their actions, but they do not imply the conclusion Reid is seeking: that we do have power over our volitions. Reid’s view, in response, is that unless reflection, good philosophy, good science or compelling evidence may be given to distrust the propositions that sane adult human beings believe naturally, we are justified in holding them. Natural beliefs are legitimate; they express what is the case. To accept a proposition as a natural principle (as a first-principle), it is not enough to simply assert that we all believe it. Reid argues, rather, that there are “ways of reasoning about them, by which those that are just and solid may be confirmed, and those that are false may be detected” (EIP 463). One must show, for instance, that the belief is acquired without the need of inductive or deductive reasoning, that it is common to the learned and the unlearned, that giving up the conviction would lead to absurdities, that it enjoys the same kind and level of evidence as other similar beliefs which we admit as first principles, and that it appears early in the mind of children (EIP, VI.4). Reid, in his first argument for moral liberty, contends that we have power over our will because it is a proposition that is believed, naturally, and because it meets the requirements for being a first principle. The burden of proof, therefore, falls on the side of those who defend the doctrine of necessity (EAP, 235). This epistemological point, coupled with Reid’s belief that there are no convincing arguments to show that the belief must be rejected, lead Reid to conclude that we have moral liberty.

One might object that many might profess that they do not, in fact, believe that they have power over their will. Reid, in response, argues that what people profess or state is not always in line with what they actually believe. To know what people believe, Reid points out, it is not always best to ask them. It is better to look at their actions and practices. Hence a man might claim that he is not afraid of the dark, and yet refuse to sleep alone and to turn the light off. His actions betray his fear, Reid points out (EAP, 232). The same holds for our beliefs of power over our will and actions. Everyday practices are evidence that we believe we have power not only over our actions but also over our will.

e. Second Argument for Moral Liberty

Reid also argues for moral liberty from the fact that humans are moral and accountable beings. For Reid there can be no “moral obligation and accountableness, praise and blame, merit and demerit, justice and injustice, reward and punishment, wisdom and folly, virtue and vice” if human beings do not have active power (EAP, 240). In civil government, in religion, and in all moral discourse, the obligation that human beings have to do what is right is taken as given. Humans, it is universally recognized, are moral and accountable beings.

In order to be a moral being, Reid argues, a being must know the law and must have the power to obey it. Brute animals, Reid observes, are not moral and accountable beings, because they do not have the capacity to understand the moral law. Human beings, however, if they are not insane or incapacitated, are, according to Reid, capable of recognizing a moral law (what they ought to do). Human beings also have the power to do what they are responsible for (EAP, 237). Reid points out that it is self-evident to all human beings that one cannot “be under an obligation to do what it is impossible for him to do” (Ibid.). A person may be responsible for a previous action, like cutting off his finger, but he cannot then be held responsible for not being able to use his fingers for a particular action that requires dexterity. The fact that humans are held responsible for some of their actions, therefore, implies that they are capable of knowing the moral law, and that they can have the power to act accordingly.

The doctrine of necessity, Reid, argues, is not coherent with the moral accountableness of human agents. According to the doctrine of necessity, Reid writes, a man is not free to will or to choose. But if this is the case, then it is impossible for a person to will otherwise than how he in fact wills. But this means that it is impossible for a person who is determined to will to refrain from willing. And hence, according to Reid, the doctrine of necessity implies that persons should not be held responsible, because it is absurd to hold someone responsible for what he could not refrain from choosing. But we do continue to hold humans accountable for their choices and actions. Again, therefore, the burden of proof falls on those who hold that our everyday practices and beliefs are wrong, according to Reid.

f. Third Argument for Moral Liberty

In his third argument for moral liberty Reid argues that human beings have power over their actions and their volitions because they are able of conceiving plans and of carrying them out, which implies they possess moral liberty. A regular plan of conduct, Reid contends, cannot be contrived without understanding, and cannot be carried into execution without power (EAP, 240).

Reid assumes, from the start, that some persons lay down a plan of conduct, and resolve to pursue that plan. Reid also assumes we will all agree that some have in fact pursued such plans or ends by adopting and performing the proper means.

For Reid, these assumptions imply that the person who carries out the plan has wisdom and understanding. But, he argues, they also demonstrate that the person has a degree of power over his voluntary determinations. For Reid, “understanding without power may project, but it can execute nothing” (EAP, 240). A regular plan of action cannot be contrived without understanding, and it cannot be executed without power.

The question, now, is ‘where is this understanding and power?’ In other words: ‘who is the true cause of the plan?’ If wisdom and understanding is in fact in the person who carried out the plan, Reid writes, then we have the very same evidence that the power which executed it is also in that person.

For Reid’s opponent, the true cause of the plan lies in the motives of the agent, and not in the agent who exerts his active power. But, Reid points out, motives do not have understanding and power. These qualities are characteristics of agents, but only agents have understanding and power over their wills and actions. So, is there an agent other than the person who carried out the plan who in fact arranged the motives (and was the true cause of the plan)? If the cause is an agent other than the person who carried out the plan, Reid writes, then we cannot say the person had a hand in the execution of the plan; it is not even his plan, and, further, we have no evidence that the person is a thinking being. After all, many animals carry out plans such as building bee hives or spinning webs. But we do not think of the plans as the effects of intelligence in the animal. The plans here are effects of another agent, of another efficient cause. But in the case of human beings, the plans they carry out really are their plans. And we think of humans as exerting the intelligence required to lay out and pursue their plans of action. Hence, Reid concludes, if the intelligence is really in the person who carried out the plan and whose plan it is, then the power must be in that person as well.

What evidence do we have that our fellow human beings think and reason? For Reid, the evidence that persons who carry out intelligent plans have wisdom and understanding is provided by the persons’ speech and actions. The behavior of human beings, their language, and their bodily movements and expressions are all signs of intelligence (for more on natural signs of mental states, see EAP, 116 and 141). The understanding therefore truly lies in the person who carried out the plan. However, to carry out a plan, one must have power to execute it. Hence, “if the actions and speeches of other men give us sufficient evidence that they are reasonable beings, they give us the same evidence, and the same degree of evidence, that they are free agents” (EAP, 242). Pursuing plans, step by step, therefore, is a sign of both wisdom and active power.

In conclusion, Reid recognizes that although many questions remain to be answered, one thing is clear: in everyday activities and concerns of the present life of human beings, no person is very much affected by the doctrine of necessity (EAP, 269). Even though some vehemently defend necessity and others zealously guard liberty, we see no great difference between them in their everyday conducts. The fatalist, despite his claims, “deliberates, resolves, and plights his faith. He lays down a plan of conduct, and prosecutes it with vigour and industry…” (EAP, 269) He exhorts, commands, blames and praises. In all cases “he sees that it would be absurd not to act and to judge as those ought to do who believe themselves and other men to be free agents” (Ibid.). In everyday life, Reid concludes, all men act in a way that is consistent with his theory of liberty, and inconsistent with the doctrine of necessity. All humans believe, in the everyday, that they possess active power over some of their actions. And what all humans believe is a sign of what is, according to Reid, unless one may offer good arguments to show that the belief is false. In the absence of such good arguments, Reid thinks that the most reasonable position it to claim that humans have moral liberty. And what is most important, according to Reid, is for us “to manage these powers, by proposing to ourselves the best ends, planning the most proper system of conduct that is in our power, and executing it with industry and zeal. This is true wisdom; this is the very intention of our being” (EAP, 5). Powers to act, to direct our actions, to plan, to decide, are all useful powers for Reid, and using them well, for good ends, is the noblest human mission.

5. References and Further Reading

a. Primary Sources

  • Collins, Anthony (1976) ‘A Philosophical Inquiry Concerning Human Liberty’ in Determinism and Free Will, ed. J.O’Higgins SJ. The Hague: Martinus Nijhoff.
  • Hobbes, Thomas (1648) ‘Of Liberty and Necessity’ in The English Works of Thomas Hobbes, iv, accessed November 24, 2015, https://archive.org/details/englishworkstho28hobbgoog.
  • Hume, David (1998) An Enquiry Concerning the Principles of Morals (EPM), Tom L. Beauchamp (ed.). Oxford: Oxford University Press.
  • Hume, David (2007a) A Treatise of Human Nature: A Critical Edition (T), David Fate Norton and Mary J. Norton (eds.). Oxford: Clarendon Press.
  • Leibniz, Gottfried (1985) Theodicy. LaSalle, Il: Open Court.
  • Locke, John (1979) An Essay Concerning Human Understanding. Oxford: Clarendon Press.
  • Priestley, Joseph (1782) Doctrine of Philosophical Necessity Illustrated; being an Appendix to the Disquisitions relating to Matter and Spirit. London.
  • Reid, Thomas (1997) An Inquiry into the Human Mind on the Principles of Common Sense (I), Derek R. Brookes (ed.). Edinburgh, UK: Edinburgh University Press. (Original work published in 1764.)
  • Reid, Thomas (2001) ‘Of Power’, The Philosophical Quarterly 51.202: 3-12 (Original work unpublished manuscript, 1772).
  • Reid, Thomas (2002) Essays on the Intellectual Powers of ManA Critical Edition (EIP), Derek R. Brookes (ed.). Edinburgh, UK: Edinburgh University Press. (Original work published in 1785.)
  • Reid, Thomas (2010) Essays on the Active Powers of ManA Critical Edition (EAP), Knud Haakonssen and James A. Harris (eds.). Edinburgh, UK: Edinburgh University Press. (Original work published in 1788.)
  • Spinoza, Baruch (1985) Ethics, in The Collected Works of Spinoza, vol. 1, transl. E. Curley. Princeton: Princeton University Press.

b. Secondary Sources

  • Alvarez, Maria (2010) ‘Thomas Reid’, in A companion to the Philosophy of Action, Timothy O’Connor & Constantine Sandis (eds.). UK: Wiley-Blackwell: 505-512.
    • Argues that Reid’s endorsement of the thesis that we cause changes by causing volitions gives rise to serious problems.
  • Chisholm, Roderick M. (1979) ‘Objects and Persons: Revisions and Replies’, in Essays on the Philosophy of Roderick M. Chisholm, Ernest Sosa (ed.). Amsterdam, Rodopi: 317-388.
    • Recognizes that his account implies an infinite regress but denies that the regress is vicious.
  • Hatcher, Michael (2013) ‘Reid’s Third Argument for Moral Liberty’, British Journal for the History of Philosophy, 21.4: 688-710.
    • Argues that the argument from design is one of the premises in Reid’s third argument for moral liberty.
  • Harris, James (2003) ‘On Reid’s “Inconsistency Triad”: A Reply to McDermid’, British Journal of the History of Philosophy 11.1: 121-127.
    • Defends Reid against the charge that Reid’s attempt to offer arguments for moral liberty is inconsistent with his view that the proposition that we have power over our wills is a first principle, and hence cannot be proved.
  • Hoffman, Paul (2006) ‘Thomas Reid’s Notion of Exertion’, Journal of the History of Philosophy 44.3: 431-447.
    • Offers a Reidian answer to the infinite regress of exertions problem by showing that exertions of power are events which are exempted of being caused by prior exertions.
  • Jaffro, Laurent (2011) ‘Reid on powers of the mind and the person behind the curtain’, Canadian Journal of Philosophy, 41.sup1: 197-213.
    • Examines Reid’s claim that there is no inert activity; the operations of the understanding involve some degree of activity.
  • Kroeker, Esther (2007) ‘Explaining our Choices: Reid on Motives, Character and Effort’, The Journal of Scottish Philosophy, 5 (2) 2007, 187–212.
    • Offers a Reidian answer to the objection that Reid’s account or moral liberty leads to an infinite regress of reasons and motives, and examines Reid’s criticism of the Principle of Sufficient Reason.
  • Kroeker, Esther (2011) ‘Reid’s Moral Psychology: animal motives as guides to virtue’, Canadian Journal of Philosophy, 41.sup1: 122-141.
    • Examines Reid’s understanding of the animal motives and argues that uncorrupted animal motives are, for Reid, guides to virtue.
  • Lindsay, Chris (2005) ‘Reid on Scepticism about Agency and the Self’, Journal of Scottish Philosophy 3.1: 19-33
    • Defends Reid’s possible response to Alvarez’s objection that Reid’s theory gives rise to a skeptical worry concerning awareness of one’s actions, and suggests there are still tensions originating from Reid’s dualist metaphysics.
  • Madden, Edward (1983) ‘Commonsense and Agency Theory’, Review of Metaphysics 36.2: 319-341.
    • Presents an overview of Reid’s views on agency and causality, and discusses problems related to Reid’s claim that reasons for action are not causes.
  • McDermid, Douglas (1999) ‘Thomas Reid on Moral Liberty and Common Sense’, British Journal for the History of Philosophy 7.2: 275-303.
    • Presents the essential features of Reid’s theory of liberty, shows how his arguments depend on common sense epistemology, and discusses limitations of Reid’s account.
  • McDermid, Douglas (2010) ‘A Second Look at Reid’s First Argument for Moral Liberty’, in Reid on Ethics, Sabine Roeser (ed.). Great Britain: Palgrave Macmillan: 143-163.
    • Discusses the Reidian claim that the principle that we have moral liberty (that we have some degree of power over our actions and determinations of our will) is a principle of common sense.
  • O’Connor, Timothy (1994) ‘Thomas Reid on Free Agency’, Journal of the History of Philosophy 32.4: 605-622.
    • Argues that the Reidian response to the problem of infinite regress of exertions of power is that it is best to understand exertions as relations that hold between agents and their actions, and that such relations, as opposed to events, are uncaused.
  • Rowe, William (1987) ‘Two Concepts of Freedom’, Proceedings and Addresses of the American Philosophical Association 61: 43-64.
    • Presents a Reidian response to the objection that Reid’s moral theory leads to an infinite regress of exertions of power by arguing that exertions are the kind of events that do not require a previous exertion of power.
  • Rowe, William (1991a) ‘Responsibility, Agent-Causation, and Freedom’, Ethics 101: 237-257.
    • Argues that agent-causation, for Reid, is the concept that plays a central role in the logical connection between moral responsibility and freedom.
  • Rowe, William (1991b) Thomas Reid on Freedom and Morality. Ithaca: Cornell University Press.
    • Presents and evaluates Reid’s theory of freedom.
  • Stalley, R.F. (1998) ‘Hume and Reid on the Nature of Action’, Reid Studies, 12: 33-48.
    • Discusses Reid’s response and criticism of Hume’s claim that we have no clear idea of power or cause.
  • Stecker, Robert (1992) ‘Thomas Reid’s Philosophy of Action’, Philosophical Studies 66: 197-208.
    • Argues that Reid’s theory of causality leads to a regress problem, and if it does not it is still uninformative and mysterious.
  • Van Cleve, James (2015) Problems from Reid. Oxford: Oxford University Press.
    • Examines Reid’s arguments related to a wide range of topics such as perception, sensation, skepticism, conception, epistemology and first principles, as well as action theory.
  • Van Woudenberg, René (2010) ‘Thomas Reid on Determinism,’ in Reid on Ethics, Sabine Roeser (ed.). Great Britain: Palgrave Macmillan: 123-142.
    • Argues that Reid understood ‘determinism’ differently from how it is currently understood, and discusses Reid’s criticism of determinism as he understood it.
  • Weinstock, Jerome (1975) ‘Reid’s Definition of Freedom’, the Journal of the History of Philosophy, 13.3: 335-345.
    • Examines Reid’s attempt to include power over one’s will in the definition of freedom, and argues that power over the will has to do with possession of reason and a faculty of judgment that enables us to govern ourselves.
  • Yaffe, Gideon (2007), ‘Promises, Social Acts, and Reid’s First Argument for Moral Liberty.’ Journal of the History of Philosophy 45. 2: 267-289.
    • Presents Reid’s philosophical views about the act of promising, and showing that, for Reid, it is because promising is a social act that he thinks it presupposes active power.
  • Yaffe, Gideon (2004), Manifest Activity. Thomas Reids Theory of Action, Oxford: Oxford University Press.
    • Offers a critical examination and reconstruction of Reid’s arguments in favor of his doctrine of moral liberty.

 

Author Information

Esther Engels Kroeker
Email: Esther.kroeker@uantwerpen.be
University of Antwerp
Belgium

Avicenna (Ibn Sina): Logic        

avicenna drawingAvicenna (Ibn Sina) (c. 980-1037 C.E., or 375-428 of Hegira) is one of the most important philosophers and logicians in the Arabic world. His logical works are presented in several treatises. Some of them are commentaries on Aristotle’s Organon, and are presented in al-Shifa al-Mantiq, the logical part of the Encyclopedic Book al-Shifa (The Cure), which also contains a treatise on Metaphysics (al-Ilahiyat) and a treatise on Natural Science (al-tabi‘iyyat); others are edited on their own, such as the books entitled al-Isharat wa-t-tanbihat, al-Najat, and Mantiq al-mashriqiyyin. Among his writings, there is a book written in Persian called Danishnama-yi ʻAla᾿i (The Book of Knowledge for ʻAla al-Dawla).

As a logician, he was mainly influenced by Aristotle’s commentators such as Alexander of Aphrodisias, and by al-Farabi, in the Arabic world, and had himself many followers including al-Ghazali, Nasir-eddine al-Tusi, Afdal al-Din al-Khunaji and Fakhr al-Din al-Razi. He added new concepts and distinctions that are not found in the writings of the ancient authors, or in those of the earlier logicians of the Arabic tradition. He improved the Aristotelian categorical and modal syllogistics, and constructed a whole system of hypothetical logic, different from the Stoic system and far more developed than al-Farabi’s reflections on the same topic. His modal syllogistic in particular is very different from the Aristotelian one, for the conversions involving the modal propositions are different. As to hypothetical logic, it involves the conditional as well as disjunctive propositions and makes connections between them both and between them and the predicative ones.

In his analysis of the syllogisms, Avicenna introduces a new distinction between the iqtirani (translated as “conjunctive”) syllogisms and the istithna᾿i (usually translated as “exceptive”) ones [that is, the usual Stoic kind of syllogisms], the former being a large class including the categorical syllogisms together with one kind of the hypothetical syllogisms. In the former, the conclusion does not occur in the premises; it conveys new knowledge deduced from the two premises, while in the latter, which uses the istithna, that is, detachment, the premises explicitly include either the conclusion or its contradictory.

His analysis of the absolute (or non-modal) propositions is new and original, since he introduces temporal considerations in this type of proposition and renews the oppositional relations between these propositions by introducing perpetual propositions, general absolute propositions and special absolute propositions. This analysis is influenced by semantic and linguistic considerations.

Table of Contents

  1. Life and Works
  2. The Definition of Logic
  3. The Different Kinds of Propositions and their Relations
  4. The Analysis of the Absolute Propositions in al-Shifa, al-Qiyas
  5. The Absolute Propositions in Mantiq al-Mashriqiyyin
  6. The Categorical Syllogistic
  7. The Modal Propositions and Oppositions
  8. The Modal Syllogistic
  9. The Hypothetical Logic
  10. Combining Hypothetical and Categorical Propositions
  11. References and Further Reading
    1. Avicenna’s Logical Treatises and Other Primary Sources
    2. Secondary Sources

1. Life and Works

Abu Ali al-Hussein Abdullah ibn al-Hassan ibn Ali Ibn Sina (called Avicenna in the West) was born in a village near Bukhara in 980 C.E. (375 of Hegira). At a very young age, he was taught the Koran together with much literature. He also learned philosophy, geometry and Indian calculus during his childhood and youth. His teacher, al-Natili, taught him Logic, starting with Prophyry’s Isagoge. He also studied the other treatises of the Aristotelian Organon and Euclidian geometry, which he mastered readily. While turning to Natural Science, he became interested in medicine and read many books related to that topic. At the age of 16, he was so knowledgeable in that domain that he was able to treat and cure people. As he became famous as a physician, he was asked to take care of the Sultan Nuh ibn Mansur and succeeded in curing him from his disease. To thank him, the sultan gave him access to the royal library where he could read many original medical books, together with books on poetry, Arabic grammar and theology. At the age of 18, he was already very knowledgeable in all these disciplines and had read many unknown ancient treatises in various sciences. However, he had some difficulty understanding Aristotle’s Metaphysics, which he read several times without getting the point of it. Only after he had the opportunity to read al-Farabi’s commentary on Metaphysics, entitled Aghradu kitab ma ba‘d at-tabi‘a, could he understand the aim and interest of metaphysics.

After the death of his father, he left Bukhara because of some troubles at that time and started to travel. He went to several places including Khurasan, Jurjan, and other Persian towns searching for subsistence. He started writing, in particular the Kitab al-Qanun (on medicine) and other books in various domains. He then became the minister forShams ad-Dawlah (the sultan of Hamadan in Persia) after having cured him from his disease. He started working for the sultan during the day and writing his books, in particular al-Shifa, in the evenings. He wrote first al-Tabi‘iyyat (Natural Science) of the Shifa book, after finishing the first volume of the Qanun. His secretary and his brother assisted him by reading and copying these books.

After the death of the sultan and because of some political complications, he went to jail for four months, where he wrote Hayy ibn Yaqdan, among other works; however, he was able to escape and travel with his brother and his secretary to Ispahan, where he stayed the rest of his life. In Ispahan, he became the minister and doctor of the sultan ‘Ala’ ud-Dawla; he also wrote Kitab al-Najat, plus the rest of Kitab al-shifa, in particular the logical part of it, and other books on arithmetic, geometry, music, and biology (anatomy and botany) together with a book on astronomy, at the request of the sultan. He also wrote three books on language and wrote down his medical observations and experiences in his famous Kitab al-Qanun.

Although Avicenna worked very hard, he also enjoyed a good living. However, during an expedition with the sultan he got sick; he tried to cure himself, but despite his efforts, he became very weak, and the intestinal disease did not disappear completely, recurring several times. He died at the age of 53 (428 of Hegira) from the disease, realizing that no medicine could cure him (al-Isharat wa-t-tanbihat 85-97; Mantiq al-Mashriqiyin; “Avicenna”).

Avicenna’s corpus is rich and varied. He wrote on all sciences, from astronomy, botany, and medicine to metaphysics, logic, physics, chemistry, linguistics and many others. In the field of logic, which is our concern here, he wrote several treatises, such as: al-Shifa al-Mantiq (which contains the correspondence on all Aristotelian treatises), al-Najat (translated in 2011 by Asad Q. Ahmed under the title: Avicenna’s Deliverance: Logic), al-Isharat wa-t-tanbihat, al-Qasida al-muzdawija, and Mantiq al-Mashriqiyyin. The latter, however, is what remains from a much longer book which, apparently, was destroyed in Ispahan in 1034 C.E. and is said to have contained Avicenna’s project on an Eastern philosophy called hikma mashriqiyya (Arnaldez 192).

2. The Definition of Logic

Avicenna defines the field of logic in many of his logical treatises. Some of these definitions include the following:

What is meant by logic, for men, is that it is a regulative (qanuniya) tool whose use prevents his mind from making errors (ʼan yadalla fi fikrihi) (al-Isharat 117).

…for [logic] is the tool that prevents the mind from errors in what men conceive and assent to, and it is what leads to the true convictions by providing their reasons and by following its methods (al-Najat 3).

Logic is said to be what “prevents the mind from errors.” This means, first, that it is a tool—an Organon in classical terms—and second, that its aim is to help reach the truth and avoid fallacies by studying in a specific way the methods used for that purpose. Logic focuses on the methods used in the quest for truth; it is thus different from other sciences in this particular way. Its focus is to study all the relevant methods that can be used to reach the truth in the best and safest way. However, the second definition is more precise than the first, as it explicitly evokes the notions of concepts and assents. As defined by Tusi, logic is “a science by itself (ʻilmun bi nafsihi) and a tool with regard to other sciences…” (al-Isharat 117, note 1). It is, in other words, the science that studies the best tools to reach the truth. This opinion can be found in al-Qiyas, where Avicenna says that these two features of logic are not contrary but compatible (al-Qiyas 11.7).

However, what exactly is the subject matter of logic? In al-Shifa al-Madkhal, Avicenna distinguishes between concepts on the one hand and propositions on the other. Propositions are either true or false, whereas concepts are components of the propositions and are neither true nor false. They are expressed in general terms, which can either be verbs, nouns, or adjectives. These terms may either be subjects or predicates in propositions. Logic thus studies the conceptions and assents as the Arabic authors largely hold. Although Avicenna talks about this view of logic explicitly, he is often said by many commentators to endorse another view, namely that logic is concerned with the study of the secondary intentions (or intelligibles, as they are sometimes called). The view that logic studies secondary intelligibles is expressed in al-Ilahiyyat (the metaphysics of the Shifa) rather than in the logical treatises. Avicenna says in that treatise:

The subject matter of logic, as you know, is given by the secondary intelligible meanings, based on the first intelligible meanings, with regard to how it is possible to pass by means of them from the known to the unknown [emphasis added], not in so far as they are intelligible and possess intellectual existence ([an existence] which does not depend on matter at all, or depends on an incorporated matter). (Metaphysics 7 qtd. by T. Street; the expression “as you know” is contested by one of the referees who prefers “as you have learned” or “as you have known”).

However, in this view, as the emphasized passage shows, the aim of the study is also always to show how one can go from the known to the unknown. Therefore, the real focus and specificity of logic is to pay attention to this passage. What are these secondary intelligible meanings? According to Street, these are second-level concepts, as distinguished from first-level concepts, such as the concepts of animal or human being, which refer to individuals in the real world, that is, the concepts that could have the property of being subjects, predicates, or genus. He thus focuses on the ontological aspect of this subject, as according to him, the secondary intelligibles are a distinct stretch of being (Street 2008, section 2.1.2). This account of Avicenna’s position is very close to A. I. Sabra’s analysis of the subject of logic in Avicenna’s theory, which relates Avicenna’s distinction to “the Porphyrian distinction between terms in first position and terms in second position,” (A. I. Sabra 1980, p. 753) that is, between the terms that refer to external objects and those referring to first position meanings.

However, Sabra’s interpretation of the first and secondary intelligibles has been criticized by W. Hodges, who finds Sabra’s comparison between the Avicennan and the Porphyrian distinctions “unfortunate, because there is an obvious candidate in Madkhal i.2–4 for the notions that Ibn Sina is referring back to, namely things in first and second mode of existence”. Thus, his own interpretation is different from that of both T. Street and A. I. Sabra in that it stresses the formal character of logic in Avicenna’s frame.

According to him, the definition provided in Mantiq al-Mashriqiyin clarifies the distinction between words “in first and second mode of existence” in the best way. This definition is the following: “And the subject [of logic] is the meanings in the context of their being subject to composition through which they reach a point where an idea is made available in our minds which was not in our minds [before]… (wa mawduʻuhu – al-maʻani min haythu hiya mawduʻatun lil-ta᾿lifi alladhi tasiru bihi muwassilatan ᾿ila tahqiqi shay᾿in fi adhhanina laysa fi adhhanina, …)”. Accordingly, “being in second mode of existence is the same thing as being ‘subject to composition’ – or at least being subject to being made a component of proposition”. In this respect, the secondary intelligible meanings can have as features to be subjects or predicates, universal or particular, and so on. These meanings are parts of the propositions, which in turn are the components of a syllogistic mood and have as features to be either premises of a syllogistic mood or a conclusion following from the premises. The conclusion deduced from the premises contains the newly acquired knowledge, which results from the already known premises. Here we find the passage from the known to the unknown stressed by Avicenna in his various definitions of logic.

It is in this sense that logic may be said to be formal, that is, to study the inferences and arguments with respect to their logical structures in order to find out what inferences are conclusive, that is, valid. In Avicenna’s text, the notion of validity might be defined as truth “in all matters” (al-Qiyas 64.10-11), as he shows in that passage that the validity of a syllogistic mood does not and should not depend on the matter of the propositions it contains.

This shows that Avicenna privileges deductive logic over other kinds of logical systems. The logic he defends is demonstrative as he explicitly says in the very beginning of al-Qiyas, where he identifies the aim of logic as follows: “Our aim in the art (sinaʻa) of logic is first and foremost (al-awwal wa bi-dhdhat) to identify the syllogisms and, among them, to study the demonstrative syllogisms. The usefulness of this study for us is to be able, using this instrument, to acquire the demonstrative sciences” (al-Qiyas 3.2-4). Thus, in his frame, logic is deductive, and it is this feature that makes it useful to the other sciences, as it is by applying its inferences and rules that the other sciences can reach the truth. As an instrument or a tool in other sciences, it plays a most important and specific role (al-Qiyas 10.15). Its truths and rules are, because it is deductive, comparable to the truths of mathematics, which are exact, “far from error,” and admitted unanimously and are “well-regulated” (munassaq) and not subject to plurality (al-Qiyas 16.9-10). Likewise, the part of logic that studies proofs and syllogistic inferences, which is the heart of logic, “does not admit plurality if it is rightly understood, for it is of the well-regulated type (min al-qismi al-munassaq)” (al-Qiyas 17.2).

3. The Different Kinds of Propositions and their Relations

Avicenna distinguishes between several kinds of propositions in the different sections of his system. The propositions may be categorical, modal, or hypothetical.

The categorical propositions are predicative: they contain a subject and a predicate, plus a copula and a quantifier in some cases. This class includes singular propositions, indefinite propositions, and quantified propositions, which are either particular or universal. All kinds are either affirmative or negative.

The modal propositions contain a modal operator in addition to all the components of the categorical propositions. The modalities are expressed using the words “necessarily” and “possibly,” to which a negation may be added. They are also singular, indefinite, or quantified and may be affirmative or negative.

The hypothetical propositions are either conditional or disjunctive. The conditional ones contain the expression “if…then,” while the disjunctive ones contain either the words “either…or” or the expression “not…both.” Unlike the categorical propositions, their elements are not terms, but whole propositions, as in the example “If the sun rises, it is daytime” and “either this number is odd or it is even.”

Avicenna says that these hypothetical propositions, like the predicative ones, may also be singular, indefinite, or quantified. When they are quantified, they contain words such as “whenever” (kullama), “always” (da’iman), “never” (laysa al battata), “maybe” (qad yakun), “not (whenever)” (laysa kullama), “not always” (laysa da’iman). However, they differ from the predicative ones in that their elements are propositions related by a logical operator instead of two terms related by a copula. An example of a universal affirmative conditional is the following: AC: “Whenever H is Z, A is B.” The universal affirmative disjunctive, that is, AD, is expressed as follows: “Always either H is Z or A is B.”

Avicenna acknowledges the four relations of the traditional square of opposition between the quantified categorical propositions in al-‘Ibarah (De Interpretatione). However, he does not explicitly talk about a square and does not use geometrical figures at all, although he does precisely define the four relations by means of their truth conditions. These definitions are the following (the commonly used vowels A, E, I, O do not appear in Avicenna’s text; they are used here for convenience):

  • Contradiction (tanaqud): valid when its components are true-false in all matters (A/O and E/I).
  • Contrariety (tadadd): valid when its components are false-false in the possible matter or true-false in the necessary and impossible matters (A/E).
  • Subcontrariety (ma taht at-tadadd): valid when its components are true-true in the possible matter and true-false in the impossible and necessary matters (I/O).
  • Subalternation (called tadakhul, a word close to inclusion): valid when its components are false-true in the possible matter, true-true or false-false in the two other matters (A/I and E/O).

As to the quantified hypothetical propositions, he says that the couples AC / OC and EC / IC, as well as AD / OD and ED / ID, are contradictory. However, the conditional ones do not validate the other relations of the square, given the operators they contain, as their corresponding formulas are as follows (where quantification is made based on situations): 

 Ac: (∀s)(Ps ⊃ Qs); Ic: (∃s)(Ps ∧ Qs); Ec = ~(∃s)(Ps ∧ Qs); Oc = ~(∀s)(Ps ⊃ Qs).

The disjunctives may be formalized as follows:

AD: (∀s)(Ps ⊻ Qs), ID: (∃s)(Ps ∨ Qs), ED: ∼(∃s)(Ps ∨ Qs), OD: ~(∀s) (Ps ⊻ Qs) (S. Chatti 2014a).

These formalizations of the disjunctive propositions validate not only the contradictions AD / OD and ED / ID, but also the contrarieties AD / ED, the subcontrarieties ID / OD, and the subalternations AD / ID and ED / OD.

4. The Analysis of the Absolute Propositions in al-Shifa, al-Qiyas

However, in al-Shifa al-Qiyas (the correspondent of Prior Analytics, literally: The Syllogism), he provides a more detailed analysis of the categorical propositions, which introduces temporal connotations. In this new analysis, the perpetual propositions (containing the words “always” or “never”) are the real opposites of the general absolute propositions (containing “at some times”). For instance, the real contradiction of “Every S is P (at some times)” is not simply “Some Ss are not P,” but rather “Some Ss are never P,” which is a perpetual proposition. In the same way, each general absolute proposition, whether universal or particular, affirmative or negative, is contradicted by a perpetual one. With these new propositions, one can draw several squares with valid relations but, once again, Avicenna does not himself draw any figure (Chatti 2014b 3).

Furthermore, some propositions are called special absolute propositions because they contain the expression “at some times, but not always.” These are contradicted by complex propositions containing disjunctions and are comparable to the bilateral possible propositions in this respect (Street 2004, Appendix).

One may ask why Avicenna introduces these temporal connotations. This is because he believes that most categorical propositions are not true if one does not add some further condition. For instance, “Every man is laughing” is true only when one adds “at some times” (al-Qiyas 82). Thus, the temporal connotation helps determine the truth value of the proposition when it is ordinary, that is, part of everyday discourse.

The temporal conditions are specifically Avicennan, together with the condition “as long as he exists” added in some kinds of propositions. However, Avicenna also uses other conditions such as “as long as it is S,” and “as long as it is P,” which are already present in Theophrastus’ text, as noted by Wilfrid Hodges in several writings. The former conditions make Avicenna’s logic different from Aristotle’s as well as from al‑Farabi’s and Averroes’ systems, which do not contain such temporal precisions.

5. The Absolute Propositions in Mantiq al-Mashriqiyyin

In Mantiq al-Mashriqiyyin, which supposedly contains Avicenna’s original views but which is, as it stands, very incomplete, as only some fragments remain from it, Avicenna classifies the predicative propositions into five kinds as follows:

  1. Necessary (daruriya): S is P (as long as S exists) (Mantiq 65, 68)
  2. Implicative (lazima) (personal translation): S is P (as long as it is S) (Mantiq 65, this is also called lazima mashruta: implicative with a condition)
  3. Factual (tari’a) (Mantiq 65) (muwafiqa 68-69): S is P (not perpetually)
  4. Determined (mafruda) (Mantiq 65): S is P (at some determined time), e.g. “Every moon eclipses” (Mantiq 68)
  5. Spread (muntashira) (Mantiq 65): S is P (at some undetermined but regular times), e.g. “Every human being breaths” (Mantiq 68)
  6. Temporal (waqtiya) (Mantiq 65) (hadira = present (Mantiq 68)): S is P (at present), e.g. “Every animal is a man,” which could be true “if there were a time where this would be the case” (Mantiq 68)

Some sentences can even be interpreted in different ways, as witnessed by the following example: “Every sick person is weakened” (Mantiq 72), which could either be considered kind (2) (for instance, if the illness is chronic), kind (4) if the weakness occurs at one determined time, or kind (5) if the weakness occurs at undetermined but regular times (Mantiq 72).

These kinds are not really new in Avicenna’s theory, as he addresses kinds (4) and (5) in al-Qiyas and gives an example in that same treatise that is very close to kind (6) by saying: “because it is possible at some time that ‘every B is C’ […], at that time ‘every animal is a person’ will be true” (al Qiyas 141). As to (1) and (2), they represent two kinds of propositions that were sharply separated in al-Qiyas (1964 21-22) and that are also still separated in al-Isharat because (1) was considered descriptional, while (2) is substantial (Street 2004 551).

Avicenna also uses the expression lazima mashruta when he talks about the implicative, leading to the question of why he adds this adjective and whether there is any difference between lazima and lazima mashruta. Taking into account the explanations provided in the text (al-Qiyas 65.5-11), the difference might be related to the truth values of both propositions, as Avicenna analyzes some sentences that could be true only when one adds “as long as it is S” and distinguishes between them and those containing “as long as it exists.” In some cases, the condition “as long as it is S” is absolutely indispensable, which sharply distinguishes these sentences from those, called necessary, which could be true with the condition “as long as it exists.” In the first kind, “S is P only when it is S,” otherwise it would be false; this is why Avicenna says “with a condition,” as this condition is crucial to determine the truth-value of the sentence. The example given in the text clarifies this idea. The sentence “A moving thing changes as long as it exists” is not true because something that can move (such as animals or people) or even that does actually move could not be said to change “as long as it exists,” as this movement and, consequently, the change, does not last the whole time the thing exists. To the contrary, if one says:

  • “Every man is an animal (as long as it is a man),” this sentence is not different in its truth-value from that other sentence:
  • “Every man is an animal (as long as he exists),” both are true, even if the conditions are not the same in each case.

In this example and similar ones, as Avicenna emphasizes in al-Qiyas, “the situation is not [that] different (la yaftariqu al-halu) from saying ‘as long as it exists’ and saying ‘as long as it is white’ [that is, ‘as long as it is S’]” (al-Qiyas 22.6-7), while in other sentences, such as the first example above (“A moving …”), there is a real difference. Therefore, maybe Avicenna adds “with a condition” (mashruta) to emphasize the case where the condition “as long as it is S” makes a real difference with regard to the truth-value of the sentence. The difference is that the implicative is true only in so far as the thing is described as S, whereas the necessary is true during the whole time that the thing described as S exists.

Nevertheless, these two kinds of propositions share at least one thing, which is the continuous link between S and P, expressed in both propositions by “as long as,” which is not obvious in the other kinds.

Kind (3) seems to correspond to the so-called general absolute propositions (containing “at some times”). However, the special absolutes (containing “at some times but not always”), which were analyzed in al-Qiyas and al-Isharat, are not part of this classification.

All these propositions can be modalized by adding the modal words “necessarily” (bi-d-darurati) and “possibly” (bi-l-’imkani), to which one can add negations as follows: “not necessarily” (laysa bi-d-darurati) or “not possibly” (laysa bi-l-’imkani) (al-Qiyas 71). These modal propositions are four-fold (ruba‘iyah) (al-Qiyas 70), as they contain four elements.

6. The Categorical Syllogistic

Avicenna admits three figures and the same valid moods as Aristotle in Prior Analytics. He explicitly rejects the fourth figure because of its unnaturalness (al-Qiyas 107). However, given the precisions added in the analysis of the absolute propositions, some rules such as conversion become invalid for some kinds of propositions, which consequently invalidate the demonstrations that use this rule and the moods that are obtained by means of it. Only some kinds of propositions validate the conversions (of E, I and A). Thus, the general absolute propositions, containing “at some times,” do not validate E-conversion given that “No man is laughing (at some times)” does not lead to “No laughing thing is a man (at some times),” as “it is impossible to negate the predicate ‘man’ from what is laughing in effect” (al-Qiyas 82). Consequently, since conversion is often used in the reduction of syllogisms of the second and third figures, these syllogisms are not valid when they contain the non-convertible propositions. As a matter of fact, the valid moods admitted by Avicenna contain quantified propositions containing the condition “as long as it is S.” These admit E-conversion, as well as the other syllogistic rules, as it can be deduced, for instance, from the sentence “Nothing that sleeps wakes while sleeping” that “Nothing that wakes, sleeps while awake” (Street 2004 551). Thus Cesare [that is, the 1st mood of the 2nd figure in Avicenna’s wording] from the second figure is stated as follows: “Every C is B (as long as it is C); No A is B (as long as it is A); therefore, no C is A (as long as it is C)” (al-Qiyas 114-115). Note also that as all other Arabic logicians, he starts the syllogism with the minor premise, and the major premise is the second one. However, the places of the terms are correct because, for all moods, the minor term is the subject of the conclusion, while the major term is its predicate.

In addition, he explicitly states some rules that govern the different figures and the valid moods including, for instance, the following general rule: “the conclusion (natija) follows the least (akhass) premise with regard to quantity and quality, but not with regard to modality” (al-Qiyas 108; al-Najat 33). As a consequence, in the third figure, only particulars are deducible (al-Qiyas 108) and in the second figure, only negatives are deducible, while in the first figure, all kinds of propositions are deducible.

Other rules are stated, such as the following:

  •  In the second figure, no syllogism is possible with two affirmative premises, as the middle can be predicated by two opposite subjects, e.g. ‘body’ can be predicated by ‘men’ and ‘stones’ (al-Qiyas 111).
  •  In all syllogisms of the first and second figures, the major premise must be universal.
  •  In the syllogisms of the third figure, the minor premise must be affirmative.
  •  No syllogism is conclusive when the minor is negative and the major is particular.

In three of the first figure moods (Darii, Barbara, and Celarent), the conclusion may be converted, which gives rise to other (imperfect) moods. Therefore, these moods admit two conclusions: the first one, obtained as usual from the two premises, and a second one, obtained by conversion from the first conclusion (al-Qiyas 110.4-6).

The singulars are treated as universals (al-Qiyas 109.12-13). For instance, the following Barbara syllogism contains only singular propositions: “Zayd is the father of Abdullah; the father of Abdullah is the brother of ‘Amr; therefore, Zayd is the brother of ʻAmr” (al Qiyas 109.13-14).

Avicenna uses all kinds of proofs in demonstrating the moods of the second and third figures. He systematically provides two (or more) proofs for each mood, a direct one by conversion or by ekthesis and an indirect one, by reductio ad absurdum (bi-l-khalf). For instance, the mood Cesare above, that is, “Every C is B (as long as it is C); no A is B (as long as it is A); therefore, no C is A (as long as it is C),” in addition to its proof by conversion (al-Qiyas 114.5-8), is also proven by reductio ad absurdum as follows: “Suppose the conclusion is false, then ‘Some Cs are A (as long as they are C)’ is true; but we have ‘No A is B (as long as it is A)’; so by Ferio, we deduce ‘Not every C is B (as long as it is C); but this contradicts the first premise, that is, Every C is B (as long as it is C), which is not acceptable” (al-Qiyas 114-115). The negation of the conclusion is thus not compatible with the first premise, which indirectly proves the validity of the whole syllogism.

The other moods are also proven in two or more ways. For instance, in the second figure, Camestres is proven by the conversion of the minor and the conclusion and by reductio ad absurdum, Festino is proven by the conversion of the major and by reductio ad absurdum, and Baroco is proven by reductio ad absurdum and by ekthesis. In the third figure, Darapti is proven by ekthesis, by the conversion of the minor and by reductio ad absurdum; Felapton is proven by ekthesis, by the conversion of the minor, and by reductio ad absurdum; Datisi is proven by the conversion of the minor; Disamis is proven by ekthesis, by the conversion of the major and of the conclusion, and by reductio ad absurdum; Bocardo is proven by ekthesis and by reductio ad absurdum; and Ferison is proven by the conversion of the minor and by reduction ad absurdum (al-Qiyas, pp. 114-119)

These proofs are inspired by Aristotle’s, but some of them cannot be found in Aristotle’s texts. These are, for instance, the proofs by ekthesis of the second figure mood Baroco (al-Qiyas 118.10-12) and of the third figure mood Bocardo (al-Qiyas 119.1-2). However, Avicenna is not the first philosopher in the Arabic tradition to provide proofs of these two moods by ekthesis. Before him, Al-Farabi also proved them both by ekthesis in his Kitab al-Qiyas (al-Farabi 1986a 25.15-26.4, 28.16-29.1). Bocardo’s proof is exactly the same as al-Farabi’s, but Avicenna’s proof of Baroco is different from al-Farabi’s. To see the difference, let us state them both.

Baroco itself is the following:

Every A is B [“B belongs to every A” in one of al-Farabi’s phrasings]

Some C’s are not B [“B is not in some C”]

Therefore Some C’s are not A [“A is not in some C”]

The proof by ekthesis relies on the assumption that since “B is not in some C,” then B is negated by “all this part”; therefore, “suppose that this part is designated on its own and let us call it D” (al-Farabi, al-Qiyas 1988 131; 1986a 25.17-18), then we have the following steps in  Al-Farabi’s proof and Avicenna’s proof:

Al-Fārābī’s proof: Avicenna’s proof:
1. B is not in some C 1. Some C’s are not B
2. B belongs to no D (assumption) 2. No D is B (assumption)
3. B belongs to every A (major premise) 3. Every A is B (major premise)
4. D belongs to no B (from 2 by conversion) 4. No D is A (from 2, 3 by Camestres)
5. D belongs to no A (from 3, 4 by Celarent) 5. Some C is D (assumption)
6. A belongs to no D (from 5 by conversion) 6. Therefore Not every C is A (4, 5, by Ferio)
7. D is some C (assumption)
Therefore A is not in some C (from 6, 7 by Ferio)
(Al-Farabi, al-Qiyas 1988 131.9-17; Kitab al-Qiyas 1986a 25.15-26.4) (Avicenna, al-Qiyas 116.10-12)

Avicenna’s proof is shorter, as it applies Camestres directly to the assumption and the major premise to obtain the crucial premise “No D is A,” while al-Farabi, although he notes that steps 2 and 3 are the premises of Camestres, does not apply Camestres directly; rather, he converts 2 to obtain 4 and applies Celarent to arrive at the premise “A belongs to no D,” which is necessary to deduce the conclusion.

However, both proofs share the same difficulty, which is that the assumption “Some C is D” is not warranted by the premises of Baroco because given that O does not have an import, C could be empty, in which case the assumption would be false. This difficulty is raised by Wilfrid Hodges who considers that Avicenna could not have missed it but has probably considered that it did not make the ekthetic proof of Baroco illegitimate.

As to al-Farabi, although he preceded Avicenna in his use of ekthesis in the proof of Baroco and was also the first logician in the Arabic world to defend the idea that quantified negative propositions do not have an import, whereas the affirmatives do, since the affirmatives such as “every man is white” “are false when the subject does not exist” (Kitab al-Maqulat 1986b 124.14), whereas their negative contradictories are true in that case (124.15), he did not seem to find the proof of Baroco problematic and did not provide any other proof for it. His justification of the use of ekthesis is that conversion is not applicable in that case. Al-Farabi did not mention the difficulty above, and the concrete example he provides involves non-empty terms, that is, the following: A: Horse, B: whinnying, C: animal, D: man. Thus he says:

If we consider that the animals from which we have denied the whinnying are men, for example, we then have ‘every horse is whinnying’ and ‘No man is whinnying.’ It follows ‘No man is a horse’ as we showed above. And ‘men are some animals,’ therefore ‘Some animals are not horses’. (27.10-12).

Since in this example C is not empty, we might consider that al-Farabi did not pay immediate and sufficient attention to the fact that C could be empty in some cases, as the premise O in Baroco does not rule out that case.

7. The Modal Propositions and Oppositions

As we said above, Avicenna expresses the modal propositions by adding the words “necessarily” and “possibly” to all kinds of propositions. He negates the modality to obtain the contradictory of the affirmative modal proposition, whether singular, indefinite, or quantified. This syntactic device works when the modality is external (that is, at the beginning of each proposition); this seems to be the strategy used by Avicenna in his analysis of the modal oppositions (al-Qiyas 49-50), as he just added, in that part of the text, the word laysa (‘no’ or ‘not’) in front of the modal propositions, even the quantified ones, to obtain their negations. This differs from al-Farabi’s style, as the latter always puts the modal word in front of the predicate. However, Avicenna also uses internal formulations by putting the modal word at the end of the proposition, in particular in his modal syllogistic.

Avicenna provides three definitions of possibility, which are the following: 1. the unilateral possible, 2. the bilateral possible, and 3. what is neither actual nor necessary, nor impossible. The latter is related more specifically to the future. However, he privileges the second meaning, that is, the bilateral possible. In addition, he also provides the negation of the bilateral possible for all kinds of propositions, including the quantified ones, whether the modality is internal or external. He presents the entailments and equivalences between the modal propositions in his al-Shifa al-‘Ibarah (the correspondent of De Interpretatione) and shows in particular that the possible in Tables I and III rejected by Aristotle (De Interpretatione 22a14–22a32) should be interpreted as bilateral because, in that case, all the entailments become valid (S. Chatti 2014b 9-13).

As to necessity, it is the dual of possibility because “□ ≡ ~◊~” and “◊ ≡ ~□~.” The bilateral possible is expressed as “◊α ∧ ◊~α,” while its negation can be formalized as “□α v □~α.” When the propositions are quantified, the following couples of contradictories result: □A / ◊O, □E / ◊I, □I / ◊E, □O / ◊A. For the bilateral possible, the contradictories are as follows when the possibility is external: ◊A ∧ ◊O / □O ∨ □A and ◊E ∧ ◊I / I ∨ □E, and as follows when it is internal: A◊ ∧ E◊ / I□ ∨ O□ and “Some Ss are ◊~P and ◊P” / “Every S is □~P or □ P.”

The necessity operator may be added to all the assertoric propositions above, which contain various conditions. For instance, one may say: “Necessarily Zayd writes (as long as he writes),” or “Necessarily the moon eclipses (at some determined time).” When the proposition is necessary but does not contain any condition, the necessity is said to be absolute (Lagerlund 233). This absolute necessity is very rarely used, as most of the necessary propositions contain some condition, in particular the existential condition (that is, “as long as S exists”).

The oppositional relations between the modal quantified propositions may be represented by means of a Dodecagon (a figure with 12 vertices), where the other oppositions of the square can be added to the contradictions already mentioned. Avicenna has provided all the contradictions and some of the subalternations, contrarieties, and subcontrarieties; however, the remaining ones can easily be demonstrated in his system by means of the very relations he himself admits. The figure representing the modal singular (and indefinite) propositions is a hexagon, where all the relations are given by Avicenna, except two subcontrarieties, which are missing in his text but could be easily added (Chatti 2014b 10).

8. The Modal Syllogistic

Avicenna’s modal syllogistic differs from Aristotle’s in some points and from Averroes’ syllogistic, which is very Aristotelian. As we saw above, Avicenna primarily uses two kinds of possibility with the third one being mentioned but not given much importance. This influences the modal syllogistic rules and, consequently, the validity of the syllogisms. According to Avicenna, the conversion holds for necessary E and leads to necessary E as well. However, A necessary does not lead to I necessary, it leads rather to I possible. For instance, from the necessary proposition “Every laughing [thing] is necessarily a man” (or “is a man necessarily”), one cannot deduce “Some men are necessarily laughing”, rather the conversion leads to “Some men are possibly laughing” (al-Isharat 336), as the sentence “Some men are necessarily laughing” is not true, while the initial A proposition is necessarily true. The same can be said about I necessary (al-Isharat 335-336; Street 2008). It can be noted here that the example taken by Avicenna is precisely the one that illustrates the invalidity of conversion when it is applied to the general absolute propositions (see sections 4 and 6 above). As to possible E and possible O, they do not convert because if “Possibly no man is a writer” is true, “Possibly no writer is a man” is false and “Possibly some writers are not men” is also false (al-Isharat 338). This is so because the predicate “man” cannot be denied from the subject “writer,” even in a possible proposition given that a writer is necessarily a man and cannot be anything else.

To the contrary, possible A and possible I do convert. However, if the possible is narrow (=bilateral), the conversion leads to a general possible proposition (=unilateral), that is, the general possible I (al-Isharat 339). According to Avicenna, “if every C is possibly B” (where the possibility is bilateral and internal), or “if some C is possibly B” (by the bilateral kind of possibility), then “Some B is possibly C” (where the possibility is unilateral and internal), “Otherwise, it [would] not [be] possible for a thing that is B to be also C” (al-Isharat 339). An example can show this; if we say “Every human being is possibly a writer” (and eventually, “possibly not a writer” too), we can deduce that “Some writers are possibly human beings,” otherwise, it would be impossible for any writer to be a man, and this of course is not plausible. Naturally, in the second proposition, the possibility is unilateral, as it would be false to say “Some writers are possibly not human beings.” These conversions also hold with internal modalities.

As a matter of fact, in stating the modal syllogisms, Avicenna uses internal modalities as noted by Tony Street (Street 2008, section 2.3.1) who explains that, according to Avicenna, necessity has to do, above all, with being: “It depends on how things are and not on how things are described” (Street 2008, section 2.3.1). Consequently, the necessary propositions used in his syllogistic should contain “as long as it exists,” and the necessity operator is internal, as it occurs most of the time at the end of the proposition.

As to the moods held, there are several analyses of the modal syllogistic in the literature. One of them is the analysis provided by Tony Street in his article “An Outline of Avicenna’s Syllogistic” (2002). According to this author, the moods held in the first figure are the following: “[AXA], [ALA], XXX, XLX, LXL, LLL, MMM, MXM, MLM” to which he adds “two imperfect mixes: LML, XMM.” In the second figure, he says that the following are held valid: “LLL, XLL, LXL, MLL, LML” and in the third figure, the following are admitted: “XXX, LLL, LXL, XLX, MMM, XMM, MXM, LML, MLM” (Street 2002 160), where A: perpetual (containing the word “always”), X: absolute, M: possible, L: necessary.

Another analysis is provided by Paul Thom, who also uses the same kinds of propositions, that is, Perpetual (=P), General Absolute (=X), Possible (=M) and Necessary (=L) propositions. He states the universal affirmatives of each kind as follows: “1. X, the universal affirmative general absolute “every j is b”: jM ⊂ bm, [= every possible j is sometimes b] 2. P, the universal affirmative perpetual “every j is always b”: jM ⊂ bm [= every possible j is always b], 3. L, the universal affirmative necessity-proposition “every j is necessarily b”: jM ⊂ bL [every possible j is necessarily b], 4. M, the universal affirmative possibility-proposition “Every j is possibly b”: jM ⊂ bM [= every possible j is possibly b]” (Thom 2008 363, explanations inside brackets added following Thom’s interpretations of the subscripts, 363.9-10). In this interpretation, which Thom calls the “simple de re reading” (Thom 2008 363.12), the moods held are the following: “(i) the LLL, PLP, XLX, and MLM syllogisms of Fig 1, along with (ii) the LPL, PPP, XPX and MPM syllogisms, and also (iii) the LXL, PXP, XXX, and MXM syllogisms of the same Figure” (Thom 2008 364.5-7), plus the MMM and XMM moods (Thom 2008 364). These moods are validated by the semantics that Thom presents in his article. In the second figure, Thom says that the following moods are validated by the same reading and semantics: “LML-2, MLL-2” (Thom 365), plus the “XPL and PXL syllogisms” which are said to be “equivalent to XMX-1 and PMP-1” (Thom 2008 365). In the third figure, the following moods are validated: “XMX, LML, MMM, PMP and XMX” (Thom 2008 365).

Note that in both accounts the perpetual is added to the usual modalities, which could be justified by the fact that Avicenna uses the word da’iman (=always or perpetually) when he talks about the modal syllogistic and that he sometimes uses both words together by saying necessarily and always [bi-al-darurati da᾿iman] (al-Qiyas 128.5, 7).

However, this addition of the perpetual as a “separate class” of propositions has been criticized by Wilfrid Hodges who considers that the perpetual sentences are nothing more than those called “necessary” by Avicenna himself, as they have the same logical behavior. Therefore, although he agrees with Street’s list of the modal moods, he contests those containing the perpetual propositions. His analysis of the Avicennan modal syllogistic uses a two-dimensional framework that quantifies the times added to the usual quantification of objects. An example of this two-dimensional framework can be found, for instance, in the article “The move from one to two quantifiers” (Hodges 2015).

In these accounts, the absolute proposition seems to be interpreted as a general absolute one, that is, as a proposition containing “at some times.” However, Avicenna explicitly says in several places that this kind of proposition does not convert and that the absolute propositions used in the syllogistic moods should be convertible. The convertible absolutes contain the condition “as long as it is S” as we saw above (section 6). Therefore, the absolutes used in the different moods should contain this condition, even if it is a first figure mood, as the conversion should lead to a proposition of the same kind. For instance, E-conversion leads from “No C is B (as long as it is C)” to “No B is C (as long as it is B)”; it does not lead to “No B is C (at some times).” The discussion of Barbara XLL where the absolute is interpreted as “Everything described as B is A at some times and this time is the one where it is described as B (kull ma yusafu bi [B] yakunu lahu [A] waqtan ma, wa dhalika al-waqtu huwa kawnuhu mawsufan bi [B])” (al-Qiyas 128.5-6) confirms this opinion, given that Avicenna clearly explains exactly what he means by this absolute. This mood is illustrated by the following concrete example: “All snow is white by necessity, and every white thing dissociates the eye as long as it is white; therefore, all snow always dissociates the eye” (al-Qiyas 129. 1-2). This example is translated as follows by Wilfrid Hodges:

(a-d) All snow is coloured white throughout its existence.

(a-ℓ) Everything coloured white dissociates the eye so long as it is coloured white.

(a-d) Therefore all snow dissociates the eye throughout its existence.

This translation shows that Wilfrid Hodges, following Avicenna’s explanations, uses an ℓ sentence (that is, a sentence containing “as long as it is S”) in this mood, which means that the major proposition is not a general absolute (containing “at some times”), but rather what Avicenna calls an implicative (lazima) in Mantiq al-Mashriqiyin, which, unlike the former, is convertible. The mood itself is a Barbara XLL since Avicenna uses the word da᾿iman (always) in the conclusion. It therefore seems that Barbara XLL is admitted when the major is descriptional. This is confirmed by the list of first figure moods admitted by Avicenna, which contain among other possibilities the ℓdd and ℓℓℓ moods, that W. Hodges presents in “The move from one to two quantifiers” (Hodges 2015, section 5).

The moods above are different from Aristotle’s, as noted by these authors. For instance, Barbara LML is not valid in Aristotle’s modal logic, as Aristotle has Barbara LMM instead (Pr.A. 35b37-36a2). Avicenna differs also from Aristotle with regard to the conversion of necessary A, which leads to necessary I in Aristotle’s theory (Pr. A. 25a 31-33). However, as we saw above, it leads to possible I in Avicenna’s theory. This makes his theory different also from Averroes’, who tries to validate all the moods held by Aristotle.

9. The Hypothetical Logic

Hypothetical Logic is the part of the system that deals with the conditional and disjunctive propositions as they are stated, for instance, in Stoic logic. The hypothetical syllogism in general is called qiyas sharti. The logicians of the Arabic world, such as al-Farabi, include these propositional syllogisms in their correspondence about Prior Analytics and sometimes in their correspondence about Categories. However, unlike al-Farabi and Averroes, who just present the Stoic indemonstrables and some of their variants, Avicenna presents these same indemonstrables at the end of al-Qiyas (389-407), but also develops a whole hypothetical syllogistic where the valid moods contain conditional as well as disjunctive propositions and even combines such propositions with the categorical ones. The former syllogisms are called istithna᾿i (translated as “exceptive” in Street 2004, for instance), whereas the latter are called iqtirani (usually translated as “conjunctive”). The iqtirani / istithna᾿i distinction involves all kinds of syllogisms, whether categorical or hypothetical, as the class of iqtirani syllogisms includes the categorical syllogisms and one kind of hypothetical one, whereas the istithna᾿i syllogisms are the usual Stoic ones. The difference between both kinds is that the premises of an istithna᾿i syllogism include either the conclusion or its contradictory, while in the iqtirani ones, the conclusion is not included in the premises. The hypothetical syllogistic with the conditional propositions is almost the exact duplicate of the categorical syllogistic, as it contains three figures and moods corresponding to the usual categorical ones. When the disjunctive propositions are introduced, many moods are added that do not necessarily correspond to the categorical ones.

The sharti (hypothetical) propositions are of two kinds: those containing “if…then” and those containing “either…or.” The former are the conditional propositions, and they express either what Avicenna calls the luzum, that is, the strong implication (the relation of “following from”), or what he calls the ittifaq. The latter are the disjunctive propositions that express either a strong or less strong separation. Here too, he sometimes speaks of ittifaq. In the luzum, or real implication, the consequent necessarily follows the antecedent so that if the antecedent is true, the consequent must be true because there is a semantic or causal link between them both. For instance, when one says: “If the sun is up, then it is daytime.” Here, the antecedent is the cause of the consequent, and the consequent “follows the antecedent in the reality (fi al-wujudi) and rationally (fi al ʻaqli)” (al-Qiyas 233.16). The causality relation may be involved in several ways; either the antecedent is the cause of the consequent as when someone says “If the sun is up, then it is daytime,” or the antecedent is itself “caused [by] and not separated (ghair mufariq)” from the consequent, or both the antecedent and the consequent are caused by the same thing, as with lightning and thunder, which are caused by “the movement of the wind in the clouds” (al-Qiyas 234. 4). In all these examples, there is a natural link between the antecedent and the consequent, which are related in all situations. This natural or semantic link may be present and makes the conditional true even when both propositions are false, as when one says: “If men are stones, then they are inert” (al-Qiyas 261.1-2), or when the antecedent is false while the consequent is true as when one says: “If five is even, then five has a half” (al-Qiyas 260.14). In both examples, the entailment is due to the semantic link between the antecedent and the consequent.

However, in the ittifaq, which is translated as “chance connection” by N. Shehaby and evokes either the notion of accident or agreement, there is no such natural, semantic, or causal link as shown by the following example: “If men exist, then horses exist too” (al-Qiyas 234.14). Here, the existence of men is not the cause of the existence of horses, nor is it caused by it, nor are the two propositions related in either way, whether semantically or causally to each other. Each is true on its own and neither needs the other to be true in order for it to be true itself. Therefore, we could talk of some kind of concomitance because both are true. However, the truth of the consequent is not due to the truth of the antecedent, they just happened to be true together (ittafaqa ittifaqan) (al-Qiyas 234.15).

Elsewhere, Avicenna talks about al-muwafaqa fi al-sidqi (al-Qiyas 265.11), which means the “agreement” in the truth, that is, the fact for the propositions to be both true. Therefore, muwafaqa or ittifaq (which amount to the same, as both words come from the same root) mean in that case the agreement in the truth, which could be rendered simply by the word “concomitance.” In addition, he offers the sentence “if men are talking, then donkeys are braying” as an example and says that “here, it suffices that the consequent is true, for this reason the truth of this proposition is clear” (al-Qiyas 265.13-14). This idea that the truth of the consequent alone makes the whole conditional proposition true is also evoked by Wilfrid Hodges who says “[i]n this passage it seems that Ibn Sina understands an ittifaqi sentence (a, mt)(p, q) to be one which is taken to be true on the basis that ‘Q’ is always true” (Hodges 2014 237). Thus, maybe, as Wilfrid Hodges clearly suggests against Shehaby’s interpretation based on the notion of chance, ittifaq is best rendered by the notion of agreement (or accordance) of the consequent (or of both propositions) with reality. This may be shown by another example provided by Avicenna, which is: “If every donkey is talking, then every man is talking,” which is true “by means of concordance or agreement (ʻala maʻna al-muwafaqa)” (al-Qiyas 270.10). In this example, the antecedent is false and it is the truth of the consequent that makes the whole proposition true. This is confirmed by the following text, which sounds like a definition: “Agreement (muwafaqa) is nothing but (laysa illa) the configuration in which the consequent is true (wa al-muwafaqa laysa illa nafsu tarkib al-tali ʻala annahu haqqun)ˮ (al-Qiyas 279.15).

Anyway, in all these cases, whether both propositions are true or the consequent alone is true, the common feature is that the truth of the sentence is not due to the link between the antecedent and the consequent, as there is no strong (semantic or causal) link that could make us deduce the consequent from the antecedent. If the sentence is true, it is either because all its elements or its consequent alone are in accordance with reality. This accordance is perhaps what Avicenna means by ittifaq or muwafaqa, which he sometimes associates with the word mutabaqa, that is, correspondence (al-Qiyas 265.10).

The next issue to address is how Avicenna analyzes the hypothetical inferences and what syllogisms and moods are admitted.

In his study of the exceptive (istithna᾿i) syllogisms, Avicenna does not evoke the Stoics at all nor does he cite his sources. However, he says very often that these syllogisms are common or commonly known (mashhur) (al-Qiyas 390, 391, 395). He once evokes a “man who has advanced knowledge in the science of medicine” (al-Qiyas 398.12) and “people who strongly defend the first teacher (that is, Aristotle)” (al-Qiyas 398.14). These remarks suggest that the physician he refers to might be Galen, since Galen presented the stoic indemonstrables in his Institutio Logica, as Lukasiewicz, in his article on the stoic propositional logic (1934) says (Lukasiewicz 1934 qtd. in Largeault 1972 16), while the defenders of Aristotle might be the Peripatetics in general. Al-Farabi is also evoked and even criticized in this part of the system.

The exceptive syllogisms presented are the Modus Ponens, Modus Tollens, Modus Tollendo Ponens, Modus Ponendo Tollens, and the syllogism where the first premise is a negated conjunction, which Avicenna expresses by using a disjunction of two negative propositions.

Note that when the ittifaq is concerned, the deductions by means of these indemonstrables cannot be made, as from the two premises:

If every man is speaking, then every donkey is braying

But not every donkey is braying

one cannot deduce “Not every man is speaking” (al-Qiyas 267.1-11) by Modus Tollens, because “Every donkey is braying” does not follow from “Every man is speaking.”

However, he criticizes al-Farabi, who distinguishes between a complete implication (which is convertible, and is thus an equivalence) and an incomplete one (not convertible) by saying that such a distinction relies on the content of the propositions, which should not be taken into account in these syllogisms, since only the forms of the propositions should be considered. According to al-Farabi, with the complete implication, from “If p then q” and “q,” one may deduce “p” (al-Maqulat 1986b 79, symbols added). While, according to Avicenna, “When we say: ‘If A is B, then J is D’, and if this is a premise of our syllogism, we must consider what this means by considering its form, and decide what follows from this form. Saying that its consequent may be converted with its antecedent depends on something that is not the form of the premise; rather it has to do with the matter of the premise. This is like asking if the predicate of the universal affirmative is identical with its subject or not” (al-Qiyas 391. 16-17-392. 1-3). This means that the notion of form plays a fundamental role in Avicenna’s logic, whether with regard to the propositions or with regard to the inferences. The form of the conditional premise is shown by the order of the elements within the proposition.

As to the study of the iqtirani syllogisms, which are part of the hypothetical syllogistic, it is made possible by the fact, already mentioned in section 3, that the hypothetical propositions, whether conditional or disjunctive, are also quantified by Avicenna. These quantifications have been interpreted in two ways in the literature: some, like Nicholas Rescher (1963), say that Avicenna quantifies using times, while others, like Zia Movahed (2012), privilege the quantification of situations. As a matter of fact, the latter solution is preferable in that it is more general and closer to Avicenna’s examples.

As stated above (section 3), Avicenna uses the words “whenever” and “never” to express the two universal conditional propositions and the words “maybe” (qad yakun) and “not whenever” (laysa da’iman) to express the two particular conditionals. Consequently, the conditional quantified propositions can be formalized as follows:

 AC: whenever P, then Q = In all situations, if…then..: (∀s)(Ps ⊃ Qs)

 EC: never (if P then Q) = In all situations, if… then ~… : (∀s)(Ps ⊃ ~Qs) = ~(∃s)(Ps ∧ Qs)

 IC: maybe (if P then Q) = Not (never if…then): ~(∀s)(Ps ⊃ ~Qs) = (∃s)(Ps ∧ Qs)

 OC: not (whenever P, then Q) (laysa da’iman) (al-najat, p. 45) = ~(∀s)(Ps ⊃ Qs) = (∃s)(Ps ∧ ~Qs).

As to the disjunctive propositions, they are expressed using the words “always, either … or…,” “never, either … or…,” “maybe, either…or…,” and “not always, either… or….”

Consequently, they can be formalized as follows:

 AD: Permanently (always) either P or Q: (∀s) (Ps ⊻ Qs)

 ED: Never either P or Q (al-Qiyas 283): ~ (∃s) (Ps ∨ Qs)

 ID: Maybe either P or Q … (p. 288) (al-Qiyas 290): (∃s) (Ps ∨ Qs)

 OD: Maybe not either P or Q (maybe not = not always): ~(∀s) (Ps ⊻ Qs) (S. Chatti 2014a 190).

These formalizations differ from those presented by Nicholas Rescher in his “Studies in the History of Arabic Logic (1963), which are the following:

The universal affirmative: (∀t) (Pt  ∨ Qt)

The universal negative: (∀t) ~ (Pt ∨ Qt)

The particular affirmative: (∃t) (Pt ∨ Qt)

The particular negative: (∃t) ~ (Pt ∨ Qt) (Rescher 233).

Apart from the quantification of times, some of the above formulas might not be satisfying because Rescher uses a unique symbol for the disjunction in all his formulas. If this symbol represents the inclusive disjunction, then the proposition AD can be true when its elements are both true, which does not conform to Avicenna’s examples, since these examples involve incompatible propositions and the disjunction is clearly intended to be exclusive. If it represents the exclusive disjunction, then AD is rendered well but not ID, which would not be different from AD if one considers only one situation. As to ED, its formalization is not obvious.

The literature provides another formalization of the disjunctive quantified sentences, offered by Wilfrid Hodges, which seems promising. This interpretation is the following (where ‘mn’ means munfasil: disjunctive)

(a, mn) At all times t, at least one of p and q is true at t

(e, mn) At all times t, if p is true at t, then q is true at t

(i, mn) There is a time at which p is true and q is not true

(o, mn) There is a time at which neither p nor q is true

These formulas differ from Rescher’s with regard to ED and ID, but AD and OD are rendered in the same way and do not account for the exclusive character of AD. Anyway, all the above formalizations need to be verified by considering all the moods admitted by Avicenna in this part of his logic, and this still needs to be done. Many scholars are interested in and doing research on this subject.

Avicenna combines the disjunctive propositions with the conditional ones to state many new syllogisms that do not all correspond to the known categorical syllogisms. He states a clear correspondence between the conditional and disjunctive propositions, as in his theory, “p → q” is equivalent to “~p ∨ q” (al-Qiyas 251. 16-17) while “p ⊻ q” is equivalent to “p ≡ ~q,” “~(p ≡ q)” and “~p ≡ q” (al-Qiyas 248). These equivalencies make it possible to express any conditional proposition with a disjunctive one. However, some syllogisms such as Darapti and Felapton of the third figure require the A proposition(s) they contain to have a true antecedent; otherwise, they are not valid. Consequently, to validate these syllogisms, the A premise has to be expressed in this way: “(∃s)Ps ∧ (∀s)(Ps ⊃ Qs),” which becomes: “p ∧ (p ⊃ q)” when we consider only one situation. However, then the conditional proposition is no longer equivalent to a disjunctive one, as “p ∧ (p ⊃ q)” is not equivalent to “~p ∨ q”. This introduces some confusion in the definition of the conditional, which does not seem to be formally satisfying. One has to note, however, that the conditional in Avicenna’s theory should not be interpreted as a material conditional, as he does not give it the truth conditions of the material conditional. It seems thus to be an intensional implication, which deserves a more detailed examination.

10. Combining Hypothetical and Categorical Propositions

First of all, what is worth noting in Avicenna’s hypothetical logic is that, in his system, the hypothetical propositions are expressed as couples of propositions, each containing a subject and a predicate and related by a logical operator, that is, either a conditional or a disjunction. For instance, “If H is Z, then A is B,” or “Either H is J or A is B,” or “Whenever H is Z, then A is B” (al-Qiyas 305). Therefore, the elements of the hypothetical propositions are themselves categorical propositions of the SP (subject/predicate) kind. Therefore, it is not astonishing to find some mixed syllogisms containing both hypothetical and categorical propositions and where the subjects and the predicates are themselves related in some way to the premises and the conclusion. An example of such syllogisms is the following:

 A is either B or C or D

 Every B and [every] C and [every] D is H

 Therefore A is H (al Isharat 438)

In this syllogism, the first premise contains two disjunctions (which must be inclusive in order for the syllogism to be valid), while the second premise contains three universal categorical propositions related by conjunctions. This combination leads to a categorical proposition containing a subject and a predicate. The validity of the syllogism is therefore due to the logical relations between the terms A, B, C, D and H, and not only to the relations between the different propositions taken as wholes. The syllogism could be part of the usual [categorical] syllogistic if the disjunction were used in that theory. The whole syllogism shows that Avicenna uses the inclusive disjunction in his system, even if he does not say it explicitly.

Another syllogism looks more like a hypothetical one, as it has the following structure:

  If A is B, then every C is D

  Every D is H

  Therefore if A is B, then every C is H (al Isharat 441)

Here, the first premise and the conclusion contain conditionals, but the second premise is clearly a universal categorical proposition. The validity of the syllogism is due to the logical links between the terms and also the propositions as wholes. The terms are involved because “every C is H” follows from both “every C is D” and “every D is H” by the transitivity of the implication (that is, by a Barbara syllogism). The whole hypothetical syllogism says that if “A is D” implies “every C is D,” it also implies what follows from it, that is, “every C is H.” This can be formalized as follows (where “A is B”: p): “{[p → (∀x)(Cx ⊃ Dx)] ∧ (∀x)(Dx ⊃ Hx)} → [p → (∀x)(Cx ⊃ Hx)].”

This formalization shows the combination between the hypothetical logic and the categorical one.

Therefore, Avicenna’s logic combines the usual syllogistic and his own hypothetical syllogistic. However, the latter is still very close to the usual syllogistic because even in this kind of logic he uses very much the term variables and does not really express the elements of the conditional or disjunctive propositions by single variables. Instead, he represents them with expressions such as “H is Z” or “A is B” and the like. These expressions represent the propositions related by the propositional operators, but they contain only term variables; thus, the hypothetical propositions are not represented by propositional variables as they are in modern propositional logic. Nevertheless, when talking about the conditional propositions or the disjunctive ones at a meta-level, he qualifies these elements by using the words “the antecedent” and “the consequent.” This shows that he treats them as wholes at that level, but he does not use single variables within the conditional or disjunctive propositions.

Avicenna’s hypothetical system is thus very closely related to his syllogistic system, and it would be hard to separate them sharply by considering the former as some kind of propositional logic and the latter as a predicate logic in the modern sense. In addition, the system is only partially formalized, which makes it difficult to determine with enough clarity and accuracy the validity of the hypothetical syllogisms and even the definitions of the logical constants used. However, it cannot be judged without entering into all the details of the syllogisms held valid, and this deserves a separate study whose aim would be to precisely determine the improvements provided by this Avicennian hypothetical logic.

11. References and Further Reading

a. Avicenna’s Logical Treatises and Other Primary Sources

  • Al-Farabi, Abu Nasr. “Kitab al-Qiyas.” Rafik Al Ajam. Ed. al-Mantiq ‘inda al-Farabi. Vol. 2. Beirut: Dar el Machrik, 1986a, 11-64.
  • Al-Farabi, Abu Nasr. “Kitab al-Maqulat.” Rafik Al Ajam. Ed. al-Mantiq ‘inda al-Farabi. Vol. 1. Beirut: Dar el Machrik, 1986b, 89–132.
  • Al Farabi, Abu Nasr. Al-Mantiqiyat li-al-Farabi. Vol. 1. Texts published by Mohamed Teki Dench Proh. Edition Qom. 1988.
  • Aristotle. Prior Analytics. The Complete Works of Aristotle. Revised Oxford Edition. Ed. Jonathan Barnes. Vol. 1. 1991.
  • Avicenna. Mantiq al Mashriqiyyin. Ed. M. al Khatib and A. al Qatlane. Cairo: al maktaba al-salafiyya, 1910, 2-83.
  • Avicenna. An-Najat. Ed. Sabr el Kordi, Mohieddine. 2nd ed. Cairo: Library Mustapha al Babi al Hilbi, 1938.
  • Avicenna. Al- Shifa’, al-Mantiq 2: al-Maqulat. Ed. G. Anawati, M. El Khodeiri, A.F. El-Ehwani, S. Zayed. Cairo: Wizarat al thaqafa wa-l-Irsad al-Qawmi, 1959, 3-273.
  • Avicenna. Al-Shifa’, al-Mantiq 4: al-Qiyas. Ed. S. Zayed. Cairo: Wizarat al thaqafa wa-l-Irsad al-Qawmi, 1964, 3-580.
  • Avicenna. Al-Shifa’, al-Mantiq 3: al-‘Ibarah. Ed. M. El Khodeiri. Cairo: Dar al Kitab al arabi lil tab’ wa-Nashr, 1970, 1-131.
  • Avicenna. Al-Isharat wa l-tanbihat, with the commentary of N. Tusi, intr by Dr Seliman Donya, Part 1, 3rd ed. Dar el Maʻarif: Cairo, 1971.

b. Secondary Sources

  • Arnaldez, R. “Avicenne.” Dictionnaire des philosophes. PUF. 2nd ed. Vol 1(A-J). 1993, 191-199.
  • Asad Q. Ahmed. Avicenna’s Delivrance: Logic. Oxford University Press: Oxford, 2011.
  • Bäck, A. “Avicenna’s Conception of the Modalities.” Vivarium XXX, 2. 1992.
  • Bobzien, S. “Ancient Logic.” Stanford Encyclopedia of Philosophy. Ed. Edward N. Zalta. 2006.
  • Chatti, S. “Syncategoremata in Arabic Logic, Avicenna and Averroes.” History and Philosophy of Logic, 35, 2, 2014a, 167-197.
  • Chatti, S. “Avicenna on Possibility and Necessity.” History and Philosophy of Logic. 2014b, 1-22.
  • Garson, J. “Modal Logic.” Stanford Encyclopedia of Philosophy. Ed. Edward N. Zalta. 2009. http://plato.stanford.edu/entries/logic-modal/
  • Hodges, W. “The Move from One to Two Quantifiers.” The Road to Universal Logic: Festschrift for 50th Birthday of Jean-Yves Beziau. Vol. 1. Ed. Arnold Koslow and Arthur Buchsbaum. Birkhäuser: Basel, 2015, 221-–240.
  • Lagerlund, H. “Avicenna and Tusi on Modal Logic.” History and Philosophy of Logic, 30:3, 2009, 227-239.
  • Lukasiewicz, J. “Contribution à l’histoire de la logique des propositions,” tr. J. Largeault. Logique Mathématique. Armand Colin: Paris, 1972.
  • Movahed, Z. “A Critical Examination of Ibn Sina’s Theory of the Conditional Syllogism.” Sophia Perennis, Vol 1.1. Available at: www.ensani.ir/storage/Files/20120507101758-9055-5.pdf
  • Rescher, N. Studies in the History of Arabic Logic, tr. M. Mahrane. Cairo: 1963.
  • Sabra, A. I. “Avicenna on the Subject Matter of Logic,” The Journal of Philosophy, 77, 1980, 746-764.
  • Shehaby, N. The Propositional Logic of Avicenna, tr. Kluwer, D. Reidel, Dordrecht: 1973.
  • Street, T. “An Outline of Avicenna’s Syllogistic.” Archiv für Geschichte der Philosophie, 84 (2), 2002, 129-160.
  • Street, T. “Arabic Logic.” Handbook of the History of Logic. Ed. Dov Gabbay, John Woods. Vol. 1. Elsevier BV: 2004.
  • Street, T. 2008. “Arabic and Islamic Philosophy of Language and Logic.” Stanford Encyclopedia of Philosophy. (Fall 2008 Edition), Ed. Edward. N. Zalta. Available at: http://plato.stanford.edu/entries/arabic-islamic-language/.
  • Thom, P. 2008. ‟Logic and Metaphysics in Avicenna’s Modal Syllogistic.” The Unity of Science in the Arabic Tradition: Science, Logic, Epistemology and their Interactions. Ed. S. Rahman, T. Street, H. Tahiri. Dordrecht: 2008.

 

Author Information

Saloua Chatti
Email: salouachatti@yahoo.fr
Tunis University
Tunisia

Richard M. Gale (1932—2015)

R.M. Gale courtesay of daughterRichard Gale was an American philosopher known for defending the A-theory of time against the B-theory. The A-theory implies, for example, that tensed predicates are not reducible to tenseless predicates. Gale also argued against the claim that negative truths are reducible to positive ones. He created a new modal version of the cosmological argument for God’s existence, which he later refined with Alexander Pruss. The argument generated considerable interest in the philosophical community. He produced some interesting and sometimes controversial interpretations of both William James and John Dewey. In The Divided Self of William James, he argued that James is a Promethean pragmatist who attempts to ground all truth in ethics. In John Dewey’s Quest for Unity, he represented Dewey as a Promethean pragmatist who holds that human beings are creators of meaning. Gale argued that Dewey attempts to combine this with his own type of mysticism, while never achieving a successful synthesis.

Gale worked at New York University, Hunter College, and Vassar before joining the University of Pittsburgh in 1964 where he remained until retiring in 2003. He was the editor of three books, the author of eight, and he published over one hundred philosophical articles, critical studies, and book reviews.

Table of Contents

  1. Biography
  2. The Philosophy of Time
  3. Negation and Non-Being
  4. The Philosophy of Religion
    1. The Nature and Existence of God
    2. The Gale-Pruss Cosmological Argument
  5. Pragmatism
  6. References and Further Reading
    1. Primary Sources
      1. Books Edited
      2. Books Authored
      3. Articles
      4. Book Reviews
      5. Critical Studies
    2. Secondary Sources

1. Biography

Richard Gale was born on July 13, 1932 and raised on the upper West Side of Manhattan where he attended public schools. Gale describes an uneventful childhood in which he was an average student. However, his older sister Zelda was a brilliant student—a fact of which his teachers never tired of reminding him, and which, he opines, may have initiated the distrust of school teachings that lasted throughout his life. He recalls a strange sense of detachment during that period, feeling as if he were a visiting cultural anthropologist from Arcturus observing the habits of strange earthlings. This, Gale remarks, prepared him for the detached viewpoint of the philosopher, “the spectator of all time and eternity, bent on trying to understand a world that was not worth taking seriously.” Gale remarks that outward observers would not have felt he was so “out of it” because he dutifully followed all the rituals of “the tribe”, reciting the Pledge of Allegiance, singing the national anthem, and attending Sunday school at the Park Avenue Synagogue which, he wryly notes, “wasn’t on Park avenue”. Although Gale did not have any personal interest in religious questions after his teenage years, one positive experience in his youth was conversing with Rabbi Milton Steinberg, “a very great man”. His main interest at this time was athletics. Gale describes himself as a great player in practice who “stank” in the games because he could not relax. He liked philosophy because it did not require one to perform well in the crucial moment. One can think and rethink it in private until one gets it right. It is fortunate, Gale remarks, that he did not become a brain surgeon.

Gale’s other interest in his youth was music, especially jazz. His father owned the Savoy Ballroom and managed Chick Webb, Ella Fitzgerald, Cab Calloway, and the Ink Spots. He booked many leading black jazz musicians of the day, including Dizzy Gillespie, Charley Parker, Lester Young, Art Tatum, Errol Garner, and Sarah Vaughn, many of whom Gale knew as a teenager. He studied a version of the Schillinger system of musical composition and arranging with Edgar Sampson and organized a six piece combo in high school in which he played piano and did the arrangements. Gale remarks that “after almost every gig we had to change our name so we could get another job”. He began with “Richard Gale and His Manhattan Serenaders,” followed by “Richard Gale and His Rhumba Orchestra,” and, eventually, “Richard Gale and His Society Orchestra.” His “typical gig” was a bar mitzvah in Brooklyn where they were not paid but could eat as much as they wanted. “We were,” Gale admits, “a very oral bunch of guys.”

Gale did not want to go to college but gave in to his parents. Given his expectation that he would go into his father’s music business, he pursued a Bachelor of Music degree at Ohio Wesleyan University where he soon discovered that he had “everything that it took to be a great composer except talent”. His strategy was to write twelve tone musical compositions because he thought he would not be criticized for them because no one would expect to understand them. When one of his compositions was performed at the college the best his friends could say was “Very interesting,” which, Gale inferred, meant they hated it.

Since most of Gale’s time was put into his music classes and the Air Force Reserve Officer Training Corps, he had little time for other subjects. Recognizing that his career as a great composer was not on the cards, and, having no idea what philosophy was, he took a philosophy course taught by William Quillian, Jr. Here, he finally found what he was hungering for, something one does for its own sake, but he “did not know what it was”. Gale became an outstanding student in philosophy classes though he still laboured to get B’s in his other courses. Several of his junior year essays received prizes, enabling him to “step out of my sister’s shadow and … believe in myself”. He credits Lloyd Easton, who specialized in American Hegelianism, as his best teacher. A guest lecture on Dewey by George Raymond Geiger, whose “brilliance and humour blew me away,” inspired him to write his senior thesis on Kant’s and Dewey’s aesthetic theories under Easton and visiting professor G.P. Conger. He found Conger, best known for his Theories of Macrocosms and Microcosms in the History of Philosophy, a “delightful and fascinating man.” Years later Gale described his senior thesis as “the work of a mad dog Deweyite that made Kant look like he should have had his tenure revoked.”

Upon graduation in 1954 Richard served two years as an intelligence officer in Japan, which he “loved”. After leaving the Air Force, he joined his father’s music business at BMI as a “song plugger”. His job was to “romance” disc jockeys to “get them to play our songs.” The company had many big hits at that time including Elvis Presley’s “Don’t Be Cruel” and “All Shook Up.” Richard soon discovered that he “hated” the business because to be successful one had to manipulate people into doing one favours and he feared that he was “turning into the sort of person who is successful in this business”.

During this time he took a two-semester night course at NYU on existentialism with William Barrett, where reading Kierkegaard’s Fear and Trembling became “the most important event in my life.” Fear and Trembling was “the moral goose I needed to give me the strength and courage to leave the music business” and “do what I really loved.” He was fascinated by Barrett’s portrait of Kierkegaard’s “knight of infinite faith who can achieve authenticity only through making” a decision that appears “absurd to the world.” The music business “was my Regina!” His parents were not amused. Gale left the music business and began graduate studies at NYU in 1957. During this period he met his wife Maya Mori, who had just graduated with a degree in fine arts from Nihon University in Tokyo and was studying interior design in the United States. Two and a half years later they married.

Although perhaps not the usual experience, Gale found graduate school to be “paradise”: “Every day I had to pinch myself to believe that one could really make a living doing philosophy …” Gale published several of his term papers in respected professional journals but admits that he had a big advantage over his peers. Since normal graduate students had not suffered the horror of the music business they were at a motivational disadvantage. Gale later described his early publications as proof of the danger of rushing into premature publication, remarking that Walter Stace, for whom he had written one of them, told him that his courage exceeded his ability

Gale’s chose NYU because he was “an ardent Deweyite” and wanted to study with Sidney Hook, “the high priest of pragmatism.” His master’s thesis is titled Dewey and the Paradox of the Alleged Futurity of Yesterday. However, he wrote a paper for visiting professor Anthony Flew on McTaggart’s views on time and made a “decisive turn to ordinary language philosophy.” This began his passion for the problem of time that never left him.

He wrote term papers on Aristotle’s theory of time, Kant’s theory of time, Husserl’s theory of time, and so forth, one after another in every subsequent course. He wrote his Ph.D. thesis, The Concept of Time in 20th Century Analytic Philosophy, under Milton Munitz, “a saintly man whose egoless love of philosophy was inspiring.”

Gale entered the job market in January of 1961 and discovered that he had a peculiar knack for “blowing” job interviews because his aggressive quality “scared people.” In a panic, he changed strategy and took his wife Maya along to an interview at Vassar. He was offered the job. According to Gale, the better part of his success in life can be credited to Maya. His major accomplishment was “marrying a woman who was a much better person than I.” After meeting her at a party in New York City in 1957 while still in the music business she became “my moral and spiritual vitamin pills, the centre of gravity for my life.” In the acknowledgments in The Divided Self of William James he profusely thanks Maya for her strength, courage, loyalty, spirituality, and sweetness, and adds “in the most heartfelt words I ever wrote [that it] is no accident that animals and birds are attracted to her. They know something.” With his relationship with his wife on a firm footing it was no surprise when children began appearing in quick succession. Andrew was born in 1961, Laurence in 1963, and Julia in 1965.

Soon after becoming an instructor at Vassar, Gale wanted to relocate. Although the students were good and his colleagues “pleasant enough” he needed to get to a top department where his colleagues and graduate students “would beat on me and from whom I could learn.” “I had to publish my way out of Vassar” because Vassar was filled with great teachers “who thought that if you published it proved you didn’t love your students.”

Eventually Gale received offers from Santa Barbara and Pittsburgh. His reaction to Pittsburgh was: “love at first sight.” Pittsburgh was “the space-time capital of the world.” In addition to praising the distinguished Pitt faculty he describes how the “hot shots” among the graduate students “put the fear of God in me”—high praise indeed since God himself had failed to do so. Pitt’s philosophy “hot shots” engendered “panic attacks” on his way to his graduate classes. He audited Sellars’ courses and describes Sellars as “the master tease”. A typical Sellars undergraduate lecture would consist of about thirty minutes of brilliant discussions of the history of ideas, followed by about twenty minutes of criticisms of someone’s views, but when it came time deliver the goods the “bell would ring” and Sellars “would issue a promissory note that never got cashed.” Gale remarks that Sellars, “like the Shadow” had “the ability to cloud men’s minds and make them see what he wanted them to see”—reminiscent of the criticisms Nietzsche levelled at Socrates. Gale later audited classes by Hempel, Salmon, Glymour, Earman, and numerous international visitors at Pitt’s world-famous Centre for the Philosophy of Science. Gale opines that he received “the equivalent of a second Ph.D. in philosophy” at Pitt.

Although Gale had not had any personal interest in religion since his youth, he became interested in the philosophy of religion through his teaching assignments. For, he loved to argue and “no area is more loaded with foxy arguments than the philosophy of religion.” Gale made it his vocation to be the “fly in the ointment” of the analytically-oriented contemporary defenders of theism, Plantinga, Alston, Adams, and Swinburne. It therefore came as a surprise to him when he came up with a new version of the cosmological argument for God’s existence, which he believed works, “sort of”.

Gale called his friend Adolf, “an atheist’s atheist,” and laid out his new argument. “What’s the catch?,” Adolf asked, and Gale answered “There isn’t any: It works.” After a long silence Adolf asked, “Why did you do it, Richard?” Even though Gale passionately defended his new cosmological argument, he remained a non-believer: “Philosophy has never entered into his personal life.” When asked if he is an atheist, agnostic, or theist, he replied, “None of the above.” “If one needs to ask what is the meaning and purpose of life … one has lost one’s way”. “The examined life isn’t worth living”. “The value of doing philosophy professionally is that it enables one to live unphilosophically.”

Gale’s reservations about the value of philosophy were not simple anti-intellectualism. In response to those who feel that being a good person requires that one engage in philosophical reflections on what sorts of reasons should guide one’s life, Gale replied that he is “incapable of that kind of self-reflection. The one thing that “always shall be and always should be a mystery to us is ourselves”—once again, reminiscent of remarks by Nietzsche. Gale admitted one exception to keeping philosophy out of his personal life. He reported telling his friend Alexander Pruss that his defence of his cosmological argument is the first time he became personally involved in a philosophical argument, to which Pruss replied this is because that is his only argument with a true conclusion.

Richard Gale was a great teacher and a warm and caring person, helpful to students and colleagues, always ready to talk philosophy, and possessed of a wicked sense of humour that brought much levity into the world. After a remarkably varied and satisfying life, Gale passed away in his sleep on July 19, 2015 in Knoxville, Tennessee.

2. The Philosophy of Time

Richard Gale published his widely used anthology, The Philosophy of Time with Anchor in 1967. He recalls requesting that a picture of “the river of time be put on the cover” with an “observer on the bridge shining [a] spotlight of presentness onto the water, illuminating one of the events in [their] history floating on the river’s surface” and was “floored” by the “brilliantly astute” question from Anchor’s art department: “What age should the observer be?” This, Gale remarks, discloses the way in which the observer on the bridge has to be “a transcendent spook, similar to Vonnegut’s [timeless] Tralfamadorians.” Gale asked whether Anchor could put a Picasso-like multi-dimensional depiction of the observer to capture the observer at every age in their life. Anchor declined but put a picture of a pocket watch reflected in several mirrors on the cover, illustrating the difficulty in obtaining a non-paradoxical portrait of the observer’s perspective on their own temporal history.

The distinction between the A and B theories of time was first made by McTaggert in 1908. In 1968 Gale published The Language of Time in which he gives an “impassioned defence” of the A-Theory of time. The competing B-theory is the view that time is nothing but a temporal series of events running from earlier to later, where the distinctions between past, present and future are reducible to temporal relations between the events that are in time. Gale’s A-Theory holds that tense-distinctions are irreducible. B-Theorists typically reply that if tensed distinctions are irreducible it is because they are subjective. It is, however, worth noting that later B-theorists, such as Mellor and Smart, admit that the translation of A-sentences into B-sentences often doesn’t succeed, but hold that one can explain the truth conditions of any tensed declarative sentence without appeal to tensed facts, and then use Occam’s razor to get rid of the offending tensed sentences. See “Are There Essentially Tensed Facts?

B-Theorists often argue for the subject-dependency of the “now” or the present, and thus of the past and future, by an analogy between the “now” and the “here”. B-theorists argue that since everyone admits that there is not an objective “here” (everywhere can be “here” to someone), and since the logic of “now” is analogous to the logic of “here”, the “now” is as subjective as the “here”. Gale replies that there are deep disanalogies between the “now” and the “here” that suggest that the “now” is objective in a way that the “here” is not because our temporal perspectives are “imposed on us” in a way that our spatial perspectives are not. S, currently in Pittsburgh, calls Pittsburgh “here,” but could easily hop a plane to Singapore, whereupon Singapore would be “here,” and then, hop another plane back to Pittsburgh whereupon Pittsburgh would again be “here.” By contrast, S, in 2016, cannot hop into a device that transports S to 100 B.C. and then return via the device, to 2016 again.

Gale (1964c) argues that agential-based asymmetries between the past and the future are rooted in our concept of causation for which there are no spatial analogies. We can bring about events in the future but not the past. S, currently in Pittsburgh, can, theoretically, causally impact events at any point in space, for example, in Singapore, but cannot causally impact events that happened in 100 B.C. Space and time, from the perspective of the causal-agent, are just different.

Gale stresses that he is not arguing for the objective reality of a “queer” entity, the present or the “now”—which is disclosed by that transcendent spotlight of presentness on the River of Time. That “queer” view of temporal becoming leads to a contradiction. Rather, “The present” and the “now” are rigid designators: The proposition expressed by “Now might not be now” is necessarily false. However, if some time passes, and the present shifts to a later time, “then now will not be now – this very moment of time – at some later time, in violation of the necessity of identity between individuals designated by rigid designators.” As Gale puts it, “entiative theories about the present”, theories that see time as some queer sort of entity, confuse “the what with the how of reference.” There is nothing mysterious about what is denoted by a use of “now”. If Sue tells her partner “now’s the time,” no one need consult a philosopher about what is meant. What is mysterious is the how in such references because “we are prisoners of time [with] a unique temporal perspective … imposed upon us.”

Finally, Gale discusses the disagreement between the A- and B-theorists regarding whether there is “a bifurcation between man and nature.” Smart and Grünbaum argue that science abstracts from personal indexical expressions because they are subjective and not needed for a complete objective description of the world. It is not important to science that someone is now rolling balls down inclined planes with certain results. It is only important that at some time t1 certain balls were rolled down an inclined plane with certain results. Gale calls this the “error theory of tensed temporal perspectives”. Presupposing his views about the objective disanalogies between spatial and temporal indexical expressions, Gale rejects this as a “scientistic” bias that needs only to be stated clearly to be refuted.

3. Negation and Non-Being

Gale believes that his critique of B-theories of time prepared him for his next major interest in negation and non-being. Just as the early B-Theorists of time attempt to translate all A-propositions into B-propositions, some philosophers aspire to translate all negative statements into positive ones (to eliminate irreducibly negative truths about the world). Gale attempts to escape this reduction by devising adequate criteria for the legitimate use of negative propositions. Gale (1970a) argues that otherness and incompatibility analyses fail to reduce negative to positive propositions. One cannot analyse “This is not green” into the conjunction of “Every positive property of this is other than greenness” and “There is some positive property of this that is incompatible with greenness.”

Although Gale believes that the attempt to reduce negative propositions to positive incompatibility propositions fails, he does credit it with helping to demystify negative facts and events. The problem with negative facts is that we do not possess adequate identity-conditions for them. There is no non-arbitrary criterion for answering questions like “How many forest fires did not occur yesterday?” Even though Gale rejects the attempt to reduce negative to positive propositions he (1972) admits that incompatibility and otherness analyses do give useful extensional criteria of identity for negative events which satisfy Parmenides’ injunction against referring to that which is not (because these sorts of analyses only quantify over positive properties and existent individuals). If asked how many forest fires did not occur yesterday one can reply, “One did not occur in the Allegheny National forest, another did not occur in the Mendocino National Forest, another did not occur …, and so on” (where the “and so on” is not problematic because it only alludes to existing forests).

4. The Philosophy of Religion

a. The Nature and Existence of God

Although it may seem that analytical philosophy is inherently antithetical to religious belief (1991a, 2), a new breed of analytically-oriented philosophers, notably Plantinga, Alston, Adams, and Swinburne, has emerged to defend the rationality of theism—the view that an omnipotent, omniscient, and omnibenevolent creator of the world exists. Gale’s On the Nature and Existence of God is the fruit of his aim to be the “fly in the ointment” of these analytical theists. Gale’s project is positive in that it aims to improve the theistic concept of God. It is negative in that it argues that the grounds for theism are often shaky (1991a, 2-3). The spirit of Hume’s Philo imbues the book (1991a, 2). Gale does not pretend to answer the question of God’s existence because he does not consider arguments from beauty or design (1991a, 1). The first part of the book, “Atheological Arguments,” considers arguments that attempt to show that there is a logical inconsistency in the theist’s concept of God (1991a, 15). “Blasphemy aside,” this part of the book helps one “redesign” the concept of God (1991a, 3-4). The second part, “Theological Arguments,” critically examines the main traditional arguments for theism. Gale divides these into the epistemological arguments, the ontological argument, the cosmological argument, and the argument from religious experience, and “prudential arguments,” the latter of which purport to establish the prudential or moral benefits of believing in the theist’s God. Gale is primarily concerned with the former and only deals with prudential arguments in Chap. 9 (1991a, 3).

Since the sort of “historical-cum-indexical” account often given for fixing the reference of ordinary proper names does not work for supernatural beings, and since different theists have widely divergent views about their God, Gale, in the introduction, attempts to show how “we” and Abraham can refer to the same God (1991a, 4-10). Gale’s positive account, which does not imply that God exists, distinguishes between “hard” and “soft core” features of God (1991a, 7). A hard core feature, which is important in determining reference, is one without which God could not be God (1991a, 10). Soft core features can be abandoned without change of reference. Absolute simplicity is given as an example of a soft core feature and being supremely great as a hard core feature (1991a, 8). The hard core features are high level “emergent” properties, for example, that God is eminently worthy of worship is an “emergent” property that supervenes on God’s lower-level properties of omnipotence, omniscience, and so forth. Gale does not provide an account of this kind of emergence, claiming that the emergence-connection is “loose” (1991a, 8). Gale’s positive account of how “we” fix the reference of “God” involves both an historical-causal theory and the requirement that the co-referring speakers share the same form of religious life (1991a, 9-10). What counts as the same religious community over time is “a deep issue” that Gale does not pretend to solve (1991a, 10-11).

Chap. 3 deals with “The Omniscience-Immutability Argument.” That is, if God is omniscient He must know everything. That means that He must know temporal facts in the A-series (involving irreducible indexical expressions like “It is now 3 PM”). However, if God is not in time, He cannot know A-series facts because He has no “now” to know (1991a, 58). But if He is in time then He is not immutable. For if He knows what is now true then what is true of Him differs from moment to moment. Gale considers whether one might escape this dilemma by restricting God’s omniscience to B-series facts (A-series facts being beneath His timeless knowledge) (1991a, 60). However, it is religiously important for God to know A-series facts (1991a, 72-73), for example, for God to know that Abraham is now preparing to sacrifice Isaac. Gale concludes that one must eliminate timelessness and immutability from the concept of God (1991a, 95-97).

Chap. 4, the longest chapter in the book, discusses the deductive problem of evil. Gale argues that Mackie’s arguments either fail or depend on disputable premises. He reformulates this as the problem of what “morally exonerating excuses” God might have to allow evil in the world (1991a, 32, 105). If no morally exonerating reasons are found then one concludes on inductive grounds that theism is incompatible with the existence of evil. If, however, theists can show that God has morally exonerating reasons for allowing evil then theism is consistent with the existence of evil. Plantinga’s version of the “free will defence,” the view that the existence of free will outweighs the evil produced by free will, “is a thing of beauty.” (1991a, 113) Plantinga’s key claim is that it is logically possible that a maximally perfect God is not able to create a world in which there are persons that are free to do good or evil but that always do the good. Since a world with free persons who do more good than evil is better than any world without free persons it seems reasonable to infer that it is logically possible that God has a morally exonerating excuse for permitting evil by creating such a world. Gale rejects most of the common objections to Plantinga’s free-will defence. However, Gale argues that Plantinga’s free-will defence eliminates human freedom and makes God not only responsible for people’s evil deeds but morally blameworthy for them.

First, Gale argues that Plantinga’s free will defence makes God the cause (in the sense of moral responsibility and blame) of the actions of those created free persons. Since God knows that S will freely do evil then, in creating S, God is both responsible and blameworthy for the evil that S does (1991a, 153-156). Second, Gale argues that since Plantinga’s God has “middle knowledge” of His created human beings, that is, knowledge of what possible things would do in different circumstances (NEG, 131), they are not really acting freely (1991a, 133). God does not merely know that possible person J is constructed in a certain way. God knows that if J is put in a certain circumstance C then J will behave in a certain way W. Gale then urges an analogy between God’s creating J while possessing “middle knowledge” of what J will do in C with cases where someone programs J’s psychological makeup so that J will act in certain way in C. If one induces J, who has an amorous nature, to call Alice for a date by telling J that Alice is interested in him, one does not cancel J’s freedom—one does not drug, hypnotize, or brainwash J (1991a, 157). However, the nature of God’s control over his created beings is freedom-cancelling (1991a, 122, 153ff). Gale does not claim this analogical argument is conclusive but only that it casts reasonable doubt on Plantinga’s free-will defence (NEG, 158-160).

In the second Part of the book Gale considers several of the traditional arguments for the rationality of belief. Chap. 6 considers the ontological argument, which comes in two forms, Anselm’s original form and the updated modal version. Since Gale agrees with Plantinga’s critique of Anselm’s argument (2003a), most of his discussion is concerned with Plantinga’s modal argument. Plantinga’s argument requires the premise that there is some possible world that contains a being that necessarily exists and is maximally perfect. Following modal theorem S5 (If it is possible that it is necessary that p then it is necessary that p), Plantinga infers that God necessarily exists. Gale replies that many atheists would not agree that there is some possible world in which a necessarily existing maximally perfect being exists (1991a, 226). Indeed, since accepting that possibility-premise as it is understood in S5, where possibly necessary means necessary, asking the atheist to accept it begs the question.

Chap 8 considers the argument from religious experience. Gale questions whether powerful religious experiences are cognitive and whether they provide evidence for the existence of God (1991a, 286). The argument that religious experiences are cognitive is usually premised on an alleged analogy to sensory experiences (1991a, 316). Gale argues that arguments by Alston, Gutting, Swinburne, and Wainwright fail to show that religious experiences are relevantly analogous to sensory experiences (1991a, 326). Since God does not occupy space and time, there can be no veridical (non-sensory) experience of God, and if an experience is non-veridical it cannot constitute evidence for the existence of its “object.”

The most famous of the “prudential” arguments is Pascal’s wager. First, “Pascal’s wager” is not really a wager (1991a, 353). If God exists and one bets wrong, one does not lose one’s earthly life. Gale also raises the “many Gods” objection against Pascal. That is, formulations of the “wager” presuppose that either there is a God who rewards believers with infinite bliss or there is no God at all. However, many kinds of God are possible. Imagine a God who rewards people who step on every third sidewalk crack and condemns those who do not to infinite punishment (1991a, 350). The logic of the “wager” gives one as much reason to believe in this bizarre God as it does to believe in the theist’s God.

Gale argues against James’ “Will to Believe” argument, claiming that James fails to show that one can have sufficient moral reasons for “self-inducing an epistemically unsupported belief” (1991a, 283). Rather, to believe propositions unsupported by evidence violates one’s duty as a rational person and undermines one’s own personhood (1991a, 372, 376, 382-383).

b. The Gale-Pruss Cosmological Argument

Gale (1999c) published a new modal version of the cosmological argument. Gale and his former student, Alexander Pruss, jointly published a simplified and strengthened modal version of the argument (hereafter GP) titled “A New Cosmological Argument.” GP does not purport to prove the existence of Anselm’s God but does purport to prove that “there [necessarily] exists in the actual world a very powerful and intelligent supernatural designer-creator of this world’s universe.” Call this being, similar but not identical to Anselm’s God, “G”. GP also holds that G is a self-explaining being in the sense that there is a successful ontological argument for its existence, even if one is not able to provide it.

GP presupposes the notion of a possible world. GP also employs the notions of contingent and necessary propositions. Contingent propositions are both possibly true and possibly false (true in some possible worlds and false in others). Necessarily true propositions are true in all possible worlds. GP holds that a possible world is “a maximal, compossible conjunction of abstract propositions”. It is maximal in the sense that for every proposition q, either q or not-q is a conjunct in this conjunction. Every possible world is compossible in the sense that it is conceptually or logically possible that all of the conjuncts in the conjunction that specifies that world can be true together. There are no impossible (contradictory) worlds. There are also possible worlds that have different laws of nature than those in the actual world. Further, GP treats necessary truths as facts. It is a “fact” in the actual world that it is not both raining and not raining (at the same time in the same place).

GP admits that it would be unfair to assume the strong Principle of Sufficient Reason (PSRs) that there is a sufficient explanation for any fact in the world because that is close to assuming that there is a sufficient first cause for these facts, which most opponents of the cosmological argument would reject. GP only asserts a weak version of PSR (PSRw), that it is possible that such facts have an explanation. Opponents of the cosmological argument cannot reasonably deny that a sufficient explanation for such facts is possible. Finally, GP assumes the modal axiom S5 that if it is possible that it is necessary that p then it is necessary that p. For example, since it is possible that 2 + 2 = 4 is necessarily true then, by S5, it is necessarily true that 2 + 2 = 4.

The simplified core of GP is this: (1) If it is possible that a necessary supernatural creator of the actual world, G, exists, then it is necessary that a supernatural creator of the world, G, exists. (2) It is possible that a supernatural creator of the world, G, exists. Therefore, (3) it is necessary that a supernatural creator of the world, G, exists.

The first premise is an instantiation of S5. If one accepts S5, the second premise is the key and most of GP is concerned with the second premise. GP defines the Big Conjunctive Fact (BCF) of a possible world, the maximal compossible conjunction of all propositions that would be true of that world if it were actualized. The BCF of the actual world consists in all the propositions (both necessary and contingent) that are actually true.

Since all possible worlds share the same set of necessary propositions, the distinction between the different possible worlds resides in their different contingent propositions that would be true in those worlds if they were actualized. A contingent proposition (or being) is one that possibly, in a broadly conceptual or logical sense, is true (or existent) and possibly is false (or nonexistent). The Big Conjunctive Contingent Fact (BCCF), which contains the contingent propositions that would be true in a possible world, were it actualized. Since all possible worlds share the same set of necessary propositions, the different possible worlds are individuated by their different BCCF’s. Since every possible world is maximal, then, for every contingent proposition p, either p or not-p is a conjunct in this BCCF. Thus, no two possible worlds can have the same BCCF.

GP calls the BCCF of the actual world “p”. Thus, p includes the existence of and non-existence of all contingent beings in the world and the occurrence and non-occurrence of all contingent events in the actual world, but it also includes all the contingent acts of any necessary beings that may exist in the actual word. GP argues that p has a sufficient explanation in the actual world. Although many philosophers will deny that p does or must have a sufficient explanation, they cannot deny that it is possible that there is some possible world w1 that contains p but also contains q and the proposition that q explains p. But this hypothetical possible world w1 turns out to be identical with w, the actual world. Suppose there were some difference between w and w1. Then, since possible worlds are maximal, there must be some proposition r true in w1 but not true in w. But if r is true in w1 and not in w, then, by the law of the bivalence, ~ r is true in w. Since, however, by hypothesis, everything actually true in w is also true in w1 (because w1 contains p), ~ r is true in w1 as well. But this means that r and ~ r are true in w1, which is a contradiction, and that cannot be, because w1 is supposed to be a possible world. Since the assumption that w1 is a different from w implies a contradiction, it follows that w1 is the same as w. Thus, the actual world w contains q and the proposition that q explains p. Thus, the actual world w contains a sufficient explanation of its p.

What is the nature of this q? GP claims that it is a conceptual truth, of which theonly sorts of explanations it can conceive are scientific and personal explanations. Call this claim SPE. A scientific explanation explains why some proposition is true by reference to some conjunction of law-like propositions and at least one contingent proposition that reports a state of affairs at some time. A personal explanation elucidates why some proposition is true by reference to the intentional action of an agent. GP admits that there might be types of explanation that are beyond our ken, but, “in philosophy we ultimately must go with what we can make intelligible to ourselves ….” One is left with the choice between scientific and personal explanations. But q cannot be a scientific explanation. Since scientific propositions are contingent, this would mean that q is part of w’s BCCF and that would mean that the explanation of w’s BCCF is a part of that BCCF. Since “law-like propositions cannot explain themselves,” q can only, by disjunctive syllogism, be a personal explanation that reports the intentional actions of some being.

However, q cannot report the actions of a contingent being. If it were contingent, the proposition that states its existence would be part of the BCCF of w and, therefore, “q itself is not able to explain why the contingent being it refers to exists, since a contingent being’s intentional action” presupposes “and hence cannot explain, that being’s existence.” Thus, q refers to the intentional actions of a necessary being.

One might assume that q is a necessary proposition because it reports the action of a necessary being. But appearances are deceiving. GP argues that “q is a contingent proposition that reports the free intentional action of a necessary being.” GP most favored argument for the claim that q is a contingent proposition is a reductio based on the assumption that q is necessarily true. If q is necessarily true, then q is a conjunct in the BCF of all possible worlds. Since, however, q entails p (the BCCF of w), and since a possible world is individuated by its BCCF, it follows that every possible world is identical with w, which means that there is only one possible world. Since, despite protests from Spinoza and Leibniz, this is absurd, q is a contingent proposition.

Since G is a necessary being it satisfies one key component of the traditional notion of a theistic God. But some versions of traditional theism, for example, Leibniz, found it hard to account for the apparent contingency in the world. That is, even though Leibniz condescends to call certain propositions “contingent”, he is committed to hold that from God’s point of view (the only truly adequate one) all propositions are necessary (Russell, 1967, 60-61). Thus, Leibniz cannot use GP because there is, for Leibniz, really no such thing as a BCCF. The ingenuity of GP is that it finds genuine roles, which Leibniz could not accept, both for the contingent and the necessary. The genuine contingency in the world rests on the free choices of a necessary G.

Oppy (2000) counters that GP is committed to PSRs. Since accepting PSRw commits the opponents of theism to accepting PSRs, GP is committed to accept the PSRs that it hoped to avoid. Gale and Pruss (2002) accept that that PSRw entails PSRs but hold that though PSRs is entailed by their argument, it is not a premise in their argument. Thus, the only reason opponents of GP have for rejecting PSRw is that it leads to the theistic view that they reject.

Despite the facts that Gale has fiercely defended GP, he remains an “unbeliever”. So “why did you do it Richard?” The answer is that Gale is still the “detached … spectator of all time and eternity” following the logic wherever it leads, even to a proof of God’s existence, in which he is personally uninterested, but which, he believes, works—“well sort of”.

5. Pragmatism

 In the early 1990’s Gale found himself drawn back to his “old flames”, James and Dewey, and “fell madly in love with James again”. Gale’s book, The Divided Self of William James, attempts to formulate James’ ethics based on his “Promethean pragmatism”. James’ Promethean pragmatism consists in three propositions: 1.) We are always morally obligated to maximize desire satisfaction over desire dissatisfaction, 2.) Belief is an action, therefore, 3.) We are always morally obligated to believe in such a manner that maximizes desire satisfaction over desire dissatisfaction (1999a, 11, 25). James’s aim is to ground, not just beliefs, but also meaning, reference, and truth, in ethics. Working from his basic empiricism, James attempts to determine the ontological status of ethical terms by analysing one’s experiential reasons for predicating them, which leads him to reject the Platonic view that the good is determined by the Form of the Good prior to the existence of sentient beings. Gale approvingly quotes James’ remark that “neither moral relations nor the moral law can swing in vacuo” (1999a, 27). Gale argues that one of the reasons for the failure to appreciate James is that Dewey co-opted him for his own ends by naturalizing James’ views, thereby eliminating the mystical and spiritual dimensions that had motivated James (1999a, 335).

Gale’s book, John Dewey’s Quest for Unity, is an attempt to come to terms with his other early love. Gale argues that Dewey attempts to combine Prometheanism with his own unique type of mysticism, but never achieves a successful synthesis. Gale’s Dewey sees human beings as Promethean creators of meaning via action in nature where artistic creation is the paradigm of creative synthesis (but something similar holds for the creation of knowledge and moral action). The problem, according to Gale, is that the remnants of the absolute Idealism from Dewey’s early Hegelianism extend into his mature period (2010a, 163). Though Dewey constantly appeals to experience, he has two notions of experience, one being the ordinary common sense notion of experience that results from the interaction between and organism and its environment, the other being a Plotinian-Hegelian “Absolute, the only true individual, with everything emanating out of it” (2010a, 163).

Before beginning the Dewey-book, Gale feared that he would not be able to write a very positive book on Dewey. Many critics agree, arguing that Gale greatly misunderstands Dewey. Dewey’s problem, Gale holds, is that he tries to ground his grand normative vision in a “misbegotten metaphysics” (2010a,16). Gale admits that his vision of a Hegelian Absolute experience “goes well beyond the letter of [Dewey’s] text” (2010a, 162). Gale replies that he is trying to free Dewey’s view, with which he obviously has some sympathy, from the remnants of an obscure Hegelian metaphysics that distorts Dewey’s potentially very valuable philosophy. Indeed, Gale states that if the greatness of a philosopher consists in the beneficial effects their writings have upon the reader, then “Dewey must be reckoned the greatest philosopher of all time” (2010a, 9).

6. References and Further Reading

a. Primary Sources

i. Books Edited

  • Gale, Richard, (ed.). The Philosophy of Time New York Anchor Doubleday Books, 1967
    • Gale contributed five ten-page introductions to the five sections of the book.
  • Gale, Richard, (ed.). The Blackwell Companion to Metaphysics Oxford: Blackwell, 2002
  • Gale, Richard, with Pruss, Alexander, ed’s. The Existence of God International Research Library of Philosophy Dartmouth Publishing, 2003
    • This anthology of articles by numerous authors provides a useful companion to Gale’s On the Nature and Existence of God and contains a “mammoth” introduction by Gale and Pruss.

ii. Books Authored

  • Gale, Richard. The Language of Time London: Routledge & Kegan Paul, 1968
  • Gale’s primary articulation and defence of the A-theory of time against the B-theory
  • Gale, Richard. Negation and Non-being, American Philosophical Quarterly Monograph No. 10. Oxford: Blackwell, 1976
    • Gale claimed that this book fell “stillborn from the presses” because it went un-reviewed but holds that it was at the time the only systematic historical and philosophical discussion of these issues.
  • Gale, Richard. On the Nature and Existence of God London: Cambridge University Press, 1991a
    • Gale’s critical response to a variety of different analytical arguments, both for and against theism
  • Gale, Richard. The Divided Self of William James London: Cambridge University Press, 1999a
    • Considered by many to be Gale’s best book, Gale describes James as a synthesis of ‘Promethean’ Pragmatism, which holds that language and concepts are a means for controlling nature, and a mystic who believes that ultimate reality is inaccessible to conceptualization.
  • Gale, Richard. The Philosophy of William James: an Introduction. London: Cambridge University Press, 2004
    • An accessible introduction to Gale’s view of James
  • Gale, Richard. On the Philosophy of Religion Boston: Wadsworth, 2007.
    • A useful clearly written textbook on central issues in the philosophy of religion
  • Gale, Richard. John Dewey’s Quest for Unity: The Journey of a Promethean Mystic. Amherst: Prometheus Press, 2010
    • Argues that Dewey’s philosophy is valuable because it attempts, unsuccessfully, to synthesize the Promethean view that human beings are creators of meaning with Dewey’s own brand of mysticism
  • Gale, Richard. God and Metaphysics Amherst: Prometheus Press, 2010
    • A collection of Gale’s seminal articles in the various areas of philosophy, including God, time, non-being, and pragmatism

iii. Articles

  • Gale, Richard. “Russell’s Drill Sergeant, Bricklayer and Dewey’s Logic,” Journal of Philosophy 56 (1959): 401-406
  • Gale, Richard. “Natural Law and Human Rights,” Philosophy and Phenomenological Research 20 (1960a): 521-531
  • Gale, Richard. “Mysticism and Philosophy,” Journal of Philosophy 57 (1960b): 471-481
  • Gale, Richard. “Endorsing Predictions,” Philosophical Review 70 (1961a): 376-385
  • Gale, Richard. “Professor Ducasse on Determinism,” Philosophy and Phenomenological Research 22 (1961b): 92-96
  • Gale, Richard. “Tensed Statements,” Philosophical Quarterly 12 (1962a): 53-59
  • Gale, Richard. “Can a Prediction’ Become True’?” Philosophical Studies 13 (1962b): 43-46
  • Gale, Richard. “Dewey and the Problem of the Alleged Futurity of Yesterday,” Philosophical and Phenomenological Research 22 (1962c): 501-511
  • Gale, Richard. “A Reply to Smart, Mayo, and Thalberg on ‘Tensed Statements’,” Philosophical Quarterly 13 (1963a): 351-356.
  • Gale, Richard. “Some Metaphysical Statements about Time,” Journal of Philosophy 60 (1963b): 225-237
    • Argues that the paradox in various metaphysical statements about the unreality of time is revelatory
  • Gale, Richard. “Is It Now Now?” Mind 73 (1964a): 97-105.
    • Attempts to exhibit the three necessary conditions for tensed communication. Argues that the Present and the ‘Now’ are rigid designators and therefore are not subjective
  • Gale, Richard. “A Reply on the ‘Alleged Futurity of Yesterday,” Philosophy and Phenomenological Research 24 (1964b): 421-422
    • Discusses the question whether there is a paradox in the pragmatist view that the meaning of any statement is the sum-total of experiential consequences that can be found in our experiential future
  • Gale, Richard. “The Egocentric Particular and Token-Reflexive Analyses of Tense,” Philosophical Review 73 (1964c): 213-228
    • Argues against the B-theorist’s view that the seemingly irreducible features of tensed propositions are subjective
  • Gale, Richard. “On Believing What Isn’t the Case,” Proceedings of the XIIIth International Congress of Philosophy (1964d)
  • Gale, Richard, with Douglas McGee and Frank Tillman. 1964. “Ryle on Use, Usage, and Utility,” Philosophical Studies 15 (1964): 57-60
    • Argues that Ryle’s distinction between linguistic use and usage, in order to distinguish the practice of philosophers from that of grammarians and philologists, fails
  • Gale, Richard. “Falsifying Retrodictions,” Analysis 26 (1964e):6-9
    • Disposes of several counterexamples to the view that there is no way for us to act now to falsify retrodictions
  • Gale, Richard, and Irving Thalberg. “The Generality of Predictions,” Journal of Philosophy 62 (1965a): 195-210
    • Argues that there are certain crucial logical asymmetries between past and future that are deeply entrenched in the way we speak about the past and the future
  • Gale, Richard. “Why a Cause Cannot Be Later than Its Effect,” Review of Metaphysics 19 (1965b): 209-234.
  • Gale, Richard. “Existence, Tense, and Presupposition,” The Monist 50 (1966a): 98-108
    • Argues that “exists (existed, will exist)” is not a predicate of things and that “is present (past, future)” is not a predicate of events or states of affairs
  • Gale, Richard. 1966b. “McTaggart’s Analysis of Time,” American Philosophical Quarterly (1966b):145–152
    • Argues for Gale’s A-theory of time
  • Gale, Richard. “Pure and Impure Descriptions,” Australasian Journal of Philosophy (1967a) 45:32–43
  • Gale, Richard. “Propositions, Judgments, Sentences, and Statements,” Encyclopedia of Philosophy, vol. 6, Paul Edwards, (ed.) New York, Macmillan (1967b): 494-505
  • Gale, Richard. “Indexical Signs, Egocentric Particulars and Token-Reflexive Words,” Paul Edwards (ed.). Encyclopedia of Philosophy, vol. 4. New York: Macmillan, 1967c
  • Gale, Richard. “Hook’s Views on Metaphysics,” Sidney Hook and the Contemporary World, Paul Kurtz, (ed.). J. Day Co., 1968
  • Gale, Richard. “’Here’ and ‘Now’,” The Monist 53 (1969a): 396-409
    • Argues that there are certain crucial asymmetries between ‘here’ and ‘now’
  • Gale, Richard. “A Note on Personal Identity and Bodily Continuity,” Analysis 30 (1969b):193-195
  • Gale, Richard. “Do Performative Utterances Have any Constative Function?” Journal of Philosophy 67 (1969c): 117-121
  • Gale, Richard. “Negative Statements,” American Philosophical Quarterly 7 (1970a): 206-217
    • Discusses how to construct a reasonably clear criterion between positive and negative statements
  • Gale, Richard. “Strawson’s Restricted Theory of Referring,” Philosophical Quarterly 20 (1970b): 162-165.
    • Argues that Strawson’s restricted (relative to Russell’s) theory of referring leads to absurdities
  • Gale, Richard. “Has the Present Any Duration?” Noûs 5 (1970c): 39-47
    • Argues that the durational present, which can be interpreted as the punctual present, has, except for the irreducible reference to now, has the same characteristics as the corresponding instant in the B-series
  • Gale, Richard. “The Fictive Use of Language,” Philosophy 46 (1971): 324-340
    • Employs Austin’s trilogy of locutionary, illocutionary, and perlocutionary acts to argue that the difference between fictive and non-fictive uses of language are fundamentally pragmatic
  • Gale, Richard. “On What There Isn’t,” Review of Metaphysics 25 (1972): 459-488
    • Argues that negative facts cannot be reduced to positive ones
  • Gale, Richard. “O’Connor on the Identity of Indiscernibles,” Philosophical Studies 24 (1973a): 412-415
  • Gale, Richard. “Bergson’s Analysis of the Concept of Nothing,” Modern Schoolman 51 (1973b): 269-300
    • Applies Gale’s analyses of negative propositions to the question whether absolute nothingness is possible
  • Gale, Richard. “Could Logical Space Be Empty?” Essays on Wittgenstein in Honor of G.H. von Wright, Jaakko Hintikka and G.H. von Wright, (ed’s). Acta Philosophica Fennica, vol. 28, Thos. 1-2, 1976
    • Explores the question whether in Wittgenstein’s Tractatus all positive atomic propositions could be false
  • Gale, Richard. 1977. “A Reply to Oaklander,” Philosophy and Phenomenological Research (1977): 234-238.
    • Defends his claim (that there can be a B-series of events only if these events also form an A-series) against Oaklander’s objection to that claim from The Language of Time
  • Gale, Richard. “Wiggins’ Thesis D (x),” Philosophical Studies 45 (1984): 239-245
    • Argues that Wiggins’ principle of Sortal Dependency easily leads to counterexamples
  • Gale, Richard. “William James and the Ethics of Belief,” American Philosophical Quarterly 17 (1985): 1-14
    • Gale describes this paper as an attempt to capture the “spirit and thrust” of James’ The Will to Believe.
  • Gale, Richard. “Omniscience-Immutability Arguments,” American Philosophical Quarterly 23 (1968a): 319-335.
    • Argues for Gale’s A-theory of time
  • Gale, Richard. “A Priori Arguments from God’s Abstractness,” Noûs 20 (1986b): 531-543.
    • Argues that the a priori arguments based on God’s abstractness that God necessarily exists are uncompelling
  • Gale, Richard. “Parfit’s Arguments Against Partially Relativized Theories of Rationality,” Analysis 47 (1986c): 230-236.
  • Gale, Richard. “Freedom vs. Unsurpassable Greatness,” International Journal for Philosophy of Religion 23 (1988): 65-75.
    • Argues against Plantinga’s new version of the ontological argument
  • Gale, Richard. “Lewis’ Indexical Argument for World-Relative Actuality,” Dialogue 28 (1989): 289-304
  • Gale, Richard. “Freedom and the Free Will Defence,” Social Theory and Practice 16 (1990): 397-423
  • Gale, Richard. “Becoming,” Handbook. Philosophia Verlag, 1991b
  • Gale, Richard. “On Some Pernicious Thought-Experiments,” Thought Experiments in Science and Philosophy, G. Massey and T. Horowtiz, (ed’s). Rowman & Littlefield Publishers, 1991c
  • Gale, Richard. “Pragmatism versus Mysticism: The Divided Self of William James,” Philosophical Perspectives, vol. 5, James Tomberlin, (ed.) Atascadero: Ridgeview, 1991d
  • Gale, Richard; Pruss, Alexander. “Cosmological and Design Arguments,” Oxford Handbook of the Philosophy of Religion Oxford: Oxford University Press, 1991e
  • Gale, Richard. “Reply to Paul Helm,” Religious Studies 29 (1993): 257-263.
    • Argues that religious experiences are conceptualizable in a way that we do not adequately understand but that they do not provide evidence for the existence of the accusative
  • Gale, Richard. “William James on Self-Identity Over Time,” Modern Schoolman 71 (1994a): 165-189.
  • Gale, Richard. “The Overall Argument of Alston’s Perceiving God,” Religious Studies 30 (1994b): 135-149
    • Argues against Alston’s pragmatic and epistemic arguments that it is rational to believe in God’s existence
  • Gale, Richard. “Swinburne’s Argument from Religious Experience,” Alan G. Padgett (ed.), Reason and the Christian Religion Oxford: Clarendon, 1994c
    • Argues against Swinburne’s argument from religious experience
  • Gale, Richard. “McTaggart, John McTaggart Ellis,” Encyclopedia of Time, Samuel Macey, (ed.) New York: Garland, 1994d
  • Gale, Richard. “Russell, Bertrand Arthur William,” Encyclopedia of Time. Samuel Macey, (ed). New York: Garland, 1994e
  • Gale, Richard. “Analytic Philosophy,” Encyclopedia of Time. Samuel Macey, (ed). New York: Garland, 1994f
  • Gale, Richard. “Why Alston’s Mystical Doxastic Practice is Subjective,” Philosophy and Phenomenological Research 54 (1994g): 869 – 875.
    • Argues against Alston that alleged direct perceptions of God are subjective and unreliable
  • Gale, Richard. “Non-Being and Nothing,” Oxford Companion to Philosophy, Ted Hondereich, (ed.). Oxford: Oxford University Press, 1995
  • Gale, Richard, and Earman, John. “Time,” Cambridge Dictionary of Philosophy, Cambridge: Cambridge University Press, 1995
  • Gale, Richard. “John McTaggart Ellis McTaggart,” A Companion to Metaphysics, J. Kim and E. Sosa, (ed’s.). Blackwell-Wiley, 1996a
  • Gale, Richard. “Negation,” Blackwell’s Companion to Metaphysics, Jaegwon Kim and Ernst Sosa, (ed’s). Hoboken: Blackwell-Wiley, 1996b
  • Gale, Richard. “Nothingness,” Blackwell’s Companion to Metaphysics, Jaegwon Kim and Ernst Sosa, (ed’s). Hoboken: Blackwell-Wiley, 1996c
  • Gale, Richard. “Some Difficulties in Theistic Treatments of Evil,” The Evidential Argument from Evil, Daniel Howard-Snyder, (ed.), (pp. 206-218). Bloomington: Indiana University Press, 1996d
  • Gale, Richard. “William James’s Quest to Have It All,” Transactions of the Charles S. Peirce Society 32 (1996e): 568-596.
  • Gale, Richard. “Disanalogies Between Space and Time,” Process Studies 25 (1996f): 72-89.
    • Argues that there are major disanalogies between space and time and ‘here’ and the ‘now’
  • Gale, Richard. “John Dewey’s Naturalization of William James,” The Cambridge Companion to James, Ruth Anna Putnam, (ed.). Cambridge: Cambridge University Press, 1997a
  • Gale, Richard. “William James’s theory of Freedom,” Modern Schoolman 74 (1997b): 227-247.
  • Gale, Richard. “From the Specious to the Suspicious Present: The Jack Horner Phenomenology of William James,” Journal of Speculative Philosophy 11 (1997c): 163-189.
    • Argues that James’ account of how we experience time is based on a faked phenomenology that distorts the way we experience time
  • Gale, Richard. “William James’s Semantics of “Truth’,” Transactions of the Charles S. Peirce Society 33 (1997d): 863-898
    • Argues that James’ moralization of epistemology requires him to reject Tarski’s principle that a sentence “s” is true if and only if s and that this means that James’pragmatic theory of meaning can only supply the conditions under which a belief is epistemically warranted
  • Gale, Richard. “William James’ Ethics of Prometheanism, History of Philosophy Quarterly 15 (1998a): 245-269
  • Gale, Richard. “Ich bin Ein Realist: James’s Attempt to Placate Realism,” International Studies in Philosophy 30 (1998b): 1-17
    • Critically analyses James’ view that we are all morally obligated to maximize desire satisfaction over all other available options
  • Gale, Richard. “Robert M. Adam’s Theodicy of Grace,” Philo 1 (1998c): 36-44.
  • Gale, Richard. “William James and the Willfulness of Belief,” Philosophy and Phenomenological Research 59 (1999b): 71-91.
    • Argues that there is a salvageable core to James’ view that we can believe at will
  • Gale, Richard. “Santayana’s Bifurcationist Theory of Time,” Bulletin of the Santayana Society (17 (1999c): 1-13
  • Gale, Richard. “A New Argument for the Existence of God: One That Works, Well Sort Of,” The Rationality of Theism: Essays in the Philosophy of Religion. G. Bruntrup & R.K. Tacelli, (ed.)’s. Dordrechet: Kluwer, 1999d
    • Gale’s first statement of his new cosmological argument
  • Gale, Richard, Pruss, Alexander. “A New Cosmological Argument,” Religious Studies 35 (1999): 461-476.
    • Gale’s and Pruss’s improved version of Gale’s new modal cosmological argument
  • Gale, Richard. Introduction to the re-issue of William James and Henri Bergson by Horace Kallen London: Thoemmes Press, 2001
  • Gale, Richard. “Time, Temporality, and Paradox,” Blackwell Guide to Metaphysics, Richard Gale, (ed.). Oxford: Blackwell, 2002a
  • Gale, Richard. “The Metaphysics of John Dewey,” Part I and Part II, Transactions of the Charles S. Peirce Society 38 (2002b): 477-519.
    • Explains why Dewey was not the best philosopher of all time but is the best person ever to have been a philosopher.
  • Gale, Richard. “Divine Omniscience, Freedom, and Backward Causation,” Faith and Philosophy 19 (2002c): 85-88.
    • Attempts to deduce a contradiction from the proposition that God is omniscient and immutable and that there are true temporal indexical propositions
  • Gale, Richard. “A Challenge for Interpreters of Varieties,” Streams of William James. The William James Society 4 (2002d): 32-33
  • Gale, Richard. “A Thomist Metaphysics,” Blackwell Guide to Metaphysics. Richard Gale, (ed.) Oxford: Blackwell, 2002e
  • Gale, Richard, Pruss, Alexander. “A Response to Oppy, and to Davey and Clifton,” Religious Studies 38 (2002): 89-99
    • Argues that the fact that the weak principle of sufficient reason entails the strong principle of sufficient reason does not damage their argument and gives one a further reason to accept the weak principle
  • Gale, Richard. “Why Traditional Cosmological Arguments Don’t Work and a Sketch of a New One that Does,” Contemporary Debates in Philosophy of Religion, Michael Peterson, (ed.) Oxford: Blackwell, 2003a
  • Gale, Richard. “God Eternal and Paul Helm,” Reason, Faith and History: Essays in Honour of Paul Helm, Martin Stone, (ed.). London: Ashgate, 2003b
  • Gale, Richard, and Pruss, Alexander. “A Response to Almeida and Judisch,” International Journal for Philosophy of Religion 53 (2003): 65-72.
    • Denies that their argument leads to a contradiction but acknowledges the need to clarify the nature of their conclusion
  • Gale, Richard. “The Ecumenicalism of William James,” William James and The Varieties of Religious Experience: Centenary Celebration. Jeremy Carrett, (ed.). London: Routledge, 2004a
  • Gale, Richard. “William James and John Dewey: the Odd Couple,” Midwestern Studies in Philosophy 28 (2004b): 149–167.
  • Gale, Richard. “The Still Divided Self of William James: A Response to Pawelski and Cooper Transactions of the Charles S. Peirce Society 40 (2004c): 153-170.
  • Gale, Richard, and Pruss, Alexander. “Cosmological and Teleological Arguments,” Oxford Companion to the Philosophy of Religion, William Wainwright, (ed.), 2005
  • Gale, Richard. “Response to My Critics,” Philo 6 (2003): 132-165
  • Gale, Richard. “John Dewey’s ‘Time and Individuality’,” Modern Schoolman, 82 (2005a): 175-192
  • Gale, Richard. “On the Cognitivity of Mystical Experiences,” Faith and Philosophy 22 (2005b): 426-441
  • Gale, Richard. “Comments on the Will to Believe,” Social Epistemology 20 (2006a): 35-39
  • Gale, Richard. “The Problem of Ineffability in Dewey’s Theory of Inquiry,” Southern Journal of Philosophy 44 (2006b): 75-90.
  • Gale, Richard. “The Failure of Traditional Theistic Arguments,” Cambridge Companion to Atheism, edited by M. Martin, (ed.) Cambridge: Cambridge University Press, 2007a
  • Gale, Richard. “Evil and Alvin Plantinga,Alvin Plantinga, (ed.) Deane-Peter Baker Cambridge: Cambridge University Press, 2007b
  • Gale, Richard. “Relations,” Encyclopedia of American Philosophy, (ed’s). J. Lachs and R. Talisse. Nashville: Vanderbilt University Press, 2007c
  • Gale, Richard. “Healthy-Mindedness,” Encyclopedia of American Philosophy, (ed’s). J. Lachs and R., Talisse. Nashville: Vanderbilt University Press, 2007d
  • Gale, Richard. “The Influence of William James,” Encyclopedia of American Philosophy, (ed’s). J. Lachs and R. Talisse Nashville: Vanderbilt University Press, 2007e
  • Gale, Richard. “God,” Encyclopedia of American Philosophy. Nashville: Vanderbilt University Press, 2007f
  • Gale, Richard. “Time,” Encyclopedia of American Philosophy. Nashville: Vanderbilt University Press, 2007g
  • Gale, Richard. “Timothy Sprigge: The Grinch that Stole Time”, Consciousness, Reality and Value: Essays in Honour of T.L.S. Sprigge. Pierfrancesco Basile & Leemon B. McHenry (eds.). Frankfurt: Ontos Verlag, 2007h
  • Gale, Richard. “The Deconstruction of Traditional Philosophy by William James,” Pragmatism,” Looking Toward Last Things: Pragmatism 100 Years After James’s,” John Stuhr, (ed.). Bloomington: Indiana University Press, 2009
  • Gale, Richard. “The Naturalism of John Dewey,” Cambridge Companion to John Dewey, (ed.) Molly Cochoran, Cambridge: Cambridge University Press, 2010
  • Gale, Richard. “The Problem of Evil,” Routledge Companion to the Philosophy of Religion, (ed’s), Chad Meister and Paul Copan, (ed’s). London: Routledge, 2012
  • Gale, Richard. “James,” Twentieth-Century Philosophy of Religion: The History of Western Philosophy of Religion. New York: Routledge, 2013
  • Gale, Richard. “Autobiography” (unpublished)

iv. Book Reviews

  • Gale, Richard. What Is Political Philosophy? by Leo Strauss Philosophy and Phenomenological Research, 1961
  • Gale, Richard. The Natural Philosophy of Time by G. J. Whitrow Philosophy and Phenomenological Research, 1961
  • Gale, Richard. Time and the Physical World by Richard Schlegel Philosophy and Phenomenological Research, 1963
  • Gale, Richard. Analytical Philosophy of History by Arthur Danto Foundations of Language, 1968
  • Gale, Richard. Essays on Wittgenstein’s Tractatus by Irving Copi and Robert Beard, (ed’s). Philosophy and Phenomenological Research, 1968
  • Gale, Richard. Joint review of “The Problem of Future Contingencies” by Richard Taylor and “Present Truth and Future Contingency” by Rogers Albritton Journal of Symbolic Logic, 1969
  • Gale, Richard. Kant’s theory of Time by Al-Azum, Studies in History and Philosophy of Science, 1971
  • Gale, Richard. Remarks on Colours by Ludwig Wittgenstein, Review of Metaphysics, 1981
  • Gale, Richard. Time: A Philosophical Analysis by T. Chapman. Dialogue 23 (1984): 153
  • Gale, Richard. Eternity, Internal Studies in Philosophy by Brian Leftow, 1995
  • Gale, Richard. The Correspondence of William James, v. 4, Ignal Skrupskelis, (ed.), Transactions of the Charles Sanders Peirce Society, 1996
  • Gale, Richard. Judaism and the Doctrine of Creation by Norbert Samuelson CCAR Journal: A Reform Jewish Quarterly, 1999
  • Gale, Richard. Heaven’s Champion by Ellen Suckiel Journal of Value Inquiry, 1999
  • Gale, Richard. William James and the Metaphysics of Experience by David Lamberth Philosophy and Phenomenological Research, 2000
  • Gale, Richard. Questions of Time and Tense by Robin Le Poidevin, (ed.). Philosophical Books, 2000
  • Gale, Richard. Dewey’s Empirical Theory of Knowledge and Reality by John Shook Transactions of the Charles Sanders Peirce Society, 2001
  • Gale, Richard. Truth, Rationality, and Pragmatism: Themes from Peirce by Christopher Hookway. Philosophical Quarterly, 2001
  • Gale, Richard. A. J. Ayer: A Life by Ben Rogers. Free Inquiry, 2003
  • Gale, Richard. The Unknown God by Antony Kenny Religious Studies, 2004
  • Gale, Richard. God and Philosophy by Antony Flew. Transactions of the Charles Sanders Peirce Society, 2005
  • Gale, Richard. God, the Best, and Evil by Richard Langtry Notre Dame Philosophical Reviews: An Electronic Journal, 2009
  • Gale, Richard. William James at the Boundaries: Philosophy, Science, and the Geography of Knowledge by Franchesca Bordogna Journal of the History of Philosophy 48 (2010): 252-253

v. Critical Studies

  • Gale, Richard. Studies in Metaphilosophy by Morris Lazerowtiz Philosophical Quarterly, 1965
  • Gale, Richard. Referring by Leonard Linsky Journal of Philosophy, 1969
  • Gale, Richard. The Philosophy of Space and Time by Richard Swinburne Journal of Philosophy, 1969
  • Gale, Richard. The Cement of the Universe by J.L. Mackie Modern Schoolman, 1976
  • Gale, Richard. The Concept of Identity by Eli Hirsch Journal of Philosophy, 1983
  • Gale, Richard. The Epistemology of Religious Experience by Keith Yandell Faith and Philosophy, 1995
  • Gale, Richard. Ontological Arguments by Graham Oppy Cambridge: Cambridge University Press. Philosophy and Phenomenological Research 58 (1999): 715-19
  • Gale, Richard, and Pruss Alexander Atheism and Theism by J. J. C. Smart and John Haldane Faith and Philosophy 16 (1999): 106-113
  • Gale, Richard. 1999. Providence and the Problem of Evil by Richard Swinburne, Religious Studies 36 (2):209-219

b. Secondary Sources

  • Scott Aikin. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. Transactions of the Charles S. Peirce Society 46 (2010): 656-659
  • Alston, William. Review of On the Existence and Nature of God by Richard Gale. The Philosophical Review 102 (1993): 433-435
  • Alston, William. Perceiving God: The Epistemology of Religious Experience. Ithaca: Cornell, 2014
  • Beilby, James “Plantinga’s Model of Warranted Christian Belief,” Alvin Plantinga. Deane-Peter Baker, (ed.) Cambridge: Cambridge University Press, 2007
  • Bird, Graham. Review of The Divided Self of William James by Richard Gale. MIND 111 (2002): 100-103
  • Boyle, Deborah. “William James’s Ethical Symphony,” Transactions of the Charles S. Peirce Society, 34 (1998): 977-1003
  • Butler, Clark. “Motion and Objective Contradictions,” American Philosophical Quarterly 19 (1981): 131-139
  • Cocchiarella, Nino. Review of The Language of Time by Richard Gale. Journal of Symbolic Logic 37 (1972): 170-172
  • Craig, W.L. “Divine Timelessness and Personhood,” International Journal for Philosophy of Religion, 43 (1998): 109-124
  • Craig, W.L. Time and Eternity. Wheaton, Illinois: Crossway Books, 2001
  • Craig, W.L. The Tensed Theory of Time: A Critical Examination. New York: Springer, 2000
  • Craig, W.L. The Tenseless Theory of Time: A Critical Examination. New York: Springer, 2013
  • Davey, Kevin, Clifton, Robb. “Insufficient Reason in the ‘New Cosmological Argument,” Religious Studies 37 (2001): 485-490
  • Dowden, Bradley. “Time” Internet Encyclopedia of Philosophy URL: https://iep.utm.edu/time/#SH9e
  • Dyke, Heather. Review of Blackwell Guide to Metaphysics by Richard Gale, (ed.) Australasian Journal of Philosophy 81 (2003): 620-621
  • Evans, Charles. “Timeless Truth,” The Philosophical Review 71 (1962): 241-242
  • Ford. Lewis. “The Duration of the Present,” Philosophy and Phenomenological Research 35 (1974): 100-106
  • Franks, Christopher. “Passion and the Will to Believe,” The Journal of Religion 84 (2000): 431-449
  • Gellman, Jerome, 2000. “Prospects for a sound stage 3 of cosmological arguments,” The Existence of God, Richard Gale and Alexander Pruss, (ed’s). Farhnam, UK: Ashgate
  • Goldman, Loren. Review of John Dewey’s Quest for Unuty: The Journey of a Promethean Mystic by Richard Gale. Education and Culture 29 (2013): 135-139
  • Goodman, Russell B. Review of The Divided Self of William James by Richard Gale. Religious Studies 36 (2000): 227-245
  • Helm, Paul. “Gale on God,” Religious Studies 29 (1993): 245-255
  • Hobbs, Charles. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. The Journal of Speculative Philosophy (25 (2011): 28-430
  • Howard-Snyder, Daniel, (ed.). The Evidential Argument from Evil Bloomington: Indiana University Press, 1996
  • Humber, James. “Response to Gale,” Social Theory and Practice: Proceedings from the Georgia State University Conference on Human Freedom 16 (1990): 425-433
  • Iannone, Pablo. Review of The Philosophy of William James: An Introduction by Richard Gale. Review of Metaphysics 59 (2005): 173-174
  • King-Farlow, John. “The Positive McTaggart on Time,” Philosophy 49 (1974): 169-178
  • Khatchadourian, Haig. “Do Ordinary Spatial and Temporal Expressions Designate Relations?” Philosophy and Phenomenological Research 34 (1973): 82-94
  • Lamberth, David C. Review of The Divided Self of William James by Richard Gale. Journal of the American Academy of Religion 68 (2000): 890-893
  • Leftow, Brian. “Time, Actuality and Omniscience,” Religious Studies 26 (1990): 303-321
  • McDonough, Richard. “The Gale–Pruss cosmological argument: Tractarian and advaita Hindu objections,” Religious Studies, 2016. http://journals.cambridge.org/abstract_S0034412516000123.
  • McHugh, Christoper. “A Refutation of Gale’s Creation-Immutability Arguments,” Philo 6 (2003): 5-9
  • McTaggert, J.M.E. “The Unreality of Time,” Mind 17 (1908): 457-474
  • Mellor, D. H., Matters of Metaphysics, Cambridge: Cambridge University Press, 1991
  • Meyers, Gerald. Review of The Divided Self of William James by Richard Gale. Philosophy and Phenomenological Research 64 (2002): 491-494
  • Meyers, William, and Pappas, Gregory. “Dewey’s Metaphysics: A Response to Richard Gale,” Transactions of the Charles S. Peirce Society 40 (2004): 679-700
    • Argues that Gale’s analytical outlook leads him to fundamentally misunderstand Dewey’s philosophy
  • “Moe Gale Dies” Sept. 3, 1964. New York Times
  • Newton-Smith, W. Review of The Language of Time by Richard Gale. The British Journal for the Philosophy of Science 20 (1969): 281-283
  • Oaklander, L. Nathan. 1977. “The Timelessness of Time,” Philosophy and Phenomenological Research 38 (1977): 228-233
    • Argues that Gale’s purported reduction of B-relations to A-determinations fails because it cannot account for the timelessness of time
  • Oaklander, L. Nathan, and Smith, Quentin. The New Theory of Time. New Haven: Yale University Press, 1994
    • An excellent anthology on the philosophy of time that includes interesting discussion both of the older snd the newer versions of the A and B theories of time.
  • Oppy, Graham. Ontological Arguments and Belief in God Cambridge: Cambridge University Press, 1995
  • Oppy, Graham. “On ‘A New Cosmological Argument,’” Religious Studies 36 (2000): 345–353
    • Argues that Gale’s and Pruss’s weak principle of reason entails the strong version of the principle of sufficient reason
  • Oppy, Graham. Arguing about Gods Cambridge: Cambridge University Press, 2006
  • Oppy, Graham. Describing Gods: An Investigation of Divine Attributes Cambridge: Cambridge University Press, 2014
  • O’Connor, David. God and Inscrutable Evil: In Defence of Theism and Atheism. Lanham, MD: Rowman and Littlefield, 1996
  • Padgett, Alan. “God and Time: Toward a New Doctrine of Divine Timeless Eternity,” Religious Studies 25 (1989): 209-215
  • Pawelski, James. “William James’ Divided Self and the Process of its Unification: A Reply to Richard Gale.” Transactions of the Charles Sanders Peirce Society 39 (2003): 645-656
    • Argues that Gale’s view that James is a mystic and that this produces a self that is divided from its pragmatism is wrong
  • Plantinga, Alvin. Warranted Christian Belief New York: Oxford University Press, 2000
  • Plumer, Gilbert. “Detecting Temporalities,” Philosophy and Phenomenological Research 47 (1987): 451-460
  • Post, John F. Review of On the Nature and Existence of God by Richard Gale. Philosophy and Phenomenological Research 53 (1993): 950-954
  • Plecha, James. “Tenselessness and the Absolute Present,” Philosophy 59 (1984): 529-534
    • Argues contra Gale that one can acknowledge that language can be detensed while accepting the absolute present and rejecting the block universe
  • Prior, A.N. Review of The Language of Time by Richard Gale. Mind 78 (1969): 453
  • Pruss, Alexander. “The Hume-Edwards Principle and the Cosmological Argument,” International Journal for Philosophy of Religion 43 (1998): 149-165
  • Pruss, Alexander. “A Restricted Principle of Sufficient Reason and the Cosmological Argument,”
  • Religious Studies 40 (2004): 165-179
  • Pruss, Alexander. The Principle of Sufficient Reason: A Reassessment Cambridge: Cambridge University Press, 2010
  • Pruss, Alexander. Actuality, Possibility and Worlds London: Bloomsbury Academic, 2011
  • Rankin, K.W. Review of The Language of Time by Richard Gale. The Philosophical Quarterly 19 (1969): 176-177
  • Raposa, Michael. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. American Journal of Theology & Philosophy 31 (2010): 275-278
  • Reichenbach, Bruce R. The Cosmological Argument: A Reassessment Springfield, Illinois: Charles C. Thomas, 1972
  • Robison, John. “A Note on ‘Falsifying Retrodictions’,” Analysis 26 (1965): 9-11
  • Rowe, William. Review of On The Nature and Existence of God by Richard Gale. Journal of the American Academy of Religion 63 (1995): 592-595
  • Rutten, Emmanuel. A Critical Assessment of Contemporary Cosmological Arguments: Towards a Renewed Case for Theism. Doctoral Thesis: Vrije University. Amsterdam, 2012
  • Saka, Paul. “Pascal’s Wager and the Many Gods Objection,” Religious Studies 37 (2001): 321-341
  • Sanford, David H. “McTaggart on Time,” Philosophy 43 (1968): 371-378
  • Schellenberg, J. L. Review of On the Nature and Existence of God by Richard Gale Review of Metaphysics 46 (1992): 402-404
  • Schlesinger, George. “The Stillness of Time and Philosophical Equanimity,” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 30 (1976): 145-159
  • Schlesinger, George. “The Reduction of B-Statements,” Philosophical Quarterly 28 (1978): 162-165
  • Schlesinger, George. Aspects of Time Indianapolis: Hackett, 1980
  • Sherry, Patrick. Review of On the Nature and Existence of God by Richard Gale. Philosophy 67 (1992): 563
  • Shook, John. Review of The Philosophy of William James: An Introduction by Richard Gale. Philosophy in Review 25 (2005): 179-181
  • Slattery, Michael. “More on what there Isn’t,” Review of Metaphysics 26 (1972): 344-348
  • Smart, J.J.C. “Tensed Statements: A Comment,” Philosophical Quarterly 12 (1962): 264-265
  • Smith, Quentin. “The Co-Reporting Theory of Tensed and Tenseless Sentences,” Philosophical Quarterly 40 (1990): 213-222
  • Smith, Quentin. “Sentences about Time,” Philosophical Quarterly 37 (1987): 37-53
  • Stein, Howard. Review of The Language of Time by Richard Gale. Journal of Philosophy 66 (1969): 350-355
  • Stephens, Matthew. Review of The Divided Self of William James by Richard Gale. Philosophy in Review 21 (2001): 113-115
  • Suckiel, Ellen Kappy. Review of The Divided Self of William James by Richard Gale. Transactions of the Charles S. Peirce Society 36 (2000): 161-168
  • Suckiel, Ellen Kappy. “The Authoritativeness of Mystical Experience: An Innovative Proposal from William James,” International Journal for Philosophy of Religion 52 (2002): 175-189
  • Swineburn, Richard. “Reply to Richard Gale,” Religious Studies 36 (2002): 221-225
    • Swineburn responds to Gale’s criticism that his thesis is ambiguous and makes clear that he intends a strong version of the thesis that God is justified in allowing evil to occur.
  • Swineburn, Richard. Review of On the Nature and Existence of God by Richard Gale. The Journal of Theological Studies, NEW SERIES 43 (1992): 784-788
  • Talisse, Robert. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. Philosophical Quarterly 61 (2011): 863-864
  • Tiles, J.E. 2010. Review of John Dewey’s Quest for Unity: The Journey of a Promethean Mystic by Richard Gale. Notre Dame Philosophical Reviews 5 (2010)
  • Tooley, Michael. Time, Tense and Causation Oxford: Clarendon Press, 1997
  • Van Inwagen, Peter. “Reflections on the Chapters by Draper, Russell, and Gale,” The Evidential Argument from Evil, Daniel Howard Snyder, (ed.), Bloomington: Indiana University Press, 1996
  • Watson, Justin. Review of The Divided Self of William James by Richard Gale. Religion & Literature 31(1999): 124-126
  • White, David A. “Can Alston Withstand the Gale?” International Journal for Philosophy of Religion 39 (1996): 141-149
    • Argues that Gale’s argument that God is not the sort of being that can be pinned down and baptised in an act of naming is wrong.
  • Williams, Clifford. “The Metaphysics of A- and B- Time,” Philosophical Quarterly 46 (1996): 371-381
    • Argues that the alleged difference between A- and B- time remains undescribed
  • Williams, Clifford. “Bergsonian Approach to A- and B- Time,” Philosophy 73 (1998): 379-393
  • Zimmerman, Dean. “Richard Gale and the Free Will Defence,” Philo 6 (2003): 78-113

 

Author Information

Richard McDonough
Email: rmm249@cornell.edu
Arium School of Arts & Sciences
Singapore

Wittgenstein: Epistemology

WittgensteinAlthough Ludwig Wittgenstein is generally more known for his works on logic and on the nature of language, but throughout his philosophical journey he reflected extensively also on epistemic notions such as knowledge, belief, doubt, and certainty. This interest is more evident in his final notebook, published posthumously as On Certainty (1969, henceforth OC), where he offers a sustained and, at least apparently, fragmentary treatment of epistemological issues. Given the ambiguity and obscurity of this work, written under the direct influence of G. E. Moore’s A Defense of Commonsense (1925, henceforth DCS) and Proof of an External World (1939, henceforth PEW), in the recent literature on the subject, we can find a number of competing interpretations of OC; at first, this article presents the uncontentious aspects of Wittgenstein’s views on skepticism, that is, his criticisms against Moore’s use of the expression “to know” and his reflections on the artificial nature of the skeptical challenge. Then it introduces the elusive concept of “hinges,” central to Wittgenstein’s epistemology and his views on skepticism; and it offers an overview of the dominant “Wittgenstein-inspired” anti-skeptical strategies along with the main objections raised against these proposals. Finally, it briefly sketches the recent applications of Wittgenstein’s epistemology in the contemporary debate on skepticism.

Table of Contents

  1. Wittgenstein on Radical Skepticism: A Minimal Reading
  2. The Therapeutic Reading
  3. The Epistemic Reading
  4. The Contextualist Reading
  5. The Non-epistemic Reading
  6. The Non-propositional Reading
  7. The Framework Reading
  8. Concluding Remarks
  9. References and Further Reading

1. Wittgenstein on Radical Skepticism: A Minimal Reading

The feature of Cartesian-style arguments is that we cannot know certain empirical propositions (such as “Human beings have bodies,” or “There are external objects”) as we may be dreaming, hallucinating, deceived by a demon, or be “brains in the vat” (BIV), that is, disembodied brains floating in a vat, connected to supercomputers that stimulate us in just the same way that normal brains are stimulated when they perceive things in a normal way. Therefore, as we are unable to refute these skeptical hypotheses, we are also unable to know propositions that we would otherwise accept as being true if we could rule out these scenarios.

Cartesian arguments are extremely powerful as they rest on the Closure principle for knowledge. According to this principle, knowledge is “closed” under known entailment. Roughly speaking, this principle states that if an agent knows a proposition (for example, that she has two hands), and competently deduces from this proposition a second proposition (for example, that having hands entails that she is not a BIV), then she also knows the second proposition (that she is not a BIV). More formally

The “Closure” Principle:

If a subject S knows that p, and p entails q, then S knows that q.

Let’s take a skeptical hypothesis, SH, such as the BIV hypothesis mentioned above, and M, an empirical proposition such as “Human beings have bodies” that would entail the falsity of a skeptical hypothesis. We can then state the structure of Cartesian skeptical arguments as follows:

(S1) I do not know not-SH

(S2) If I do not know not-SH, then I do not know M

(SC) I do not know M

Considering that we can repeat this argument for each and every one of our empirical knowledge claims, the radical skeptical consequence we can draw from this and similar arguments is that our knowledge is impossible.

A way of dealing with “Cartesian-style” skepticism is to affirm, contra the skeptic, that we can know the falsity of the relevant skeptical hypothesis; for instance, in DCS and PEW, G. E. Moore famously argued that we can have knowledge of the “commonsense view of the world,” that is of statements such as “Human beings have bodies,” “There are external objects,” or “The earth existed long before my birth” and that this knowledge would offer a direct response against skeptical worries.

Moore himself (1942) was not fully convinced by the anti-skeptical strength of DCS and PEW, which have engendered a huge debate that will be impossible to summarize here (see Malcolm, 1949; Clarke, 1973; Stroud, 1984). Nonetheless, it is important to notice that Moore’s affirmation that he knows for certain the “obvious truisms of the commonsense” is pivotal in his anti-skeptical strategy; his knowledge-claims would allow him to refute the skeptic.

But, argues Wittgenstein, to say that we know “obvious truisms” such as “There are external objects” is misleading for a number of reasons. First, because in order to claim “I know,” one should be able to, at least in principle, produce evidence or offer compelling grounds for her beliefs. That is to say, the “language-game” of knowledge involves and presupposes the ability to give reasons, justifications, and evidence; but crucially (OC 245), Moore’s grounds are not stronger than what they are supposed to justify. In other words, as per Wittgenstein, if a set of evidence has to count as compelling grounds for our belief in a certain proposition, then that evidence must be more certain than the belief itself; this cannot happen in the case of Moore’s “obvious truisms” because, at least in normal circumstances, nothing is more certain than the fact that we have two hands or a body (OC 125).

Just imagine, for instance, that one attempted to legitimate one’s claim to know that p by using the evidence that one has for p (that is, what one sees, what one has been told about p and so on). Now, if the evidence we adduce to support p is less certain than p itself, then this same evidence would be unable to support p.

However, Wittgenstein argues that if it would be somewhat odd to claim that we still know Moore’s “obvious truisms,” they cannot be an object of doubt. For instance, if someone is seriously pondering whether she has a body or not, we would not investigate the truth-value of her affirmations; rather, we would question her ability to understand the language she is using or her sanity, for a similar false belief would probably be the result of a sensorial or mental disturbance (OC 526).

Moreover, for Wittgenstein, the kind of never-ending doubt put forward by a proponent of radical skepticism, far from being a legitimate intellectual enterprise, will prevent his proponents from engaging in any intellectual activity at all; to support his point, Wittgenstein gives the example (OC 310) of a pupil who constantly interrupts a lesson, questioning the existence of things or the meanings of words. His doubts will lack any sense, and at most they will lead him to a sort of epistemic paralysis; he will just be unable to learn the skill/subject we are trying to teach him (OC 315).

More generally, for Wittgenstein, any proper epistemic inquiry presupposes that we take something for granted; if we start doubting everything, there will be no knowledge at all. As he remarks at one point:

If you are not certain of any fact, you cannot be certain of the meaning of your words either […] If you tried to doubt everything you would not get as far as doubting anything. The game of doubting itself presupposes certainty (OC 114–115).

That is to say, the questions that we raise and our doubts depend on the fact that some propositions are exempt from doubt, are as it were like hinges on which those turn […] But it isn’t that the situation is like this: We just can’t investigate everything, and for that reason we are forced to rest content with assumption. If I want the door to turn, the hinges must stay put (OC 341–343).

Neither knowable nor doubtable, for Wittgenstein, Moore’s “obvious truisms of the commonsense” are “hinges” (OC 341–343): apparently empirical contingent beliefs which perform a different, basic role in our epistemic practices.

2. The Therapeutic Reading

The “therapeutic reading” of OC (Conant, 1998) stems from the remarks in which Wittgenstein talks about Moore’s “misuse” of the expression “to know”:

Now, can one enumerate what one knows (like Moore)? Straight off like that, I believe not. For otherwise the expression “I know” gets misused. And through this misuse a queer and extremely important mental state seems to be revealed (OC 6).

I know that a sick man is lying here? Nonsense! I am sitting at his bedside, I am looking attentively into his face. So I don’t know, then, that there is a sick man lying here? Neither the question nor the assertion makes sense. Any more than the assertion “I am here”, which I might yet use at any moment, if suitable occasion presented itself. […] And “I know that there’s a sick man lying here”, used in an unsuitable situation, seems not to be nonsense but rather seems matter-of-course, only because one can fairly easily imagine a situation to fit it, and one thinks that the words “I know that…” are always in place where there is no doubt, and hence even where the expression of doubt would be unintelligible (OC 10).

According to the proponents of the “therapeutic reading,” we should read these passages in light of the theory, pivotal in later Wittgensteinian thought, that the meaning of a word consists in its use in ordinary situations. As he writes in his Philosophical Investigations (1997, henceforth PI):

For a large class of cases—though not for all—in which we employ the word “meaning” it can be defined thus: the meaning of a word is its use in the language (PI 43).

This would allow us to reconstruct Wittgenstein’s treatment of skepticism as follows. Moore fails to mean something quite particular by stating his “obvious truisms” outside of a language-game, that is outside any of our everyday epistemic practices; thus, in the circumstances in which they are actually used, it is not clear what has been said, if anything.

At the same time, a radical skeptic fails to recognize the role—or, using Wittgenstein’s expression, the use—that expressions such as “knowledge” and “doubt” play in our ordinary epistemic practices; as in our everyday life there is nothing similar to the kind of general investigation pursued by the radical skeptic (OC 209 “A doubt without an end is not even a doubt.”), the skeptical challenge would be, strictly speaking, senseless (1998, 241–248).

Following the therapeutic reading of OC, then, both Cartesian skepticism and Moore’s anti-skeptical strategy would be based on a misunderstanding of how language works; it is not clear what Moore and the skeptic are doing with their words and therefore, even if apparently they have a meaning, they lack any sense.

This rendering of Wittgenstein’s anti-skeptical position has appeared to many commentators (see, for instance, Salvatore, 2013) to be simply too crude.

Consider the following entries:

The statement “I know that here is a hand” may then be continued: “for its my hand that I am looking at”. Then a reasonable man will not doubt that I know. Nor will the idealist; rather he will say that he was not dealing with the practical doubt which is being dismissed, but there is a further doubt behind that one. That this is an illusion has to be shewn in a different way (OC 19).

But it is an adequate answer to the skepticism of the idealist, or the assurances of the realist, to say that “There are physical object” is nonsense? For them after all it is not nonsense. It would, however, be an answer to say: this assertion, or its opposite, it’s a misfiring attempt to express what can’t be expressed like that. And that it does misfire can be shewn; but that isn’t the end of the matter. We need to realize that what presents itself to us as the first expression of a difficulty, or of its solution, may as yet not be correctly expressed at all. Just as one who has a just censure of a picture to make will often at first offer the censure where it does not belong, and an investigation is needed in order to find the right point of attack for the critic (OC 37).

These remarks alone seem to suggest that Wittgenstein was well aware that simply showing that the skeptic was using terms such as “knowledge” and “doubt” outside of any ordinary practice was not enough to dismiss skeptical worries. On the contrary, he seems to concede that nothing prevents us from thinking that the skeptic and his opponent are engaged in a “language-game,” that is philosophical inquiry, in which the expressions “to know” and “to doubt” are used in a way that is at odds with their everyday usage but is still, at least apparently, meaningful and legitimate.

Also, these passages show that Wittgenstein would not consider a rebuttal of skepticism on the basis of pragmatic considerations alone to be satisfactory. Recall that on Conant’s reading, Wittgenstein would dismiss skeptical doubts as they are at odds with our ordinary practices, thus unintelligible on closer inspection; on the contrary, he stresses the necessity for a philosophical analysis of the hidden assumptions that make the skeptical “doubt” so apparently compelling.

3. The Epistemic Reading

A very influential reading of OC is Crispin Wright’s notion of “rational entitlement” (2004a/2004b), which stems from his famous diagnosis of Moore’s Proof (1985). If in DCS, as we have already seen, Moore argued that we can have knowledge of the “commonsense view of the world,” that is of very general “obvious truisms” such as “I am a human being,” “Human beings have bodies,” “The earth existed long before my birth,” and so on, in PEW, he famously maintained that even an instance of everyday knowledge such as “This is a hand” can offer a direct response against skeptical worries. Moore’s Proof is standardly rendered as follows:

(MP 1) Here is a hand

(MP 2) If there is a hand here, then there are external objects

(MP C) There are external objects

As per Wright, we can reconstruct PEW as follows:

  • I. It perceptually appears to me that there are two hands
  • II. There are two hands
  • III. Therefore, there are material objects

In other words, to state I) amounts to saying that there is a proposition that correctly describes the relevant aspects of Moore’s experience in the circumstances in which the Proof was given; in the case of the Proof, for instance, I) will sound like “I am perceiving (what I take to be) my hand.” Then, from I) follows II) and from II) follows III), since “a hand” is a physical object; and given that the premises are known, so is the conclusion.

But, argues Wright, the passage from I) to II) is highly problematic: if Moore was victim of a skeptical scenario such as the “Dream hypothesis” and thus was just dreaming his hand, II) would no longer follow from I). More generally, I) can ground II) only if we already take for granted that our experience is caused by our interaction with material objects; thus, sensory experience can warrant a belief about empirical objects only if we already assume that there are material objects.

Hence, we need to already have a warrant for III) in order to justifiably go from I) to II); and this is why Moore’s Proof would be question begging or epistemically circular: in order to consider the premises of Moore’s Proof true, we are implicitly assuming the truth of its conclusion.

Thus, Moore’s Proof would lead to another, more subtle form of skepticism that Wright calls Humean ; while Cartesian-style skepticism goes from uncongenial skeptical scenarios to show that we cannot know any of our empirical beliefs, Humean skepticism argues that anytime we make an empirical knowledge claim, we are already assuming that, so to say, things outside of us are already the way we take them to be and, more generally, that there are material objects.

Again, in order to go from I) to II) to III), we need to have an independent warrant to believe that III) is true; and as we do not have this independent warrant, then the argument fails to provide warrant for his conclusions. This is a phenomenon that Wright calls “failure of transmission of warrant” (or transmission failure for short). However, Wright argues, in many cases our inquiries are based on commitments or presuppositions that cannot be justified, but that nonetheless we take for granted whenever we are involved in an epistemic practice; and this is what happens with Moore’s “obvious truisms of the commonsense.” On Wright’s account, “hinges” are beliefs whose rejection would rationally necessitate extensive reorganization, or the complete destruction, of what should be considered as empirical evidence or more generally of our epistemic practices.

This reading of OC leads Wright to propose the following “Wittgenstein-inspired” anti-skeptical account. Each and every one of our ordinary inquiries would then rest on ungrounded presuppositions, “hinges”; but still, argues Wright, since the warrant to hold Moore’s “obvious truisms” is acquired in an epistemically responsible way, we cannot dismiss them simply because they are groundless as this would lead to a complete cognitive paralysis (2004a, 191).

As per Wright, then, Cartesian skepticism would only show that every process of knowledge-acquisition rests on ungrounded presupposition. However, a system of thought, purified of all liability to hinges, would not be that of a rational agent; and because rational agency is a basic way for us to act, we therefore have a default rational basis, an entitlement, to believe in “hinges” and thus to know them, even if in an unwarranted way (hence “epistemic reading”).

Wright’s rational entitlement has engendered a huge debate that would be impossible to summarize here (see Pryor 2000, 2004, 2012; Davies 2003, 2004; Coliva, 2009a, 2009b, 2015). This article presents only the main objections raised against the plausibility of the “epistemic reading” and its anti-skeptical strength.

A first issue (Salvatore, 2013) is that following this account, the Cartesian skeptical challenge, even if ultimately illegitimate, would nonetheless have the merit to highlight a constitutive limit of our epistemic practices, namely that they rest on ungrounded presuppositions. On the contrary, far from revealing the structure of our epistemic practices, for Wittgenstein, Cartesian-style skepticism will undermine the same notion of what an epistemic practice is. For once we doubt a hinge such as “There are external objects,” expressions like “evidence,” “justification,” and “doubt” will radically alter if not completely lose their meaning. Wittgenstein stresses this point in many entries of OC, as in the following remark, where he writes:

If, therefore, I doubt or am uncertain about this being my hand (in whatever sense), why not in that case about the meaning of these words as well? (OC 456).

That is to say, once we assume ex hypothesis that we could be victims of a skeptical scenario, it would be hard to understand what could count as evidence for what, as each and every one of our perceptions would be the result of constant deception. Thus, to doubt a hinge would put in question the same meaning of the words with which we are expressing our doubts.

Another objection against Wright’s proposal goes as follows. Recall that, for Wright, we are rationally entitled to believe in the truth of hinges such as “Human beings have bodies” or “There are external objects.” A consequence of this thought (Pritchard, 2005) is that following this strategy, it would be possible to know the denials of skeptical hypotheses, even if in an unwarranted way: an anti-skeptical move which is excluded by Wittgenstein in many remarks of OC, as in the following entries:

Moore has every right to say he knows there’s a tree there in front of him. Naturally he may be wrong. (For it is not the same as with the utterance “I believe there is a tree there”.) But whether he is right or wrong in this case is of no philosophical importance. If Moore is attacking those who say that one cannot really know such a thing, he can’t do it by assuring them that he knows this and that […] (OC 520).

Moore’s mistake lies in this—countering the assertion that one cannot know that, by saying “I do know it” (OC 521).

That is to say, to claim against a Cartesian skeptic that we know Moore’s “obvious truisms of the commonsense” would be at the same time misleading and unconvincing.

First, as we have already seen “hinges” cannot be evidentially grounded, for any evidence, we could adduce to support a proposition p such as “I have a hand” would be less secure than p itself. As Wittgenstein writes at some point:

One says “I know” when one is ready to give compelling grounds. “I know” relates to a possibility of demonstrating the truth. Whether someone knows something can come to light, assuming that he is convinced of it. But if what he believes is of such a kind that the grounds he can give are no surer than his assertion, than he cannot say that he knows what he believes (OC 243).

Also, and more importantly, to claim that we “know” Moore’s “‘obvious truisms of the commonsense” on the basis of pragmatic considerations would simply miss the point of the skeptical challenge.

Moreover (Pritchard, 2005; Jenkins, 2007) following Wright, it is entirely rational to set aside skeptical concerns whenever we want to pursue a given epistemic practice; but here Wright seems to conflate practical and epistemic rationality. That is to say, a Cartesian skeptic can well agree that we have to dismiss skeptical concerns whenever we are involved in a given epistemic inquiry, as not to do so would lead to a cognitive paralysis. But even conceding that it would be practically rational to set aside skeptical worries in order to achieve cognitive results, what is at issue in the skeptical challenge is the epistemic rationality of trusting our senses when skeptical hypotheses are in play. In other words, a skeptic can grant that we have to rule out skeptical worries when we need to form true beliefs about the world in our everyday life; still, she can argue that the fact that we need true beliefs about the world does not make our acceptance of “hinges” epistemically rational as long as we cannot rule out skeptical scenarios. Wittgenstein himself was well aware of this point; consider OC 19:

The statement “I know that here is a hand” may then be continued: “for its my hand that I am looking at”. Then a reasonable man will not doubt that I know. Nor will the idealist; rather he will say that he was not dealing with the practical doubt which is being dismissed, but there is a further doubt behind that one. That this is an illusion has to be shewn in a different way.

4. The Contextualist Reading

Another influential reading of OC is Michael Williams’ “Wittgensteinian Contextualism” which he has proposed in his book Unnatural Doubts and in a number of other more recent works (1991, 2001, 2004a, 2004b, 2005). A first formulation of a contextualist interpretation of OC can be found in Morawetz (1978). (For a general introduction to epistemic contextualism, along with an overview of other “non Wittgensteinian” contextualist anti-skeptical proposals, see here.)

Recall that in some passages of OC (OC 114, 115, 315, 322), Wittgenstein argues that any proper inquiry presupposes certainty, that is, some unquestionable prior commitment. In these remarks, Wittgenstein also alludes to the importance of the context of inquiry; hence stating that without a precise context, there is no possibility of raising a sensible question or a doubt. Williams generalizes this part of Wittgenstein’s argument as follows: in each context of inquiry, there is necessarily a set of “hinge” beliefs (that he names methodological necessities), which will hold fast and which are therefore immune to epistemic evaluation in that context.

A motivation for this reading is the extreme heterogeneity of the “hinges” mentioned by Wittgenstein. Along with Moore’s “obvious truisms,” in fact, throughout OC he considers as “hinges” propositions whose certainty is indexed to a historical period (“No man has ever been on the moon.”) together with basic mathematical truths (“12 × 12 = 144”) and contingently empirical claims (“This is a hand.”). As per Williams, this would be a way to stress a basic feature of our inquiries: namely, they all would rest on unsupported presuppositions that can nevertheless be dismissed/questioned where new questions arise or when we are switching from a context of inquiry to another. For instance, a historical inquiry about whether, say, Napoleon won at Austerlitz presupposes “hinge commitments” such as “The world existed long before my birth”; all our everyday epistemic practices presuppose hinges such as “There are external objects” or “Human beings have bodies,” and so on.

Crucially, for Williams to take for granted the “methodological necessities” of a given epistemic practice is not only a matter of practical rationality as in Wright’s “entitlement strategy”; rather, it is a condition of possibility of any sensible enquiry. That is to say, while following Wright, we have to rest on “hinges” mostly because it is the only practical alternative, for Williams our confidence in Moore’s “obvious truisms” would highlight the constitutively “local” and “context-dependent” nature of all our epistemic practices.

Thus, in ordinary contexts, it would be illegitimate to doubt hinges such as “Human beings have bodies” or “There are external objects”; but once these are brought into focus, for instance by running skeptical arguments, we are simply switching from a context of inquiry to another, that is, from the everyday context into the philosophical one.

Therefore, by doubting the “hinges” of our most common epistemic practices, the skeptic is simply leading us from a context in which it is legitimate to hold these hinges fast toward a philosophical one in which everything can be doubtable.

Nonetheless, the skeptical move cannot affect our everyday knowledge claims, which are made by taken-for-granted “methodological necessities” such as “The earth existed long before my birth” or “Human beings have bodies.” At most, what the Cartesian skeptic is able to show us is that, in the more demanding context of philosophical reflection, we do not know, strictly speaking, anything at all.

A consequence of this thought is that, even if legitimate and constitutively unsolvable at a philosophical level, the Cartesian skeptical challenge would not affect our everyday epistemic practices as they belong to different contexts, with completely different methodological necessities or “hinges.” Moreover, the same propositions that we cannot claim to know at a philosophical level are known to be true, albeit tacitly, in other contexts even if they lack evidential support. Evidential support is something that they cannot constitutively possess, insofar as any hinge has to be taken for granted whenever we are involved in an epistemic practice.

There are many problems that Williams’ “Wittgensteinian contextualism” has to face in order to be considered a plausible interpretation of Wittgenstein’s thought. First, on his account, the skeptical enterprise is both completely legitimate and constitutively unsolvable. That is to say, in the context of our ordinary epistemic practices, it would be illegitimate to doubt “obvious truisms” such as “Human beings have bodies” or “This is a hand”; nonetheless, “hinges” would still be doubtable and dismissible in the more demanding context of philosophical inquiry.

Even if in some passages of OC, as we have seen while discussing the “therapeutic reading,” Wittgenstein seems to concede that a skeptic might be using the expressions “to know” and “to doubt” in a specialized and, so to say, “philosophical” way, this does not lead him to admit that the skeptic would be somewhat “right,” even if only in the philosophical context. Rather, throughout OC, Wittgenstein stresses that there is no context in which we can rationally hold a doubt about Moore’s “obvious truisms”; as we have seen supra, to seriously doubt a “hinge” would look more similar to a sign of mental illness than to a legitimate philosophical inquiry:

In certain circumstances a man cannot make a mistake. (Can here is used logically, and the proposition does not mean that a man cannot say anything false in those circumstances.) If Moore were to pronounce the opposite of those propositions which he declares certain, we should not just not share his opinion: we should regard him as demented (OC 155).

If I now say “I know that the water in the kettle in the gas-flame will not freeze but boil”, I seem to be as justified in this “I know” as I am in any. “If I know anything I know this”. Or do I know with still greater certainty that the person opposite me is my old friend so-and-so? And how does that compare with the proposition that I am seeing with two eyes and shall see them if I look in the glass? I don’t know confidently what I am to answer here. But still there is a difference between cases. If the water over the gas freezes, of course I shall be as astonished as can be, but I shall assume some factor I don’t know of, and perhaps leave the matter to physicists to judge. But what could make me doubt whether this person here is N.N., whom I have known for years? Here a doubt would seem to drag everything with it and plunge it into chaos (OC 613, my italics).

Thus, even if Wittgenstein seems somewhat to concede a prima facie plausibility to skeptical hypotheses, he nonetheless denies Moore’s “obvious truisms of the commonsense” can be sensibly doubted or denied, even if only in the context of philosophical inquiry (cfr. OC 231, 234).

Also, for Williams (2004a), Wittgenstein’s treatment of Moore’s “obvious truisms” would differ sensibly throughout OC; while the “hinges” listed by Moore would be “methodological necessities,” a statement such as “There are external objects,” namely the conclusion of PEW, would be plain nonsense (2004a, 86–87).

OC would then present two different anti-skeptical strategies, influenced respectively by Moore’s PEW and DCS. The first 60 entries of OC would be concerned with Moore’s Proof. On Williams’ reading, in these remarks, Wittgenstein would consider the skeptical challenge and Moore’s anti-skeptical strategy as constitutively senseless; both Moore and the skeptic would, in fact, treat “There are external objects” as a hypothesis that can be either confirmed by evidence or dismissed. But “There are external objects” is not an empirical hypothesis that can be tested or doubted; the very fact that we think, talk, and make judgments about the world shows that “There are external objects” and so any attempt to prove or to doubt this “proposition” would be misguided.

Williams motivates this division also to make sense of Wittgenstein’s saying that “There are external objects” is nonsense, but (Moyal-Sharrock, 2004, 91–92) Wittgenstein does not necessarily use the term nonsense in a derogatory way. Just consider this passage of the Philosophical Grammar (1974, henceforth PG):

[…] when we hear the two propositions, “This rod has a length” and its negation “This rod has no length”, we take sides and favor the first sentence, instead of declaring them both nonsense. But this partiality is based on a confusion; we regard the first proposition as verified (and the second as falsified) by the fact “that rod has a length of 4 meters” (PG 129).

 For Wittgenstein, then, nonsense is not only what violates sense, but also what defines or elucidates it. Thus, Wittgenstein calls “There are physical objects” nonsense; still, this does not amount to saying that it is unintelligible or senseless. Also, while it is undeniable that, for Wittgenstein, “There are external objects” cannot be treated as a hypothesis, there is no clear suggestion that he would consider other “hinges” as open to doubt or verification . Consider the following entry:

It is clear that our empirical propositions do not all have the same status, since one can lay down such a proposition and turn it from an empirical proposition into a norm of description. Think of chemical investigations. Lavoisier makes experiments with substances in his laboratory and now he concludes that this and that takes place when there is burning. He does not say that it might happen otherwise another time. He has got hold of a definite world-picture—not of course one that he invented: he learned it as a child. I say world-picture and not hypothesis, because it is the matter-of-course foundation for his research and as such also does unmentioned (OC 167, my italics).

5. The Non-epistemic Reading

If for Wright and Williams, it is then possible to know hinges such as “There are external objects” or “Human beings have bodies,” whether out of practical considerations or in the context of our ordinary epistemic practices; according to the proponents of the “non-epistemic reading” of OC, “hinges” are, strictly speaking, unknowable; still, this will not lead to skeptical conclusions.

This reading of OC was firstly proposed by Strawson (1985; for similar proposals, see also Wolgast 1987, Conway 1989). According to Strawson, for Wittgenstein skeptical doubts are neither meaningless nor irrational but simply unnatural. This is because, since the radical skeptical doubts are raised with respect to propositions that we find it natural to take for granted (such as “There are external objects” or “Human beings have bodies”), given our upbringing within a community that collectively holds them fast, we cannot help accepting Moore’s “obvious truisms of the commonsense” while lacking reasons and grounds in their favor. Thus, following this interpretation of OC, skeptical hypotheses should not rebutted by argument, but simply recognized as idle and unreal as they call into doubt what we cannot help believing, given our shared “form of life.”

A similar account has been proposed by Stroll (1994). According to Stroll, “hinges” lie at the foundation of our language-games with ordinary empirical propositions; as such, they cannot be said to be propositions in the ordinary sense, as they are not subject to truth and falsity or to verification and control.

Thus, the certainty of “hinges” such as “There are external objects” or “Human beings have bodies” has a non-propositional, pragmatic, or even animal nature that are not be subject to any epistemic evaluation.

This reflection on the “animal,” pre-rational certainty of “hinges” is the starting point of another “non-epistemic reading” of OC, proposed by Moyal-Sharrock (2004, 2005). As per Moyal-Sharrock, our confidence in Moore’s “obvious truisms of the commonsense” such as “There are material objects” or “Human beings have bodies” is not a theoretical or presuppositional certainty but a practical certainty that can express itself only as a way of acting (OC 7, 395); for instance, a “hinge” such as “Human beings have bodies” is the disposition of a living creature, which manifests itself in her acting in the certainty of having a body (Moyal-Sharrock, 2004, 67), and manifests itself in her acting embodied (walking, eating, not attempting to walk through walls, and so forth).

Accordingly, Cartesian skeptical arguments, even if prima facie compelling, rest on a misleading assumption: the skeptic is simply treating “hinges” as empirical, propositional knowledge-claims, while on the contrary, they express a pre-theoretical animal certainty, which is not subject to epistemic evaluation of any sort.

Due to this categorical mistake, a proponent of Cartesian skepticism conflates physical and logical possibility (2004, 170). That is to say, skeptical scenarios such as the BIV one are logically possible but just in the sense that they are conceivable; in other words, we can imagine skeptical scenarios, then run our skeptical arguments, and thus conclude that our knowledge is impossible. Still, the mere hypothesis that we might be disembodied BIV has no strength against the objective, animal certainty of “hinges” such as “There are material objects” or “Human beings have bodies,” just as merely thinking that “human beings can fly unaided” has no strength against the fact that human beings cannot fly without help.

Therefore, skeptical beliefs such as “I might be a disembodied BIV” or “I might be the victim of an Evil Deceiver” are nothing but belief-behavior (2004, 176), as the skeptic is doubting objectively certain “hinges”; thus, we should simply dismiss skeptical worries, for a skeptical scenario such as the BIV one does not and cannot have any consequences whatsoever on our epistemic practices or, more generally, on our “human form of life.”

This reading of OC has attracted several criticisms (see, for instance, Salvatore 2013, 2016). If, from one side, Moyal-Sharrock stresses the conceptual, logical indubitability of Moore’s “truisms,” she nonetheless seems to grant that the certainty of “hinges” stems from their function in a given context, to the extent that they can be sensibly questioned and doubted in fictional scenarios where they can “play the role” of empirical propositions. But crucially, if “hinges” are “objectively certainties” because of their role in our ordinary life, a skeptic can still argue that in the context of philosophical inquiry, Moore’s “commonsense certainties” play a role which, similar to the role they play in fictional scenarios, is both at odds with our “human form of life” and still meaningful and legitimate.

Moreover, despite Moyal-Sharrock’s insistence on the conceptual, logical indubitability of Moore’s “truisms of the commonsense,” her rendering of Wittgenstein’s strategy seems to resemble Williams’ proposal, thus incurring the objections we have already encountered against this reading. As it is argued throughout this work, to simply state that Cartesian skepticism has no consequence on our “human form of life” sounds like too much of a pragmatist response against the skeptical challenge. This is so because a skeptic can well agree that skeptical hypotheses have no consequence on our everyday practices or that they are just fictional scenarios; also, she can surely grant that Cartesian-style arguments cannot undermine the pre-rational confidence with which we ordinarily take for granted Moore’s “obvious truisms of the commonsense.” But crucially, and as Wittgenstein was well aware, a skeptic can always argue that she is not concerned with practical doubt (OC 19) but with a, so to speak, purely philosophical one.

Also and more importantly, even if we agree with Moyal-Sharrock on the “nonsensical” nature of skeptical doubts, this nonetheless has no strength against Cartesian-style skepticism. Recall the feature of Cartesian skeptical arguments: take a skeptical hypothesis SH such as the BIV one and M, a mundane proposition such as “This is a hand.” Now, given the Closure principle, the argument goes as follows:

(S1) I do not know not-SH

(S2) If I do not know not-SH, then I do not know M

Therefore

(SC) I do not know M

 In this argument, whether an agent is seriously doubting if she has a body or not is completely irrelevant to the skeptical conclusion, “I do not know M.” Also, a proponent of Cartesian-style skepticism can surely grant that we are not BIV, or that we are not constantly deceived by an Evil Genius and so on. Still, the main issue is that we cannot know whether we are victim of a skeptical scenario or not; thus, given Closure, we are still unable to know anything at all.

6. The Non-propositional Reading

Wittgenstein’s reflections on the structure of reason have influenced a more recent “Wittgenstein-inspired” anti-skeptical position, namely Pritchard’s “hinge-commitment” strategy (2016b), for which “hinges” are not beliefs but rather arational, non-propositional commitments, not subject to epistemic evaluation.

To understand his proposal, recall the following remarks we have already quoted supra:

If you are not certain of any fact, you cannot be certain of the meaning of your words either […] If you tried to doubt everything you would not get as far as doubting anything. The game of doubting itself presupposes certainty (OC 114–115).

The question that we raise and our doubts depend on the fact that some propositions are exempt from doubt, are as it were the hinges on which those turn [….] that is to say, it belongs to the logic of our scientific investigations that certain things are indeed not doubted […] If I want the door to turn, the hinges must stay put (OC 341–343).

As per Pritchard, here Wittgenstein would claim that the same logic of our ways of inquiry presupposes that some propositions are excluded from doubt; and this is not irrational or based on a sort of blind faith but, rather, belongs to the way rational inquiries are put forward (see OC 342) . As a door needs hinges in order to turn, any rational evaluation would then require a prior commitment to an unquestionable proposition/set of “hinges” in order to be possible at all.

A consequence of this thought (2016b, 3) is that any form of universal doubt such as the Cartesian skeptical one is constitutively impossible; there is simply no way to pursue an inquiry in which nothing is taken for granted. In other words, the same generality of the Cartesian skeptical challenge is then based on a misleading way of representing the essentially local nature of our enquiries.

This maneuver helps Pritchard to overcome one of the main problems facing Williams’ “Wittgensteinian Contextualism.” Recall that, following Williams, the Cartesian skeptical challenge is both legitimate and unsolvable, even if only in the more demanding philosophical context. On the contrary, argues Pritchard, as per Wittgenstein, there is simply nothing like the kind of universal doubt employed by the Cartesian skeptic, both in the philosophical and in the, so to say, non-philosophical context of our everyday epistemic practices. A proponent of Cartesian skepticism looks for a universal, general evaluation of our beliefs; but crucially, there is no such thing as a general evaluation of our beliefs, whether positive (anti-skeptical) or negative (skeptical), for all rational evaluation can take place only in the context of “hinges” which are themselves immune to rational evaluation.

Each and every one of our epistemic practices rests on “hinges” that we accept with certainty, a certainty which is the expression of what Pritchard calls “‘über-hinge’ commitment.” This would be an arational commitment toward our most basic beliefs (such as that “There are external objects” or “Human beings have bodies”) that, as we mentioned above, is not itself opened to rational evaluation, but that importantly is not a belief.

To understand this point, just recall Pritchard’s criticism toward Wright’s rational entitlement. As we have seen, Wright argues that it would be entirely rational to claim that we know Moore’s “obvious truisms of the commonsense” whenever we are involved in an epistemic practice which is valuable to us; but, Pritchard argues, in order to know a proposition, we need reasons to believe that proposition to be true. And as, following Wright, we have no reason to consider “hinges” true other than the fact that we need to take them for granted, then we cannot have knowledge of them either.

With these considerations in mind, we can come back to Pritchard’s “‘über-hinge’ commitment.” As we have seen, this commitment would express a fundamental arational relationship toward our most basic certainties, a commitment without which no knowledge is possible. Crucially, our basic certainties are not subject to rational evaluation; for instance, they cannot be confirmed or disconfirmed by evidence and thus they would be non-propositional in character (that is to say, they can be neither true nor false). Accordingly, they are not beliefs at all; rather, they are the expression of arational, non-propositional commitments. Thus, the skeptic is somewhat right in saying that we do not know Moore’s “obvious truisms of the commonsense”; but this will not lead to skeptical conclusions, for our “hinge commitments” are not beliefs so they cannot be objects of knowledge. Therefore, the skeptical challenge is misguided in the first place.

Pritchard’s account is concerned first and foremost with the psychology of our inquiries, and not with the epistemic status of the “hinges”; thus, his reflections on the structure of reason are just meant to stress the local nature of our epistemic practices, for which we have to rule out general doubts such as the skeptical one. But (Salvatore, 2016) even if, following his strategy, we are able to retain our knowledge of “mundane” propositions, the skeptic will still be able to undermine our confidence in the rationality of our ways of inquiry; under skeptical scrutiny, we will be forced to admit that all our practices rest on unsupported, ungrounded arational presuppositions that are not, and crucially cannot be, rationally grounded.

7. The Framework Reading

Another reading of OC is the “framework reading” (McGinn, 1989; see also Coliva, 2010) according to which “hinges” are “judgments of the frame,” that is, conditions of possibility of any meaningful epistemic practice.

This reading stems from the passages in which Wittgenstein highlights the analogy between Moore’s “obvious truisms of the commonsense” and basic mathematical truths:

But why am I so certain that this is my hand? Doesn’t the whole language-game rest on this kind of certainty? Or: isn’t this “certainty” (already) presupposed in the language-game? […] Compare with this 12×12=144. Here too we don’t say “perhaps”. For, in so far as this proposition rests on our not miscounting or miscalculating and on our senses not deceiving us as we calculate, both propositions, the arithmetical one and the physical one, are on the same level. I want to say: The physical game is just as certain as the arithmetical. But this can be misunderstood. My remark is a logical and not a psychological one (OC 446– 447).

I want to say: If one doesn’t marvel at the fact that the propositions of arithmetic (e.g. the multiplication tables) are “absolutely certain”, then why should one be astonished that the proposition “This is my hand” is so equally? (OC 448).

According to McGinn, we should read Wittgenstein’s remarks on “hinges” in light of his views about mathematical and logical truths. In the Tractatus Logico-Philosophicus (henceforth TLP), Wittgenstein held what we might call an “objectivist” account of logical and mathematical truths, for which they were a description of the a priori necessary structure of reality. In the later phase of his thinking, Wittgenstein completely dismissed this view, suggesting instead that we should think of logical and mathematical truths as constituting a system of techniques originating and developed in the course of the practical life of human beings. What is important in these practices is not their truth or falsity but their technique-constituting role; so, the question about their truth or falsity simply cannot arise. Quoting Wittgenstein:

The steps which are not brought into question are logical inferences. But the reason is not that they “certainly correspond to the truth-or sort-no, it is just this that is called “Thinking”, “speaking”, “inferring”, “arguing”. There is not any question at all here of some correspondence between what is said and reality; rather is logic antecedent to any such correspondence; in the same sense, that is, as that in which the establishment of a method of measurement is antecedent to the correctness or incorrectness of a statement of length (RFM, I, 156).

That is to say, logical and mathematical truths define what “to infer” and “to calculate” is; accordingly, given their “technique-constituting” role, these propositions cannot be tested or doubted, for to accept and apply them is a constitutive part of our techniques of inferring and calculating.

If logical and mathematical propositions cannot be doubted, this is also the case for Moore’s “obvious truisms of the commonsense.” Even if they resemble empirical, contingent knowledge claims, all these “commonsense certainties” play a peculiar role in our system of beliefs; namely, they are what McGinn calls “judgment of the frame” (1989, 139).

As mathematical and logical propositions define and constitute our techniques of inferring and calculating, “hinges” such as “This is a hand,” “The world existed long before my birth,” and “I am a human being” would then define and constitute our techniques of empirical description. That is to say, Moore’s “obvious truisms of the commonsense” would show us how to use words: what “a hand” is, what “the world” is, what “a human being” is and so on (1989, 142).

Both Moore and the skeptic misleadingly treat “hinges” such as “Human beings have bodies” or “There are external objects” as empirical propositions, which can be known or believed on the basis of evidence. But Moore’s “obvious truisms” are certain, their certainty being a criterion of linguistic mastery; in order to be considered a full participant of our epistemic practices, an agent must take Moore’s “obvious truism” for granted.

Even though the “framework reading” has generally been considered a more viable interpretation of Wittgenstein’s thought (see Coliva, 2010), its anti-skeptical strength has been the focus of some serious analysis. First, following this account, to take “hinges” for granted is a condition of possibility of our epistemic practices (1989, 116-120); still (Minar, 2005, 258), a skeptic can nonetheless argue that the indubitability of Moore’s “obvious truisms of the commonsense” is nothing but a fact about what we do. That is to say, “hinges” such as “Human beings have bodies” or “There are external objects” are presupposed by our ordinary linguistic exchanges and constitute what McGinn calls our “framework judgments”; but once these “obvious truisms” are brought into focus, the skeptic will find that their not being up for questioning is simply what happens in normal circumstances. As we have already seen while presenting Conant’s and Wright’s readings of OC, the very fact that in our ordinary life we have to rule out skeptical hypotheses has no strength against Cartesian skepticism; again, what is at issue in the skeptical challenge is not the practical, but the epistemic rationality of setting aside skeptical concerns when we cannot rule out Cartesian-style scenarios.

8. Concluding Remarks

Given the elusive nature of Wittgenstein’s remarks on skepticism, there is still little to no consensus on how they should be interpreted or, more generally, whether Wittgenstein’s remarks alone can represent a valid response to radical skepticism. Pritchard’s (2016b, c) reflection on “hinges,” for example, are just one part of a more complex anti-skeptical framework that he calls epistemological disjunctivism (Pritchard, 2016b) and that would be impossible to summarize here. Coliva (2010, 2015) has recently proposed a version of the “framework reading” in which “hinges,” even if propositional, have a normative role, and their acceptance is a “condition of possibility” of any rational enquiry (on Coliva’s reading of OC and its anti-skeptical implications, see Moyal-Sharrock, 2013, and Pritchard & Boult, 2013). On the contrary, Salvatore (2015, 2016) has used the analogy drawn by Wittgenstein between “hinges” and “rules of grammar” in order to argue for the nonsensicality of skeptical hypotheses, which would be nonsensical combination of signs excluded by our epistemic practices (defined and constituted by “hinges” such as “Human beings have bodies” or “There are external objects”).

9. References and Further Reading

  • Clarke, T. (1972), “The Legacy of Skepticism,” The Journal of Philosophy, Vol. 69, No. 20, Sixty-Ninth Annual Meeting of the American Philosophical Association Eastern Division, 754–769.
  • Conway, G. D. (1989), Wittgenstein on Foundations, Atlantic Highlands, N.J., Humanities Press.
  • Coliva, A. (2015), Extended Rationality: A Hinge Epistemology, Palgrave MacMillan.
  • Coliva, A. (2010), Moore and Wittgenstein. Scepticism, Certainty and Common Sense, Palgrave MacMillan.
  • Coliva, A. (2009a), “Moore’s Proof and Martin Davies’ epistemic projects,” Australasian Journal of Philosophy.
  • Coliva, A. (2009b), “Moore’s Proof, Liberals and Conservatives. Is There a Third Way?” in A. Coliva (ed.) Mind, Meaning and Knowledge. Themes from the Philosophy of Crispin Wright, OUP.
  • Conant, J. (1998), “Wittgenstein on Meaning and Use,” Philosophical Investigations.
  • Malcolm, N. (1949), “Defending Common Sense,” Philosophical Review.
  • Minar, E. (2005), “On Wittgenstein’s Response to Scepticism: The Opening of On Certainty,” in D. Moyal-Sharrock and W.H. Brenner (eds.), Readings of Wittgenstein’s On Certainty, London, Palgrave, 253–274.
  • Moore, G. E. (1925), “A Defense of Common Sense,” in Contemporary British Philosophers, 1925, reprinted in G. E. Moore, Philosophical Papers, London, Collier Books, 1962.
  • Moore, G. E. (1939), “Proof of an External World,” Proceedings of the British Academy, reprinted in Philosophical Papers.
  • Moore, G. E. (1942), “A Reply to My Critics,” in Paul Arthur Schilpp (ed.), The Philosophy of G. E. Moore. Open Court.
  • Moyal-Sharrock, D. (2013), “On Coliva’s Judgmental Hinges,” Philosophia, Vol. 41, No. 1, 13–25.
  • Moyal-Sharrock, D. (2004), Understanding Wittgenstein’s On Certainty, London, Palgrave Macmillan.
  • Moyal-Sharrock, D. and Brenner, W. H. (2005), Readings of Wittgenstein’s On Certainty, London, Palgrave.
  • McGinn, M. (1989), Sense and Certainty: A Dissolution of Scepticism, Oxford, Blackwell.
  • Morawetz, T. (1978), Wittgenstein & Knowledge: The Importance of “On Certainty,” Cambridge, MA, Harvester Press.
  • Pritchard, D. H. (2016a), Epistemic Angst. Radical Scepticism and the Groundlessness of Our Believing, Princeton University Press.
  • Pritchard, D. H. (2016b), “Wittgenstein on Hinges and Radical Scepticism in On Certainty,” Blackwell Companion to Wittgenstein, H.-J. Glock & J. Hyman (eds.), Blackwell.
  • Pritchard, D. H. and Boult, C. (2013), “Wittgensteinian anti-scepticism and epistemic vertigo,” Philosophia, Vol. 41, No. 1, 27–35.
  • Pritchard, D. H. (2005), “Wittgenstein’s On Certainty and Contemporary Anti-skepticism,” in Readings of Wittgenstein’s On Certainty, D. Moyal-Sharrock and W.H. Brenner (eds.), London, Palgrave, 189–224.
  • Salvatore, N. C. (2016), “Skepticism and Nonsense,” Southwest Philosophical Studies.
  • Salvatore, N. C. (2015), “Wittgensteinian Epistemology and Cartesian Skepticism,” Kriterion-Journal of Philosophy, Vol. 29, No. 2, 53–80.
  • Salvatore, N. C. (2013), “Skepticism, Rules and Grammar,” Polish Journal of Philosophy, Vol. 7, No. 1, 31–53.
  • Strawson, P. F. (1985), Skepticism and Naturalism: Some Varieties, London, Methuen.
  • Stroll, A. (1994), Moore and Wittgenstein On Certainty, Oxford, Oxford University Press.
  • Stroud, B. (1984), The Significance of Philosophical Scepticism, Oxford, Oxford University Press.
  • Williamson, T. (2000), Knowledge and Its Limits, Oxford, Oxford University Press.
  • Williams, M. (2004a), “Wittgenstein’s Refutation of Idealism,” in Wittgenstein and Skepticism, D. McManus (ed.), London, New York, Routledge, 76–96.
  • Williams, M. (2004b), “Wittgenstein, Truth and Certainty,” in Wittgenstein’s Lasting Significance, M. Kolbel, B. Weiss (eds.), Routledge, London.
  • Williams, M. (2005), “Why Wittgenstein isn’t a Foundationalist,” in Readings of Wittgenstein’s On Certainty, D. Moyal-Sharrock and W. H. Brenner (eds.), 47–58.
  • Wright, C. (1985), “Facts and Certainty,” Proceedings of the British Academy, Vol. 71, 429–472.
  • Wittgenstein, L. (2009), Philosophical Investigations, revised 4th edn. edited by P. M. S. Hacker and Joachim Schulte, tr. G. E. M. Anscombe, P. M. S. Hacker and Joachim Schulte, Wiley-Blackwell, Oxford.
  • Wittgenstein, L. (1979), Wittgenstein’s Lectures, Cambridge 1932–35, from the Notes of Alice Ambrose and Margaret MacDonald, ed. Alice Ambrose, Blackwell, Oxford.
  • Wittgenstein, L. (1974) Philosophical Grammar, ed. R. Rhees, tr. A. J. P. Kenny, Blackwell, Oxford.
  • Wittgenstein, L. (1969), On Certainty, ed. G. E. M. Anscombe and G. H. von Wright, tr. D. Paul and G. E. M. Anscombe, Blackwell, Oxford.
  • Wolgast, E. (1987), “Whether Certainty is a Form of Life,” The Philosophical Quarterly, Vol. 37, 161–165.
  • Wright, C. (2004a), “Warrant for nothing (and foundation for free)?” Aristotelian society Supplement, Vol. 78, No. 1, 167–212.
  • Wright, C. (2004b), “Wittgensteinian Certainties,” in Wittgenstein and Skepticism, D. McManus (ed.), 22–55.

 

Author Information

Nicola Claudio Salvatore
Email: n162970@dac.unicamp.br
University of Campinas – UNICAMP
Brazil

Socialism

Socialism is both an economic system and an ideology (in the non-pejorative sense of that term). A socialist economy features social rather than private ownership of the means of production. It also typically organizes economic activity through planning rather than market forces, and gears production towards needs satisfaction rather than profit accumulation. Socialist ideology asserts the moral and economic superiority of an economy with these features, especially as compared with capitalism. More specifically, socialists typically argue that capitalism undermines democracy, facilitates exploitation, distributes opportunities and resources unfairly, and vitiates community, stunting self-realization and human development. Socialism, by democratizing, humanizing, and rationalizing economic relations, largely eliminates these problems.

Socialist ideology thus has both critical and constructive aspects. Critically, it provides an account of what’s wrong with capitalism; constructively, it provides a theory of how to transcend capitalism’s flaws, namely, by transcending capitalism itself, replacing capitalism’s central features (private property, markets, profits) with socialist alternatives (at a minimum social property, but typically planning and production for use as well).

How, precisely, socialist concepts like social ownership and planning should be realized in practice is a matter of dispute among socialists. One major split concerns the proper role of markets in a socialist economy. Some socialists argue that extensive reliance on markets is perfectly compatible with core socialist values. Others disagree, arguing that to be a socialist is (among other things) to reject the ‘anarchy of the market’ in favor of a planned economy. But what form of planning should socialists advocate? This is a second major area of dispute, with some socialists endorsing central planning and others proposing a radically decentralized, participatory alternative.

This article explores all of these themes. It starts with definitions, then presents normative arguments for preferring socialism to capitalism, and concludes by discussing three broad socialist institutional proposals: central planning, participatory planning, and market socialism.

Two limitations should be noted at the outset. The article focuses on moral and political-philosophical issues rather than purely economic ones, discussing the latter only briefly. Second, little is said here about socialism’s rich and complicated history. The article emphasizes the philosophical content of socialist ideas rather than their historical development or political instantiation.

Table of Contents

  1. Socialism and Capitalism: Basic Institutional Contrasts
    1. Ownership: Some Preliminaries
    2. Private, State, and Social Ownership
    3. Economic Systems as Hybrids
  2. Socialism vs. Communism in Marxist Thought
  3. Why Socialism? Economic Considerations
  4. Why Socialism? Democracy
    1. Scope
    2. Influence
  5. Why Socialism? Exploitation
    1. Exploitation as Forced, Unpaid Labor
    2. Eliminating Exploitation
  6. Why Socialism? Freedom and Human Development
    1. Formal Freedom
    2. Effective Freedom
  7. Why Socialism? Community and Equality
    1. Why Produce? Communal vs. Market Reciprocity
    2. Justice, Inequality, Community
  8. Institutional Models of Socialism for the 21st Century
    1. Central Planning
    2. Participatory Planning
      1. Parecon: Basic Features
      2. Allocation in Parecon: Economic Coordination Through Councils
      3. Evaluating Parecon
    3. Market Socialism
      1. Schweickart’s “Economic Democracy”
      2. Evaluating Economic Democracy
  9. References and Further Reading

1. Socialism and Capitalism: Basic Institutional Contrasts

Considered as an economic system, socialism is best understood in contrast with capitalism.

Capitalism designates an economic system with all of the following features:

  1. The means of production are, for the most part, privately owned;
  2. People own their labor power, and are legally free to sell it to (or withhold it from) others;
  3. Production is generally oriented towards profit rather than use: firms produce not in the first instance to satisfy human needs, but rather to make money; and
  4. Markets play a major role in allocating inputs to commodity production and determining the amount and direction of investment.

An economic system is socialist only if it rejects feature 1, private ownership of the means of production in favor of public or social ownership. But must an economic system reject any of features 2-4 to count as socialist, or is rejection of private property sufficient as well as necessary? Here, socialists disagree. Some, often called “market socialists”, hold that socialism is compatible, in principle, with wage labor, profit-seeking firms, and extensive use of markets to organize and coordinate production and investment. Others, sometimes called “orthodox” or “classical” socialists, contend that an economic system with these features is scarcely distinguishable from capitalism; true socialism, on this view, requires not merely social ownership of the means of production but also planned production for use, as opposed to “anarchic”, market-driven production for profit.

This section explores the core socialist commitment to social ownership of the means of production. Other important aspects of socialism—for instance, its stance towards markets and planning—are discussed in later sections (especially section 8).

a. Ownership: Some Preliminaries

Consider a society’s instruments of production, its land, buildings, factories, tools, and machinery; consider also its raw materials, its oil and timber and minerals and so on. Together, these instruments and these materials comprise society’s means of production. To whom should these means of production belong: to society as a whole, or to private individuals or groups of individuals? This is the central question dividing capitalists and socialists, with capitalists advocating extensive rights of private ownership of the means of production and socialists advocating extensive social or public ownership of these means.

Notice that the capitalist/socialist dispute does not concern the desirability of private property in items unrelated to production. The issue between socialists and capitalists is not whether individuals should be able to own “personal property” (for example, toothbrushes, houses, clothing, and other articles of everyday use) but whether they should be able to own “productive property” (for example, stores, factories, raw materials, and other productive assets).

But what does it mean to own something? Standardly, to own something is to enjoy a bundle of legally enforceable rights and powers over that thing. These rights and powers typically include the right to use, to control, to transfer, to alter (at the limit, even to destroy), and to generate income from the thing owned, as well as the right to exclude non-owners from interacting with the owned thing in these ways. Because these rights admit of gradations, so too does ownership, which is scalar—a matter of degree—rather than dichotomous. In general, the wider one’s rights of use, control, and so on over an object, the fewer restrictions one faces in exercising these various rights, and the wider one’s ownership rights over that object. Ownership, notice, may be narrowed and restricted without ceasing to be ownership. Limited ownership is not an oxymoron.

Another important distinction here is that between legal and effective ownership. These can go together, as when a person owns her car both in law and in fact: she not only has the title, but also possesses actual powers of use, control, and so on over the vehicle. But so too can they come apart. “The means of production belong to all the people,” proclaimed the Soviet Union’s constitution, but these were just words, for in reality democratically unaccountable bureaucrats and party officials grasped all the important economic levers. Something similar could be said of the relationship between shareholders in large capitalist corporations, on the one hand, and management and executives on the other: the former have “paper” ownership, but it is the latter that really exercise control. In general, it is effective rather than merely legal or formal ownership that is of interest in the present context. Capitalists and socialists alike want to realize their preferred patterns of ownership not just on paper, but also in reality.

b. Private, State, and Social Ownership

To understand socialism, one must distinguish between three forms of ownership. Under private ownership, individuals or groups of individuals (for example, corporations) are the primary agents of ownership; it is they who enjoy the various rights of use, control, transfer, income generation, and so on discussed above. Under state ownership, the state retains for itself these rights, and is thus the primary agent of ownership. Both of these forms of ownership should be familiar to anyone who has frequented a business or driven on an interstate highway.

Much less familiar is the key socialist idea of social ownership. Social ownership of an asset means that “the people have control over the disposition of that asset and its product” (Roemer, A Future for Socialism 18). Social ownership of the means of production, then, obtains to the degree that the people themselves have control over these means: over their use and over the products that eventuate from that use. This is a conceptually simple idea, but it can be difficult to grasp its practical implications. How, in concrete terms, could social control over the means of production be realized?

Historically, socialists have struggled to answer this question, and for good reason: it is not at all obvious how meaningful control over something as massive and complex as a modern economy might be shared across tens or even hundreds of millions of people. Broadly speaking, socialists have identified two main strategies of socialization. The first seeks to socialize the economy by nationalizing it. The second seeks the same end by radically decentralizing and democratizing economic power. These strategies will be investigated in greater detail below (see section 8), but for now a few orienting remarks are in order.

First, regarding nationalization: state ownership functions as a vehicle for socialization only to the extent that the people are themselves in control of the state. Otherwise nationalization amounts to little more than statism, not socialism; it constitutes economic rule by state officials rather than by society as a whole. Any genuinely socialist program of nationalization, then, must adhere to a two-part recipe: nationalize the economy, but also democratize the state, thereby putting the people in control of the economy at one remove.

This second step has proven rather elusive in practice. It was not accomplished—indeed, it was not even really attempted—by the so-called “socialist” authoritarianisms of the 20th century such as the Soviet Union and China. And certainly considerable barriers to genuine democratization exist even in countries with longstanding liberal democratic traditions, such as the United States. These barriers include the awesome influence of special interests and concentrated wealth on the political process, corporate domination of political media, voter ignorance and apathy, and so on. Democracy—popular control over the state—is, in short, an ideal easier praised than implemented, even under favorable conditions. However, these considerable practical problems aside, there seems to be nothing incoherent in principle with the idea of a genuinely socialist—because genuinely democratic—program of nationalization.

Or is there? Many socialists argue that state ownership can never fully realize socialism’s promise, no matter how democratic the relationship between the people and the state. This is because real social ownership involves not only control-at-a-remove, so to speak, but also active involvement and participation. On this conception, it is not enough for democratically accountable politicians and bureaucrats to steer the economy in your name; rather, you must do (or at least have the real opportunity to do) some of the steering yourself. The core idea here is well expressed by Michael Harrington:

Socialization means the democratization of decision making in the everyday economy, of micro as well as macro choices. It looks primarily but not exclusively to the decentralized, face-to-face participation of the direct producers and their communities in determining the matters that shape their social lives (197).

In a socialist society, average, everyday people must be active rather than passive, empowered rather than subordinated, involved rather than excluded. But if this is what genuine socialization requires, then socialism is

not a formula or a specific legal mode of ownership, but a principle of empowering people at the base, which can animate a whole range of measures, some of which we do not yet even imagine (Harrington, 197).

The point is not that nationalization can never play a role in making socialism real, but that it cannot play the outsized role often assigned to it.

But if socialists should not rely exclusively on nationalization, to what else should they appeal instead? Different socialists will answer this question in different ways, as we will see in section 8. But most would recommend leavening democratically controlled state ownership with sizable helpings of workplace democracy (as found, for instance, in the Mondragon and La Lega cooperatives in Spain and Italy, respectively), social control over investment, and various other measures to economically empower local communities and individuals (for instance, the “participatory budgeting” process found in Porto Allegre, Brazil, through which citizens meet in popular assemblies to decide how the city’s resources should be spent). By knitting together nationalization of major industries with these and other programs and initiatives, socialists hope to bring to fruition the “truly audacious project of empowering people to take command of their everyday lives” (Harrington, 197).

c. Economic Systems as Hybrids

In principle, an economy could be wholly capitalist, statist, or socialist. An economy would be wholly capitalist just in case all its productive assets were privately controlled; wholly statist, provided all such assets were state-controlled; and wholly socialist, provided all such assets were socially-controlled. While these are coherent theoretical possibilities, they have not been implemented in practice. In reality, all economies are hybrids that blend together private, social, and state ownership. It is better, then, to think of capitalism, statism, and socialism “not simply as all-or-nothing ideal types of economic structures, but also as variables” (Wright, 124). According to this analysis, an economy can be more or less capitalist, socialist, or statist, depending on the particular balance it strikes between the three forms of ownership.

For example, even in the United States—widely seen as a bastion of capitalism—the state plays a considerable role in controlling economic activity and in distributing the proceeds thereof. Does this mean it is a statist or perhaps even a socialist economy? No. Economies should be individuated with reference to their dominant mode of ownership. Since capitalist ownership dominates the United States’ economy—most of its productive assets being privately owned—it should be thought of as capitalist, albeit with some non-capitalist aspects. Similarly, an economy within which most productive assets are socially controlled should count as socialist, even if (as would almost certainly be the case) it also included statist or capitalist elements.

2. Socialism vs. Communism in Marxist Thought

Although this article focuses on socialism rather than Marxism per se, there is an important distinction within Marxist thought that warrants mention here. This is the distinction between socialism and communism.

Both socialism and communism are forms of post-capitalism. Both feature social rather than private ownership of the means of production. Both, within Marxist orthodoxy, reject market production for profit in favor of planned production for use. But beyond these important similarities lie significant differences. In the Critique of the Gotha Progam, Marx’s fullest discussion of these matters, he divides post-capitalism into two parts, a “lower phase” (later called “socialism” by followers of Marx) and a “higher phase” (communism). The lower phase follows immediately on the heels of capitalism, and so resembles it in certain ways. As Marx memorably puts this point, socialism is “in every respect, economically, morally and intellectually still stamped with the birth marks of the old society from whose womb it emerges” (Critique of the Gotha Program 614). These capitalist “birth marks” include:

  • Material scarcity. Like capitalism, socialism does not overcome scarcity. Under socialism, the social surplus increases, but it is not yet sufficiently large to cover all competing claims.
  • The state. Socialism transforms the state but does not do away with it. What was a “dictatorship of the bourgeoisie” under capitalism becomes a “dictatorship of the proletariat” under socialism: a state controlled by and for the working class. (Since workers make up the vast majority, this is less authoritarian than it sounds.) Workers must seize the state and use it to implement, deepen, and secure the socialist transformation of society. Only after this transformation is complete can the state “wither away”, and the “government of people” be replaced by the “administration of things”.
  • The division of labor. Socialism, like capitalism, will feature occupational specialization. Having developed under capitalist educational and cultural institutions, most people were socialized to fit narrow, undemanding productive roles. They are not, therefore, “all around individuals” capable of performing a wide variety of complex productive tasks. Accordingly, socialism must feature a broadly inegalitarian occupational structure. As under capitalism, there will be janitors and engineers, nurses’ aides and surgeons, factory workers and planners.
  • Finally, under socialism (many) people will retain certain capitalist attitudes about production and distribution. For example, they expect compensation to vary with contribution. Since contributions will differ, so too will rewards, leading to unequal standards of living. Turning from distribution to production, many socialist producers will, like their capitalist predecessors, regard work as merely a “means to life” rather than “life’s prime want”. They work, in short, to get paid, rather than to develop and apply their capacities or to benefit their comrades.

So in all of these ways, the “lower phase” of post-capitalism resembles its capitalist predecessor. Over time, however, these capitalist “birth marks” fade, all traces of bourgeois attitudes and institutions vanish, and humanity finally achieves the “higher phase” of post-capitalist society, full communism.

What would “full communism” be like? Marx never answered this question in detail—and indeed, he disparaged as utopian those socialists who focused excessively on “drawing up recipes for the kitchens of the future”—but from his brief remarks about communist society certain broad outlines can be discerned. Perhaps his most famous description of communism comes in the following passage from the Critique of the Gotha Program:

In a higher phase of communist society, after the enslaving subordination of the individual to the division of labor, and therewith also the antithesis between mental and physical labor, has vanished; after labor has become not only a means of life but also life’s prime want; after the productive forces have also increased with the all-round development of the individual, and all the springs of cooperative wealth flow more abundantly—only then can the narrow horizon of bourgeois right be crossed in its entirety and society inscribe on its banner: from each according to his ability, to each according to his needs (615):

Unpacking this passage, we see that Marx makes all of the following claims about communism:

  • It has done away with the division of labor, especially that between mental and physical labor;
  • Attitudes towards work have changed (perhaps in part because work itself has changed). Communist producers regard work as both instrumentally and intrinsically valuable: they see work not merely as a means to life, but also as “life’s prime want”;
  • Human beings themselves have changed under communism, becoming “all-around”, highly developed individuals (rather than the stunted, one-sided creatures that so many of them were under capitalism and perhaps even under socialism);
  • Material scarcity has been eliminated or at least greatly attenuated, as “the productive forces have increased” and thus “all the springs of cooperative wealth flow more abundantly”;
  • As a result of all these changes, communist society is able to conform to the famous principle: from each according to his ability, to each according to his needs—thus severing the link (found in communism’s “lower phase”) between contribution and reward.

Not only will communism (unlike socialism) do away with class, material scarcity, and occupational specialization, it will also do away with the state. As noted above, the state begins to wither away under socialism. But this process is not completed until the “higher phase” of full communism, for it is only in that phase that lingering class antagonisms are finally eradicated. With these antagonisms cleared away, the state has nothing to do—no class conflict to manage, no further function to perform—and so, like a vestigial limb, it gradually atrophies from disuse. Or, as Engels famously puts this point in Socialism: Utopian and Scientific,

State interference in social relations becomes, in one domain after another, superfluous, and then dies out of itself; the government of persons is replaced by the administration of things, and by the conduct of processes of production. The state is not “abolished”. It withers away (91).

In sum, within Marxist theory socialism and communism are very different indeed. Although both eradicate private property and profits, only the latter also eliminates the division of labor, the state, material scarcity, and perhaps even conflict itself. It is only under communism that mankind completes its ascendance from the “kingdom of necessity” into the “kingdom of freedom” (Engels 95).

3. Why Socialism? Economic Considerations

Is socialism worthy of allegiance, and if so, why?

The standard normative argument for socialism is comparative. Socialists typically single out certain moral and political values, argue that these values are poorly served under capitalism, and then support socialism by contending that these values would fare better—not necessarily perfectly, but better—under socialism. Values drawn upon by socialists vary, but usually include democracy, non-exploitation, freedom (both formal and effective), community, and equality. Sections 4–7 discuss these values and their alleged connections with socialism.

But before turning to these explicitly normative arguments, a word should be said about the purely economic case for socialism. (Since this article’s focus is normative rather than economic, this section will be brief.) Capitalism, many socialists hold, is wild and wasteful, prone to great booms and tremendously destructive busts. The argument goes like this: capitalist competition greatly augments society’s forces of production. Each firm, merely to stay in business, must innovate. As a result, productivity soars. Ever more output can be produced for ever fewer inputs, labor included. Abundance looms.

But this very abundance, paradoxically, is an economic problem. Gluts drive down prices as supply overwhelms demand. Profits decline. Firms, forced to cut costs, sack workers and slash wages. As unemployment and economic insecurity mount, demand plummets still further: people simply don’t have much money to spend. With reduced demand comes reduced opportunities for profits, hence, reduced production. What was a boom has turned into a bust, and society faces the absurd spectacle of idle farms next to hungry people; empty shoe factories beside shoeless workers; foreclosed houses alongside the homeless.

Capitalism, then, makes possible universal abundance. But its central features—market competition, the pursuit of profits, and private property—ensure that this possibility will never be realized. In Marxist language, there is a deep “contradiction” between capitalism’s “forces of production” and its “relations of production”, a contradiction that nothing short of socialist revolution can solve. Society must overthrow capitalist productive relations, replacing anarchic market production for profit with planned production for use. Only then will humanity eliminate the ridiculous concatenation of vast productive potential alongside vast unmet needs. Or so the socialist argument goes.

Socialists find further economic faults with capitalism. Capitalism misallocates resources towards producing what is profitable rather than what is needed. True, what is needed can sometime be profitable. But often the two categories come apart. Think, the socialist will say, of the vast resources spent producing luxuries for the rich, while the needy go without. Or consider the underproduction of critical, but unprofitable, antibiotics, even as “lifestyle drugs” (like Propecia, for baldness) roll off the production line.

Capitalism is also inefficient in its use of human labor power. Capitalism functions best when there exists a “reserve army of the unemployed,” in Marx’s phrase. The credible threat of unemployment reduces workers’ salary demands and increases their work effort. But unemployment means idle workers: able bodied people, willing to work, who cannot find an outlet for their productivity. This is a waste, and it would not exist under socialism (or so it is claimed.)

Further, capitalism allows an entire segment of the (able-bodied) population to live without working: namely, the independently wealthy, who can simply live off investment income. This, again, is wasteful; were these people recruited into the labor process, labor time for the rest could decline. Finally, capitalism misdirects the labor of many of those it does employ. Just think, the socialist will say, of the legions of lawyers, advertisers, marketers, and financial workers. Such workers (and others beside) perform no real productive function. Their jobs are necessary only within the framework of capitalism itself. In a socialist economy, there is no need for marketing, financial speculation, or lawyers specializing in mergers and acquisitions. Socialism would free people currently doing these tasks to apply their talents in a more useful way. Marketers could become teachers; financiers, farmers. And we would all be the better for it.

In sum, socialists seek to upend the common sense view of capitalism. Most people take it for granted that whatever its normative flaws, at the very least capitalism ‘delivers the goods,’ so to speak. Not so, replies the socialist. Because it is prone to economic crises, and is wasteful and inefficient in its use of the means of production (including human labor), capitalism’s economic bona fides must be questioned.

4. Why Socialism? Democracy

The article turns now to the normative case against capitalism and in favor of socialism, starting with democracy.

Democracy means rule by the people, as opposed to rule by the rich, or rule by the excellent, or, more generally, rule by any part of the people over the rest. Systems plausibly claiming to be democratic can vary along at least three dimensions. They can bring a broader or a narrower range of issues under democratic jurisdiction; their members can be more or less directly involved in the exercise of political power; and they can insist upon greater or lesser equality of influence (or perhaps opportunity for influence) over political processes. Call these the scope, involvement, and influence dimensions, respectively.

Other things being equal, as involvement, scope, and equality of influence increase, so too does democracy. Thus it can make sense to say that one democratic system is “more democratic” than another. So too, it is possible to compare different democratic ideals in terms of their “democratic-ness”. A principle or ideal that insists upon maximal equality of influence, for instance, is (other things equal) more democratic than a principle or ideal that does not.

Socialists are radical democrats. They do not merely profess rule by the people; they also interpret that ideal in a highly democratic way, opting for maximalist or near-maximalist positions along all three of the just-mentioned dimensions. They want democracy to have very broad scope; they want citizens to be highly involved in democratic processes; and they want citizens to have roughly equal opportunities to influence these processes. And they typically argue, further, that the democratic ideal, understood in this rich and demanding way, militates against capitalism and in favor of socialism. This article will focus on the scope and influence dimensions.

a. Scope

To see this argument, consider first the scope dimension of democracy, which concerns the question: where should the boundary between public and private, between politics and civil society, be drawn? Which issues should be subject to democratic choice? Many socialists endorse something like the following principle:

All Affected Principle: People affected by a decision should enjoy a say over that decision, proportional to the degree to which they are affected.

However, it is a rather short step—or so say socialists—from this intuitively plausible principle to the radical conclusion that economics should be subordinated to democracy, that large swathes of economic life should be politicized and brought under popular control. All that is required to make that leap is the seemingly incontrovertible premise that many economic issues affect the public. When a local business fires 20% of its workers, this affects the public. When financiers withdraw support for a new shopping center, this affects the public. When society’s productive assets are deployed to make yachts for millionaires rather than affordable housing, this affects the public. When corporations pull up roots and relocate production to greener pastures, this affects the public.

In all of these cases (and many others besides), people’s lives are affected—indeed, often profoundly affected—by economic decisions. Do they get a say in these decisions, as required by the All Affected Principle? Not under capitalism, which grants extensive control over such matters to holders of private property rights. Where private property reigns, owners rather than affected parties decide, for example, whether to hire or fire, to invest, to relocate, and so on. From the socialist point of view, this is a serious offense against democracy. Capitalism, socialists claim, depoliticizes what should remain political; it cedes far too much control over common affairs to private parties. It is, in this way, insufficiently democratic.

But if the root cause of this democratic deficit is private control over productive assets, then the solution, or so socialists argue, must be social control over the same. Social property brings into the democratic domain what private property improperly removes. What touches all must be decided by all; economic matters touch all; therefore economic matters must be decided by all. This is the simple but powerful democratic syllogism at the heart of one major argument for socialism, for social rather than private control of the economy. What might social control over the economy look like in practice? Section 8 explores competing answers to this question.

b. Influence

Socialists find further grounds for rejecting capitalism in democracy’s influence dimension. Standardly, democracy is held to require not merely that all citizens have a say, but that they have an equal say. But what does this really mean? To clarify, suppose that A and B have equal voting rights, but A, being rich, educated, and leisured, has a greater chance to influence the political process than B, who is poor, uneducated, and short on free time (he must work long hours to make ends meet). Do A and B have an “equal say”, in the sense required by democracy?

Nearly all socialists, and indeed, many non-socialists, would say “no”; they would detect a democratic deficit in this scenario, for they typically see democracy as requiring not merely formal equality of opportunity for political influence but also substantive or fair equality of opportunity for political influence. On this view, it is not enough for A and B to enjoy identical legal protections to vote, to run for office, to engage in political speech, and so on. Instead, genuine democracy requires (over and above this merely legal equality) that equally talented and motivated citizens have roughly equal prospects for winning office and/or influencing policy, regardless of their economic and social circumstances—or something along these lines.

Now, capitalism clearly can implement formal political equality. Many capitalist societies grant their citizens equal rights to vote, to run for office, and so on. But can capitalism implement substantive political equality?

Many socialists think not. Capitalism, they point out, generates steep economic inequalities, dividing society into rich and poor. But in a variety of ways, the rich can translate their economic advantages into political ones. This translation can occur relatively directly, as when the rich buy political influence through campaign contributions, or when they hire lobbyists to steer legislative priorities (sometimes going so far as to draft laws themselves). Or it can occur relatively indirectly, as when the wealthy use their ownership of media to shape public opinion (and thus the political process), or when capitalists threaten to take their money out of the country in response to disliked (usually leftist) policies, thereby limiting what government can do.

But whether moneyed interests affect politics directly or indirectly, the net result is the same: capitalism amplifies the voices of the rich, enabling their concerns to dominate the political process. Indeed, some socialists, pressing this objection to its logical conclusion, contend that “democracy” under capitalism is really little more than oligarchy—rule by the rich—covered by a democratic fig leaf. Or, as Vladimir Lenin put this point: “Democracy for an insignificant minority, democracy for the rich—that is the democracy of capitalist society” (79).

Sophisticated defenders of capitalism respond by arguing that capitalism’s democratic deficits can be repaired within a fundamentally capitalist framework. Campaign finance reform, regulation of lobbying, restrictions on corporate domination of media, even limitations on the movement of capital across borders would, together, do much to restore or preserve political equality amidst capitalist economic inequality, and yet none of them are incompatible with capitalism per se. It follows (capitalists argue) that there is no need to throw out the baby of capitalism with the bathwater of political inequality. Sufficiently reformed, capitalism can indeed realize not just formal political equality but also substantive political equality.

The question, socialists would reply, is whether these reforms would ever be chosen by political elites under capitalism. Will capitalist oligarchs willingly undercut the very basis of their rule by socializing control over mass media, installing real campaign finance reform, limiting capital flows, and so on?

Would socialism perform any better than capitalism on this “influence” dimension of democracy? Would it enable equally talented and motivated citizens to have roughly equal prospects for influencing politics? Socialists argue that it would. Because it eliminates class, socialism eliminates the major threat to substantive political equality. (Of course, other forms of exclusion, such as racism and sexism, must also be overcome.) Wealthy property owners will not dominate the political process at the expense of the poor and unpropertied because the latter will be an empty set. Everyone will be a wealthy property owner, in the sense that everyone will share control over the means of production and will have access to a dignified standard of living. Everyone will therefore have roughly equal economic resources to bring to bear on the political process.

Put differently, whereas capitalism attempts to secure political equality despite massive economic inequalities, socialism attempts to secure political equality in large part by eliminating these inequalities.

5. Why Socialism? Exploitation

According to many socialists, one of capitalism’s central moral failings is that it is exploitative. Socialism, by contrast, would not be exploitative—or so these socialists allege—and this is one of the main reasons for preferring it to capitalism.

But what is exploitation? Is capitalism truly exploitative? And would socialism really eliminate exploitation? This subsection explores socialist answers to these questions.

a. Exploitation as Forced, Unpaid Labor

Although there is no universally accepted account of exploitation, Jeffrey Reiman’s Marx-inspired suggestion that exploitation is “a kind of coercive prying loose of unpaid labor” provides a good framework for discussion (3). On this account, a person is exploited if and only if she is forced to work for free. Feudal serfs, for example, were exploited because they were legally and physically compelled, at sword-point if necessary, to spend part of their working time toiling in the lord’s fields for nothing in return. This was forced, unpaid labor of the most obvious sort, and it constituted a serious form of exploitation.

But are capitalist employees exploited? At first glance, it would appear not. Workers get paid wages, so it doesn’t seem as if they are working for free. Nor does it appear that workers are forced to work. Capitalism, being a system of “free labor”, grants workers ownership over their labor power and entitles them to sell it—or not—as they please. So where is the force supposedly inherent in the capital/worker relationship?

Take the issue of force first. In general, a person is forced to do something X whenever she has no reasonable alternative to doing X. Workers, then, are forced to sell their labor power to capitalists just in case they have no reasonable alternative to doing so. But of course they don’t have a reasonable alternative, or so some socialists contend. Their argument is simple. Everyone must make a living. There are, under capitalist property relations, only two main ways to do this: one can live off of investment or property income, or one can live off of wages. By definition, workers cannot pick this first option; they don’t own means of production, so they can’t live off of income generated by such ownership. This leaves wage labor as the only acceptable option. True, workers are formally free to decline capitalist employment, but this does not represent a reasonable option since its consequences are so dire: starvation or, in more enlightened circumstances, life on the dole. Workers therefore have no minimally reasonable choice but to sell their labor power to owners of means of production.

It follows that workers are forced to work for capitalists, even if they are not so forced by capitalists (or indeed, by anyone else). The forcing in question is structural rather than agential; as Reiman explains, it is “an indirect force built into the very fact that capitalists own the means of production and laborers do not.” Or, as Marx puts this point, it is the “the dull compulsion of economic relations” rather than “direct force” that “completes the subjection of the laborer to the capitalist” (Capital Vol. I, 737).

Not all socialists accept this argument. G.A. Cohen, for example, suggests that individual workers do have a reasonable alternative to selling their labor power: they can become capitalists themselves (“The Structure of Proletarian Unfreedom”). Not overnight, perhaps, but with enough scrimping and saving, is it not possible for an individual worker to start a business of her own? Cohen concludes that individual workers are not forced to sell their labor power. (He also argues that workers are “collectively unfree”—unfree as a class—since not all, or even many, workers can escape their class at the same time; the economy can absorb only so many small business owners at any given moment. But this alleged collective unfreedom of workers, though interesting and important, is peripheral to our present topic and so must be set aside.)

In response, some socialists question whether opening a small business really represents a reasonable option for most workers. For one thing, many workers simply can’t save enough to open such a business: their wages are just too small relative to the cost of living. For another, even if a worker is able, through years of thrift, to open his own business, most businesses fail, often leaving the owner much worse off financially than she would have been had she simply remained a wage laborer. Pulling together these ideas, one critic of Cohen concludes that “escaping into the petty bourgeoisie…is a reasonable alternative only for a tiny minority of workers. Thus the vast majority of working-class individuals are forced to sell their labor power to earn a living” (Peffer, 152).

But even if Cohen is wrong, and individual workers are forced to sell their labor power, notice that it does not yet follow that workers are exploited. For forced labor alone does not exploitation make. Exploitation, as described above, involves forced, unpaid labor. Let us turn, then, to the issue of compensation, and in particular, to the question of whether workers toil (at least in part) for free.

Again, surface appearances cut against the socialist position. Wage laborers standardly receive an hourly wage. If they work, say, eight hours, they get eight hours’ pay. It certainly seems, then, that workers receive full compensation for their toil. Perhaps this compensation is unfairly low, but that is a different issue: the exploitation charge, standardly construed, is that workers are forced to work for no pay, not that they are forced to work for low pay.

But probe more deeply, some socialists contend, and the unpaid nature of much work under capitalism becomes clear. To see their argument, it helps to start with an easier case: feudal production. Under feudalism, serfs spent part of their working time working in their own fields and the rest working in their lord’s fields. They kept whatever they could grow on their own plots, and surrendered whatever they grew on the lord’s. Put differently, serfs received compensation for part of their working time, but no compensation at all for the rest of it. A great deal of their work, then, was wholly unpaid: a fact that was very obvious to all involved, given the physical separation between paid work (on the serf’s fields) and unpaid work (on the lord’s).

Marxists argue that precisely the same division between paid and unpaid work exists under capitalism. Workers spend the first part of their working day working, in effect, for themselves. This is the part of the day during which they produce the equivalent of their wages. Marx calls this “necessary labor time”. But the working day does not stop there. Indeed, it cannot stop there, for if it did, there would be no “surplus product” for the capitalist to appropriate, and thus no reason for the capitalist to hire the worker in the first place. So the capitalist requires the worker to perform “surplus labor”, which is just labor beyond “necessary labor”: labor beyond what is required to produce value equivalent to the worker’s wage. The value produced during surplus labor time, Marx calls “surplus value”. Crucially, this surplus value belongs to the capitalist rather than the worker, and is the source of all profits.

To illustrate, consider a worker who produces 1 widget per hour over the course of an eight-hour shift, thus yielding eight widgets in total. Her boss takes these widgets, sells them, and then returns part of the proceeds to the worker in the form of a wage. But this wage must be less than what the capitalist reaped by selling the widgets. Otherwise the capitalist would have nothing left over as profit. To fix ideas, suppose that the worker’s daily wage is equivalent to the value of 2 widgets. To produce this value, she had to toil for 2 hours (at 1 widget per hour). Yet her shift lasts 8 hours. It follows that she spent 2 hours working for herself, and 6 hours working for her boss: which is to say, 6 hours working for free.

We can now appreciate Marx’s remark that “the secret of the self-expansion of capital [that is, the secret of profit] resolves itself into having the disposal of a definite quantity of other people’s unpaid labor” (Capital Vol. I, 534). Profits, on Marxist analysis, are possible only through the extraction of unpaid surplus labor from workers. Wage workers toil gratis no less than serfs. That the division between paid and unpaid labor under capitalism is temporal rather than physical or spatial (as under serfdom) makes this division harder to see, but it does not in any way diminish its reality—or so the socialist argument goes.

b. Eliminating Exploitation

How exactly is socialism supposed to eliminate exploitation? Notice that it would not eliminate work itself, as Marx writes, “Just as the savage must wrestle with Nature to satisfy his wants, to maintain and reproduce life, so must civilized man, and he must do so under all social formations and under all modes of production” (Capital Vol. III, Ch. 48). So even under socialism, work must be done.

However, it does not follow that people must be forced to do it. Society could eliminate the compulsion to labor by partly decoupling income (or access to basic resources more broadly) from work. Philippe van Parijs’s “unconditional basic income” represents one way to achieve this decoupling. On his proposal, which has attracted significant support from socialist quarters, each citizen, no matter how rich or how poor, would be paid a monthly income, set as high as possible, and in any case sufficient to live with dignity. This income would come without any strings attached. In particular, it would not be conditional on working, seeking work, or training for future work. It would go to all members of the political community: leisured surfers off of Malibu no less than industrious steelworkers in Pittsburgh.

Perhaps the economic feasibility of such a proposal may be questioned. But for present purposes, the important thing to appreciate is the way in which a UBI (as it is known) gives each person the “real freedom” to drop out of the paid labor force, thereby eliminating both the compulsion to work and (therefore) exploitation.

From a socialist perspective, there are at least two potential problems with this way of eliminating exploitation.

First, a UBI enables people to live off the hard work of others—no reciprocation required. Again, surfers get the check no less than people with paid employment. But socialists complain when capitalists live off the work of others; shouldn’t they complain when surfers (and so forth) behave similarly?

Second, there is nothing uniquely socialist about a UBI. Capitalist no less than socialist societies can implement a UBI, thereby enabling everyone to live decently without working. A defender of capitalism might therefore insist that when it comes to exploitation, capitalism and socialism are on all fours: both are equally susceptible to exploitation and equally able to enact the policies needed to eliminate it.

In response, socialists might point to the second necessary feature of exploitation, non-compensation. Notice that compensation takes many forms. Acquiring exclusive control over a sum of money, or over a bundle of resources, is one of them. But so too is acquiring a share of control over resources. Say that you and I work to build a tree house which we then jointly control. Neither of us has exclusive say over the tree house. And yet it would be wrong to conclude that our labors have gone uncompensated. We have been compensated; it’s just that our compensation comes in the form of common rather than private property.

This is precisely the sense in which all labor is compensated under socialism. Workers own the means of production together; they (therefore) own the surplus generated by these means. True, they do not own this surplus privately. They share control over its disposition and use. But shared control can be a form of compensation no less than private control.

Under capitalism, workers have private ownership over their wages (and the things these wages buy) but no ownership at all over most of what they produce. This is the sense in which most of their laboring activity goes uncompensated. Workers produce a surplus, hand it over to capitalists, and are then cut out of the picture; their bosses are free to do with the surplus whatever they like: consume it, invest it, burn it, and so forth. Under socialism, by contrast, workers have private ownership over their wages (or, in a money-less economy, over resources for personal use) and collective ownership over the social surplus they produce. They both make the surplus and share control over how to use this surplus. At no point, then, are socialist producers toiling ‘for free’, since their labors go towards building an economy that is shared and controlled by all. It’s as if everyone made a gigantic tree house that everyone is then free to use and to help govern.

So, contrary to the capitalist objection raised 4 paragraphs back, it seems that socialism is uniquely well positioned to eliminate exploitation. Both socialism and capitalism could, in principle, eliminate forced labor by attenuating the link between income and work. But only socialism can ensure that all work is compensated through common ownership of the social surplus. Thus socialism expunges exploitation from economic life even absent something like a UBI, whereas the same cannot be said of capitalism.

Against this argument, critics might reply that the kind of ‘compensation’ for surplus labor promised by socialism is wholly inadequate. Under capitalism, the worker’s surplus is appropriated by the capitalist; under socialism, the worker’s surplus is appropriated by society. From the worker’s point of view, this may seem a distinction without a difference. Both appropriations rob the worker of effective control over the fruits of her labor. True, under socialism the worker is a member of the group doing the appropriating, but, as merely one of millions of such members, her individual influence over that group is infinitesimal. Is it plausible to regard her tiny sliver of decision-making power over the surplus as ‘compensation’ for her surplus labor? Arguably not, in which case socialism does not actually eliminate exploitation.

6. Why Socialism? Freedom and Human Development

Many socialists point to considerations of freedom, broadly understood, to support socialism over capitalism.

Freedom comes in many varieties. This article will discuss two. Formal freedom involves the absence of interference. Effective freedom involves the presence of capability. A person who is unable to walk has the formal freedom to ascend a steep flight of steps—assuming that no one will interfere with her attempt—but lacks the effective freedom to do so.

a. Formal Freedom

It is sometimes suggested that socialism fares poorly with respect to formal freedom. There are two main grounds for this contention, one historical, the other conceptual.

Historically, many countries claiming to be socialist trampled basic liberties such as freedom of expression and religion. They imprisoned and killed political dissidents and other ‘enemies of the people’. Far from being free societies, they were deeply oppressive ones.

Some critics of socialism suggest that this historical correlation between socialism and oppression was no accident. Rather, it reflects a deep flaw in socialism’s design. Socialism concentrates economic and political power in the hands of the state. Abuse is inevitable under such conditions. Milton Friedman, building off of this insight, famously posited a necessary connection between capitalism (which, unlike socialism, disperses economic power rather than concentrating it) and freedom: not all capitalist societies are free, but all durably free societies must be capitalist.

Socialists concede the heart of Friedman’s point, but argue that it does not undermine their position. Friedman, they say, was right to warn against excessive centralization of power. But he was wrong to suggest that socialism necessarily requires said centralization. The contemporary socialist ideal is profoundly democratic and decentralized; it seeks to disperse economic power, not concentrate it. It aspires to an economy and a society controlled from the broad bottom, not the narrow top. So the kind of socialism that contemporary socialists embrace is simply different than the kind of ‘socialism’ targeted by Friedman’s critique. Put differently, Friedman’s worry attacks a view held by very few socialists today—or so it might be argued.

Turning to a different objection, it is sometimes suggested that on purely conceptual grounds socialism is a more restrictive society than capitalism. The argument for this claim is simple. Capitalism permits private ownership of productive assets; socialism does not. Socialism therefore provides less formal freedom than capitalism. It interferes with various economic activities that capitalism allows. Thus, if what you value is formal freedom, then you should prefer capitalism to socialism.

The trouble with this argument, as pointed out by G.A. Cohen, is that it “see[s] the freedom which is intrinsic to capitalism [but not] the unfreedom which necessarily accompanies capitalist freedom” (“Capitalism, Freedom, and the Proletariat” 150). Capitalism does indeed allow some things that socialism forbids: for example, opening a business. But the converse is also true. To use Cohen’s example: I am free to pitch a tent on common land. I am not free to pitch a tent on land that you own privately. Should I try, the state will interfere, thereby reducing my formal freedom. Private property’s effects on formal freedom, then, are not uniformly positive, but mixed. Private property extends formal freedom to owners even as it withdraws it from non-owners. As Cohen writes, “To think of capitalism as a realm of freedom is to overlook half its nature” (“Capitalism, Freedom, and the Proletariat” 152)

Of course, precisely the same can and indeed must be said of socialism. All systems of property, whether capitalist or socialist, exert complex effects on formal freedom; all such systems necessarily distribute both freedom and unfreedom. But in light of this complexity, our guiding question here—which system, capitalism or socialism, provides more formal freedom?—is probably unanswerable. All we can say with confidence is that these systems provide differently shaped zones of formal freedom; each extends formal freedom in some ways while restricting it in others. However, it seems extremely difficult, perhaps impossible, to determine which of these zones is ‘larger’ overall. At the very least, defenders of capitalism must say a great deal more to establish that capitalism is, a priori, a freer society than socialism.

Socialists score this particular fight a draw.

b. Effective Freedom

Whereas socialists tend to play defense regarding formal freedom, they go on offense when discussing effective freedom.

Effective freedom, again, involves the capacity to accomplish one’s ends. This implies but goes beyond formal freedom. Say that my goal is to complete a marathon. One way I can fail to accomplish this goal is by meeting with agential interference. If you physically restrain me from participating in the race, you undermine my effective freedom by undermining my formal freedom. However, effective freedom usually requires much more than the mere absence of interference. I can actually complete a marathon, for example, only if a host of further conditions are in place. Some are broadly social: I must live in a society in which marathons occur. Others are broadly economic: I must be able to afford all the costs associated with training for the race, traveling to the race, entering the race, and so forth. And there are physical or “internal” factors as well. I can’t finish the marathon unless I have sufficient mobility and endurance. All of which is to say that effective freedom depends upon a wide range of factors, many of which have nothing to do with human interference per se.

Now, in a good and just society, which effective freedoms—which “capabilities,” as they are sometimes called—would people have? The typical socialist response runs as follows. At a minimum, everyone must have the effective freedom to meet their basic needs for food, shelter, health care, and so on. With these capabilities in place, people are able to survive. This is a crucial accomplishment, and one demanded by minimal standards of justice and decency. However, a truly good society must set its sights higher; it must enable people not merely to survive, but also to flourish.

And what is human flourishing? Socialists standardly accept a broadly Marxist/Aristotelian account according to which the good life centrally involves not just the passive pleasures of consumption (watching TV, eating tasty food, and so on) but also the more active and enduring satisfactions of “self-realization”, which can be defined as the “development and application of a person’s talents in a way that gives meaning to his or her life” (Roemer, A Future for Socialism 3). Mastering an instrument, playing a sport, solving a physics problem, writing an article, building a shed: these are all examples of potentially self-realizing activities.

Such activities typically have a steep “learning curve” that makes them frustrating at first, but deeply gratifying over the long haul. In this, they contrast sharply with consumption activities, which have the opposite hedonic profile: watching TV is immediately gratifying, but its charms wane with repetition. This contrast is one reason why self-realization is (according to many socialists) more important to human flourishing than consumption. A life replete with consumption but lacking in self-realization becomes stale, cramped, unsatisfying. Indeed, at the limit, it veers towards meaninglessness. It is only through the development and exercise of one’s higher powers and talents that one leads a truly human existence—or so the socialist view has it.

So a genuinely good and just society, then, is one in which “the free development of each is the condition of the free development of all,” as Marx and Engels declare in the Communist Manifesto: it is one in which each person has real access to the material, social, and political preconditions for human development and self-realization. But how, precisely, does any of this amount to an argument for socialism? The answer is that socialists typically see capitalism as a serious barrier to self-realization, a barrier that nothing short of socialism can remove. As Jon Elster puts it, capitalism “offers [the opportunity for self-realization] to a few but denies it to the vast majority” (Introduction to Karl Marx 43). Socialism, by contrast, would democratize self-realization, putting it within reach of average, everyday people for the first time in human history—or so it is claimed.

To fill in these claims, consider the material and social preconditions for self-realization. To develop and apply one’s talents in a way that gives meaning to life, one must, at a minimum, have one’s basic needs met. People who are sick, hungry, or homeless are simply not in a good position to develop and exercise their higher talents. However, since capitalism reliably leads to poverty via frequent busts, structural unemployment, downward pressure on wages, and so on—or so socialists will claim—it therefore reliably depresses access to self-realization for a significant portion of the population. Socialism, by contrast, would eliminate poverty and thus would eliminate this potent material barrier to self-realization.

Suppose, however, that basic needs are met: what else is required for self-realization? Time. Now, under capitalism, most people are forced, through lack of private property, to perform wage-labor for a living (see section 5.a). Their days are thus divided into two parts: working time and leisure time. But time spent in a capitalist workplace is, for the vast majority of people, hardly time for self-realization. Capitalist jobs are oriented around the demands of profit, not self-realization. And quite often, the most profitable way to organize work is to “deskill” it: to make it simple, routine, even stultifying (Braverman).

Granted, there are exceptions. Some workers, such as doctors, engineers, college professors, carpenters, have challenging, complex, autonomous, engaging jobs that help bring self-realization and meaning to their lives. But these are the lucky few. More typical is the experience of, say, assemblers, fast food workers, cashiers, poultry-plant operators, secretaries, human resource clerks, and so on and so forth. Saddled with “alienating” jobs like these, workers work merely to live; as Marx writes, they “feel themselves at home only when they are not working, and when they are working they do not feel at home” (Economic and Philosophic Manuscripts). This is not to demean the people occupying these roles, nor is it necessarily to deny the social importance of these jobs. Rather, it is only to point out that these jobs offer little opportunity to develop and exercise complex talents in a way that brings meaning to life. If capitalism’s armies of cashiers, clerks, and so on are to experience self-realization, then, it will have to be off the clock, during their free time.

Yet here we hit upon a further capitalist barrier to self-realization: its unwillingness to expand what Marx called the “realm of freedom,” leisure, by shrinking the “realm of necessity,” work. Despite ever-rising productivity—more output per unit of labor input—working time rarely declines under capitalism. This is, on its face, rather puzzling. After all, there are, in principle, two ways an enterprise could respond to an increase in productivity. It could keep working hours constant while increasing output, or it could keep output constant while cutting working time. Yet capitalist firms consistently choose the first option over the second; they choose to produce more stuff rather than reduce the working day.

What explains this “output bias”? Profits (Cohen, Karl Marx’s Theory of History Ch. XI). Firms do not make more money by reducing working time; they make more money by increasing output. And so we get, under capitalism, a society chronically short on leisure but drowning in consumer goods; we get the familiar harried rat race, albeit with iPhones.

Now, this mountain of stuff must be sold. This requires spending enormous resources on the “sales effort”—marketing, advertising, and so on—so as to stimulate demand. The result is a highly consumerist society in which many people identify the good life with the life of consumption rather than self-realization. This widespread consumerism may be further promoted by a “sour grapes effect”. Denied self-realization by the alienating nature of work and insufficient free time, the capitalist worker decides self-realization isn’t worth having to begin with. Like the fox in Aesop’s fable, he rejects the tasty fruit he cannot have (self-realization) for the blander fruit within reach (consumption).

In sum, for a variety of interconnected reasons, having to do with its tendency to produce poverty, deskill work, provide inadequate free time, and promote a consumerist orientation, capitalism undermines self-realization and therefore human flourishing. Not, admittedly, for all. But for the vast majority, capitalism renders a rich and meaningful life difficult if not impossible to achieve. Or so the socialist argument goes.

How would things differ in a socialist economy? We have already seen that socialism, by (allegedly) eliminating poverty, would eliminate that particular material barrier to self-realization. Regarding work and leisure, socialists argue that because their system places human beings rather than anarchic market forces in control of the economy, it empowers us to prioritize self-realization and expanded leisure in the design and organization of work. Since we control production, we can tailor it to suit our preferences. If we want better, non-alienating work and more free time, we can get it. Admittedly, this would probably result in lower output. With reduced hours and more engaging labor processes, less stuff would be produced. But from the socialist point of view, this is no great tragedy. Past a certain point, more stuff contributes very little to human flourishing. Once a decent standard of living has been secured, self-realization hinges mainly on access to meaningful work and adequate free time. If the price of securing these things is less stuff, so be it. Fewer iPhones in exchange for more meaningful jobs and no rat race: this is a tradeoff that socialists heartily recommend.

7. Why Socialism? Community and Equality

Capitalism is competitive and cut-throat; socialism is cooperative and harmonious. Capitalism divides; socialism unites, or so many socialists have argued. The crucial value in play in these arguments is “community”.

The concept of “community” admits of at least two different interpretations. The first concerns producers’ motivations: what drives people to wake up each day and go to work? Is it fear and greed, or a desire to serve one’s fellows? The latter is the motivation consistent with community, yet it is relentlessly undermined by capitalism (or so socialists claim). The second sense of community concerns limitations on material inequality. When inequalities in living conditions grow too steep, mutual incomprehension results. People dwell in different worlds. This undermines community (in this second sense), or so it may be argued. These two senses of community, and their fates under capitalism and socialism, will be explored more deeply in what follows.

a. Why Produce? Communal vs. Market Reciprocity

As Adam Smith observed, under capitalism “it is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest” (Book 1, Ch 2). The baker hands over a loaf only because you pay him. Remove the payment, and he removes the bread. So it goes in a market society, for as G.A. Cohen argues, in such a society productive activity is motivated “not on the basis of commitment to one’s fellow human beings and a desire to serve them while being served by them, but on the basis of cash reward” (Why Not Socialism? 39). Market participants operate on the maxim “serve-to-be-served”; they serve only in order to receive service in return. They strive to give as little as possible while getting as much as they can—“buy low and sell high,” as the saying goes. Indeed, the best-case scenario (by market logic’s lights) is to give nothing and get everything.

Market logic thus locks us into deeply anti-social relations. The marketeer, looking at humanity, sees not comrades or brothers and sisters, but customers and competitors. Predator-like, he sees mere “sources of enrichment” and “threats to success”. The former are to be fleeced, the latter crushed. Yet these are horrible ways to relate to other people. Market society may deliver the goods, but it does so only by bringing out some of the worst aspects of human nature. Or so some socialists argue.

But is there an alternative? Cohen asks us to consider how people behave on a camping trip. If A needs help setting up her tent, does B use her need strategically as a means to self-enrichment? Does he ‘drive a hard bargain’, making his support contingent on receiving something in return? No; that’s not how decent people behave in such a context. Rather, in the standard case, B helps A simply because A needs help. Service in response to need: this is what motivates productive activity on a camping trip.

Now, this is not to say that B’s assistance comes entirely without strings attached. B needn’t be a sucker, so to speak; he needn’t continue to help A if A consistently fails to return the favor. On a camping trip, one reasonably expects some degree of reciprocity. Campers thus occupy a sweet spot between anti-social market predation on the one hand and self-denying altruism on the other. Cohen labels this sweet spot “communal reciprocity,” and describes its content as “serve-and-be-served”. Campers acting on this motivation value both sides of the conjunction. They regard it as intrinsically desirable to serve each other, yet they also do expect some degree of reciprocation. As Cohen explains, “the relationship between us under communal reciprocity is not the market-instrumental one in which I give because I get, but the non-instrumental one in which I give because you need, or want, and in which I expect a comparable generosity from you” (Why Not Socialism? 43).Cohen recognizes an important caveat here: the responsibility to reciprocate is conditional upon ability. Thus, there’s no problem with disabled people receiving support without making a contribution in return.

Such behavior is entirely normal and functional on a camping trip. Communal reciprocity clearly “works” in such a context. But can it work on a massive, society-wide scale? Can millions or billions of strangers serve each other, with tolerable economic results, out of fraternity and benevolence rather than greed and fear?

Skeptics cite two main grounds for doubt. The first is human nature: surely people are simply too selfish, greedy and tribal for communal reciprocity to work on a massive scale. Treating your actual brothers as brothers is one thing; treating total strangers as brothers is quite another.

Socialists reply that human nature is complex. We are indeed greedy and competitive, but so too are we generous and cooperative. Economic context powerfully influences which of these traits predominate. Edward Bellamy, an influential 19th century American socialist novelist and thinker, compares human nature to a rosebush (Ch. 26). Put a rosebush in a swamp, and it will appear sickly and ugly. One might conclude that rosebushes are, by ‘nature’, noxious little shrubs. But this would be a mistake. We know that rosebushes are capable of great beauty, given the right developmental conditions. Yet surely, argues Bellamy, the same goes for human beings. Shaped by capitalism, people appear greedy, cramped, and fearful. But this is only because we’re mired in a metaphorical swamp. Put us in the more hospitable soil of socialism and we, like the rosebush, would blossom; we would display all the fellow-feeling, generosity, and cooperative instincts socialism requires.

Human nature, in short, poses no serious obstacle to socialism. ‘Socialist man’ dwells within all of us, waiting only for the right social conditions to emerge.

But these social conditions simply cannot emerge, for they are infeasible: this is the second skeptical objection. Without markets, economies simply do not function tolerably well—witness the failure of Soviet-style planning. In response, Cohen argues that this is just one data point. It would be overly hasty to write off all non-market alternatives simply on the basis of one failed experiment. He admits that socialists face a “design” problem. They do not now know how to power an economy on generosity and fraternity rather than greed or fear. But design problems often turn out to be solvable with enough ingenuity and attention. Non-market socialists do not currently have the answers. But in the fullness of time, they might—or so Cohen argues.

Before turning to a different community-based critique of capitalism, a powerful challenge to Cohen’s argument should be noted. Jason Brennan points out that socialism cannot lay claim to communal reciprocity by definitional fiat. Socialism is just communal ownership of the means of production. Whether this particular way of structuring property fosters communal reciprocity, leading to a generosity and a “serve-and-be-served” mentality on a wide scale, is an entirely empirical question that cannot be answered from the ‘philosopher’s armchair,’ as it were. Yet when we turn to the empirical record, we find little support for Cohen’s position.

If Cohen were right, then we should expect to see an inverse relationship between markets and various pro-social attitudes and behaviors. We should expect to see greater levels of greed, mistrust, and so on as markets expand and deepen. The most marketized societies should also be the most anti-social. But this is not at all what we find. In fact, we find precisely the opposite. Studies cited by Brennan suggest that market exchange promotes various pro-social attitudes such as trust, fairness, and reciprocity. Brennan concludes that Cohen has it backwards: if we wish to spread camping trip values across society, we should embrace markets, not reject them.

Notice that Brennan’s critique (if sound) damages only non-market varieties of socialism. It does not undermine (and indeed actually provides some support for) market versions of the same. (Market socialism is discussed further in 8.c.)

b. Justice, Inequality, Community

This article has not said very much about equality as a socialist ideal. This may surprise some readers. Isn’t equality of condition one of socialism’s central aims? Indeed, socialism’s allegedly uncompromising egalitarianism is sometimes used as the basis for a reductio ad absurdum of the socialist position. The reductio runs like this: according to socialism everyone must be equal; one way to do this is to ‘level down’ the better off; but this is morally repugnant; so socialism must be rejected. One thinks here of Kurt Vonnegut’s famous anti-egalitarian tale “Harrison Bergeron”, in which an equality-obsessed government knocks the noggins of the more intelligent, bringing them into alignment with their IQ-disadvantaged peers.

The reductio fails because socialists do not advocate equality of condition, at least not in any straightforward sense. Much light has been shed on this issue by the now-voluminous philosophical literature on egalitarianism. Of particular import is the work by philosophers like Richard Arneson and G.A. Cohen on the question: “equality of what”? Insofar as leftists seek equality, what is it that they wish to equalize? Standard options include 1) resources, 2) welfare, 3) opportunities for resources, and 4) opportunities for welfare.

Most philosophers agree that the first two options are non-starters. Equalizing outcomes (as 1 and 2 would do) improperly ignores personal choice and responsibility. The point is nicely illustrated by Aesop’s fable of the grasshopper and the ant. Stipulate that both bugs know that winter is coming, and that both have the capability, that is, the effective freedom, to build a house and to gather adequate supplies. That is to say, both have equal opportunity to provision themselves. Yet only industrious ant chooses to use this opportunity; carefree grasshopper decides to dance and play instead. Fast forward to winter: there sits ant in his house, warm and well-fed, while grasshopper shivers hungrily outside. Now, no matter which metric we use—resources or welfare—ant is clearly much better off than grasshopper. Between the two bugs, a very significant inequality of condition obtains. But does this inequality constitute an injustice?

Interestingly, many socialists would answer ‘no’ to this question. These socialists hold a “responsibility-sensitive” form of egalitarianism sometimes called “luck egalitarianism”. On this view, outcome inequalities (whether measured in resources, welfare, or some other metric) are just if and only if they “reflect the genuine choices of parties who are initially equally placed and who may therefore reasonably be held responsible for the consequences of those choices” (Cohen, Why Not Socialism? 26). By luck egalitarian lights, then, the grasshopper/ant inequality is perfectly just since it reflects divergent choices rather than differences in unchosen circumstances. (Circumstantially, the bugs were identically placed. Both could have prepared for winter. But only ant chose to do so.)

Notice that the luck egalitarian would reach a different verdict if grasshopper and ant were unequal because (say) grasshopper was born without legs, and thus couldn’t provision himself. Then it would be unjust for him to go without food or shelter. For that outcome would reflect factors beyond his control, namely, his unchosen disability, in violation of the luck egalitarian standard.

In sum, contemporary socialist “luck egalitarians” have a nuanced view on equality and justice. Opportunities (for resources, welfare, or whatever) must be equal. But outcomes may be unequal provided that these inequalities are due to choices rather than circumstances. In a socialist society, then, grasshopper’s shivering does not necessarily signal injustice.

It might, however, signal a different moral defect: namely, a breech of community or compassion. Socialists aspire to a social world within which people care about and (when necessary) care for one another. Dramatically different living conditions put this regime of mutual comprehension, concern, and caring in jeopardy. Condemned to the wintry cold, grasshopper would face trials beyond ant’s understanding. The two bugs would come to dwell in different worlds. Whatever fellow-feeling or mutual concern previously marked their relations would vanish, leaving only a gulf of indifference and estrangement. This is no way for socialist comrades to live: not because it would be unjust (by hypothesis, it would not) but because it would be insufficiently fraternal and compassionate. Cohen concludes that “certain inequalities that cannot be forbidden in the name of [justice] should nevertheless be forbidden, in the name of community” (Why Not Socialism? 37). On this line, it would be just, but not justified, for ant to bar his door.

Are we back to the Harrison Bergeron reductio, then? Does socialism implausibly require absolute equality of condition after all? No, for two reasons.

First, not all inequalities undermine community. Perhaps ant must, in the name of community, provide grasshopper with some of his food and shelter. But does community require him to split his possessions down the middle? Surely not. The point is that while extreme inequalities may place community under strain, more modest ones might not.

Second, Cohen declares, without much argument, that the demands of community trump those of justice. But this ranking may be contested. Why shouldn’t justice trump community, at least occasionally? Perhaps just inequalities should sometimes be allowed to stand even if they undermine community.

8. Institutional Models of Socialism for the 21st Century

What, in practice, would a socialist society actually look like? What concrete institutions and policies—political, economic, and social—would it use to organize, motivate, and direct economic activity? It is difficult to assess the desirability of socialism without answering these questions. The normative case for socialism depends, at least in part, on the attractiveness and feasibility of its institutional vision. More prosaically: even if one is convinced of the abstract philosophical arguments canvassed in section 4, one still has to know what socialism would really be like in order to tell whether one wants it.

This section discusses three broad institutional models of socialism for the 21st century: central planning, participatory/democratic planning, and market socialism. All three models, being socialist, reject private ownership of the means of production in favor of social ownership. But beyond this important point of commonality, many significant differences emerge, especially concerning a) whether planning should be centralized or decentralized, and b) the appropriate role of markets in a socialist economy.

a. Central Planning

Throughout the 20th century, the standard socialist answer to the question “if not capitalism, then what?” was centrally-planned socialism.

Under central planning, “production is organized and coordinated within an administrative hierarchy, with decisions being made at the center and passed down through intermediate levels of the hierarchy to the production units” (Devine 55). Political authorities at the top of this hierarchy decide on broad economic objectives—build up heavy industry, satisfy consumer preferences, develop a backward region, and so on. Central planners then generate a concrete plan to achieve these objectives. To this end, they first gather a massive amount of information. Tens of thousands of enterprises inform planners of their productive capabilities and input requirements; millions of consumers communicate their consumption preferences. With this information in hand, planners, through a complex, multi-stage, “iterative” process, arrive at an overall plan for the economy that sets specific production targets for each enterprise. (Factory A, produce X shoes; factory B, produce Y amount of steel, and so on.) The center sends these orders to enterprise managers, who then devise more specific labor processes through which their workers produce the ordered goods in the right way at the right time. To the extent that the overall plan is fulfilled, sufficient resources are produced to meet whatever broad objectives political authorities have chosen, and the economy ‘works’.

What is to be said in favor of central planning? In theory, quite a bit. Central planning replaces capitalism’s anarchic market production for profit with planned production for use. It therefore promises to eliminate all those problems that socialists associate with private property, markets, and the pursuit of profits—problems like economic instability and poverty, class conflict and exploitation, various barriers to “real freedom” and self-realization, such as alienating labor and inadequate free time, lack of community and solidarity, and unjust economic inequalities. Freed of these capitalist pathologies, a centrally planned society would be classless, prosperous, and harmonious; it would be a society in which “the free development of each is the condition of the free development of all” (Marx and Engels Ch. 2).

Or so the story goes. However, critics allege that in practice central planning performs poorly. There are two problems worth pulling apart here.

The first is economic. Although centrally planned economies eliminate the worst forms of poverty, they do not produce generalized affluence. Under central planning, innovation is sluggish. Product quality is low. Shortages and hoarding are common. Work effort is lacking. These defects stem from underlying information, calculation, and incentive problems. Central planners, critics argue, cannot know what people want, or what producers are able to produce, with sufficient accuracy; nor, even if they could, would they be able to use this massive quantity of information to calculate a coherent overall plan; nor, even if they could calculate such a plan, would they be able to incentivize managers and workers to follow it faithfully.

The second problem with central planning is normative. Central planning, critics say, does not lead to an egalitarian, classless utopia, but to an authoritarian, undemocratic society dominated by a “coordinator class” of political elites, planners, and enterprise managers. Indeed, the basic logic of the system guarantees that central planning is a “road to serfdom” (in Hayek’s famous phrase) rather than a route to democratic empowerment. As one critic explains, “Central planners gather information, calculate a plan, and issue ‘marching orders’ to production units. The relationship between the central agency and the production units is authoritative rather than democratic, and exclusive rather than participatory” (Albert 52). Information flows up the hierarchy; orders flow down. Central planners decide; everyone else obeys. This seems rather far from the “radical empowerment” envisioned by many socialists.

Indeed, central planning’s economic and normative failings are related; the latter compound the former. It is partly because central planning alienates and disempowers workers that it performs so poorly qua economic system. Workers, so treated, expend little effort at work, ignore orders, under-report their productive capabilities, over-report their output, and so on.

Persuaded by these objections, most socialists today reject central planning, holding that it simply doesn’t work sufficiently well and that it comes at too steep a cost to democratic empowerment and freedom. But if they reject central planning, what do they propose to put in its place? There would seem to be only two options: either socialists rehabilitate planning by decentralizing and democratizing it, or they make peace with the market. The first route leads to some form of “participatory planning”; the second, to “market socialism”. The next two subsections explore these models in greater detail.

b. Participatory Planning

Perhaps the problem with central planning has to do with centralization rather than planning: this is the core thought behind “participatory” or “democratic” planning. Advocates of this approach include Pat Devine, Michael Albert, and Robin Hahnel. Because Albert and Hahnel’s model, called “participatory economics,” or “parecon” for short, is especially well developed, this article shall take it as representative of the broader participatory planning approach.

i. Parecon: Basic Features

Parecon rests on five main institutional proposals:

  1. Social ownership of the means of production
  2. Democratic workplaces
  3. Balanced “job complexes”
  4. Remuneration according to effort, sacrifice, and need
  5. Economic coordination based on comprehensive participatory planning, using a complex system of nested “worker” and “consumer” councils

We may move quickly through the first and fourth of these proposals. Social ownership means simply that nobody in particular owns the means of production; rather, “we all own [them] equally, so that ownership has no bearing on the distribution of income, wealth or power” (Albert 9). What does bear on the distribution of income in a parecon is effort and sacrifice (112-117). The underlying rationale here is luck egalitarian (see 7.b). People, Albert and Hahnel argue, should be rewarded or penalized only for those things under their control. But the only thing that people control is their level of effort and sacrifice. Therefore, they should be rewarded and penalized only for their level of effort and sacrifice. Those who work harder or longer should enjoy greater consumption opportunities than those who work less hard and/or less long. There is an exception here: people who are unable to work will be provided with an average income.

Proposals 2, 3, and 5 require more extensive discussion.

Democratic workplaces. Parecon takes as one of its core values the idea of “democratic self-management,” which implies that “each actor in the economy should influence economic outcomes in proportion to how those outcomes affect him or her”. In other words, “Our say in decisions should reflect how much they affect us” (Albert 40). This norm implies that decisions affecting only a given individual should be left entirely to that individual. But other decisions have broader consequences, and are therefore appropriate objects of democratic choice. Many workplace decisions fall into this “other-affecting” category. Albert and Hahnel propose a complex system of nested “councils” for handling such decisions. Some workplace decisions will be entirely up to individual workers; others, assigned to work-teams; still others, to the enterprise as a whole. Indeed, since some workplace decisions affect people beyond the workplace’s four walls, such as consumers, some method for granting an appropriate degree of influence to these affected external actors must be found. Albert and Hahnel propose democratic “consumer councils” and “industry councils”. More will be said about this system of democratic council coordination below.

Balanced job complexes. Parecon proposes to radically remake the division of labor by creating “balanced job complexes” in which “the combination of tasks and responsibilities each worker has would accord them the same empowerment and quality of life benefits as the combination every other worker has” (Albert 10). This is, of course, very far from how occupations are structured currently. Under capitalism, considerations of profit and class power largely determine the way in which different productive tasks are bundled into jobs. The result is a division of labor that assigns routine, boring, disempowering (but profitable) work to the many, while reserving varied, complex, empowering work for the privileged few.

Parecon rejects this inequitable division. It does so on grounds of fairness: why should interesting and enjoyable work be hoarded by some rather than shared by all? It also objects on democratic grounds: unequal division of empowering work “inexorably destroys participatory potentials and creates class differences” (Albert 104). Any workplace with, say, janitors and managers will be a de facto hierarchy, even if it is, on paper, democratically organized. In the name of fairness and democracy, then, we must transform the division of labor so that “every individual [will] be regularly involved in both conception and execution tasks, with comparable empowerment and quality of life circumstances for all” (Albert 111).

ii. Allocation in Parecon: Economic Coordination Through Councils

This brings us to feature 5: economic coordination through councils. Every economy must decide what gets produced and consumed, and in what quantities. This is the problem of allocation. Market systems solve this problem through decentralized, voluntary, self-interested competition and exchange between buyers and sellers. Recall Adam Smith’s baker, who makes bread not because someone tells him to, but because by making bread, he can make money through trade. Centrally-planned economies solve the allocative problem by handing it over to a small group of economic elites, who craft a comprehensive plan for the entire economy and issue binding instructions for realizing it. Moscow decides that X amount of shoes will be produced, Y amount of steel, and so on, and enforces these demands on lower levels in the economic hierarchy.

Parecon rejects both approaches. In place of markets and central planning, it proposes a system of nested worker and consumer councils, through which individuals cooperatively generate an overall plan for production and consumption. Albert and Hahnel call this system “decentralized participatory planning.”

Simplifying greatly, its basic gist is this. To figure out what people want to consume, we ask them. To figure out what they are willing to produce, we ask them. We then aggregate all these responses and compare proposed supply with proposed demand. If they don’t match, we close the gap through democratic negotiation conducted on a footing of equality. Through such negotiation, we eventually reach, say, five feasible plans. We put them to a vote and implement the winner. Decentralized participatory planning thus promises to solve the allocative problem without hierarchy or markets.

The system features several key participants: first, worker councils (and federations thereof—for example, a “software industry council,” a “farming council,” and so forth.); second, consumer councils (and federations thereof—for example, neighborhood councils, city-level councils, state-level councils, and so on); and third, the “iteration facilitation board,” or IFB. The IFB initiates the planning process by announcing provisional prices for all inputs and outputs. Importantly, these prices should reflect the “full social costs and benefits” associated with these inputs and outputs, including opportunity costs and externalities, whether positive or negative. Albert and Hahnel see this as a key difference with market systems. In a parecon, prices will accurately track the true social costs of production. Prices rarely do this in market societies. Think, for instance, of the absurdly low price of gasoline in the United States, despite the ecological and economic costs of its widespread production and use.

With these provisional prices in hand, each economic actor—individuals and councils and federations of councils—proposes both a) a consumption plan and b) a production plan. The former specifies what the actor would like to consume during the period being planned (the upcoming year, say); the latter specifies which outputs the actor proposes to produce, and the inputs they will require to do this. Plans will go to appropriate councils for approval. Thus, a family might submit their consumption plan to the neighborhood consumption council, while a worker might submit her plan to her work-team or to the larger workplace council.

On what basis are proposals approved or rejected? Individual consumption proposals should be approved if the person’s income is equal to or greater than the total cost of the goods requested. Income, remember, is a function of one’s effort and sacrifice at work. Higher-level consumption proposals (for a neighborhood, say) should be approved if the group’s income (minus the costs of members’ personal consumption) suffices to cover the costs of the requested items. The underlying thought here is that a community’s overall consumption should correspond to the amount of effort its members expend producing goods and services for society: the harder the community works, the more it is entitled to consume. Turning to production proposals, these are evaluated by comparing the estimated social benefits of the goods and services produced with the estimated social costs of producing them. If this ratio is positive, then the production proposal should be accepted; if it is negative, then the proposal does not represent a responsible use of societal resources, and it is sent back for revision.

The IFB aggregates all approved proposals and determines whether projected supply matches projected demand. Barring a miracle, it won’t, not at this stage. So the IFB recalculates prices in light of the mismatch between supply and demand, raising prices for goods with excess demand, lowering prices for those with excess supply, and sends the plans back to their originators for revision. Using the new prices, consumers and producers tweak their proposals, perhaps shifting demand to lower-priced goods and increasing supply of goods with high prices. They then send these revised proposals to the relevant councils, which evaluate them as before. Eventually, all approved proposals make their way to the IFB, which recalculates overall supply and demand to see if they match. If they do, then the process is over; if they don’t, then another round of revisions is required. If the process ends with multiple feasible plans, then society votes to determine the winner.

iii. Evaluating Parecon

This is, to be sure, an incredibly complex procedure, indeed, much more complex than this brief sketch indicates. Even Albert and Hahnel admit that it will take multiple rounds of negotiation, and no small amount of paperwork, to arrive at a feasible plan. But the hope is that “as every individual or collective worker or consumer participant negotiates through successive rounds of back and forth exchange of their proposals with all other participants, they alter their proposals to accord with the messages they receive, and the process converges” (Albert 128). And this, notice, without markets or central planning:

There is no center or top. There is no competition. Each actor fulfills responsibilities that bring them into greater rather than reduced solidarity with other producers and consumers. Everyone is remunerated appropriately for effort and sacrifice. And everyone has proportionate influence on their personal choices as well as those of larger collectives and the whole society (Albert 128-129).

The absence of hierarchy is worth emphasizing. Although there is an IFB, and although one’s consumption and production proposals must be approved by others, the overall distribution of power is web-like rather than hierarchical. No one occupies any special position of authority. Economic decisions are not dominated by the wealthy (as under capitalism) or the politically connected (as under central planning). Instead, all economic decision-making is radically democratic and open to negotiation: each person has a say over decisions that affect him or her, proportional to the degree to which he or she is affected. Parecon may have important flaws, but inadequate respect for democratic values would not seem to be among them. Indeed, it is hard to imagine a system more faithful to the core socialist commitment to bottom-up, democratic control over the economy. This, surely, is parecon’s chief virtue from the socialist perspective.

But would it work? Some commentators are skeptical. Erik Olin Wright writes:

The information complexity of the iterated planning process described in Parecon might in the end simply overwhelm the planning process. Albert is confident that with appropriate computers…this would not be a problem…Perhaps he is right. But he may also be horribly wrong. As described…the information process seems hugely burdensome (264).

Defenders of parecon reply to such worries in several ways. First, they argue that critics overestimate the amount of planning that parecon requires. Setting up one’s initial consumption proposal may well take lots of time and energy. But with that initial investment made, planning in subsequent years should be much quicker. One can simply base future plans off of the original one, tweaking here and there as necessary.

Nor need one specify in great detail what one wants to consume. “For planning purposes,” writes Albert, “we need only request types of goods, even though later everyone will pick an exact size, style, and color to actually consume” (130). One’s consumption proposal must “express preferences for socks, but not for colors or type of socks; for soda, books, and bicycles, but not for flavors, titles, or styles of each” (217). (Of course, critics may find new grounds for concern here. Because consumption preferences tend to be rather fine grained—a person wants to read a dystopian science fiction novel, not some generic book; she wants wild-caught salmon, not “food” or even “fish”—parecon seemingly faces a dilemma. If people do not request specific items, then many desired items will be in short supply, and consumer satisfaction will be low. If, on the other hand, people do request specific items, we’re thrown back on the original worry: isn’t this planning process unworkably cumbersome?)

Second reply to the infeasibility worry: we must remember that other economic systems require paperwork and planning, too. Under market socialism (and capitalism) consumers must make budgets, do their taxes, pay bills, go shopping, and so on. Enterprises must decide what they will make and in what quantities. They must also make various personnel decisions, deciding who will work with whom, for how long, on which projects, and so on. Added up over the course of the year, the amount of time spent on such activities is far from trivial. Indeed, one might argue that total planning time will be roughly constant across market- and participatory-planning-based systems.

Finally, suppose that parecon does require a substantial amount of time and energy, or perhaps, more time and energy than alternative systems. Still, these costs must be judged against the potential benefits. Parecon promises a more equal, fraternal, just, democratic society. Is it even remotely reasonable to reject such a society on the grounds that it requires too much paperwork?

In sum, defenders of parecon argue 1) that their proposal won’t prove nearly as burdensome in practice as critics fear; indeed, 2) one may reasonably doubt that it is any more burdensome than other systems; and 3) even if parecon does prove burdensome both absolutely and comparatively, surely the sacrifice is worth the result.

c. Market Socialism

Suppose one rejects central planning, but also doubts the feasibility (or perhaps even the desirability) of parecon-style participatory planning. Must one therefore reject socialism? No, not according to “market socialists” such as John Roemer, David Schweickart, David Miller, Erik Olin Wright, Tom Malleson, and others.

On the traditional view, socialists must, by definition, be opposed not simply to private property, but also to markets. Market socialists disagree. On their view, socialism requires only a certain form of ownership, namely, social rather than private ownership. About markets, socialists should be open-minded. Markets are just tools for communicating information and motivating economic activity. Like any tool, they should be evaluated instrumentally. Do they work better than alternatives? If so, then socialists should embrace them.

And indeed, market socialists characteristically argue that markets do work better than the alternatives: just look at the economic record. This is not to say that markets are perfect, nor is it to say that they should be allowed to operate ‘freely,’ without any constraints. Market regulations are integral to the market socialist vision. Market socialists are no kind of market fundamentalists. Rather, they view themselves as pragmatists. They see the evils of capitalism, but they also see the problems with planning-based socialist alternatives. The way forward, they argue, is to take the good parts of capitalism and combine them with the good parts of socialism. This will displease fundamentalists on both sides, but what alternative is there? Capitalism is a moral disaster. Central planning was worse. Participatory planning is a pipe dream. 21st century leftists must fuse socialism with markets. There is no other way. Or so market socialists argue.

i. Schweickart’s “Economic Democracy”

To further explain the market socialist position, this article will present David Schweickart’s market socialist model, “Economic Democracy” (ED for short). (For a recent proposal very similar in spirit to Schweickart’s see Malleson 2015. For other important developments of market socialism, see Roemer 1994, Miller 1989, and Carens 1981). Boiled down to essentials, ED has three main features: worker self-management, the market, and social control of investment.

Worker self-management: “Each productive enterprise is controlled by those who work there” (After Capitalism 49). Workers together decide all aspects of production: what to make, how to make it, workplace policies, compensation, and so on. This does not preclude the use of managers or experts. In large firms especially, some delegation of authority will almost certainly prove necessary. Schweickart suggests that workers might elect a workers’ council which will then appoint executives, managers, and so on.

The market. In stark contrast with planning-based forms of socialism, ED solves the problem of allocation using market competition between profit-seeking enterprises. ED’s enterprises start with a sum of money (M). They use this money to buy productive inputs on the market, which they transform into commodities. They then compete with other enterprises to sell these commodities to consumers or other enterprises, thereby ending up with a new amount of money (M’). (Prices are determined mainly by market forces of supply and demand, although price regulations may sometimes be appropriate: again, Schweickart is no market fundamentalist.) In the normal case M’ > M, indicating that the enterprise has turned a profit. Indeed, turning a profit is the immediate aim of production in ED: enterprises produce to make money, not (primarily) to satisfy human needs. As Schweickart says, “profit is not a dirty word in this form of socialism” (After Capitalism 51).

This may sound rather close to capitalism, but in fact there is an important difference here. Under capitalism, profits go to owners, not workers, who receive wages. Under ED, by contrast, there are no wages; rather, “workers get all that remains once nonlabor costs…have been paid” (After Capitalism 51) Precisely how workers divvy up the enterprise’s surplus is up to them. In theory they could split it equally. But given the need to outcompete other enterprises—hence, to attract and retain skilled labor—some degree of inequality is likely to be chosen. More productive workers, or workers with skills in higher demand, will almost certainly earn more than their fellows. Notice the difference here with parecon, which links income to effort rather than contribution or other “morally arbitrary” factors beyond the agent’s control. Empirical evidence suggests that self-managed firms (like those in the Mondragon cooperative in Spain) opt for a 4 or 5:1 ratio between the incomes of the highest- and lowest-paid employees: quite a dramatic difference from the 300:1 spread typical in large capitalist corporations.

Social control over investment. This is the most clearly “socialist” piece of Schweickart’s model. In an ED, the means of production belong to all members of society, not to the enterprises that happen to deploy them. To reflect this social rather than sectional or private ownership, all enterprises must pay a capital assets tax. “This tax,” writes Schweickart, “may be regarded as a leasing fee paid by the workers of the enterprise for use of social property that belongs to all” (After Capitalism 52). Revenues from this tax constitute the national investment fund, which is the sole source of investment money in ED. By tweaking the tax rate, society can determine the size of the national investment fund—hence, the amount of money available for investment, and thus the overall level of economic growth and development.

Note the contrast with capitalism: under capitalism, most investment comes from private rather than public sources. Both the amount and direction of economic development therefore depend on the whims and abilities of private investors. This leads to the boom and bust cycle discussed in section 3, as well as other pathologies such as excessive growth, ecological devastation, underdeveloped regions alongside overdeveloped ones, unemployment, poverty, and all the rest. Under ED, by contrast, investment is democratically controlled by all members of society. In theory, this should enable “more rational, equitable, and democratic development than can be expected under capitalism” (After Capitalism 52)—a point that will be explained further below.

How, specifically, should social control over investment be institutionalized? There are many options. At one extreme, society might opt for a planning-heavy system in which a democratically accountable planning board draws up a plan for all new investment, which Schweickart estimates would constitute about 15% of GDP, and allocates funds accordingly. For example, the planning board might decide to prioritize renewable energy, or consumer goods, or whatever. At the other extreme, society might prefer a laissez-faire model in which funds are channeled through public banks to enterprises using essentially the same criteria that capitalist banks use: namely, profitability. In this version of ED, market forces would largely determine the pattern of investment.

Schweickart himself proposes something in the middle of these two options. Funds should go to regions (for example, Texas) and communities (for example, Fort Worth) on a per capita basis. If the Fort Worth region has the same population as Silicon Valley, then it will receive the same amount of investment. In ED, then, there will be no economic backwaters, no regions or communities left behind. Nor will regions or communities be locked into a destructive neoliberal “race to the bottom” to attract investment dollars. Each community receives its “fair share” no matter how business-friendly (or unfriendly) its policies.

Once distributed to regions and communities on a per capita basis, investment funds are channeled to regional and community enterprises by public banks. Enterprises in need of investment (say, to expand production) apply to area banks for funds. Banks assess applications on the basis of a) profitability, b) job creation, and c) any other democratically chosen criteria, such as ecological impact. This mixed standard implies that while profitability matters, it is not all that matters. Projects that further socially chosen goals may be chosen over more profitable, but less socially desirable alternatives.

Summing up, Schweickart’s model strategically transplants certain core elements of capitalism into a broadly socialist framework. We get markets and profit-seeking enterprises, but also workplace democracy and social control over investment. The result, Schweickart argues, is an economy that outperforms all rivals—whether socialist or capitalist—in terms of values dear to socialist hearts, such as equality, economic stability, human development, democracy, and environmentalism. ED thus promises to deliver “socialism that would really work,” to quote the title of one of Schweickart’s early articles on the topic.

ii. Evaluating Economic Democracy

Perhaps it would really work, but would it be socialism? This, in essence, is the main criticism of Schweickart’s model (and of market socialism more generally).

That his proposal would work—that it is feasible—seems relatively uncontroversial. Markets work. Self-managed enterprises work, as illustrated by decades of empirical evidence. Social control over investment is the only truly novel piece of Schweickart’s model, but its basic mechanisms—the capital assets tax, the national investment fund, the system of public banks allocating funds on the basis of profitability as well as other socially chosen considerations—raise no obvious feasibility worries. Granted, neoclassical economists will complain that because ED regulates and interferes with markets in various ways, it sacrifices efficiency. But “less than perfectly efficient” does not mean “infeasible”. And efficiency isn’t the only thing we want from an economy anyway. Better to sacrifice some efficiency, Schweickart would argue, for gains in employment, more equitable development across regions, greater democratic empowerment at work, and so on. So all things considered, market socialism seems eminently feasible. This is perhaps its greatest selling point.

But is it desirable? Critics right and left will argue that it is not. Those on the right will complain that ED limits basic economic freedoms, such as the formal freedom to own the means of production, to hire wage labor, and to run a business in a un-democratic fashion. Market socialists will reply that not all formal freedoms are worth protecting. They will further suggest that ED will enhance effective economic freedom for the vast majority, even if this means diminishing economic freedom, both formal and effective, for those elites who would, absent ED, enjoy greater workplace control and authority. Under capitalism, most workers control no productive property and enjoy no real say over their work. Economic power is monopolized by a tiny class of owners. Under ED, by contrast, economic hierarchies are flattened. Economic power within the enterprise is distributed equally to all workers on a one worker, one vote basis. Consequently, everyone has the effective freedom to shape workplace decisions. Seen from this angle, ED enhances rather than reduces economic freedom.

Market socialism attracts critical fire from the left as well as the right. It is a strange form of socialism indeed, leftist critics will argue, that features anarchic market production for profit rather than planned production for use. With markets and profits come competition, greed, fear, and the diminution of community; with markets and profits come consumerism, ever-expanding hours of work, and the ecologically insane desire for never-ending economic growth.

Schweickart replies that these worries are overblown. Yes, ED features competition; yes, there will be advertising and some degree of consumerism; yes, enterprises may, under certain circumstances, seek to grow. But the details make a difference. Competition, consumerism, and economic growth are all held in check in ED by countervailing forces. Social control over investment means that we can democratically determine the overall rate and direction of economic growth. We can prioritize environmental aims, for instance, over the rapacious quest, so characteristic of capitalism, for additional output at whatever cost. Workplace democracy means that we can choose shorter working hours in exchange for reduced consumption opportunities. Moreover, because democratic firms seek to maximize profit per-worker (rather than total profit, as do capitalist firms), they will not expand as aggressively as their capitalist counterparts. But reduced expansion means less output that needs to be sold, which, in turn, reduces demand for advertising and marketing. In short, for all of these reasons ED is absolutely compatible with the socialist vision of a less-consumerist, more leisurely, ecologically sane world, or so defenders of market socialism would argue.

Indeed, market socialists would draw a more general lesson here. From the fact that markets in a capitalist context lead to undesirable effects X, Y, or Z, we cannot automatically infer that they would lead to X, Y, or Z in the dramatically different political-economic framework of market socialism. Maybe they would, but maybe they wouldn’t. The only way to tell, insist market socialists, is to work carefully through the details.

9. References and Further Reading

  • Albert, Michael. Parecon: Life After Capitalism. London: Verso, 2003.
    • Presents Albert (and Hahnel’s) participatory planning model of socialism.
  • Albert, Michael, and Robin Hahnel. Looking Forward: Participatory Economics for the Twenty First Century. South End Press, 1991.
    • An early statement of Albert and Hahnel’s participatory planning model of socialism.
  • Arneson, Richard. “Equality and Equal Opportunity for Welfare.” Philosophical Studies 56 (1), 77-93, 1989.
    • A canonical statement of the “luck egalitarian” position.
  • Bellamy, Edward. Looking Backward. Dover, 1996 [1888].
    • A utopian novel, widely acclaimed in its day, depicting political, economic and social arrangements in socialist Boston, some 100 years after a successful revolution.
  • Braverman, Harry. Labor and Monopoly Capital: The Degradation of Work in the Twentieth Century. 25th Anniversary Edition. New York: Monthly Review Press, 1998 [1974].
    • Important Marxist analysis of work, according to which the imperatives of profit-maximization force capitalists to simplify and routinize labor processes, thereby degrading work.
  • Brennan, Jason. Why Not Capitalism? New York: Routledge, 2014.
    • A sharp parody of and rejoinder to G.A. Cohen’s Why Not Socialism? that defends capitalism on moral (rather than pragmatic) grounds.
  • Carens, Joseph. Equality, Incentives, and the Market: An Essay in Utopian Politico-Economic Theory. Chicago: University of Chicago Press, 1981.
    • Describes a market socialist economic system that—unlike capitalist and non-market socialist alternatives—fully realizes the values of equality, freedom, and economic efficiency.
  • Cohen, G.A. “The Structure of Proletarian Unfreedom.” Philosophy and Public Affairs, Vol. 12, No. 1, 3–33, 1983.
    • Argues that workers are individually free (since they are not forced to work for capitalists) but not collectively free (since few workers can escape proletarian status at any given time).
  • Cohen, G.A. History, Labour, and Freedom: Themes From Marx. Oxford: Clarendon Press, 1988.
    • Collection of Cohen’s essays on Marxist themes.
  • Cohen, G.A. “On the Currency of Egalitarian Justice.” Ethics 99 (4), 906-944, 1989.
    • Important statement of luck egalitarianism.
  • Cohen, G.A. Karl Marx’s Theory of History: A Defence. Expanded edition. Princeton, NJ: Princeton University Press, 2000.
    • Cohen’s classic reconstruction and qualified defense of Marx’s theory of history, “historical materialism”. Widely regarded as a founding text of the so-called “Analytical Marxism” movement.
  • Cohen, G.A. Why Not Socialism? Princeton: Princeton University Press, 2009.
    • Argues that—bracketing issues of feasibility—socialism is morally desirable, but concedes that socialists do not know whether socialism is feasible.
  • Cohen, G.A. “Capitalism, Freedom, and the Proletariat.” In G.A. Cohen, On The Currency of Egalitarian Justice and Other Essays. Princeton: Princeton University Press, 2011.
    • Analyzes freedom under capitalism, arguing that private property restricts formal freedom in underappreciated ways.
  • Devine, Pat. Democracy and Economic Planning. Cambridge: Polity Press, 1988.
    • Rich, detailed, economically sophisticated statement of a democratic alternative to central planning, with especially interesting ideas about the division of labor.
  • Elster, Jon. An Introduction to Karl Marx. Cambridge: Cambridge University Press. 1986.
    • An often-critical reconstruction of central Marxist themes by one of the central figures in the Analytical Marxism movement.
  • Elster, Jon. Self-Realization in Work and Politics: The Marxist Conception of the Good Life. Social Philosophy and Policy, Vol. 3, No. 2, 1986.
    • Analytically crisp discussion of self-realization and the prospects for achieving it under capitalism and socialism.
  • Engels, Frederick. Socialism: Utopian and Scientific. Pathfinder Press, 2008 [1880].
    • Important overview of historical materialism and the socialist critique of capitalism by Marx’s intellectual partner; arguably more accessible to beginners than anything by Marx himself.
  • Friedman, Milton. Capitalism and Freedom. 40th Anniversary Edition. Chicago: University of Chicago Press. 2002 [1962].
    • Friedman’s classic defense of libertarian capitalism on moral grounds.
  • Gilabert, Pablo. “The Socialist Principle ‘From Each According to Their Abilities, To Each According to Their Needs’.” Journal of Social Philosophy, Vol. 46, No. 2, 197-225, 2015.
    • Interesting recent paper that brings the needs/abilities principle into dialogue with other positions in distributive justice.
  • Harrington, Michael. Socialism: Past and Future. New York: Little, Brown & Co, 1989.
    • Historically learned, empirically informed overview of socialism’s development and future trajectory by an important figure in American socialist politics.
  • Hayek, Friedrich. The Road to Serfdom: Text and Documents—The Definitive Edition. Chicago: University of Chicago Press, 2007.
    • Hayek’s celebrated broadside against socialist planning and the creeping threat to freedom that it represents.
  • Holmstrom, Nancy. “Exploitation.” Canadian Journal of Philosophy, Vol. 7, No. 2, 353-369, 1977.
    • Early, analytically sharp defense of the view that exploitation is forced, uncompensated labor, the products of which producers do not control.
  • Lenin, Vladimir. The State and Revolution. New York: Penguin, 2009 [1918].
    • Argues, to give one example, that genuine democracy is impossible under capitalism.
  • Levine, Andrew. Arguing for Socialism. London: Verso. 1988.
    • Rigorous, subtle work that mounts a qualified case for socialism using tools of contemporary moral and political philosophy.
  • Malleson, Tom. After Occupy: Economic Democracy for the 21st Century. New York: Oxford University Press, 2015.
    • Empirically and philosophically rich development of a broadly market-socialist position with an especially interesting defense of workplace democracy.
  • Marx, Karl. Capital: A Critique of Political Economy, Vol. 1. New York: Vintage Books, 1977 [1867].
    • Marx’s masterwork lays bare capitalism’s “laws of motion”, but says little about alternatives.
  • Marx, Karl. Critique of the Gotha Program. In David McLellan (Ed.), Karl Marx: Selected Writings, second edition. Oxford: Oxford University Press, 2000 [1875].
  • Marx, Karl, and Frederick Engels. The Communist Manifesto. London: Verso, 1998 [1848].
    • An enormously influential political pamphlet outlining core elements of the Marxist theory of history, critique of capitalism, and program for a socialist future.
  • Miller, David. Market, State, and Community: Theoretical Foundations of Market Socialism. Oxford: Oxford University Press, 1989.
    • Important, philosophically sophisticated statement of market socialist ideas.
  • Ollman, Bertell, ed. Market Socialism: The Debate among Socialists. New York: Routledge, 1998.
    • Brings together leftist critiques and defenses of market socialism.
  • Peffer, Rodney. Marxism, Morality, and Social Justice. Princeton: Princeton University Press, 1991.
    • Accessible reconstruction of Marxist themes, using techniques of analytic philosophy, that brings Marxism into dialogue with liberal egalitarians like John Rawls.
  • Reiman, Jeffrey. “Exploitation, force, and the moral assessment of capitalism: Thoughts on Roemer and Cohen.” Philosophy and Public Affairs, Vol. 16, No. 1, 3-41, 1987.
    • Argues that exploitation is forced, unpaid labor, and further contends—contrary to Cohen—that individual workers are indeed forced to work for capitalists.
  • Roemer, John. “Should Marxists Be Interested in Exploitation?” Philosophy and Public Affairs Vol. 14, No. 1, 30-65, 1985.
    • His answer is no: Marxists should focus on distributive justice rather than exploitation.
  • Roemer, John. A Future for Socialism. Cambridge, MA: Harvard University Press, 1994.
    • Important statement of market socialism by a leading figure in the Analytical Marxist movement.
  • Schweickart, David. “Economic Democracy: A Worthy Socialism That Would Really Work.” Science & Society, Vol. 56, No. 1 (Spring), 9-38, 1992.
    • A capsule presentation of Schweickart’s market socialist model, “economic democracy”.
  • Schweickart, David. “Nonsense on Stilts: Michael Albert’s Parecon.” Schweickart’s website. Posted January, 2006.
    • Argues that Albert and Hahnel’s “participatory economics” can’t work, and wouldn’t be desirable even if it did.
  • Schweickart, David. After Capitalism. Second edition. Lantham, MD: Rowman & Littlefield, 2011.
    • Argues for a heterodox form of socialism that blends profits and markets with workplace democracy and social control over investment.
  • Smith, Adam. The Wealth of Nations: Books 1-3. New York: Penguin, 1982 [1776].
    • Smith’s classic discussion of early capitalism.
  • Van Parijs, Philippe. Real Freedom For All. Oxford: Oxford University Press, 1997.
    • Defends a “basic income” on “real libertarian” grounds.
  • Wright, Erik Olin. Envisioning Real Utopias. London: Verso, 2010.
    • Drawing on a vast fund of research from social science and philosophy, reimagines socialism for the 21st century.

 

Author Information

Samuel Arnold
Email: s.arnold@tcu.edu
Texas Christian University
U. S. A.

Egalitarianism

Are all persons of equal moral worth? Is variation in income and wealth just? Does it matter that the allocation of income and wealth is shaped by undeserved luck? No one deserves the family into which they are born, their innate abilities, or their starting place in society, yet these have a dramatic impact on life outcomes.

Keeping in mind the extreme inequality in many countries, is there some obligation to pursue greater equality of income and wealth? Is inequality inherently unjust? Is equality a baseline from which we judge other distributions of goods? Do inequalities have to be justified by people somehow deserving what they have, or by inequality somehow improving society?

As a view within political philosophy, egalitarianism has to do both with how people are treated and with distributive justice. Civil rights movements reject certain types of social and political discrimination and demand that people be treated equally. Distributive justice is another form of egalitarianism that addresses life outcomes and the allocation of valuable things such as income, wealth, and other goods.

The proper metric of equality is a contentious issue. Is egalitarianism about subjective feelings of well-being, about wealth and income, about a broader conception of resources, or some other alternative? This leads us to the question of whether an equal distribution of the preferred metric deals with the starting gate of each person’s life (giving everyone a fair and equal opportunity to compete and succeed) or with equality of life outcomes. Egalitarianism also raises a question of scope. If there is an obligation to pursue distributive equality, does it apply only within particular states or globally?

See also Moral Egalitarianism.

Table of Contents

  1. What is Egalitarianism?
  2. Equality of What?
    1. Welfare
    2. Resources
    3. Capabilities
    4. Democratic/Social Equality
    5. Primary Goods
    6. Luck Egalitarianism
  3. Equality of Opportunity
  4. Anti-Egalitarianism
    1. Sufficiency vs. Equality
  5. Domestic or Global?
  6. References and Further Reading

1. What is Egalitarianism?

Consider three different claims about equality:

  1. All persons have equal moral and legal standing.
  2. In some contexts, it is unjust for people to be treated unequally on the basis of irrelevant traits.
  3. When persons’ opportunities or life outcomes are unequal in some important respect, we have a reason to lessen that inequality. (This reason is not necessarily decisive.)

All of these claims express a commitment to equality. They are each progressively more egalitarian. Understanding the difference between these claims, their normative implications, and the various ways the content of the third claim can be further specified, are crucial to understanding the disparate collection of philosophical views that compose egalitarianism.

Claim (1) entails claim (2), and therefore captures part of contemporary egalitarianism. If all persons are equal, then there are political constraints on how they can be treated unequally. Disenfranchisement and differential rights violate the equality affirmed in (1). (3) is even stronger than (2), because it is not only committed to treating people equally, but ensuring that people have equal amounts of some important good. There is controversy whether (1) entails, is merely compatible with, or is incompatible with (3).

The descriptive thesis found in claim (1) affirms the equality of all persons. This must not be the plainly false assertion that for any given trait, all persons are equal. We differ in our abilities, resources, opportunities, preferences, and temperaments. The claim must be about something more specific. All persons have equal moral worth or equal standing. The United States Declaration of Independence famously states that “all men are created equal.” Jeremy Bentham’s dictum “each to count for one, none to count for more than one,” is another expression of the descriptive thesis. While the conditions in which people live, their wealth and income, their abilities, their satisfaction, and their life prospects may radically differ, they are all morally equal. In moral and political deliberation, each person deserves equal concern. All should have equal moral and legal standing.

If all persons are equal in this way, then some forms of unequal treatment must be unjust. The descriptive thesis, applied within a particular state, at least entails equal rights and equal standing. Therefore (1) constrains how a just political society can be structured because it entails some degree of support for claim (2). The degree is debatable in terms of which contexts require equal treatment, what types of institutions must treat people equally, and so on. At least in terms of basic political rights, discrimination on the basis of gender, ethnicity, and caste is prohibited. Many would also extend these to commerce and the wider public sphere: businesses should not be able to refuse service on the basis of race, gender, or sexual orientation. The descriptive thesis must entail some commitment to equal treatment, but the scope of that commitment is disputed.

Claim (3) (let us call it the egalitarian thesis) is closely related to the descriptive thesis. (1) is taken by some as ground for affirming (3). Denying (1) is grounds for rejecting the imperative in (3). Yet the two theses are distinct. A commitment to (1) does not obviously entail a commitment to (3), because (3) is more robust and has wider scope. (1) may entail (3), but establishing this requires a substantive argument. The descriptive thesis’ extension into the social standing, well-being, wealth, income, and life outcomes of citizens is controversial. Unlike (3), (1) is not on its face opposed to radical inequalities in income, wealth, capabilities, welfare, life prospects, or social standing. If those inequalities arise within legitimate political institutions that respect the equal standing of all persons, they may be just.

The egalitarian thesis addresses more than the moral worth of persons. It expresses an obligation to pursue distributive equality. Deviations from equality are prima facie unjust. But along which dimension ought we pursue greater equality? Candidate metrics include resources, income, wealth, welfare, or capabilities to perform certain functions. The obligation to pursue equality along some such dimension makes (3) fully egalitarian in the contemporary sense of the term. (1) does not necessarily prohibit dramatic inequalities, whether they are deserved or undeserved, due to hard work or luck, recent or hereditary. Absent further argument, the content of (1) is only concerned with such inequalities conditionally, when they violate the equal moral status of persons. Of course if social exclusion, caste discrimination, and unequal rights are prohibited in light of the fact that (1) entails some level of commitment to (2), this will influence the distribution those metrics. This is not the same as a direct obligation to pursue distributive equality of one of those metrics.

(1) is descriptive in content but has normative implications. Egalitarianism is essentially prescriptive and normative. (3) directly states what ought to be done with regard to the inequalities among persons. It is an imperative to reduce distributive inequality along some dimension. The normative commitments that follow from (1) set minimal standards: states must not violate the equal standing of persons.  The normative commitments of (3) are stronger and more aspirational: we continually pursue equality by reducing inequality. This is a pursuit of substantive distributive justice—equality of some sort of condition or opportunities. It is not mere formal equality of rights, or of economic notions such as considering everyone equal as long as their income is determined by their marginal product.

Egalitarians are thus committed to distributive justice in a way that (1) need not be. (1) may entail a certain conception of distributive justice having to do with equality of opportunity and individual rights, especially property rights. For example, John Locke argued that all persons are equal and have the same rights. The equal standing and equal rights of all persons, even in the pre-civilized state of nature, is a crucial component of his theory of just government. This is a commitment to equality, but it is not egalitarian in the contemporary sense. It does deal with distributive justice, but only in terms of respecting property rights and the right to free exchange of property. A commitment to equality is not yet a commitment to substantive distributive justice (a commitment to have a fair and equitable distribution of goods), and is compatible with merely formal or historical distributive justice (defining a just distribution as one that respects standing property rights and the right of people to trade without theft or coercion).

What is an egalitarian commitment to substantive distributive justice? In the most literal sense, it requires equalizing the distribution of some quantifiable thing among persons, such as income or wealth. An egalitarian may see distributive justice as an end in itself. This would mean it is constitutive of a just society. It can also mean that we choose a metric of equality that is intrinsically good, such as welfare or well-being. Those things are desirable in themselves, not because they are instrumental in acquiring other goods. Alternatively, egalitarianism can be seen as merely instrumental. For example, distributive justice can be seen as a means to achieving some other social end, such as creating social relationships among citizens that are equal and non-oppressive, and allowing them to flourish and function as citizens. An example of an instrumental metric of equality is resources, because resources can be used to generate welfare.

Strictly speaking, all non-equalizing views of substantive distributive justice are alternatives to egalitarianism. This would exclude Rawls’ difference principle, which allows for inequalities when they are required to raise the absolute condition of the worst off. It would also exclude views that prioritize aid to the worst off or argue in favor of redistribution to guarantee a sufficient minimum for all. The contemporary usage of the term is not restricted to equalizing views. While there are contemporary debates between egalitarianism narrowly defined and non-equalizing views such as Rawls’, the most illuminating contemporary definition of the term is that it is a commitment to substantive distributive justice as opposed to merely formal or historical distributive justice.

Egalitarianism therefore comprises divergent views about equality that go beyond the merely descriptive thesis and affirm at least one of the following theses: first, some important type of thing should be distributed equally among persons; second, distributive inequality (along some relevant dimension) is prima facie unjust and should be reduced.

Both principles further specify the normativity contained in (3), yet still give little concrete guidance. Consider a different normative principle with similar form: the current level of infant mortality is unjust and should be reduced. While this thesis does not tell us how to achieve our end, it clearly specifies the end. We know what counts as success because we know what infant mortality is and how to measure it. These two distributive principles, while clearly egalitarian, do not articulate any specific end. They give no guidance on what quantifiable thing matters to distributive justice. What form must a just distribution take? Is it about wealth? Income? Well-being? Preference satisfaction? Something else?

The remainder of this article focuses on the following topics:

  1. What is the proper egalitarian metric? Well-being? Resources? Income? Capabilities?
  1. Once we settle on a metric, are we then concerned with ex ante or outcome equality? In other words, is egalitarianism concerned with a fair allocation of holdings among persons at the starting gate of each life, so that the ensuing competition is fair, or is it concerned with equal life outcomes? Do choice and responsibility matter to this question? What if a given inequality is due to informed and avoidable choices made by the relevant persons? Can such inequalities be just? Should our shares be determined by our choices and actions? If so, then what is genuine equality—a pattern of distribution in which each person is maximally responsible for their holdings, with the role of luck minimized?
  1. Anti-egalitarianism. Many deny the fundamental equality of persons. Some think men are superior to women, certain races are superior to others, and certain castes should dominate others. If so, there is no general moral imperative to lessen inequality among persons. Anti-egalitarianism of this sort rejects both (1) and (2). This article will not address such views. The more philosophically compelling anti-egalitarianism stems not from a rejection of (1) but rather from one of the following readings of it:(3) does not follow from (1).Pursuit of (3) is counterproductive or has bad consequences. This includes political objections about incentives and productivity, an objection that if equality is desirable then it is desirable to lower the condition of those who have more even when this does not objectively aid those who have less, and objections that egalitarianism is motivated by envy.Engaging in redistribution to pursue the aim of (3) is incompatible with (1). For example, pursuing (3) violates rights that follow from (1).
  1. The relationship between egalitarianism and global justice. Does egalitarianism apply to the global community of humanity, or only within particular states? If it does not apply globally, is this a justified deference to the moral value of specific political attachments, a temporary compromise on the way to a more defensible form of egalitarianism, or is it simply unjustifiable favoritism?

2. Equality of What?

Egalitarianism requires a commitment to equalizing our holdings or at least reducing distributive inequality. Neither of these aims can solely be about equal standing or equal moral worth, if equal moral worth can be respected in a society that exhibits inequality among one of the specified dimensions. Respect for (1) puts some constraints on either inequality or the acceptable material minimum (say, by respect for equal rights entailing the minimum holdings to make those rights effective). That has to do with distributive justice, but in an attenuated sense that falls short of egalitarianism. Similarly, a society with radical inequality may make a rational calculus that some minimal redistribution is required for social stability, but this is prudential and conditional, not genuinely egalitarian.

What, other than equal standing or moral worth, is egalitarianism about? We examine five of the most influential candidates: welfare, resources, capabilities, democratic/social equality, and primary goods.

a. Welfare

Welfare is well-being or one’s quality of life. There are two main variants of welfare. The first is hedonic: welfare is pleasure or happiness. Your welfare increases as you experience more pleasures and fewer pains. The second is desire or preference satisfaction. Your welfare increases the more your desires, goals, and preferences are satisfied.

According to hedonic welfare egalitarianism, this feeling is what fundamentally matters in life. Welfare is the purpose of our actions. This view is common in ethics generally and is not restricted to political egalitarianism. Jeremy Bentham argued that humans seek pleasure and avoid pain, and that this is both a descriptive truth about human psychology and a normative truth about what we morally ought to do. Welfare is an intrinsic good. Other goods are useful in an instrumental sense. They can be used to obtain welfare.

If the use of material resources generates welfare, then equalizing welfare will attain substantive outcome equality even among people who exhibit different levels of efficiency in welfare generation. An able-bodied person may require fewer resources than a disabled person to achieve a given level of well-being. Suppose a disabled person needs a wheelchair. If she holds an equal amount of resources as a non-disabled person, then the able bodied person is better off than the disabled person. The disabled person must exchange resources for a wheelchair. So either she is not mobile or she is mobile but has fewer remaining resources than the able-bodied, and in either case she is worse off. Welfare equality accounts for variation in talents and abilities and opportunities. Equality of welfare attempts to neutralize the impact of these variations on the distribution of welfare.

From a welfare egalitarian perspective, a just distribution of material resources is merely instrumental to achieving what really matters. We cannot redistribute welfare directly; we can only redistribute the resources that persons can use to generate welfare. Since equality of welfare accounts for variations in how efficiently a person can convert resources into welfare, it is markedly different from equality of resources. An egalitarian welfare distribution will not distribute resources equally.

A problem facing this approach is that preferences adapt to one’s living conditions. Therefore, if preferences help determine one’s level of welfare, unjust inequalities in living conditions might not be rectified by welfare egalitarianism. Nussbaum gives examples of women deprived of resources and opportunities adapting their preferences. This leads to them reporting similar satisfaction levels to women who are objectively less deprived. The adaptive preferences worry is that when there are unjust inequalities, those at the bottom will adapt their preferences to this injustice. A preference can adapt such that you no longer desire that which you are denied. Someone for whom college is an impossible goal may adapt their preferences so that they do not desire to attend college. “Sour grapes” is an even stronger negative preference or aversion to the thing denied. Empirical studies support the thesis that preferences adapt to environmental factors and expectations. Thus someone with fewer opportunities than another may eventually report equivalent welfare levels to those with more opportunities, merely because their preferences, expectations, and standards have lowered. Welfare egalitarianism might therefore convert inequality to equality via subjugated persons internalizing and accepting their inferior status, thereby increasing their satisfaction and reported welfare. (For more on preferences see Harsanyi 1982; and Nussbaum 1999 Ch.5, 2001a.)

However, adaptive preferences are also a benefit for welfare egalitarianism. If persons did not adapt their preferences and ends in response to what they can reasonably expect to attain, aggregate life outcomes would be worse. If goals and preferences were completely non-adaptive, our collective welfare levels would suffer. Adapting one’s ends and preferences is part of forming a rational plan of life. Consider someone who pursues a goal of being a professional athlete at the expense of other professional and personal options. If that person lacks the relevant physical ability, this goal is harmful to their welfare.

Another question facing welfare egalitarianism is whether we should adopt an objective or subjective conception of welfare. Thus far, the description of welfare has been subjective. But what if someone derives high levels of welfare from objects or activities that have low or negative social worth? What if the person experiences higher level of welfare in pursuit of an idiosyncratic end rather than securing the objective necessities for survival? What if there are higher and lower forms of welfare?

Scanlon (1975) gives an example of someone who prefers to have resources to build a temple rather than to provide for his own health and physical well-being.  If he would experience greater subjective welfare under the former scenario, is that the ideal outcome? Or should we take an objective view, specify welfare in terms of the most objectively urgent needs, and guarantee that those are met? Suppose a person will have a below average level of subjective welfare if they have their basic necessities but not the temple, and a very high level of subjective welfare if they have the temple but not the basic necessities. What would welfare egalitarianism have us do? This is a dispute over whether any objective welfare standards are sovereign over individual preferences.

Two other problems for welfare egalitarianism deal with psychological variations among persons. Consider variation in disposition. The cheerful and the gloomy will vary in welfare levels as their share of resources holds constant. Do the gloomy deserve compensation? If resources are the raw material for generating welfare, this would lead to subsidizing the gloomy merely for being gloomy. The opposing view is that the gloomy should adapt rather than be subsidized, and if they do not adapt this is a personal matter, not an unjust inequality.

Expensive and inexpensive tastes are further problems for equality of welfare. Someone might have tastes and preferences that require a large number of resources, or particularly scarce resources, to satisfy. Those with expensive tastes require more resources to achieve a given level of welfare than those with less expensive tastes. While both disabilities and expensive tastes are inefficiencies in the conversion of resources to welfare, it seems a mistake to lump them together. Tastes can change over time. They are subject to their bearer’s agency in ways disabilities are not. People can cultivate, modify, and abandon their tastes and preferences. Also, being deprived of the goods made possible by, say, being ambulatory is not clearly equivalent to the deprivation suffered by someone with an unsatisfied preference for exotic food and wine. There seems to be a difference between using society’s resource to subsidize those with disabilities and subsidizing those with expensive tastes. Proponents of resource egalitarianism find welfare egalitarianism inadequately sensitive to this difference.

Some of the objections to welfare egalitarianism just outlined can be answered by moving to equality of opportunity for welfare. Equality of opportunity for welfare accounts for the luck egalitarian principle that what is bad, is for someone to be worse off than others through no fault of their own. Equality of opportunity for welfare does not commit itself to subsidizing the imprudent or those who cultivate expensive tastes. For an example of equality of opportunity for welfare, see Arneson (1989, 1990). For an equality of opportunity view with a wider metric that includes aspects of both welfare and resources, see Cohen (1989). Equality of opportunity is addressed in greater detail in Section 3.

b. Resources

Resources are things one can possess or use. Think of the various things you can use to generate welfare: wealth, income, land, food, consumer goods. Wider conceptions of resources include one’s own talents and abilities. Resources can also be social: social capital, respect, and opportunities.

Welfare is an intrinsic good, resources are instrumental goods. Resources are good because they can be used to generate welfare, or to guarantee that people are fully capable of functioning and thriving, or able to pursue some specific conception of the good life. Why focus on an instrumental good rather than the intrinsic good? Recall that different persons may require different amounts of resources to achieve equivalent levels of welfare. For example, we can understand disability as inefficiency in welfare generation. Equality of welfare counteracts disabilities, variations in talent and ability, and so on. From the welfare egalitarian perspective, focusing on resources misses the point.

On the other hand, equality of resources gives an attractive answer to other forms of resource-to-welfare inefficiencies that are not obviously matters of justice. What if my tastes are simply more expensive than yours? If you can achieve a specific welfare level with low-grade hamburger, but I need wagyu beef to reach the same level, then equality of welfare, at least in principle, requires subsidizing my share of resources above yours. You get fewer resources than I do only because your tastes are less expensive. Is this just? Many find it to be implausible in principle and inapplicable in practice. Consider the problems of implementing a scheme of distributive justice that would subsidize expensive tastes. This would generate resentment and reduce the commitment to distributive justice in society. There is also a problem of knowledge and trust—how do I know you have expensive tastes? Everyone has an incentive to report having expensive tastes when they are subsidized.

The bad sort of adaptive preferences amplifies this problem. Suppose your tastes are less expensive than mine because you were raised in a less privileged environment with fewer resources and opportunities. This institutionalizes prior inequalities and subsidizes further those who were already better off. If that seems unjust, then it is attractive to shift focus from the intrinsic good to the instrumental good. If we equalize resources, we can give everyone a fair opportunity to generate welfare and leave variations in tastes as a private concern.

One welfare-egalitarian response to these problems is to distinguish between tastes that are under the control of the person and those for which the person is not responsible. If my taste is out of my control, then its impact on my welfare levels is a matter of justice. If I intentionally cultivated the taste, or refuse to expend effort attempting to revise it, then it is a private concern. But this distinction raises perplexing empirical questions. How could we ascertain whether or not a taste is under one’s control? This is a counterfactual claim about what would happen if the person tried to change it, or a historical question about what happened when in fact they tried to change it.

Equality of resources provides a compelling answer to these problems.  If we all have an equivalent bundle of resources, and have control over how we expend them, then whatever tastes an individual has is a private concern. It is not a matter of justice. But the advantage gained in terms of expensive tastes generates a cost: we may no longer have a sufficiently egalitarian response to unjust inefficiencies such as disabilities. Even if expensive tastes and gloominess should not be concerns of distributive justice, inefficiencies involving disability should be. If you and I have the same bundle of resources, but you need a wheelchair to be mobile and I do not, then you are disadvantaged. Our positions are not equal. If equal shares of resources define distributive justice, the disabled are at a disadvantage.

Dworkin (1981b, 2002) takes this as one reason to treat some features of the self as resources. This allows resource egalitarianism to differentiate expensive tastes and disabilities. Dworkin sees both as inefficiencies in welfare generation, but only disability is also a resource deprivation. Someone who can walk has more bodily resources than someone who cannot. This wide conception of resource egalitarianism sees disability as a resource deprivation and therefore a matter of distributive justice. Equal shares of resources now account for disabilities. In the example from the previous paragraph, you will receive the same bundle as me plus a wheelchair. Our total bundles are comparable, because mine includes an ambulatory body while yours includes a non-ambulatory body plus a wheelchair. This approach also applies to innate talents. Someone with abilities or talents that are in high demand already has more resources than someone without such innate talents.

Dworkin’s strategy immediately raises the question of how to determine the value of specific traits and abilities. If we want to implement such a scheme of redistributive justice, how would we specify the value of all these resources? It is a trivial matter to specify equality of wealth or income, but not to quantify the resource variation among persons with various abilities, disabilities, and talents. Dworkin attempts to solve such problems by abstracting away from particular cases and looking at decisions that rational people would make in a hypothetical insurance market. Rational agents, unaware of their own actual talents, abilities, and disabilities, purchase coverage against having disabilities or a lack of valued skills. For example, one considers what sort of policy would be attractive to insure against blindness, lack of in-demand talents, and so on. Then the actual redistributive scheme in society should redistribute resources to actual persons in accord with the insurance coverage that it would have been rational to purchase. Think of it along the lines of medical insurance or unemployment insurance. The hypothetical insurance market provides a rough guide for determining the value of specific resources, giving a baseline of compensation for those who lack such resources.

Resource egalitarianism aims to secure for everyone an equal set of resources and an equal opportunity to convert those resources into welfare. How well people do this, and resulting inequalities stemming from their choices, are not core concerns of this conception of distributive justice.

c. Capabilities

Capabilities are potential functionings, such as walking to work, reading a book, travelling, or being safe and secure in one’s home. If you have the capability to do a specific thing then you have both the abilities and resources required to do it, whether or not you actually choose to do it. A person has the capability to participate in a town hall discussion when they have the physical ability to move into that space (their body, or lack of assistive devices, or the infrastructure does not prevent this motion), the safety to do so without being assaulted, the ability to become informed about the issues (literacy, access to information), and so on. Whatever material and social conditions are required for a specific functioning are possessed by whoever has the relevant capability.

Capabilities-based approaches to distributive justice are sufficientarian rather than equalizing. What is unjust is not the number of capabilities possessed by those on the top compared to others, but the objective inadequacy of the capabilities of those on the bottom. While not equalizing, this is egalitarian. It is concerned with substantive distributive justice. These theories are meant to provide a minimal component of justice that can be combined with further normative principles. When coupled with egalitarian principles, the view is no longer sufficientarian. In terms of its minimal core, though, just as with resource egalitarianism, its commitment to distributive justice is instrumental: a more egalitarian distribution of resources can bring more persons up to the threshold capability level.

The capabilities-based approach’s distinction between capability and function accounts for responsibility and autonomy. What the theory attempts to secure is a sufficient level of capabilities for all. Whether an individual functions is up to his or her own choice. The capabilities approach is therefore not subject to the adaptive preferences objection. No matter how much one adapts their tastes, preferences, and expectations downward, it is unjust whenever they lack the essential capabilities. They may, through free choice or conditioning, choose not to function in certain ways—but they must have the relevant capabilities. In this case, the agent is not making a judgment that something is not worth doing when they currently cannot do it, they are making a judgment that they do not want to do something that they are capable of doing. They have the abilities and resources required to do so. Thus, adaptive preferences can still lead to inequalities in functioning, but this does not impact distributive justice. A sufficient level of capabilities for all requires a certain pattern of the distribution of resources. That pattern is not impacted by the choice of some persons not to function in certain ways.

This approach raises an obvious and crucial question: which capabilities matter to distributive justice? Not every capability should matter, such as the capability to pollute the environment. It also seems that capabilities must be specified in a coarse rather than fine-grained way. The theory would be intractable if every discrete form of functioning were correlated with a discrete capability. For the theory to be illuminating and useful the list must be manageable.

Some capabilities theorists, such as Sen, avoid enumerating an official list. Nussbaum argues that the following list enables one to live a full life with dignity. She does not treat it as timeless or the final word:

  1. Life—capable of living a normal lifespan.
  2. Bodily Health—health, nutrition, shelter.
  3. Bodily Integrity—movement, security against violence, choice in reproduction, sexual satisfaction.
  4. Senses, imagination, thought—the exercise of these capacities in a fully human sense, facilitated by education and protected by rights (of expression, religion, and so forth).
  5. Emotions—emotional development allowing one to form attachments.
  6. Practical reason—development, critical reflection upon, and pursuit of a conception of a good human life.
  7. Affiliation—social interaction, the social bases of self-respect.
  8. Other species—living with and showing concern for the natural world.
  9. Play—recreation.
  10. Control over one’s environment—political activity, political guarantees of security and noninterference, property holdings, full participation in the economic and civic spheres.

Nussbaum’s capabilities list gives a general picture of human flourishing. It reaches every domain of human life. (For more on the capabilities approach, see “Sen’s Capability Approach.”)

d. Democratic/Social Equality

Democratic or social equality is a narrower-scope form of the capabilities approach. Elizabeth Anderson (1999, 2010) developed the most prominent version. Her theory stems from a critique of the individualistic nature of both resource and welfare egalitarianism. Those theories of distributive justice address equality among the holdings of different individuals. Anderson objects to the focus on individual holdings of resources or welfare levels. The point of egalitarianism is social, dealing with relations among persons, not atomistic, dealing with individual allocations of some metric. Anderson rejects the individual compensation model entirely. We cannot do away with unjust inequality by allocating more resources or welfare to those at the bottom. Anderson focuses on the capabilities of citizens and the social relationships between them. Unjust inequalities are caused by oppression, which is social.

Let us again consider disability. Anderson argues that disability is as much a social as a biological fact. The impact on one’s life of having a particular disability varies according to the way social space and infrastructure are constituted and on the social practices of fellow citizens. For example, someone in a wheelchair has less of a handicap when social spaces are physically accessible to them. Equality and inequality are essentially social—the impact of many disabilities depends on social attitudes and political policies. What accommodations do the majority enact through democratic policy? Do non-disabled treat the disabled as equal and fully capable? The proper response to disability cannot be individual compensation. The resources and redistribution that should be used to counteract such handicaps must deal with social practices and infrastructure. Individualistic models could account for why a disabled person requires extra medical resources, but does not reach the level of infrastructure and social practice. Wheelchair accessible social spaces are not part of any individual’s holdings of resources. They are not her property. Yet they are fundamental to understanding disability and inequality.

Unjust inequalities are not mere individual deprivations of welfare or resources compared to others, but socially imposed oppression and exploitation. The paradigm of unjust distribution is not one in which some have much more than others, but in which some oppress and exploit others. Inequality is constituted by certain sorts of social relations. The ideal distribution is not one in which everyone is equalized in terms of resources or welfare, but in which everyone can fully function as a citizen. This is a narrow-scope capabilities approach in two ways: first, the capabilities list is not all-encompassing; second, this is all within a particular political state. Indeed, Anderson’s conception is specifically democratic equality.

This approach is committed to substantive distributive justice as instrumental in guaranteeing that all citizens have a sufficient set of capabilities. Whether a citizen possesses a given capability is jointly determined by the individual, their resources, their environment (natural and built), and the social practices and attitudes of their fellow citizens. Hence the focus is more on institutional changes to make the infrastructure navigable with disabilities, and changes to social norms and behavior, rather than seeing disabilities as an inefficiency for which the individual has a claim to a greater resource share.

The list of capabilities is narrowed to those required to function as a citizen, but nonetheless must be rather coarse and general. The capabilities list must include what is needed to fully function as a citizen and to avoid oppressive social relationships. However, fully functioning as a citizen includes more than political life. It also includes the ability to function in the civil and economic spheres. The point of egalitarianism is not to impose a pattern of distribution but to eradicate oppression, which is socially imposed.

Not only is this theory narrower than the theories of Sen and Nussbaum, it is more constrained than any other option we have considered. Welfare, resources, preferences, primary goods, Nussbaum’s capabilities—each of these reaches into every domain of human life. This conception of equality only touches our lives as citizens. Now, to be sure, since capabilities must be specified in a rather coarse-grained way, the relevant capabilities to citizenship can be put to use in other domains of life. Nonetheless, the scope is relatively narrow.

One objection facing this approach is that it may be possible to guarantee that everyone can fully function as citizens and avoid oppression while at the same time having radical inequality of resources or welfare. If so, perhaps this view is unacceptably narrow because guaranteeing the threshold capability level is compatible with unjust inequalities in life outcomes. Another worry is that this view might be less able to address global justice than other alternatives. That is a disadvantage if one thinks that a unified theory should cover both domestic and global justice.

e. Primary Goods

We now turn to an influential variation on resource egalitarianism. It is not strictly equalizing, and it employs a wide and diverse conception of resources. John Rawls argued that primary goods are what citizens have reason to care about, regardless of whatever else they care about. Primary goods include health, physical and mental abilities, income, wealth, rights, liberties, opportunities, and the social bases of self-respect. No matter what particular conception of the good a citizen may have, what their life plans, goals, and deepest commitments are, she has reason to want more rather than fewer primary goods. Primary goods are what must be expended or employed in pursuit of your conception of the good. (This could mean recreation, education, artistic output, religious missionary work, and so on.) Non-material goods such as liberties and opportunities are what make one’s freedom effective. The social bases of self-respect make for a rewarding life.

All of these primary goods are valuable to you regardless of your religion, values, and life goals. No matter what comprehensive conception of the good you affirm, it is rational to want more rather than fewer primary goods. However, given our differing conceptions of the good, we will not all agree on the best way to use the additional goods created by our social cooperation. Principles of justice are required to fairly allocate resources. For Rawls, the right is prior to the good. Just principles for allocating primary goods trumps pursuit of our individual, various conceptions of the good. This is one thing meant by the title of his book Justice as Fairness.

Rawls’ theory is egalitarian but not necessarily equalizing. It focuses on substantive distributive justice but does not always aim for an equal distribution of all primary goods. Basic rights and liberties must be distributed equally. Fair equality of opportunity requires that opportunities are distributed equally across persons of equal talent and motivation. However, considering all the various primary goods including wealth and income, equality is merely the baseline from which other distributions are judged. Other distributions can be preferable to equality. Inequalities can be justified instrumentally when they are necessary to raise the absolute condition of the worst off. This is accomplished when inequality is a necessary causal mechanism for increasing total productivity. Greater incentives may be required to motivate the talented to be more productive.  The worst off would prefer to live in a society in which they get a larger slice of a larger economic pie than to live in a purely equal society in which they get a smaller slice of a smaller pie. Rawls’ strategy is to answer the problem of distributive justice via a social contract. We consider an idealized choice scenario in which free and equal persons come to an agreement about the nature of the society they wish to enter. If our society matches principles that those persons would have chosen, our society is just. A society meeting this standard is as close as we can get to a voluntary agreement to be bound by a particular state.

Rawls argued that the distribution of benefits and burdens in society should not be fundamentally determined by that which is arbitrary from the moral point of view. This rejection of the morally arbitrary explains Rawls’ choice of the veil of ignorance as part of the preferred choice scenario for picking principles of justice. Rawls argued that we should choose principles of justice by imagining persons behind a veil of ignorance that prevents them from basing their choice on what is morally arbitrary. It is not possible to choose principles tailored to serve one’s own peculiar self-interest. The choice of principles is still made out of self-interest, but it is the interest of an abstract model of the person, not of a specific person who is aware of their particular, contingent situation in the actual world. The veil occludes knowledge of much that is due to chance, but also much that is due to choice, including the choosers’ various conceptions of the good. This scenario attempts to value choice by creating the conditions under which people can all pursue their own conceptions of the good. They reason about how to secure primary goods, which can be expended in pursuit of any conception of the good. The original position creates a model of the Kantian notion of the self, and the veil of ignorance forces the choosers to make decisions that are categorical. They lack the knowledge required to make hypothetical choices based in their own particular conception of the good and their peculiar desires.

Rawls argues that under these conditions, rational actors would choose a maximin strategy. Each individual’s goal is to make the worst possible outcome for themselves as good as it can be. They would not take an avoidable gamble on entering into a society with persons suffering at the bottom of the socioeconomic ladder because they would not want to risk living their entire lives under such conditions. Nor would they object to inequality when it raises the absolute level of the worst off, since they are more concerned with the objective quality of their own lives than with envy of those with more primary goods. Rational, self-interested persons situated in a fair procedure for making decisions about their society will affirm the difference principle. According to the difference principle, if incentives that generate inequality are required to increase productivity, then the resulting inequality can be just. If such incentives are required to motivate higher productivity, then they should be allowed as long as they can be harnessed to assist the worst off. By using inequality to motivate productivity, the economic pie grows, and redistribution can improve the lives of the worst off.  Note, however, that the difference principle cannot justify violations of the descriptive thesis affirming the equal worth of all persons. A liberty principle takes priority over the difference principle. We may not create a system of unequal rights and liberties even if doing so would allow us to raise the absolute condition of the worst off.

Gerald Cohen objects to the demand for greater incentives that the difference principle allows. The people who require greater incentives to work productively are blameworthy. Why, knowing that if they work to their full ability this will benefit the worst off, do they not do so without demanding a greater share of primary goods? Cohen argues that this demand for incentives is exploitative. If the talented changed their outlook, we would have greater equality and improvement of the lives of the worst off. Rawls’ theory deals with principles governing political institutions and the basic structure of society, not with private actions and motivations. Cohen thinks egalitarianism should be internalized. In Rawls’ theory, persons in the original position are conceived of as self-interested, and a fair procedure for choosing principles of justice ensures a commitment to distributive justice. But that is a product of the fairness of the choice scenario and the self-interest of the participants. Cohen thinks that egalitarianism as a moral and political imperative should motivate individual choices and actions, not only shape the basic structure of society and its institutions. Still, as a matter of public policy, Cohen deems Rawls’ view a radical improvement on contemporary society. His objection is that the difference principle is subordinate to unjust motivations and attitudes. Justice requires that we have egalitarian motivations, and therefore the talented should never demand the incentives allowed by the difference principle. Egalitarianism is a normative ideal, and talented persons ought to work productively and support redistributive policies to pursue equality without demanding a greater share of primary goods. Rawls thinks that in actual societies people will have a variety of motivations. The problem for Cohen is that the original position models persons as self-interested rather than egalitarian. He concludes that the difference principle is not just.

f. Luck Egalitarianism

We now turn to a view that combines egalitarianism, Rawls’ rejection of the influence of morally arbitrary factors, and an emphasis on the values of choice and responsibility. Rawls’ social contract view holds that the morally arbitrary should not fundamentally determine the distribution of primary goods or people’s life prospects. So one’s family, one’s innate talents, and one’s starting place in society should not shape one’s life prospects or distributive share unless this benefits the worst off. These factors are undeserved and should not alone determine the distribution of benefits and burdens in society. Luck egalitarianism distills this thought into a complete theory of distributive justice. The ideal distribution is sensitive to people’s choices and informed gambles, but not to brute luck in the distribution of talents and opportunities. For the luck egalitarians, our capacities for free deliberation, choice, and action are pre-institutional. Therefore, they should inform and determine the principles of distributive justice, and the institutional expectations for entitlement and deservingness. (Hurley argues that this is a crucial feature of luck egalitarianism.) These features of the self are not ignored in Rawls’ view, but they do not fundamentally shape the institutions.

Luck egalitarianism is a responsibility-sensitive conception of equality and a system for distributing goods and aid under conditions of scarcity. It prioritizes aid to those who suffer through no fault of their own. It is a non-equalizing commitment to substantive distributive justice. Equality provides a baseline, though in a quite different way from Rawls. The role of equality here is what we can call ex ante equality. At the starting gate of life, we should be equal in some sense. Depending on the favored metric, we should begin with an equal amount of resources or opportunity for welfare. The luck egalitarian ideal is that we start on an equal footing, and then the outcomes of our life choices and freely taken gambles should determine our future holdings. Inequality therefore can be just. It is not just because it brings about some further social good, as the difference principle allowed for inequalities that improve the objective condition of the worst off. Rather, inequalities are justified by being brought about in the right way, by having the right sort of causal origin.

Indeed, luck egalitarianism is an alternative way to develop the emphasis on choice, responsibility, and individual sovereignty that leads some to reject egalitarianism entirely. Cohen argues that the view co-opts these values from the anti-egalitarians. Luck egalitarianism is not opposed to inequality per se; it is opposed to inequalities that have the wrong sort of origins. Inequalities based in brute luck, that is, the type of morally arbitrary factors cited by Rawls (innate talents, parentage, starting place in society) generate unjust inequalities. But option luck, that is, luck in the outcomes of freely taken risks or gambles, lead to just inequalities. As with the capabilities approach, luck egalitarianism may be combined with other principles of justice. (See Cohen on community.)

One objection to luck egalitarianism is based in skepticism about free will and moral responsibility. The theory hinges on the moral importance of choice and responsibility. If there is no robust conception of free will and moral responsibility, why think that inequalities caused by our choices are just?

Another worry about the theory is abandonment. Does luck egalitarianism offer no aid to those who suffer because of choices with poor outcomes? If inequalities are just whenever they are caused by choice, then is there no minimum level of well-being guaranteed for all? One sort of response to this worry is combining luck egalitarianism with other political values. Cohen argues that a commitment to community prohibits inequalities that would be allowed in a purely luck egalitarian system. Kymlicka argues that luck egalitarianism can be combined with social egalitarian views that likewise prohibit some inequalities that might be allowed by luck egalitarianism.

Anderson develops a social egalitarian view and is a strong critic of luck egalitarianism. Her conception of democratic equality is not only a development of the capabilities theory but also an explicit rejection of luck egalitarianism.  She thinks that the luck egalitarian focus on brute luck means the theory completely misses the social nature of inequality. She objects that luck egalitarianism ends up trying to correct the “cosmic injustice” of brute luck in an attempt to ensure that people get what they deserve, and that this blinds them to the social oppression and exploitation that constitutes inequality. Unjust inequality has to do with social relationships.

Another question facing those who support luck egalitarianism is how to define equal starting places. This leads us into the larger issue of what constitutes equality of opportunity.

3. Equality of Opportunity

What if there are dramatic inequalities in the opportunities for choice, education, and careers? This is a problem for luck egalitarians, because they need to specify a starting gate conception of equality. It is also a pressing issue for the other conceptions of equality.

Dworkin argues that inequalities can be historically justified when persons made their choices from an equivalent set of options. This commits luck egalitarianism to robust equality of opportunity. However, his standard is difficult to interpret, since citizens can never have a strictly equivalent set of options, unless that set is so restricted that the society is dystopian. There must be some standard to define when their options are fungible or equivalent enough. However, this is a massive problem for egalitarian theory, and it seems luck egalitarianism’s values of choice and responsibility alone cannot solve it. Answering that problem requires some other standard of value. When do persons have equal opportunities?

Equality of opportunity is a natural extension of the descriptive thesis that affirmed the equality of all persons. The descriptive thesis is incompatible with forms of oppression that rule out classes of people from competing for certain positions within society. A denial of the descriptive thesis entails a denial of a commitment to equality opportunity. But what exactly does equality of opportunity require? It can be understood as ranging from merely formal equality of opportunity to substantive equality of opportunity. The more one approaches the latter, the more one becomes committed to substantive distributive justice.

Formal equality of opportunity requires that desirable positions and resources in society be allocated by open and meritocratic competition. Firms, government agencies, and universities are appropriate candidates for such equality of opportunity. This requires little or no substantive distributive justice. It does require that all citizens can participate in the competition, and that the winners are chosen on the basis of purely meritocratic concerns. Meritocracy requires that the traits that determine who wins the competition actually predict success in the position. Formal equality of opportunity prohibits allocating positions on the basis of gender, ethnicity, and so on. This deals only with opportunities, not outcomes. It does not address systemic inequalities in who wins the meritocratic competitions.

Substantive equality of opportunity addresses both the procedures for allocating positions and the preparation of the candidates that determine their chances of success. It deals with both fair procedures and the actual outcomes of those procedures. For example, if positions are open on the basis of purely meritocratic competition, but the advantages conferred by wealthy parentage are so overwhelming that only the children of the wealthy win the desirable positions, this is merely formal equality of opportunity. Those who support substantive equality of opportunity argue that the merely formal is morally inadequate.

Consider Bernard Williams’ example of a hypothetical warrior society. In the past, this was a caste society in which warriors had high prestige and the majority of wealth. The society transitions to a system of formal equality of opportunity. Under the old order, only the sons of wealthy families were eligible to be chosen as warriors. All others were consigned to poverty and subjugation. Now, warrior positions are allocated under a system that exhibits formal equality of opportunity. Under the new order, there is a meritocratic allocation of the desirable warrior positions. These desirable positions are distributed according to the results of an open, meritocratic, and fair tryout. Rich and poor alike may enter the competition. There is no bias in judging the winners and losers. Stipulate that women may now obtain these positions. Success in the examination is predictive of success as a warrior, so the system is meritocratic.

However, this is all compatible with only the offspring of warriors having adequate nutrition and training to succeed in the competition. Although careers are open to talents, the poor have no chance to cultivate the relevant talents. Even those with the luck to be born with innate ability have their prospects defined by their parentage. Those who were not born to a warrior family cannot succeed. Therefore, the old social hierarchy will persist, even though a strict caste system has been replaced by open, meritocratic procedures that satisfy formal equality of opportunity.

A formal equality of opportunity defender might point out that the long-term outlook for this social hierarchy is made much more tenuous by the implementation of formal equality of opportunity. Other changes to the society could impact the levels of inequality. The dominant positions in society are subject to change over time in a way that they were not under the original caste system.

Still, from the egalitarian perspective, this meritocratic society is unjust. That destabilizing forces can change things under formal equality of opportunity does not redeem the status quo. The current situation is unjust, and destabilizing change would not entail that the next distribution will be just, only that the individuals occupying the dominant and subordinate positions will change. The transition might be to one in which different non-meritocratic attributes correlate with having any chance for success; say, from warrior families to merchant families, or that the offspring of a small set of occupations will be the only ones with a genuine opportunity to succeed.

A perfectionist, someone who thinks that society should maximize the pursuit of some particular conception of the good, could argue that formal equality of opportunity is adequate because the concentration of wealth, which in turn prepares people to flourish as warriors, creates the best set of warriors overall. One can object to this on perfectionist terms (that generating the best warriors is not the proper overriding good, or that this system does not generate the best set of warriors) or on Rawlsian terms of liberal justice (no one conception of the good should be made sovereign in a free society, and no one would agree to this arrangement in the original position).

Suppose the example is shifted slightly. Rather than only the sons of wealthy high caste families having any opportunity to succeed, there is a small amount of social mobility. Some not born into a privileged position win the meritocratic competition. There is not substantive equality of opportunity, but there is both formal equality of opportunity and actual mobility. A supporter of substantive equality of opportunity will still object that it is the strength of the correlation between family background, the resources provided by that background, and obtaining a warrior position is itself adequate evidence of the inadequacy of formal equality of opportunity. These concerns push one to rely on another metric, such as resources, to attain a substantive, material form of equality of opportunity.

Of course, examples need not be so rigid as Williams’ caste society. A collection of informal social attitudes and practices may also violate equality of opportunity. If women are not seen as capable of being good pilots, then hiring and promotion procedures will lack genuine formal equality of opportunity, even if this is neither inscribed in company policy, in law, or in a caste system. These impediments to equality of opportunity are endemic in contemporary society. There are more strategies for answering these problems than can possibly be described in this brief article, so we will mention only two that expand upon views already covered. Rawls (2009) developed a conception of fair equality of opportunity that undermines the role of class, race, gender, or caste to determine life prospects. Fair equality of opportunity requires that persons of equivalent talent who expend equivalent effort have equivalent outcomes. Roemer (2009) provides a sophisticated luck egalitarian account of equality of opportunity that separates people into different types. The competitions that allocate desirable resources and positions should be designed so that effort is rewarded. The details of this scheme are beyond the scope of this article, but these two views are good starting places for readers who want to research the issue in greater depth.

4. Anti-Egalitarianism

An obvious form of anti-egalitarianism rejects the descriptive thesis. If persons are not equal, then there is no moral imperative to pursue substantive distributive justice. Sexism, racism, caste discrimination, and so on are obviously not views that lead into egalitarianism. These objections are beyond the scope of this article.

A common political objection to egalitarianism is that it is based in envy. None of the theories canvassed in this article are explicitly based in envy, so this objection has more to do with the alleged psychological motivations for becoming an egalitarian rather than criticism of egalitarian arguments themselves. Of course, Rawls’ theory explicitly rejects envy. Persons in the original position want to secure the greatest number of primary goods for themselves. Their choice is not impacted by envy of those who may end up with an even greater share of primary goods.

A second political objection is that egalitarianism undermines productivity. If the state redistributes income or other resources, then there is less incentive to be productive. Egalitarians can deny this on empirical grounds, object that total productivity is not the most important criterion, or attempt to harness the way that incentives motivate productivity (as with Rawls’ difference principle).

A practical objection is that a commitment to distributive equality would lead us to “level down” the allocations of those who have more for no real benefit. Suppose all the members of a population have x units of your preferred metric of distributive justice, except for one person who has 2x. Now consider whether it is desirable to transition from that distribution to one in which everyone holds x units. This makes one person worse off and no person better off. The distribution is now equal, but is it preferable? Is it more just? A strict egalitarian can respond that if equality is intrinsically valuable then the distribution is improved in that respect. They are not strictly committed to concluding that this makes the new distribution preferable overall. That only follows if equality is the overriding or sole value. If equality must be balanced against other values, then egalitarians have an answer to the leveling down objection. A strict egalitarian who thinks equality is instrumental already accepts other values, so they can argue that in these cases equality is not instrumental in bringing about the desired consequences.

The leveling down objection is a threat to views that pursue strict equality. Non-equalizing conceptions of substantive distributive justice avoid the problem. What most theories aim to do is improve the condition of the worst off and thereby lessen inequality, not pursue strict equality unconditionally. Views that prioritize aid to the worst off or support a sufficient minimum floor are not obviously subject to this objection. Even if one thinks it is morally obligatory to redistribute resources to improve the condition of those who are worse off than others, it does not follow that it is obligatory to destroy resources when that is the only way to achieve distributive equality.

Perhaps the most philosophically interesting objections to egalitarianism are themselves based in the descriptive thesis that all persons are in fact equal. One objection is that egalitarian distributive justice is insufficiently sensitive to both deservingness and human agency. A second is that there is no just way to implement a redistributive scheme that aims towards equality, because doing so violates freedoms and rights that follow from our equality.

Welfare egalitarianism, resource egalitarianism, the capabilities approach, and Rawls’ difference principle are patterned conceptions of distributive justice as opposed to historical conceptions. Strict egalitarianism defines a pattern of equal shares, the various capabilities approaches define patterns involving a sufficient minimum below which persons cannot fall, and the difference principle states that the level of permissible deviation from the baseline of equality is defined by what is necessary to raise the absolute condition of the worst off.

Nozick (1974) argues against all patterned conceptions of distributive justice. He claims that according to patterned conceptions of justice, if a given pattern is just, it makes no difference which persons occupy which places in the distribution. Justice is defined in terms of structural features of the pattern, not the identity of those occupying specific places in the pattern. Yet that seems counterintuitive. Those at the top might deserve their place on the basis of working hard. Inequalities might be generated by the voluntary transfer of goods that took place in a distribution that was already just. Nozick concludes that, rather than favoring a patterned conception of distributive justice, we ought to understand distributive justice in terms of historical entitlements and voluntary transactions. He agrees with Rawls that the distribution of natural talents is not a basis for deservingness, but denies that this means the distribution of those talents (and the varying wealth and income derivable from them) is arbitrary from the moral point of view. It is not arbitrary because natural talents are implicated in the normative relationship of self-ownership. Persons own themselves. That includes their native abilities. This means that, by extension, they hold strong entitlements to the property they can obtain by exercising those (undeserved) talents.

Since Nozick was primarily responding to Rawls’ Theory of Justice, it is worth looking at this objection and the extent it threatens patterned conceptions in general and Rawls’ conception in particular. Rawls’ view can be defended against Nozick’s objection that according to patterned conceptions of distributive justice, it should not matter which individual occupies which place in the pattern. Consider the role given to institutional expectations and institutional desert. Rawls’ theory allows for people to deserve property so long as the state’s institutions have created the reasonable expectation of such property rights. In other words, entitlement to property is generated by the basic structure of the state. Institutional expectations ground such entitlements. Therefore, his view is compatible with a conception of private property that is not indifferent to which persons occupy which positions in the distribution. Of course, Nozick, following Locke, thinks individuals can have preinstitutional entitlements, so his view of property rights is much stronger.

Still, in Rawls’ patterned view of distributive justice, it must matter which particular individuals occupy which places in the patterned distribution, because the point of the difference principle is that scarce talents are harnessed for the benefit of all. There is a causal relationship between which persons occupy which positions and the pattern of the total distribution. The size of the economic pie is defined by which people occupy which places. Switching places would change total productivity and harm the absolute condition of the worst off. Rawls argues that a given society’s distribution of goods is just if it matches the difference principle. The specific pattern depends on myriad factors, and those factors cannot be held constant while you switch the persons occupying the different positions in the pattern. For example, if in a given state greater incentives are required to motivate some of the highly talented to be more productive, you cannot switch their place in the pattern without changing the productivity level. In such cases, Nozick’s discussion of switching persons within the pattern would necessarily modify the pattern itself. The hypothetical place switching across identical patterns cannot be implemented. So what Nozick means by “patterned” does not capture everything that matters in substantive distributive justice. This response also applies to luck egalitarian accounts of distributive justice. Luck egalitarianism is committed to having shares allocated in accordance with the individual’s choices and option luck. (For a much stronger desert-based alternative, see Kagan 2012.)

Nozick’s second objection has to do with individual liberty to make voluntary transactions. Suppose an actual distribution meets your definition of a just pattern, whatever that may be. So long as persons can make voluntary transactions (purchases, gifts, trades, bequests), the original pattern will be lost. This all happens without exploitation or coercion. The only way to regain the pattern is to for the state to interfere with these voluntary transactions and coercively redistribute the resources. But that is objectionable for two reasons. First, since the deviation from the initial pattern was entirely voluntary, nobody has a valid objection to the second pattern. It wrongs no one, since every transaction that changed the pattern was consensual. Second, coercive redistribution to retain the original pattern must violate property rights. In the initial distribution, which we stipulate was just, each had a right to their holdings. Through voluntary transfers, the new pattern was generated. But if the transactions were voluntary, the new owners of these resources are as entitled to them as the original owners were. The original pattern was just and therefore it is neither required nor permissible for the state to redistribute anything. Egalitarian redistribution enforced by the state must violate property rights. No program can pursue substantive distributive justice through redistribution, because such redistribution is unjust.

This anti-egalitarianism is crucial for Nozick’s understanding of the descriptive thesis: individual rights, including the right to own and transfer property, constitute our equality. Those rights preclude systems of imposing, retaining, or regaining a specific distributive pattern. His understanding of equality is incompatible with egalitarianism. Nozick concludes that we should understand distributive justice in formal and historical terms, not in terms of patterning. He then argues for a set of historical principles governing the original acquisition and subsequent transfer of property. Nozick affirms that persons are equal, but this means that each person has equally strong property rights. The descriptive thesis on this view entails a denial of egalitarianism. Egalitarianism can only be pursued by violating the property rights that follow from our equality.

a. Sufficiency vs. Equality

There is also a sufficiency objection to strictly equalizing views. Frankfurt objects that

The mistaken belief that economic equality is important in itself leads people to detach the problem of formulating their economic ambitions from the problem of understanding what is most fundamentally significant to them. It influences them to take too seriously, as though it were a matter of great moral concern, a question that is inherently rather insignificant and not directly to the point, namely, how their economic status compares with the economic status of others. In this way the doctrine of equality contributes to the moral disorientation and shallowness of our time. (Frankfurt 1987)

A person focused on strict egalitarianism evaluates their own life and holdings based on something impersonal and independent of the particular features of their own lives and their own personal needs. Egalitarianism is harmful.

However, the egalitarian impulse is really based in something that is of moral importance—the principle that all persons should have a sufficient level of well-being. On Frankfurt’s view, people become egalitarians on the basis of compelling reasons, but those reasons have to do solely with sufficiency, not equality.

It seems clear that egalitarianism and the doctrine of sufficiency are logically independent: considerations that support the one cannot be presumed to provide support also for the other. Yet proponents of egalitarianism frequently suppose that they have offered grounds for their position when in fact what they have offered is pertinent as support only for the doctrine of sufficiency. Thus they often, in attempting to gain acceptance for egalitarianism, call attention to disparities between the conditions of life characteristic of the rich and those characteristic of the poor. (Frankfurt 1987)

The case for egalitarianism is usually only a case against poverty.

The fundamental error of egalitarianism lies in supposing that it is morally important whether one person has less than another regardless of how much either of them has. […] The economic comparison implies nothing concerning whether either of the people compared has any morally important unsatisfied needs at all nor concerning whether either is content with what he has. (Frankfurt 1987)

Defenders of equality must show that substantive distributive justice is not captured by concerns over sufficiency alone. We will use Scanlon as a representative example. (See also Parfit and O’Neill for discussions of equality as opposed to sufficiency.) Scanlon offers five sorts of reasons to be concerned with equality and not merely sufficiency. 1. Some inequalities create humiliating differences in status. One could object that sufficiency is whatever level required to avoid humiliation and shame. However, the level is sensitive to differences between the better off and worse off rather than being determined by objective or unchanging standards. This means that, contra Frankfurt, we are intrinsically concerned with differences between people, not just that everyone meets some sufficient benchmark. 2. Inequalities can give those who have more an unjust amount of power over others. 3. Social institutions are only fair if there is equality of starting places in society. Inequality can undermine procedural fairness. We can see this in economic competition, inequality of opportunity, and political influence. 4. Inequalities can be objectionable when they involve failure to treat equally those who have a claim to equal benefit. Just because everyone has a sufficient level of some service or resource provided by the state does not mean that unequal allocation is just. 5. Inequality can violate the claims of citizens to benefit from the fruits of social cooperation. This is how Scanlon reads Rawls as egalitarian. The participants in the original position are equal participants. The presumption is that they have an equal claim to the benefits of social cooperation. This is why equality is the benchmark from which inequalities are judged, and only those that benefit everyone are permissible. The primary goods are produced by social cooperation and, contra Nozick, the baseline or benchmark is that every equal citizen has an equal claim to those benefits. (For more on the debate between equality, sufficiency, and giving priority to the worst off, see the references for Nagel, Parfit, and Scanlon. For elucidating commentary on Scanlon, see Wolff 2013.)

5. Domestic or Global?

Many egalitarians hold a stronger domestic than global view. Redistributive priority is given to fellow citizens over persons in other nations. This, on its face, seems inconsistent or unwarranted. If one is committed to equality, what difference could national borders make? Is it just for a state to prioritize domestic distributive justice over global distributive justice? As a pure matter of luck egalitarianism, the state into which one is born is a paradigm example of brute luck. Having one’s life prospects be determined by nation of origin seems as morally arbitrary as having one’s life prospects determined by parentage. The arbitrariness of nationality combined with the universality of the descriptive thesis (all persons are equal) creates tension with domestic prioritization. On the other hand, if redistributive justice deals with the allocation of goods produced by the cooperation of citizens, then perhaps there is a justification for prioritizing domestic over international redistribution. The amount of redistribution required to address global inequality may depend on the nature of the goods to be allocated as well as the degree of entanglement among the world’s various states.

Consider an efficiency argument against global egalitarianism. One may be an egalitarian yet argue for domestic priority based on increased costs of sending aid to distant locations, difficulty with managing the efficient distribution on the other end, or epistemic advantages of dealing with local rather than remote issues. Peter Singer argues against the efficiency rationale. Changes in modern transportation, financial systems, and information technology have lessened most of the inefficiencies in aiding far away persons. Singer’s argument is not about egalitarianism per se, but about preventing what all reasonable people can agree are objectively bad states of affairs: famine, starvation, epidemics, and so forth. So on the one hand, it is not egalitarian in the sense of an equal distribution of some metric, but rather egalitarian in the sense of doing away with suffering at the bottom rungs of the global society.

Singer’s view is a useful example of moral obligations being global. If moral obligations can be global, then perhaps so too can egalitarianism. Proximity is arbitrary in his analysis: someone suffering nearby is no more morally relevant than someone suffering far away. Given the magnitude of global suffering, there is an egalitarian element to his utilitarian calculus. So long as these objectively bad states of affairs are occurring, first world people are obligated to work to prevent them. This will flatten global inequality. His view can be taken in two ways: the strict reading requires sacrifice to the point of marginal utility with the globe’s worst off, or a weaker (though still radical) reading that requires significant sacrifice. However, Singer constrains both readings with a utilitarian productivity argument: the first world may need some excess consumer culture (that in the short term contradicts our obligations to the worst off) to keep the economy at a level where it can make the maximum contribution to the plight of the globe’s worst off.

Singer therefore takes the descriptive thesis to require radical, obligatory sacrifice on the part of citizens in first world countries. Given the amount of objectively bad states of affairs in the world, those who are comparatively well off are obligated to reallocate resources to the worst off.

Onora O’Neill gives another example of global moral obligation. She argues for a right not to be killed unjustly. Global resource inequalities amount to de facto killings. They are unjust, since they can be avoided at reasonable cost. There is no obligation to equalize anything globally, but there is an obligation to avoid violating the global poor’s right to not be killed unjustly. She gives an argument by analogy that highlights the tension between property entitlements and distributive justice. In a lifeboat scenario, one who has excess water and food but withholds it from others, who will die without it, violates their right not to be killed unjustly. Property entitlements vary in strength in different contexts. She then argues that the planet is no different from a lifeboat, so that those dying from poverty and famine have their right not to be killed unjustly violated. This argument hinges on the contextual variability of property rights and the relative strength of the right not to be killed over property rights. The right not to be killed trumps property rights, so the redistribution required to avoid these killings is obligatory. Unlike Singer, this does not generalize to an obligation to prevent all objectively bad happenings globally. First world citizens are only obligated to do what is required to secure everyone’s right not to be killed unjustly. Yet this is radical, too—her conception of agency means that those complicit in first world economies are killing the globe’s worst off. Redistribution is the means to avoid these killings.

Given these types of arguments for global moral obligations, what can be said in favor of domestic priority in egalitarian redistribution? If distributive equality is a matter of justice, should redistribution be global? As in the discussion of anti-egalitarianism, one obvious objection is to deny that the descriptive thesis holds globally. Denying the equality of all the globe’s people is not philosophically interesting. A stronger argument is that the demands of egalitarian justice are tied up with institutions and practices that are not global. If matters of distributive justice have to do with coercive redistribution, then perhaps only persons living within the same state fall under egalitarian requirements.  If so, global distributive justice would only apply if there were genuinely powerful and coercive global institutions. Egalitarian obligations only arise within a coercive political structure. That the state holds coercive power over the citizens means that they should each be treated equally and, perhaps, that the state should engage in redistribution to pursue equality of holdings. Various forms of this view appeal to different features of the state. A similar argument is that redistributive justice has to do with allocating the resources made possible through social cooperation. If so, then the bonds of citizenship matter to distributive justice, and we should treat domestic and international inequality differently.

Another domestic-priority view is that egalitarian norms arise among people who share political bonds and obligations, and those attachments are local rather than global. These sorts of objections are not unconditionally opposed to global egalitarianism; they rather object that egalitarianism is tied to certain relationships and institutions that currently are not global. Some egalitarians counter that the amount of global engagement, cooperation, and institutional entanglement does generate global egalitarian obligations. (For example, see Pogge 1989.)

Richard Miller gives a consequentialist argument for domestic prioritization. Too much redistribution directed outside of a particular state can have a destabilizing impact. Even if that state is well off compared to others, as long as it has inequality in its own economy, then those on the internal bottom rungs may become alienated if resources are taken out of their economic system and sent to another country whose most deprived citizens are even more worse off. The worst-off citizens within the relatively wealthier state are participating in a scheme of social cooperation that benefits the well off, their state engages in egalitarian redistribution, but the redistributive scheme prioritizes the needs of the worst off in other countries. It seems as though this scheme provides benefits to all but the domestic worst off. This can undermine their commitment both to productive labor and the respect for the rule of law. This in turn harms the state, makes it less stable and productive, and therefore makes it less able to generate external aid.

Miller also attempts to transcend the patriotism-cosmopolitanism dispute by universalizing patriotic priority. For the vast majority of people, certain universal human goods are only satisfied in local political communities. (The exceptions are a miniscule small number of global elites.) Our need for social interaction and political community is satisfied locally, as we do not share rich attachments with persons across the globe. This changes the inherently arbitrary nature of the state into which one was born into something morally relevant. This is not a rejection of all global redistribution, but an attempt to break from the view that patriotic priority and helping the globe’s worst off are polar opposites. Combined with the previous consequentialist argument, this means that in order to secure these universal human goods, we need individual states, and within each state we need patriotic priority in redistributive justice. Each person needs these goods categorically, they can only be provided locally, and they are threatened when the redistributive scheme within a given state does not exhibit patriotic priority. This is all compatible with the descriptive thesis applying globally. On this view the descriptive thesis only requires that we are not insensitive to the suffering of others. We do have global obligations to assist others, but this does not mean all the demands of distributive justice are all global.

A commitment to global equality requires radical, perhaps unrealistic sacrifice. That can be taken as reason to reject global egalitarianism: persons cannot reasonably be expected to bring about global equality. However, normative principles specify what we ought to do, not what we are comfortable doing. What we ought to do might require a complete change to our way of life.

6. References and Further Reading

  • Anderson, Elizabeth S. 1999. “What Is the Point of Equality?” Ethics 109 (2): 287–337.
    • (An attack on contemporary egalitarian theory in general and luck egalitarianism in particular. Provides a defense of democratic equality.)
  • Anderson, Elizabeth S. 2010. “The Fundamental Disagreement between Luck Egalitarians and Relational Egalitarians.” Canadian Journal of Philosophy 40, no. sup1: 1–23.
  • Arneson, Richard J. 1989. “Equality and Equal Opportunity for Welfare.” Philosophical Studies 56 (1): 77–93.
  • Arneson, Richard J. 1990. “Liberalism, Distributive Subjectivism, and Equal Opportunity for Welfare.” Philosophy & Public Affairs, 19: 158–94.
  • Arneson, Richard J. 2000. “Luck Egalitarianism and Prioritarianism.” Ethics 110 (2): 339–49.
  • Arneson, Richard J. 2004. “Luck Egalitarianism Interpreted and Defended.” Philosophical Topics 32: 1–20.
    • (Important defense of luck egalitarianism.)
  • Barry, Nicholas. 2006. “Defending Luck Egalitarianism.” Journal of Applied Philosophy 23 (1): 89–107.
  • Blake, Michael. 2001. “Distributive Justice, State Coercion, and Autonomy.” Philosophy & Public Affairs 30 (3): 257–96.
    • (Egalitarian obligations hold within particular states, not globally.)
  • Cavanagh, Matt. 2002. Against Equality of Opportunity. Oxford: Clarendon Press.
  • Cohen, Gerald A. 1989. “On the Currency of Egalitarian Justice.” Ethics 99 (4): 906–44.
  • Cohen, Gerald A. 2009. Rescuing Justice and Equality. Cambridge: Harvard University Press.
  • Cohen, Joshua. 1989. “Democratic Equality.” Ethics 99 (4): 727–51.
  • Dworkin, Ronald. 1981a. “What is Equality? Part 1: Equality of Welfare.” Philosophy & Public Affairs 10 (3): 185–246.
  • Dworkin, Ronald. 1981b. “What is Equality? Part 2: Equality of Resources.” Philosophy & Public Affairs 10 (4): 283–345.
    • (Useful discussion of different metrics of equality.)
  • Dworkin, Ronald. 2002. Sovereign Virtue: The Theory and Practice of Equality. Cambridge: Harvard University Press.
  • Elster, Jon. 1985. Sour Grapes: Studies in the Subversion of Rationality. Cambridge: Cambridge University Press.
  • Feinberg, Joel. 1974. “Non-Comparative Justice.” Philosophical Review 83 (3): 297–358.
  • Fleurbaey, Marc. 1995. “Equal Opportunity or Equal Social Outcome?” Economics and Philosophy 11 (1): 25–55.
  • Frankfurt, Harry. 1987. “Equality As a Moral Ideal.” Ethics 98 (1): 21–43.
  • Freeman, Samuel. 2006. “Distributive Justice and the Law of Peoples.” In Rawls’s Law of Peoples: A Realistic Utopia?, edited by Rex Martin and David Reidy, 243–60. Oxford: Blackwell Publishing.
  • Harsanyi, John C. 1982. “Morality and the Theory of Rational Behavior.” In Utilitarianism and Beyond, edited by Amartya Sen and Bernard Williams, 39–62. Cambridge: Cambridge University Press.
  • Hurley, Susan. 2003. Justice, Luck, and Knowledge. Oxford: Oxford University Press.
  • Kagan, Shelly. 1999. “Equality and Desert,” in What Do We Deserve: A Reader on Justice and Desert, edited by Louis P. Pojman and Owen McLeod. New York: 298–314.
  • Kagan, Shelly. 2012. The Geometry of Desert. New York: Oxford University Press.
    • (Desert-based conception of distributive justice.)
  • Knight, Carl. 2009. Luck Egalitarianism: Equality, Responsibility, and Justice. Edinburgh: Edinburgh University Press.
  • Knight, Carl, and Zofia Stemplowska, eds. 2011. Responsibility and Distributive Justice. New York: Oxford University Press.
  • Knight, Carl. 2013. “Luck Egalitarianism.” Philosophy Compass 8 (10): 924–34.
  • Kymlicka, Will. 1990. Contemporary Political Philosophy. Oxford: Clarendon Press.
  • Lake, Christopher. 2001. Equality and Responsibility. New York: Oxford University Press.
  • Miller, Richard W. 1998. “Cosmopolitan Respect and Patriotic Concern.” Philosophy & Public Affairs 27 (3): 202–24.
    • (A defense of domestic prioritization in redistribution.)
  • Nagel, Thomas. 1991. Equality and Partiality. Oxford: Oxford University Press.
  • Nagel, Thomas. 2005. “The Problem of Global Justice.” Philosophy & Public Affairs 33 (2): 113–47.
    • (Nagel argues that obligations of egalitarian justice only extend as far as a scheme of enforcement, which typically extends only throughout a particular state.)
  • Nagel, Thomas. 2012. “Equality.” In Mortal Questions 106–27. New York: Cambridge University Press.
  • Nozick, Robert. 1974. Anarchy, State, and Utopia. New York: Basic books.
    • (A libertarian affirmation of the equality of all persons and rejection of redistribution aiming at greater equality.)
  • Nussbaum, Martha C. 1999. Sex and Social Justice. New York: Oxford University Press.
  • Nussbaum, Martha C. 2001a. “Symposium on Amartya Sen’s Philosophy: 5 Adaptive Preferences and Women’s Options.” Economics and Philosophy 17 (1): 67–88.
  • Nussbaum, Martha C. 2001b. Women and Human Development: The Capabilities Approach. New York: Cambridge University Press
    • (An influential development of the capabilities approach.)
  • O’Neill, Onora. 1975. “Lifeboat Earth.” Philosophy & Public Affairs 4 (3): 273–92.
  • Otsuka, Michael. 1998. “Self‐Ownership and Equality: A Lockean Reconciliation.” Philosophy & Public Affairs 27 (1): 65–92.
  • Otsuka, Michael. 2003. Libertarianism Without Inequality. Oxford: Oxford University Press.
    • (A reconciliation of libertarianism and substantive distributive justice.)
  • Parfit, Derek. 1995. Equality or Priority. First presented at the Lindley Lectures, November 21 1991. Lawrence: University of Kansas.
  • Parfit, Derek. 1997. “Equality and Priority.” Ratio 10 (3): 202–21.
  • Piketty, Thomas. 2014. Capital in the Twenty-First Century. Translated by Arthur Goldhammer. New York: Belknap Press.
  • Pogge, Thomas W. 1989. Realizing Rawls. Ithaca, New York: Cornell University Press.
  • Pogge, Thomas W. 1994. “An Egalitarian Law of Peoples.” Philosophy & Public Affairs 23 (3): 195–224.
  • Rakowski, Eric. 1991. Equal Justice. New York: Oxford University Press.
  • Rawls, John. 2009. A Theory of Justice. Cambridge: Harvard University Press.
  • Raz, Joseph. 1986. The Morality of Freedom. Oxford: Oxford University Press.
  • Roemer, John E. 2009. Equality of Opportunity. Cambridge: Harvard University Press.
    • (Sophisticated account of luck egalitarian equality of opportunity that focuses on effort.)
  • Sangiovanni, Andrea. 2007. “Global Justice, Reciprocity, and the State.” Philosophy & Public Affairs 35 (1): 3–39.
    • (Argues that egalitarian obligations only arise within particular political communities.)
  • Scanlon, Thomas M. 1975. “Preference and Urgency.” The Journal of Philosophy 72 (19): 655–69.
  • Scanlon, Thomas. 1996. “The Diversity of Objections to Inequality.” In The Difficulty of Tolerance: Essays in Political Philosophy, 202–18. Cambridge: Cambridge University Press.
  • Scanlon, Thomas. 1998. What We Owe to Each Other. Cambridge: Harvard University Press.
  • Scheffler, Samuel. 2003. “What is Egalitarianism?” Philosophy & Public Affairs 31 (1): 5–39.
  • Scheffler, Samuel. 2005. “Choice, Circumstance, and the Value of Equality.” Politics, Philosophy & Economics 4 (1): 5–28.
  • Segall, Shlomi. 2007. “In Solidarity with the Imprudent: A Defense of Luck Egalitarianism.” Social Theory and Practice 33 (2): 177–98.
  • Sen, Amartya, and Bernard Williams, eds. 1982. Utilitarianism and Beyond. Cambridge: Cambridge University Press.
  • Sen, Amartya. 1980. “Equality of What?” In The Tanner Lectures on Human Values, vol. 1, edited by S. McMurrin, 195–220. Salt Lake City: University of Utah Press.
  • Sen, Amartya. 1992. Inequality Reexamined. Oxford: Oxford University Press.
  • Sher, George. 1979. “Effort, Ability, and Personal Desert.” Philosophy & Public Affairs 8 (4): 361–76.
  • Tan, Kok-Chor. 2008. “A Defense of Luck Egalitarianism.” The Journal of Philosophy 105 (11): 665–90.
  • Temkin, Larry S. 1993. Inequality. New York: Oxford University Press.
  • Van Parijs, Philippe. 1995. Real Freedom for All: What (If Anything) Can Justify Capitalism? Oxford: Oxford University Press.
  • Walzer, Michael. 1983. Spheres of Justice: A Defense of Pluralism and Equality. New York: Basic Books.
  • Williams, Bernard. 1962. “The Idea of Equality.” In Philosophy, Politics and Society: Second Series, edited by P. Laslett and W.G. Runciman. Oxford: Blackwell.
  • Wolff, Jonathan. 1998. “Fairness, Respect, and the Egalitarian Ethos.” Philosophy & Public Affairs 27 (2): 97–122.
  • Wolff, Jonathan. 2013. “Scanlon on Social and Material Inequality. Journal of Moral Philosophy 10 (4): 406–25.

 

Author Information

Ryan Long
Email: longr@philau.edu
Philadelphia University
U. S. A.

Jürgen Habermas (1929—)

Habermas
photo by Ziel

Jürgen Habermas produced a large body of work over more than five decades. His early work was devoted to the public sphere, to modernization, and to critiques of trends in philosophy and politics. He then slowly began to articulate theories of rationality, meaning, and truth. His two-volume Theory of Communicative Action in 1981 revised and systematized many of these ideas, and inaugurated his mature thought. Afterward, he turned his attention to ethics and democratic theory. He linked theory and practice by engaging work in other disciplines and speaking as a public intellectual. Given the wide scope of his work, it is useful to identify a few enduring themes.

Habermas represents the second generation of Frankfurt School Critical Theory. His mature work started a “communicative turn” in Critical Theory. This turn contrasted with the approaches of his mentors, Max Horkheimer and Theodor W. Adorno, who were among the founders of Critical Theory. Habermas sees this turn as a paradigm shift away from many assumptions within traditional ontological approaches of ancient philosophy as well as what he calls the “philosophy of the subject” that characterized the early modern period. He has instead tried to build a “post-metaphysical” and linguistically oriented approach to philosophical research.

Another contrast with early Critical Theory is that Habermas defends the “unfinished” emancipatory project of the Enlightenment against various critiques. One such critique arose when the moral catastrophe of WWII shattered hopes that modernity’s increasing rationalization and technological innovation would yield human emancipation. Habermas argued that a picture of Enlightenment rationality wedded to domination only arises if we conflate instrumental rationality with rationality as such—if technical control is mistaken for the entirety of communication. He subsequently developed an account of “communicative rationality” oriented around achieving mutual understandings rather than simply success or authenticity.

Another enduring theme in Habermas’ work is his defense of “post-national” structures of political self-determination and transnational governance against more traditional models of the nation-state. He sees traditional notions of national identity as declining in importance; and the world, as faced with problems stemming from interdependency that can no longer be addressed at the national level. Instead of national identity centered on shared historical traditions, ethnic belonging, or national culture, he advocates a “constitutional patriotism” where political commitment, collective identity, and allegiance coalesce around the shared principles and procedures of a liberal democratic constitutionalism facilitating public discourse and self-determination. Habermas also claims that emerging structures of international law and transnational governance represent generally positive achievements moving the global political order in a cosmopolitan direction that better protects human rights and fosters the spread of democratic norms. He sees the emergence of the European Union as paradigmatic in this regard. However, his cosmopolitanism should not be overstated. He does not advocate global democracy in any strong sense, and he is committed to the idea that democratic self-determination requires a measure of localized mutual identification in the form of civic solidarity—a legally mediated solidarity around shared history, institutions, and rooted in some shared “ethical” pattern of life (see Sittlichkeit discussion below) fostering mutual understandings.

Table of Contents

  1. Biography: Early Life to Structural Transformation
  2. Enduring Themes in Formative and Transitional Work
    1. Public Deliberation Over Positivist Decisionism and Technocracy
    2. From Philosophical Anthropology to a Theory of Social Evolution
  3. The Linguistic Turn into the Theory of Communicative Action
  4. Discourse Ethics
  5. Political and Legal Theory
  6. References and Further Reading
    1. General Introductions to Habermas
    2. Introductory Books and Articles on Specific Themes
      1. Biography
      2. Linguistic Turn
      3. Discourse Ethics
      4. Political Theory
    3. Works Cited
    4. Secondary Scholarship Beyond the Subject-Specific Recommendations Cited Above

1. Biography: Early Life to Structural Transformation

Habermas was born in 1929 in Düsseldorf, Germany. He has noted that early corrective surgeries for a cleft palate sensitized him to human vulnerability and interdependence, and that subsequent childhood struggles with fluid verbal communication may partly explain his theoretical interest in communication and mutual recognition. He has also cited the end of WWII and frustrations over postwar Germany’s uneven willingness to fully break with its past as key personal experiences that inform his political theory.

Habermas belongs to what historians call the “Flakhelfer generation” or the “forty-fivers.” Flakhelfer means antiaircraft-assistant. At the end of the war, people born between 1926 and 1929 were drafted and sent to help man antiaircraft artillery defenses. Over a million youth served as such personnel. The second “forty-fiver” label captures how this generation came of age with the 1945 Nazi defeat. These experiences fostered a political skepticism and vigilance born out of having been exploited, and an affinity for the nascent liberal democratic principles of postwar Germany. Both labels capture formative features of Habermas’ biography (Specter 2010, Matustik 2001).

Reflecting on his upbringing during the war, Habermas describes his family as having passively adapted to the Nazi regime—neither identifying with nor opposing it. He was recruited into the Hitler Youth in 1944 and sent to man defenses on the western front shortly before the war ended. Soon thereafter he learned of the Nazi atrocities through radio broadcasts of the Nuremburg trials and concentration camp documentaries at local theaters. Such experiences left a deep impact: “all at once we saw that we had been living in a politically criminal system” (AS 77, 43, 231).

After the war, he studied philosophy at the universities of Göttingen (1949-50), Zurich (50-51) and Bonn (51-54). He wrote his thesis on Schelling under the direction of Erich Rothacker and Oskar Becker. He was increasingly frustrated with the unwillingness of German politicians and academics to own up to their role in the war. He was disappointed in the postwar government’s failure to make a fresh political start and distressed by continuities with the past. In interviews, he has recalled leaving a campaign rally in 1949 after being disgusted by the far-right connotations of the flags and songs used. He was similarly disappointed by German academics. At university he studied the work of Arnold Gehlen and Martin Heidegger extensively, but their prior Nazi ties were not discussed openly. In 1953 Heidegger reissued his 1935 Lectures on Metaphysics in a largely unedited form that included reference to the “inner truth and greatness of the Nazi movement.” Habermas published an op-ed challenging Heidegger, and the lack of response seemed to confirm his suspicions (NC, 140-172). He wrote a piece critiquing Gehlen a few years later (1956). Around the same time he was distressed to learn Rothacker and Becker had also been active Nazi party members.

Near the end of his studies Habermas worked as a freelance journalist and published essays in the intellectual journal Merkur. He took an interest in the interdisciplinary Institute for Social Research affiliated with the University of Frankfurt. The Institute had returned from wartime exile in 1950, and Adorno became director in 1955. Adorno was familiar with Habermas’ essays and took him on as a research assistant. While at the Institute Habermas studied philosophy and sociology, worked on research projects, and continued to publish op-ed pieces. One such piece, Marx and Marxism, struck Horkheimer as too radical. Horkheimer wrote to Adorno suggesting he dismiss Habermas from the Institute. The following year Horkheimer rejected Habermas’ Habilitationsschrift proposal on the public sphere. Habermas did not want to alter his project, so he completed his dissertation at the University of Marburg under the Marxist political scientist Wolfgang Abendroth.

His Habilitationsschrift, The Structural Transformation of the Public Sphere (German 1962, English 1989), was well received in Germany. It chronicled the rise of the bourgeois public sphere in 18th and 19th century Europe, as well as its decline amidst the mass consumer capitalism of the 20th century. Habermas gave an account of the way in which newspapers, coffee shops, literary journals, pubs, public meetings, parliament and other public forums facilitated the emergence of powerful new social norms of discourse and debate that mediated between private interests and the public good. These forums functioned as mechanisms to disseminate information and help freely form the public political will needed for collective self-determination. These norms also partly embodied important principles like equality, solidarity, and liberty. By the late 19th century, however, capitalism was increasingly monopolistic. Large corporations easily influenced the state and society. Economic elites could use ownership of the media and other (previously public) forums to manipulate or manufacture public opinion and buy-off politicians. Citizens deliberating about the common good were transformed into atomized consumers pursuing private interests. Habermas describes this as the “re-feudalization” of the public sphere. While his narrative was pessimistic, the end of Structural Transformation seems to hold out hope that the truncated normative potential of the public sphere may yet be revived. The work solidified Habermas’ place in the German academy. After a short stint in Heidelberg, he returned to the University of Frankfurt in 1964 as a professor of philosophy and sociology, taking over the chair vacated by Horkheimer’s retirement.

In the spirit of his early call for renewed public sphere debate, Habermas has consistently engaged political movements as a public intellectual and taken part in various scholarly debates. This has not always been easy. After returning to Frankfurt he had been a mentor for the German student movement, but had a falling out with student radicals in 1967. In June of that year a variety of simmering protests—over the restructuring of German universities, proposed “emergency laws,” the Vietnam War, and other issues—boiled over. The breaking point was when a student at a protest against the Shah of Iran was shot and fatally beaten by plainclothes police, who then tried to cover up the incident. This stoked the flames of student protests. Sit-ins and protests crippled everyday life. Under the leadership of Rudi Dutschke students occupied the Free University of Berlin.

Habermas worried that protest leaders seemed to be advocating an unsophisticated and extra-legal opposition to any and all authority that could easily lead to violence. At a conference in Hannover shortly after the shooting he publically reproached Dutschke by calling his model of extra-legal direct-action “left fascism.” That charge alienated Habermas from the leftist student movement and inspired an essay collection Die Linke antwortet Jürgen Habermas (The Left Answers Habermas—German 1969). Rapprochement would only come a decade later when, in the aftermath of a series of killings by the radical left-wing Red Army Faction, politicians on the right tried to garner political capital by suggesting that such terrorism was rooted in the ideas of Frankfurt School Critical Theory. Habermas and Dutschke published pieces repudiating the accusation. A decade later, the editor of the essay collection apologized for how the book made it seem like Habermas’ falling out with the student movement marked a conservative turn that meant he was no longer part of the left.

As a public intellectual, Habermas has engaged a variety of topics: the anti-nuclear movement of the late fifties, the “Euromissile” debate of the early eighties and, in the early two-thousands, both the terrorism of 9/11 and the second Iraq War. In the second half of the eighties he was also a key voice in the Historikerstreit debate between historians, philosophers, and other academics about the proper way for Germany to situate and remember the Holocaust amidst the history of other atrocities. In 1989 he made important contributions to public debate about the reunification of Germany. While Habermas was not against reunification, he was critical of the speed and manner in which reunification was carried out. More recently, he has approached public debate on the European Union along the broadly similar lines of a cautious optimism that is also on guard against a forced, rushed, or duped false unity that would lack legitimacy and stability over the long term.

In a more academic vein, he has had numerous exchanges with thinkers like Jacques Derrida, Richard Rorty, Hans-Georg Gadamer, Niklas Luhmann, John Rawls, Robert Brandom, Hilary Putnam, and Cardinal Joseph Ratzinger (before he was Pope Benedict XVI). His ongoing debate with postmodernism is arguably the most enduring line of debate. Broadly speaking, thinkers like Michel Foucault, Jacques Derrida, and Richard Rorty have levied criticisms to the effect that reason is little more than a historically and culturally contingent social form, that notions of universally valid morality and truth are ethnocentric projections of power, that interests shaped by radically different ways of life are irreconcilable, and that our belief in the emancipatory moral progress of humankind is a myth. Habermas has tried to meet such challenges in much the same way as he responded to Horkheimer and Adorno’s Dialectic of Enlightenment: by relying on his account of communicative rationality in Theory of Communicative Action. However, before turning to that more mature theory, we must survey a few major phases of his formative and transitional work.

2. Enduring Themes in Formative and Transitional Work

a. Public Deliberation Over Positivist Decisionism and Technocracy

The essays in Towards a Rational Society (German 1968 and 1969, English 1970) and Theory and Practice (German 1971, English 1973b) were written on the heels of Structural Transformation. They were written amidst the “positivism dispute” in Germany about the relation between the natural and social sciences. The (somewhat inaccurately labeled) “positivist” side of this debate took scientific inquiry as the sole paradigm of knowledge and generally thought of the social sciences as analogous to the natural sciences. Following Adorno, Habermas argued against a positivistic understanding of the social sciences.

For Habermas, positivism is comprised of three claims: (1) knowledge consists of causal explanations cast in terms of basic laws or principles (for example, laws of nature), (2) knowledge passively reflects or mirrors independently existing natural facts, (3) knowledge is about what is, not what ought to be. He calls these claims scientism, objectivism, and value-neutrality. He said each can be pernicious, especially in the social scientific realm. Scientism fosters the view that only causal and empirically verifiable hypotheses can count as true knowledge. Objectivism seems to falsely naturalize the world by ignoring how lived experiences, human subjectivity, and interests can structure the object domain that gets identified as relevant or worthy of study. Lastly, value-neutrality misleads us into thinking that the role of knowledge is purely descriptive and technical. Values or preferences are seen as separate from knowledge and, as such, wholly subjective “givens” lying beyond rational justification. In turn, knowledge is seen as a tool for efficiently controlling the environment so as to realize whatever values an agent happens to hold. Ironically, this fails to see the tacit value commitments already inscribed in this general paradigm of knowledge.

Habermas’ critique makes sense given his place in Frankfurt Critical Theory. Despite differences with the first generation, he shares the decidedly non-neutral commitments to human emancipation, interdisciplinarity, and self-reflexive theory. Like Horkheimer and Adorno, Habermas worried the prior ascendancy of positivism had left influences on our conceptualizations of knowledge and social inquiry that were hard for even reflective positivists to leave behind. Indeed, he critiques Karl R. Popper’s account of inquiry and knowledge even though it rejects what Habermas calls objectivism. In opposition to a positivist picture of knowledge merely mirroring the world, Habermas holds the Frankfurt School’s Hegelian-Marxist-inspired conception of a dialectical relation between knowledge and world. Finally, like his Frankfurt School contemporaries, Habermas was concerned that positivism had left subtle yet pernicious impacts on politics.

In early writings Habermas is especially critical of two related trends, decisionism and technocracy, that stem from a positivistic understanding of political science and practice. Decisionism starts from the assumption that there is no such thing as the public interest, but rather a clash of inherently subjective values that do not (even in principle) admit of rational persuasion or agreement. It follows that political elites must either simply decide between competing values or base policy on their aggregation. Either way, political value preferences are taken as brute or static facts; there is no sense in which reasoned argumentation and persuasion could genuinely transform such preferences or lead people to a new understanding of their values. Technocracy builds from this point by emphasizing the “objective necessities” (Sachzwänge) supposedly involved in a political system—economic growth, social stability, national security—and highlighting the increasing ability of policy experts to advise political leaders about strategies for optimally realizing these goals. The worry with this approach is that questions about what specific type of growth, stability, and security we seek (and why) are removed from debate by definitional fiat. In decisionism, political legitimacy flows from periodic expressions of acclamation or disapproval at the way leaders have manifested predefined values. In technocracy, legitimacy supposedly flows from the ability of politicians to find and follow expert advice so as to attain fixed outcomes pre-defined by “objective necessities.” Both models render the potentially transformative effects of public deliberation superfluous. Legitimacy is seen as flowing from either certain outcomes or periodic expressions of aggregate preference.

Habermas thinks both models are extremely problematic accounts of democratic political practice and legitimacy. While Structural Transformation only gestured at how the normative potential of the public sphere could be reinvigorated in contemporary circumstances, this theme received increasing attention in works such as Legitimation Crisis (German 1973, English 1975), Theory of Communicative Action, and Between Facts and Norms (German 1992, English 1996). An account of democratic legitimacy that combats decisionism and technocracy is an enduring concern. Indeed, despite championing the European Union he has continued to critique technocracy by criticizing the way in which it has arisen and is currently structured (2008, 2009, 2012, 2014).

b. From Philosophical Anthropology to a Theory of Social Evolution

Knowledge and Human Interests (German 1968, English 1971) and Communication and the Evolution of Society (German 1976, English 1979) are two early attempts at a new systematic framework for Critical Theory. The approaches he uses are akin to the tradition of “philosophical anthropology” in the German social theory of the early 1900s that grew out of phenomenology—a tradition that is quite different from contemporary anthropology. Knowledge and Human Interests sought to overcome positivist epistemology that saw knowledge as simply discerning static facts, and to give a plausible account of the dialectical relation between knowledge (theory) and world (practice). Habermas’ main claim was that the knowledge of scientific and social progress is tacitly guided by three types of “knowledge constitutive interests”—technical, practical, and emancipatory—that are “anthropologically deep-seated” in the human species.

Knowledge and Human Interests tries to recover and develop alternative models of the relation between theory and practice. The approach is historical and reconstructive in that it interprets the attempts of prior theorists as part of a trajectory that Habermas wants to extend. He reviews prior reformulations of Kant’s “transcendental synthesis” (the form-legislating activity making objective experience possible) and his “transcendental unity of apperception” (the unity of the subject having such experience). He also tries to articulate the way in which Hegel relocated such synthesis in the historical development of human subjectivity (absolute spirit) and how Marx relocated it in the material use of tools and techniques (embodied labor). Habermas wants to add to such a trajectory by rehabilitating their shared insight that the constitution of experience is not generated by transcendental operations but by the worldly natural activities of the human species. Yet he wants to do this in a way that avoids the mistakes of Marx and Hegel as well. He tries to do this by building on his interpretation of Hegel, which was already concisely captured by his essay Science and Technology as Ideology (German 1968, included in English 1970).

In that essay he responded to Herbert Marcuse’s claim that the technical reason of science inherently embodies domination. According to Marcuse, under late capitalism the technical reason of science functions ideologically to collapse intersubjective practical questions about how we want to live together into technical questions about how to control the world to get what we want. Habermas shares Marcuse’s concerns, as his criticism of technocracy makes clear. Yet he thinks this dynamic is contingent because, taken as an emergent collective project, humankind constitutes how the world shows up in experience through its worldly activity. More specifically, Habermas identifies two irreducibly distinct and dialectically related modes of human self-formation, “labor” and “interaction.” Whereas labor is an action type that aims at technical control to achieve success, interaction is an action type that aims at mutual understandings embodied in consensual norms. Marcuse’s claim (and his remedy of a “new science”) would only stand if the “interaction” of intersubjective collective political choice—including the question of how we use technology—was somehow subsumed or rendered superfluous by the “labor” of technological progress in controlling the external world. But, given Habermas’ views in this period, this is impossible. Interaction and labor seem to be pitched as irreducible and invariant categories of human experience. Neither can be dropped nor can one be subsumed in the other—even if their relation becomes unbalanced.

In Knowledge and Human Interests, this division between labor and interaction is recast as the technical and practical interests of humankind. The technical interest is in the material reproduction of the species through labor on nature. Humans use tools and technologies to manage nature for material accommodation. The practical interest is in the social reproduction of human communities through intersubjective norms of culture and communication. Human social life requires members who can understand each other, share expectations, and achieve cooperation. In a sense, these interests are the “most fundamental.” Moreover, the knowledge that flows from them is supposed to slowly accrue over time in the enduring institutions of society: theoretical knowledge driven by the technical interest in controlling nature accrues in the “empirical-analytic” sciences, and normative knowledge driven by the practical interest in mutual understandings accrues in the interpretive “historical-hermeneutic” sciences.

But, going beyond Science and Technology as Ideology, in Knowledge and Human Interests Habermas adds a third “emancipatory” human interest in freedom and autonomy. The labor of material reproduction and the interaction norms of social reproduction require, in a weak sense, psychosocial mechanisms to repress or deny basic drives and impulses that would destroy material and social reproduction. For instance, labor requires delayed gratification and social interaction requires internalized notions of obligation, reciprocity, shame, guilt, and so forth. Unfortunately, psychosocial mechanisms of control are often used far more than they need to be to secure material and social reproduction. Indeed, perverse incentives to rely on such mechanisms may even arise: if the burdens and benefits of material and social reproduction processes become unfairly distributed across groups and solidified over time, then those in power may find psychosocial mechanisms useful. If women are falsely taught there are natural laws of gender relations such that the dominant patterns of marriage and domestic work that consistently disadvantage them are the best they can hope for, this is an ideological mechanism of social control. It is the limitation of freedom and autonomy for no purpose other than domination, and it “functions” through systematically distorted communication.

Habermas posits a human interest in using self-reflection and insight to combat ideologically veiled, superfluous social domination so as to realize freedom and autonomy. While there is no clearly institutionalized set of sciences where the knowledge spurred on by such an interest would accrue, Habermas points to Marx’s critique of ideology and Freud’s psychoanalytic dissolution of repression as demonstrating a cognitive viewpoint that focuses on neither (efficient) work nor (legitimate) interaction but (free) identity formation liberated from internalized systematically distorted communication. Here Habermas takes his lead from Kant’s idea that reason aims to emancipate itself from “self-incurred tutelage,” and tries to forge a link between theory (reason) and practice (in the sense of self-realization) through using critical reflection on self and society to unveil and dissolve internalized oppressive power structures that betray one’s own true interests.

Knowledge and Human Interests was envisioned as a preface for two other books that would jointly challenge the separation of theory and practice. However, the project was never finished. On the one hand, Habermas felt that vibrant critiques of positivism in the philosophy of science made the rest of the project superfluous. On the other, the work encountered heavy criticism. For starters, Habermas seems to pitch work and interaction as real action types. But, if we account for how work is communicatively structured, interaction is teleologically ordered, and how historical notions of work and interaction structure one’s sense of freedom, then it is clear these can be at best idealizations. Moreover, as even sympathetic interpreters noted, his account of an emancipatory interest seemed to blur together reflection on “general presuppositions and conditions of valid knowledge and action” with “reflection on the specific formative history of a particular individual or group” (Giddens, McCarthy, 95). Lastly, his stipulation of knowledge-constitutive interests seemed to reproduce the sort of foundationalism he wished to avoid.

Given such criticism, it may seem surprising that Communication and the Evolution of Society reconstructs Marx’s historical materialism as a theory of social evolution. This sounds foundationalist and deterministically teleological. These impressions are misleading. Around this time Habermas began presenting his work as a “research program” with tentative and fallible claims evaluable by theoretical discourses. Moreover, while he speaks of evolution, he uses the term differently than 19th century philosophies of history (Hegel, Marx, Spencer) or later Darwinian accounts. His “social evolution” is neither a merely path-dependent accumulative directionality nor a progressive, strongly teleological realization of an ideal goal. Instead, he envisions a society’s latent potentials as tending to unfold according to an immanent developmental logic similar to the developmental logic cognitive-developmental psychologists claim maturing people normally follow. Lastly, Habermas’ theory of social evolution avoids worries about determinism by distinguishing between the logic and the mechanisms of development such that evolution is neither inevitable, linear, irreversible, nor continuous. A brief sketch of his theory follows.

Habermas characterizes human society as a system that integrates material production (work) and normative socialization (interaction) processes through linguistically coordinated action. This is qualitatively different from the static and transitive status hierarchy systems of even other “social” animals. In various human epochs the linguistic coordination of these processes crystalizes around different “organizational principles” that are the “institutional nucleus” of social integration. In the most basic societies kinship structures play this role by (to take just one possible configuration) dividing labor and specifying socialization responsibilities through sex-based roles and norms. Habermas claims this organizational principle was replaced by political order in traditional societies and the economy in liberal capitalist societies. Social evolution in general and the particular movements from one “nucleus” to the next stem from learning in material and social reproduction.

Understood as ideal types, work and interaction mark out different ways of relating to the world. On the one hand, in material production one mainly adopts an instrumental perspective that tries to control an object in conformity to one’s will. In this orientation, learning is gauged by success in controlling the world and the resultant knowledge is cognitive-technical. On the other, in social reproduction one mainly adopts a communicative perspective that tries to coordinate actions and expectations through consensually agreed upon normative standards. In this orientation learning is gauged by mutual understanding and the resultant knowledge is moral-practical. Each learning process follows its own logic. But, since the processes are integrated in the same social system, advances in either type of knowledge can yield internal tensions or incongruities. These cannot be suppressed by force or ideology for long, and eventually need to be solved by more learning or innovation. If these internal tensions are too great, they induce a crisis requiring an entirely new “institutional nucleus.”

For Habermas, the slow social learning in history is the sedimentation of iterated processes of individual learning that accumulates in social institutions. While there is no unified macro-subject that learns, social evolution is also not mere happenstance plus inertia. It is the indirect outcome of individual learning processes, and such processes unfold with a developmental logic or deep structure of learning: “the fundamental mechanism for social evolution in general is to be found in an automatic inability not to learn. Not learning but not-learning is the phenomenon that calls for explanation” (LC, 15; also see Rapic 2014, 68). Habermas posits a universal developmental logic that tends to guide individual learning and maturation in technical-instrumental and moral-practical knowledge. He discerns this logic in the complementary research of Jean Piaget in cognitive development and Lawrence Kohlberg in the development of moral judgment. As social and individual learning are linked, such underlying logic has slowly created homologies—similarities in sequence and form—between: (i.) individual ego-development and group identity, (ii.) individual ego-development and world-perspectives, and (iii.) the individual ego-development of moral judgment and the structures of law and morality (Owen 2002, 132). Habermas pays more attention to the last homology and later writings focus on Kohlberg, so it is instructive to focus there (1990b).

Kohlberg’s research on how children typically develop moral judgment yielded a schema of three levels (pre-conventional, conventional, and post-conventional) and six stages (punishment-obedience, instrumental-hedonism/relativism, “good-boy-nice-girl”, legalistic social-contract/law-and-order, universal ethical principles). Two stages correspond to each level. Habermas follows Kohlberg’s three levels in claiming we can retrospectively discern pre-conventional, conventional, and post-conventional phases through which societies have historically developed. Just as normal individuals who progress from child to adult pass through levels where different types of reasons are taken to be acceptable for action and judgment, so too we can retrospectively look at the development of social integration mechanisms in societies as having been achieved in progressive phases where legal and moral institutions were structured by underlying organizational principles.

Habermas slightly diverges from the six stages of Kohlberg’s schema by proposing a schema of neolithic societies, archaic civilizations, developed civilizations, and early modern societies. Neolithic societies organized interaction via kinship and mythical worldviews. They also resolved conflicts via feuds appealing to an authority to mediate disputes in a pre-conventional way to restore the status quo. Archaic civilizations organized interaction via hierarchies beyond kinship and tailored mythical worldviews backing such hierarchies. Conflicts started to be resolved via mediation appealing to an authority relying on more abstract ideas of justice—punishment instead of retaliation, assessment of intentions, and so forth. Developed civilizations still organized interaction conventionally, but adopted a rationalized worldview with post-conventional moral elements. This allowed conflicts to be mediated by a type of law that, while rooted in a community’s (conventional) moral framework, was separable from the authority administering it. Finally, with early modern societies, we find certain domains of interaction are post-conventionally structured. Moreover, a sharper divide between morality and legality emerges such that conflicts can be legally regulated without presupposing shared morality or needing to rely on the cohering force of mythical worldviews backing hierarchies (McCarthy 1978, 252).

Obviously, this sketch is rather vague and needs further elaboration. This is especially true in light of the ways a superficial reading (that takes social evolution as strictly parallel rather than homologous to individual development) lends itself to unsavory developmentalist narratives. Yet, apart from a few later writings, Habermas has not returned to his theory of social evolution in a systematic way. Several secondary authors have tried to fill in the details (Rockmore 1989, Owen 2002, Brunkhorst 2014, Rapic 2014). Nevertheless, Habermas still endorses the contours of his theory of social evolution: these ideas show up in Theory of Communicative Action and his later writings on the nature and development of legality and democratic legitimacy bear a loose connection to this early work (especially the final homology above) insofar as they are tailored for specifically post-conventional societies. Yet, before turning to his democratic theory, we must tackle the hugely important intervening body of work concerning his communicative turn and its articulation in his Theory of Communicative Action.

3. The Linguistic Turn into the Theory of Communicative Action

Habermas’ engagement with speech act theory and hermeneutics in the late 1960s and 70s started a linguistic turn that came to full fruition in Theory of Communicative Action. This turn makes sense after both Knowledge and Human Interests and Communication and the Evolution of Society. He came to see the knowledge-constitutive interests of the former as illicitly relying on assumptions in the philosophy of consciousness and Kantian transcendentalism, while the reconstructed phases of social learning and evolution in the latter can seem far too naturalistic or foundationalist. In contrast, a focus on communicative structures let him form his own pragmatic theory of meaning, rationality, and social integration based in reconstructions of the competencies and normative presuppositions underlying communication. This approach is transcendental and naturalistic but only weakly so. Far from an account of ultimate foundations, his approach takes itself to be a post-metaphysical methodology for philosophical and social scientific research into practical reason. From the start of his linguistic turn until well after Theory of Communicative Action this approach underwent revisions. In what follows, only a broad outline of this trajectory is given.

Habermas has cited his 1971 Gauss lectures at Princeton (German publication 1984b, English publication 2001) as the first clear expression of the linguistic turn, but it was also evident in On the Logic of the Social Sciences (German 1967, English 1988a). His first truly systematic foray in Anglo-American philosophy of language came with What is Universal Pragmatics? (German 1976b, included in English 1979). His ideas were then revised further in Theory of Communicative Action. While the development of his ideas throughout this period is an important exegetical task, for present purposes the broad way he takes up speech act theory is what is important: he accepts the division in linguistics between syntax, semantics, and pragmatics. He considers each division to be reconstructing the tacit system of rules used by competent speakers to recognize the well-formed-ness (syntax), meaningfulness (semantics), and success (pragmatics) of speech. His main interpretive twist is that the theories of truth-conditional propositional meaning often associated with philosophical projects regarding language only locate part of the meaning of speech. Thus, he moves away from meaning based on the correspondence theory of truth and gives an account of the unique pragmatic validity behind the meaning of speech.

While his linguistic turn is sometimes cast as a break with prior theory, his interpretive approach actually coheres quite well with his early critique of positivism. He has always rejected the idea that language simply states things about the world. Instead of merely analyzing propositions that either do (true) or do not (false) obtain in the world, he is interested in the full range of ways people use language. He claims that, instead of focusing on sentences, a complete theory of language would focus on contextual utterances as the most basic unit of meaning. Thus, he developed a formal pragmatics (called “universal pragmatics” in early work). Building on the work of Karl Bühler, he conceives of the pragmatic use of language in context as embedding sentences in relations between speaker, hearer, and the world. This embedding helps to intersubjectively stabilize such relations. Habermas claims that, in uttering a speech act speakers mean something (express subjective intentions), do something (interact with or appeal to a hearer) and say something (cognitively represent the world). While truth-conditional theories of meaning focus on cognitive representations of the world, Habermas prioritizes the pragmatics of speech acts over the semantic or syntactical analysis of sentences. What is done through speech is taken to be what is most basic for meaning.

During his linguistic turn Habermas appropriated several ideas from John Searle. Even though Searle has not always fully agreed with such appropriations, two of them are useful points of orientation (Searle 2010, 62). Habermas adopts Searle’s idea of the constitutive rules underlying language: just like the rules of a game define what counts as a legitimate move or status, so too there is an implicit rule-governed structure to the use of language by competent speakers. He also adopts Searle’s view, built on JL Austin’s work, that speech has a double structure of both propositional content and illocutionary force. For instance, the propositional content of “it is snowy in Chicago” is a representation of the world. But the same content can be used in different illocutionary modes: as a warning to drive carefully, as a plea to delay travel, as a question or answer in a larger conversation, and so on. Moreover, beyond such illocutionary force, all speech acts also have derivative perlocutionary effects that, unlike illocution, are not internally connected to the meaning of what is said. A warning about snow may elicit annoyance or gratitude, but such responses are contextually inferred and not necessarily connected to either the propositional content or the warning itself.

These ideas about the structure of speech highlight a few key points. First, Habermas takes perlocutionary success (for example, eliciting gratitude) to be parasitic on illocutionary force (for example, the speech is perceived as a warning, not a plea). Attaining success with others by realizing one’s intention in the world is secondary to achieving an understanding with them. For example, even when lying, the lie only works by first coming to a false understanding that what is being said is true. Second, he identifies three modes of communication—cognitive, interactive, and expressive—that depend on whether a speaker’s main illocutionary intention is to raise a truth claim of propositional content, a claim of rightness for an act, or a claim of sincerity about psychological states. Third, he identifies corresponding speech act types—constatives, regulatives, and expressives—that, seen from the perspective of a competent language user, contain immanent obligations to redeem the aforementioned claims by respectively providing grounds, articulating justifications, or proving sincerity and trustworthiness.

In short, Habermas thinks there are general presuppositions of communicative competence and possible understanding that underlie speech and which require speakers to take responsibility for the “fit” between an utterance and inner, outer, and social worlds. For any speech act oriented towards mutual understanding, there is a presumed fit of sincerity to the speaker’s inner world, truth to the outer world, and rightness to what is inter-subjectively done in the social world. Naturally, these presumptions are defeasible. Yet, the point is that speakers who want to reach an agreement have to presuppose sincerity, truth and rightness so as to be able to mutually accept something as a fact, valid norm, or subjectively held experience.

For Habermas these elements form the “validity basis of speech.” He claims that, by uttering a speech act, a speaker is seen as also potentially raising three “validity claims”: sincerity for what is expressed, rightness for what is done, and truth for what is said or presupposed. Depending on the speech act type, one claim often predominates (for example, constatives raise a validity claim of truth) and, more often than not, speech rests on undisturbed background agreements about facts, norms, and experiences. Moreover, minor disagreement can be quickly resolved through clarifying meaning, reminding others of facts, asking about preexisting commitments, highlighting situational features, and so on. Habermas sometimes refers to such minor communicative repairs as “everyday speech.” But when disagreement persists we may need to transition to what Habermas calls “discourse”: a particular mode of communication in which a hearer asks for reasons that would back up a speaker’s validity claim. In discourse the validity claims that are always immanent within speech become explicit.

Clearly, Habermas uses “validity” in an odd way. The notion of validity is most often used in formal logic where it refers to the preservation of truth when inferentially moving from one proposition to another in an argument. This is not how Habermas uses the term. What then does he mean by validity? It is instructive to look at the assumptions behind his theory of meaning. When his model of meaning emphasizes what language does over what it merely says or means the operative assumption is that the primary function of speech is to arrive at mutual understandings enabling conflict-free interaction. Moreover, at least with respect to claims of truth and rightness, he assumes genuine and stable understandings arise out of the give and take of reasons. Claims of truth and rightness are paradigmatically cognitive in that they admit of justification through reasons offered in discourse. What Habermas means by validity then is a close structural relationship between the give and take of reasons and either achieving an understanding or (more strongly) a consensus that allows for conflict-free interaction. This yields an “acceptability theory” of meaning where the acceptance of norms is always open to further debate and refinement through better reasons.

As we cannot know in advance what reasons will bear on a given issue, only robust and open discourses license us to take the (provisional) consensuses we do achieve as valid. Habermas therefore formulates formal and counterfactual conditions—the “pragmatic presuppositions” of speech and the “ideal speech situation”—that describe and set standards for the type of reason-giving that mutual understandings must pass through before we can regard them as valid (on these formal conditions and how understanding and consensus may differ see below and section 4). At the same time, we never start this give-and-take of reasons from scratch. People are born into cultures operating on background understandings that are embodied in inherited norms of action. Borrowing from Husserl and others, Habermas calls this stock of understandings the “lifeworld.”

The lifeworld is an important if somewhat slippery idea in Habermas’ work. One way to understand his particular interpretation of it is through the lens of his debate with Gadamer. Broadly speaking, Habermas agrees with the view of language held by Gadamer and hermeneutics generally: language is not simply a tool to convey information, its most basic form is dialogic use in context, and it has an inbuilt aim of understanding. On such a view, objectivity is not just correspondence to an independent world but instead something that is ascribed to mutual understandings (about the world, relations to others, and oneself) intersubjectively achieved in communication. Moreover, communication has an underlying structure that makes understandings possible in the first place. Meaning is therefore in some sense parasitic on this background structure.

On this much Gadamer and Habermas agree. But Gadamer takes all this to mean that explicit understanding and misunderstanding are only possible due to a taken-for-granted understanding of cultural belonging and socialization into a natural language. Habermas agrees that culture and socialization are important, but is worried that Gadamer’s take on the background structures that form the “conditions of possibility” for meaning yields a relativistic “absolutization of tradition.” On Habermas’ interpretation the lifeworld encompasses the sort of belonging and socialization referred to by Gadamer, but it works with and is underpinned by certain deep structures of communication itself. For Habermas, the complementarity between the lifeworld and a particular manifestation of these deep structures in discourse and “communicative action” (below) is what lets one interrogate and progressively revise parts of the background stock of inherited understandings and validity claims, thereby avoiding either relativism or the dogmatic veneration of tradition.

For Habermas the lifeworld is a reservoir of taken-for-granted practices, roles, social meanings, and norms that constitutes a shared horizon of understanding and possible interactions. The lifeworld is a largely implicit “know-how” that is holistically structured and unavailable (in its entirety) to conscious reflective control. We pick it up by being socialized into the shared meaning patterns and personality structures made available by the social institutions of our culture: kinship, education, religion, civil society, and so on. The lifeworld sets out norms that structure our daily interactions. We don’t usually talk about the norms we use to regulate our behavior. We simply assume they stand on good reasons and deploy them intuitively.

But what if someone willfully breaks or explicitly rejects a norm? This calls for discourse to explain and repair the breach or alter the norm. As a micro-level example: if someone breaks a promise then they will be asked to justify their behavior with good reasons or apologize. Such communication is also called for when norms suffer more serious breakdowns: one may question the reasons behind norms and whether they remain valid, or run into a new and complex situation where it is unclear which norms, how, to what extent, and if they apply. Regardless of how serious the norm breach or breakdown is, we need to engage in discourse to repair, refine, and replenish shared norms that let us avoid conflict, stabilize expectations, and harmonize interests. Discourse is the legitimate modern mechanism to repair the lifeworld; it embodies what Habermas calls “communicative action.”

Communicative action can be seen as a practical attitude or way of engaging others that is highly consensual and that fully embodies the inbuilt aim of speech: reaching a mutual understanding. In later writings Habermas distinguishes weak and strong communicative action. The weak form is an exchange of reasons aimed at mutual understanding. The strong form is a practical attitude of engagement seeking fairly robust cooperation based in consensus about the substantive content of a shared enterprise. This allows solidarity to flourish. In either form, communicative action is distinct from “strategic action,” wherein socially interacting people aim to realize their own individual goals by using others like tools or instruments (indeed, he calls this type of action “instrumental” when it is solitary or non-social). A key difference between strategic and communicative action is that strategic actors have a fixed, non-negotiable objective in mind when entering dialogue. The point of their engagement is to appeal, induce, cajole, or compel others into complying with what they think it takes to bring their objective about. In contrast, communicatively acting parties seek a mutual understanding that can serve as the basis for cooperation. In principle, this involves openness to an altered understanding of one’s interests and aims in the face of better reasons and arguments.

The contrast between communicative and strategic action is tightly linked to the distinction between communicative and purposive rationality. Purposive rationality is when an actor adopts an orientation to the world focused on cognitive knowledge about it, and uses that knowledge to realize goals in the world. As noted, it has social (strategic) and non-social (instrumental) variants. Communicative rationality is when actors also account for their relation to one another within the norm-guided social world they inhabit, and try to coordinate action in a conflict free manner. On this model of rationality, actors not only care about their own goals or following the relevant norms others do, but also challenging and revising them on the basis of new and better reasons.

Approaching rationality after action orientations is not merely stylistic. Habermas notes that while many theorists start with rationality and then analyze action, the view of action that such an order of analysis primes us to accept can tacitly smuggle in quasi-ontological connotations about the possible relations actors can have amongst themselves and to the world. Indeed, this mistake figures into Habermas’ critique of Weber’s account of the progressive social rationalization ushered in by modernity. Weber framed Western rationalism in terms of “mastery of the world” and then naturally assumed the rationalization of society simply meant increased purposive rationality. As is apparent from Habermas’ account of social learning, this is not the only way to understand the “evolution” of societies or the species as a whole throughout history. By expanding rationality beyond purposive rationality Habermas is able to resist the Weberian conclusion that had been attractive to Horkheimer and Adorno: that modernity’s increasing “rationalization” yielded a world devoid of meaning, people focused on control for their own individual ends, and that the spread of enlightenment rationality went conceptually hand-and-glove with domination. Habermas feels the notion of rationality in his Theory of Communicative Action resists such critiques.

The contrast between communicative and strategic action mainly concerns how an action is pursued. Indeed, while these action orientations are mutually exclusive when seen from an actor’s perspective, the same goal can often be approached in either communicative or strategic ways. For instance, in my rural town I may have a discussion with neighbors whereby we determine we share an interest in having snow cleared from our road, and that the best way to do this is by taking turns clearing it. This could count as an instance of communicative action. But, imagine a wealthy and powerful recluse who is indifferent to his neighbors. He could just pay a snowplow to clear the road up until his driveway. He could also use his power to manipulate or threaten others to clear the snow for him (for example, he could call the mayor and hint he may withhold a campaign donation if the snow is not cleared). Strategic action is about eliciting, inducing, or compelling behavior by others to realize one’s individual goals. This differs from communicative action, which is rooted in the give-and-take of reasons and the “unforced force” of the best argument justifying an action norm.

Strategic action and purposive rationality are not always undesirable. There are many social domains where they are useful and expected. Indeed, they are often needed because communicative action is very demanding and modern societies are so complex that meeting these demands all the time is impossible. Speakers engaged in communicative action must offer justifications to achieve a sincerely held agreement that their goals and the cooperation to achieve them are seen as good, right, and true (see section 4). But, in complex and pluralistic modern societies, such demands are often unrealistic. Modern social contexts often lack opportunities for highly consensual discussion. This is why Habermas thinks weak communicative action is likely sufficient for low stakes domains where not all three types of validity claims predominate, and why strategic interaction is well-suited for other domains. For Habermas, modern societies require systematically structured social domains that relax communicative demands yet still achieve a modicum of societal integration.

Habermas takes the institutional apparatus of the administrative state and the capitalist market to be paradigmatic examples of social integration via “systems” rather than through the lifeworld. For example, if a state bureaucracy administers a benefit or service it takes itself to be enacting prior decisions of the political realm. As such, open-ended dialogue with a claimant makes no sense: someone either does or does not qualify; a law either does or does not apply. Similarly, in a clearly defined and regulated market actors know where market boundaries lie and that everyone within the market is strategically engaged. Each market actor seeks individual benefit. It makes little sense to attempt an open-ended dialogue in a context where one supposes all others are acting strategically for profit. Both domains coordinate action, but not through robustly cooperative and consensual communication that yields solidarity. Certainly, not all large-scale and institutionalized interaction is strategic. Some social domains like scientific collaboration or democratic politics institutionalize reflexive processes of communicative action (see section 5 on democratic theory). In such fora cooperation may yield solidarity across the enterprise. Even so, the systems integration like that found in bureaucracies or markets sharply differs from integration through communicative action.

It should be stressed that these are simply paradigmatic examples, and that the same social domain can be institutionalized differently across societies. It is therefore more useful to look at the coordinative media that are typically used to interact with and steer any given institutionalized system rather than positing a fictive typology of clear social domains wherein it is assumed that either strategic or communicative action takes place. Habermas identifies three such media: speech, money, and power. Speech is the medium by which understanding is achieved in communicative action, while money and power are non-communicative media that coordinate action in realms like state bureaucracies or markets. A medium may largely be used in one social domain but that doesn’t mean it has no role in others. While speech is certainly the main medium of healthy democratic politics, this doesn’t mean money and power never play a role.

This all might seem to imply that there is no single correct way for system and lifeworld to jointly achieve social integration. Indeed, the complementarity between system and lifeworld laid out in Theory of Communicative action is broad enough to accommodate a wide range of institutional pluralism with respect to the structure of markets, bureaucracies, politics, scientific collaboration, and so on. But, the claim that there is no “one size fits all template” for social integration should not be taken as the claim that system and lifeworld have no proper relationship. Socialization into a lifeworld precedes social integration via systems. This is true historically and at the individual level.

Moreover, Habermas claims the lifeworld has conceptual priority with respect to systems integration. His thinking runs as follows: the lifeworld is the codified (yet revisable) stock of mutual normative understandings available to any person for consensually regulating social interaction; it is the reservoir of communicative action. Systems integration represents carefully circumscribed realms of instrumental and strategic action wherein we are released from the full demands of communicative action. Yet the very definition and limitation of these realms always depends on communicative action regarding, for example, the types of markets or state administration a community wants to have and why. Without being rooted in the mutual understandings of the lifeworld, we would get untrammeled systems of money and power disconnected from the intersubjectively vouchsafed practical reason that Habermas thinks underpins all meaning. The organizing principles of systems themselves would stop being coherent. For instance, market competition makes sense against a backdrop of normative principles like fairness, equal opportunity to compete, rules against capitalizing on secret information, and so on. But if markets were so “no-holds-barred” that these principles no longer applied, then engaging in market activity would cease to make sense. Similarly, if markets were so regulated that there was no genuine risk or opportunity they would also start to loose coherence as an enterprise. In both these skewed hypothetical scenarios the system is rigged and thus, if there are functional alternatives, it is not worth participating in. This is a variant of his early anti-technocracy argument. Positing “objective necessities” like economic growth, social stability, national security and then circumventing communicative action veils disagreement on what type of growth, stability, and security is important for a given community and why. As such, systems designed to achieve these ends are primed to loose coherence and legitimacy based in widely accepted structuring principles.

Habermas thinks the lifeworld self-replenishes through communicative action: if we come to reject inherited mutual understandings embedded in our normative practices, we can use communicative action to revise those norms or make new ones. Mechanisms of systems integration depend on this lifeworld backdrop for their coherence as enterprises achieving a modicum of social integration. The trouble is that systems have their own self-perpetuating logic that, if unchecked, will “colonize” and destroy the lifeworld. This is a main thesis in Theory of Communicative Action: strategic action embodied in domains of systems integration must be balanced by communicative action embodied in reflexive institutions of communicative action such as democratic politics. If a society fails to strike this balance, then systems integration will slowly encroach on the lifeworld, absorb its functions, and paint itself as necessary, immutable, and beyond human control. Current market and state structures will take on a veneer of being natural or inevitable, and those they govern will no longer have the shared normative resources with which they could arrive at mutual understandings about how they collectively want their institutions to look like. According to Habermas, this will lead to a variety of “social pathologies” at the micro level: anomie, alienation, lack of social bonds, an inability to take responsibility, and social instability.

In Theory of Communicative Action Habermas pins his hopes for resisting the colonization of the lifeworld on appeals to invigorate and support new social movements at the grassroots level, as they can directly draw upon the normative resources of lifeworld. This model of democratic politics essentially urges groups of engaged democratic citizens to shore up the boundaries of the public sphere and civil society against encroaching domains of systems integration such as the market and administrative state. This is why his early political theory is often called a “siege model” of democratic politics. As section 5 will show, this model was heavily revised in Between Facts and Norms. Before turning to that work, we must flesh out discourse ethics—an idea that figured into Theory and Communicative Action but which was only fully developed later.

4. Discourse Ethics

Habermas’s moral theory is called discourse ethics. It is designed for contemporary societies where moral agents encounter pluralistic notions of the good and try to act on the basis of publically justifiable principles. This theory first received explicit and independent articulation in Moral Consciousness and Communicative Action (German 1983, English 1990a) and Justification and Application (German 1991a, English 1993), but it was anticipated by and depends on ideas in Theory of Communicative Action. The overview that follows draws upon these works. Much like the prior section, it only traces the broad outline of discourse ethics.

Discourse ethics applies the framework of a pragmatic theory of meaning and communicative rationality to the moral realm in order to show how moral norms are justified in contemporary societies. It could be seen as a theory that uncovers what we pragmatically do when we make and defend the moral validity claims underlying and manifested in our norms. Yet, we need to be careful with this characterization. Because of its cognitive commitments to moral learning and knowledge discourse ethics cannot simply be a reconstructive description of how it is we practically avoid conflicts and stabilize expectations in post-conventional social contexts. It is also an attempt to provide a formal procedure for determining which norms are in fact morally right, wrong, and permissible. Discourse ethics is squarely situated in the tradition of Neo-Kantian deontology in that it takes the rightness and wrongness of obligations and actions to be universal and absolute. On such a view, the same moral norms apply to all agents equally. They strictly bind one to performing certain actions, prohibit others, and define the boundaries of permissibility. There is no “relative” validity of genuinely moral norms even though, as we shall see, they can be embedded in social contexts that have consequences for their application. As long as these caveats are kept in mind we can understand discourse ethics by analyzing the practice of making and defending validity claims and how there are certain conditions of possibility tacitly underpinning and enabling this practice.

What are the conditions that enable this practice? As touched on above, Habermas posits certain unavoidable pragmatic presuppositions of speech which, when realized in discourse, can approximate a counterfactual ideal speech situation to greater or lesser degrees (1971; MCCA, 86). Discourse participants need to presuppose these conditions in order for the practice of discursive justification to make sense and for arguments to be truly persuasive. Four of these presuppositions are identified as the most important: (i.) no one who could make a relevant contribution is excluded, (ii.) participants have equal chances to make a contribution, (iii.) participants sincerely mean what they say, and (iv.) assent or dissent is motivated by the strength of reasons and their ability to persuade through discursive argumentation rather than through coercion, inducement, and so on (BNR, 82; TIO, 44). The point is not that actual discourses ever realize these conditions—this is why the ideal speech situation is best understood as a counterfactual regulative ideal. Rather, the point is that the outcomes of any discourses are only reasonably taken to be “valid” (empirically true, morally right, and so forth) under the presumption that these conditions have been sufficiently met. As soon as a violation is discovered this casts doubt upon the validity of the discursive outcome.

In addition to these pragmatic presuppositions Habermas proposes his discourse principle (D). This principle is supposed to capture the type of impartial, discursive justification of practical norms required in post-conventional societies: “only those action norms are valid to which all possibly affected persons could agree as participants in rational discourse”. (BFN 107; TIO 41) While (D) was initially framed as a principle for moral discourses it was soon revised to the more general form above, as there are many practical norms concerning interpersonal interaction that are not directly moral even if they must be compatible with morality. Yet even in its broadened form it is crucial to note that (D) only applies to discourses concerning practical norms about interpersonal behavioral expectations, not all discourses about theoretical, aesthetic, or therapeutic concerns (which may or may not involve interpersonal social interaction). The guiding thought is that if discourses about an action norm are carried out in a sufficiently ideal manner and they yield consensus then this is a good indication the norm is valid. The principle does not hold that consensus reached through discourse constitutes validity, nor that whatever norm people coalesce around after discourse that looks sufficiently ideal is assured to be valid. Rather, (D) simply holds that consensus about a norm can be a good test of validity if it has been achieved in the right type of discursive way. It is important to note that, because of its very broad scope, (D) mainly functions by pointing out invalid norms. By itself the discourse principle cannot tell us which norms are valid. It can only help us identify norms that are good candidates for validity.

Moreover, before the validity of an action norm can be assessed, we need more details on the types of discourse and validity claims at issue (TIO 42). Within his project of discourse ethics Habermas identifies moral, ethical, and pragmatic discourses (JA 1-17; MCCA 98). Each type deploys practical reason differently, framing and analyzing questions under the rubrics of the purposive (practical), the good (ethical), or the just (moral). The language of differing discourse “types” should not be taken to mean that norms come prepackaged in distinct kinds. Instead, any norm can be discursively thematized in any of these ways and should not be arbitrarily limited to a given type. With that caution in mind, we can begin to understand discourse types and the norms they produce.

Ethical discourses are a good place to start. For, while they are constrained by the outcomes of moral discourses and therefore not foundational, our prior discussion of the lifeworld provides an apt segue. Ethical discourses are paradigmatically about clarifying, consciously appropriating, and realizing the identity, history, and self-understanding of a group or individual. They make validity claims to authenticity rather than truth or rightness. They also involve value judgments about a particular social form or practice concerning the good life in a community. This is one reason why the outcomes of ethical discourses will have relative validity: they are meant to redeem validity claims for actors in some community or another. Another reason is that values differ from the types of generalizable or universalizable interests embodied in moral norms. While moral norms are supposed to strictly oblige agents to either do or not do some action, values admit of degree. While moral norms express principles backed by reasons, values are affective components of meaning acquired in virtue of living in a given social context. They are connected to reasons but not reducible to them. Values can orient us to goals, aid motivation, and help successfully navigate the lifeworld but cannot ground moral obligations by themselves. Values attract or repel but do not persuade; they can provide motivation to “do the right thing”—to have the will to follow a moral insight—but they do not constitute or even always help us discern what “the right thing” is (BFN 255).

Ethical discourses are rooted in ethicality (Sittlichkeit), which is distinct from morality (Moralität). Like many philosophers, Habermas separates the realm of the right from the realm of the good. Following a loosely Hegelian terminology, he parses this as the difference between morality and ethicality. Ethicality is a way of life composed of both cognitive and affective elements as well as more structural elements that reproduce this way of life: laws, institutions, conventions, social roles, and so forth. It is particularistic in that it defines goals in terms of what is good for a group as a whole and its members. As Habermas believes in George Herbert Mead’s model of “individuation through socialization,” ethicality is deeply engrained and connected to the lifeworld. No one can simply drop their internalized ethical perspective just as no one can simply step out of the lifeworld they have inherited. Individuals are always in some sense bound up with the identity, practices, and values of their upbringing and traditions even if they come to largely reject them. But, as was clear from Habermas’ critique of Gadamer, ethical perspectives do not determine us. Ethical discourses explain how this is by mediating between inheritance and transcendence. While we inherit and internalize an ethical perspective as individuals, we can always question parts of it that we wish to challenge, refashion, or reject for lack of sufficient reasons underwriting certain norms.

This dialectic between the ethicality we internalize through socialization and the way in which we wish to consciously reappropriate and (dis)own portions of such ethicality helps to explain why, in contrast to other discourse types, Habermas pays a great deal of attention to ethical discourses at both the individual and group levels. Ethical discourses at the individual level are called ethical-existential while ethical discourses at the group level are referred to as ethical-political discourse. For example, an individual considering a certain profession would engage in an ethical-existential discourse (for example, is this profession right for me given my character and goals?), while a polity considering whether certain policies express their collective interest, identity, and values would engage in an ethical-political discourse (for example, does this policy align with the collective identity and commitments we have had and how we want to appropriate them moving forward?).

There are two key points about these levels. First, the outcomes of such discourses are constrained by morality irrespective of what would be authentic at individual or group levels: an individual cannot simply decide to become a serial killer just as a country cannot simply enact a policy that has patently immoral consequences (for example, for those outside it). While Habermas thinks it is important to account for the way in which morality is embedded in social contexts through ethical discourses, he is staunchly opposed to postmodern or communitarian takes on morality and justice. Second, there will often be a reflexive interplay between these two levels of ethical discourse. Discourses about what it means to genuinely inhabit a collective identity can impact the ordering and strength of the values held by individuals, and discourses about who one fundamentally is and wishes to be can, through resistance to dominant interpretations of traditions and highlighting unacknowledged injustices, impact how others in a collectivity appropriate their identity and normative practices moving forward. This interplay is bookended by broader moral discourses at both levels, thereby helping the outcomes of such discourses stay in the realm of permissibility.

Pragmatic discourses are similar to ethical discourses in that they start from the teleological perspective of an agent who already has a goal. But in contrast to the reflexive, clarifying, and potentially transformative self-realization and collective self-determination of ethical discourses, pragmatic discourses simply start with a goal of presumed value and set about realizing it. This goal may involve identity and values but it could also refer to more pedestrian concerns and interests. Because the goal is presumed to be worthwhile the values, interests, or goals at issue show up as relatively static. Pragmatic discourses simply focus on the most efficient way to realize or bring about a goal, and their claim to validity concerns whether or not certain strategies or interventions in the world are likely to produce a desired result. As Habermas puts it, pragmatic discourses correlate “causes to effects in accordance with value preferences and prior goal determinations” so as to generate a “relative ought” that expresses “what one “ought” or “must” do when faced with a particular problem if one wants to realize certain values or goals” (JA 3). The “ought” is relative because it is something akin to a rule of prudence that depends on whether an agent happens to have a certain interest or find a goal worth pursuing.

Finally, we turn to what might be seen as the most important type of discourse: moral discourses. Moral discourses are broader in scope and establish stronger validity claims than either ethical or pragmatic discourses. They seek to discern and justify norms that bind universally rather than simply in the confines of a specific community or because an agent happens to find a goal valuable. These norms have binarily coded, unconditional validity instead of the gradated, relative validity of the outcomes produced by pragmatic and ethical discourses.

In order to discursively discern this non-relative sense of moral validity Habermas proposes a separate principle, his principle of universalization (U), for discourses about moral norms: “A norm is valid when the foreseeable consequences and side effects of its general observance for the interests and value orientations of each individual could be jointly accepted by all concerned without coercion” (TIO 42). While (U) has gone through several different formulations, the basic idea is that for whatever valid moral norms there are, such norms can be accepted by all affected persons in a sufficiently ideal discourse wherein they assert their own interests and values. (U) checks if the norms we take to be moral actually are in virtue of whether or not they are universalizable. If they are not universalizable, they cannot be moral norms. Beyond this basic characterization there are some interpretive issues with (U). Three are worth brief focus: its apparent reference to consequences, where (U) comes from, and the role of interests.

First, in the version of (U) above, it is easy to mistake the “foreseeable consequences and side effects” clause with the addition of a mild consequentialist constraint. Given Habermas’ deontological commitments this would be odd. Instead, the clause builds in a “time and knowledge index” so that it does not make impossible demands on moral agents. Fully satisfying (U) would require discourse participants who had unlimited time, complete knowledge, and no illusions about their own interests and values; it would require participants who transcended their human condition. As (U) must be usable in the real world it can only ask that moral discourse participants attempt to account for the “anticipated typical situations” to which a norm would apply when they attempt to justify any moral norm (JA 37). The circumscribed task of (U) is key: it is only supposed to justify moral norms in the abstract. While this justification may point towards “typical” cases of application, it does not predetermine all applications. What about novel, atypical, or completely unforeseen situations to which the norm might unexpectedly apply?

Following Klaus Günther, Habermas claims that moral (and legal) decisions in specific cases require a logic of appropriateness found in discourses of application (Günther 1993; JA 35-37). Discourses of application look at a concrete case and survey all potentially applicable norms, relevant facts, and circumstances. They try to offer exhaustive or “complete” descriptions of a situation so as to decide among multiple, sometimes competing or only partly applicable norms that might regulate a situation. There is a division of labor between the two types of recursively related discourse: whereas discourses of justification lay out the reasons why we should endorse a norm as a general rule with reference to typical situations, discourses of application seek to apply norms to concrete cases which may be wholly new or defy expectations. As fallible agents we can make a variety of different errors in our discursive justification of a norm or fail to anticipate new situations or altered understandings of facts, values, and interests—a failure that would be revealed in application. Habermas calls this the “dual falliblist proviso,” and it instills an awareness that moral justification is an ongoing project (TJ 259). The recursive interplay of justification and application is supposed to progressively address prior errors and oversights. New insights gleaned from application discourses or novel situations can lead us to revisit norms whose justification was taken for granted, and this refinement of our understanding regarding how and why norms are justified will help us apply them better. If we had providential foreknowledge we would not need application discourses. But since we are fallible the “foreseeable consequences and side effects” should be seen as referring to an in-built “time and knowledge” index for the outcomes of justificatory discourses, which are then supplemented by application discourses that may impact the formulation of the initial norm.

The second interpretive issue is where (U) comes from. Habermas initially claimed that (U) could be formally deduced from a combination of the pragmatic presuppositions of discourse and (D), but weakened this claim shortly thereafter (JA 32 n17). Instead of deriving (U) from a formal deduction or informal inferences he now claims—using a term coined by Peirce—we arrive at (U) “abductively” (TIO 42). To arrive at something abductively is to suggest that we first observe a phenomenon (moral norms) and adopt a “best guess” hypothesis to explain it (the moral principle), which can then be subjected to further inductive testing (Ingram 2010, 47; Finlayson 2000a, 19). In short, (U) is now proposed as the best candidate principle for helping to explain moral normativity. To buttress the plausibility of this claim Habermas has also fallen back on his theory of social evolution and the “weak…notion of normative justification” in post-conventional contexts (TIO 45). Indeed, he now often speaks about (U) as following from the type of impartial justificatory procedure appropriate to a post-conventional condition that seeks to discern norms that are “equally in everyone’s interest,” “generalizable,” or “universalizable” (RPT 367; BFN 108, 460; TJ 265). The reference to interests leads us to the third interpretive issue with (U).

Early formulations of (U) only refer to interests (MCCA 65, 120). The inclusion of value orientations is potentially confusing. As noted above values are not necessarily cognitively grounded. As Habermas has always presented his moral theory as cognitivist it would be odd to give values such a central role. It seemed to make sense that initial formulations of (U) only included interests, as Habermas has defined interests in a cognitive fashion (on interests as “reasons to want” see Finlayson, 2000b). Bolstering an interpretation of (U) that puts priority on (cognitive) interests he has stated that “(U) works like a rule that eliminates as non-generalizable content all those concrete value orientations with which particular biographies or forms of life are permeated” (MCCA 121), and that the specific part of (U) referring to “uncoerced joint acceptance” means that any reasons put forth in moral discourse must “cast off their agent-relative meaning and take on an epistemic meaning from the standpoint of symmetrical considerations” (TIO 43). Moreover, the interpretive secondary literature has often emphasized the centrality of interests over values and focused on how Habermas often talks about “generalizable” or “universalizable” interests as the distinctive feature that moral norms secure (Heath 2003; Finlayson 2000b; Lafont 1999). How then should the inclusion of value orientations be understood?

Habermas has said he included value orientations in (U) so as to “prevent the marginalization of the self-understanding and worldviews of particular individuals and groups” (TIO 42). This does not mean that values are on a par with interests. Instead, his point is that interests and values are always bound together. Value orientations exert at least some indirect influence on moral discourses insofar as they subtly influence the very interpretation of our own interests (JA 90). Proceeding as if value orientations can be expunged from moral discourses may in fact introduce discursive blind spots. Indeed, candor about one’s own value-orientations may be crucial since the impartiality of (U) involves “generalized reciprocal perspective-taking” that cuts both ways: it orients participants towards “empathy for the self-understandings” of others as well as towards “interpretive interventions into the self-understanding of participants who must be willing to revise their descriptions of themselves and others” (TIO 43). The essential point is that even though “some of our needs are deeply rooted in our anthropology” and can be seen as basic generalizable interests shared by all, we must nevertheless avoid “ontologizing generalizable interests” into “some kind of given” because even “the interpretation of needs and wants must take place in terms of a public language” wherein our own self-understandings are open to revision (TJ 268; JA 90).

A final interpretive issue that merits attention is the precise status of moral rightness. Habermas has always held that morality and truth are analogous in that both are cognitive, binarily coded, and subject to learning processes. Moreover, he has always been sharply critical of approaches that would reduce morality to a purely subjective or relativized affair. Yet, given that rightness is not reducible to truth and that Habermas has repeatedly disclaimed a moral realist reading of his theory, it is unclear precisely how far this analogy is supposed to extend. This is not only because there are a variety of differences between empirical and moral knowledge but also because Habermas has changed his theory of truth over the years—moving from a consensus theory that identified truth with ideal warranted assertability to a “pragmatic epistemological realism that follows in the path of linguistic Kantianism” (TJ 7). Early articulations of discourse ethics seemed to admit of interpretations wherein rightness was a justification-transcendent concept that couldn’t be captured by ideal warranted assertability. This led some interpreters to interpret Habermas’ moral theory as at least tacitly committed to some variant of internal moral realism (Davis 1994, Kitchen 1997, Lafont 1999 and 2012, Smith 2006, Peterson 2010 ms.). But, in the course of resisting this reading, Habermas has explicitly claimed that, “ideally warranted assertability is what we mean by moral validity” (TJ 258, 248). He now wishes to articulate a notion of moral rightness that can be cashed out in terms of a pragmatist constructivism that also avoids the perils of relativism and skepticism—that is, which maintains an anti-realist account of moral rightness that still resists collapsing into a form of moral consensus theory. Whether he succeeds in this endeavor is a hotly debated topic.

5. Political and Legal Theory

In post-conventional, pluralistic societies ever fewer norms can be underwritten by a shared ethos embodied in a community’s ethicality or collective identity. Moral norms cannot pick up the slack to achieve social integration and cohesion by themselves. Because moral discourse is demanding and aims at what is equally in everyone’s interest, few moral norms will be seen as justified across the world or even in a given society (JA 91, TJ 265). And, as Habermas noted in Theory of Communicative Action, while systems like the bureaucratic state and economy can achieve stability and coordinate expectations through money and power, this can erode mutual understandings and social solidarity; markets and bureaucracies tend to displace and colonize the lifeworld. Indeed, his political essays from this period cast democratically created law as holding the line against system encroachments in a siege mentality (BFN 486-89, Habermas 1992b 444). This may leave us asking: What other resources exist for legitimate social integration?

In Habermas’ clearest statement of political theory, Between Facts and Norms, modern law shows up as precisely the resource we are looking for. If law is linked to democratic political structures in the right way it confers legitimacy on legal norms, thereby fostering social integration and stability. Broadly speaking, the relation between legal legitimacy, procedural-democratic popular sovereignty, and public discourse is nested and reflexive: legitimate law must be rooted in democracy, which itself depends upon a robust public sphere. A vibrant democratic public sphere is what allows for the revision and questioning of prior law. Conceived of in this way modern law is a “transformer” that preserves the normative achievements and mutual understandings that issue from the collective self-determination of the public sphere by translating them into legitimate, binding decisions that can “counter-steer” against the logics of the state and market. As long as legal decisions are arrived at in the right type of procedural, discursive fashion there is a presumption in favor of their rationality and legitimacy. And, as long as the public sphere continues to be a robust and open forum of contestation, any prior decisions are revisable such that there is a circulation between the informal public sphere and more formal institutions of the state. This focus on the transformative, mediating nature of law revises the prior “siege” model of democratic law into a procedural “sluice” model (Habermas 2002, 243). While the prior model saw democratically generated law as a defensive dam or shield against the demands of systems, the new model sees a certain type of lawmaking as mediating the circulation between lifeworld and system in a way that produces legitimate and binding legal norms. Modern law works with systems and alongside post-conventional morality to stabilize social expectations and resolve conflicts.

We can start to understand the relation between law, democracy, and the public sphere by focusing on legal legitimacy and democracy. Between Facts and Norms posits a tension within law itself, as well as an internal relation between modern law and democracy. To function, all law must demand compliance, threaten coercion, and (however tacitly) appeal to an underlying normative justification. Law is therefore characterized by a tension between “facticity” and “validity” insofar as it must be recognized as factually efficacious and normatively justified. This tension helps explain the relation between law and democracy in contemporary contexts. Pre-modern law appealed to God, nature, human reason, or shared culture for its justificatory backing. In post-conventional societies the fact that law is coercible and changeable yet merely rooted in fallible humans is laid bare. For Habermas, the underlying normative justification can now only be understood as “a mode of lawmaking that engenders legitimacy” (IO 254). The thought is that democracy is the only mode of lawmaking that is up to this legitimacy-engendering task. In light of these connections it is fruitful for present purposes to focus on “deliberations that end in legislative decision making” rather than treating political and legal legitimacy separately (BFN 171; Bohman and Rehg 1999, 36).

The democracy Habermas has in mind differs from overly populist varieties. He is clear that the legitimacy underwriting lawmaking must be twofold: law must not only express the democratic will of the community but must also be non-subordinately “harmonized” with morality (BFN 99, 106). This non-subordinate concordance of legality and discourse theoretic morality is the hardest sense of legitimacy to explain and the easiest to overlook, so it is fruitful to start there. For Habermas, “legal and moral rules…appear side by side as two different but mutually complementary kinds of action norms” in post-conventional societies. In order to also account for “the idea of self-legislation by citizens” we must avoid a “subordination of law to morality” along the lines of classical natural law theory (BFN 105-6, 120; IO 257). Yet it seems puzzling to hold that democratically determined law should be compatible with but not subordinate to discourse-theoretic morality. What about cases where law and morality seem to conflict? There are a few answers that highlight unique features in Habermas’ theory. At a general level these answers take the same shape: while there are many ways that legal systems can square with moral permissibility, there are nevertheless structural and conceptual features endogenous to processes of modern procedural-democratic popular sovereignty that, at least at an abstract level, tend to harmonize legal norms with moral permissibility. This avoids concerns with morality trumping legality in an exogenous manner.

One reason to expect that democratically legitimate law and moral permissibility will be at least in principle commensurable is that they are both rooted in (D). We saw above how the moral principle (U) expresses the way (D) is specified for moral discourses. Habermas also proposes a principle of democratic legitimacy (L) that expresses the way (D) is specified for political discourses producing law. This principle is rooted in (D) in virtue of what Habermas calls the “legal form.” When (D) is deployed in discourses aimed at producing legal norms for regulating common life together it is understood these norms will be cloaked in the legal form: the set of formal and functional features characterizing modern positive law. Modern positive law is enacted and conventional, enforceable and coercive, rooted in institutions with some reflexivity, tailored to protect individuals through rights, and limited in scope (BFN 111-118, IO 256). If law is to function as a tool for the consensual regulation of social conflicts and the integration of society, then it needs to take on this form.

The principle of democratic legitimacy (L) is part of the normative backing that is supposed to emerge, albeit in nuce and very abstractly, from the historical interpenetration of (D) and the legal form that has culminated in the structures of modern democratic state. It claims, “only those statutes may claim legitimacy that can meet with the assent of all citizens in a discursive process of legislation that in turn has been legally constituted” (BFN 110; constituted is sometimes translated as organized). This principle captures how (D) is specified for political discourses so that democratic procedures underwrite the legitimacy of legal norms. Legitimacy does not arise out of formal legality alone; it needs the added normative backing of democracy. The idea of (L) is that compliance with the law must be rational and rooted in the law’s perceived legitimacy. To achieve this, political discourses must be structured in a way where formal legislative institutions accurately represent and address deliberations going on in the informal public sphere, and where there are institutionalized procedural mechanisms organized in a way to help screen out weak arguments (BFN 340). The details of this structuring will be clarified below, particularly in relation to the process model and the relationship between democracy and the public sphere.

However, the mere fact that (U) and (L) are rooted in (D) does little to ensure the commensurability of law and discourse-theoretic morality. Fortunately, there are additional reasons why we might expect such a harmonization. Habermas thinks the combination of (D) and the legal form in (L) also supplies us with the resources to discern the conceptual kernels of an abstract “system of rights” that will be inscribed in the core structures of any legitimate self-determining political community. The basic argument is that in order for (L) to be realized it must make reference to a concrete community engaged in self-determination through modern law. In such communities equal legal personhood takes on the role of a “protective mask,” a formal identity mainly defined by rights instead of duties, that crystalizes around individual moral persons (BFN 531, 112). This legal identity is constituted by a core of rights that secure the status and private autonomy of individuals such that they can not only live their individual lives but also genuinely deliberate (on equal footing, free from coercion, and so forth) about the terms of shared life together. Yet, these individual rights cannot be effective unless they presuppose other rights to participation and basic material provision—rights that secure public autonomy. The claim is that the legal manifestations of private and public autonomy, often expressed in the idioms of human rights and popular sovereignty, mutually presuppose one another. What results is an abstract system of rights made up of five core types. What are these right types?

First, in order to discursively engage one another people need to be reasonably secure. Therefore, rights that guarantee the status of individual persons are required. Three types of rights jointly achieve such protection: (i.) the right to equal liberties compatible with those of others, (ii.) rights of membership that determine the extent of the community, and (iii.) rights of due process that assure each person is treated the same and equally protected under the law (BFN, 133-134). These rights secure the individual private autonomy prioritized by classical liberalism. But any community engaged in specifically democratic self-determination must also safeguard the ability to actively use the freedom afforded by this secure status to deliberate, disagree, and come to mutual understandings in concert with others. If individual rights are to be effectively used (iv.) rights of communication and political participation that formally secure equal opportunity and access to the political process are required. These rights secure the collective public autonomy prioritized by classical republicanism. They enable discourses in the public sphere as well as equal access to channels of political say and influence; they enable democratic popular sovereignty by making sure everyone can participate on fair and equal terms, and that information, innovative ideas and arguments about how to structure common life are kept freely circulating and scrutinized. Lastly, these four right types are insufficient if basic needs are threatened or go unmet. Formal guarantees of freedom and participation mean little if they amount to the freedom to starve. So, as a final step, Habermas proposes some measure of (v.) social, technological, and ecological rights securing the basic conditions of a minimally decent life. Democratic states have often done a poor job fully realizing these rights, but the claim is simply that these general right types are conceptually required if self-determination through law is to achieve the dual sense of legitimacy noted above. In this same spirit of clarification, it is also important to note that the abstract system only identifies certain right types, not some list of concrete rights. Communities have incredibly wide interpretive latitude when it comes to how these rights show up. Habermas often refers to rights as “unsaturated placeholders”; it is largely up to communities to “fill in” their content.

The expectation of a non-hierarchal harmonization of morality and legality may now seem less puzzling. Ideally, lawmaking discourses approximate (L) against the backdrop of an abstract system of rights inscribed in the political structures of a democratic community. This places some broad constraints on how deliberations unfold and the type of norms they can produce. Moreover, apart from these structural background constraints political discourses are also themselves unique. In contrast to moral discourses focused on “the interest of all” or ethical discourses focused on authentic self-realization, political discourses aimed at self-determination through law reference a plethora of different concerns, and do so in an internally-structured way aimed at carving out a space (defined by rights) where moral personhood and ethical authenticity can flourish (BFN 531).

While deliberations about “political questions are normally so complex that they require the simultaneous treatment of pragmatic, ethical, and moral aspects” of issues, they ideally unfold along a ‘process model’ where there is a structured interplay between pragmatic, ethical, and moral concerns as well as procedurally regulated bargaining (BFN 565, 168). The basic idea is that for any provisional policy conclusion there is an obligation to respond to objections stemming from more abstract aspects of an issue or levels of discourse; discursive processes cannot be arbitrarily limited. For instance, participants in a discourse on immigration policy cannot simply consult ethical concerns regarding their community’s authentic identity but yet refuse to listen to moral discourses that bear on such policies. Any moral aspects need to be explicitly discussed, and they filter or check more particularistic issue-aspects and discourses (Cf. BFN 169 and the emendation at 565 on whether to refer to the structured interplay as between discourses or aspects of a case). The abstract system of rights and the process model mean that, within political deliberations about how to structure common life together, it will in principle always be possible for more abstract moral discourses to weakly check pragmatic and ethical-political discourses. And, this checking will be endogenous to structures of democratic self-determination.

So far the focus has been on the relation between law and democracy without much reference to the public sphere. However, it is hard to overstate the importance Habermas places on democratic deliberation rooted in the public sphere. None of the formal or structural mechanisms mentioned so far guarantee that public political discourses or laws will be specified in a given way. There is assurance neither that the abstract system of rights or (L) will be meaningfully realized, nor that the interplay of various types of concerns in political discourses will unfold along the process model. Everything hangs on the quality and institutional structuring of deliberation in the public sphere. Indeed, the primary reason why democracy confers legitimacy upon legislative outcomes is that it is rooted in a model of distinctly procedural popular sovereignty that simultaneously expresses the will of the community and that leads to more rational outcomes. An analysis of the specific way in which democracy and the public sphere are related on Habermas’ model is the best way to understand how the democratic mode of lawmaking underwrites the legitimacy of legal norms.

In Between Facts and Norms Habermas proposes a “two-track” model of democratic politics outlining a circulation of political power engendering legitimacy. He divides the political public sphere into informal and formal parts. The informal public sphere includes all the various voluntary associations of civil society: religious and charitable organizations, political associations, the media, and public interest advocacy groups of all varieties (BFN 355). In this sphere public political deliberation is free and unorganized. Through this open clash of views and arguments individuals and collectivities can both persuade and be persuaded, thereby contributing to the emergence of considered public opinions. In contrast, the formal public sphere includes institutionalized forums of discourse and deliberation like congress, parliament, and the judiciary as well as more peripheral administrative and bureaucratic agencies associated with state structures. This sphere is supposed to be organized in such a way that it renders decisions reflecting the considered public opinions of the informal public sphere. Formal institutionalized decision making bodies must be porous to results of the informal public sphere.

The informal public sphere is the key forum for generating a type of normative power that can integrate society through mutual understandings and solidarity rather than through money or administrative-bureaucratic power. When discourse participants in the informal public sphere freely reach mutual understandings about how to regulate the terms of shared life together “communicative power” emerges (Flynn 2004 discusses communicative power’s precise locus). Communicative power arises from jointly authored norm expectations that are cognitively grounded in the force of better reasons and motivationally grounded (albeit weakly) in mutual recognition and collective ethical discourses. Cognitively speaking, free communication in the public sphere can foster “rational opinion and will-formation” because “the free processing of information and reasons, of relevant topics and contributions is meant to ground the presumption that results reached in accordance with correct [discursive] procedure are rational” (BFN 147). This acceptance also provides weak motivation: in accepting a norm’s validity claim one accepts the background understandings and reasonins underlying it which can motivate relevant circumstances. Moreover, because this mutual understanding was presumably reached through persuasive discourse where reasoned dissent was (and remains) a real possibility, norm acceptance can also motivate in a spirit of anti-paternalistic empowerment: parties recognize each other as accountable and responsible for their actions in accord with a norm until new counter-reasons are discovered. While they may be aware of counter-inclinations and motives that are not backed by good reasons, they take one another to be competent, responsible agents who can choose to act on rationally backed norms (Günther 1998). Yet, because the motivation accompanying cognitive insight is fragile and weak, communicative power must also be rooted in a community with a shared ethical-political identity and legitimate law so that motivational deficits can be met with supplemental resources of a shared life and law.

Communicative power can only arise if the informal public sphere has certain characteristics. First and foremost, it must be relatively free of distortions, coercion, and silencing social pressures so that communication can work as a filter for fostering more rational individual and collective will formation (BFN 360). The public sphere also needs to accurately function as a “context of discovery” wherein problems that affect large segments of the public are identified and taken up for discussion and resolution in discourse. Moreover, civil society must be animated by a political culture so that members actively participate in voluntary associations and public discourse about the terms of common life together (BFN 371). Normative power potentials cannot be generated if members largely retreat into private concerns or a society is internally segmented and riven with special interests (Flynn (2004) 439-444; Bohman and Rehg (1999) 41-42). Clearly, if the public sphere is to remain healthy then the media’s role in fostering accurate information and timely mass communication will also be crucial (EFP, 138-183).

The political institutions of the formal public sphere are arranged so as to be porous to the inputs of the informal public sphere, to further refine and focus public opinion, and to make decisions. Building on the work of Bernhard Peters, Habermas maintains that modern constitutional democracies are set up so communication and decision-making flow from the “periphery” of the informal public sphere into the “center” constituted by those formal political institutions that create, enforce, clarify, or implement the law (BFN 354). In a well-functioning democratic regime there will be structural “sluices” or “floodgates” embedded in the institutions of the administrative state (legislature, judiciary, and so forth) so that the circulatory flow of power proceeds in the right direction, from the periphery to the center.

The thought is that the political community should “program” and direct the institutions of the administrative complex, not the other way around (BFN 356). If the state or other powerful actors reverse this flow by simply positing new laws or rules and either demanding compliance or inducing it in some other way, then this exercise of non-communicative administrative-bureaucratic power would be neither legitimate nor stable. Habermas claims the “integrative capacity of democratic citizenship” erodes to the extent that the circulation of political power is interrupted or reversed. Only communicative power has the legitimating force needed so that a community can both author and rationally abide by the law. Democratic lawmaking is the key institution that “represents…the medium for transforming communicative power into administrative power” while preserving its normative potential (BFN 169, 81, 299). Democratically generated law ensures normative power potentials flow in the right direction and that they are maintained when implemented by institutions of the administrative state.

This account of procedural-democratic collective self-determination should not be confused with traditional national self-determination. Habermas rejects models of sovereign collective self-determination that presuppose a nation or people with a homogeneous identity and interests, as well as models where “a network of associations” stands-in for this (imaginary) collective-self (BFN 185, 486). Instead, in modern constitutional democracies the “idea of popular sovereignty is… desubstantialized [and]…not even embodied in the heads of the associated members.” Popular sovereignty “is found in those subjectless forms of communication that regulate the flow of discursive opinion- and will- formation in such a way that their fallible outcomes have the presumption of practical reason on their side” (BFN 486). Insofar as we can speak about the will of a community it is an anonymous and subjectless public opinion emerging out of the discursive structures of communication themselves (BFN 136, 171, 184-186, 299, 301). This unique interpretation of popular sovereignty helps explain some final aspects of Habermas’ political theory: his views on religion and the public sphere, his constitutional patriotism, and his vision of politics beyond the nation-state.

In early writing Habermas claimed that as the rationality and pluralism of enlightenment ideals slowly took hold in modern societies the mythic explanations of religion would be less important. But, he slowly came to revise his view on religion in modern societies. At present, the way he sees religion fitting into the public sphere of a liberal democracy is what is important. In liberal democracies, untrammeled populism is held in check by not only individual rights but also the very nature of public debate: citizens collectively self-determine through persuasion and rational argumentation. To do this amidst the pluralism of modernity, the laws they make must be grounded in public reasons accessible to all. The question is what this means for religious citizens.

There have been a variety of answers. For instance, in Political Liberalism John Rawls held that liberal democratic citizens should ultimately only endorse policies that they can support on the basis of secular reasons. While these citizens may have religious reasons that favor a law or policy, when engaged in political debate they must eventually “translate” these reasons into terms that non-believers could accept. Habermas is sympathetic to the vision of liberal democracy animating this view of how religious citizens should act. Indeed, he criticizes thinkers like Wolterstorff who insist that religious citizens ought to be allowed to try to base coercive law on their own particularistic values and conception of the good. Nevertheless, he feels that placing the burden of “translation” onto religious citizens alone is somewhat misguided. Such an approach underestimates the ethical-existential importance of religion in some people’s lives—especially if it is bound up with the structure of their lifeworld and identity. As an alternative, Habermas proposes both religious and non-religious citizens be allowed to invoke any reasons for or against policies at the level of the informal public sphere, provided they take one another’s claims seriously and do not dismiss them from the outset. But when it comes to the institutions of the formal public sphere concerning coercive lawmaking, justifications should only be based in reasons that all can accept.

This view is somewhat unsatisfying for several reasons: it simply moves the asymmetrical burden of translation “up a level,” it may run into concerns of a metaphorical split in identity, and it could even saddle non-religious citizens with undue burdens (Yates 2007, Lafont 2009). For present purposes, the most charitable reading is that Habermas assumes all democratic citizens have an obligation to adopt a thoroughly self-reflective attitude. Religious citizens must “self-modernize” insofar as they are expected to be open to things like the authority of science, the need for non-religious reasons backing coercive law, and the possible validity of claims made by other religions. But, this also means non-religious citizens must move beyond a dogmatic secularist understanding wherein it is impossible for religious claims to have any cognitive value whatsoever. Indeed, given that some fundamental moral notions—such as equal human dignity—have been inextricably tied to the history of world religions, he claims it is not always clear where the boundaries of the religious and secular are. Determining these boundaries (and what can count as publicly acceptable) may at times be a cooperative task wherein each side takes the claims of the other with some degree of seriousness (2006b, 45 and 2003b, 109).

Habermas’ reinterpretation of popular sovereignty also explains why he has adopted the theory of constitutional patriotism pioneered by Dolf Sternberger. Constitutional patriotism maintains that, in contrast to national identities of the past, modern political communities can base their collective identities around the unique ways they appropriate and embed the abstract, universalistic principles of democratic self-determination within their unique histories and traditions. On such a model, political allegiance can coalesce around “a particularist anchoring of…the universalist meaning of [principles such as] popular sovereignty and human rights” (BFN 500; L’i 308; BNR 106). This particularist anchoring would presumably include the way in which a community takes up the abstract system of rights, the process model, and (L). The claim is that the specific way a political community instantiates the “abstract procedures and principles” of the modern democratic state fosters the development of a “liberal political culture” that “crystalizes” around that country’s constitutional traditions, structures, and discursive fora (IO 118; DW 78). The integrative force that emerges against this backdrop is called civic solidarity, which Habermas characterizes as “an abstract, legally mediated solidarity among citizens… a political form of solidarity among strangers” (DW 79; BNR 22). This is essentially the integrative potential of democratic citizenship when it is actively used.

One assumption here is that “culture and national politics have become…differentiated” from one another; citizens can see themselves as part of a shared political culture precisely because they no longer see the state as a vehicle for realizing a homogenous, pre-political nation. While this is a far cry from empirical realities in many parts of the world, Habermas sees the European Union as illustrative in this regard. Even in a context that was once characterized by strong national identities (where the chances for such an identity might seem slimmer than in more multicultural contexts) we can start to see how “a common political culture could differentiate itself off from the various national cultures” and how “identifications with one’s own forms of life and traditions [could be] overlaid with a patriotism that has become more abstract, that now relates… to abstract procedures and principles” (NC 261; BFN 507, 465; IO 118; BNR 327; DW 78).

Finally, Habermas sees constitutional patriotism as a normative resource that could help to expand civic solidarity across political borders and uncouple legal structures from the nation-state so they could be scaled-up into new institutions of international law. Such developments would allow new forms of democratic self-governance above the nation-state at regional and global levels (DW 79). These post-national implications are naturally produced by Habermas’ core theoretical commitments. Deliberative democracy is committed to institutionalized discourse that in some way makes it possible for law to be justified to the persons who are affected by or subjected to it. Given increasing global interdependence this obviously pushes in cosmopolitan directions. However, at the same time, it is important to remember that communicative power must be rooted in a community with a shared ethical-political identity, and that constitutional patriotism is parasitic upon a particular political culture. This rootedness means that civic solidarity and new forms of self-governance can stretch, but only so far.

This anchored cosmopolitanism yields a multi-level constitutionalization of international law that aims at some measure of global governance without government. While Habermas’ account of such a multi-level system is only a sketch and many details need filling-in, the broad outline is clear. He proposes a system comprised of the “supranational” (global), “transnational” (regional), and national level political institutions with different roles. A supranational organization akin to a reformed United Nations is envisioned as securing international peace, security, and core human rights. At the mid-level, transnational authorities like the EU would tackle technical issues through coordinative efforts and political issues through negotiated bargaining among sufficiently representative regional regimes of commensurate stature. Finally, nation-states would retain their status as the locus of democratic legitimation. This would require the spread of democratic structures to each nation-state so that laws can reflect the will of the community and so that they could be reliably in line with the basic human rights secured by a supranational organization.

This vision of a multi-level political system for the constitutionalization of international law can be criticized as demanding both too much and too little. Habermas’ version of cosmopolitan deliberative democracy locates the touchstone of legitimacy in the fact that “citizens are subject only to those laws which they have given themselves in accordance with democratic procedure” (CEU 14). From this perspective of democratically legitimate law, the proposed system may demand too little. Despite Habermas’ insistence that negotiation between regional regimes could take place in a way that would not “impair deliberation and inclusion,” it is hard to see how such bargaining could really constitute a process where citizens give themselves the law through democratic procedures (CEU 19). From the perspective of rootedness in political culture, the multi-level system may also demand too much with the extension of civic solidarity to transnational regimes. Habermas clearly thinks there are limits to such an extension, as “the transnational extension of civic solidarity…comes to nothing…when it is supposed to assume a global format.” However, apart from the fact that neighboring countries might be supposed to have some minimal level of shared history and culture born out of territorial proximity and an interdependency of interests, it is unclear why this extension of solidarity would reach the levels needed to underwrite the democratic legitimacy of laws within transnational units of regional governance (CEU 62).

While Habermas is certainly aware of these criticisms, he is largely focused on defending his political theory in broad, systematic terms. If the broad normative outlines are correct then the overall theory will stand regardless of how the empirical details are filled in. Indeed, Habermas is rather unique among contemporary philosophers both in his systematic approach to large areas of theory and in his willingness to allow others to fill in the details of how particular claims might work. He has always insisted that philosophers do not speak from a privileged place of knowledge. The best that they can hope for is to articulate a theory that can be convincingly and rigorously tested and debated in the public sphere. We can perhaps understand not only his political theory, but several other theoretical projects in this spirit of a public intellectual putting forth a theory for testing and debate that requires further articulation by those who come after.

6. References and Further Reading

a. General Introductions to Habermas

The article presented a general and reasonably complete introduction to Habermas. However, given the breadth of his work and space constraints, the following should also be consulted:

  • 1978. McCarthy, Thomas. The Critical Theory of Jürgen Habermas.  The MIT Press.
  • 1988. White, Stephen K. The Recent Work of Jürgen Habermas. Cambridge University Press.
  • 2005. Finlayson, James Gordon. Habermas: A Very Short Introduction. Oxford University Press.
  • 2011. Fultner, Barbara (ed.) Jürgen Habermas: Key Concepts. Acumen Press.
  • 2014. Bohman, James and Rehg, William. Jürgen Habermas. Stanford Encyclopedia of Philosophy.
  • Thomas Gregersen maintains an online bibliography at the Habermas Forum.

The following printed bibliographies are also useful:

  • 2013. Corchia, Luca. Jürgen Habermas. A Bibliography: Works and Studies (1952-2013). Pisa (IT): Arnus University Books.
  • 2014. Müller-Doohm, Stephan. Jürgen Habermas – Eine Biographie. Berlin: Suhrkamp.

b. Introductory Books and Articles on Specific Themes

i. Biography

  • 2001. Matustik, Martin Beck. Jürgen Habermas: A Philosophical-Political Profile. Rowman and Littlefield.
  • 2010. Specter, Matthew G. Habermas: An Intellectual Biography. Cambridge University Press.
  • 2004. Wiggershaus, Rolf. Jürgen Habermas. Reinbek Bei Hamburg: Rowohlt.

ii. Linguistic Turn

  • 1994. Cooke, Maeve. Language and Reason. The MIT Press.
  • 1999. Lafont, Cristina. The Linguistic Turn in Hermeneutic Philosophy. Jose Medina (trans.). Cambridge University Press.
  • 2016. Lafont, Cristina. Jürgen Habermas in The Blackwell Companion to Hermeneutics, 440-445.

iii. Discourse Ethics

  • 1994. Davis, Felmon John. Discourse Ethics and Ethical Realism: A Realist Realignment of Discourse Ethics. European Journal of Philosophy, 125-142.
  • 1997. Rehg, William. Insight and Solidarity: The Discourse Ethics of Jürgen Habermas. University of California Press.
  • 2000a. Finlayson, James Gordon. Modernity and Morality in Habermas’s Discourse Ethics. Inquiry. 43, 319-40.

iv. Political Theory

  • 1994. Bohman, James. Review: Complexity, Pluralism, and the Constitutional State: On Habermas’s Faktizitat und Geltung. Law and Society Review, 897-930.
  • 2002. Discourse and Democracy: Essays on Habermas’s Between Facts and Norms, ed. Rene von Schomberg and Kenneth Baynes. SUNY Press.
  • 2010. Hedrick, Todd. Rawls and Habermas: Reason, Pluralism, and the Claims of Political Philosophy. Stanford University Press

c. Works Cited

Most of Habermas’s work can be found in German and English. After the original year of publication and title, see the square brackets for the English translation. For some texts a translation does not exist, only exists in part, or is divided between texts. This is denoted by an asterisk (*).

  • 1953. Mit Heidegger gegen Heidegger denken. Zur Veröffentlichung von Vorlesungen aus dem Jahre 1935. Frankfurter Allgemeine Zeitung. July 25, 1953. [English: 1977]
  • 1956. Der Zerfall der Institutionen (Arnold Gehlen). Frankfurter Allgemeine Zeitung. July 4, 1956.*
  • 1958. Philosopische Anthropologie: Ein Lexikonartikel. In: A. Diemer, I. Frenzel (eds.) Fischer-Lexikon Philosophie. Frankfurt am Main: Fischer. Pp. 18-35.*
  • 1962. Strukturwandel derffentlichkeit. Darmstadt: Luchterhand. [English: 1989]
  • 1967. Probleme einer philosophischen Anthropologie. Lecture transcript from the 1966/7 Winter semester at the University of Frankfurt (unauthorized edition).*
  • 1967. Zur Logik der Sozialwissenschaften. Tübingen: J.C.B. Mohr. [English: 1988a]
  • 1968a. Technik und Wissenschaft als Ideologie. Frankfurt am Main: Suhrkamp. [English: 1970, 1973b*]
  • 1968b. Erkenntnis und Interesse. Frankfurt am Main: Suhrkamp. [English: 1971b]
  • 1969. Protestbewegung und Hochschulreform. Frankfurt am Main: Suhrkamp. [English, 1970]
  • 1970 Toward a Rational Society, J.J. Shapiro (trans.) Boston: Beacon.
  • 1971a. Theorie und Praxis. Frankfurt am Main: Suhrkamp. [English: 1973b]
  • 1971b. Knowledge and Human Interests. J. J. Shapiro (trans.). Boston: Beacon.
  • 1971c. The Christian Gauss Lectures: Reflections on the linguistic foundations of sociology [For published versions, see chapter 1 of German 1984b and pp.1-103 of English 2001* original translation for lecture purposes by Jeremy Shapiro; re-translated for publication by Barbara Fultner]
  • 1973a. Wahrheitstheorien. In H. Fahrenbach (ed.), Wirklichkeit und Reflexion. Pfllingen: Neske. 211-265. Reprinted as chapter 2 in 1984b.*
  • 1973b. Theory and Practice, J. Viertel (trans.). Boston: Beacon.
  • 1973c. Nachwort / Postscript to Knowledge and Human Interests: Philosophy of the Social Sciences [included in all subsequent printings of Knowledge and Human Interests].
  • 1973d. Legitimationsprobleme im Spätkapitalismus. Frankfurt am Main: Suhrkamp. [English: 1975]
  • 1975. Legitimation Crisis, T. McCarthy (trans.). Boston: Beacon.
  • 1976a. Zur Rekonstruktion des Historischen Materialismus. Frankfurt am Main: Suhrkamp. [English: 1979*]
  • 1976b. Was heiβt Universalpragmatik? In K.-O. Apel (ed.), Sprachpragmatik und Philosophie. Frankfurt am Main: Suhrkamp. 174-272. [English: 1979, chap. 1]
  • 1977. Martin Heidegger, on the publication of lectures from the year 1935. Graduate Faculty Philosophy Journal 6, no. 2: 155-180.
  • 1979. Communication and the Evolution of Society, T. McCarthy (trans.) Boston: Beacon.
  • 1981. Theorie des kommunikativen Handelns. Band I: Handlungsrationalität und gesellschaftliche Rationalisierung. Band II: Zur Kritik der funktionalistischen Vernunft. Frankfurt am Main: Suhrkamp. [English: 1984a and 1987]
  • 1983. Moralbewuβtsein und kommunikatives Handeln. Frankfurt am Main: Suhrkamp. [English: 1990a]
  • 1984a. The Theory of Communicative Action, Volume I: Reason and the Rationalization of Society, T. McCarthy (trans.). Boston: Beacon.
  • 1984b. Vorstudien und Ergänzungen zur Theorie des kommunikativen Handelns. Frankfurt am Main: Suhrkamp. [English: 2001* does not include the Wahrheitstheorien essay]
  • 1985. Die Neue Unübersichtlichkeit: Kleine Politische Schriften V.ÿ Frankfurt am Main: Suhrkamp. [English: 1991]
  • 1986a. Gerechtigkeit und Solidarität: Eine Stellungnahme zur Diskussion über Stufe 6. In W. Edelstein and G. Nunner-Winkler (eds), Zur Bestimmung der Moral. Frankfurt am Main: Suhrkamp. 291-318. [English: 1990b]
  • 1986b. Entgegnung. In A. Honneth and H. Joas (eds), Kommunikatives Handeln. Frankfurt am Main: Suhrkamp. 327-405. [English: 1991b]
  • 1987. The Theory of Communicative Action. Vol. II: Lifeworld and System, T. McCarthy (trans.). Boston: Beacon.
  • 1988a. On the Logic of the Social Sciences, S. W. Nicholsen and J. A. Stark (trans.). Cambridge, MA: MIT Press.
  • 1988b. Nachmetaphysisches Denken. Frankfurt am Main: Suhrkamp. [English: 1992a]
  • 1989. The Structural Transformation of the Public Sphere, T. Burger and F. Lawrence (trans). Cambridge, MA: MIT Press.
  • 1990a. Moral Consciousness and Communicative Action, C. Lenhardt and S. W. Nicholsen (trans). Cambridge, MA: MIT Press.
  • 1990b. Justice and solidarity: On the discussion concerning stage 6. In: T. E. Wren (ed.), The Moral Domain. Cambridge, MA: MIT Press. 224-251, S. W. Nicholsen (trans.).
  • 1991a. Erläuterungen zur Diskursethik. Frankfurt am Main: Suhrkamp. [English: 1993]
  • 1991b. The New Conservatism: Cultural Criticism and the Historians’ Debate. S. W. Nicholsen (trans.).
  • 1992a. Postmetaphysical Thinking, W. M. Hohengarten (trans.). Cambridge, MA: MIT Press.
  • 1992b. Faktizität und Geltung. Beiträge zur Diskurstheorie des Rechts und des demokratischen Rechtsstaats. Frankfurt am Main: Suhrkamp. [English: 1996b]
  • 1993. Justification and Application, C. P. Cronin (trans.). Cambridge, MA: MIT Press.
  • 1994. The Past as Future. M. Pensky and P. Hohendahl (trans.). University of Nebraska Press.
  • 1996a. Die Einbeziehung des Anderen. Studien zur politischen Theorie. Frankfurt am Main: Suhrkamp. [English: 1998a]
  • 1996b. Between Facts and Norms: Contributions to a Discourse Theory of Law and Democracy, W. Rehg (trans.). Cambridge, MA: MIT Press. [German, 1992]
  • 1998a. Inclusion of the Other: Studies in Political Theory, C. Cronin and P. DeGreiff (eds). Cambridge, MA: MIT Press.
  • 1998b. Die postnationale Konstellation. Frankfurt am Main: Suhrkamp. [English: 2001a]
  • 1999a. Wahrheit und Rechtfertigung. Frankfurt am Main: Suhrkamp. [English: 2003a]
  • 2001a. The Postnational Constellation, M. Pensky (trans., ed.). Cambridge, MA: MIT Press.
  • 2001b. Die Zukunft der menschlichen Natur. Auf dem Weg zu einer liberalen Eugenik? Frankfurt am Main: Suhrkamp. [English: 2003b]
  • 2001c. Zeit der Übergänge. Kleine politische Schriften IX. Frankfurt am Main: Suhrkamp. [English: 2004b]
  • 2003a. Truth and Justification, B. Fultner (trans.). Cambridge, MA: MIT Press.
  • 2003b. The Future of Human Nature, W. Rehg, M. Pensky, and H. Beister (trans.). Cambridge: Polity.
  • 2004a. Der gespaltene Westen. Kleine politische Schriften X. Frankfurt am Main: Suhrkamp. [English: 2006a]
  • 2004b. Time of Transitions. C. Cronin (trans.). Cambridge: Polity Press.
  • 2006a. The Divided West. C. Cronin (trans.). Cambridge: Polity Press.
  • 2006b. Pre-political foundations of the democratic constitutional state. In J. Habermas and J. Ratzinger The Dialectics of Secularization: On Reasons and Religion. B. McNeil (trans.). San Francisco: Ignatius. 19-52.
  • 2007. Kommunikative Vernunft und grenzüberschreitende Politik. Eine Replik. In: Anarchie der kommunikativen Freiheit Jürgen Habermas und die Theorie der internationalen Politik. Peter Niesen and Benjamin Herborth (eds.). Frankfurt am Main: Suhrkamp Press.
  • 2008. Between Naturalism and Religion. C. Cronin (trans.). Cambridge: Polity Press.
  • 2008. Ach Europa. Kleine politische Schriften XI. Frankfurt am Main: Suhrkamp. [English: 2009]
  • 2009. Europe: the Faltering Project. C. Cronin (trans.). Cambridge: Polity Press.
  • 2012: The Crisis of the European Union: A Response. C. Cronin (trans.). Cambridge: Polity Press.
  • 2014. The Lure of Technocracy. C. Cronin (trans.). Cambridge: Polity Press.
  • 2014. Entgegnung (x13) and Schlusswort. In: Habermas und der Historische Materialismus. Smail Rapic (ed.). München: Karl Alber Press.

d. Secondary Scholarship Beyond the Subject-Specific Recommendations Cited Above

  • 1990. Apel, Karl-Otto. Diskurs und Verantwortung. Frankfurt am Main: Suhrkamp.
  • 2002. Apel, Karl-Otto. Regarding the relationship of morality, law and democracy: on Habermas’s Philosophy of Law (1992) from a transcendental-pragmatic point of view. In: Habermas and Pragmatism. Aboulafia, Mitchell, Myra Bookman and Catherine Kemp (eds.). 17-30.
  • 2014. Brunkhorst, Hauke. Critical Theory of Legal Revolutions: Evolutionary Perspectives. Bloomsbury.
  • 2009. Brunkhorst, Hauke, Regina Kreide and Cristina Lafont (eds.) Habermas-Handbuch. Stuttgart: JB Metzler.
  • 2000b. Finlayson, James Gordon. What Are ‘Universalizable Interests’? Journal of Political Philosophy. vol. 8, no. 4, 456-469.
  • 2004. Flynn, Jeffrey. Communicative Power in Habermas’s Theory of Democracy. European Journal of Political Theory, 433-454.
  • 1993. Günther, Klaus. The Sense of Appropriateness. J. Farrell (trans.). Albany: SUNY Press.
  • 1991. Honneth, Axel and Hans Joas. Communicative Action: Essays on Jürgen Habermas’s Theory of Communicative Action. Gaines, Jeremy and Doris L. Jones (trans.). Cambridge, MA: MIT Press.
  • 2003. Horkheimer, Max and Theodor Adorno. Dialectic of Enlightenment. G. Schmid Noerr (ed.), E. Jephcott (trans.). Stanford: Stanford University Press.
  • 2001. Joas, Hans. Values versus Norms: a pragmatic account of moral objectivity. In: The Hedgehog Review 3, 42-56.
  • 2012. Lafont, Cristina. Agreement and Consent in Kant and Habermas: Can Kantian Constructivism be fruitful for democratic theory? In: The Philosophical Forum. 43/3, 277-95.
  • 2009. Lafont, Cristina. Religion and the Public Sphere: What are the deliberative obligations of democratic citizenship? Philosophy and Social Criticism 35, 127-150.
  • 1991. McCarthy, Thomas. Ideals and Illusions, Cambridge, MA: MIT Press.
  • 2007. Müller, Jan-Werner. Constitutional Patriotism. Princeton University Press.
  • 2000. Müller-Doohm, Stefan (ed.). Das Interesse der Vernunft: Rückblicke auf das Werk von Jürgen Habermas seit “Erkenntnis und Interesse”. Frankfurt am Main: Suhrkamp.
  • 2007. Niesen, Peter and Benjamin Herborth (eds.). Anarchie der kommunikativen Freiheit-Jürgen Habermas und die Theorie der internationalen Politik. Frankfurt am Main: Suhrkamp Press.
  • 2002. Owen, David. Between Reason and History: Habermas and the Idea of Progress. Albany: SUNY Press.
  • 2014. Rapic, Smail (ed.). Habermas und der Historische Materialismus. München: Karl Alber Press.
  • 1989. Rockmore, Tom. Habermas and Historical Materialism. Bloomington and Indianapolis: Indiana University Press.
  • 1998. Rosenfeld, Michel, and Andrew Arato (eds). Habermas on Law and Democracy, Berkeley: University of California Press.
  • 1982. Thompson, John B. and David Held. Habermas: Critical Debates. Cambridge, MA: MIT Press.
  • 1991. Wellmer, Albrecht. Ethics and dialogue: Elements of moral judgment in Kant and discourse ethics. In: A. Wellmer, The Persistence of Modernity, D. Midgley (trans.). Cambridge, MA: MIT Press. 113-231.
  • 1995. White, Stephen K. (ed.). The Cambridge Companion to Habermas. Cambridge: Cambridge University Press.
  • 2007. Yates, Melissa. Rawls and Habermas on religion in the public sphere. Philosophy and Social Criticism. 33, 880-891.

 

Author Information

Max Cherem
Email: Max.Cherem@kzoo.edu
Kalamazoo College
U. S. A.

Ancient Aesthetics

ancient-greek-aestheticsIt could be argued that ‘ancient aesthetics’ is an anachronistic term, since aesthetics as a discipline originated in 18th century Germany. Nevertheless, there is considerable evidence that ancient Greek and Roman philosophers discussed and theorised about the nature and value of aesthetic properties. They also undoubtedly contributed to the development of the later tradition because many classical theories were inspired by ancient thought; and, therefore, ancient philosophers’ contributions to the discussions on art and beauty are part of the traditions of aesthetics.

The ancient Greek philosophical tradition starts with the pre-Socratic philosophers. In most cases, there is little evidence of their engagement with art and beauty, with the one notable exception of the Pythagoreans. In the Classical period, two prominent philosophers, Plato and Aristotle, emerged. They represent an important stage in the history of aesthetics. The problems they raised and the concepts they introduced are well known and discussed even today.

The three major philosophical schools in the Hellenistic period (the Epicureans, the Stoics and the Sceptics) inherited a certain philosophical agenda from Plato and Aristotle while at the same time presenting counterarguments and developing distinct stances. Their contributions to aesthetics are not as famous and, in some cases, are significantly smaller than those of their predecessors, yet in certain respects, they are just as important. In late antiquity, the emergence of Neoplatonism marks another prominent point in the aesthetic tradition. Neoplatonists were self-proclaimed followers of Plato, yet starting with the founder of the school, Plotinus, Neoplatonists advocated many distinctly original views, some of them in aesthetics, that proved to be enduringly influential.

The history of ancient Greek aesthetics covers centuries, and during this time numerous nuanced arguments and positions were developed. In terms of theories of beauty, however, it is possible to classify the theories into three distinct groups: those that attribute the origin of beauty to proportion, those that attribute it to functionality and those that attribute the Form as the cause of beauty. This classification ought not to be understood as a hard-and-fast distinction among philosophical schools, but as a way of pinpointing some major theoretical trends. Oftentimes, philosophers use a combination of these positions, and many original innovations are due to the convergence and interaction among them.

Ancient philosophers were also the authors of some of the more notable concepts in the philosophy of art. The notions of catharsis, sublimity and mimesis originated in antiquity and have played a role in aesthetics ever since then.

Table of Contents

  1. Ancient Aesthetics: Methodological Issues
    1. Aesthetics in Antiquity
    2. To Kalon
  2. Three Types of Theories about the Origin of Beauty
    1. Proportion
      1. Pythagoreans
      2. Plato and Aristotle
      3. The Stoics
    2. Functionality
      1. Xenophon
      2. Hippias Major
      3. Aristotle
      4. The Stoics
    3. Form
      1. Plato
      2. Plotinus
  3. Philosophy of Art
    1. Mimesis
      1. Plato
      2. Aristotle
    2. Criticism of Arts
      1. Plato
      2. Epicureans
    3. Catharsis
    4. Sublime
  4. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Ancient Aesthetics: Methodological Issues

a. Aesthetics in Antiquity

One of the most important foundational issues about ancient aesthetics is the question of whether the very concept of ‘ancient aesthetics’ is possible. It is generally considered that aesthetics as a discipline emerged in the 18th century. To speak of ancient Greek and Roman aesthetics, therefore, would be an anachronism. Furthermore, there are certain differences between ancient and modern approaches to the philosophical study of beauty and art that make them distinct projects. These differences were outlined and discussed by Oskar Kristeller, an influential critic of ancient aesthetics, who suggested that the ancients’ interest in moral, religious, and practical aspects of works of art—combined with their lack of grouping the fine arts into a single category and presenting philosophical interpretations on that basis—means that aesthetics was not a philosophical discipline in antiquity (Kristeller 1951: 506).

Kristeller’s critique is still often quoted and discussed in works that deal with the ancients’ ideas on arts and beauty. The question of how compatible ancient and modern methodologies are remains a relevant issue. At the same time, Kristeller’s view has been challenged by a number of compelling arguments in 20th and early-21st century scholarship.

A number of arguments against Kristeller’s interpretation of the aesthetic tradition have been raised. These arguments also pinpoint some of the central concepts that ancient philosophers used. Stephen Halliwell criticised Kristeller’s argument by pointing out that, first, the notion of mimesis was a much more unified concept of art than Kristeller allows (see below for a more detailed explanation of mimesis). Second, the 18th-century category of fine art, established in such works as Batteux’s Les beaux arts réduits à un même principe (1746), relied on the mimetic tradition, although later the focus shifted towards different conceptions of art (Halliwell 2002: 7–8). Peponi later refuted Kristeller’s claims by pointing out that ancient Greek thinkers grouped activities we call fine arts and, moreover, were interested in the effects produced by the beautiful properties of, for instance, poetry (Peponi 2012: 2–6).

James Porter has also criticised Kristeller’s premises and conclusions on three different grounds: Kristeller’s historical account is not the only one possible; “the modern system of arts” is not as clear-cut a category as Kristeller makes it out to be; and it does not follow that the existence of the concept of fine arts indicates the emergence of aesthetic theory (Porter 2009). In addition to this, it has been argued that the ideas of Plato and Aristotle are not only relevant to the preoccupations of modern philosophers but also address the foundational questions of aesthetics and philosophy of art (Halliwell 1991).

b. To Kalon

Another methodological issue concerning ancient aesthetics is a linguistic one, namely the translation and conceptualisation of the term to kalon (honestum in Latin) whose meaning contains some ambiguity. The issue at stake is the question of when this term can and cannot be read and translated as an aesthetic one. The Greek language has a rich vocabulary of terms that are uncontroversially aesthetic, but to kalon, a fairly popular term in philosophical texts, has a range of meanings from ‘beauty’ to ‘being appropriate.’ The problem arises especially in ethical discussions, when the context does not make it clear whether the usage of the term to kalon ought to be understood as aesthetic or not.

It has been customary to translate to kalon in ethical contexts as ‘fine’ or something similar. Early 21st-century thinkers have argued, however, that to kalon and similar Greek and Latin terms (to prepon in Greek; honestum and decorum in Latin) ought to be read as aesthetic concepts. The translations that ignore the aesthetic aspect of these terms may not capture their meaning accurately (Bychkov 2010: 176). Or, more specifically, the use of to kalon in Aristotle’s works often has aesthetic meaning and, therefore, can be translated as ‘beautiful’ (Kraut 2013). At the same time, some studies of Aristotle’s use of to kalon have argued that the conceptualisation and translation of the term depend on the context in which it is found. In the context of ethical discussions, more neutral or ethical translations ought to be preferred over aesthetic ones (Irwin 2010: 389–396).

2. Three Types of Theories about the Origin of Beauty

a. Proportion

The idea that beauty in any given object originates from the proportion of the parts of that object is one of the most straightforward ways of accounting for beauty. The most standard term for denoting this theory is summetria, meaning not bilateral symmetry, but good, appropriate or fitting proportionality.

The idea that beauty derives from summetria is usually attributed to the sculptor Polycleitus (5th cn. B.C.E.), who wrote a treatise entitled Canon containing a discussion of the exact proportions that generate beauty and then made a statue, also entitled Canon, exemplifying his theory. Little is known of Polycleitus’ work and ideas, but when the famous Roman architect Vitruvius used this notion in his De Architectura, he explained it in terms of specific numerical ratios. For instance, in the human face, the distance from the chin to the crown of the head is an eighth part of the whole height; the length of the foot is a sixth part of the height of the body, while the forearm is a fourth part. Then Vitruvius adds that ancient painters and sculptors achieved their renown by following these principles (Book 3.1.2). It is likely that Polycleitus’ treatise had similar contents, such as a discussion of specific ratios that produce beauty in a human body, and was therefore useful for making sculptures of idealised human forms.

i. Pythagoreans

Equally, if not more, significant for the philosophical tradition are Pythagorean ideas about the fundamentality of numbers. Of course, Pythagoreanism was far from a unified school of thought; diverse philosophers were given that name during antiquity. The Pythagoreans referred to here are the philosophers active during the 5th and 4th centuries B.C.E., such as Philolaus and Archytas.

Numbers, according to this strand of Pythagoreanism, underlie the basic ontological and epistemological structure of the world and, as a result, everything in the world can be explained in terms of numbers and the relationship between them, namely, proportion. Beauty is one of the properties that the Pythagorean philosophers use to support their doctrine, because they claimed its presence can be fully explained in terms of numbers or, to be more precise, the proportion and harmony that is expressed in numerical relationships.

Sextus Empiricus recorded the Pythagorean argument that sculpture and painting achieve their ends by means of numbers, and thus art cannot exist without proportion and number. Art, the argument continues, is a system of perceptions and the system is reducible to a number (Sextus Empiricus Against the Logicians Book 1.108–9).

The Pythagoreans had a well-known interest in music. The evidence on this topic is wide-ranging: from the reputation of Pythagoras as the first one to pinpoint the mathematics underlying the Greek music scale to Socrates’ remark in the Republic attributing to Pythagoras the claim that music and astronomy were sister sciences (Rep. 530D). Music is also said to have a positive influence on a person’s soul. According to a testimonial from Aristoxenus, music had an effect on a person’s soul comparable to the effect that medicine has on a person’s body (Diels, II. 283, 44). Arguably this role was attributed to music due to its being an expression of the harmonizing influence of numbers.

ii. Plato and Aristotle

Although generally speaking, Plato is best classified as a Form Theorist, a small number of passages in the Platonic corpus suggest a viewpoint derived from summetria, that is, a good proportion or ratio of parts.

In the Timaeus, lacking summetria is associated with lacking beauty (87D). Similarly, both in the Republic and the Sophist, beauty is said to derive from arrangements (R. 529D-530B and Sph. 235D–236A respectively). Plato’s use of summetria raises the question of how this theory was supposed to function alongside the idea that beauty derives from the form of beauty. Most likely, however, there was no contradiction for Plato. Summetria is one of the properties that beautiful things have, rather than the cause of beauty, which is its form. Summetria, as well as such properties as colour and shape, is one of the aspects that an object gains by partaking in the form.

The case is similar in the Aristotelian corpus. Aristotle named summetria one of the chief forms of beauty, alongside order and definiteness (M 3.1078a30–b6). The context for this definition is the refutation of the view put forth by the sophist Aristippus who argued that mathematics has nothing to say about the good and the beautiful (M 3.996a). Since the causes of aesthetic properties are describable in mathematical terms, mathematics does, in fact, have something to say about these things. Similarly, in Physics, bodily beauty (kallos) is named as one of the excellences that depend on particular relations (Ph. 246b3–246b19), and in Topics, it is said to be a kind of summetria of limbs (Topics 116b21). The beautiful (to kalon) is also identified with being well arranged in On Universe (397a6).

At the same time, Aristotle did not think that summetria was a sufficient condition for beauty. He claimed that size was also necessary for beauty. In Nicomachean Ethics 4.3, beauty is said to imply a good-sized body, so that little people might be well-proportioned, but not beautiful. The city as well is required to be of a certain size before it can be called beautiful (Politics 7.4).

iii. The Stoics

Summetria assumed a much more significant role in Stoicism. The Stoics defined beauty as originating from the summetria of parts with each other and with the whole. Galen (On the Doctrines of Hippocrates and Plato 5.3.17) attributes this definition to Chrysippus, the third head of the school, but all other testimonials describe it simply as the Stoic definition. This definition is meant to apply to both the beauty of the body and the beauty of the soul (Arius Didymus Epitome of Stoic Ethics 5b4–5b5 (Pomeroy); Stobaeus Ecl. 2.62, 15). Some sources suggest that there are additional conditions: for the former, colour, and for the latter, the stability or consistency of beliefs (Plotinus Ennead 1.6.1; Cicero Tusculan Disputations  4.13.30). In many respects, the Stoics inherit this understanding of beauty from their predecessors, but it is worth noting that they also often invoked the notion of functional beauty. Stoics aesthetics, therefore, was likely a combination of functional and proportion theories.

b. Functionality

The theory of functional beauty is the idea that beauty originates in an object when that object performs its functions, achieves its end or fits its purpose, especially when it is done particularly well, that is, excelling at the task of achieving that end. In an ancient philosophical context, this idea is also often associated with the notion of dependent beauty, which means an object is beautiful if it excels at functioning as the kind of object it is. It is also noteworthy that the Greek term to kalon, often—but not always—used as an aesthetic term, can be used to denote being fitting or well-executed. The functionalist theory of beauty might have been more linguistically intuitive to ancient Greeks than it is possible to convey in English.

i. Xenophon

It is hard to attribute this theory to one particular philosopher, since functionalist arguments are fairly common in ancient philosophy texts. An example of functional theory can be found in Xenophon’s Memorabilia. Socrates first makes a point about dependent beauty by saying “a beautiful wrestler is unlike a beautiful runner, a shield beautiful for defence is utterly unlike a javelin beautiful for swift and powerful hurling” (3.8.4). Then he further develops this point by adding that “it is in relation to the same things that men’s bodies look beautiful and good and that all other things men use are thought beautiful and good, namely, in relation to those things for which they are useful” (3.8.5).

It is not obvious that the term to kalon employed here is used in an aesthetic sense, but a few lines down, it is said that “the house in which the owner can find a pleasant retreat at all seasons and can store his belongings safely is presumably at once the pleasantest and the most beautiful. As for paintings and decorations, they rob one of more delights than they give” (3.8.10). This remark highlights that the issue at stake is aesthetic phenomena, and that a much greater pleasure is to be gained from perceiving functionality rather than perceiving pleasing, yet artificial, colours (paintings) and structures.

ii. Hippias Major

A functional definition of beauty is also found in Plato’s dialogue Hippias Major. In this dialogue, Socrates engages in a discussion with Hippias, a sophist, in order to discover the definition of beauty. They each give a number of possible options, and one of them, proposed by Socrates, was a functional definition.

It is argued that stone, rather than ivory, is more beautiful as material for eye pupils in Pheidias’ statue and that a fig wood ladle is much better suited and beautiful than a gold one for making soup. Socrates proposes these two cases as objections to Hippias’ proposal that beauty is gold. By presenting two cases in which a beauty-making property is not some inherent property of an object, but that object’s functionality, Socrates rejects Hippias’ suggestion. This move also leads to examining the possibility that all beauty is to be defined as deriving from functionality, but this option is ultimately rejected as well on the grounds that it appears to rely on a kind of deception, because it prioritizes how things appear over how things truly are (290D–294E).

iii. Aristotle

In Aristotle’s work, there are many instances of excellence in functionality described by the term to kalon. In fact, Aristotle states outright that fitting a function and to kalon are the same (Top. 135a12–14). Since this term can be used both aesthetically and non-aesthetically, it is a matter of contention whether in some specific cases the reference for this term is meant to be an aesthetic phenomenon or not.

If to kalon is read aesthetically, some of the most pertinent passages for the functionalist understanding of aesthetic properties would come from Aristotle’s descriptions of natural phenomena. For instance, according to Generation of Animals, the generation of bees reveals a kalon arrangement of nature; the generations succeed one another even though drones do not reproduce (760a30–b3). In Nicomachean Ethics, Aristotle states that dogs do not enjoy the scent of rabbits as such, but the prospect of eating them; similarly, the lion appears to delight in the lowing of an ox, but only because it perceives a sign of potential food (1118a18–23).

iv. The Stoics

A certain kind of functionality and aesthetic language also appear in certain Stoic arguments, most notably in the works of Panaetius who used the term to prepon (‘fitting’, ‘becoming’, ‘appropriate’ in English) in certain ethical arguments. Probably the most elaborate discussion of to prepon (or decorum in Latin) is recorded in Cicero’s On Duties, which represents Panaetius’ views.

Here, an analogy between poetry and human behaviour is drawn as follows. The poets “observe propriety, when every word or action is in accord with each individual character.” The poets depict each character in a way which is appropriate regardless of the moral value of the character’s actions, so that a poet would be applauded even when he skilfully depicts an immoral person saying immoral things. To human beings, meanwhile, nature also assigned a kind of role, namely that of manifesting virtues like steadfastness, temperance, self-control, and so forth. This claim reflects one of the essential tenets of Stoic ethics, eudaimonia, which is living in accordance with nature and pursuing virtue (Diogenes Laertius 7.87–9; Long and Sedley 63C). Human beings, therefore, are functional entities as well in the sense that they have a certain function and end. The idea that achieving that end produces beauty is made clear when it is said that just as physical beauty consisting of the harmonious proportion of limbs delights the eye, so too does to prepon in behaviour earn the approval of fellow humans through the order, consistency and self-control imposed on speech and acts (1.97–98).

c. Form

i. Plato

Plato’s best-known argument, the theory of forms, has much bearing on his aesthetics in a number of ways. The theory posits that incorporeal, unchanging, ideal paradigms— forms—are universals and play an important causal role in the world generation. Arguably the most important way in which the theory of forms has bearing on aesthetics is the account of the origin of aesthetic properties. Beauty, just like many other properties, is generated by its respective form. An object becomes beautiful by partaking in the form of Beauty.  The form of Beauty is mentioned as the cause of beauty throughout the Platonic corpus; see, for instance Cratylus 439C–440B; Phaedrus 254B; Phaedo 65d–66A and 100B–E; Parmenides 130B; Republic 476B–C, 493E, 507B. In this respect, the form of Beauty is just like all the other forms. Plato does, however, say that the form of Beauty has a special connection with the form of Good, even if they are not, ultimately, identical (Hippias Major 296D–297D).

The form of Beauty is shown as having a pedagogical aspect in the Symposium. In Diotima’s speech, the acquisition of knowledge (that is, the knowledge of the forms) is represented as the so-called Ladder of Love. A lover is said to first fall in love with an individual body, then notices that there are commonalities among all beautiful bodies and thus becomes an admirer of human form in general. Then the lover starts appreciating the beauty of the mind, followed by the beauty of institutions and laws. The love of sciences is the next step on the ladder until the lover perceives the form of Beauty. The form is said to be everlasting, not increasing or diminishing, not beautiful at one point and ugly at another, not beautiful only in relation to any specific condition, not in the shape of any specific thing, such as a limb, a piece of knowledge or an animal. Instead, it is absolute, everlasting, unchanging beauty itself (210A–211D).

ii. Plotinus

Plotinus, a self-proclaimed follower of Plato, was also committed to the view that beauty originated from the form of Beauty, adding some further elaborations of his own. Plotinus presents this account as a rival to the summetria theory. His treatise On Beauty (Ennead 1.6) starts with an elaborate critique of the rival theory. Plotinus claims that accounting for beauty by means of summetria has a number of drawbacks. For instance, it cannot explain the beauty in unified objects that do not have parts, such as a piece of gold.

According to Plotinus’ own theory, an object becomes beautiful by virtue of its participating in the form. He also adds that the Intellect (nous) is the cause of beauty. To be precise, it is the Intellect that imposes the forms onto passive matter thus producing beauty. Those entities that do not participate in the form, and thus reason, are ugly (1.6.2). The form is therefore capable of producing beauty by virtue of its being an instrument of the Intellect that creates order and structure out of chaotic matter in the universe, and beauty is an expression of its designing powers.

Apart from being expressions of the Intellect, forms have another aspect that makes them the cause of beauty; namely, they unify disarrayed and chaotic elements into harmony. When form approaches formless matter, it introduces a certain intrinsic agreement, so that many parts are brought into unity and harmony with each other. The form has intrinsic unity and is one, and therefore, it turns the matter it shapes into one as well, as far as it is possible. The unity produces beauty which ‘communicates itself’ to both the parts and the whole (1.6.2).

Plotinian metaphysics and aesthetics converge in the analogy between Intellect shaping the universe and a sculptor shaping a piece of stone into a statue. At the beginning of his Ennead 5.8 (On the Intelligible Beauty), Plotinus asks his readers to envision two pieces of stone placed next to each other, one plain and another one sculpted into the shape of an especially beautiful human or some god. Then he argues that the latter will appear immediately beautiful, not because of the material it is made out of but because it possesses the form. The beauty is caused by the intellect of the sculptor, which transmits the form onto the stone. The visible form that the sculptor imposes onto the stone is an inferior version of the actual form that can only be contemplated. The actual forms are purely intellectual, ‘seen’ with mind’s eye. The intellectual beauty of reason, argues Plotinus, is a much greater and also truer beauty (En. 5.8.1.).

Plotinus follows Plato in arguing that visible beauty is inferior as it is only a copy of the true beauty of forms. There is, however, a significant difference between them in terms of their attitude towards the value of artistic beauty. Plotinus warns against devaluing artistic activities and, in an argument very much unlike those found in Plato, states that (i) nature itself imitates some things. (ii) Arts do not simply imitate what is seen by the eye but refer back to the principles of nature. (iii) Arts produce many things not by means of copying, but from themselves. In order to create a perfect whole, they add what is lacking, because arts contain beauty themselves. (iv) Phidias (one of the most famous Greek sculptors) designed a statue of Jupiter not by imitation, but by conceiving a form that a god would take if he were willing to show himself to humans (5.8.1).

3. Philosophy of Art

a. Mimesis

In older scholarship, it is common to find a claim that a Greek term for art was techne, and as this is a much narrower term than the contemporary concept of fine art, it is claimed that ancient Greeks did not have a concept of fine art. This interpretation, however, has been challenged. It has been argued that, if there were a concept of fine art in Greek thought, it would be mimesis. In the most literal meaning of the term, mimesis refers to imitation in a very broad sense, including such acts as following an example of someone’s behaviour or adopting a certain custom. This word is widely used when discussing art and artistic activities, and it can be roughly defined as an imitative representation, where ‘representation’ is understood as involving not just copy-making, but also creative interpretation. Aristotle grouped poetry with “the other mimetic arts” (8.1451a30) in the Poetics, in a remark that suggests the conceptualisation of a distinct group of artistic activities resembling the notion of fine arts. A similar grouping of “imitators” (mimetai), including poets, rhapsodists, actors, and chorus-dancers can be found in Plato’s Republic as well (2.373B).

i. Plato

Books 2 and 3 of Plato’s Republic contain an extensive analysis of mimesis in the context of the education of the guardian class in the ideal city-state. In Book 2, Socrates starts developing his account of the ideal city-state. The class of guardians plays an especially important role in its maintenance, and therefore, the question of how the guardians ought to be educated is raised. Apart from physical education, the education based on storytelling is quite important, as it starts early in childhood and precedes physical education (2.376E).

First, Socrates and the interlocutors agree to ban from the guardians’ education and the ideal city-state more generally certain stories based on their content, particularly stories depicting the gods committing evil deeds (2.377D–E). At the start of book 3, there is a longer list of the kind of stories that are undesirable in the ideal city, including ones with negative portrayals of the afterlife, lamentations, gods committing unseemly acts and portrayals of bad people as happy (386A–392C).

Then there follows a discussion of the style (Gr. lexis) of narration. Socrates distinguishes direct speech, when a poet speaks in his own voice, from imitative speech, when a poet imitates the speech of the characters in the story and suggests that if a poem is written in the former style, it contains no mimesis (3. 393D). The poetry can be of three kinds: dithyrambs (in poet’s own voice, no mimesis), tragedy and comedy (pure mimesis) and epic poetry (a combination of the two) (3.394C).

The discussion turns towards the question of whether mimetic poets ought to be allowed into the city-state and whether guardians themselves could be mimetai. The answer to this question turns out to be negative. The main argument against mimesis in the ideal city goes as follows. The guardians preserve the well-being of the city, and thus the only things they ought to imitate are the properties of virtue, not shameful or slavish acts. The reason for this is that enjoying the imitation of these things might lead them to actually pursuing them, as imitation is habit-forming (3.395D–E). It is ultimately concluded that only a pure imitator of a good person ought to be allowed into the city-state (3. 397D–398B)

ii. Aristotle

Aristotle argues that poetry originates from two causes. Both of these causes are grounded in human nature, particularly the natural proneness of human beings to mimesis. Mimesis is said to be (i) the natural method of learning from childhood and (ii) a source of delight for human beings.

In order to support the latter point, Aristotle notes that although such objects as dead bodies and low animals might be painful to see in real life, we delight in artistic depictions of them, and the reason for this is the pleasure humans derive from learning. People delight in seeing a picture, either because they recognise the person depicted and ‘gather the meaning of things’ or—if they do not recognise the subject—they admire the execution, colour, and so on. (The distinction between mimesis and colour/composition is reiterated in Politics, where colour and figures are said to be not imitations but signs with little connection to morality, and therefore, young men ought to be taught to look at those paintings which depict character (1340a32–39).) This principle applies not only to visual arts. The natural inclination to mimesis combined with the sense of harmony and rhythm is the reason why humans are drawn to poetry as well (Poetics 1448b5–1448b24).

Aristotle’s conceptual analysis of poetry contains a revealing discussion of the differences between poetry and history. Poets differ from historians by virtue of describing not what happened, but what might happen, either because it is probable or necessary. They do not, however, differ because one is set in prose and another one in verse, as the works of Herodotus could be set in verse and remain history. The fundamental difference between history and poetry lies in the fact that the former is concerned with statements about particulars, whereas the latter is concerned with universal statements. Some tragedies do use historical characters, but this, according to Aristotle, is because “what is possible is credible,” which presumably means that plots involving historical characters are more moving because they might have actually happened. Another notable conclusion is that the poet is a poet because of the plot rather than the verse, as the defining characteristic of such activity is the imitation of action (1451a37–b31).

b. Criticism of Arts

i. Plato

Unlike Aristotle, Plato saw potential dangers associated with mimetic activities. In Republic 5, “lovers of beautiful sights and sounds,” people addicted to music, drama and so on, are contrasted with true philosophers. The lovers of sights and sounds pursue only opinions, whereas philosophers are the pursuers of knowledge and, ultimately, beauty in itself (5.475D–480A).

But perhaps the best-known argument criticising art comes from Book 10 of the Republic. Here, the products of artistic activities are criticised for being twice removed from what is actually the case. Socrates uses the example of a symposium couch to argue that the painting of a couch is just a copy of reality, the actual couch. Yet the actual couch made by the craftsman is also just a copy of the true reality, the forms. The painters, according to this argument, portray only a small portion of what is actually the case. For the most part, they are concerned with appearances. There are, thus, three kinds of couches: one produced by god, another one produced by a carpenter and the third one by a painter. God and the craftsmen are called makers or producers of their kinds of couches, but the painter is only an imitator, a producer of the product that is thrice removed from nature. This category is also said to include tragedians and all the other imitators (596D–597E).

These and other passages have earned Plato a reputation of being hostile to art. Plato’s theory of art, however, is much more complex, and criticism is only one aspect of his treatment of artistic mimesis. An example of a more constructive understanding of artistic imitation can be found in the same work where he famously criticises it, the Republic.

For instance, Socrates suggests that there is an analogy between the ideal political state they are discussing and an idealised portrait, arguing that no one would think the latter is flawed because the painter cannot produce an ideal person in reality and, therefore, there is no need to worry that their ideal state does not actually exist (5.472D–E). Socrates’ remark indicates that there is much more to painting than the copying of appearances. Ideas like these can be found throughout the Republic (see also 6.500E–501C; 3.400E–401A).

In fact, after banishing poetry from the ideal city earlier, Socrates praises Homer, who is said to be the best of the tragedians, and a concession is made for hymns to god and eulogies to good people. Socrates also adds that even imitative poetry could be welcomed in the city, provided there is an argument showing it ought to belong to such well-governed places (10.606E–607c). The ancient quarrel between poets and philosophers, as Plato called it, was neither unambiguous nor a settled matter.

ii. Epicureans

The Epicureans, members of the Hellenistic philosophical school notorious for its atomist physics and hedonist ethics, were also critics of poetry. The Epicurean ethical views, especially the claim that death is not evil, played a major role in shaping their perspective on poetry. The extant works of the founder of the school, Epicurus, show him criticising muthos, stories told by poets. Epicurus was concerned with the dangerous influence that these stories could have on those who hear them. The stories of poets are based on beliefs that produce the feeling of anxiety in listeners (for instance, the belief that life is full of pain and it is best to not to be born at all). The opposite of these are beliefs gained by studying nature and engaging in philosophical investigations. Such studies lead to the discovery that the greatest pleasure in life is ataraxia (the state of tranquillity) and abolishing the fear of pain and death (Letter to Menoeceus 126–7; Principal Doctrines 12). Epicurus also notoriously argued against receiving the traditional education (paideia) that includes an education in poetry (Letter to Pythocles 10.6; Plutarch 1087A).

It is noteworthy, however, that Epicurus was not unequivocally opposed to poetry and arts. Some evidence suggests that he maintained that only an Epicurean would discuss music and poetry in the right way, although the Epicureans would not take up writing poetry themselves (Diogenes Laertius, Lives 10.120; Plutarch 1095C). It appears that for Epicurus, like Plato, arts were problematic because of their power to impart incorrect beliefs and emotions that pose risks to one’s ataraxia.

Lucretius, the author of the Epicurean epic poem De Rerum Natura, espouses a somewhat different attitude toward poetry. Written in the 1st century B.C.E. in Latin, the poem is an exposition of Epicurean views including atomism, hedonistic ethics and epistemic dogmatism (especially against attacks from the Sceptics). As a whole, the poem engages very little with aesthetic issues, with the exception of the often-quoted passage from Book 1, in which Lucretius talks about the effects of poetry. He compares himself to a physician who, administering unpleasant-tasting wormwood, covers the brim of the glass with honey, not to deceive his patients, but to help them take the medicine and become better. In the same way, Lucretius himself sweetens doctrines that otherwise might seem woeful to those who are new to Epicureanism (1.931–50).

c. Catharsis

Catharsis is a psychological phenomenon, often associated with the effects of art on humans, famously described by Aristotle. There is, however, no explicit definition of catharsis in the extant Aristotelian corpus. Instead, we have a number of references to such a phenomenon. The one most pertinent to aesthetics is found in Poetics, where one of the defining features of tragedy is a catharsis of such emotions as fear and pity (1449b22–28). Another reference to catharsis can be found in Politics. Here Aristotle writes that music ought to be used for education, catharsis and other benefits (1341b37–1342a1). The lack of Aristotle’s own definition combined with the long and rich history of later interpretations of catharsis (see Halliwell 1998: app. 5) makes it hard to reconstruct a precise Aristotelian account of this term. It is arguably related to the influence that arts have on a person’s emotions and judgements that derive from those emotions (Politics 1340a1–1340b18).

It has been argued that the concept of catharsis has both religious and medical connotations, although more recent interpretations favour the view that it is primarily a psychological phenomenon that has certain ethical aspects (though it is not a means to learn ethics per se).

d. Sublime

Another aesthetic term that originated in antiquity, but was made famous by subsequent adaptations, especially by Kant and Burke, is that of the sublime. The main source for the theory of the sublime is the handbook on oratory titled Peri Hupsous (De Sublimitate in Latin), although it is also noteworthy that a notion of the sublime was known and used much more widely in antiquity (Porter 2016). The authorship of Peri Hupsous is disputable. The work has been attributed to Cassius Longinus, a Greek rhetorician in the 3rd century C.E., and an anonymous author in the 1st century C.E. referred to as pseudo-Longinus.

Fundamentally, the sublime as described by Longinus is a property of style, “certain loftiness and excellence of language.” It does have some more striking aspects, however. For instance, Longinus states that:

A lofty passage does not convince the reason of the reader, but takes him out of himself . . . Skill in invention, lucid arrangement and disposition of facts, are appreciated not by one passage, or by two, but gradually manifest themselves in the general structure of a work; but a sublime thought, if happily timed, illumines an entire subject with the vividness of a lightning-flash, and exhibits the whole power of the orator in a moment of time (1).

Longinus suggests that sublimity originates from five different sources: (i) the greatness of thought; (ii) a vigorous treatment of passions; (iii) skill in employing figures of thought and figures or speech; (iv) dignified expressions, including the appropriate choice of words and metaphors; and (v) majesty and elevation of structure. The last cause of sublimity is said to embrace all the preceding ones as well (8.1).

4. References and Further Reading

a. Primary Sources

  • Armstrong, A. 1966–88. Plotinus: Enneads. 7 vols. Cambridge, MA: Harvard University Press.
  • Arnim, H. F. A. von. 1903–1924. Stoicorum Veterum Fragmenta. 3 vols. Leipzig: Teubner.
  • Bychkov, O. and A. Sheppard, eds. 2010. Greek and Roman Aesthetics. Cambridge: Cambridge University Press.
  • Cooper, J. and D. Hutchinson, eds. 1997. Plato: Complete Works. Indianapolis; Cambridge: Hackett.
  • Diels, H. and W. Kranz, eds. 1951–1952. Die Fragmente der Vorsokratiker, griechisch und deutsch. 3 Vols. Berlin: Weidmannsche buchhandlung.
  • Dyck, A. R. 1996. A Commentary on Cicero De Officiis. Ann Arbor: University of Michigan Press.
  • Goodwin, W. 1874. Plutarch’s Morals. Cambridge: John Wilson and son.
  • Hicks, R. D. 1925. Diogenes Laertius: Lives of Eminent Philosophers. London: W. Heinemann; New York: G.P. Putnam’s Sons.
  • King, J. 1945. Cicero: Tusculan Disputations. Cambridge, MA: Harvard University Press.
  • Long, A. and D. Sedley, eds. 1987. The Hellenistic Philosophers. 2 vols. Cambridge: Cambridge University Press.
  • Leonard, W. E. 1916. Lucretius: De Rerum Natura.  London: Dent; New York: Dutton.
  • O’Connor, E. M. 1993. The essential Epicurus: letters, principal doctrines, Vatican sayings, and fragments. Buffalo, N.Y.: Prometheus Books.
  • Roberts, W. R. 2011. Longinus on the Sublime: The Greek Text Edited after the Paris Manuscript. 2nd edn. Cambridge: Cambridge University Press.

b. Secondary Sources

  • Asmis, E. 1991. “Epicurean Poetics.” Proceedings of the Boston Area Colloquium in Ancient Philosophy 7, pp. 63–93. Reprinted in Philodemus and Poetry: Poetic Theory and Practice in Lucretius, Philodemus and Horace, ed. by D. Obbink, Oxford University Press 1995, pp. 15–34; and in Ancient Literary Criticism, ed. Andrew Laird, Oxford University Press 2006, pp. 238–66.
    • (A discussion of the evidence concerning the views on poetry found in the works of Epicurus, Lucretius and Philodemus.)
  • Barney, R. 2010. “Notes on Plato on The Kalon and The Good.” Classical Philology 105(4): 363–377.
    • (A discussion of functionality and its relationship to beauty in Plato’s works.)
  • Beardsley, Monroe C. 1966. Aesthetics from Classical Greece to the Present. New York: Macmillan.
    • (Relevant sections of this book contain a classic interpretation of ancient aesthetics.)
  • Bernays, J. 1979. “Aristotle on the Effect of Tragedy.” In Articles on Aristotle, edited by J. Barnes, Schofield, and R. Sorabji. Vol. 4: Psychology and Aesthetics, 154–165. London. (Originally in Abhandlungen der historisch‐philosophischen Gesellschaft in Breslau, vol. 1, 1857: 135–202; and Sonderausgabe, Breslau 1857.)
    • (A seminal paper for the study of Aristotle’s concept of catharsis; it argues that catharsis is the ‘purgation’ of emotions.)
  • Bett, R. 2010. “Beauty and its Relation to Goodness in Stoicism.” In Ancient Models of Mind, ed. A. Nightingale and D. Sedley, 130–152. Cambridge: Cambridge University Press.
    • (In this paper, the evidence for the Stoic definition of beauty as summetria is collected and interpreted.)
  • Boudouris, K. ed. 2000. Greek Philosophy and the Fine Arts, Volume 2. Athens: International Centre for Greek Philosophy and Culture.
    • (A large collection of papers on various aspects of ancient Greek aesthetics.)
  • Bychkov, O. 2010. Aesthetic Revelation: Reading Ancient and Medieval Texts after Hans Urs von Balthasar. Washington, D.C.: Catholic University of America Press.
    • (A wide-scope monograph; the central argument concerns the notion of the revelatory aesthetics and its presence in ancient (and later) philosophical texts.)
  • Close, A. J. 1971. “Philosophical Theories of Art and Nature in Classical Antiquity.” Journal of the History of Ideas 32(2): 163–184.
    • (A study of the notion of creator/designer in antiquity.)
  • Demand, N. 1975. “Plato and the Painters.” Phoenix 29(1): 1–20.
    • (An article discussing Plato’s attitude to painting and the relationship between his views and contemporary painting traditions.)
  • Denham, A. ed. 2012. Plato on Art and Beauty. New York: Palgrave Macmillan.
    • (A collection of papers on Plato’s philosophy of art.)
  • Destrée, P. and P. Murray, eds. A companion to Ancient Aesthetics. Hoboken, NJ: Wiley-Blackwell.
    • (A wide-ranging collection of extended entries, including such topics as mimesis, beauty, sublime, art and morality, tragic emotions and others.)
  • Ford, A. 1995. “Katharsis: The Ancient Problem.” In Performativity and Performance, edited by A. Parker and E. K. Sidgwick, 109–32. New York and London.
    • (An interpretation of Aristotle’s concept of catharsis with an argument that the relevant passages from Politics help to shed light on the sparse description in Poetics.)
  • Gál, O. 2011. “Unitas Multiplex as the Basis of Plotinus’ Conception of Beauty: An Interpretation of Ennead V.8.” Estetika: The Central European Journal of Aesthetics 48(2): 172–198.
    • (A paper arguing that, for Plotinus, beauty derives from Intellect and unity in diversity.)
  • Golden, L. 1973. “The Purgation Theory of Catharsis.” The Journal of Aesthetics and Art Criticism 31(4): 473–479.
    • (An in-depth argument against Bernays’ interpretation of catharsis as purgation; it contains a suggestion that catharsis is better understood as intellectual clarification.)
  • Halliwell, S. 1991. “The Importance of Plato and Aristotle for Aesthetics.” Proceedings of the Boston Area Colloquium in Ancient Philosophy, vol.7, pp. 321–48. New York: Routledge.
    • (A paper arguing that Plato and Aristotle address issues that are pertinent to contemporary aesthetics.)
  • Halliwell, Stephen. 1998. Aristotle’s Poetics. 2nd edn. London: Duckworth.
    • (An extensive study of Poetics, including a number of concepts central to Aristotle’s aesthetics; also includes appendices on the history of interpreting catharsis after Aristotle, dating of Poetics and others.)
  • Halliwell, Stephen. 2002. The Aesthetics of Mimesis: Ancient Texts and Modern Problems. Princeton: Princeton University Press.
    • (A seminal study of the concept of mimesis in Greek philosophy and literature.)
  • Horn, H. -J. 1989. “Stoische Symmetrie und Theorie des Schönen in der Kaiserzeit.” Aufstieg und Niedergang der römischen Welt 36.3: 454–472.
    • (A study of the Stoic definition of beauty as summetria.)
  • Hyland, D. 2008. Plato and the Question of Beauty. Blooming & Indianapolis: Indiana University Press.
    • (An interpretation of Plato’s notion of beauty in Symposium, Hippias Major and Phaedrus influenced by continental philosophy.)
  • Irwin, T. 2010. “The Sense and Reference of Kalon in Aristotle.” Classical Philology 105(4): 381–396.
    • (An argument for avoiding an aesthetic translation of the term to kalon in Aristotle’s works on ethics.)
  • Kraut, R. 2013. “An aesthetic reading of Aristotle’s Ethics.” In Politeia in Greek and Roman Philosophy, ed. M. Lane and V. Harte, pp. 231–250. Cambridge: Cambridge University Press.
    • (An argument for translating to kalon in Aristotle’s work as an aesthetic term.)
  • Kristeller, O. P. 1951. “The Modern System of the Arts: A Study in the History of Aesthetics Part I.” Journal of the History of Ideas 12(4): 496–527.
    • (An article containing arguably the most significant critique of the notion of ancient aesthetics.)
  • Konstan, D. 2015. Beauty: The Fortunes of an Ancient Greek Idea. Oxford: Oxford University Press.
    • (A wide-ranging study of the ancient Greek conception of beauty; includes a discussion of translating problematic aesthetic terms.)
  • Laird, A. ed. 2006. Ancient Literary Criticism. Oxford: Oxford University Press.
    • (A collection of papers covering a wide range of topics including Aristotle’s catharsis, the views of the Hellenistic schools on poetry and Plato’s treatment of tragedy.)
  • Lear, J. 1988. “Katharsis.” Phronesis 33: 297–326.
    • (An argument against the interpretation of catharsis as ‘purgation’ of emotions; and the suggestion that it is, instead, a psychological one with certain ethical connotations.)
  • Lear, G. R. 2006. “Aristotle on Moral Virtue and the Fine.” In The Blackwell Guide to Aristotle’s Nicomachean Ethics, ed. R.Kraut, pp.116–136. Malden, MA; Oxford: Blackwell.
    • (A study of Aristotle’s use of to kalon with the argument that Aristotle used this term (with its aesthetic undertones) to put an emphasis on certain properties of goodness, namely, intelligibility and pleasantness to contemplate.)
  • Lobsien V. and C. Olk, eds. 2007. Neuplatonismus und Ästhetik: zur Transformationsgeschichte des Schönen. Berlin/New York: De Gruyter.
    • (A collection of papers on Neoplatonist aesthetics.)
  • Lombardo, G. 2002. L’Estetica Antica. Bologna: Il Mulino.
    • (A short monograph in Italian containing a discussion of views on aesthetics espoused by both major and lesser-known philosophical figures in antiquity.)
  • Nehamas, A. 2007. “‘Only in the Contemplation of Beauty is Human Life Worth Living’ Plato, Symposium 211d.” European Journal of Philosophy 15 (1): 1–18.
    • (A discussion of the role that beauty plays in Plato’s Symposium.)
  • Nussbaum, M. 1990. Love’s Knowledge: Essays on Philosophy and Literature. Oxford: Oxford University Press.
    • (The relevant sections of this book analyze the complex relationship between philosophy and literature in Plato’s works.)
  • Pappas, N. 2012. “Plato on Poetry: Imitation or Inspiration?” Philosophy Compass 7 (10): 669–678.
    • (An argument that in Republic and Sophist, poetry is treated as imitation, whereas in Ion and Phaedrus, it is treated as inspiration. The relationship between the two views is explained by employing Plato’s concept of drama in Laws.)
  • Peponi, A. -E. 2012. Frontiers of Pleasure: Models of Aesthetic Response in Archaic and Classical Greek Thought. Oxford: Oxford University Press.
    • (A study of the representations of aesthetic properties of artworks and other objects in ancient Greek texts, including philosophical ones.)
  • Pollitt, J. J. 1974. The Ancient View of Greek Art: Criticism, History, and Terminology. New Haven, CT: Yale University Press.
    • (A seminal work on ancient Greek philosophy of art, it deals with not only philosophical but also literary, rhetorical and other kinds of texts.)
  • Porter J. 2009. “Is Art Modern? Kristeller’s ‘Modern System of the Arts’ Reconsidered.” British Journal of Aesthetics 49: 1–24.
    • (An article containing a critique of Kristeller’s dismissal of the possibility of ancient aesthetics.)
  • Porter, J. 2010. The Origins of Aesthetic Thought in Ancient Greece: Matter, Sensation and Experience. Cambridge: Cambridge University Press.
    • (The central argument claims that Plato and Aristotle established formalist aesthetics, which dominated the tradition and silenced alternative, materialist aesthetics.)
  • Porter, J. 2016. The Sublime in Antiquity. Cambridge: Cambridge University Press.
    • (A study of the notion of sublime outside Longinus’ treatise.)
  • Rogers, K. 1993. “Aristotle’s Conception of τὸ καλόν.” Ancient Philosophy 13:355–71. Reprinted in L. P. Gerson (ed.) 1999. Aristotle: Critical Assessments, iv. London: Routledge: 337–55.
    • (The analysis and interpretation Aristotle’s use of the term to kalon, especially his claim that virtues are undertaken for the sake of to kalon.)
  • Sheffield, F. 2006. Plato’s ‘Symposium’: The Ethics of Desire. Oxford: Oxford University Press.
    • (A monograph on Plato’s Symposium; the central argument interprets the dialogue as concerned with moral education, but in a distinct way, that is, by means of the analysis of desire.)
  • Tatarkiewicz, W. 1974. The History of Aesthetics. Vol. 1. The Hague: Mouton.
    • (A collection of ancient Greek philosophical texts on various topics in aesthetics accompanied by a commentary.)
  • Zagdoun, M. -A. 2000. La Philosophie Stoïcienne de l’art. Paris: CNRS Editions.
    • (An extensive study of the notions of beauty and art in Stoic philosophy.)

 

Author Information

Aiste Celkyte
Email: aiste.celkyte@googlemail.com
Yonsei University
South Korea

Metaepistemology

Metaepistemology is, roughly, the branch of epistemology that asks questions about first-order epistemological questions. It inquires into fundamental aspects of epistemic theorizing like metaphysics, epistemology, semantics, agency, psychology, responsibility, reasons for belief, and beyond. So, if as traditionally conceived, epistemology is the theory of knowledge, metaepistemology is the theory of the theory of knowledge. It is an emerging and quickly developing branch of epistemology, partly because of the success of the more advanced ‘twin’ metanormative subject of metaethics. The success of metaethics and the structural similarities between metaethics and metaepistemology have inspired parallel conceptual forays in metaepistemology with far reaching implications for both subjects.

The current article offers a concise survey of basic themes and problems in metaepistemology. The survey, of course, aims neither at being exhaustive nor at presenting these basic themes and problems in their full sophistication and complexity. Rather, given the very broad span of themes and problems that fall under the label of metaepistemology, the aim is to introduce basic themes and problems and overview some of the cutting edge research that is currently undertaken in metaepistemology debates.

In what follows, “(meta)”epistemology contains brackets to indicate the epistemology of epistemology. This is to be distinguished from non-bracketed “metaepistemology,” which is meant to refer to the whole domain of metaepistemological theorizing (metaphysics, epistemology, semantics, agency and so forth).

Table of Contents

  1. Situating Metaepistemology within Epistemology and Metanormativity
  2. Normativity
  3. Metaphysics
  4. Semantics
  5. (Meta)Epistemology
  6. Reasons for Belief and Epistemic Psychology
  7. Agency and Responsibility
  8. New Directions in Metaepistemology
  9. References and Further Reading

1. Situating Metaepistemology within Epistemology and Metanormativity

Following the example of ethics (for example, Fisher 2011; see also Fumerton 1995), we can distinguish three basic branches of epistemology: normative epistemology, applied epistemology, and metaepistemology. Normative epistemology mostly deals with first-order theorizing about how we should form justified beliefs, gain understanding, truth and knowledge, offer accounts of the basic sources of knowledge (like memory, perception, testimony) and so forth, but it does not pursue higher-order questions about these matters or pressing applied epistemic matters. To the extent that it does, it embroils itself, respectively, in metaepistemology and applied epistemology. Applied epistemology draws from normative epistemological theorizing in order to respond to pressing epistemic matters of practical value, like climate change skepticism, jury decision-making, gender or race issues in epistemology, and so forth.

The following is an example to illustrate how the trichotomy of the epistemic domain is meant to divide epistemological labor. As is well-known, epistemologists are intrigued by the perennial question “What is knowledge?” and, accordingly, try to come up with plausible reductive analyses. This much is first-order normative epistemological theorizing at its best. If we conceptually dig deeper, however, move a level down and ask whether there is any “real” (or robust) knowledge or whether the project of reductive analysis of knowledge is any plausible, then we ask second-order, metaepistemological questions. That is, we ask questions about first-order epistemological questions, like the question “what is knowledge?”. Moreover, if we ask epistemic questions of pressing practical value, like whether gender, race, and ethnic origin factors affect ordinary knowledge attributions, then we are pressing applied questions (for example, Fricker 2010) and have swiftly moved into the field of applied epistemology.

Opinions diverge about the exact interrelation of the three branches of epistemology and the exact interrelation of metaepistemology and its twin metanormative subject of metaethics. In regard to the former issue, there are two broad, possible positions about the relation among the three branches. The first position is one we may call the autonomy thesis. According to the autonomy thesis, also sometimes propounded in ethical theory (compare Enoch 2013 for discussion), metaepistemology is an independent branch of epistemological inquiry that does not depend on the results of the other two branches of epistemology. Inversely, both applied and normative epistemology do not depend on the results of metaepistemology either. The autonomy thesis bears some prima facie plausibility because it seems intuitive that one may be, let us say, a coherentist, foundationalist, or reliabilist about normative epistemology but an expressivist, error theorist, or relativist about metaepistemology.

The other position on the matter is what we may call the interdependency thesis. It suggests that there are important theoretical interdependencies between the three branches (pace some prima facie appearances of autonomy). If, for example, we could reductively analyze epistemic justification in informative necessary and sufficient conditions, it seems that we would have a theory to invoke in normative justificatory matters and apply to pressing questions of epistemic justification like, say, climate change skepticism. However, the fact that such analyses do not seem readily available indicates that nothing is very obvious in metaepistemological matters.

In regard to the latter issue, namely, how to situate metaepistemology not merely within epistemology but within the broadly metanormative domain, there are again two broad, divergent positions. First, many metanormativists hold “the parity thesis” (or, sometimes called, “the unity thesis”) according to which the epistemic and the moral/practical are intertwined normative subjects, theoretically on a par and should therefore share the same metanormative fate, whatever that may be (realist, antirealist, Kantian constructivist, or even other) (compare Kim 1988; Cuneo 2007). Other metanormativists deny this and argue that there are important discontinuities between metaepistemology and metaethics and, hence, that we should instead hold “the disparity thesis” (compare Lenman 2008; Heathwood 2009).

For example, Cuneo (2007) has argued that the moral and the epistemic domain share core structural similarities (reasons, supervenience, motivation, and so forth) and that this bolsters the parity thesis. In response to Cuneo’s (2007) arguments for the parity thesis, it has been suggested by Lenman (2008) and Heathwood (2009) that while moral facts and truths may be irreducible, epistemic facts and truths may be reducible to facts and truths about evidence and probability (where these are ultimately to be understood in descriptive terms) and, therefore, there is a fundamental disparity between the two metanormative subjects. Again, Cuneo and Kyriacou (2017) have come up with a rejoinder to the Heathwood/Lenman case for the moral/epistemic disparity and argued that the parity seems to go through in the end. Of course, the dialectic is currently developing and the jury is still out.

So far, we have said a few basic things about the possible positions in situating metaepistemology within epistemology proper and within metanormativity. We now turn to the basic question of what it is that makes epistemology a distinctively normative subject and how from epistemic normativity we arrive at perplexing metaepistemological questions. The next section unpacks the various aspects of the metaepistemological domain that will be presented as we proceed.

2. Normativity

One of the most remarkable characteristics of human primates is their evolved, often linguistically mediated, capacity for cognizing and, moreover, the intrinsic normativity of this cognizing; intrinsic normativity of cognizing because our wide array of cognitive endeavors seem to be inherently “fraught with ought” and evaluable in terms of (in)correctness. Intuitively, to the extent that we are rational and responsible agents, there are propositions we ought to believe and propositions we ought not to believe, and there are cognitive practices, methods, processes, habits, and so forth that are epistemically correct to employ and others that are epistemically incorrect to employ. That is, (in)correct from the epistemic point of view.

Indeed, generations of epistemologists from the early moderns like Descartes (1641), Locke (1690) and Hume (1739), to Clifford (1877), Chisholm (1966), Alston (1988), Fumerton (1995), Feldman (2002) and beyond have attested the normativity of cognizing and have talked about corresponding epistemic duties, oughts, obligations, requirements, and so forth—terms that for current purposes are used interchangeably—that rational agents have.

For example, intuitively, we ought to believe on the basis of the relevant evidence or the relevant reliable cognitive process and ought not to believe what is merely bequeathed by tradition, dictated by fiat of authority or simply feels good. It is also epistemically correct to collect evidence meticulously and open-mindedly, and it is epistemically incorrect to cook up your lab research to the conclusions that a generous research sponsor would favor (for example, say, that extensive consumption of red meat incurs no side-effects on health and the environment).

It is precisely this intrinsic normativity of our cognitive endeavors (practices, methods, processes, habits, beliefs, theories) that gives rise to metaepistemological questions because as rational, responsible agents we seem bound by epistemic duties and obligations that are rationally non-optional and inescapable. To the extent that we are rational agents, we seem constrained by epistemic oughts and duties regardless of whether we like it or not, or whether we submit to these or not. The fact is reflected in ordinary locutions like “p is the right thing to believe,” “You should trust what Paul says because he is an expert on the matter,” or ‘They should have known this much; there is no excuse,” and so forth. Call this fundamental appearance of ordinary epistemic discourse the deontic appearance.

 

Of course, the deontic appearance is the prima facie appearance of ordinary epistemic discourse and appearances, even deeply entrenched appearances, as we know very well may be deceptive. Secunda facie, we may have no epistemic duties or obligations and epistemic normativity may not be explainable in deontological terms. But at least prima facie we often talk and think in terms of propositions that one should or should not believe and in terms of practices, processes, methods, habits etc. that one should and should not employ. This much of epistemic appearance seems unequivocal and whether we should debunk the deontic appearance or not is a further question down the road.

It should also be underscored that the deontic appearance of ordinary epistemic discourse seems to have a distinctively categorical flavor; that is, the phenomenology of our everyday talk and thought about duties, obligations, oughts, seems to imply the existence of categorical duties and obligations such as duties that are in some sense unconditional, that is independent of our psychology (desires, dispositions, beliefs,) and constrain what we ought to believe insofar as we are rational. For example, if a speaker utters, “You should believe that p” in an ordinary conversational context her statement would, typically, conversationally implicate that it is an (epistemic) fact of sorts that “You should believe that p.” A fortiori, the conversational implication is that anyone epistemically rational would be obliged to believe that p because it constitutes a categorical epistemic obligation (derivative of a corresponding epistemic fact).

In line with the deontic appearance, the broadly internalist view that takes it that we are bound by reflectively accessible epistemic duties is called epistemic deontologism (compare Clifford 1877; Alston 1988; Feldman 2002). It asserts that we have reflectively accessible, epistemic duties and that they should regulate rational doxastic behavior, namely, endorsing, maintaining, and revising a belief.  Epistemic deontologism can be construed in a number of ways depending on how we understand epistemic goals of inquiry. Accordingly, we can have different proposals about how to construe epistemic duties.

However, the standard way to understand epistemic deontologism has been in terms of epistemic justification (for discussion see Feldman 2002).  Roughly, an epistemic duty for S to believe p exists iff S has sufficient justification for p.  Sufficient justification may in turn be understood in various ways, perhaps, along broadly evidentialist lines, that is, in terms of a relatively high ratio of evidential probability (for example, Heathwood 2009) or even along reliabilist lines, that is, in terms of a high ratio of truth output by a process (or ability) in an externalist framework (for example, Goldman 1979).

This, of course, is not the only way epistemic deontologism may be construed because it can also be construed in terms of alternative epistemic goals/values like truth, knowledge, and even understanding or wisdom (compare for the latter two goals Kyriacou 2016). That would mean that, roughly, an epistemic duty for S to believe p exists iff p is true or an instance of knowledge or even promotes understanding or wisdom. However, the best construal of epistemic deontologism is a question we need not further dwell on here. The important thing for current explicating purposes is that no matter how epistemic duties are to be construed, the deontic appearance stirs a whole host of perplexing and far-reaching metaepistemological questions, like the following:

Metaphysical: Are there epistemic properties/goals, norms, and facts in virtue of which categorical epistemic oughts, duties, and obligations for rational agents follow? If yes, what is their exact nature? If no, where does this leave us in terms of the intrinsic normativity of our cognitive practices? May the nonexistence of epistemic properties/goals, norms, and facts cripple the normative dimension of our epistemic lives? How is the constraint of epistemic supervenience to be understood and explained?

Semantic: What is the meaning of epistemic statements? Is it descriptive, expressivist, or even other? Are epistemic statements truth-apt? If yes, can truth-aptness be rescued in an expressivist metasemantic framework? Can deflationism do the trick? If there are robust epistemic facts, how do they ground truth, if at all? If there are no robust epistemic facts, then what does ground epistemic truth, if at all? Is the meaning of epistemic statements invariant or context-sensitive? Do practical interests, stakes, and so forth have a semantic contribution to the meaning of epistemic statements?

(Meta)epistemological: If there are categorical epistemic oughts and duties, how do we get to know them, if at all? Do we merely construct such duties and obligations or do we somehow discover them? If we discover them, how can this happen with minimal reliability, given that such properties and duties do not seem at first instance natural? How is this cognitive reliability to be accounted for, given our evolutionary history and the fact that the evolutionary process has been a blind, nonintentional process largely pushing towards adaptation, survival and reproduction by means of natural selection? Are intuitions credible evidence, especially in view of the evolutionary-cultural origins of cognition? Is talk of epistemic duties and oughts misconceived in light of epistemic externalism? Does epistemic externalism comport with the intrinsic normativity of cognitive endeavors?

Reasons for Belief and Epistemic Psychology: Are there categorical reasons for belief or are all reasons for belief hypothetical and dependent on our contingent, subjective desires and dispositions? Is epistemic rationality merely instrumental or categorical? Is epistemic judgment motivating? If yes, motivating in what way?

Agency and Responsibility: Can we directly choose what to believe and, if not, what about the fundamental and deontic notion of epistemic responsibility? Is there such a thing as character or is it merely fictional? If there is, can it play an integral part in our epistemic lives? If there is not, where does this leave our epistemic lives?

The following sections concisely introduce and discuss at some depth at least many of these metaepistemological questions.

3. Metaphysics

A core component of epistemic metaphysics concerns ontology. Epistemic ontology explores questions about the existence and nature of epistemic properties like epistemic justification, warrant, rationality, entitlement, understanding, truth, wisdom, knowledge, epistemic duties, and norms like, “You ought to trust your senses, unless you have reason to doubt their overall reliability” and particular epistemic facts like, “The theory of evolution is well-justified, given the abundant empirical evidence.”

Here, focus is restricted on the epistemic properties of justification and knowledge and respective justificatory and knowledge duties/norms/facts for at least three basic reasons: first, because of the more prominent position they have traditionally held in the history of epistemology; second, because of their relatively more advanced research state of art; and third, because considerations of simplicity and economy inevitably constrain the thematic boundaries of the article.

Justification and knowledge are treated in turns and not jointly in spite of the fact that some positions about the two properties are strictly analogous, for two reasons: first, because this analogy can easily come apart, for example, in principle someone could be an antirealist about knowledge but not about justification; second, because the debates of justification and knowledge often develop independently of one another, and it would perhaps oversimplify the state of the debates if we agglomerate the two.

Now, beginning with epistemic justification, a traditional distinction that helps map the theoretical landscape of justification debates is that between epistemic realism and antirealism (though to see how hard it can be to distinguish the two see Dreier, 2004). On the one hand, realists take epistemic justification to be a real, mind-independent property that its existence does not depend in any way on human cognizing. Thus, if a belief is justified, then this should be the object of discovery and not of invention (or construction). Accordingly, we should be able to understand that a justified belief instantiates the property of justification and that it is in virtue of this property that is justified.

On the other hand, epistemic antirealists deny that epistemic justification is anything like a real and mind-independent property. Epistemic justification is considered a property (if at all) that is constructed out of the workings of evolved human cognizing and nothing over and above this. This is not to imply the rather naïve view that justification is made up “out of thin air” by the antirealist. Justification is still constrained by certain epistemic norms, facts or framework, although these are mind-dependent and are of mere local validity. For the antirealist, if a belief is justified, then it is justified in virtue of certain epistemic norms, facts, or framework, but this should not be overstated. Epistemic norms, facts, or framework are invented by cognizers and therefore epistemic justification is also invented. It is not the case that if a belief is justified, then we can somehow understand that is justified because it instantiates a real property of justification.

As in metaethics, realists can be distinguished between reductionists and antireductionists. Reductionists can further be distinguished between analytic and synthetic reductionists. Analytic reductionists believe in the capacity of traditional a priori conceptual analysis to deliver illuminating, descriptive analyses of philosophically interesting concepts. Accordingly, they would take epistemic justification to be, in principle, reductively analyzable to a more basic property like coherence, reliability, foundations, virtues, responsibility, evidence, probability, and so forth (compare Bonjour 1985, 1998; Goldman 1979; Zagzebski 1996; Conee and Feldman 2004; Vahid 2005; Sosa 2007; Heathwood 2009;).

Synthetic reductionists are not so sanguine about traditional a priori conceptual analysis, and its purported capacity to deliver illuminating, descriptive analyses of philosophically interesting concepts. Accordingly, they would deny that epistemic justification is, in principle, reductively analyzable to a more basic and informative property, but they would still cling on realism about epistemic justification because they would take it to be a natural kind property, somehow discoverable only by the a posteriori means of empirical science and not by the a priori conceptual analysis of traditional philosophical methodology (compare Jenkins 2007). As, for instance, we can discover by empirical means that “water is H20 molecules” or that “gold is the element with atomic number 79,” we can presumably discover the natural kind property that constitutes the essence of epistemic justification, or so the thought goes.

So far, we have seen analytic and synthetic reductionism. Both typically adhere to methodological naturalism, roughly, the view that naturalistic scruples constrain the right kind of philosophical methodology (compare Pollock and Cruz 1999). In other words, philosophical methodology should be empirically informed and cohere with our best naturalistic picture about ontology, epistemology, and so forth. Be that as it may, analytic and synthetic reductionists disagree about the exact content of methodological naturalism. Analytic reductionists are still optimistic about the method of a priori conceptual analysis while synthetic reductionists counter that conceptual analysis is rendered obsolete by the progress of the a posteriori methods of empirical science and that philosophy should be sensitive to this progress.

However, there are also antireductionist realists that are usually reluctant to embrace methodological naturalism, at least not any form of chauvinistic methodological naturalism that would exclude the possibility of antireductionism from the outset. Anti-reductionists usually disagree with their fellow reductionist realists about the capacity of methodological naturalism to deliver illuminating philosophical results and, in particular, results about justification—and, in principle, other normative properties (compare Moore 1903; Boghossian 2007; Cuneo 2007). They take epistemic justification to be a property that is real and mind-independent but not reducible to any more basic, natural property. That is, a property that can be the object of study of natural sciences and empirical psychology, neuroscience, anthropology, sociology, and so forth.

Behind their suspicion of methodological naturalism may hover the intuition that normative properties do not seem natural. This suspicious attitude also helps explain their pessimism about the employment of methodological naturalism in metanormativity puzzles, for if normative properties do not seem in any profound way to be natural, then, perhaps, we should not insist on the employment of the restrictive philosophical methodology of methodological naturalism. The thought is that if normative properties are, indeed, non-natural in any profound sense then by insisting on a naturalistic methodology we will not be making any progress. We will only engage in a subtle begging of the question, or so the thought goes.

On the opposite theoretical side stand epistemic antirealists about justification. Antirealists deny that the property of epistemic justification is anything over and above what human cognizers construct and thereby invent. This is not to deny that there are, in some sense, justified beliefs in virtue of certain epistemic norms and facts or agents that justifiably believe that p or even corresponding epistemic oughts and duties. It is only to reject the distinctively realist idea that epistemic justification is a robust property somehow “out there” and our beliefs are justified to the extent that they instantiate that “out there” property. Rather, justification is something that emerges out of our evolved cognitive attitudes that are mental states and out of our culturally evolved epistemic practices and interactions that are social activities. The same goes for corresponding epistemic oughts and duties. They may exist, in some sense, but definitely not in the Archimedean “out there” sense that realists like to envisage.

Like realists, epistemic antirealists are a heterogeneous lot. Some may be subjectivists, others expressivists, error theorists, or relativists. Let us very briefly preview the rudiments of these families of theories. Subjectivists typically hold that judgments of justification report the agent’s noncognitive attitudes, valuations, pro-attitudes.  For subjectivism, justification assertions like “p is justified, given my evidence” or attributions like “S justifiably believes that p” report the speaker’s attitudes of approval, endorsement, recommendation, trust, and so forth, for the belief that p or for S’s believing that p. Worthy of notice is that the speaker’s attitudes are reported and not expressed. The speaker is supposed, so to speak, to step back from his own attitudes, introspect and simply report these attitudes but not directly express them. So, if I say, “S justifiably believes that p,” according to a simple subjectivist theory I may be reporting my approval for the belief that p, but not directly expressing that approval.

The fine-grained distinction between reporting/expressing the speaker’s attitudes might seem like an insignificant detail but it is a distinctive feature of the theory that helps distinguish it from expressivist theories (compare Schroeder 2008a). Besides, subjectivism is usually understood to be a cognitivist theory while expressivism is usually understood to be a noncognitivist theory. To stipulate, a cognitivist theory is a theory that takes normative judgment to express descriptive mental states like beliefs while a noncognitivist theory is a theory that takes normative judgment (or at least some species of it) to express nondescriptive mental states like desires, proattitudes and sentiments. Obviously, although both subjectivism and expressivism may be labeled as broadly sentimentalist theories because they involve noncognitive attitudes, subjectivism is a cognitivist theory while expressivism is a noncognitivist theory.

Subjectivism is usually considered an implausible theory (see Schroeder 2008a) and, in fact, although it has some metaethical proponents (compare Wiggins 1987), to the best of my knowledge there are no obvious subjectivists in metaepistemology. But things are very different with regard to expressivism that has had quite a few proponents recently. Expressivists take justification judgments to express (and not report) the speaker’s noncognitive attitudes like approval, endorsement, recommendation, assurance, reliance, plans, trust, desires and intentions. Justification assertions like “p is justified” or attributions like “S justifiably believes that p” express the speaker’s attitudes of approval, endorsement, recommendation, trust, and so forth for p or for S’s believing that p (compare Kyriacou 2012). They express (or “voice”) directly the speaker’s states of mind.

The third antirealist theory of epistemic justification is that of error theory (or, sometimes, fictionalism). Unlike subjectivism and expressivism, error theory does not invoke, one way or another, noncognitive attitudes. Error theorists take their justification judgments to purport to describe respective justification facts but deny that such facts really exist (compare Olson 2011a, 2011b). Given the absence of such facts (typically considered to be truthmakers), we end up with an error theory, namely, a theory that suggests that justification judgments are uniformly false; at least all first-order justification judgments are false (compare Olson 2011a, 2011b).

The fourth antirealist theory of justification is that of relativism. Relativism denies the existence of “real” justificatory epistemic norms and facts and stipulates that justificatory norms and facts are only relative to some indexical factor of mere local validity—usually the agent, or his society, culture, and so forth (compare MacFarlane 2005). Often, relativists are cultural relativists that think that there are no mind-independent norms and facts and that the only norms and facts that really exist are some culturally constructed and embedded norms and facts (compare Stich 1990; for a recent defense of cultural moral relativism see Velleman 2013; and for criticism see Kyriacou 2015). These culturally constructed norms and facts allow for justified beliefs but the justification is of only local validity.

There are many subdivisions under the banner of each of these families of theories that we cannot really dwell on here. There are, for example, many different expressivist theories, and it is really doubtful whether any two of these theories are identical. For instance, Allan Gibbard, one of the most prominent and influential expressivists and one of the first to extend expressivism from metaethics to metaepistemology, has held at least two expressivist theories, the early norm-expressivism (1990) and the later plan-expressivism (2003). There are other versions of expressivism in the literature too—habits-expressivism (compare Kyriacou 2012) and hybrid versions of expressivism such as Ridge’s (2007) ecumenical expressivism (for some discussion of epistemic expressivism see Chrisman 2012).

We have now introduced the rudiments of the major realist and antirealist theories of justification, but there are also other theoretical options that are somewhat harder to classify as realist or antirealist that deserve at least a short mention. For example, as in metaethics (for example, Korsgaard 1996), a Kantian constructivist might claim that norms and facts of justification are constructed out of a priori constitutive norms of rationality (for example, the universalizability or autonomy formulas) and obviate the distinction between realism/antirealism. She could claim that her theory cannot be properly classified as realist or antirealist but only as deontological. Categorical epistemic duties follow from the application of these constitutive norms of rationality but these duties are, ontologically speaking, neither realist nor antirealist.

Be that as it may, so far we have discussed the question of the ontology of justification and the various sorts of approaches to it, but there are also two important metaphysical challenges for a plausible metaepistemological theory that deserve some attention: the evolutionary challenge and the supervenience challenge.

The evolutionary challenge is more of a challenge to normative realism and realist understandings of justification/knowledge (moral and epistemic). As Sharon Street (2006, 2009) has argued, our evolutionary history prima facie conflicts with normative realism because it is very implausible to think that we evolved and our moral and epistemic attitudes were somehow mysteriously and finely attuned to track corresponding moral and epistemic facts. Such a realist “tracking account” seems implausible on a number of counts (ontological parsimony, clarity, mysterious causal connections, and so forth), especially if we think of a competing metaphysically lighter, mere Darwinian account that explains our normative attitudes and their content as largely shaped by the main mechanism of evolutionary change, namely, natural selection. There is no need to postulate moral and epistemic facts, Street argued, in order to have the best Darwinian explanation of how we came to have the normative attitudes we tend to have.

Thus, the evolutionary challenge for realists is to explain, in the best theoretical way, how our normative attitudes have largely been shaped by natural selection in consonance with robust normative facts. The problem for the realist now is that it seems that Ockham’s razor should apply and redundant robust normative facts (and realism) should drop off the picture of that theoretical explanation because a mere Darwinian, antirealist account can do all the explaining we need.

Street’s evolutionary challenge has stirred some fascinating discussion and some interesting realist rejoinders that we unfortunately have to skip here—for example, Setiya (2012), Enoch (2013), FitzPatrick (2014), Vavova (2014). Nevertheless, it is widely acknowledged to be an important challenge for metanormative realism. If realism is to be plausible, it has to explain in a plausible way how it can comport with our evolutionary history.

The supervenience challenge asks us to explain how epistemic properties (if any) relate to the natural world. This challenge is usually interpreted in terms of the widely accepted metaphysical constraint of the epistemic supervenience thesis (compare Kim 1988; Conee and Feldman 2004; Vahid 2005; Cuneo 2007). It is a metaphysical constraint because it suggests that any theory needs to explain how epistemic properties like justification supervene on more basic, natural properties in such a way that if two situations are naturalistically identical (and thereby indistinguishable) and the first realizes, say, epistemic justification then the second situation must also realize epistemic justification.

Metaphysically speaking, it cannot be the case that two naturalistically identical situations (at least in the epistemically relevant aspects) realize inconsistent epistemic properties. There must be some naturalistic difference at the base level that grounds some difference at the normative-supervening level, otherwise there is no good reason why the supervenient-normative properties should be inconsistent. To illustrate, suppose that there are two naturalistically indistinguishable cases where a dead body has been found. It would be unreasonable to think that the one case justifies the belief in a homicide while the second justifies the belief in a suicide—unless of course there is at least some relevant naturalistic difference in the two cases.

The epistemic supervenience thesis is a rather technical way to formalize the strong mundane intuition that ‘no double standards’ should be allowed in normative matters (epistemic, practical/moral, aesthetic, or other). Some theories can deal with the constraint rather easily while others seem to have difficulties with it. For example, reductionist realists have an easy explanation of the constraint. Intuitively, if two naturalistic situations are identical (at least in the epistemically relevant respects) and the first situation realizes justification as, say, coherence, then there should be no surprise that the other situation realizes coherence and thereby justificatory status.

On the other hand, antireductionists cannot offer the same account of supervenience with the reductionist because, crucially, they deny that epistemic justification is a reducible property. To see this, conceive for a moment of epistemic justification as a non-natural property, an irreducible property that regardless of what natural facts it is not entailed that a certain belief p is justified for S. Conceive now of two naturalistically identical situations and that one of the two situations does realize justification. It seems that it remains at least an open question whether the second situation also realizes epistemic justification exactly because the property is not reducible to a more basic property. No doubt, this is not to deny that antireductionists can somehow explain epistemic supervenience. It is only to note a distinctive challenge they face.

Expressivists usually attempt to explain epistemic supervenience as a merely conceptual and not as a metaphysical constraint (for discussion compare Hare 1952; Blackburn 1993; and Ridge 2014). For antirealists such as expressivists, epistemic supervenience is only an a priori norm of rationality that constrains the appropriate application of the concept of epistemic justification. If we have two naturalistically identical situations (at least in the epistemically relevant respects), and we judge that the one justifies a target belief p, then on pain of irrationality, we should also judge that the next situation justifies the belief that p.

One worry for antirealist explications of supervenience is that they reduce the constraint from a metaphysical principle to a conceptual and this seems to change the subject; likewise, expressivists might object that to insist that the constraint should be addressed in its metaphysical guise it is to beg the question in favor of realism. At any rate, supervenience is a tricky but valuable philosophical concept, and there are questions about how to best interpret it (local or global, for instance) and how to best account for its intuitive character, but the discussion will end here. Enough has been said to showcase why epistemic supervenience seems to be a challenge for realists and antirealists and a desideratum for a plausible metaepistemic theory of justification.

Let us now turn to the ontology of knowledge. The metaepistemology of knowledge seems even less well-defined than the metaepistemology of epistemic justification. This is reflected in the fact that knowledge theorists do not usually speak in the metaphysically-loaded terms of realism/antirealism about knowledge. They often set up their discussion in terms of challenges and problems for a theory of knowledge like radical skepticism, the Gettier problem, the lottery problem, the dogmatism paradox, linguistic evidence, and so forth. To keep the distinctively metaepistemological tone of the article, let us follow Michael Williams (2001) and speak of realism/antirealism about knowledge. According to Williams (2001), and this is the understanding of knowledge realism/antirealism that we endorse in the ensuing discussion, realists think of knowledge as something real, invariant, and mind-independent. Antirealists simply deny that there is such knowledge.

So, for realists, whether “S knows that p” is a question to be decided by the corresponding, independent knowledge facts, whatever these may be (evidential, reliabilist, virtue-theoretic, and so forth). Knowledge status is not a construct or invention of human cognition of sorts. In contrast, antirealists deny that there is such a thing as robust knowledge and corresponding robust knowledge norms and facts. There is knowledge, of course, but not in the metaphysically-inflated sense that realists tend to envisage. Knowledge is true belief in accordance with certain knowledge norms and facts that are not mind-independent and not of universal validity. How to comment on antirealist knowledge norms and facts depends on the contours of the particular theory one favors (relativist, expressivist, error theory and so forth). At any rate, there is nothing over and above this sort of knowledge.

Realists divide again into reductionists and antireductionists, and reductionists further divide into analytic and synthetic reductionists. On the one hand, analytic reductionists think that the analytical project about knowledge is still viable in spite of the failures and pessimism that traditional conceptual analysis may occasionally inspire. The Gettier problem has, for example, inspired pessimism about the prospects for an analysis of knowledge to many epistemologists (compare Kirkham 1984; Fogelin 1994; Williamson 2000; Floridi 2004), but some others remain unmoved and argue for sophisticated analyses of knowledge, like Pritchard’s (2012) anti-luck virtue-theoretic account. On the other hand, synthetic reductionists again tend to treat knowledge as a natural kind property that in principle should be discoverable by means of empirical inquiry. In this spirit Kornblith (2002) and Neta (2008) have argued that knowledge is just another natural kind.

In their turn, antireductionists deny that knowledge is reducible, analytically or synthetically. They tend to think that knowledge is irreducible to anything more basic and should be taken to be a primitive and sui generis concept, “the unexplained explainer” that is the most fundamental building block for an epistemological theory. Other epistemic concepts/phenomena like evidence, justification, probability, assertion, and skepticism should be explained in the light of knowledge and not the other way round. The most notable proponent of this “knowledge first” approach to epistemological theorizing is, of course, Williamson (2000).

Worthy of note is that Williamson (2000) puts forth his antireductionist theory not as a metaphysically-loaded theory, so it would be a mistake to hasten to infer that he is a non-naturalist about knowledge just because he is an antireductionist about knowledge. He is more interested in showing that the analytical reductionist project about knowledge is a “degenerate research programme.” In consequence, the safe thing to say is that although he is an avowed antireductionist, he shows no particular interest in the metaphysics of knowledge as such.

Antirealists about knowledge include expressivists and relativists as well as what we could call skeptics-as-error theorists. Expressivists like Gibbard (2003) and Chrisman (2007) claim that there are no “real” knowledge facts in virtue of which knowledge truths obtain. Building on Gibbard’s (1990) norm-expressivism about rationality, Chrisman (2007) suggests that knowledge is a normative concept we use to evaluate epistemic positions, and, accordingly, express approval for the norms in virtue of which true belief is formed. There are also relativists about knowledge who suggest that there are no independent or absolute knowledge facts but only constructed knowledge facts of mere local validity, like Stich (1990) and Macfarlane (2005).

Skeptics about knowledge are typically also antirealists and one natural way to integrate their position in metaepistemological classification would be as error theorists (compare Unger 1971; Fogelin 1994; Kyriacou 2017). Skeptics deny that there are any real knowledge facts (at least empirical knowledge facts) and therefore imply that at least most of knowledge discourse is implicitly in a state of constant error (for discussion compare Hawthorne 2004). Knowledge assertions and attributions are almost uniformly false. Of course, we speak of knowing such-and-such but we do not know in reality because there is no such thing as knowledge. As a result, ordinary speakers are afflicted by semantic blindness about the concept of knowledge. They unwittingly speak as if they know, but philosophical reflection can eventually indicate that there is not much of real knowledge.

Thus far we have introduced realists and antirealist approaches to knowledge. Interestingly, however, some epistemologists argue that there is “real” knowledge but not of the demanding invariant sort that traditional realism (as stipulated above) presupposes. There is proper, “real” knowledge that fully merits the name but is context-sensitive. That is, it is knowledge where the demandingness of the standards of justification (in virtue of which we arrive at true belief) varies with context because of factors like the intentions, needs, stakes, goals, and so forth of the attributor. In essence, the standards of knowledge may shift from context to context.

Contextualists, though, suggest that at least some important portion of our knowledge talk comes out true because the standards of knowledge need not be so high that we couldn’t ever satisfy them. Of note is that contextualism is a semantic view and not a metaphysical view and as such it could be easily wed to an antirealist theory. For example, some cultural relativists might be understood as a kind of contextualists about knowledge. The concept of knowledge picks out incommensurable culturally constructed knowledge facts from cultural context to cultural context (for critical discussion of such views see Boghossian 2007). At any rate, discussion of contextualism about knowledge takes us into the field of semantics, which is discussed in section four.

A final note on how the supervenience constraint applies to knowledge. As the epistemic supervenience thesis applies to epistemic justification, it seems to apply to knowledge as well. That is, if two epistemic situations are naturalistically indistinguishable (at least in the epistemically relevant respects) and the first situation realizes knowledge, then in the absence of at least some relevant difference between the two situations, it would be irrational to deny that knowledge is realized in the second situation. No double standards are allowed in epistemology. Again, the supervenience constraint on knowledge has stirred some fascinating discussion but space restrictions oblige us to leave the topic here (for critical discussion of the mentalist supervenience thesis of Conee and Feldman 2004, see Greco 2010).

We have now talked a bit about the metaphysics of epistemic justification and knowledge. As it has probably become obvious from the discussion, metaphysics is inextricably linked with semantics (and there is also a good methodological question of explanatory priority) and, therefore, it makes for a natural follow up section.

4. Semantics

Epistemic semantics typically deals with the meaning of epistemic declarative statements and often focuses on the more traditional concepts of justification and knowledge. Let us first present a certain famous semantic challenge for the meaning of normative predicates that has been lately applied to the subset of epistemic predicates as well (Jenkins 2007; Heathwood 2009; Greco 2015; Cuneo and Kyriacou 2017). This is Moore’s (1903) famous open question argument that he applied to “goodness” that ushered Moore to the conclusion that goodness is an indefinable predicate that picks out a simple and sui generis non-natural property. It can be formulated in this style of question: “Is (super)natural property N (for example, pleasure, desire, divine will, and so forth) goodness?” Or, in terms of more colloquial discursive contexts, “I can see that this is N (pleasurable, desirable, socially accepted, and so forth), but can’t see why this is good.”

Moore thought that competent speakers of English (and of goodness) will find this style of questions widely open and that this intuition of semantic openness is evidence that goodness is irreducible. He thought so because he assumed that property identities can be discovered solely by a priori conceptual analysis and that they should be directly transparent to competent speakers of the target language. As applied to epistemic predicates like justification, the open question argument would go like this: “Is (super)natural property N epistemic justification?”; or “I can see that this belief (or system of beliefs) is coherent, intuitive, socially approved, reliably produced and so forth but is this really justified? I don’t see that!”; or “I see that this belief is coherent, self-presenting, and so forth, but so what? It does not make it justified.”

The open question argument has been transposed and applied to epistemic justification and epistemic rationality with the same antireductionist verdict as in the case of goodness (and other moral predicates). Interestingly, there are also echoes of the Moorean semantic openness idea in Williamson’s (2000) famous antireductionist argument about knowledge. Williamson (2000:31) says, for example, that ‘‘even if some sufficiently complex analysis never succumbed to counterexamples, that would not entail the identity of the analyzing concept with the concept knows. Indeed, the equation of the concepts might well lead to more puzzlement rather than less.”  The Moorean, semantic intuition that lies behind Williamson’s words is that it just seems that any purported reduction of knowledge, even if it avoids counterexamples, will remain semantically open and this indicates that the predicate is irreducible to a more basic property. It is a conceptual primitive that is not derivative to anything conceptually more basic.

As it quickly became obvious, however, the open question argument is dialectically vulnerable because it relies on unwarranted assumptions about meaning and analysis. First, intuitions of semantic openness are inconclusive evidence for property non-identity. We might have not discovered the correct analysis yet and, besides, intuitions are often a bad counselor (compare Frankena 1939).

Second, not all property identities should be immediately transparent to a competent speaker (compare Smith 1994). Some may of course be almost trivial like “vixen is a female fox,” but some others may require reflection and practice even for the relatively simple “circle is the figure with a circumference that is equidistant from the center.” Arguably, it might take some reflection (and drawing) for a competent speaker to grasp the property identity. Third, some property identities may not be discoverable by a priori conceptual analysis no matter how hard or ingeniously we try (compare Brink 1989; Kornblith 2002; Jenkins 2007; and Neta 2008). Some seem discoverable by the a posteriori means of empirical science like the natural kinds of water, gold, silver, salt, and so forth.

But in spite of its dialectical weaknesses, the open question argument has exerted significant influence to metaethics (compare Darwall and others 1992) and now seems to extend this influence to metaepistemology. Many think that there is something strongly intuitive to this argument that is hard to shake off. Perhaps what the argument captures is the strong—in Enoch’s (2013) words- “just-too-different intuition,” namely, the intuition that normative properties are just too different to be reduced to anything more basic, like natural properties. Others, no doubt, think that the argument can be blunted and we can still argue for reductionist accounts of goodness, justification, rationality, knowledge, and so forth. To be sure, all parties to the discussion seem to consider it a serious semantic challenge that should be addressed, one way or another.

Those who accept the antireductionist result of the open question argument may interpret it (and actually have) in ways amenable to their overall views, realist or antirealist. Synthetic reductionists have seen it as evidence that an a priori conceptual reduction of epistemic properties is not possible but this constitutes no reason for thinking that an a posteriori reduction could not apply. So they proposed that perhaps epistemic properties are irreducible by conceptual analysis and, hence, this is why we have “open feel” semantic intuitions in Moore-style open question arguments.

This optimism, though, seems to be premature as there are good reasons to question whether sophisticated synthetic reductionism is all that promising. In the case of moral properties, Timmons and Horgan (1991) have devised a sophisticated Moore-style open question argument, the so-called “moral twin earth argument,” with the intent to thwart the optimism of synthetic reductionists. Inspired by Putnam’s (1975) seminal “twin earth argument” for semantic externalism, Timmons and Horgan have devised a thought experiment that allows us to test our semantic intuitions for the prospects of such a synthetic reductionism.

Timmons and Horgan contrasted their moral version of the twin earth argument with Putnam’s natural kinds version. They argued that, while in Putnam’s original thought experiment our intuitions suggest that the meaning of natural kind terms is not merely fixed internally (by other “meanings in the head,” so to speak) but externally by the natural world, in the moral version of the story our intuitions differ significantly. We tend to think that differences in the extension of, say, “right’” merely reflect differences in internal normative theory, not external natural facts about rightness. This, obviously, contrasts with Putnam’s natural kinds version because these intuitive differences in extension in the twin earths experiments seem to reflect differences in what external natural facts we tend to think there are.  We tend to think that there are no moral natural facts or kinds.

The exact details need not detain us here but what is important is that as in the case of Moore’s classic open question argument,, in the moral twin earth argument, our semantic intuitions seem to suggest that such a synthetic reduction is not in the offing. Even if there were a synthetic reduction of normative properties, we would tend to find such a reduction semantically open. To illustrate this, suppose for the sake of argument that somehow epistemic justification is reduced to some externalist property X (reliabilist, subjunctive tracking property, and so forth). Would this close the question “Is justification the externalist property X?”; it seems that the question would still strike us as widely open.

Some others have seen the open question argument as evidence for an antireductionist but realist theory while others, more sympathetic to both antireductionism and methodological naturalism, have seen it as evidence for antireductionist antirealist theories like error theory or expressivism (compare Ayer 1936). They suggested that we have entrenched “open feel” semantic intuitions because no reduction (analytic or synthetic) of normative properties (moral or epistemic) is forthcoming. Since there are no such properties, the quest for reduction is therefore quixotic.

In particular, expressivism is a very interesting approach to moral and epistemic discourse because it breaks completely with the traditional and mainstream truth-conditional metasemantic framework. Unlike error theorists, relativists, and so forth, expressivists question and reject both factualism and cognitivism. While antirealists by definition reject factualism—namely, the idea that propositions are rendered true by corresponding robust facts regardless of the kind of discourse (descriptive or normative)—with the sole exception of audacious expressivists, they respect cognitivism. That is, they respect the thesis that normative propositions express descriptive mental states like beliefs that purport to pick out corresponding normative properties and facts.

The result of the joint rejection of factualism and cognitivism is a novel metasemantic framework that does not understand meaning on the basis of truth and reference (and truth-conditions) but on the basis of the notion of expression of states of mind (though, for a recent non-content-centric exposition of expressivism see Charlow 2014). Truth is not built in the rudiments of the semantic theory at first instance. Sophisticated, non-classical expressivists (compare Blackburn 1993, 1998) like to appeal to a deflationary account of normative truth, but even then truth is not a primitive component of their metasemantic theory, but only a derivative of the mental states expressed.

This allows expressivists to reap some important explanatory fruit but at the same time also incurs important theoretical costs: expressivists can explain the operation of the open question argument; they are in accord with a naturalistic picture of ontology and epistemology consonant with our evolutionary history (compare Gibbard 1990, 2003); they can explain epistemic motivation (compare Kappel and Moeller (2014); they can explain normative disagreement as a mere conflict in noncognitive attitudes (compare Stevenson 1963); they are in line with linguistic evidence for the expression of noncognitive attitudes in normative discourse; and even more.

However, the theoretical price they are called to pay seems also pretty high. For a start, at first sight truth and objectivity fall out of the picture and the appeal to deflationism about truth seems only to transpose the problem a step back (compare Cuneo 2007). Normative disagreement also becomes a psychological matter of conflicting noncognitive attitudes and not a logical matter of truth/falsity. Consonance with empirical linguistics may also quickly become a problem rather than an attraction because sometimes we may express normative thoughts without expressing noncognitive states like approval as the theory suggests that we should;  at least this much of empirical linguistics cannot and should not be denied a priori by expressivism (for similar points compare Huemer (2008) and Yalcin 2012).

There are even more problems for expressivism, but a very serious problem that deserves at least a brief note is widely considered to be “the Frege-Geach problem” (compare Schroeder 2008a, 2008b; Charlow 2014). Drawing from Frege’s (1918/1997) discussion of negation, Geach (1960, 1965) first broached the problem for the early expressivist theory of emotivism, and ever since the problem has been at the forefront of expressivism debates. The problem consists in the fact that while expressivism works reasonably well in the context of asserted atomic normative sentences, in non-asserted logically complex contexts, the noncognitive content of the sentence seems absent.

This seems to have serious repercussions because in the case of deductive inference contexts it implies a fallacy of equivocation in regard to the normative predicate. This is so because the meaning of the predicate seems different from premise to premise and as sameness of meaning is a prerequisite for truth-preservation and validity, an obviously valid argument is left invalid. This means that expressivism does not account for a key semantic fact, namely, logical validity (for a round discussion of the problem see Schroeder 2008b).

The Frege-Geach problem has provoked heated discussions about the plausibility of expressivism, and expressivists keep trying to address the challenge. Some have tried to build a so-called “logic of attitudes” while others have tried to build structure into the involved noncognitive attitudes that would help explain the syntactical features of logic and address the problem (compare Blackburn 1993, 1998; Gibbard 1990, 2003; and Schroeder 2008a). More recently, some have developed novel non-content-centric understandings of expressivism (compare Charlow 2014). Whether expressivism can be developed into a plausible, full-blown metasemantic framework is currently an ongoing research project for many philosophers (some pessimists, some optimists).

Perhaps unsurprisingly, expressivists have an additional problem with truth. For an expressivist, nondescriptive theory of meaning the notion of truth need not come up in the explication of the meaning of sentences. What matters are the states of mind expressed and not truth-conditions, but truth is so valuable a notion that we cannot just give up so lightly. For instance, we want to hold onto such truths like that “the theory of general relativity is well-justified, given the evidence’ or that ‘murder is wrong.” But of course, expressivists are baffled about such truth-talk. Early emotivists like Ayer (1936) were happy to concede that moral statements are not truth-apt at all, but misgivings about giving up truth never subsided.

More sanguine and sophisticated non-classical expressivists observed that expressivism is a metasemantic framework that need not directly involve truth-conditions but that need not imply that we cannot wed such a framework with a derivative notion of truth. Perhaps it is incoherent to assume that expressivism can be wed to certain traditional theories of truth like a correspondence theory, given the antirealism of expressivism, but we could still appeal to metaphysically light theories of truth like deflationism/minimalism. This is what Blackburn (1993, 1998) has proposed for example. Blackburn suggested that we could be expressivists and antirealists but appeal to a deflationary theory of quasi-truth and quasi-facts in order to rescue normative quasi-truth. He has dubbed this project quasi-realism for obvious reasons: you can be an antirealist but mimic all the realist appearances, like the truth appearance (though, for the so-called “problem of creeping minimalism” about how to distinguish realism from quasi-realism see Dreier 2004).

Deflationism about truth understands truth-talk in a merely disquotational, ontologically deflated way. No commitment to a truth-property is involved and typically truth-talk is understood as a linguistic device that serves certain conversational and social functions. For instance, to say that “p is true” is semantically equivalent to saying that “p” (and vice versa).  The predicate “…is true” (and cognates) may recommend that p, show approval of p, confidence about p, and so forth. and facilitate other conversational and social functions but does not thereby commit to a robust truth property. Similarly, saying that “It is true that p is justified” merely shows approval, recommendation, and so forth. that p is justified but without any ontological commitment to a truth property (see Dowden & Swartz’s Truth for more discussion of deflationism).

Of course, deflationism as a theory of truth faces a number of important independent objections that carry over to the expressivist project. Here are a couple of objections. First, it is very unclear what grounds normative truth in the absence of normative facts (compare Cuneo 2007). Truth seems to presuppose a grounding relation that confers truthmaking but it is not clear what grounding and truthmaking can antirealist expressivism offer (for discussion of grounding see Schaffer 2013). So the question remains: If a normative sentence is true, it is true in virtue of what? Second, we often ascribe truth in second-order sentences like that “It is true for expressivism that realism is false and antirealism true” and there is a question about how the expressivist can explain such truth-talk if there is no corresponding robust truth about the matter.

To conclude our discussion of expressivism and return to mainstream truth-conditional semantics, there are various truth-conditional theories of justification/knowledge with different takes on semantics. There is of course traditional invariantism and contextualism about justification and knowledge. Traditional invariantism takes justification/knowledge to be absolute/univocal concepts that their meaning remains invariant from context to context (compare Unger 1971; Fogelin 1994; and Kyriacou 2017). Invariantism can then be glossed either in more moderate terms or more demanding terms. Moderate invariantism about knowledge can, for instance, be explicated in terms of a safety principle, that is, a principle that roughly suggests that true belief involved in knowledge could not have easily been false. Such safety-based, Neo-moorean approaches can be found in Sosa (1999), Williamson (2000), and Pritchard (2007, 2012).

More demanding invariantism about knowledge can be explicated in terms of a sensitivity principle, that is, a principle that roughly suggests that true belief involved in knowledge is sensitive to falsity and if the belief were false it wouldn’t be believed. Interestingly, one strand of demanding invariantism may lead to skeptical invariantism because it seems to indicate that all logical (relevant or irrelevant) possibilities of error should be taken into consideration and ruled out. Given that almost always we cannot rule out all logical possibilities of error, we inevitably embrace skepticism about knowledge. Of course, sensitivity theorists need not be, and some have not been, skeptics. Nozick (1981), for example, famously accepted a sensitivity condition on knowledge but rejected the intuitive condition of closure under known entailment and escaped the embrace of skepticism; at least this much he thought (for critical discussion compare Fogelin (1994) and Hawthorne 2004).

Some other philosophers simply repudiate the semantic invariance assumption in favor of semantic contextualism. Attributor contextualism takes justification/knowledge to be context-sensitive concepts, that is, concepts that their meaning varies due to contextual factors (stakes, interests, needs, goals, and so forth of the attributor). Such contextual factors induce a conversational shift of epistemic standards between high and low standards depending on the discursive context. Philosophers like Annis (1978), DeRose (1995), Lewis (1996), Cohen (1998), Williams (2001), and Wedgwood (2008) have proposed contextualist accounts of justification/knowledge that analyze meaning as context-sensitive.

More recently novel understandings of invariantism and contextualism have been proposed. These novel understandings are at least partly motivated by empirical linguistic evidence of how we occasionally use the concepts of justification/knowledge. Subject-sensitive invariantism proposes that the meaning of justification/knowledge is sensitive to the invariant subject’s practical interests, stakes, goals, and so forth. (and not the attributor’s) and makes much of the importance of knowledge for assertion and practical reasoning (compare Hawthorne 2004). Contrastivism takes into consideration the contrastive and comparative element in justification/knowledge discourse. For example, if I say ‘”S knows that p” this might conversationally imply that “S knows that p rather than q” (compare Schaffer 2004).

Questions of epistemic meaning go far beyond what we presented, but we have to pause here. In the next section we turn to (meta)epistemology, namely, the epistemology of epistemology and, in particular, the (meta)epistemology of epistemic justification and knowledge.

5. (Meta)Epistemology

Let us pause for a moment to take stock. We have explained how metaepistemological questions and puzzles arise out of the deontic appearance of our cognizing and its concomitant intrinsic normativity. Accordingly, we explained that human cognizing seems to implicate categorical epistemic duties and obligations; that is, there are propositions we ought to (dis)believe and practices that are (in)correct to employ from the epistemic point of view. Finally, we explained that justification and/or knowledge and corresponding epistemic oughts may be understood realistically or anti-realistically, or even otherwise (for example, in Kantian constructivist style) and delved a bit into the semantic aspect of things.

The obvious (meta)epistemological question now is how we get to know, or at least have justified belief, of such epistemic oughts and obligations (realist, antirealist, or other). That is, how we get to know, or at least have justified belief, about what we ought to believe or what doxastic practices, habits, and methods we should employ. To pursue this metaepistemic question we need first to stipulate a fundamental and very contentious distinction, namely, the distinction between epistemic internalism/externalism. The distinction is so fundamental—that we have not managed to entirely avoid it so far—because the epistemological theory we end up with depends on how we construe it and take sides about it. Given that so much hinges on the distinction, it should come as no surprise that the distinction is so contentious that is even debatable how best to construe it (and there are a number of ways of doing so) (see Poston’s Internalism and Externalism in Epistemology for discussion).

One standard way is to construe it in terms of cognitive accessibility, namely, the reality or not of cognitive accessibility to facts/evidence/reasons that support the target belief p. According to the accessibility reading, epistemic internalism suggests that S justifiably believes that p iff S has cognitive access to epistemic reasons in support of p (compare Chisholm 1966). Epistemic externalism simply consists in the denial of epistemic internalism. S can justifiably believe that p even if S has no cognitive access to epistemic reasons in support of p. If, for example, the belief is produced by a reliable belief-forming process, then S can justifiably believe that p (even without access to supporting reasons). There may of course be cases of justified believing that enjoy cognitive access to reasons in support of p, but for externalists this is not a prerequisite for justification (compare Goldman 1979).

A second reading of the distinction is laid out in terms of mental states rather than accessibility (compare Conee and Feldman 2004). According to the mentalist reading, epistemic internalism suggests that S justifiably believes that p iff S has mental states that count as evidence in support of p. Epistemic externalism again denies this. S can justifiably believe that p even if S has no mental states that count as evidence in support of p (compare Greco 2010). The mere satisfaction of some external conditions (reliability, counterfactual tracking, and so forth) suffices for justification or knowledge.

Of note is that mentalism implies no accessibility to reasons for justification. All that is required for justification are evidential mental states, but the agent need not have access or conscious awareness of such mental states or reasons. In this sense, mentalist internalism concedes to access externalism that cognitive access is not required for justification but still upholds that justification supervenes on and is fixed by mental states. At any rate, for the purposes of this article we could stick to the more standard accessibility reading of the distinction of epistemic internalism/externalism.

With this partial clarification of the epistemic internalism/externalism distinction in hand, we can now revert to the question of how we know, or at least have justified belief, about epistemic duties and obligations. Recall that ordinary epistemic discourse may give some prima facie support to epistemic deontologism and that the traditional (and more natural) interpretation of this has been in internalist contours. We have epistemic duties and obligations, like the general and a priori “You ought to abide by the relevant evidence and pursue the truth” or the particular and a posteriori “You should believe what Mary says because of her relevant expertise,” and these are reflectively accessible. Careful reflection can, in principle, indicate the epistemic duties and obligations an agent may have.

Of importance for the epistemology of epistemic oughts and duties is also the ontological distinction between epistemic realism and antirealism. Realists would claim that there are real, mind-independent epistemic duties and obligations that vindicate the deontic appearance and would offer different epistemological stories of how we get to know them, or at least have justified belief about them (foundationalist, coherentist, foundherentist, reliabilist, virtue-theoretic, and so forth).

To use a toy theory as an example, a foundationalist analytic reductionist would suggest that categorical epistemic duties are either non-inferentially justified or inferentially justified. Non-inferentially justified beliefs are beliefs that are justified but not in virtue of being based on other beliefs. There are various ways how beliefs could be justified without recourse to other beliefs: by means of non-doxastic states (for example, Fumerton 1995; Bonjour 2003); or self-presenting doxastic states (compare Chisholm 1966; McDowell 1994); or by means of belief-independent belief-forming processes (compare Goldman 1979); or in the case of a priori basic beliefs, in virtue of conceptual content (compare Bonjour 1998). To take the latter case, for instance, it could be suggested that “the duty to proportionate belief to the relevant evidence” seems a priori prima facie justified in virtue of conceptual content for anyone rational with proper understanding of the meaning of the proposition.

Instead, inferentially justified beliefs are beliefs that are justified in virtue of other beliefs/reasons. If beliefs/reasons for p are sufficiently strong then we have an obligation to believe that p. If for example I say that “The thing to believe is what Mr Poirot has concluded,” this should be further supported by epistemic reasons, that is, reasons that involve relevant evidence, like, say, Poirot’s character traits of integrity, attentiveness and investigating dexterity. Such a traditional foundationalist model of epistemic duties and obligations would be broadly internalist and intuitionist. On the one hand, internalist because we need access to epistemic reasons for justification and, on the other hand, intuitionist because intuitions should play a distinctive role in identifying at least non-inferentially justified duties to believe.

Unfortunately both features of such a theory are very controversial. To begin with, first, externalists would deny that internalism is a plausible constraint on epistemic justification. For one thing, they would suggest that it over-intellectualizes our everyday cognitive practices and processes and thereby distorts them beyond repair. Our cognitive practices and processes need not be conceived as so reflective and intellectual (compare Goldman 1979; Plantinga 1993; Sosa 2003, 2007; and Greco 2010). Second, what seems to really matter in our cognitive endeavors is cognitive success (of sorts) out of cognitive ability, be it of the goal of truth, knowledge or even other. So accessibility to facts/reasons does not really matter for justificatory matters. What matters is reliability through cognitive ability for truth and knowledge.

The intuitionist component does not fare any better. First, intuitions are often quite unreliable, and this psychological fact is unsurprising if we consider the evolutionary and cultural origins of our intuitions (compare Haidt 2012, and Kahneman 2011). So there is a clear question why trust our intuitions, especially when we have conflicting intuitions with other epistemic peers we esteem and trust as cognizers (for a similar point compare Setiya 2012). Second, it is not clear what sort of epistemic facts about duties we intuit when we invoke our intuitions. Surely such facts do not seem natural in the sense that they do not seem part of the natural world and the corresponding object of study of empirical sciences. Equally, non-natural (epistemic, moral or other) facts sound mysterious, or queer if we are to recall Mackie (1971), not to mention how we can reliably (and causally) track such facts if they exist (compare Street 2006, 2009; Olson 2011b).

Third, suppose our intuitions are at least somewhat reliable and sometimes, somehow, do track corresponding epistemic facts about duties. This much we can suppose with some justification because if our intuitions are completely and universally unreliable then it would seem that we have no rational ground for the starting point of an inquiry. So, it seems that too much of skepticism about intuitions might be self-defeating of any epistemic endeavor and lead to global skepticism, something that almost all epistemologists find unpalatable (pace Unger (1971), compare Plantinga 1993; De Cruz and others 2011; Vavova 2014). In addition, if we also assume so-called factualism, namely, the ontological idea that it is robust facts (of sorts) that stand as truthmakers for propositions, then there must be epistemic facts that render propositions about epistemic duties true.

Of course, internalist deontologism is far from being the only game in town. Externalists typically reject both epistemic deontologism and its internalist underpinnings as implausible. Externalists contend that the idea of internally construed epistemic duties and obligations is rendered obsolete by the advent of the more sophisticated scheme of externalism and, therefore, we should not take the deontic appearance of ordinary epistemic discourse at face value; at least not in the traditional internalist mode of its interpretation (for example, Greco 2010).

Instead, externalists focus on the reliability of cognitive processes (perception, memory, induction, deduction, introspection, and so forth), where reliability is typically understood in terms of a relatively high ratio of truth output. The basic insight is that if a cognitive process systematically and reliably delivers truth, then the belief-output of such a reliable process can be considered justified. No Cartesian-style reflective access to duties and supporting reasons is required for justification (and potentially knowledge). We can have justification and knowledge without further reflective justification or epistemic duties. It is not accidental that often externalist theories like reliabilism are understood as a form of epistemic consequentialism (see Dunn’s Epistemic Consequentialism). That is, of a theory that identifies epistemic value with the maximization and promotion of the goal of truth and the minimization and avoidance of error.

There are a variety of externalist positions in the marketplace of epistemology: Goldman’s (1979, 1986) process reliabilism; Plantinga’s (1993) proper functionalism; Bergmann’s (2008) Reidian externalism; Sosa’s (1991, 2007) and Greco’s (2010) virtue-theoretic externalisms and more. It is interesting to note that some externalist positions make important concessions to internalism and in essence come up with hybrid theories of justification and knowledge that aspire to a reconciliation of sorts between the two camps.

A good example of this is Sosa (1991, 2007). Sosa distinguishes between animal knowledge and reflective knowledge. Animal knowledge is simply “apt” knowledge that is produced by reliable, virtuous faculties, but human knowledge is the distinctively reflective sort of knowledge where reasons in support of the belief are accessed. Reasons are supposed to spring out of a coherence testing of the belief. That is, for human-reflective knowledge, production by the relevant virtuous faculty is not sufficient. Reasons in support of the belief should be accessed and these should come via a coherence test.

Hybrid accounts like Sosa’s seem better poised to fend off the often heard charge against externalism that it seems to leave epistemic normativity out of the picture because it overlooks the question of what one ought to believe (compare Fumerton 1995; Brandom 2000). More precisely, the charge is that it permits only for mechanical, reliable belief-forming processes that mostly deliver true beliefs and therefore ignores the further requirement for reflection about epistemic oughts. Sosa (1991, 2007) can alleviate this worry because he allows for a reflective element that can monitor the operation of the processes and ponder even about which process one ought to employ and rely on. This seems to be an advantage of Sosa’s (1991, 2007) theory over other mere externalist theories.

We have now seen some of the basic issues surrounding the epistemology of epistemology. Let us now turn to reasons for belief and epistemic psychology.

6. Reasons for Belief and Epistemic Psychology

Mainstream externalists dislike the whole idea of talking about (at least reflectively accessible) epistemic duties and reasons. For one thing, they tend to think that it is overly intellectualistic and, therefore, distorts ordinary cognitive practice. We neither usually appeal nor need to appeal to reasons for belief and even when we do we cannot really choose what to believe. Belief is involuntary and usually comes swiftly and effortlessly, plausibly for evolutionary psychological reasons. So, externalists are often also skeptical about the dialectical import of the deontic appearance and sometimes even of its concomitant normativity.  The concepts of deontology, internalism and sometimes normativity do not ring very well to their ears.

Admittedly, there is some insight in externalist misgivings about reasons-talk and the deontic appearance, but it is also hard to expel talk of reasons and duties from epistemological theorizing. Think for example of Locke’s (1690/1975: 687) pithy aphorism that “those who believe without reason are in love with their own fancies.”. As we have seen, this conflict of externalist and deontological/internalist intuitions seems to leave us in a bind because it is not clear which intuition we should discard (or even how we should attempt to reconcile them). Hence, the resolution of the conflict of internalist/externalist intuitions constitutes one of the most serious challenges for any epistemological theory.

Be that as it may, in this section we set aside externalist misgivings about reasons and introduce basic issues surrounding reasons for belief and epistemic psychology. Besides, even externalists give some reasons in support of externalism. First, let us distinguish epistemic from non-epistemic reasons like moral, pragmatic, prudential, aesthetic, political, and even more. One way to delineate epistemic reasons from other kinds of reasons is to emphasize that they stand as reasons for belief from the epistemic point of view. That is, from the point of view of commitment to some epistemic goals/values, whatever these may be: justification, truth, knowledge, understanding, wisdom (or even some other). Suffice it to say that, very roughly, epistemic goals are goals that promote our goal for cognitive contact with reality.

Non-epistemic kinds of reasons are not oriented towards distinctively epistemic goals. Reasons that are oriented towards practical utility are often—somewhat loosely—called pragmatic, and they are akin to a consequentialist/instrumental understanding of practical reasons. So, if for instance you are a cynic enough to believe in divine existence simply because it is consoling or because you find Pascal’s wager cogent, then you have pragmatic reasons for such belief but not epistemic reasons. Even if you should believe in God because that is what your relevant evidence rationally prescribes, but believe because it is consoling or because you find Pascal’s wager cogent then you believe for the wrong kind of reasons.

Epistemic reasons are also closely connected to moral reasons. Indeed, it is sometimes said that epistemic reasons have a moral flavor because they are reasons that should regulate behavior and such reasons cannot fail to be in some sense moral/practical (compare Clifford 1877; Cuneo 2007). Even if this is right, which is a moot thesis, epistemic reasons would make for a very distinctive species of moral reasons, perhaps so distinctive that they merit a distinctive appellation too: epistemic instead of moral/practical. They are reasons that have to do with the proper regulation of doxastic behavior and not just any kind of practical behavior. Typical doxastic behavior seems different from typical practical behavior at least in terms of conscious and direct control of the behavior.

Much more needs to be said about the classification of reasons, but I have said enough for an intuitive grasp of what might make epistemic reasons different from other kind of reasons. Let me close this topic with two concluding observations. First, epistemic reasons may be sensitive to moral/practical reasons (and other reasons as well). To see this, think of the case of a cosmologist that by trade has a duty to inquire about the origins of the universe but also has to attend to time-consuming filial duties towards her ageing parents. Surely, it is hard to combine both conflicting duties because the time dedicated to filial duties is taken out of work and reflection of epistemic duties. So, there must be a tradeoff between epistemic and moral/epistemic reasons.

Moral/practical reasons may even affect epistemic reasons in a more dramatic way. You may have good epistemic reasons to believe that your business partner and childhood friend is cheating on you but you might have moral reasons to trust him. So, it seems that epistemic and moral reasons may pull to opposite directions and the interesting question is what sort of reasons may have the upper hand in such situations. In such cases, we have a moral-epistemic obligation dilemma.

Second, a short note to the so-called “wrong kind of reasons problem” is due (compare Rabinowicz and Ronnow-Rasmussen 2004; Olson 2004; Schroeder 2010; Heuer 2010; Kyriacou 2013). It is a problem that concerns all propositional attitudes (not just belief) and it popped to the surface as a problem for so-called buckpassing accounts of value. Buckpassing accounts of value contend that something is valuable iff it has a property that elicits proattitudes like approval, admiration, desire and so forth (compare Scanlon 1998). The problem now is that something may have a property that elicits proattitudes for the wrong kind of reasons. It is not enough that something is valuable. This is beside the point.

The point is that what we judge as valuable should be so judged for the right kind of reasons. If, for example, I admire a truly remarkable painting just because this will please the painter, I have the right attitude concerning its aesthetic value for the wrong reasons. In the epistemic case, if you endorse a truly justified belief as justified because it is consoling for you, and not because it is evidentially grounded, reliably produced, or what have you as the correct justificatory story, then you have the right kind of doxastic attitude but for the wrong kind of reason. The wrong kind of reasons problem poses a challenge for any metanormative theory and it is also a challenge for a theory of epistemic justification. We need a theory of epistemic justification that would get round the problem and have agents believe justifiably for the right kind of reasons.

A further important distinction about epistemic reasons is the distinction between hypothetical/instrumental and categorical reasons (or rationality). One position supports that epistemic rationality is merely instrumental. We have certain (or opt for some) epistemic goals like truth and knowledge and then we try to satisfy them in the best possible way (to the extent that we are rational). According to instrumental epistemic rationality, there are no reasons for belief going over and above these conditional\hypothetical epistemic goals. For example, if we have the hypothetical goal of truth about p, then epistemic rationality is merely instrumental in the sense that it should find the best means to satisfy the goal. In other words, epistemic rationality is instrumental, and, as such, it should be understood in terms of hypothetical requirements like ‘If you aim for the truth of p, then do such-and-such, say, employ reliable belief-forming processes’ (compare Kornblith 2002).

Their opponents disagree and reject instrumental epistemic rationality in favor of categorical rationality. They contend that we may have reasons for belief, even if we are lacking some conditional epistemic goal (compare Kelly 2003). For example, I may have no desire to find the truth about who killed my boss but I may still have a binding (external/categorical) reason to believe that it was John because there is abundant and accessible evidence at my disposal that confirms this. In a sense, whether I care about the truth of the matter or not, I have sufficient epistemic reason to believe that John is the murderer. Epistemic rationality constrains what I should believe unconditionally, that is, independently of my desires, interests, or adopted goals. In this sense, epistemic rationality should be understood in terms of non-optional, categorical requirements like “Given the evidence, you ought to believe that such-and-such (here, that John is the murderer).”

A related question about reasons for belief that may be inspired by a parallel discussion in metaethics concerns the nature of reasons for belief. As in metaethics there has been considerable discussion about the nature of reasons for action and corresponding internal/external reasons for action, the same fruitful discussion could be opened in metaepistemology about reasons for belief. Internal reasons are reasons that depend on the agent’s subjective motivational set (desires, intentions, dispositions, plans, goals) while external reasons are reasons that are independent of the agent’s subjective motivational set (see Williams 1990; also Turri 2009).

For example, I may have a reason to believe or inquire about p only if I have a desire or care about the truth about p. According to this internal reading of having a reason, my having a reason exclusively depends on my caring and desiring to know about p. Were I not to care, I wouldn’t have a reason to believe or inquire into p at all, which is to restate the instrumental conception of epistemic rationality, but there is also an externalist reading of the notion of having a reason. For example, I might say “Sophie has a reason to believe that John is from New York” and imply that she has a reason to believe this insofar as she is rational, independently of whether she cares or not about the truth of the matter. Even if she couldn’t care less, it remains an epistemic fact that she has a reason to believe that John is from New York, which is to restate the categorical conception of epistemic rationality. There are epistemic facts about what (categorical) reasons for belief we have (compare Boghossian 2007; Cuneo 2007).

A different issue that has recently drawn attention, and also parallels an analogous discussion in metaethics, concerns the relation between epistemic judgment and motivation (compare Kappel and Moeller 2014; Grajner 2015). That is, the so-called judgment internalism/externalism controversy. Like moral judgments, it seems that epistemic judgments like “p is justified” or “S’s belief that p is true” or “I know that p” seem to implicate certain kinds of motivations. They seem to have an internal, conceptual connection with certain motivations. Judgment internalists hold onto the intuition of internal connection of judgment and motivation while judgment externalists, though they accept the systematic character of the connection, deny the internalist intuition and suggest that the connection may be severed in certain circumstances. Hence, the connection may only be external and defeasible in certain circumstances.

For example, for judgment internalists sincerely saying that “p is justified” implicates that I have at least some reason or doxastic motivation to believe that p, rely on p, recommend to others that p, approve of the norms, habits, processes , in virtue of which p was formed; or saying that “S’s belief that p is true” implicates some reason to also believe that p, approve the norms in virtue of which p was formed, give some credit to S and so forth; or asserting that “I know that p” implicates, all other things equal, termination of inquiry. The inquiry about p is at least for time being settled (compare Kappel and Moeller 2014).

In contrast externalists would question the conceptual connection and its modal strength (compare Kvanvig 2003; Grajner 2015). They might highlight the conceivability of Spock-like agents who make sincere normative judgments but remain entirely cold and unmotivated by that judgment. In the moral case, such personas are called amoralists, and the typical characters externalists have in mind are psychopaths, sociopaths, social delinquents, and so forth that seem capable of sincere moral judgment without any corresponding motivation for action. (Perhaps we can call the epistemic counterparts of amoralists, “acognitivists.”)

Like in the parallel case of moral motivation in metaethics, expressivists seem able to easily explain the systematic—if not strictly necessary—connection between epistemic judgment and motivation for belief, reliance, termination of inquiry, and so forth. This is the case because, as a noncognitivist theory about the nature of the mental state expressed in epistemic judgment, the expressivist can appeal to the motivating nature of noncognitive states to explain this systematic connection of motivation. Desires, intentions, and even epistemic sentiments may be called for explanatory psychological work.

For example, if I say that “p is justified” the expressivist will suggest that I express some sort of approval for p or the norms that license it or the doxastic habits in virtue of which was reliably formed (for example, Kyriacou 2012). The bottom line of such a noncognitivist account is that the relevant desire or intention for belief might be expressible in such judgments of justification. Similar stories would be drawn for other sorts of epistemic judgments. Thus, epistemic motivation is easily explained as a psychological phenomenon in an expressivist framework.

Cognitivists about epistemic judgment, though, cannot directly appeal to the same style of psychological explanation as the expressivist because they suggest that epistemic judgments express beliefs and beliefs do not seem to be intrinsically motivating states. They are representational states and as such they do not seem to be engaging the will at all. Saying that “Paris is a city” or that “It’s a hot day today” may be representational, but it is also “motivationally inert.” As is often said, beliefs and desires have an opposite direction of fit with the world (compare Anscombe 1957; Smith 1994). Beliefs are representational states in the sense that they aspire to represent aspects of the world and get at the respective truth. They are mind-to-world directed states. Desires are nonrepresentational states in the sense that they urge for changing the world in order to be satisfied (compare Smith 1994). They are world-to-mind directed states.

This leaves cognitivists with an obvious puzzle about epistemic motivation. They could insist that epistemic beliefs can motivate but this seems entirely ad hoc because beliefs seem representational and motivationally inert states. Alternatively, they could suggest that beliefs may induce corresponding desires, but this will have to explain how this systematic inducing works given that beliefs and desires are so-called Humean “distinct existences” (compare Korsgaard 1986; Smith 1994). That is, there is no necessary (logical or psychological) connection between belief and desire as the two states can pull apart. There can be belief without any corresponding desire and vice versa.

They could even appeal to a sophisticated hybrid (or so-called “ecumenical”) picture about normative judgment and suggest that they express both cognitive and noncognitive states (compare Copp 2001; Ridge 2007; Grajner 2015). At any event, cognitivists have some extra explaining to do that noncognitivists do not. This seems to give the noncognitivists the edge in regard to understanding the psychological relation between normative judgment and motivation, and it is considered as one of the best arguments in favor of noncognitivism. Nevertheless, one should not question the resourcefulness of the cognitivist tradition and its ability to account for moral and epistemic motivation (for ecumenical cognitivism see Grajner 2015).

With this much about reasons for belief and epistemic psychology, in the next section we briefly visit the notions of agency and responsibility.

7. Agency and Responsibility

Epistemic deontology has seemed especially implausible to many externalists because it does not comport with a plausible picture of epistemic agency. That is, epistemic deontologism assumes that as rational and responsible agents we have epistemic duties about what to believe and to the extent that we do not conform our doxastic behavior to these duties we are responsible and culpable. The problem now is that a plausible account of epistemic agency does not allow for the responsibility that deontologism requires and, since this is a prerequisite for deontologism, deontologism seems implausible.

This is arguably the case because typically doxastic behavior seems beyond the reach of our direct and conscious control. To illustrate this, let someone try to believe something that considers an obvious falsehood, or at least something she considers implausible, like “1+1=3” or that “Paris is not the French capital.” Let us even offer a powerful motive for the fruition of this cognitive exercise, say, 10,000 pounds for sincerely acquiring the belief. But alas, it seems psychologically impossible because direct belief-fixation is not up to us in any profound manner. Unlike ordinary decisions about what to do (for example, dance or standing up) we can’t directly decide what to believe.

In light of the fact that an intuitive understanding of responsibility is explicated in terms of minimal control of action, and control in terms of relative freedom to choose from open alternatives (for example, Pink 2004) one may consider epistemic deontologism a lost cause. That is, a lost cause because it relies on the existence of epistemic responsibility but, on the other hand, does not allow for sufficient control and freedom of choice that is a necessary prerequisite for such responsibility—this is “the doxastic involuntarism problem” for  (internalist) epistemic deontologism (for the classic statement compare Alston 1988).

As one might expect, there have been a number of responses on behalf of deontologism. Some have blankly denied the involuntarism intuition and insisted that we can at least sometimes directly decide what to believe (for example, Ginet 2001). Others have denied that responsibility is exhausted by the absence of direct control (for example, Feldman 2001). We might have no direct control over the working of our heart but we might still be to some extent responsible for its proper working because we can take steps to indirectly enhance its functioning. After all, we can have a healthy lifestyle that includes a pro-heart diet, exercising, regular medical exams, and so forth

Similarly, we can take steps to indirectly enhance our cognitive functioning. We could for example cultivate reliance on methods, habits, and processes that are reliable belief-producers—that is, produce a high ratio of true beliefs. If I am myopic for example, I could cultivate habitual reliance for visual perception on my glasses because I know they make for reliable perception, or I could rely on certain sources of information that are generally reliable like a trustworthy newspaper columnist or a trustworthy friend. I could also cultivate epistemic virtues, namely, character traits that are generally truth-conducive like conscientiousness, inquisitiveness, open-mindedness, perseverance, respect, courage, tolerance, fairness, love of truth and humility.

In this vein Chrisman (2008), for example, has argued that doxastic involuntarism and epistemic deontology can be reconciled. He appeals to Wilfrid Sellars’ distinction between “rules of criticism” and “rules of action” and explains how doxastic oughts might be subject to rules of criticism that require no direct voluntary control but, nevertheless, do not compromise the categoricity of doxastic oughts (and responsibility). The doxastic involuntarism problem remains a live question for epistemic deontologism and agency (for more discussion see Vitz’s Doxastic Voluntarism).

A second topic of some recent debate that relates to (epistemic) agency is the so-called “situationist challenge about character.”  Roughly, situationism about character is the thesis that character is a figment of folk psychology (and virtue theory) and does not really exist. Although it was first applied to moral character and thereby virtue ethics and then transposed to responsibilist virtue-theoretic epistemology (compare Alfano 2012) that appeals to character, it is arguably not of merely local responsibilist virtue-theoretic concern. This is the case because if, as Baehr (2012) has cogently argued, even theories that as stipulated have nothing to do with character, and virtuous/vicious traits (like evidentialism and reliabilism) are inevitably involving virtuous and vicious traits, then the debunking of character as a mere figment will have serious repercussions for these theories as well.

Harman (1999) is a classic statement of so-called situationism about moral character traits, namely, the idea that what drives our behavior is the current of the social situation and not fixed character traits, as virtue theories and folk psychology tend to think. In fact, according to Harman (1999), there is no positive empirical evidence in favor of character traits, and there is much negative empirical evidence against them and, therefore, in so thinking we commit the so-called “fundamental attribution error.” We excessively focus on the agent and downplay the importance of the social situation that the agent is found in.

The situationist challenge questions the psychological reality of the theoretical cornerstone of responsibilist virtue theories, namely, character. Drawing from various empirical psychological studies, like Milgram’s (1974) famous obedience experiments, it is suggested that the folk psychological and virtue-theoretical concept of character is a mere fiction. These studies are taken to show that in reality the notion of character does not exist because in these experiments the agents’ actions are not guided by their character, relevant virtues, or traits but by the situational context they are found in.

True enough, we speak about character and character traits (virtuous and vicious) but empirical, psychological studies confirm that in a given situation our supposed character traits may fail to manifest themselves in our conduct, at least for a statistically significant portion of us. This indicates that there is no reliable source of dispositions for action according to character traits as it is widely assumed. If there were, most of us couldn’t just so easily behave out of character. This corollary would suffice to shake the building foundations of folk psychology (that relies extensively on the notion of character) and responsibilist virtue theory as well as other theories that surreptitiously involve the notion of character (for some discussion of situationism, see Timpe’s Moral Character; for responses to moral situationism see Miller 2003; Kamterkar 2004; and Snow 2010).

More recently Alfano (2011) transposed the situationist challenge from virtue ethics to responsibilist virtue epistemology. He appeals to abundant empirical work from cognitive and social psychology to rest his case. He indicates that intellectual virtues like curiosity, flexibility, creativity, and courage are susceptible to the vagaries of non-intellectual factors of their situation like mood elevators, mood depressors, ambient sounds, ambient smells, and even the weather.

This suggests, according to Alfano (2011), a puzzle that can be framed as an inconsistent triad: (non-skepticism) Most people know quite a bit; (responsibilism) Knowledge is true belief acquired and retained through the exercise of intellectual virtue; and (epistemic situationism) Most people do not possess the epistemic virtues countenanced by responsibilism. Alfano suggests that a plausible way to resolve the puzzle is to give up responsibilism and with it the whole enterprise of responsibilist virtue epistemology. The safe prediction is that situationism about moral and intellectual character poses a sharp challenge to folk psychology, virtue theory (and beyond) and is here to stay as a topic for discussion.

With this much about agency and responsibility, let us now wrap up the overall discussion.

8. New Directions in Metaepistemology

We have covered a lot of ground in large strides and introduced various key aspects of metaepistemological theorizing, ranging from semantics to metaphysics and agency. There is, of course, much more going on in metaepistemological debates (both in terms of depth and breadth), especially as new and fascinating directions of inquiry are constantly emerging. New semantic theories have been suggested like hybrid semantic theories of epistemic concepts and inferentialist theories. More light is being shed on neglected epistemic properties and their value like understanding, entitlement, and wisdom and new and interesting parallels with metaethics are constantly emerging, like reasons and motivation. Finally, exciting experimental work on intuitions is emerging and formal work is being carried out that links metaepistemological themes with probability calculus and decision theory (for example, Pettigrew 2011, 2013). This indicates that metaepistemology is an emerging and promising field of epistemological inquiry that is anticipated to blossom in the years to come.

9. References and Further Reading

  • Alfano, Mark. (2011). ‘Expanding the Situationist Challenge to Responsibilist Virtue Epistemology’. Philosophical Quarterly 62(247): 223-249.
  • Alston, William. (1988). ‘The Deontological Conception of Epistemic Justification’, Philosophical Perspectives, 2, Epistemology, pp.115-152.
  • Alston, William. (2005). Beyond Justification. Ithaca, NY: Cornell University Press.
  • Annis, David. (1978). ‘A Contextualist Theory of Epistemic Justification’. American Philosophical Quarterly, Vol.15, No.3, pp.213-219.
  • Anscombe, G. E. M. (1957). Intention. Oxford, Blackwell.
  • Ayer, A. J. (1936). Language, Truth and Logic. London, Penguin Books.
  • Baehr, Jason. (2012). The Inquiring Mind. Oxford, Oxford University Press.
  • Bergmann, Michael. (2008). ‘Reidian Externalism’ in New Waves in Epistemology, (eds.) Vincent Hendricks and Duncan Pritchard. London, Palgrave MacMillan.
  • Blackburn, Simon. (1993). Essays in Quasi-Reaslism. Oxford, Oxford University Press.
  • Blackburn, Simon. (1998). Ruling Passions. Oxford, Oxford University Press.
  • Blackburn, Simon. (2006). Truth. London, Penguin Books.
  • Bonjour, Lawrence. (1985). The Structure of Empirical Knowledge. Cambridge MA: Harvard University Press.
  • Bonjour, Lawrence. (1998). In Defense of Pure Reason. London, Cambridge University Press.
  • Bonjour, Lawrence and Sosa, Ernest. (2003). Epistemic Justification. Oxford, Blackwell.
  • Boghossian, Paul. (2007). Fear of Knowledge. Oxford, Oxford University Press.
  • Brandom, Robert. (2000). Articulating Reasons. Cambridge MA, Harvard University Press.
  • Brink, David. (1989). Moral Realism and the Foundations of Ethics. New York, Cambridge University Press.
  • Charlow, Nate. (2014). ‘The Problem with the Frege-Geach Problem’. Philosophical Studies 167(3): 635-665.
  • Chisholm, Roderick. (1966). Theory of Knowledge. Englewood Cliffs, NJ, Prentice Hall.
  • Chrisman, Matthew. (2007). ‘From Epistemic Contextualism to Epistemic Expressivism’. Philosophical Studies 135(2):225-254.
  • Chrisman, Matthew. (2008). ‘Ought to Believe’. Journal of Philosophy 105 (7): 346-370.
  • Chrisman, Matthew. (2012). ‘Epistemic Expressivism’. Philosophy Compass 7(2):118-126.
  • Clifford, William (1877/2008). ‘The Ethics of Belief’ in Reason and Responsibility, (eds.) Joel Feinberg and Russ Shafer-Landau. Belmont, CA: Thomson, pp.101-5.
  • Cohen, Stewart. (1998). ‘Contextualist Solutions to Epistemological Problems: Scepticism, Gettier and the Lottery’. Australasian Journal of Philosophy 74, 4, pp.549-567.
  • Conee, Earl and Feldman, Richard. (2004). Evidentialism. Oxford, Oxfrod University Press.
  • Copp, David. (2001). Realist-Expressivism: A Neglected Option for Moral Realism. Social Philosophy and Policy 18(02):1-43.
  • Cuneo, Terence. (2007). The Normative Web. Oxford, Oxford University Press.
  • Cuneo, Terence and Kyriacou, Christos. (2017) ‘Defending the Moral/Epistemic Parity’ in Metaepistemology. (eds.) C. McHugh, J. Way and D. Whiting.
  • Darwall, Stephen, Gibbard, Allan and Railton, Peter. (1992). ‘Toward Fin de Siecle Ethics: Some Trends’. Philosophical Review 101(1):115-189.
  • De Cruz, Helen, Boudry, Maarten, De Smedt, Johan, and Blancke, Stefaan. (2011). ‘Evolutionary Approaches to Epistemic Justification’. Dialectica 65(4):517-535.
  • DeRose, Keith. (1995). ‘Solving the Skeptical Problem’. The Philosophical Review 104, 1, pp. 1-52.
  • Descartes, Rene. (1641\2008). Meditations on First Philosophy. Oxford, Oxford University Press. Translated by Michael Moriarty.
  • Dowden, Bradley and Swartz, Norman. ‘Truth’ in the Internet Encyclopedia of Philosophy.
  • Dreier, James. (2004). ‘Meta-ethics and the Problem of Creeping Minimalism’. Philosophical Perspectives 18(1):23-44.
  • Dunn, Jeffrey. ‘Epistemic Consequentialism’. Internet Encylopedia of Philosophy.
  • Elgin, Catherine. (2004). ‘True Enough’. Philosophical Issues 14, Epistemology, pp. 113-131.
  • Enoch, David. (2013). Taking Morality Seriously. Oxford, Oxford University Press.
  • Feldman, Richard. (2001). ‘Voluntary Belief and Epistemic Evalutation’ in Knowledge, Truth and Duty, (ed.) Matthias Steup. Oxford, Oxford University Press, pp.77-92.
  • Feldman, Richard. (2002). ‘Epistemological Duties’ in The Oxford Handbook of Epistemology, ed. Paul Moser. Oxford, Oxford University Press. pp.362-384.
  • Fisher, Andrew. (2011). Metaethics: An Introduction. Acumen.
  • Floridi, Luciano. (2004). ‘On the Logical Unsolvability of the Gettier Problem’. Synthese 142(1): 61-79.
  • Fogelin, Robert. (1994). Pyrrhonian Reflections on Knowledge and Justification. Oxford, Oxford University Press.
  • Frankena, William. (1939). ‘The Naturalistic Fallacy’. Mind XLVIII(192):464:477.
  • Frege, Gottlob. (1997). ‘On Negation’ in The Frege Reader
  • Fricker, Miranda. (2010). Epistemic Injustice. Oxford, Oxford University Press.
  • Fumerton, Richard. (1995). Metaepistemology and Skepticism. London, Rowman and Littlefield.
  • Geach, Peter. (1960). ‘Ascriptivism’. Philosophical Review 2:221-225.
  • Geach, Peter. (1965). ‘Assertion’. Philosophical Review 74(4):449-465.
  • Gibbard, Allan. (1990). Wise Choices, Apt Feelings. Oxford, Oxford University Press.
  • Ginet, Carl. (2001). ‘Deciding to Believe’ in Knowledge, Truth and Duty, ed. Matthias Steup. Oxford, Oxford University Press. pp.63-76.
  • Goldman, Alvin. (1979). ‘What is Justified Belief?’ in Epistemology : An Anthology, eds., Ernest Sosa and Jaegkwon Kim. Blackwell. pp. 340-353.
  • Grajner, Martin. (2015). ‘Hybrid Expressivism About Epistemic Justification’. Philosophical Studies 172(9):2349-2369
  • Greco, Daniel. (2015).  ‘Epistemological Open Question Arguments’. Australasian Journal of Philosophy 93(3):509-523.
  • Greco, John. (2010). Achieving Knowledge. Oxford, Oxford University Press.
  • Grimm, Stephen. (2011). ‘Understanding’ in The Routledge Companion to Epistemology, (eds.) Sven Bernecker and Duncan Pritchard. New York, Routledge. pp.
  • Haidt, Jonathan. (2011). The Righteous Mind. New York, Vintage Books.
  • Harman, Gilbert. (1986). Change in View. Cambridge, MA: MIT Press.
  • Harman, Gilbert.(1999). ‘‘Moral Philosophy Meets Social Psychology: Virtue Ethics and the Fundamental Attribution Error’. Proceedings of the Aristotelian Society 99, pp. 315-331.
  • Hare, Richard. (1952). The Language of Morals. Oxford, Clarendon Press.
  • Hawthorne, John. (2004). Knowledge and Lotteries. Oxford, Oxford University Press.
  • Heathwood, Chris. (2009). ‘Moral and Epistemic Open Question Arguments’, Philosophical Books 50: 83-98.
  • Heuer, Ulrike. (2010). ‘Beyond Wrong Reasons: The Buck-Passing Account of Value’ in New Waves in Metaethics, (ed.) Michael Brady. Palgrave Macmillan. pp. 166-184.
  • Hume, David (1739\1985). A Treatise of Human Nature. London, Penguin.
  • Huemer, Michael. (2008). Ethical Intuitionism. London, Palgrave Macmillan.
  • James, William. (1896\2008). ‘The Will to Believe’ in Reason and Responsibility, (eds.) Joel Feinberg and Russ Shafer-Landau. Belmont, CA: Thomson, pp. 106-113.
  • Jenkins, Catherine. (2007). ‘Epistemic Norms and Natural Facts’. American Philosophical Quarterly 44 (3): 259-272.
  • Kahneman, Daniel. (2012). Thinking, Fast and Slow. New York, Farrar, Straus and Giroux.
  • Kamtekar, Rachana. (2004). ‘Situationism and Virtue Ethics on the Content of Our Character’. Ethics 114(3):458-491.
  • Kappel, Klemens and Moeller, Eric. (2014). ‘Epistemic Expressivism and the Argument from Motivation’. Synthese, pp.1-19.
  • Kelly, Thomas. (2003). ‘Epistemic Rationality as Instrumental Rationality: A Critique’. Philosophy and Pehenomenological Research 66, 3, pp.612-40.
  • Kim, Jaegwon. (1988). ‘What is Naturalized Epistemology?’. Philosophical Perspectives 2: 381-405.
  • Kirkham, Richard. (1984). ‘Does the Gettier Problem Rest on A Mistake?’. Mind 93(372): 501-513.
  • Kyriacou, Christos. (2012).‘Habits-Expressivism About Epistemic Justification’. Philosophical Papers 41(2):209-237.
  • Kyriacou, Christos. (2013). ‘How Not To Solve the Wrong Kind of Reasons Problem’. Journal of Value Inquiry 47(1-2):101-110.
  • Kyriacou, Christos. (2015). ‘Critical Discussion of David Velleman’s ‘Foundations for Moral Relativism’’ UK: Open Book Publishers. 2013. Ethical Theory and Moral Practice 18(1),pp. 209-214.
  • Kyriacou, Christos. (2016a). ‘Ought to Believe, Evidential Understanding and the Pursuit of Wisdom’ in Epistemic Reasons, Epistemic Norms, Epistemic Goals, eds. Martin Grajner and Pedro Schmechtig. Berlin, DeGruyter. Pre-print version.
  • Kyriacou, Christos. (2016b). ‘Metaepistemology’ in Oxford Bibliographies Online, ed. Duncan Pritchard. New York, Oxford University Press. Pre-print version.
  • Kyriacou, Christos. (2017). ‘Bifurcated Sceptical Invariantism’ in Journal of Philosophical Research. Pre-print version.
  • Kornblith, Hilary. (2002). Knowledge and Its Place in Nature. Oxford, Oxford University Press.
  • Korsgaard, Christine. (1986). ‘Skepticism About Practical Reason’, Journal of Philosophy 83 (1):5-25.
  • Kvanvig, Jonanthan. (2003). The Value of Knowledge and the Pursuit of Understanding. Cambridge, Cambridge University Press.
  • Lenman, James. (2008). ‘Review of Terence Cuneo. The Normative Web.’ Notre Dame Philosophical Reviews 2008(6).
  • Lewis, David. (1996). ‘Elusive Knowledge’. Australasian Journal of Philosophy 74(4): 549-567.
  • Locke, John. (1690\1975). An Essay Concerning Human Understanding. Oxford, Oxford University Press. Edited with an Introduction by P.H. Nidditch.
  • Mackie, John. (1971). Ethics. Inventing Right and Wrong. London, Penguin.
  • MacFarlane, John. (2005). ‘The Assessment Sensitivity of Knowledge Attributions’. In T.S.Gendler and J.Hawthorne, eds. Oxford Studies in Epistemology, Oxford University Press. pp. 197-234.
  • Miller, Christian. (2003). ‘Social Psychology and Virtue Ethics’. Journal of Ethics 7(4): 365-392.
  • Milgram, Stanley. (1974). Obedience to Authority: An Experimental View. New York, Harper and Row.
  • Moore, G. E. (1903\2000). Principia Ethica. Cambridge, Cambridge University Press. Edited with an introduction by T. Baldwin.
  • Neta, Ram. (2008). ‘How to Naturalize Epistemology’ in New Waves in Epistemology, eds. Vincent Hendricks and Duncan Pritchard. New York, Palgrave Macmillan. pp. 324-353.
  • Nozick, Robert. (1981). Philosophical Explanations. Cambridge MA, Harvard University Press.
  • Olson, Erik. (2007). ‘The Place of Coherence in Epistemology’ in New Waves in Epistemology, (eds.) Vincent Hendricks and Duncan Pritchard. London, Palgrave Macmillan.
  • Olson, Jonas. (2004). ‘Buck-Passing and the Wrong Kind of Reasons’. Philosophical Quarterly 54(215):295-300.
  • Olson, Jonas. (2011a). ‘In Defense of Moral Error Theory’ in New Waves in Metaethics. New York, Palgrave Macmillan. pp. 62-84.
  • Olson, Jonas. (2011b). ‘Error Theory and Reasons for Belief’ in Reasons for Belief, eds. Andrew Reisner and Asbjorn Stegligh-Petersen. Cambridge, Cambridge University Press. pp. 75-93.
  • Papineau, David. (2003). ‘The Evolution of Knowledge’ in The Roots of Reason. Oxford, Oxford University Press, pp. 39-82.
  • Pettigrew, Richard. (2011). ‘Epistemic Utility Arguments for Probabilism’. Stanford Encyclopedia of Philosophy.
  • Pettigrew, Richard. (2013). ‘Epistemic Utility and Norms for Credences’. Philosophy Compass 8/10: 897-908.
  • Pink, Thomas. (2004). Free Will: A Very Short Introduction. Oxford, Oxford University Press.
  • Plantinga, Alvin. (1993). Warrant: The Current Debate. Oxford, Oxford University Press.
  • Plantinga, Alvin. (1993). Warrant and Proper Function. Oxford, Oxford University Press.
  • Plato. (2005). Euthyphro, Apology, Crito, Phaedo, Phaedrus. Cambridge MA : Harvard University Press.
  • Pollock, John and Cruz, Joseph. (1999). Contemporary Theories of Knowledge. Lanham, Rowman and Littlefield.
  • Poston, Ted. ‘Internalism and Externalism in Epistemology’ in the Internet Encyclopedia of Philosophy.
  • Pritchard, Duncan. (2010). The Nature and Value of Knowledge. Oxford, Oxford University Press. Coauthored with Allan Millar and Adrian Haddock.
  • Putnam, Hilary. (1975/1997). ‘The Meaning of ‘Meaning’’ in Philosophical Papers Vol.2 : Mind, Language and Reality. Cambridge, Cambridge University Press. pp. 215-271.
  • Quine, W. V. O. (1953). ‘Two Dogmas of Empiricism’ in From A Logical Point of View. Cambridge, MA: Harvard University Press. pp. 20-46.
  • Quine W. V. O. (1992). Pursuit of Truth. Cambridge, MA: Harvard University Press
  • Rabinowicz, Wlodek and Ronnow-Rasmussen, Toni. (2004). ‘The Strike of the Demon: On Fitting Pro-Attitudes and Value’. Ethics 114(3):391-423.
  • Ridge, Michael. (2007). ‘Ecumenical Expressivism: The Best of Both Worlds?’ Oxford Studies in Metaethics 2:51-76.
  • Ridge, Michael. (2014). ‘Moral Non-naturalism’ in Stanford Encyclopedia of Philosophy, (ed.) Edward Zalta.
  • Scanlon, T. M. (1998). What We Owe To Each Other. Cambridge MA, Harvard University Press.
  • Schaffer, Jonathan. (2004). ‘From Contextualism to Contrastivism’. Philosophical Studies 119(1-2): 73-104.
  • Schaffer, Jonathan. (2013). ‘On What Grounds What’ in Metametaphysics, eds. D.Chalmers, D.Manley and R.Wasserman. Oxford, Oxford University Press. pp.347-383.
  • Schroeder, Mark. (2008a). Being For. Oxford, Oxford University Press.
  • Schroeder, Mark. (2008b). ‘What is the Frege-Geach Problem?’. Philosophy Compass 3(4):703-720.
  • Schroeder, Mark. (2010). ‘Value and the Right Kind of Reason’. Oxford Studies in Metaethics 5:25-55.
  • Setiya Kieran. (2012). Knowing Right from Wrong. Oxford, Oxford University Press.
  • Smith, Michael. (1994). The Moral Problem. Oxford, Oxford University Press.
  • Snow, Nancy. (2010). Virtue as Social Intelligence. New York, Routledge.
  • Sosa, Ernest. (1991). Knowledge in Perspective. Cambridge, Cambridge University Press.
  • Sosa, Ernest. (2003). ‘The Place of Truth in Epistemology’ in Intellectual Virtue, (eds.) M.DePaul and L.Zagzebski. Oxford, Oxford University Press, pp.155-79.
  • Sosa, Ernest. (2007). A Virtue Epistemology. Oxford, Oxford University Press.
  • Street, Sharon. (2006). ‘A Darwinian Dilemma for Realist Theories of Value’. Philosophical Studies 127 (1), pp. 109-166.
  • Street, Sharon. (2009). ‘Evolution and the Normativity of Epistemic Reasons’. Canadian Journal of Philosophy 39 (supplement 1): 213-248.
  • Stephen, Stich. (1990). The Fragmentation of Reason. Cambridge MA, MIT Press.
  • Stevenson, C. L. (1963). ‘The Nature of Ethical Disagreement’ in Facts and Values. New Haven: Yale University Press. pp.1-9.
  • Timmons, Mark and Horgan, Terence. (1991). ‘New Wave Moral Realism Meets Moral Twin Earth’. Journal of Philosophical Research 16:447-465.
  • Timpe, Kevin. ‘Moral Character’ in the Internet Encyclopedia of Philosophy.
  • Turri, John. (2009). ‘The Ontology of Epistemic Reasons’. Nous 43(3): 490-512.
  • Unger, Peter. (1971). ‘A Defense of Skepticism’. Philosophical Review, LXXX :198-219.
  • Vahid, Hamid. (2005). Epistemic Justification and the Skeptical Challenge. London, Palgrave Macmillan.
  • Vavova, Katia. (2014). ‘Debunking Evolutionary Debunking’. Oxford Studies in Metaethics 9:76-101.
  • Velleman, David. (2013). Foundations for Moral Relativism. UK: Open Publishers.
  • Vitz, Rico. ‘Doxastic Voluntarism’ in the Internet Encyclopedia of Philosophy.
  • Yalcin, Seth. (2012). ‘Bayesian Expressivism’, Proceedings of the Aristotelian Society 112, Vol.2:123-160.
  • Zagzebski, Linda. (1996). Virtues of the Mind. Cambridge, Cambridge University Press.
  • Zagzebski, Linda. (2003). ‘The Search for the Source of the Epistemic Good’. Metaphilosophy 34, pp.12-28.
  • Zagzebski, Linda. (2009). On Epistemology. Belmont, CA : Wadsworth.
  • Wedgwood, Ralph. (2008). ‘Contextualism About Justified Belief’. Philosopher’s Imprint 8, No.9, pp.1-20.
  • Wiggins, David. (1987). ‘A Sensible Subjectivism?’ in Needs, Values and Truth. Oxford, Oxford University Press. pp. 185-214.
  • Williams, Bernard. (1979). ‘Internal and External Reasons’ in Rational Action, ed. Ross Harrison. Cambridge, Cambridge University Press. pp. 101-113.
  • Williams, Michael. (2001). Problems of Knowledge. Oxford, Oxford University Press.
  • Williamson, Timothy. (2000). Knowledge and its Limits. Oxford, Oxford University Press.

Author Information

Christos Kyriacou
Email: ckiriakou@gmail.com
University of Cyprus
Cyprus

Interpretations of Quantum Mechanics

Quantum mechanics is a physical theory developed in the 1920s to account for the behavior of matter on the atomic scale. It has subsequently been developed into arguably the most empirically successful theory in the history of physics. However, it is hard to understand quantum mechanics as a description of the physical world, or to understand it as a physical explanation of the experimental outcomes we observe. Attempts to understand quantum mechanics as descriptive and explanatory, to modify it such that it can be so understood, or to argue that no such understanding is necessary, can all be taken as versions of the project of interpreting quantum mechanics.

The problematic nature of quantum mechanics stems from the fact that the theory often represents the state of a system using a sum of several terms, where each term apparently represents a distinct physical state of the system. What’s more, these terms interact with each other, and this interaction is crucial to the theory’s predictions. If one takes this representation literally, it looks as if the system exists in several incompatible physical states at once. And yet when the physicist makes a measurement on the system, only one of these incompatible states is manifest in the result of the measurement. What makes this especially puzzling is that there is nothing in the physical nature of a measurement that could privilege one of the terms over the others.

According to the Copenhagen interpretation of quantum mechanics, the solution to this puzzle is that the quantum state should not be taken as a description of the physical system. Rather, the role of the quantum state is to summarize what we can expect if we make measurements on the system. According to the many-worlds interpretation, the quantum state is to be taken as a description of the system, and the solution to the puzzle is that each term in that description produces a corresponding measurement outcome. That is, for any quantum measurement there are generally multiple measurement results occurring on distinct “branches” of reality. According to hidden variable theories, the quantum state is a partial description of the system, where the rest of the description is given by the values of one or more “hidden” variables. The solution to the puzzle in this case is that the hidden variables pick out one of the physical states described by the quantum state as the actual one. According to spontaneous collapse theories, the quantum state is a complete description of the system, but the dynamical laws of quantum mechanics are incomplete, and need to be supplemented with a “collapse” process that eliminates all but one of the terms in the state during the measurement process.

These interpretations and others present us with very different pictures of the nature of the physical world (or in the Copenhagen case, no picture at all), and they have different strengths and weaknesses. The question of how to decide between them is an open one.

Table of Contents

  1. The Development of Quantum Mechanics
  2. The Copenhagen Interpretation
  3. The Many-Worlds Interpretation
  4. Hidden Variable Theories
  5. Spontaneous Collapse Theories
  6. Other Interpretations
  7. Choosing an Interpretation
  8. References and Further Reading

1. The Development of Quantum Mechanics

Quantum mechanics was developed in the early twentieth century in response to several puzzles concerning the predictions of classical (pre-20th century) physics. Classical electrodynamics, while successful at describing a large number of phenomena, yields the absurd conclusion that the electromagnetic energy in a hollow cavity is infinite. It also predicts that the energy of electrons emitted from a metal via the photoelectric effect should be proportional to the intensity of the incident light, whereas in fact the energy of the electrons depends only on the frequency of the incident light. Taken together with the prevailing account of atoms as clouds of positive charge containing tiny negatively charged particles (electrons), classical mechanics entails that alpha particles fired at a thin gold foil should all pass straight through, whereas in fact a small proportion of them are reflected back towards the source.

In response to the first puzzle, Max Planck suggested in 1900 that light can only be emitted or absorbed in integral units of hn, where n is the frequency of the light and h is a constant. This is the hypothesis that energy is quantized—that it is a discrete rather than continuous quantity—from which quantum mechanics takes its name. This hypothesis can be used to explain the finite quantity of electromagnetic energy in a hollow cavity. In 1905 Albert Einstein proposed that the quantization of energy can solve the second puzzle too; the minimum amount of energy that can be transferred to an electron from the incident light is hn, and hence the energy of the emitted electrons is proportional to the frequency of the light.

Ernest Rutherford’s solution to the third puzzle in 1911 was to posit that the positive charge in the atom is concentrated in a small nucleus with enough mass to reflect an alpha particle that collides with it. According to Niels Bohr’s 1913 elaboration of this model, the electrons orbit this nucleus, but only certain energies for these orbital electrons are allowed. Again, energy is quantized. The model has the additional benefit of explaining the spectrum of light emitted from excited atoms; since only certain energies are allowed, only certain wavelengths of light are possible when electrons jump between these levels, and this explains why the spectrum of the light consists of discrete wavelengths rather than a continuum of possible wavelengths.

But the quantization of energy raises as many questions as it answers. Among them: Why are only certain energies allowed? What prevents the electrons in an atom from losing energy continuously and spiraling in towards the nucleus, as classical physics predicts? In 1924 Louis de Broglie suggested that electrons are wave-like rather than particle-like, and that the reason only certain electron energies are allowed is that energy is a function of wavelength, and only certain wavelengths can fit without remainder in the electron orbit for a given energy. By 1926 Erwin Schrödinger had developed an equation governing the dynamical behavior of these matter waves, and quantum mechanics was born.

This theory has been astonishingly successful. Within a year of Schrödinger’s formulation, Clinton Davisson and Lester Germer demonstrated that electrons exhibit interference effects just like light waves—that when electrons are bounced off the regularly-arranged atoms of a crystal, their waves reinforce each other in some directions and cancel out in others, leading to more electrons being detected in some directions than others. This success has continued. Quantum mechanics (in the form of quantum electrodynamics) correctly predicts the magnetic moment of the electron to an accuracy of about one part in a trillion, making it the most accurate theory in the history of science. And so far its predictive track record is perfect: no data contradicts it.

But on a descriptive and explanatory level, the theory of quantum mechanics is less than satisfactory. Typically when a new theory is introduced, its proponents are clear about the physical ontology presupposed—the kind of objects governed by the theory. Superficially, quantum mechanics is no different, since it governs the evolution of waves through space. But there are at least two reasons why taking these waves as genuine physical entities is problematic.

First, although in the case of electron interference the number of electrons arriving at a particular location can be explained in terms of the propagation of waves though the apparatus, each electron is detected as a particle with a precise location, not as a spread-out wave. As Max Born noticed in 1926, the intensity (squared amplitude) of the quantum wave at a location gives the probability that the particle is located there; this is the Born rule for assigning probabilities to measurement outcomes. The second reason to doubt the reality of quantum waves is that the quantum waves do not propagate through ordinary three-dimensional space, but though a space of 3n dimensions, where n is the number of particles in the system concerned. Hence it is not at all clear that the underlying ontology is genuinely of waves propagating through space. Indeed, the standard terminology is to call the quantum mechanical representation of the state of a system a wavefunction rather than a wave, perhaps indicating a lack of metaphysical commitment: the mathematical function that represents a system has the form of a wave, even if it does not actually represent a wave.

So quantum mechanics is a phenomenally successful theory, but it is not at all clear what, if anything, it tells us about the underlying nature of the physical world. Quantum mechanics, perhaps uniquely among physical theories, stands in need of an interpretation to tell us what it means. Four kinds of interpretation are described in detail below (and some others more briefly). The first two—the Copenhagen interpretation and the many-worlds interpretation—take standard quantum mechanics as their starting point. The third and fourth—hidden variable theories and spontaneous collapse theories—start by modifying the theory of quantum mechanics, and hence are perhaps better described as proposals for replacing quantum mechanics with a closely related theory.

2. The Copenhagen Interpretation

The earliest consensus concerning the meaning of quantum mechanics formed around the work of Niels Bohr and Werner Heisenberg in Copenhagen during the 1920s, and hence became known as the Copenhagen interpretation. Bohr’s position is that our conception of the world is necessarily classical; we think of the world in terms of objects (for example, waves or particles) moving through three-dimensional space, and this is the only way we can think of it. Quantum mechanics doesn’t permit such a conceptualization, either in terms of waves or particles, and so the quantum world is in principle unknowable by us. Quantum mechanics shouldn’t be taken as a description of the quantum world, and neither should the evolution of the quantum state over time be taken as a causal explanation of the phenomena we observe. Rather, quantum mechanics is an extremely effective tool for predicting measurement results that takes the configuration of the measuring apparatus (described classically) as input, and produces probabilities for the possible measurement outcomes (described classically) as output.

It is sometimes claimed that the Copenhagen interpretation is a product of the logical positivism that flourished in Europe during the same period. The logical positivists held that the meaningful content of a scientific theory is exhausted by its empirical predictions; any further speculation into the nature of the world that produces these measurement outcomes is quite literally meaningless. This certainly has some resonances with the Copenhagen interpretation, particularly as described by Heisenberg. But Bohr’s views are importantly different from Heisenberg’s, and are more Kantian than positivist. Bohr is happy to say that the micro-world exists, and that it can’t be conceived of in causal terms, both of which would be meaningless claims according to positivist scruples. However, Bohr thinks we can say little else about the micro-world. Bohr, like Kant, thinks that we can only conceive of things in certain ways, and that the world as it is in itself is not amenable to such conceptualization. If this is correct, it is inevitable that our fundamental physical theories are unable to describe the world as it is, and the fact that we can make no sense of quantum mechanics as a description of the world should not concern us.

Unless one is convinced of Kant’s position concerning our conceptual access to the world, one may not find Bohr’s pronouncements concerning what we can conceive compelling. However, the motivation for adopting a Copenhagen-style interpretation can be made independent of any overarching philosophical position. Since the intensity of the wavefunction at a location gives the probability of the particle occupying that location, it is natural to regard the wavefunction as a reflection of our knowledge of the system rather than a description of the system itself. This view, held by Einstein, suggests that quantum mechanics is incomplete, since it gives us only an instrumental recipe for calculating the probabilities of outcomes, rather than a description of the underlying state of the system that gives rise to those probabilities. But it was later proved (as we shall see) that given certain plausible assumptions, it is impossible to construct such a description of the underlying state. Bohr did not know at the time that Einstein’s task was impossible, but its evident difficulty provides some motivation for regarding the quantum world as inscrutable.

However, the Copenhagen interpretation has at least two major drawbacks. First, a good deal of the early evidence for quantum mechanics comes from its ability to explain the results of interference experiments involving particles like electrons. Bohr’s insistence that quantum mechanics is not descriptive takes away this explanation (although, of course, viewing the wavefunction as descriptive only of our knowledge does no better). Second, Bohr’s position requires a “cut” between the macroscopic world described by classical concepts and the microscopic world subsumed under (but not described by) quantum mechanics. Since macroscopic objects are made out of microscopic components, it looks like macroscopic objects must obey the laws of quantum mechanics too; there can be no such “cut”, either sharp or vague, delimiting the realm of applicability of quantum mechanics.

3. The Many-Worlds Interpretation

In 1957 Hugh Everett proposed a radically new way of interpreting the quantum state. His proposal was to take quantum mechanics as descriptive and universal; the quantum state is a genuine description of the physical system concerned, and macroscopic systems are just as well described in this way as microscopic ones. This immediately solves both the above problems; there is no “cut” between the micro and macro worlds, and the explanation of particle interference in terms of waves is retained.

An immediate problem facing such a realist interpretation of the quantum state is the provenance of the outcomes of quantum measurements. Recall that in the case of electron interference, what is detected is not a spread-out wave, but a particle with a well-defined location, where the wavefunction intensity at a location gives the probability that the particle is located there.

How does Everett account for these facts? What he suggests is that we model the measurement process itself quantum mechanically. It is by no means uncontroversial that measuring devices and human observers admit of a quantum mechanical description, but given the assumption that quantum mechanics applies to all material objects, such a description ought to be available at least in principle. So consider for simplicity the situation in which the wavefunction intensity for the electron at the end of the experiment is non-zero in only two regions of space, A and B. The detectors at these locations can be modeled using a wavefunction too, with the result that the electron wavefunction component at A triggers a corresponding change in the wavefunction of the A-detector, and similarly at B. In the same way, we can model the experimenter who observes the detectors using a wavefunction, with the result that the change in the wavefunction of the A-detector causes a change in the wavefunction of the observer corresponding to seeing that the A-detector has fired, and the change in the wavefunction of the B-detector causes a change in the wavefunction of the observer corresponding to seeing that the B-detector has fired. The observer’s final state, then, is modeled by two distinct wave structures superposed, much in the way two images are superposed in a double-exposure photograph.

In sum, the wave structure of the electron-detector-observer system consists of two distinct branches, the A-outcome branch and the B-outcome branch. Since these two branches are relatively causally isolated from each other, we can describe them as two distinct worlds, in one of which the electron hits the detector at A and the observer sees the A-detector fire, and in the other of which the electron hits the detector at B and the observer sees the B-detector fire. This talk of worlds needs to be treated carefully, though; there is just one physical world, described by the quantum state, but because observers (along with all other physical objects) exhibit this branching structure, it is as if the world is constantly splitting into multiple copies. It is not clear whether Everett himself endorsed this talk of worlds, but this is the understanding of his work that has become canonical; call it the many-worlds interpretation.

According to the many-worlds interpretation, then, every physically possible outcome of a measurement actually occurs in some branch of the quantum state, but as an inhabitant of a particular branch of the state, a particular observer only sees one outcome. This explains why, in the electron interference experiment, the outcome looks like a discrete particle even though the object that passes through the interference device is a wave; each point in the wave generates its own branch of reality when it hits the detectors, so from within each of the resulting branches it looks like the incoming object was a particle.

The main advantage of the many-worlds interpretation is that it is a realist interpretation that takes the physics of standard quantum mechanics literally. It is often met with incredulity, since it entails that people (along with other objects) are constantly branching into innumerable copies, but this by itself is no argument against it. Still, the branching of people leads to philosophical difficulties concerning identity and probability, and these (particularly the latter) constitute genuine difficulties facing the approach.

The problem of identity is a philosophically familiar one: if a person splits into two copies, then the copies can’t be identical to (that is, the same person as) the original person, or else they would be identical to (the same person as) each other. Various solutions have been developed in the literature. One might follow Derek Parfit and bite the bullet here: what fission cases like this show is that strict identity is not a useful concept for describing the relationship between people and their successors. Or one might follow David Lewis and rescue strict identity by stipulating that a person is a four-dimensional history rather than a three dimensional object. According to this picture, there are two people (two complete histories) present both before and after the fission event; they initially overlap but later diverge. Identity over time is preserved, since each of the pre-split people is identical with exactly one of the post-split people. Both of these positions have been proposed as potential solutions to the problem of personal identity in a many-worlds universe. A third solution that is sometimes mentioned is to stipulate that a person is the whole of the branching entity, so that the pre-split person is identical to both her successors, and (despite our initial intuition otherwise) the successors are identical to each other.

So the problem of identity admits of a number of possible solutions, and the only question is how one should try to decide between them. Indeed, one might argue that there is no need to decide between them, since the choice is a pragmatic one about the most useful language to use to describe branching persons.

The problem of probability, though, is potentially more serious. As noted above, quantum mechanics makes its predictions in the form of probabilities: the square of the wavefunction amplitude in a region tells us the probability of the particle being located there. The striking agreement of the observed distribution of outcomes with these probabilities is what underwrites our confidence in quantum mechanics. But according to the many-worlds interpretation, every outcome of a measurement actually occurs in some branch of reality, and the well-informed observer knows this. It is hard to see how to square this with the concept of probability; at first glance, it looks like every outcome has probability 1, both objectively and epistemically. In particular, if a measurement results in two branches, one with a large squared amplitude and one with a small squared amplitude, it is hard to see why we should regard the former as more probable than the latter. But unless we can do so, the empirical success of quantum mechanics evaporates.

It is worth noting, however, that the foundations of probability are poorly understood. When we roll two dice, the chance of rolling 7 is higher than the chance of rolling 12. But there is no consensus concerning the meaning of chance claims, or concerning why the higher chance of 7 should constrain our expectations or behavior. So perhaps a quantum branching world is in no worse shape than a classical linear world when it comes to understanding probability. We may not understand how squared wavefunction amplitude could function as chance in guiding our expectations, but perhaps that is no barrier to postulating that it does so function.

A more positive approach has been developed by David Deutsch and David Wallace, arguing that given some plausible constraints on rational behavior, rational individuals should behave as if squared wavefunction amplitudes are chances. If one combines this with a functionalist attitude towards chance—that whatever functions as chance in guiding behavior is chance—then this program promises to underwrite the contention that squared wave amplitudes are chances. However, the assumptions on which the Deutsch-Wallace argument is based can be challenged. In particular, they assume that it is irrational to care about branching per se: having two successors experiencing a given outcome is neither better nor worse than having one successor experiencing that outcome. But it is not clear that this is a matter of rationality any more than the question of whether having several happy children is better than having one happy child.

A further worry about the many-words theory that has been largely put to rest concerns the ontological status of the worlds. It has been argued that the postulation of many worlds is ontologically profligate. However, the current consensus is that worlds are emergent entities just like tables and chairs, and talk of worlds is just a convenient way of talking about the features of the quantum state. On this view, the many-worlds interpretation involves no entities over and above those represented by the quantum state, and as such is ontologically parsimonious. There remains the residual worry that the number of branches depends sensitively on mathematical choices about how to represent the quantum state. Wallace, however, embraces this indeterminacy, arguing that even though the many-worlds universe is a branching one, there is no well-defined number of branches that it has. If tenable, this goes some way towards resolving the above concern about the rationality of caring about branching per se: if there is no number of branches, then it is irrational to care about it.

4. Hidden Variable Theories

The many-worlds interpretation would have us believe that we are mistaken when we think that a quantum measurement results in a unique outcome; in fact such a measurement results in multiple outcomes occurring on multiple branches of reality. But perhaps that is too much to swallow, or perhaps the problems concerning identity and probability mentioned above are insuperable. In that case, one is led to the conclusion that quantum mechanics is incomplete, since there is nothing in the quantum state that picks out one of the many possible measurement results as the single actual measurement result. As mentioned above, this was Einstein’s view. If this view is correct, then quantum mechanics stands in need of completion via the addition of extra variables describing the actual state of the world. These additional variables are commonly known as hidden variables.

However, a theorem proved by John Bell in 1964 shows that, subject to certain plausible assumptions, no such hidden-variable completion of quantum mechanics is possible. One version of the proof concerns the properties of a pair of particles. Each particle has a property called spin: when the spin of the particle is measured in some direction, one either gets the result up or down. Suppose that the spin of each particle can be measured along one of three directions 120° apart. What quantum mechanics predicts is that if the spins of the particles are measured along the same direction, they always agree (both up or both down), but if they are measured along different directions they agree 25% of the time and disagree 75% of the time. According to the hidden variable approach, the particles have determinate spin values for each of the three measurement directions prior to measurement. The question is how to ascribe spin values to particles to reproduce the predictions of quantum mechanics. And what Bell proved is that there is no way to do this; the task is impossible.

Many physicists concluded on the basis of Bell’s theorem that no hidden-variable completion of quantum mechanics is possible. However, this was not Bell’s conclusion. Bell concluded instead that one of the assumptions he relied on in his proof must be false. First, Bell assumed locality—that the result of a measurement performed on one particle cannot influence the properties of the other particle. This seems secure because the measurements on the two particles can be widely separated, so that a signal carrying such an influence would have to travel faster than light. Second, Bell assumed independence—that the properties of the particles are independent of which measurements will be performed on them. This assumption too seems secure, because the choice of measurement can be made using a randomizing device or the free will of the experimenter.

Despite the apparent security of his assumptions, Bell knew when he proved his theorem that a hidden-variable completion of quantum mechanics had been explicitly constructed by David Bohm in 1952. Bohm assumed that in addition to the wave described by the quantum state, there is also a set of particles whose positions are given by the hidden variables. The wave pushes the particles around according to a new dynamical law formulated by Bohm, and the law is such that if the particle positions are initially statistically distributed according to the squared amplitude of the wave, then they are always distributed in this way. In an electron interference experiment, then, the existence of the wave explains the interference effect, the existence of the particles explains why each electron is observed at a precise location, and the new Bohmian law explains why the probability of observing an electron at a given location is given by the squared amplitude of the wave. As Bell often pointed out, to call Bohm’s theory a hidden variable theory is something of a misnomer, since it is the values of the hidden variables—the positions of the particles—that are directly observed on measurement. Nevertheless, the name has stuck.

Bohm’s theory, then, provides a concrete example of a hidden variable theory of quantum mechanics. However, it is not a counterexample to Bell’s theorem, because it violates Bell’s locality assumption. The new law introduced by Bohm is explicitly non-local: the motion of each particle is determined in part by the positions of all the other particles at that instant. In the case of Bell’s spin experiment, a measurement on one particle instantaneously affects the motion of the other particle, even if the particles are widely separated. This is a prima facie violation of special relativity, since according to special relativity simultaneity is dependent on one’s choice of coordinates, making it impossible to define “instantaneous” in any objective way. However, this does not mean that Bohm’s theory is immediately refuted by special relativity, since one can instead take Bohm’s theory to show the need to add a universal standard of simultaneity to special relativity. Bell recognized this possibility. It is worth noting that even though Bohm’s theory requires instantaneous action at a distance, it also prevents these influences from being controlled so as to send a signal; there is no “Bell telephone”.

Bohm chooses positions as the properties described by the hidden variables of his theory. His reason for this is that it is plausible that it is the positions of things that we directly observe, and hence completing quantum mechanics via positions suffices to ensure that measurements have unique outcomes. But it is possible to construct measurements in which the outcome is recorded in some property other than position. As a response to this possibility, one might suggest adding hidden variables describing every property of the particles simultaneously, rather than just their positions. However, a theorem proved by Kochen and Specker in 1967 shows that no such theory can reproduce the predictions of quantum mechanics. A second response is to stick with Bohm’s theory as it is, and argue that while such measurements may initially lack a unique outcome, they will rapidly acquire a unique outcome as the recording device becomes correlated with the positions of the surrounding objects in the environment.

A final way to accommodate such measurements within a hidden variable theory is to make it a contingent matter which properties of a system are ascribed determinate values at a particular time. That is, rather than supplementing the wavefunction with variables describing a fixed property (the positions of things), one can let the wavefunction state itself determine which properties of the system are described by the hidden variables at that time. The idea is that the algorithm for ascribing hidden variables to a system is such that whenever a measurement is performed, the algorithm ascribes a determinate value to the property recording the outcome of the measurement. Such theories are known as modal theories. But while Bohm’s theory provides an explicit dynamical law describing the motion of the particles over time, modal theories generally do not provide a dynamical law governing their hidden variables, and this is regarded as a weakness of the approach.

Modal theories, like Bohm’s theory, evade Bell’s theorem by violating Bell’s locality assumption. In the modal case, the rule for deciding which properties of the system are made determinate depends on the complete wavefunction state at a particular instant, and this allows a measurement on one particle to affect the properties ascribed to another particle, however distant. As mentioned above, one can solve this problem by supplementing special relativity with a preferred standard of simultaneity. But this is widely regarded as an ad hoc and unwarranted addition to an otherwise elegant and well-confirmed physical theory. Indeed, the same charge is often levelled at the hidden variables themselves; they are an ad hoc and unwarranted addition to quantum mechanics. If hidden variable theories turn out to be the only viable interpretations of quantum mechanics, though, the force of this charge is reduced considerably.

Nevertheless, it may be possible to construct a hidden variable theory that does not violate locality. In order to evade Bell’s theorem, then, it will have to violate the independence assumption—the assumption that the properties of the particles are independent of which measurements will be performed on them. Since one can choose the measurements however one likes, it is initially hard to see how this assumption could be violated. But there are a couple of ways it might be done. First, one could simply accept that there are brute, uncaused correlations in the world. There is no causal link (in either direction) between my choice of which measurement to perform on a (currently distant) particle and its properties, but nevertheless there is a correlation between them. This approach requires giving up on the common cause principle—the principle that a correlation between two events indicates either that one causes the other or that they share a cause. However, there is little consensus concerning this principle anyway.

A second approach is to postulate a common cause for the correlation—a past event that causally influences both the choice of measurement and the properties of the particle. But absent some massive unseen conspiracy on the part of the universe, one can frequently ensure that there is no common cause in the past by isolating the measuring device from external influences. However, the measuring device and the particle to be measured will certainly interact in the future, namely when the measurement occurs. It has been proposed that this future event can constitute the causal link explaining the correlation between the particle properties and the measurements to be performed on them. This requires that later events can cause earlier events—that causation can operate backwards in time as well as forwards in time. For this reason, the approach is known as the retrocausal approach.

The retrocausal approach allows correlations between distant events to be explained without instantaneous action at a distance, since a combination of ordinary causal links and retrocausal links can amount to a causal chain that carries an influence between simultaneous distant events. No absolute standard of simultaneity is required by such explanations, and hence retrocausal hidden variable theories are more easily reconciled with special relativity than non-local hidden variable theories.

Bohm’s theory operates with a two-element ontology—a wave steering a set of particles. Retrocausal theories vary in their ontological presuppositions. Some—retrocausal Bohmian theories—incorporate two waves steering a set of particles; one wave carries the “forward-causal” influences on the particles from the initial state of the system, and the other carries the “backward-causal” influences on the particles from the final state of the system. But it may be possible to make do with the particles alone, with the wavefunction representing our knowledge of the particle positions rather than the state of a real object. The idea is that the interaction between the causal influences on the particles from the past and from the future can explain all the quantum phenomena we observe, including interference. However, at present this is just a promising research program; no explicit dynamical laws for such a theory have been formulated.

5. Spontaneous Collapse Theories

Hidden variable theories attempt to complete quantum mechanics by positing extra ontology in addition to (or perhaps instead of) the wavefunction. Spontaneous collapse theories, on the other hand, (at least initially) take the wavefunction to be a complete representation of the state of a system, and posit instead that the dynamical law of standard quantum mechanics—the Schrödinger equation—is not exactly right. The Schrödinger equation is linear; this means that if initial state A leads to final state A’ and initial state B leads to final state B’, then initial state A + B leads to final state A’ + B’. For example, if a measuring device fed a spin-up particle leads to a spin-up reading, and a measuring device fed a spin-down particle leads to a spin-down reading, then a measuring device fed a particle whose state is a sum of spin-up and spin-down states will end up in a state which is a sum of reading spin-up and reading spin-down. This is the multiplicity of measurement outcomes embraced by the many-worlds interpretation.

To avoid sums of distinct measurement outcomes, one needs to modify the basic dynamical equation of the quantum mechanics equation so that it is non-linear. The first proposal along these lines was made by Gian Carlo Ghirardi, Alberto Rimini, and Tullio Weber in 1986; it has become known as the GRW theory. The GRW theory adds an irreducibly probabilistic “collapse” term to the otherwise deterministic Schrödinger dynamics. In particular, for each particle in a system there is a small chance per unit time of the wavefunction undergoing a process in which it is instantly and discontinuously localized in the coordinates of that particle. The localization process multiplies the wave state by a narrow Gaussian (bell curve), so that if the wave was initially spread out in the coordinates of the particle in question, it ends up concentrated around a particular point. The point on which this collapse process is centered is random, with a probability distribution given by the square of the pre-collapse wave amplitude (averaged over the Gaussian collapse curve).

The way this works is as follows. The collapse rate for a single particle is very low—about one collapse per hundred million years. So for individual particles (and systems consisting of small numbers of individual particles), we should expect that they obey the Schrödinger equation. And this is exactly what we observe; there are no known exceptions to the Schrödinger equation at the microscopic level. But macroscopic objects contain on the order of a trillion trillion particles, so we should expect about ten million collapses per second for such an object. Furthermore, in solid objects the positions of those particles are strongly correlated with each other, so a collapse in the coordinates of any particle in the object has the effect of localizing the wavefunction in the coordinates of every particle in the object. This means that if the wavefunction of a macroscopic object is spread over a number of distinct locations, it very quickly collapses to a state in which its wavefunction is highly localized around one location.

In the case of electron interference, then, each electron passes through the apparatus in the form of a spread-out wave. The collapse process is vanishingly unlikely to affect this wave, which is important, as its spread-out nature is essential to the explanation of interference: wave components traveling distinct paths must be able to come together and either reinforce each other or cancel each other out. But when the electron is detected, its position is indicated by something we can directly observe, for example, by the location of a macroscopic pointer. To measure the location of the electron, then, the position of the pointer must become correlated with the position of the electron. Since the wave representing the electron is spread out, the wave representing the pointer will initially be spread out too. But within a fraction of a second, the spontaneous collapse process will localize the pointer (and the electron) to a well-defined position, producing the unique measurement outcome we observe.

The spontaneous collapse approach is related to earlier proposals (for example, by John von Neumann) that the measurement process itself causes the collapse that reduces the multitude of pre-measurement wave branches to the single observed outcome. However, unlike previous proposals, it provides a physical mechanism for the collapse process in the form of a deviation from the standard Schrödinger dynamics. This mechanism is crucial; without it, as we have seen, there is no way for the measurement process to generate a unique outcome.

Note that, unlike in Bohm’s theory, there are no particles at the fundamental level in the GRW theory. In the electron interference case, particle behavior emerges during measurement; the measured system exhibits only wave-like behavior prior to measurement. Strictly speaking, to say that a system contains n particles is just to say that its wave representation has 3n dimensions, and to single out one of those particles is really just to focus attention on the form of the wave in three of those dimensions.

An immediate difficulty that faces the GRW theory is that the localization of the wave induced by collapse is not perfect. The collapse process multiplies the wave by a Gaussian, a function which is strongly peaked around its center but which is non-zero everywhere. No part of the pre-collapse wavefunction is driven to zero by this process; if the wavefunction represents a set of possible measurement results, the wave component corresponding to one result becomes large and the wave component corresponding to the others become small, but they do not disappear. Since one motivation for adopting a spontaneous collapse theory is the perceived failure of the many-worlds interpretation to recover probability claims, it cannot be argued that the small terms are intrinsically improbable. Instead, it looks like the GRW spontaneous collapse process fails to ensure that measurements have unique outcomes.

A second difficulty with the GRW theory is that the wavefunction is not an object in a three-dimensional space, but an object occupying a high-dimensional space with three dimensions for each “particle” in the system concerned. David Albert has argued that this makes the three-dimensional world of experience illusory.

A third difficulty with the GRW theory is that the collapse process acts instantaneously on spatially separated parts of the system; it instantly multiplies the wavefunction everywhere by a Gaussian. Like Bohm’s theory, the GRW theory violates Bells’ locality assumption, since a measurement performed on one particle can instantaneously affect the state of a distant particle (although in the case of the GRW theory talk of “particles” has to be cashed out in terms of the coordinates of the wavefunction). As discussed in relation to Bohm’s theory, this requires an objective conception of simultaneity that is absent from special relativity, and hence it is hard to see how to reconcile the GRW theory with relativity.

One way of responding to these difficulties, advocated by Ghirardi, is to postulate a three-dimensional mass distribution in addition to and determined by the wavefunction, such that our experience is determined directly by the mass distribution rather than the wavefunction. This responds to the second difficulty, since the mass distribution that we directly experience is three-dimensional, and hence our experience of a three-dimensional world is veridical. It may also go some way towards resolving the first difficulty, since the mass density corresponding to non-actual measurement outcomes is likely to be negligible relative to the background mass density surrounding the actual measurement outcome (the mass density of air, for example). Ghirardi’s mass density is not intended to address the third difficulty; this requires modifying the collapse process itself, and several proposals for constructing a relativistic collapse process based on the GRW theory have been developed.

An alternative approach to the difficulties facing the GRW theory is to adapt a suggestion made by John Bell that the center of each collapse event should be regarded as a “flash of determinacy” out of which everyday objects and everyday experience are built. Roderich Tumulka has developed this suggestion into a “flashy” spontaneous collapse theory, in which the wavefunction is regarded instrumentally as that which connects the distribution of flashes at one time with the probability distribution of flashes at a later time. On this proposal, the small wave terms corresponding to non-actual measurement outcomes can be understood in a straightforwardly probabilistic way: there is only a small chance that a flash will be associated with such a term, and so only a small chance that the non-actual measurement outcome will be realized. The flashes are located in three-dimensional space, so there is no worry that three-dimensionality is an illusion. And since the flashes, unlike the wavefunction, are located at space-time points, it is easier to envision a reconciliation between the flashy theory and special relativity.

6. Other Interpretations

There are several other interpretations of quantum mechanics available that don’t fit neatly into one of the categories discussed above. Here are some prominent ones.

The consistent histories (or decoherent histories) interpretation developed by Robert Griffiths, Murray Gell-Mann and James Hartle, and defended by Roland Omnès, is mathematically something of a hybrid between collapse theories and hidden variable theories. Like spontaneous collapse theories, the consistent histories approach incorporates successive localizations of the wavefunction. But unlike spontaneous collapse theories, these localizations are not regarded as physical events, but just as a means of picking out a particular history of the system in question as actual, much as hidden variables pick out a particular history as actual. If the localizations all constrain the position of a particle, then the history picked out resembles a Bohmian trajectory. But the consistent histories approach also allows localizations to constrain properties other than position, resulting in a more general class of possible histories.

However, not all such sets of histories can be ascribed consistent probabilities: notably, interference effects often prevent the assignment of probabilities obeying the standard axioms to histories. However, for systems that interact strongly with their environment, interference effects are rapidly suppressed; this phenomenon is called decoherence. Decoherent histories can be ascribed consistent probabilities—hence the two alternative names of this approach. It is assumed that only consistent sets of histories can describe the world, but other than this consistency requirement, there is no restriction on the kinds of histories that are allowed. Indeed, Griffiths maintains that there is no unique set of possible histories: there are many ways of constructing sets of possible histories, where one among each set is actual, even if the alternative actualities so produced describe the world in mutually incompatible ways. Absent a many worlds ontology, however, some have worried about how such a plurality of true descriptions of the world could be coherent. Gell-Mann and Hartle respond to such concerns by arguing that organisms evolve to exploit the relative predictability of one among the competing sets of histories.

The transactional interpretation, initially developed by John Cramer, also incorporates elements of both collapse and hidden variable approaches. It starts from the observation that some versions of the dynamical equation of quantum mechanics admit wave-like solutions traveling backward in time as well as forward in time. Typically the former solutions are ignored, but the transactional interpretation retains them. Just as in retrocausal hidden variable theories, the backward-travelling waves can transmit information about the measurements to be performed on a system, and hence allow the transactional interpretation to evade the conclusion of Bell’s theorem.

The transactional interpretation posits rules according to which the backward and forward waves generate “transactions” between preparation events and measurement events, and one of these transactions is taken to represent the actual history of the system in question, where probabilities are assigned to transactions via a version of the Born rule. The formation of a transaction is somewhat reminiscent of the spontaneous collapse of the wavefunction, but due to the retrocausal nature of the theory, one might conclude that the wavefunction never exists in a pre-collapse form, since the completed transaction exists as a timeless element in the history of the universe. Hence some have questioned the extent to which the story involving forwards and backwards waves constitutes a genuine explanation of transaction formation, raising questions about the tenability of the transactional interpretation as a description of the quantum world. Ruth Kastner responds to these challenges by developing a possibilist transactional interpretation, embedding the transactional interpretation in a dynamic picture of time in which multiple future possibilities evolve to a single present actuality.

Relational interpretations, such as those developed by David Mermin and by Carlo Rovelli, take quantum mechanics to be about the relations between systems rather than the properties of the individual systems themselves. According to such an interpretation, there is no need to assign properties to individual particles to explain the correlations exhibited by Bell’s experiment, and hence one can evade Bells’ theorem without violating either locality or independence. Superficially, this approach resembles Everett’s, according to which systems have properties only relative to a given branch of the wavefunction. But whereas Everettians typically say that a relation such as an observer seeing a particular measurement result holds on the basis of the properties of the observer and of the measured system within a branch, Mermin denies that there are such relata; rather, the relation itself is fundamental. Hence this is not a many worlds interpretation, since world-relative properties provide the relata that relational interpretations deny. Without such relata, though, it is hard to understand relational quantum mechanics as a description of a single world either. However, citing analogies with spatiotemporal properties in relativistic theories, Rovelli insists that it is enough that quantum mechanics ascribe properties to a system relative to the state of a second system (for example, an observer).

Informational interpretations, such as those developed by Jeffrey Bub and by Carlton Caves, Christopher Fuchs and Rüdiger Schack, interpret quantum mechanics as describing constraints on our degrees of belief. They develop rules of quantum credence by analogy with the rules of classical information theory, expressing the difference between quantum systems and classical systems in informational terms, for example in terms of an unavoidable loss of information associated with a quantum measurement. Some proponents of an informational interpretation take an explicitly instrumentalist stance: quantum mechanics is just about the beliefs of observers, treated as external to the quantum systems under consideration. Others take their informational interpretation to be a realist one, in the sense that it can in principle be applied to the whole universe, with “information” serving as a new physical primitive. However, the adequacy of the informational approach as realist can be challenged, for example, on the basis that it does not provide a dynamics for the evolution of the actual state of the world over time. Bub responds that an account of the information-theoretic properties of our measurement results may be the deepest explanation we can hope for.

7. Choosing an Interpretation

Setting aside interpretations such as Copenhagen that eschew describing the quantum world, the interpretations discussed above present us with a number of very different ontological pictures. The many-worlds interpretation tells us that the underlying nature of physical objects is wave-like and branching. Bohm’s theory adds particles to this wave, and some hidden variable theories attempt to do away with the wave as a physical entity. The GRW theory, like the many-worlds interpretation, takes waves as fundamental, but rejects the many-worlds picture of a branching universe. Other spontaneous collapse theories add a mass density distribution to the wave, or replace the wave with point-like flashes. The GRW theory is indeterministic, casting quantum mechanical probabilities as genuine objective chances appearing in the fundamental physical laws. Bohm’s theory is deterministic, since the physical laws involve no chances, making quantum probabilities merely epistemic. The many-worlds interpretation involves no objective chances in the laws, but nevertheless (if successful) casts quantum mechanical probabilities as objective chances grounded in the branching process.

It seems, then, that we have a classic case of underdetermination: while the experimental data strongly confirm quantum mechanics, it is unclear whether those data confirm the metaphysical picture of many-worlds, Bohm, GRW or some other alternative. Since it has been doubted that underdetermination is ever actually manifested in the history of science, this is a striking example.

Nevertheless, the nature and even the existence of this underdetermination can be contested. It is worth noting that spontaneous collapse theories differ in their empirical predictions from standard quantum mechanics; the collapse process destroys interference effects, and the larger the object the more quickly one expects these effects to be detectable. At present, the differences between spontaneous collapse theories and standard quantum mechanics are beyond the reach of feasible experiments, since small objects cannot be kept isolated for long enough, and large objects cannot be kept isolated at all. Even so, the empirical underdetermination between spontaneous collapse theories and the other interpretations is not a matter of principle, and may be resolved in favor of one side or the other at some point.

The underdetermination between hidden variable theories and the many-worlds interpretation is of a different character. These two interpretations are empirically equivalent, and hence no experimental evidence could decide between them. It seems that here we have a case of underdetermination in principle. One could try to decide between them on the basis of non-empirical theoretical virtues like simplicity and elegance. On measures like this, the many-worlds interpretation would surely win, since hidden variable theories begin with the mathematical formalism of the many-worlds interpretation and add complicated and arguably ad hoc extra theoretical structure. But judging theories on the basis of extra-theoretical virtues is a controversial endeavor, particularly if we take the winner to be a guide to the metaphysical nature of the world.

Alternatively, it is not unreasonable to think that either the many-worlds interpretation or hidden variable theories could prove to be untenable. As noted above, it is unclear whether the many-worlds interpretation can account for the truth of probability claims, and if it cannot, then it fails to make contact with the empirical evidence. On the other hand, it is unclear whether any hidden variable theory can be made consistent with special relativity (and generalized to cover quantum field theory), and if not, then the hidden variable approach is arguably inadequate.

Some have argued that there is no underdetermination in the interpretation of quantum mechanics, since the many-worlds interpretation alone follows directly from a literal reading of the standard theory of quantum mechanics. It is true that both hidden variable theories and spontaneous collapse theories supplement or modify standard quantum mechanics, so perhaps only the many-worlds interpretation qualifies as an interpretation of standard quantum mechanics rather than a closely related theory. The many-worlds interpretation may be the only reasonable interpretation of quantum mechanics as it stands, and there may be good methodological reasons against modifying successful scientific theories. However, given the possibility that quantum mechanics according to the many-worlds interpretation is not in fact a successful scientific theory (because of the probability problem), it seems reasonable to consider modifications to the standard theory.

Nevertheless, it is certainly true that there may be no underdetermination in quantum mechanics, since it is possible that only one of the interpretations described here will prove to be tenable. Indeed, it is possible that none of these interpretations will prove to be tenable, since all of them face unresolved difficulties. Hence the interpretation of quantum mechanics is still very much an open question.

8. References and Further Reading

  • Albert, David Z. Quantum mechanics and experience. Harvard University Press, 1992.
    • Non-technical overview of the various interpretations of quantum mechanics and their problems.
  • Bell, John Stewart. Speakable and unspeakable in quantum mechanics: Collected papers on quantum philosophy. Cambridge University Press, 2004.
    • A mix of technical and non-technical papers, including the original 1964 proof of Bell’s theorem and discussions of various interpretations of quantum mechanics, especially hidden variable theories.
  • Bohm, David. Quantum theory. Prentice-Hall, 1951.
    • Classic quantum mechanics textbook, with early chapters covering the historical development of the theory.
  • Bohm, David, and Basil J. Hiley. The undivided universe: An ontological interpretation of quantum theory. Routledge, 1993.
    • A guide to Bohm’s theory and its implications by its originator. Technical in parts.
  • Bub, Jeffrey. Bananaworld: Quantum mechanics for primates. Oxford University Press, 2016.
    • Accessible introduction to the phenomena of entanglement, and an extended argument for an informational interpretation of quantum mechanics.
  • Cushing, James T. Quantum mechanics: Historical contingency and the Copenhagen hegemony. University of Chicago Press, 1994.
    • A comparison of the Copenhagen interpretation and Bohm’s theory, and a defense of the view that the former became canonical largely for social reasons.
  • Greaves, Hilary. “Probability in the Everett interpretation.” Philosophy Compass 2.1 (2007): 109-128.
    • Non-technical overview of the attempts to find a place for probability within Everett’s branching universe.
  • Kastner, Ruth. The transactional interpretation of quantum mechanics: The reality of possibility. Cambridge University Press, 2013.
    • Non-technical introduction to the transactional interpretation, and development of a “possibilist” version as a response to objections.
  • Maudlin, Tim. Quantum non-locality and relativity. Blackwell, 1994.
    • Non-technical guide to the problems of reconciling quantum mechanics with relativity.
  • Mermin, N. David. “Quantum mysteries for anyone.” The Journal of Philosophy 78 (1981): 397-408.
    • Non-technical exposition of Bell’s theorem and discussion of its implications.
  • Ney, Alyssa, and David Z. Albert, eds. The wavefunction: Essays on the metaphysics of quantum mechanics. Oxford University Press, 2013.
    • Essays on the ontological status of the wavefunction, including the issue of whether realism about the wavefunction makes the three-dimensional world of experience illusory.
  • Omnès, Roland. Understanding quantum mechanics. Princeton University Press, 1999.
    • Accessible (but in parts moderately technical) defense of the consistent histories approach.
  • Price, Huw. Time’s arrow & Archimedes’ point: New directions for the physics of time. Oxford University Press, 1997.
    • An extended, non-technical defense of the retrocausal hidden variable interpretation of quantum mechanics.
  • Rovelli, Carlo. “Relational quantum mechanics.” International Journal of Theoretical Physics 35 (1996): 1637-1678.
    • Exposition and defense of relational quantum mechanics. Moderately technical in parts.
  • Saunders, Simon, Jonathan Barrett, Adrian Kent, and David Wallace, eds. Many Worlds?: Everett, Quantum Theory, & Reality. Oxford University Press, 2010.
    • A collection of essays on the many-worlds interpretation, for and against, technical and non-technical. Includes an essay by Peter Byrne on the history of Everett’s interpretation.
  • Wallace, David. The emergent multiverse: Quantum theory according to the Everett interpretation. Oxford University Press, 2012.
    • An exposition and defense of the many-worlds interpretation, focusing especially on the issue of probability. Technical in parts.

 

Author Information

Peter J. Lewis
Email: plewis@miami.edu
University of Miami
U. S. A.

Human Dignity

The mercurial concept of human dignity features in ethical, legal, and political discourse as a foundational commitment to human value or human status.  The source of that value, or the nature of that status, are contested.  The normative implications of the concept are also contested, and there are two partially, or even wholly, different deontic conceptions of human dignity implying virtue-based obligations on the one hand, and justice-based rights and principles on the other.  Added to this, the different practical and philosophical presuppositions of law, ethics, and politics mean that definitive adjudication between different meanings is frustrated by disciplinary incommensurabilities.

What follows is an analysis of human dignity’s uses in law, ethics, and politics, and a critical description of the functions and tensions generated by human dignity within these fields. Crucial conceptual and methodological questions arise from the outset regarding whether human dignity can be reconstructed as one concept or must be treated as several concepts. It is argued here that a focal concept of human dignity can be reconstructed and that this concept provides the most illuminating perspective from which to view human dignity’s range of conceptions and uses.

Table of Contents

  1. Introduction
  2. Conceptual Background
  3. Themes
    1. Law
    2. Ethics
      1. General
      2. Philosophical Anthropology
      3. Tensions
    3. Politics
  4. Conceptual Analysis
    1. The Conceptual Features of Human Dignity
    2. The Credibility of an Interstitial Concept
    3. The Implications of an Interstitial Concept
  5. Conclusion
  6. References and Further Reading

1. Introduction

There are a number of competing conceptions of human dignity taking their meaning from the cosmological, anthropological, or political context in which human dignity is used. Human dignity can denote the special elevation of the human species, the special potentiality associated with rational humanity, or the basic entitlements of each individual.  There are, by extension, dramatically different normative uses to which the concept can be put. It is connected, variously, to ideas of sanctity, autonomy, personhood, flourishing, and self-respect, and human dignity produces, at different times, strict prohibitions and empowerment of the individual. It can also, potentially, be used to express the core commitments of liberal political philosophy as well as precisely those duty-based obligations to self and others that communitarian philosophers consider to be systematically neglected by liberal political philosophy.

As a consequence of these antagonistic currents of thought, philosophical analysis of human dignity cannot be separated from wider debates in moral, political, and legal philosophy. Nor can a certain level of selective reconstruction be avoided. The genealogy of the concept has been traced, tendentiously, through the whole history of Western, and sometimes non-Western, philosophical thought; such genealogies are not always illuminating at a conceptual level. More specifically, it is a desideratum of philosophical analysis of human dignity that the concept can be shown to have sufficient clarity to make a useful contribution to modern philosophical debate. This article therefore locates human dignity within a range of debates and suggests—using one important reconstruction of the concept—that human dignity represents a claim about human status that is intended to have a unifying effect on our ethical, legal and political practices.

We begin with an extended methodological and conceptual exploration, asking what should be taken as primary in examining human dignity. Noting a particularly close relationship between contemporary uses of human dignity, international law, and human rights, this connection is treated as focal without assuming that it is definitive of the concept (for related but alternative starting points see Debes 2009; Waldron 2013; Donnelly 2015).

2. Conceptual Background

The use of human dignity in public international law is a marker for understanding the moral, legal and political discourse of human dignity. A characteristic expression is found in the Preamble of the International Covenant on Civil and Political Rights (1966) whose rights “derive from the inherent dignity of the human person” and whose animating principle is “recognition of the inherent dignity and of the equal and inalienable rights of all members of the human family [as] the foundation of freedom, justice and peace in the world.” This assertion and others like it form a common reference point in contemporary literature on human dignity. Importantly, this ‘inherent dignity’ represents a potential bridge between a number of different ideas and ideals, namely freedom, justice and peace.

In fact, it is this potential to bridge different fields of regulation—human rights, bioethics, humanitarian law, equality law and others—that we might take to be the most important function of human dignity in international law. We will refer to an interstitial concept of human dignity (IHD). This concept, arising from discourses and practices of international law, has a strong relationship with equality, liberty, and the basic status of the individual. And, crucially, it implies an interstitial or conjunctive function across our normative systems. It is where law, ethics, and politics meet and are practically and critically interrelated. It is where domestic, regional, and international regulation find a common principle. It is where positive law and morality become difficult to distinguish. And it is where specific norms and general principles are linked. By extension, this concept of human dignity is the concept we should treat as the foundation of human rights because any reconstruction of the complex menu of human rights in international law has to take account of their wide-ranging implications for legal, moral and political governance. Put another way, one necessary condition for a defensible, foundational account of human rights is that their foundational principle must have an interstitial function straddling these fields of normative practice.

Note that this does not capture, and is potentially in tension with, many existing linguistic and normative practices related to human dignity. For instance, discussion of ‘dignitarian harms’ relevant to healthcare law, or local prohibitions on degrading work, might well invoke the language of human dignity without intending any implications for other normative systems. They imply nothing about politics or about law more generally. These linguistic and normative manifestations of human dignity should be considered in their own terms and are returned to in what follows. But the question of why there are tensions between these uses and the IHD is a revealing line of enquiry in itself. It concerns genealogical changes in the concept but also, and more importantly, the ways in which norms and principles are shaped and conditioned within the different practices of law, ethics and politics. To be sure, an interstitial concept is treated here as the best vantage point for all the competing claims. But this is not to insist it is the only intelligible concept. What follows is a description of an IHD’s form, content, and normative uses and an initial comparison with competing characterizations.

First, the idea of form allows us to distinguish the IHD from other uses of ‘dignity.’ Human dignity in international law is associated with a cluster of closely related, but distinguishable, formal characteristics. Human dignity connotes universality (ascription to every human person), inalienability (it is a non-contingent implication of one’s status as human), unconditionality (a property requiring no performance or maintenance), and overridingness (having priority in normative disputes). These immediately assist in distinguishing an IHD concept from a behavioral description of dignity which would not be inalienable, a virtue ethical reading which would either not include ascription to every human person or would be contingent, or a healthcare ethics reading which might not insist on the overridingness of human dignity. Note that these formal criteria are not treated as necessary conditions for human dignity but are, rather, claims commonly associated with human dignity in international law. They assist, amongst other things, in distinguishing human dignity from dignity simpliciter with its associations with behavior and comportment. They also situate the IHD close to certain currents of Kantianism and deontology without assuming that Kant’s work is definitive of the concept.

Second, content encompasses the ‘what’ and the ‘who’ of human dignity. Invocation of human dignity invites us to ask what underlying conception of humanity is at work. The discourse of the ‘human person,’ often associated with human dignity in international law, captures the mixture of formal personhood and embodiment or vulnerability. The conjunction of human and person also produces potentially competing conceptual and ontological commitments, and we can draw a distinction between normative and taxonomical humanity in our discourse of human dignity (Donnelly 2015). Further complexity arises from strong species-based claims or discussions of transhumanism that are focused on potential changes in the ontology of humanity. Undoubtedly human dignity is associated with species claims but it is also intelligible to rely upon more formal claims about the characteristics of agents or persons in analysis of human dignity. Related to these questions of ascription, the ontological and normative commitments involved in a human dignity claim (the question of what) are varied. Human dignity could concern capacities, could include the direct requirement to exercise capacities, and might also concern a teleology for humanity (that is, the ontology of human dignity). Human dignity will—at least in the use of concern here—be closely linked to notions of autonomy, personhood and free will (that is, the correlates of human dignity). Related to this is a contrast (concerning what we might call the metaphysics of human dignity) between human dignity considered broadly as a property or as something arising relationally through recognition or respect.

Third, normative use concerns characteristic normative implications and normative functions. This has been usefully expressed as a distinction between empowerment and constraint (Beyleveld and Brownsword 2001). The IHD is commonly associated with empowerment through human rights. This is distinguishable from the constraint function commonly found in bioethics and healthcare ethics, often a peremptory ban on certain kinds of uses of human beings. It is less clear how the IHD functions regarding another common distinction, that between horizontal application (between individuals) and vertical application (between the state and individual). International human rights law predominantly concerns vertical application, but the IHD, particularly given its linking of law, morality and politics does not preclude (and may imply) horizontal application. We may also note at this point a common distinction between human dignity as status and value. This turns, in part, on what response is required in the light of human dignity: status demands respect but also rights, duties and privileges; the existence of a value potentially requires fostering or enhancement. Only the former rights, duties and privileges are likely to be treated as having systemic application (being justiciable or enforceable), at least within liberal political systems that refuse to enforce moral conduct. As a consequence, the normative use of any IHD concept is undoubtedly conditioned by liberal assumptions concerning the proper scope of legislation. Nonetheless there are many instances of enforcement of more perfectionist or self-regarding conceptions of human dignity (for instance in the prohibition of ‘dwarf tossing’).

The last point reveals the most important tension in the general philosophical study of human dignity, namely the seeming co-existence of the interstitial concept characteristic of international law on the one hand and a perfectionist, virtue or purely self-regarding concept on the other. The assumption made here, that the latter perfectionist claims are non-focal or non-standard, is contentious (for the opposing view see Hennette-Vauchez 2011). Nevertheless this would appear to make the best sense of the majority of post-World War Two literature and thinking. Indeed the important post-war legal instruments themselves represent an interstitial process or moment, and the reconfiguration of the international legal order was the seedbed in which a certain idea of human dignity was given international expression. Far from being an accident of drafting or the contingencies of finding consensus, the (re)assertion of a notion of human dignity can be seen as the intention to transcend the boundaries of the legal, moral and political. Accordingly, while the following analysis does point to some historically contingent aspects of the use of human dignity, this is less important than the fact that the drafting of the Universal Declaration of Human Rights (1948) [UDHR] took place when the foundations of the international legal and political order were undergoing massive upheaval and when the need for a unifying moral principle was acute. We begin with law as the normative system within which the putative interstitial concept arose.

3. Themes

a. Law

There is no doubt that an IHD concept finds its most important expression in post-World War Two international law and constitutional instruments (the Universal Declaration of Human Rights, the Twin Covenants, and others). As such, the nature and function of human dignity in law could be assumed to be clear and well documented. This is the case at the level of doctrinal analysis of human dignity, and there is important jurisprudence arising in particular from the European Court of Human Rights and from constitutions including those of Germany, South Africa and Hungary. The sum of this jurisprudential thought is a mixture of general thinking about the foundation of constitutional rights alongside specific focus on the prohibition of degradation and objectification. This however points to two areas of deeper complexity, one hermeneutical and one concerning the conditioning effects of legal systems. First, different jurisdictions and institutions have given such radically different functions to human dignity that it is not always clear that one concept, the IHD, is at work. Indeed more substantive and perfectionist notions are often in evidence in national legal settings. Second, the IHD seems an ideal candidate for a kind of Grundnorm or secondary rule in law: a norm giving validity to legal systems as a whole or a principle governing the application of all norms within a system. However, this is difficult to defend as anything other than a loose generalization. In principled terms, legal systems treat justice as their foundational norm and this means that consistency, rather than moral defensibility, guides adjudication. And, in practice, it is not at all clear how human dignity can or should function as a ‘higher’ norm. There is, in other words, something of a mismatch between the putative function of the concept and its actual potential.

The nature and content of international law can partially explain such tensions. The prominent place of human dignity in international human rights instruments, as the foundation of those rights, has given human dignity enormous symbolic and heuristic significance. The foundational significance of human dignity is frequently assumed to extend beyond international human rights law to the international legal system as a whole. Where there are tensions between different fields of international law, or emerging practices in international law, human dignity is an important tool for focusing on the normative forces at work, in particular the significance of the individual as transcending the boundaries of state authority and as justifying state authority. It is fair to say that at this level human dignity is of enormous symbolic importance though human dignity is not, in itself, an enforceable norm of international law (the exception to this is in international humanitarian law’s Common Article 3, a prohibition on “outrages upon personal dignity”).

At the regional and domestic levels the normative implications of human dignity become more precise. While the European Court of Human Rights takes from international law the assumption that human dignity is foundational, it has operationalized it within its jurisprudence as an interpretive tool generally, and with particular reference to the idea of “torture, inhuman or degrading treatment.” This association between human dignity and the worst forms of degradation and objectification is shared with international humanitarian law and with German constitutional thinking. It is also the focus of the US constitutional deployment of human dignity as an interpretive tool in Eighth Amendment jurisprudence (concerning “cruel and unusual punishment”). The merit of this association with degradation is to give human dignity a clearer normative implication: the absolute impermissibility of certain kinds of gross mistreatment of the individual. Conversely, it is difficult to reconcile this restrictive, prohibitive reading with the assumption that human dignity is broad and foundational.

This relates, in turn, to a tension between human dignity operationalized as a specific norm (or in some instances a right) and a more general principle in law. Consider, for instance, Article 23 of the Universal Declaration of Human Rights (1948) (“everyone who works has the right to just and favourable remuneration ensuring for himself and his family an existence worthy of human dignity”). Here human dignity is neither a principle nor clearly foundational of the right it is associated with (or any other right); instead, it is a telos or standard. That standard is, potentially, related to material sufficiency or to flourishing and could be seen, to that extent, to have an aspiration to being interstitial. Nevertheless it is (in fact) rare for human dignity to be enforced as a standard and is (in principle) unclear how this would amount to normative or conceptual unification of law, ethics and politics. It is possible that some instances of human dignity as a right or as a telos appear to have clear interstitial implications but nonetheless represent a different concept from the IHD because both their content and their normative implications differ (see Waldron 2013).

The kind of complexities and possibilities that arise from human dignity being in law a right, standard or telos as well as a principle, value or status, gives rise to an underlying uncertainty as to whether law contains a single concept, a number of conceptions or simply a confusion of several ideas. There are a number of proposed normative and conceptual solutions to this tension, though it is not obvious how we might adjudicate between them. First, we can assume that human dignity necessarily has a dual status as norm (a more or less prohibitive norm) and as principle (predominantly symbolic and heuristic) (Alexy 2009). Second, we can assume that law has a number of different conceptions at work, conceptions that are either incommensurable (McCrudden 2008) or loosely linked by family resemblance (Neal 2012). Third, we can assume that law now has two very different concepts at work, one ancient and honor-based and the second closer to the IHD. We give this last option closer attention.

While many domestic or constitutional uses of human dignity are closely related to autonomy, privacy and the protection of agency, there is no doubt that (human) dignity has also been used to impose limitations on acts that can be seen as voluntarily diminishing an individual’s own human dignity or violating duties to themselves. In the broadest terms, then, there is a tension between a permissive reading of human dignity that protects autonomous individual agency from state intrusion, and a conservative reading that allows law to protect individuals from themselves. (This partially resembles Beyleveld and Brownsword’s contrast between the empowerment and constraint conceptions of human dignity.) These kinds of tensions are explored by Stephanie Hennette-Vauchez (2011), who insists on the coexistence of a human dignity principle, which is in essence a principle of equality, and an older (ancient) notion which is closer to a hierarchical notion of honor and permits the enforcement of certain norms related to self-respect. The form, content, and normative implications of these two ideas are clearly very different. While the idea of respect is morally important, it is difficult to reconcile the enforcement of respect with the assumptions we would treat as definitive of liberal legal systems, namely formal equality and division between public and private obligations. As such the honorific manifestations of human dignity are distinct from the liberal concept of human dignity; they are only rarely treated as enforceable (through personality law or public morality provisions) and lack the universal or inalienable characteristics of the IHD. They are nevertheless an irreducible part of contemporary law.

In sum, international law is a source of much of our thinking about human dignity, and in particular it gives credence to the idea of an IHD concept that can link different fields of legislation and different jurisdictions. At the same time, international and domestic legal institutions exercise a conditioning force on the discourse of human dignity. The implications of this are two-fold. First, as argued by James Griffin, human dignity acts as the foundation of human rights and gives rise to a large range of rights related to personhood and agency; nevertheless, the menu of human rights potentially generated by human dignity must be reduced or rationalized given the equal importance of legal institutions in national legal systems as a source of settled norms and practices (Griffin 2008). Second, legal systems require normative precision, and positive law invoking human dignity often appears to fall short of that precision; this has meant that jurists have favored conceptualizing and operationalizing human dignity through an association with degradation (Kaufmann et al, 2011). As Beitz insists, these implications raise related questions:

human dignity seem to apply (differently) at two distinct levels of thought about human rights—as a feature of a public system of norms and as a more specific value that explains why certain ways of treating people are (almost?) always impermissible. If there could be a theory of human dignity, one of its desiderata would be to show what (if anything) these senses of human dignity have in common and how they hang together (if they do). (2013, 283)

Beitz’s own analysis retains a certain kind of bifurcation between prohibitive and empowering conceptions of human dignity (2013, 289–290), suggesting resilient problems in making sense of human dignity’s place in law. Does the overridingness of human dignity have, in legal systems, to be conditioned by the normal institutional limits on legal norms and principles or does it retain its (extra-legal) moral force? And what role does philosophical anthropology play in our ethical and legal thinking, and should this inform what we take to be enforceable in law? This is a question of what we hold to be distinctively human and how, if at all, this should inform our thinking about law. A philosophical anthropology, along with related moral commitments, may demand or prevent perfectionist readings of human dignity which, in turn, has implications for any putative interstitial concept.

b. Ethics

i. General

Those concerns with philosophical anthropology form a point of departure for reflection on ethics. For example, animal ethics concerns sometimes explicitly, but always at least implicitly, questions about the value of human beings in contrast to nonhuman animals. Answers to such questions will typically concern whether human beings have standing over animals, or whether human beings have an inner significance that animal beings lack. These two questions are ambiguous and the relation between them is far from clear. Supported by tradition which has overshadowed much of our understanding of human dignity, the first question can be variously understood as the elevation of the human species, human dominion over nature, humanity as imago dei, or as the special worth of humanity relative to all other natural phenomena. In other words, human dignity as elevation rather than human dignity as human inner significance (compare Sensen, 2011). The second question, by contrast, leaves open the possibility that human beings and nonhuman animals have potentially incommensurable significances (Korsgaard, 2013; Nussbaum, 2006; Balzer, Rippe and Schaber, 2000; Kaldewaij, 2013). Each of these presumptions has a questionable relationship with an IHD.

Starting from the idea that human beings have a distinctive significance, at least two possibilities flow: the existence of duties of dignity that address its bearer, and duties of dignity that address others. Some philosophical theories deny a distinctive significance for human (and nonhuman) beings as such, but emphasize the contractual basis of our norms or argue that what matters morally is sentience (compare Gauthier, 1987; Singer, 2001). By contrast, philosophical views on human dignity emphasize that there is a distinctive significance to human beings and that this entails certain stringent ethical norms. Note that claiming a distinctive significance for human beings does not necessarily amount to prioritizing human beings over animals. (Claiming that human beings should be prioritized over animals would of course entail that human beings have a distinctive significance.) Indeed claims that both human nature and animal nature have their own distinctive significance can be interpreted both in terms of elevation and in terms of inner significance. When animal and human interests clash, one could try to compromise the interests of one to satisfy the same or even a different interest for the other, in line with or even as a matter of respect to their different dignities.

That being said, the claim of human significance has often found expression in philosophies that elevate human beings over animals. It should be noted that the very idea of a relative standing of human beings over nonhuman animals and nature does not entail that human beings should be protected for that dignity (Sensen, 2011). Rather, the relative elevation of a human being is conceived in terms of his distinctive human capacities that, given some teleological or religious background assumptions, entail for him a duty to exercise these. These capacities are, in turn, typically understood to be exercised by acting morally, that is, to act in line with a morality that concerns what one does to oneself, to other humans, or to God. It is these teleological or religious assumptions that generally benefit humans over animals. It has been argued that this view of humanity was central to Western traditional views of dignity including those of the ancients, medieval Christians, Renaissance and early Modern thinkers.

Within these moral schemes the question of what we should do to a human being is not (fully) decided by recognizing their dignity (as elevation), whereas the individual’s own duty to comply with that scheme is the main normative implication of the set of capacities that ground his dignity. He has initial dignity as subject to such a moral scheme, in particular by virtue of his capacity and correlated duty to live up to it. As such, his dignity may not entail any or all duties that others have to him, such as to respect or even support him. What we are to do to him depends on the content of the moral duty that we have as a result of our dignity grounding capacities, duties which are conceptualized in terms of cosmic principles or divine commands. That is to say, we are to respect each other not for our relative standing, our initial dignity, but given that and insofar as non-interference or support for beings that happen to have this standing is required by cosmic or divine principle. This principle specifies what we should value in the individual. As such, it specifies a type of dignity that comes closer to the inner significance view, which in turn may be, but does not necessarily require, an expression in terms of schemas that advance ideas of human elevation.

It is the inner significance view, not the human elevation view, that fits more easily within the formal features of the IHD. The normative significance view has found expressions in at least three ways: as a status (Habermas, 2010; Waldron and Dan-Cohen, 2012), a value (Rosen, 2012; Sulmasy, 2007) or a principle (Düwell, 2014). As a status, human dignity gives human beings a set of duties and rights. A value, by contrast, sets human dignity as something to sustain or promote. As a principle, human dignity sets a fundamental standard for action. These three types of specifications are featured in broader philosophical anthropologies that explain who has it and what should be protected in them—as well as entail implications for policy and law with regard to it. In other words, whether we treat human dignity as a value, status or principle will depend in large measure on the background assumptions—anthropological and/or cosmological—that we take to form the background of a claim about human dignity.

ii. Philosophical Anthropology

All three claims—status, value and principle—can be interpreted in terms of the formal features of the IHD (universal, unconditional, inalienable and overriding). At the same time, some views on the significance of humanity may deny one of these features, and this will affect the content and normative use of such a view of the significance of humanity considerably. In these respects, attempts to reconstruct non-Western traditional views on dignity should be especially sensitive not only to distinctions between status, value and principle, but particularly to the formal as well as substantive specifications of the significance of humanity in these traditions (Donnelly, 2009). It has been argued, for example, that the normatively relevant notion of humanity in, for example, Confucian tradition should be understood in terms of dignity’s achievement through virtuous conduct, rather than in terms that make it independent of one’s character and conduct (Luo, 2014). This would touch on the issue of universality, unconditionality, alienability and overridingness. In Confucian tradition, dignity (qua ‘worth’) can be seen as a universal human potential that we may fail to cultivate: it is therefore universal but not unconditional; it can also be self-alienated and overridden.

It has been argued also that in certain Islamic traditions, Man has a God-given status as vicegerent on earth (Mozaffari, no date; Kamali, 2002; Maroth, 2014). This status may demand some respect, but how he is to be treated depends largely on what God has specified by law. If God demands—as some traditions seem to imply—respect for human individuals as a matter of their good deeds, piety or their living by the Book, then this would raise questions about consistency with the unconditionality and inalienability of an IHD. A further significantly different tradition, Hinduism, is sometimes interpreted to operate with a concept of dignity that a human individual shares because and insofar as his soul cannot be distinguished from the universe (Braarvig, 2014). On the one hand, this implies the significance of human individuals. On the other hand, given differentiations in the world of appearances we can distinguish degrees of dignity not only between individuals, but also between classes—which one can enter only through birth—specified by the presence of the universal whole in them. The possibility of rebirth in a higher caste—conditional on loyalty to the caste system or on pure chance—renders consistent this universal notion of dignity with the social one.

On top of these possible alternatives to an IHD at the formal level, it is also crucial to note the possibility of different accounts of the IHD in which these formal features may have different and incompatible contents, if not opposing implications for normative use. The differences concern not only questions about the nature of the subject of human dignity—a species, humanity or the human person—but also what is significant in him. Further differences emerge from answers to other questions: are we to grant him rights and impose on him duties; are we to value him, non-interfere and support him to perfect himself; are we to respect him?

iii. Tensions

This mixture of concerns and foci—different background assumptions in terms of cosmology and anthropology, different assumptions in terms of normative functioning of human dignity as statue, principle, and value—gives rise to an expansive field of enquiry. Even if we were to consider how the IHD may or may not be present in ethical accounts of human dignity, this would have to encompass the two substantial fields of normative ethics and applied ethics and would require careful analysis of how and why further links between politics, ethics and law are issues. For present purposes we narrow our concerns to applied ethics.

Applied ethics can be understood by reference to ethical problems that arise from concrete practices. These practices emerge or have their existence in society and as such require attention by politics and law—not only by philosophical ethics. What we typically see is that the ethical issue is addressed in terms of norms or principles accepted in the practice, and that politics or law let this happen and regulate only in their own terms—quite independent of an explicit assessment in terms of IHD, let alone in terms of a coherent integration of philosophical ethics, politics, law, empirical knowledge and practical constraints (compare Düwell, 2012).

‘Dignity’ has different usages in different applied ethical practices, and in some it has none (Beyleveld and Brownsword, 2001; Nordenfelt, 2004; Sulmasy, 2013). For example, in the life sciences dignity is used to legitimize a patient’s right to informed consent, to set constraints on her choices. Further, it is used to constrain her choice options, such as deciding when to die. It is also used to characterize the way a patient deals with and adapts to his condition, the way a patient is treated, and to emphasize the effects of his condition or of the actions of others on his identity. It is used to emphasize the value a person attaches to himself, the extent to which he respects himself (Dillon, 2013). Dignity is the central term in assessing technological developments for their application to human life (Human dignity and bioethics: essays commissioned by the President’s Council on Bioethics, 2008). Dignity is also used to argue against abortion, against the pre-natal experimentation on early human life. It has been argued by some that all human life should be protected as a matter of dignity, whereas others emphasize protection of human life only if it will develop a personality. In this context, it especially interesting to note that in debates on pre-natal enhancement, the notion of dignity is appealed to in defense of respecting the human species as such (Bostrom, 2005; Habermas, 2005). Here human dignity is said to be threatened by attempts to bring to life human beings enhanced in certain ways, such as enhanced to be more competent in certain abilities that are valued by parents or society. Here the worry not only concerns the dignity of the enhanced individual, whether it is violated or enhanced, but also the dignity of humanity as such: whether humanity is compromised by these interferences. It also concerns the dignity of non-enhanced human beings, whether it is threatened by the increased capacity of enhanced beings. Not all of these usages express the same concept, let alone an IHD. Those that do may give only partial expression to competing versions of an IHD. Often, however, we see that problems are addressed without explicit recourse to an IHD, let alone via an integral assessment in terms of the philosophical commitments that come with such an IHD. It would make a significant difference if these discourses were orientated towards coherence with an IHD.

c. Politics

Already in discussion of applied ethics certain of the constraining and conservative uses of human dignity are in evidence. A ‘dignitarian alliance’ of conservative thinkers and activists has deployed a notion of dignity close to that of sanctity in order to oppose or constrain reproductive and biotechnological innovations (Brownsword 2003). Political discourse of the twentieth century also, by contrast, witnessed radical and liberation-focused discourses of human dignity. While the division between human dignity as empowerment and as constraint helps to partially map this contrast, this section draws a more general divide between power-focused conceptions of politics as opposed to principle-focused conceptions of politics. Principled accounts can in turn be divided between those who make ethics (and potentially human dignity) central to politics, and those who might accommodate other interstitial principles like justice or the rule of law.

In those accounts that make ethics clearly foundational to politics, human dignity could be conceived as a regulative idea, providing the trajectory of politics but not necessarily central to its practice. Slightly differently, human dignity could be treated as providing a conception of good politics and implying practical side-constraint within political systems. More directly, human dignity might be identified with the good, which would give human dignity a more clearly normative and perhaps perfectionist role (Boylan 2004). Efforts to synthesize aspects of pluralism with such accounts of the good have informed a capabilities approach intended to encompass both a substantial conception of the individual and the protections of agency and individuality characteristic of liberal thought. This itself is often expressed in the language of human dignity (Nussbaum 2006, Claassen 2014). This interpretation of human dignity in terms of capability based flourishing has been reviewed and critically reinterpreted by reference to a different idea of dignity, that of dignity as a basic principle that demands recognition of the generic features of human agency as a matter of basic rights (Gewirth 1992). Far from being unrelated to the perfectionist notion of dignity, this latter notion of dignity functions as an underlying principle that may help us identify relevant from irrelevant human capabilities as well as to rank them so as to prevent or settle clashes between them (Düwell 2009, Claassen and Düwell 2012). Such a take on capabilities would imply that possibilities for certain forms of flourishing should be protected as a matter of dignity, indeed the same kind of dignity that demands respect for freedom and well-being as basic features of agency. One further upshot of this approach would be that those things to be secured or provided might, in view of this principle, differ between persons as well as between contexts. That is to say, to protect a capability for one agent may require different or more resources than protecting it in someone else (Boylan 2004). Also, when possibilities of securing agency are scarce in a community, priority should be given to capabilities at the core of agency. It might be that this represents a manifestation of the IHD concept in that the idea is intended to have application across different systems and also be extended to other, new forms of moral and political challenges.

In contrast, those positions that give the right priority over the good place rights and a plurality of reasonable conceptions of the good at the center of just institutional design. Such a ‘community of rights’ is quite directly committed to an interstitial notion of human dignity cashed-out as both basic human rights and systems for preserving freedom and welfare across all normative systems (Gewirth 1998). Rawls’s position (2009) in contrast faces the challenge of reconciling commitment to human dignity with treating justice as a primary institutional virtue. Rawls’s two principles of justice—while expressed in the language of basic rights and institutional virtues—could intelligibly be taken as an expression of a politics based on human dignity. However, this should give rise to important hermeneutical and conceptual hesitations. First, little is added to our understanding of Rawls’s work by associating it with human dignity, and conversely the distinctive conceptual characteristics of human dignity are immediately lost in more general debates about liberal political theory. Second, in Rawls’s later work where “decent non-liberal” societies are insulated from criticism and intervention from liberal states, we might say that Rawls concedes that non-liberal states—states that would clearly not accept an IHD principle as foundational—are nonetheless morally and politically justified (2001). By extension, the links between liberal political theory and human dignity are enormously complex, and can be conditioned by the demands of realism or non-ideal theory. With that in mind we turn to more practice-based and power-focused links.

The concept of human dignity as it appeared in post-war international law was undoubtedly intended to mark a decisive political, not just legal, turning-point. The concept is closely associated with the commitment “never again”—that never again should there be atrocities of the kind in the Second World War—and we could see human dignity as a predominantly political idea focused on the impermissibility of widespread and systematic attacks on civilian populations and by extension fundamental limitations on states’ sovereignty. In this sense there is credibility to an interstitial reading of human dignity that links international law, politics and morality in supporting a more individual-focused, less state-focused account of international relations. This, in turn, strengthens a link between human dignity and (moral and institutional) cosmopolitanism given that the value of individuals transcends state boundaries.

Conversely, this—interstitial and cosmopolitan—reading of human dignity has important limitations. First, the interstitial understanding of human dignity could be assumed to be, at heart, an ideological reading of human rights discourse: it is the rhetoric of human rights that links international law and politics rather than any systemic or philosophically defensible normative framework related to dignity. Second, the cosmopolitan understanding of human dignity faces the general vulnerability of all cosmopolitan philosophies (the priority of local and natural attachments in our moral thinking) and a specific attack via the problem of statelessness. That is, unless human dignity rests on or implies a ‘right to have rights,’ any political and legal discourse of human dignity will be inadequate in comparison to the systematic and concrete protections offered to citizens by constitutions and constitutional rights. We return to the right to have rights later by way of a more general analysis of social theory.

Certain historical and sociological trends are important for understanding human dignity and its role in politics. The first and most obvious is a shift from hierarchical societies to more democratic societies and with this an emphasis on the equal status and rights of individuals. A clash between the notions of dignity as aristocratic bearing and dignity as fundamental status is a characteristic of debates concerning the French Revolution. The ‘dignity of Man’ as emblematic of political emancipatory projects finds its first major expression during this revolutionary period, and it allowed the articulation of new emancipatory projects as in Wollstonecraft’s appeal to the equal dignity of men and women ([1792]1982). The post-World War Two invocation of human dignity undoubtedly shares basic humanistic, enlightenment, and liberal assumptions with these currents of eighteenth and nineteenth century thought, though by the twentieth century the idea of the ‘dignity of Man’ was being opposed not directly by defenders of the Ancien Régime but by Marxist and communitarian critics of liberalism. What unites these latter positions is concern about the insensitivity of human dignity relative to pressing political problems including colonialism and minority rights, along with more fundamental concerns about the emptiness of the concept relative to collective interests that cannot be disaggregated into individual interests.

Sociological shifts are also crucial in understanding the competing functions of human dignity in political discourse. The characteristics of modernity, as charted by both Weber and Durkheim, involve changes in the conception of the individual (including for Durkheim the creation of an ‘ethic’ or ‘religion’ of humanity), changes in the concept of politics, and changes in the political significance of human dignity. On the one hand, the more technocratic and bureaucratic nature of politics was held to have yielded a demystifying, but also dangerously dehumanizing, relationship between the individual and political power. In the light of that and related concerns, Margalit (2009) and others use human dignity to stress the importance of retaining dignity qua self-respect within political and social practices. By the same token, Honneth’s work on the political conditions of recognition (1996) entwines respect with the basic conditions of individual and group identity. On the other hand, liberal institutions that intended to preserve the basic status of the individual have been held to be inadequate to maintain the conditions of the possibility of ethical life. This has meant direct attacks on ‘liberal’ practices, including human rights, by communitarian theorists.

It is against this background that a different style of political theorizing about human dignity can be found in the second half of the twentieth century. Hannah Arendt’s Aristotle-inspired political theory emphasizes the importance of recognition in a political community and of strong constitutional rights with an equation between human dignity and the right to have rights (Arendt 1958). Arendt offers an influential internal critique of politico-legal understandings of human dignity. Broadly, Arendt is unsympathetic to any potential interstitial concept (given her views on the basic conditions of politics) and to generalizations about the rights of Man (given her writings on the emptiness of this notion, particularly with regard to the status of refugees). In contrast she stresses the basic importance of citizenship as a condition of protecting the basic status of the individual. There are nevertheless resources in Arendt’s work that are clearly sympathetic to human dignity and human rights as more expansive commitments, and human dignity could be seen as the best expression of that view of human dignity as opposition to atrocity and defensive of human status and human plurality (Menke 2014).

In the light of these competing currents of thought, and the complexities of the concept itself, human dignity does not map neatly onto the division between empowerment and constraint or between the priority of the good and the priority of the right. The IHD, to the extent that it is a recognizable component of political thinking, might be assumed to be closer to conceptions of politics focused on the rule of law rather than a substantive conception of the good. Understood as interstitial concepts, human dignity and the rule of law are intended precisely to express the importance of links between politics and law and the co-regulation of the two. The rule of law is important not only as an expression of self-restraint in politics but also as a necessary condition of a permissive politics of human agency, choice and self-creation. This might be otherwise expressed in terms of a defense of the public-private divide. It could be expressed in more sociological terms as a defense of functional differentiation, the coexistence of different social systems that an individual can move between. Or this might be linked to a libertarian defense of minimalism in the power of the state. The unifying idea here is that human dignity is a principle with significance for political, legal and moral systems and which preserves, one way or another, the freedom and self-creation of the individual. It has been the recurrent theme of communitarian critics of liberalism and human rights that such permissiveness undermines the self-constitution of the individual within a polity. Middle ground could, potentially, be found in the capabilities approach or in an Arendtean stress on the right to have rights.

4. Conceptual Analysis

a. The Conceptual Features of Human Dignity

It is desirable, but no simple task, to begin to draw more general conclusions about human dignity as a concept and as a component of normative debate. It is worth briefly contrasting how we might approach the analysis of human dignity with that of human rights. Discussion of human rights features settled debates concerning their moral or political justification, an appropriate theory of rights, and human rights’ tailoring to practice. Analysis of human dignity, in contrast, lacks such clearly defined parameters because it is plausible that there are competing concepts of human dignity and not just competing conceptions. That is, it is not simply that in academic debate different aspects of a single concept can be given special emphasis or that there are competing justificatory strategies for the same, shared, idea. Rather, ‘human dignity’ might encompass historically different, and antagonistic, ideas. For this reason, meta-studies of the uses of human dignity have difficulty yielding definitive analysis of the concept’s presuppositions and functions, or have mapped a number of functions that are difficult to cohere (Nordenfeld 2004; Sulmasy 2013). Bonding the many functions of human dignity may be possible, at best, only through performative analysis (O’Malley 2011) or family resemblance analysis (Neal 2012), but these involve abandoning a single idea of human dignity in favor of describing various local uses.

b. The Credibility of an Interstitial Concept

In contrast, we would argue that the three normative fields of law, morality and politics together offer at least the possibility of a distinctive, focal concept. The idea of the absolute status of every individual can intelligibly be held to frame our normative practices. Indeed, the magnitude of this commitment is such that it would have to be manifest in all of our social practices. Clearly, however, this is not without problems. Any conceivable defense of an IHD concept—one that, by definition, sits between and links different normative practices—faces the immediate problem of the conditioning assumptions of those disciplines and practices (including the local practices and settled dispositions and attitudes of those working within the fields). This can be treated as a three-fold problem. The validity of any legal norm is conditional on political will (the problem of the primacy of the political); the moral justification of the idea still requires further explanation and justification (the problem of the foundations of morality); and the legal notion itself will be conditioned by a legal system so that it can be consistently operationalized within the system (the problem of the demands of justice or the normative closure of law). These three problems are pressing problems for any IHD claim precisely because the concept must claim to transcend these conditioning aspects of our normative practices.

However, it can be argued that the possibility of an interstitial concept nevertheless has support within the fields. For example, the idea of a rule of law is intended to unify different fields of legal and political regulation (through demanding their consonance with good law consistent with human agency), and for that reason a number of theorists closely associate human dignity and the rule of law (Waldron 2008; Fuller 1964). Beyond this, human dignity might well inspire more productive and precise regulatory practices, be they related to global, social or procedural justice. If the rule of law is the minimal demand that there be a good match between regulation and agency, wider ‘projects’ conjoining law, ethics, and politics can be meaningfully expressed in the language of human dignity given its unifying function. Put more modestly, the idea of politics as an anomic practice is difficult to defend—after all, law and politics stand in a relation of productive co-constitution with politics making law and legal systems revising the content of that law and regulating political practices themselves—and our best reconstructions of the foundations of political practices and institutions are likely to involve commitment to the kinds of formal assumptions associated with human dignity (Rawls 2009; Habermas 2010). And moral theories can enforce duties which in turn generate institutional designs and procedural mechanisms intended to protect human dignity and render it immanent in social systems (Gewirth 1998). In sum the three problems associated with an IHD claim are not uniformly accepted and should not be treated as a refutation of interstitial claims in general or an IHD concept specifically.

Above all, a connection between human rights and human dignity gives critical force to human dignity and indicates precisely why the predominant concept of human dignity should be assumed to be an interstitial one. Conceptualizing human dignity as foundational is sometimes construed as bonding the existing body of human rights law with a moral claim that guarantees their force as moral, not just positive, rights. The most plausible explanation of such a guarantee is through deontological theory granting supreme moral importance to the individual and immunizing them from consequentialist determinations of the common good that would potentially sacrifice their rights and their status. Beyond this, the precise account of justification, rights, and practice is open to debate, but human dignity is the foremost expression of the deontological commitments sketched here. Even in this sketch it is clear that the normative fields of law, ethics, and politics are not intended to be absolutely divided but rather guided and judged by their consistency with the protection of human rights. It is this claim that lies at the heart of an interstitial concept of human dignity (and much else besides in international law). It remains to draw out the implications of this.

c. The Implications of an Interstitial Concept

Assuming that an IHD concept—sitting between normative fields, linking these fields, and conditioning them—is intelligible, then its implications are considerable. Let us assume that the commitments contained in such a concept are as follows. Human dignity is treated as having the formal features identified (universality, overridingness, and so forth); it has the characteristic content of human dignity claims (a species claim or a claim about human dignity being relational or a property); and it encompasses commitment to a distinctive normative use (for example, empowerment of the individual, expressed in terms of claim rights, that holds at least between the individual and all political institutions). The sum of this commitment would be as follows. In all interactions between state and individual, claim rights (expressible as human rights) can and should be exercised by all human persons, and the exercise of those rights would not be conditioned by any jurisdictional boundaries. This amounts to having significance in all possible interactions between the collective and the individual. It will imply that there is no interaction between individuals that is not at least potentially normatively governed by human dignity. And it implies that any special demands about normative priorities made by law, ethics or politics would be justified only to the extent that they were consistent with, or directly conditioned by, the overarching commitment to human dignity. This concept is, then, enormously demanding insofar as its fulfillment would not be discharged on the basis of respecting a single norm (be it a Grundnorm or an anti-atrocity norm) but would, rather, demand an ongoing commitment to subject every executive and administrative decision to scrutiny on the basis of its consonance with the content and implications of human dignity particularly as this is expressed through human rights.

What conceptual and practical problems does this imply? The actual enforceability of human dignity itself as a norm or right is potentially unclear here, and the idea of human dignity’s overridingness sits uneasily with many common legal, political and moral assumptions. For related reasons it is not clear if human dignity should be a named, explicit norm within a constitution. It would be impracticable (indeed perhaps senseless) to have a norm that trumped all other norms; human dignity cannot be assumed to function in a normative vacuum. And the function of an interstitial concept is to link and justify different normative fields, not to directly govern them through one explicit Grundnorm. In fact, having concrete implications for these fields demands a more complete explication of the concept in terms of human rights which themselves require clear institutional arrangements. What human dignity amounts to is an expression of the foundations of any and all of our normative practices and the demand that human rights and human dignity have a constitutive and not just regulative role in our social institutions and practices. Nevertheless, this is a demand for a far more substantial explication of human rights, institutions, and good—that is, human dignity preserving—interaction between law, morality and politics in practice.

If, despite such challenges, we accept this IHD reading, we should reject a number of other readings of human dignity as peripheral or incoherent. Common uses of human dignity in healthcare and medical ethics that treat human dignity as one amongst many ‘middle-level principles,’ or bioethical readings that treat human dignity as synonymous with sanctity, would be non-standard readings on these assumptions and intelligible only as idiosyncratic local uses. Common criticisms of human dignity as vacuous or empty (because human dignity apparently collapses into notions of autonomy) would be rejected as incoherent because they fail to distinguish an IHD from either idiosyncratic local uses or from irrelevant non-interstitial uses. There would remain, however, an important but complex line of enquiry concerning how human dignity and self-regarding duties should be thought to interact. On the one hand, the IHD concept has been detached from the perfectionist Stoic tradition invoking species norms which determine whether individuals are ‘fully human.’ On the other hand the typical form, content, and normative implications of the IHD need not exclude the possibility of self-regarding duties arising from respecting one’s own status as human person.

5. Conclusion

The foregoing analysis stressed the problems of using human dignity in philosophical and ethical thought. The concept itself is opaque, and one important modern usage faces the problem of aspiring to be interstitial within and between normative fields that are themselves resistant to the very idea of such interstitial concepts. Nevertheless, there are good reasons why such a far-reaching concept should be primary in our thinking, and for this reason human dignity is likely to remain a component of normative discourse despite its problematic characteristics.

6. References and Further Reading

  • Alexy, R. (2009) A theory of constitutional rights. Oxford University Press.
  • Arendt, H. (1958) Origins of Totalitarianism, Meridian Books.
  • Balzer, P., Rippe, K. P. and Schaber, P. (2000) ‘Two Concepts of Dignity for Humans and Non-Human Organisms in the Context of Genetic Engineering’, Journal of Agricultural and Environmental Ethics, 13(1), pp. 7–27. doi: 10.1023/A:1009536230634.
  • Beitz, C. (2013) ‘Human Dignity in the Theory of Human Rights: Nothing But a Phrase?’, Philosophy and Public Affairs, 41(3), pp. 259–290.
  • Beyleveld, D. and Brownsword, R. (2001) Human dignity in bioethics and biolaw. Oxford: Oxford University Press.
  • Bostrom, N. (2005) ‘In Defense of Posthuman Dignity’, Bioethics, 19(3), pp. 202–214. doi: 10.1111/j.1467-8519.2005.00437.x.
  • Boylan, M. (2004) A Just Society. Rowman & Littlefield Publishers.
  • Brownsword, R. (2003) ‘Bioethics today, bioethics tomorrow: stem cell research and the dignitarian alliance’, Notre Dame JL Ethics & Pub. Policy, 17, pp. 15–51.
  • Braarvig, J. (2014) ‘Hinduism: the universal self in a class society’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
  • Claassen, R., and Düwell, R. ‘The foundations of capability theory: comparing Nussbaum and Gewirth’, Ethical theory and moral practice 16(3), pp. 493–510.
  • Claassen, R. (2014) ‘Human Dignity in the Capability Approach’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
  • Debes, R. (2009) ‘Dignity’s gauntlet’, Philosophical Perspectives, 23(1), pp. 45–78.
  • Dillon, R. S. (2013) Dignity, Character and Self-Respect. Routledge.
  • Donnelly, J. (2009) ‘Human Dignity and Human Rights’, Commissioned by and Prepared for the Geneva Academy of International Humanitarian Law and Human Rights in the framework of the Swiss Initiative to Commemorate the 60th Anniversary of the Universal Declaration of Human Rights. Available at: http://www.udhr60.ch/report/donnelly-HumanDignity_0609.pdf.
  • Düwell, M. (2009) ‘On the Possibility of a Hierarchy of Moral Goods’, in Morality and Justice: Reading Boylan’s A Just Society, John-Steward Gordon (ed.), Rowman & Littlefield Publishers, Inc: Lanham, MD.
  • Düwell, M. (2012) Bioethics: Methods, Theories, Domains. Routledge.
  • Düwell, M. (2014) ‘Human dignity: concepts, discussions, philosophical perspectives’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.004.
  • Fuller, L.L. (1964) The Morality of Law. Yale University Press.
  • Gauthier, D. (1987) Morals By Agreement. Oxford University Press, USA.
  • Gewirth, A. R. (1998) The community of rights. Springer Netherlands.
  • Habermas, J. (2005) Die Zukunft der menschlichen Natur: auf dem Weg zu einer liberalen Eugenik?. Frankfurt am Main: Suhrkamp.
  • Habermas, J. (2010) ‘The Concept of Human Dignity and the Realistic Utopia of Human Rights’, Metaphilosophy, 41(4), pp. 464–480. doi: 10.1111/j.1467-9973.2010.01648.x.
  • Hennette-Vauchez, S. (2011) ‘A human dignitas? Remnants of the ancient legal concept in contemporary dignity jurisprudence’, International journal of constitutional law, 9(1), pp. 32–57.
  • Honneth, A. (1996) The struggle for recognition: The moral grammar of social conflicts. MIT Press.
  • Human dignity and bioethics: essays commissioned by the President’s Council on Bioethics. (2008). Washington: [s.n.].
  • Kaldewaij, F. E. (2013) The animal in morality. Justifying duties to animals in Kantian moral philosophy. Department of Philosophy, Utrecht University. Available at: http://dspace.library.uu.nl/handle/1874/275543.
  • Kamali, P. M. H. (2002) The Dignity of Man: An Islamic Perspective. 2nd edition. Islamic Texts Society.
  • Kaufmann, Paulus, et al. (2011) ‘Human dignity violated: a negative approach–introduction’, in Kaufmann, P., Kuch, H., Neuhäuser, C., & Webster, E. (eds) Humiliation, Degradation, Dehumanization. Netherlands: Springer, pp. 1–5.
  • Korsgaard, C. M. (2013) ‘Kantian Ethics, Animals, and the Law’, Oxford Journal of Legal Studies, 33(4), pp. 629–648. doi: 10.1093/ojls/gqt028.
  • Luo, A. (2014) ‘Human dignity in traditional Chinese Confucianism’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.021.
  • Margalit, M. A. (2009) The decent society. Cambridge Mass.: Harvard University Press.
  • Maroth, M. (2014) ‘Human dignity in the Islamic world’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
  • McCrudden, C., (2008) ‘Human Dignity and Judicial Interpretation of Human Rights, European Journal of International Law, 19(4), pp. 655–724.
  • Menke, C. (2014) ‘Human Dignity as the Right to Have Rights: Human Dignity in Hannah Arendt’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.004.
  • Mozaffari, M. H. (no date) ‘The concept of Human Dignity in the Islamic Thought’, Hekmat: International Journal of Academic Research, (4), pp. 11–28.
  • Neal, M. (2012) ‘Dignity, law and language-games’, International Journal for the Semiotics of Law-Revue internationale de Sémiotique juridique, 25(1), pp. 107–122.
  • Nordenfelt, L. (2004) ‘The varieties of dignity’, Health care analysis: HCA: journal of health philosophy and policy, 12(2), pp. 69–81; discussion 83–89. doi: 10.1023/B:HCAN.0000041183.78435.4b.
  • Nussbaum, M. C. (2006) Frontiers of justice: disability, nationality, species membership. Cambridge, Mass.: The Belknap Press : Harvard University Press.
  • O’Malley, M. J. (2011) ‘A Performative Definition of Human Dignity’ Facetten Der Menschewürde: 75–101.
  • Rawls, J. (2001) The law of peoples: with, the idea of public reason revisited. Cambridge, Mass.: Harvard University Press.
  • Rawls, J. (2009) A theory of justice. Cambridge, Mass.Harvard University Press.
  • Rosen, M. (2012) Dignity its history and meaning. Cambridge, Mass: Harvard University Press.
  • Sensen, O. (2011) ‘Human dignity in historical perspective: The contemporary and traditional paradigms’, European Journal of Political Theory, 10(1), pp. 71–91. doi: 10.1177/1474885110386006.
  • Singer, P. (2001) Animal Liberation. Ecco Press.
  • Sulmasy, D. P. (2007) ‘Human dignity and human worth’, in Perspectives on human dignity: A conversation. Springer, pp. 9–18. Available at: http://link.springer.com/content/pdf/10.1007/978-1-4020-6281-0_2.pdf.
  • Sulmasy, D. P. (2013) ‘The varieties of human dignity: a logical and conceptual analysis’, Medicine, health care, and philosophy, 16(4), pp. 937–944. doi: 10.1007/s11019-012-9400-1.
  • Waldron, J. (2008) ‘The Concept and the Rule of Law’, Georgia Law Review, 43(1), pp. 1–62.
  • Waldron, J. and Dan-Cohen, M. (2012) Dignity, rank, and rights. Oxford; New York: Oxford University Press.
  • Waldron, J. (2013) ‘Is dignity the foundation of human rights?’ NYU School of Law, Public Law Research Paper 12–73. doi: http://dx.doi.org/10.2139/ssrn.2196074.
  • Wollstonecraft, M. (1982) Vindication of the Rights of Woman. Ontario: Broadview Press.

 

Author Information

Stephen Riley
Email: stephenriley12@gmail.com
Utrecht University
Netherlands

and

Gerhard Bos
Email: bos.gerhard@gmail.com
Utrecht University
Netherlands

Luck

Winning a lottery, being hit by a stray bullet, or surviving a plane crash, all are instances of a mundane phenomenon: luck. Mundane as it is, the concept of luck nonetheless plays a pivotal role in central areas of philosophy, either because it is the key element of widespread philosophical theses or because it gives rise to challenging puzzles. For example, a common claim in philosophy of action is that acting because of luck prevents free action. A platitude in epistemology is that coming to believe the truth by sheer luck is incompatible with knowing. If two people act in the same way but the consequences of one of their actions are worse due to luck, should we morally assess them in the same way? Is the inequality of a person unjust when it is caused by bad luck? These two complex issues are a matter of controversy in ethics and political philosophy, respectively.

A legitimate question is whether the concept of luck itself is worthy of philosophical investigation. One might think that it is not given (i) how acquainted we are with the phenomenon of luck in everyday life and (ii) the fact that progress has been made in the aforementioned debates on the assumption of a pre-theoretical understanding of the notion.

However, the idea that a rigorous analysis of the general concept of luck might serve to make further progress in areas of philosophy where the notion plays a fundamental role has motivated a recent and growing philosophical literature on the nature of luck itself. Although some might be skeptical that investigating the nature of luck in general can help shed some light on long-standing philosophical debates such as the nature of knowledge—see Ballantyne 2014—it is hardly sustainable that no general account of luck will be able to ground any substantive claim in areas of philosophy where the notion is constantly invoked but left undefined. This article gives an overview of current philosophical theorizing about the concept of luck itself.

Table of Contents

  1. Preliminary Remarks
    1. The Bearers of Luck
    2. The Target of the Analysis
    3. General Features of Luck
  2. Luck and Significance
  3. Probabilistic Accounts
    1. Objective Accounts
    2. Subjective Accounts
  4. Modal Accounts
  5. Lack of Control Accounts
  6. Hybrid Accounts
  7. Luck and Related Concepts
    1. Accidents
    2. Coincidences
    3. Fortune
    4. Risk
    5. Indeterminacy
  8. References and Further Reading

1. Preliminary Remarks

The following preliminary remarks will address three questions: (1) What are the bearers of luck? (2) What is the target of the analysis of current accounts of luck? (3) What general features of luck should an adequate analysis of luck be able to explain?

a. The Bearers of Luck

The best way to find out what the bearers of luck are consists in considering the kind of entities of which we predicate luck-involving terms and expressions such as “lucky,” “a matter of luck,” or “by luck.”

1. Agents. On the one hand, the term “lucky” can be predicated of agents—for example, “Chloe is lucky to win the lottery.” In general, the kind of beings to which we attribute luck are beings with objective or subjective interests such as self-preservation or desires—see Ballantyne (2012) for further discussion. In this sense, a human or a dog are lucky to survive a fortuitous rockfall, but a stick of wood or a car are not. Still, at least in some contexts, it seems correct to attribute luck to an object without interests, as when one says that one’s beloved car is lucky not to have been damaged by a fortuitous rockfall. However, this kind of assertions are felicitous insofar as they are parasitic on our interests. No one would say that a stick of wood is lucky not to have been destroyed by a rockfall if its existence bore absolutely no significance to anyone’s interests, and if one would, one would only say it figuratively.

A related question is whether the kind of agents to which we attribute luck are only individuals or whether luck can be also ascribed to collectives. There is certainly a sense in which a group of individuals can be said to be lucky, as when we say that a group of climbers is lucky to have survived an avalanche. Coffman (2007) suggests that there seems to be no reason why group luck cannot be reduced to or explained in terms of individual luck. But if one holds—with many theorists working on collective intentionality—that groups can be the bearers of intentional states, it might turn out that group luck cannot be so easily reduced to individual luck. For example, if it is by bad luck that a manufacturing company fails to achieve its yearly revenue goal—so it is bad luck for the company—it does not necessary follow that each and every one of its workers—for example, people working on the assembly line—are also unlucky, if, say, they cannot be fired by law and the company is not compromised.

2. Events. On the other hand, the term “lucky” and expressions such as “a matter of luck” or “by luck” can be predicated of events—for example, “Chloe’s lottery win was lucky”—and states of affairs—for example, “It is a matter of luck that Chloe won” or “Chloe’s winning the lottery was by luck”; see Coffman (2014) for further discussion. Plausibly, luck-involving expressions can be also predicated of items belonging to related metaphysical categories such as accomplishments, achievements, actions, activities, developments, eventualities, facts, occurrences, performances, processes, and states. For presentation purposes, luck will be here described as a phenomenon that applies to agents and events, where by “agent” is meant any being with interests and by “event” any member of the previous categories.

b. The Target of the Analysis

1. Relational versus non-relational luck. We say things such as (1) an event E is lucky for an agent S and (2) S is lucky that E. We also say things such as (3) it is a matter of luck that E and (4) E is by luck. Milburn (2014) argues that (1) and (2) are plausibly equivalent: E is lucky for S if and only if S is lucky that E. (3) and (4) also seem equivalent: it is a matter of luck that E if and only if E is by luck. However, (1) and (2) are not equivalent to (3) and (4). Milburn is right in pointing out that this marks an important distinction that anyone in the business of analyzing luck should keep in mind.

The difference between (1) and (2), on the one hand, and (3) and (4), on the other, is that (1) and (2) denote a relation between an agent and an event, whereas (3) and (4) are not indicative of any relation and only apply to events. Call the kind of luck denoted by (1) and (2) relational luck and the kind of luck denoted by (3) and (4) non-relational luck—Milburn uses different terminology: he employs the expression “subjective-relative luck” to refer to relational luck and “subjective-involving luck” to refer to non-relational luck when the relevant event concerns an agent’s action.

Relational luck can be distinguished from non-relational luck regardless of the fact that the target event is an agent’s state or action. For instance, when the relevant event is an action by the agent—for example, that S scores a goal—the luck-involving expressions in (3) and (4) apply to the agent—for example, it is a matter of luck that S scores a goal—but fail to establish a relationship between the agent—S—and the event—S’s scoring of a goal. In contrast, if the target event is the agent’s action, (1) and (2) do establish a relationship between the agent and her action—for example, that S scores a goal is lucky for S.

In the literature, most accounts of luck try to explain what it takes for an event to be lucky for an agent. In other words, they focus on relational luck. But it might well be that in order to shed light on the special varieties of luck—for example, epistemic, moral, distributive luck—one might need to shift the focus of the analysis to non-relational luck—see Milburn (2014) for further discussion.

2. Synchronic versus diachronic luck. Most accounts of (relational and non-relational) luck focus on when an event is lucky—for an agent or simpliciter—at one point in time. However, Hales (2014) argues that luck may be predicated not only synchronically—that is, of an event’s occurrence at a certain time—but also diachronically—that is, of a series or streak of events occurring at different times. For example, synchronically, we say things such as “Joe was lucky to hit the baseball at the end of the game.” Diachronically, we say things such as “Joe was lucky to safely hit in 56 consecutive baseball games.” Hales’s point is that we can be lucky diachronically but not synchronically, and the other way around. By contrast, McKinnon (2013; 2014) argues that while we can determine the presence and degree of diachronic luck—for example, luck in a streak of successful performances—we do not have the ability to determine the presence of synchronic luck—that is, whether a concrete performance is by luck.

3. Strokes of luck. An important departure from standard analysis of relational and non-relational luck is Coffman (2014; 2015), who thinks that the notion of an event E being a stroke of luck for an agent S is more fundamental than the notion of E being lucky for S—or more simply, than the notion of lucky event—and that, therefore, the former should be the target of the analysis of any adequate account of luck. Nonetheless, Coffman’s account of strokes of luck features the same kind of conditions that other authors give in their analyses of the notion of lucky event. In view of this, Hales (2015) objects that Coffman’s approach unnecessarily adds an extra layer of complexity to the already complex analysis of luck and casts doubt on how an analysis of the notion of stroke of luck can shed any more light than an analysis of the notion of lucky event in those areas of philosophy where the concept of luck plays a significant role.

c. General Features of Luck

Before entering into further details, it is convenient to highlight three general features of luck that any adequate analysis of the concept should be able to explain.

Goodness and badness. Luck can be good or bad. This is clearly true of relational luck. For instance, we say things such as “Dylan was lucky to survive the car accident” or “Dylan was unlucky to die in the car accident” to mean, respectively, that it is good luck that he survived and bad luck that he died. Moreover, one and the same event can be both good and bad luck for an agent, which plausibly has to do with the fact that two or more interests of the agent are at stake—Ballantyne (2012). For example, losing one’s keys and having to spend the night outdoors is bad luck if one gets a cold as a consequence, but it is also good luck if one thereby avoids an explosion in one’s apartment.

By contrast, attributions of non-relational luck not so clearly convey good or bad luck—for example, “The discovery of Pluto was a matter of luck.” This is plausibly due to the fact that such attributions do not denote any relationship between a lucky event and an agent or group of agents. To put it differently, if we interpret that sort of attributions as conveying good or bad luck, it is probably because we read them as denoting such relationship. At any rate, accounting for why luck is good or bad is a desideratum at least for analyses of relational luck.

Finally, although the term “lucky” is ordinarily associated with good luck, in the philosophical literature, it is used to denote events that instantiate good luck as well as events that instantiate bad luck. This is done mainly for the sake of simplicity.

Vagueness. Luck is to some extent a vague notion. Not all instances of luck are as clear-cut as a lottery win. For example, goals from the corner kick in professional soccer matches are considered neither clearly lucky nor clearly produced by skill. Pritchard (2005: 143) gives another example: if someone drops her wallet, keeps walking and after five minutes realizes that she just lost her wallet, returns to the place where she dropped it and finds it, is that person lucky to have found her wallet? The answer is not clear. Accordingly, we should not expect an analysis of luck to remove this vagueness. On the contrary, an adequate account should predict borderline cases, that is, cases that are neither clearly lucky nor clearly non-lucky. This is a desideratum for accounts not only of relational luck but also of non-relational luck.

Gradualness. Luck is a gradual notion. In ordinary parlance, it is common to attribute different degrees of luck to different events. For example, winning one million dollars playing roulette is luckier than winning one dollar, even if the odds are the same. Interestingly, winning the prize of an ordinary lottery is luckier than winning the same amount of money by tossing a coin, that is, when the odds are lower. An adequate analysis of luck should be also able to account for these different differences in degree. Again, this concerns accounts of relational luck as well as of non-relational luck.

2. Luck and Significance

Several atomic nuclei joining and triggering off an explosion is an event that is neither lucky nor unlucky for anyone if it happens at the other end of the galaxy. But it is bad luck if the explosion takes place nearby. One way to account for the difference in luckiness is that while the former event is not significant to anyone, the latter is significant to whoever is nearby. Cases like this motivate philosophers who theorize about the concept of luck to endorse a significance condition, that is, a requirement to the effect that an event is lucky for an agent only if the event is significant to the agent.

Since the significance condition establishes a relationship between an agent and an event, whether one thinks that such a condition is needed or not depends on what the target of one’s account is. For instance, if one is in the business of analyzing relational luck, one will be willing to include a significance condition in one’s analysis. But if one’s aim is to account for non-relational luck instead—that is, when is an event lucky simpliciter—one will be reluctant to include such condition in one’s analysis—see Pritchard (2014) for further discussion.

Although there is a wide agreement that an adequate analysis of relational luck must include a significance condition, there is a significant disagreement on its specific formulation. Pritchard (2005: 132–3) formulates the significance condition as follows:

S1: An event E is lucky for an agent S only if S would ascribe significance to E, were S to be availed of the relevant facts.

S1 requires that lucky agents have the capacity to ascribe significance. But that is problematic insofar as the condition prevents sentient nonhuman beings (Coffman 2007) and human beings with diminished capacities like newborns or comatose adults (Ballantyne 2012) from being lucky.

Coffman (2007) proposes an alternative significance condition in terms of the positive or negative effect of lucky events on the agent:

S2: An event E is lucky for an agent S only if (i) S is sentient and (ii) E has some objective evaluative status for S—that is, E has some objectively good or bad, positive or negative, effect on S.

Ballantyne (2012) gives a counterexample to S2 by arguing, first, that (ii) should be read as follows:

(ii)* E has some objectively positive or negative effect on S’s mental states.

The reason given by Ballantyne is that if the event’s effect is not on the agent’s mental states, it is not obvious why clause (i) is required. With that in place, the counterexample to S2 goes as follows: an unlucky man has no inkling that scientists have randomly selected him to put his brain in a vat to feed his neural connections with real-world experiences. The case is allegedly troublesome for S2 because the event, which is bad luck for the man, has no impact on the man’s mental states and, in particular, on his interior life, which is not altered.

A reply might be that, although the fact that the man’s brain is put in a vat does not affect the man’s interior life and namely his phenomenal mental states, it certainly affects his representational mental states. In particular, most of them turn out false, which seems to be objectively negative for the man, just as S2 requires.

Ballantyne (2012) proposes an alternative formulation of the significance condition in terms of the positive or negative effect of lucky events on the agent’s interests:

S3: An event E is lucky for an agent S only if (i) S has a subjective or objective interest N and (ii) E has some objectively positive or negative effect on N—in the sense that E is good or bad for S.

S3 is more specific than S2 in the kind of attributes that are supposed to be positively or negatively affected by lucky events. While S2 does not say whether these need to be the qualitative states of sentient beings, or their representational states, or their physical condition, S3 is explicit that what lucky events affect are the subjective and objective interests of individuals.

Leaving aside the question of what the correct formulation of the significance condition is, it is interesting to see how a significance condition can help explain the three general features of luck outlined above, that is, the goodness, badness, vagueness, and gradualness of luck. Concerning goodness and badness, the explanation is straightforward: luck is good or bad because the significance that lucky events have for people is positive or negative. Concerning vagueness, significance is a vague concept, so including a significance condition in an analysis of luck at least does not remove its inherent vagueness. Concerning gradualness, it can be argued that the degree of luck of an event proportionally varies with its significance or value—Latus (2003), Levy (2011: 36), Rescher (1995: 211–12; 2014). Consider the previous example of winning one million dollars playing roulette versus winning one dollar when the probability of winning is the same: it can be simply argued that the former event is luckier than the latter because it is more significant.

3. Probabilistic Accounts

Paradigmatic lucky events—for example, winning a fair lottery—typically occur by chance. Probabilistic accounts of luck explicitly appeal to the probability of an event’s occurrence to explain why it is by luck. In addition, they typically include a significance condition to explain why events are lucky for agents. For discussion purposes, the analyses of luck below will be presented as analyses of significant events, so the relevant significance condition can be omitted.

a. Objective Accounts

Some accounts make use of objective probabilities to define luck, that is, the kind of probabilities that are not determined by an agent’s evidence or degree of belief, but by features of the world:

OP1: A significant event E is lucky for an agent S at time t if only if, prior to the occurrence of E at t, there was low probability that E would occur at t.

OP1 says that lucky events are events whose occurrence was not objectively likely. A related way to formulate a probabilistic view—suggested by Baumann 2012—is by means of conditional objective probabilities:

OP2: A significant event E is lucky for an agent S at time t if only if, prior to the occurrence of E at t, there was low objective probability conditional on C that E would occur at t.

C is whatever condition one uses to determine the probability that the event will occur. For example, the unconditional probability that Lionel Messi will score a goal in the soccer match is high but given C—the fact that he is injured—the probability that he will score is low. Suppose that Messi ends up scoring by luck. The condition helps explain why: he was injured and therefore it was not very likely that we would score.

According to Hales (2014), probabilistic views of luck such as OP1 or OP2 are the most widespread among scientists and mathematicians. But they face at least two problems. First, a dominant—although not undisputed—idea is that necessary truths have probability 1. In view of it, Hales (2014) argues that probabilistic analyses cannot account for lucky necessities, which are maximally probable. For example, he contends that organisms—humans included—are lucky to be alive because the gravitational constant, G, is the one that actually is, but the probability that G made life possible is 1.

Second, another problem for probabilistic accounts is that, although rare, there are highly probable lucky events, that is, lucky events whose occurrence is highly probable—see Broncano-Berrocal (2015). Suppose that someone is the most wanted person in the galaxy and that billions of mercenaries are trying to kill her, but also that her combat skills drastically reduce the probability that each independent assassination attempt will succeed. Suppose that one such an attempt succeeds for completely fortuitous reasons that have nothing to do with the exercise of her skills. That she is killed is obviously bad luck, but it was also very probable given how many mercenaries were trying to kill her: even if each killing attempt had low probability to succeed, the probability that at least one would succeed was high given the number of independent attempts—that is, the probability of the disjunction of all attempts was high. This shows, contrary to what OP1 and OP2 say, that luck does not entail low probability of occurrence.

OP1 and OP2 are analyses of synchronic luck. McKinnon (2013; 2014) proposes a probabilistic account of diachronic luck instead. The view, called the expected outcome view, starts with the observation that we can determine the expected objective ratio of many events, including people’s performances. By way of illustration, the expected ratio of flipping a coin is 50 percent tails and 50 percent heads. On the other hand, the expected ratio of a certain basketball player’s free-throw shots being successful might be of 90 percent. However, in real life series of tosses or free-throws shots the outcomes typically deviate from those values. In the light of these considerations, McKinnon proposes the following view:

OP3: For any series A of events (E1, E2, …, En) that are significant to an agent S and for any objective expected ratio N of outcomes for events of type E, S is lucky proportionally to how much the actual ratio of outcomes in A deviates from N.

In a nutshell, McKinnon’s view is that we attribute any deviation from the expected ratio of outcomes to luck, and namely to good luck—if the deviation is positive—and to bad luck—if the deviation is negative. If the actual ratio is as expected, the ratio is fully attributable to skill. One key element of McKinnon’s view—and the reason why she rejects any attempt to give an account of synchronic luck—is that she thinks that, while we can know that the set of outcomes that deviate from the expected ratio are due to luck, we cannot know which one of the outcomes in that set is by luck. In other words, we can know whether we are diachronically lucky, but not whether we are synchronically lucky.

Before turning to a different type of probabilistic accounts, let us see how accounts modeling luck in terms of objective probability explain the three general features of luck outlined above. On the one hand, they can explain why luck is a gradual notion in a natural way. For instance, Rescher (1995: 211–12; 2014) thinks that luck varies with not only significance but also chance. If S is the value or significance of an event E, how lucky E is can be determined, according to Rescher, as follows:

Luck = S × (1 – Prob[E]).

In other words, Rescher thinks that luck varies proportionally with the value or significance that the event has for the agent and inversely proportionally with the probability of its occurrence.

On the other hand, defenders of objective probabilistic views might in principle explain why luck is vague notion in epistemic terms. They might argue that knowing exactly how lucky someone is with respect to an event entails that the exact probability of the event’s occurrence is known. However, the relevant probabilities are typically unknown are, at best, approximately known, which might in principle help explain why, say, a goal from the corner kick is neither clearly lucky nor clearly produced by skill: prior to its occurrence, the probability that it would occur was unclear.

Finally, as we have seen, McKinnon thinks that her view also helps explain why luck is good or bad: luck is good or bad depending on whether the actual deviation from the expected ratio is positive or negative.

b. Subjective Accounts

A different way to model luck in probabilistic terms is by means of subjective probabilities, that is, the kind of probabilities that are determined by an agent’s evidence or degree of belief. One way to state this kind of view is that whether or not an event counts as lucky for an agent depends on the agent’s degree of belief in the occurrence of the event, that is, on how confident she is or how strongly she believes that the event will occur—see Latus (2003), Rescher (1995: 78–80), and Steglich-Petersen (2010) for relevant discussion. More precisely:

SP1: A significant event E is lucky for an agent S at time t if only if, just before the occurrence of E at t, S had a low degree of belief that E would occur at t.

A subjective probabilistic account might be also formulated in terms of the agent’s evidence for the occurrence of the event—see Steglich-Petersen (2010):

SP2: A significant event E is lucky for an agent S at time t if only if, given S’s evidence just before the occurrence of E at t, there was low probability that E would occur at t.

SP1 and SP2 characterize luck as a perspectival notion: if for A but not for B it is subjectively improbable that an event E will occur, then, if E occurs, E is lucky for A but not for B—Latus (2003) endorses this thesis. For example, suppose that someone receives a big check from a secret benefactor. From that person’s perspective, it is good luck that she has received the check, but from the perspective of the benefactor, it is not—the example is from Rescher (1995: 35). In addition, those who firmly believe in fate or whose evidence strongly points to its existence are never lucky according to these views, because everything that happens to them is highly probable from their perspective.

Stoutenburg (2015) gives a similar evidential account of degrees of luck. The idea is that an agent is lucky with respect to an event to the extent that her evidence does not guarantee its occurrence, in the sense that if the conditional probability of the occurrence of the event given the agent’s evidence is not maximal, she is lucky to some degree with respect to that event:

SP3: A significant event E is lucky to some degree for an agent S at time t if only if, given S’s evidence just before the occurrence of E at t, the probability that E would occur at t is not 1.

A problem for views such as SP1, SP2, and SP3 is that events are no less lucky if we have no evidence or have not thought about them—see Steglich-Petersen (2010). For example, someone would be clearly lucky if, unbeknownst to her, a bullet just missed her head by centimeters. Steglich-Petersen (2010) thinks that one way to fix this problem is to formulate a subjective view in terms of the agent’s total knowledge instead of her degree of belief or evidence for the occurrence of the event:

SP4: A significant event E is lucky for an agent S at time t if only if, for all S knew just before the occurrence of E at t, there was low probability that E would occur at t.

SP4 is compatible with an event being lucky for the agent when she has no prior evidence or doxastic state about its occurrence. But SP4 might still not yield the right results. Consider a macabre lottery in which all the participants have been poisoned and the only way to survive is to win the prize, which is the antidote. The lottery draw is a fair one, so surviving is a pure matter of chance. Suppose that the only difference in knowledge between two participants, A and B, is that only A knows of herself that has been poisoned and is a participant of the lottery. For all A knows, there is low probability that she will survive. In contrast, for all B knows, her survival is very likely—she is a healthy person and has no reason to think that she has been poisoned. According to SP4, B would not be lucky if she won the lottery and survived as a result. Intuitively, however, A and B would be equally lucky if they won the lottery.

In general, this and other cases might be taken to illustrate that what is apparently lucky does not always coincide with what is actually lucky—see Rescher (2014) for the distinction between apparent and actual luck. A potential problem for subjective views is then that they might be only capturing intuitions about the former.

Steglich-Petersen (2010) advances a different account, which is not probabilistic in nature, but which is worth considering in this section, not only because it is a natural development of SP4, but also because, like SP2, SP3, and SP4, it characterizes luck as an epistemic notion. In particular, it analyzes luck in terms of the agent’s epistemic position with respect to the future occurrence of the lucky event:

SP5: A significant event E is lucky for an agent S at time t if only if, just before the occurrence of E at t, S was not in a position to know that E would occur at t.

Steglich-Petersen explains that we are in a position to know that an event will occur if, by taking up the belief that the event will occur, we thereby know that it will occur. SP5 yields the correct result in the macabre lottery case, which was troublesome for SP4. None of the participants is in a position to know that they would win the lottery and survive as a result. For that reason, the winner is lucky.

However, SP5 might not capture the intuitions of other cases correctly. Suppose that someone is the holder of a ticket in a fair lottery. During the lottery draw, a Laplacian demon predicts and tells that person that she will be the winner, so she comes to know in advance—and therefore is in a position to know—that she will be the winner. However, that person is not less lucky to win the lottery because of that knowledge or because of being in that position. After all, it is still a coincidence that she has purchased the ticket that corresponds to the accurate prediction of the demon. In sum, knowing that one will be lucky—and therefore being in a position to know it—does not necessarily prevent one from being lucky.

Before considering an alternative approach to luck, let us see how subjective probabilistic accounts explain the three general features of luck presented at the beginning of the article. On the one hand, they can account for degrees of luck in terms of degrees of subjective probability. As we have seen, SP3 says that an agent is increasingly lucky with respect to an event the less likely the occurrence of the event—conditional on her evidence—is. On the other hand, advocates of the subjective approach might explain borderline cases of luck by appealing to the fact that the relevant subjective probabilities are not always transparent, so if we cannot determine whether an event is lucky or non-lucky, it is plausibly because the relevant subjective probabilities cannot be determined either. Finally, to explain why luck is good or bad defenders of subjective accounts can simply include a significance condition on luck in their analyses.

4. Modal Accounts

A different approach to luck emphasizes the fact that paradigmatic instances of luck such as lottery wins could have easily failed to occur. Modal accounts accordingly explain luck in terms of the notion of easy possibility. As usual in areas of philosophy where the notion of possibility is invoked, advocates of the modal approach use possible worlds terminology to explain that notion and in turn the concept of luck. In this sense, that a lucky event could have easily not occurred means that, although it occurs in the actual world, it would fail to occur in close possible worlds.

Closeness is simply assumed to be a function of how intuitively similar possible worlds are to the actual world. For example, if an event E occurs at time t in the actual world, close possible worlds can be obtained by making a small change to the actual world at t and by seeing what happens to E at t or at times close to t—see Coffman (2007; 2014) for relevant discussion. One should keep in mind that although current modal views are closeness views, it is in principle possible to give a modal account of luck that ranges over distant possible worlds.

In the literature, there can be found several formulations of modal conditions on luck, where the main point of disagreement concerns the proportion of close possible worlds in which an event needs not occur in order for its actual occurrence to be by luck. For discussion purposes, however, those conditions will be presented here as if they constituted full-fledged analyses of luck, but it is important to keep in mind that modal conditions are typically considered necessary but not sufficient for a significant event to be by luck. A prominent exception is Pritchard (2005), who is the only author in the literature advocating a pure modal account of luck—in more recent work (2014), he drops the significance condition from his analysis, plausibly because he is mainly interested in giving an account of non-relational luck. Also for discussion purposes, the analyses of luck below will be presented, as before, as analyses of significant events. Without further ado, let us consider the following modal account by Pritchard (2005: 128):

M1: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in a wide proportion of close possible worlds in which the relevant initial conditions for E are the same as in the actual world.

According to M1, one is lucky to win a fair lottery because in a wide class of close possible worlds one would lose. M1 has two important features. The first one is that it does not consider any close possible world relevant to determine whether an event is lucky or not: only those in which the relevant initial conditions are the same as in the actual world. According to Pritchard (2014), the relevant initial conditions for an event are specific enough to allow a correct assessment of the luckiness of the target event, but not so specific as to guarantee its occurrence. Nonetheless, Pritchard leaves as a contextual matter what features of the actual world need to be fixed in our evaluation of close possible worlds. For instance, when we assess the modal profile of lottery results, we typically keep fixed features such as the fairness and the odds of the lottery or the fact that one has decided to purchase a specific lottery number.

Riggs (2007) argues that M1 is defective precisely because there is no non-arbitrary way to fix the relevant initial conditions. In reply, Pritchard (2014) argues that an analysis of a concept should not be more precise than the concept that the analysis intends to account for. Given that luck is a vague notion, the somewhat vague clause on initial conditions might be after all doing some explanatory work.

The second important feature of M1 is that it requires that the lucky event fails to occur in a wide proportion of close possible worlds. Pritchard (2005: 130) explains that by “wide” he means at least approaching half the close possible worlds, where events that are clearly lucky would not obtain in most close possible worlds.

However, there are clearly lucky events, such as obtaining heads by flipping a coin, that would not occur in a large proportion of close possible worlds—since the probability of heads is 0.5, we can suppose that in half the close possible worlds the outcome would be still heads. Perhaps, the following slightly different formulation is to be preferred—see Coffman (2007):

M2: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in at least half the close possible worlds in which the relevant initial conditions for E are the same as in the actual world.

However, Levy (2011: 17–18) argues that if we accept that an event that does not occur in half the close possible worlds is lucky, we can also accept that an event that does not occur in little less than half the close possible worlds—for example, in 49 percent of them—is lucky as well. In view of this, Levy thinks that it is better not to commit one’s modal account to a precise view of the issue. Instead, Levy argues that there is no fixed proportion of close possible worlds where an event must not occur to be considered lucky in the actual world. His point is that there might be different “large enough” proportions of close possible worlds in which events need not occur to be considered lucky. According to Levy, what makes the threshold vary from case to case is the significance that the event has for the agent. A modal account in the spirit of Levy’s considerations would be then the following:

M3: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in a large enough proportion of close possible worlds in which the relevant initial conditions for E are the same as in the actual world, where the relevant proportion of close possible worlds is determined by the significance that E has for S.

Lackey (2008) raises two important objections to the modal approach. The first one challenges the idea that the easy possibility of an event not occurring is necessary for luck. She proposes a counterexample involving a modally robust lucky event. Suppose that (i) A buries a treasure at location L and that (ii) B independently places a plant in the ground of L. When digging, B discovers A’s treasure. Lackey’s point is that if we stipulate that A’s and B’s independent actions are sufficiently modally robust, in the sense that there is no chance that they would fail to occur in close possible worlds, B’s discovery, which is undeniably lucky, would occur in most close possible worlds.

Pritchard (2014) and Levy (2009) try to circumvent the objection in two steps. First, they distinguish between the notions of luck and fortune. Then, they propose an error theory according to which most people would be mistaken to say that B’s discovery is by luck: B’s discovery is in reality fortunate, not lucky—see section 7 for the specific way in which Pritchard and Levy distinguish luck from fortune.

Lackey’s second objection targets the idea that the easy possibility of an event not occurring is sufficient for luck. Lackey thinks that whimsical events—that is, events that result from actions that are done on a whim—show exactly this. For instance, suppose that someone decides to catch the next flight to Paris on a whim. That person’s going to Paris is not by luck—since it is the result of her self-conscious decision—but it would nevertheless fail to occur in most close possible worlds—since she has made the decision on a whim.

In reply, Broncano-Berrocal (2015) argues that Lackey’s objection obviates the clause on initial conditions of modal accounts: if someone decides to go—and goes—to Paris on a whim, close possible worlds in which the relevant initial conditions for that trip are the same as in the actual world—that is, the only possible worlds that according to modal views are relevant to assess whether the trip is by luck—are worlds in which that person makes the decision to go to Paris. But, consistently with what modal accounts say, that person goes to Paris in most of those worlds. In a similar way, when it comes to evaluating whether someone in possession of a specific ticket is lucky to have won the lottery, we only consider close possible worlds in which she has decided to buy that specific ticket. Again, in most of those worlds, that ticket is a loser, just as modal accounts predict.

On the other hand, Hales (2014) thinks that cases of lucky necessities are problematic not only for objective probabilistic accounts but also for modal views. For example, if Jack the Ripper is terrorizing the neighborhood and it is one’s dearest friend Bob knocking on one’s door, one might be lucky that Bob is not Jack the Ripper, but it is metaphysically impossible that Bob is Jack the Ripper because things are self-identical—Hales gives credit to John Hawthorne for the example.

Before turning to lack of control views, let us see how modal accounts explain the three general features of luck. Concerning goodness or badness, modal views can simply include a significance condition—although, as noted, Pritchard (2014), one of the main advocates of the modal approach, thinks that a significance condition is not necessary for luck. In addition, we have seen that the clause on the relevant initial conditions of the event is vague enough to preserve the characteristic vagueness of the concept of luck.

On the other hand, modal views have at least two interesting ways to account for degrees of luck—the terminology below is from Williamson (2009), who applies it to the safety condition for knowledge. M1, M2, and M3 adopt what can be called the proportion view of the gradualness of luck: they cash out the degree of luck of an event in terms of the proportion of close possible worlds in which it would fail to occur—the larger the proportion of such close possible worlds is, the luckier the event is. Church (2013) argues that the proportion view should not be restricted to close possible worlds only: degrees of luck should be modeled in terms of all relevant possible worlds, although he also argues that more weight should be given to close ones.

The idea that more weight should be given to some possible worlds when fixing the degree of luck of an event serves to stipulate a different view of the gradualness of luck. The view, which can be called the distance view, says that the degree of luck of an event varies as a function of the distance to the actual world of possible worlds in which it would fail to occur. In this way, the closer those worlds are, the luckier the event is—Pritchard (2014) endorses the distance view.

On a related note, modal theorists can explore the relation between the significance of a lucky event and its modal profile. As we have seen, Levy (2011) thinks that the size of the proportion of close possible worlds in which an event needs not occur to count as lucky is sensitive to the significance that the event has for the agent. Although Levy thinks that it is a mistake to seek much clarity about how the latter affects the former, he also believes that there is a relation of inverse proportionality between the two: the more significant an event for an agent is, the smaller needs to be the proportion of close possible worlds in which it would not occur to be considered lucky for the agent—Coffman (2014) calls this the inverse proportionality thesis; see Levy (2011: 36).

By way of illustration, compare surviving a round of Russian roulette with one bullet in the chamber of a revolver with a six-shot capacity—approximately 0.16 probability of being shot—with winning one dollar in a poker game after having called an all-in that one knew one only had a 0.16 probability of losing. In both cases, one would succeed—that is, one would survive or win—in most close possible worlds, but only the former case is considered clearly lucky. The inverse proportionality thesis accounts for the difference: surviving is such a significant event that the proportion of close possible worlds in which one dies needs not be large for one’s actual survival to be considered lucky. However, Coffman (2015: 40) argues that the thesis is not sustainable precisely because it leads to the result that all extremely significant events count as lucky if there is at least a small non-zero chance that they will not happen—for example, the thesis seems to entail that we are lucky to survive every time we take a flight.

5. Lack of Control Accounts

One of the most widespread intuitions about luck is that lucky events are events beyond our control. For example, one way to explain why we are lucky to win the lottery is that the outcome of the lottery is beyond our control. In the literature, different lack of control views account for luck in those terms.

Some authors give pure lack of control accounts—for example, Broncano-Berrocal (2015), Riggs (2009). Other authors think that lack of control conditions are necessary but not sufficient for significant events to be by luck—for example, Coffman (2007; 2009), Latus (2003), Levy (2009; 2011). As in the case of modal conditions, and mainly for discussion purposes, the latter will be presented as if they constituted full-fledged analyses of luck—also as before, the analyses will be presented as analyses of significant events. That said, the simplest lack of control account has the following form:

LC1: A significant event E is lucky for an agent S at time t if only if E is beyond S’s control at t.

Many lucky events are beyond our control, so LC1 seems to be on the right track. However, Lackey (2008) argues that the fact that a significant event is beyond our control is neither necessary nor sufficient for the event being lucky. Against the sufficiency claim, Lackey argues that many nomic necessities—for example, sunrises—are not under our control, but that does not mean that they are by luck—see also Latus (2003) for this objection. To prove that lack of control is not necessary for luck, Lackey proposes a case in which a demolition worker, A, succeeds in demolishing the warehouse she was planning to demolish when pressing the button of the demolition system she had designed to that effect only because the electrical current is accidentally restored after the damage caused by a mouse when chewing the connection wires. According to Lackey, the explosion is both under A’s control and by luck.

Coffman (2009) and Levy (2011), who think that lack of control is not sufficient for luck, argue that Lackey’s counterexample to the necessity claim rests on the false thesis—called by Coffman the luck infection thesis—that if luck affects the conditions that enable an exercise of control, then the exercise of control itself is by luck; more generally, if S is lucky to be in a position to ϕ and S ϕ-es, then S ϕ-es by luck. The thesis, according to Coffman, has blatant counterexamples. For example, a lifeguard who accidentally goes to work very early and sees a swimmer drowning is lucky to be in a position to save the swimmer, but if done competently, it is not by luck that she saves him.

To overcome this and other objections, lack of control theorists define the notion of control in different ways. For example, Coffman (2009) thinks that an event is under an agent’s control just in case she is free to do something that would help produce it and something that would help prevent it. Rescher (1969: 329) gives a similar account of control as the capacity to produce the occurrence of an event—what Rescher calls positive control—and the capacity to prevent it—what he calls negative control. While Rescher defends a probabilistic account of luck, Coffman thinks that lack of both negative and positive control—when understood in terms of freedom—is necessary for luck. The following is a lack of control view in the spirit of Coffman’s and Rescher’s respective conceptions of control:

LC2: A significant event E is lucky for an agent S at time t if only if S is not both free to do something that would help produce E at t—or lacks the capacity to do it—and free to do something that would help prevent E at t—or lacks the capacity to do it.

An immediate problem for LC2 is that it is not the same to have control as to exercise it. We might have control over something in the sense that we are free or have the capacity to control it, but that does not mean that we actually exercise that capacity or freedom. For example, a competent pilot who is free or has the capacity to produce and prevent a plane crash but who refuses to take control of the plane for some reason is objectively lucky that a passenger manages to land the plane safely and that as a result survives.

Levy (2011: chap. 5) understands control in similar terms as Coffman and Rescher, but he introduces additional epistemic constraints. For Levy, an event is under an agent’s control just in case there is a basic action that she could perform which she knows would bring about the event and how it would do so. This way to understand control can be supplemented with Rescher’s point that agents can also control an event by inaction, omission or inactivity (Rescher 1969: 369). Taking the latter into account, the following is a pure lack of control view in the spirit of Levy’s conception of control:

LC3: A significant event E is lucky for an agent S at time t if only if S is able to perform—or to omit performing—a basic action whose occurrence—or non-occurrence—is such that S knows would bring about—or prevent—E at t and how it would do so.

According to LC3, if we do not want to be exposed to the whims of luck not only we have to be able to perform—or omit performing—actions that causally influence the world, but we also need to know that, and how, the world is sensitive to them.

A potential problem for LC3 is that we might be properly described as being in control of something when we act in a way that brings it to a desired state despite we do not know how exactly this happens. For example, a driver might know that by turning the steering wheel to the left she will avoid an obstacle in the road, but she might be completely mistaken about how exactly this works—for example, she might erroneously believe that, whenever she turns the steering wheel to the left, it is a magical dwarf who moves the car to the left. So, she knows that her basic action will bring about the desired effect while failing to know how. The problem is that if that person competently avoids the obstacle, the maneuver seems under her control, no matter that she mistakenly thinks that it is under the dwarf’s.

A different lack of control account is due to Riggs (2009), who tries to defend the lack of control approach from Lackey’s objection that the fact that an event is beyond our control does not suffice for the event being lucky. Riggs admits that although it is true that many nomic necessities—for example, sunrises—are beyond our control, we can still exploit them to our advantage. The idea is that if we exploit them for some purpose, they are not lucky for us even if they are not under our control. The following analysis accounts for luck in those terms:

LC4: A significant event E is lucky for an agent S at time t if only if (i) E is beyond S’s control at t and (ii) S did not successfully exploit E, prior to E’s occurrence at t, for some purpose.

To illustrate how LC4 can distinguish between lucky and non-lucky physical events beyond our control, Riggs proposes a case in which two people, A and B, are about to be executed, but only A knows two important facts: first, that their captors believe that solar eclipses are in reality a message from the gods telling them to stop sacrifices; second, that, unbeknownst to their captors, a solar eclipse will take place at the exact time the execution is planned. Riggs thinks that, while B is lucky to be released, A is not. By being in a position to exploit the eclipse in her favor, A is in control of the situation.

Coffman (2015: 10) argues via counterexample that LC4 does not distinguish correctly between lucky and non-lucky physical events beyond our control. He proposes a case in which someone lives in an underground facility that is, unbeknownst to her, solar-powered. According to Coffman, that person, who has become completely oblivious to sunrises, is not lucky that the sun rises every morning and keeps her facility running, even if it is something that is neither beyond her control, nor successfully exploited by her for some purpose.

Broncano-Berrocal (2015) gives a lack of control account in the spirit of Riggs’s, but with significant differences. According to Broncano-Berrocal, there are two ways in which something might be under our control. On the one hand, we exercise effective control over something by competently bringing it to a desired state—for example, by causally influencing it in a certain way. On the other hand, something is under our tracking control when we actively check or monitor that it is currently in a certain desired state, so that we are thereby disposed or in a position either (i) to exercise effective control over it or (ii) to act in a way that would allow us to achieve goals related to the thing controlled—for example, exploiting it to our advantage. By way of illustration, when flying on autopilot mode, a pilot does not exercise effective control over the plane—for example, she does not exert any causal influence on it—but the plane is under her tracking control if she is sufficiently vigilant. A key point of Broncano-Berrocal’s account is that, depending on the practical context, attributions of control such as “Event E is under S’s control” might refer either to effective control, to tracking control, or to both. The corresponding account of luck is the following:

LC5: A significant event E is lucky for an agent S at time t if only if E is beyond S’s control at t, where E is beyond S’s control at t either if (i) S lacks effective control over E, or (ii) E is not under S’s tracking control, or (iii) both.

Lotteries are typically not under our tracking control—although they might be if a Laplacian demon tells us what the result will be. The reason why winning a fair lottery is a matter of luck is, according to LC5, that we are not able to causally influence the result in the desired way, that is, the fact that we lack effective control. By the same token, LC5 also considers lucky winning a lottery that, unbeknownst to one, has been rigged in one’s favor.

LC5 allows to give a different response to Lackey’s demolition case: Lackey’s intuition that the explosion is under A’s control can be explained in terms of the fact that A exercises effective control over the explosion by pressing the button. But the intuition that A is lucky to demolish the warehouse is parasitic on the fact that the explosion is not under A’s tracking control. In particular, the practical context provided by Lackey is such that A is responsible for the design of the demolition system but fails to check that the connection wires are damaged—sometimes, tracking control might be very difficult to achieve. In a similar way, LC5 explains that, while we lack effective control over many physical events—for example, sunrises—the reason why they are not lucky is that they are under our tracking control, that is, they are things that we regularly monitor and thereby can exploit to our advantage.

Coffman’s solar-powered facility case, the counterexample to LC4, is also a counterexample to LC5. Coffman’s point is that sunrises are not lucky for the person living in the solar-powered underground facility, despite they are not under her control—tracking or effective. In reply, defenders of lack of control views might argue that it is not unreasonable to say that such a person is lucky that the sun rises every morning and keeps, unbeknownst to her, her facility running. After all, there are similar attributions of luck in ordinary speech. For example, we say things such as “S is lucky to live in an earthquake-free region” even though S ignores it and is therefore lucky that an earthquake will not make her house collapse.

Finally, Hales (2014) thinks that there are cases of skillful achievements that lack of control accounts are compelled to consider lucky. For instance, he thinks that not even the best batter in history can plausibly be said to have control over whether he hits the ball, since there are many factors over which he cannot exercise any sort of control—for example, distractions, the pitches he receives, and the play of the opposing fielders. In reply, lack of control theorists might argue that Hales is illicitly raising the standards of control. After all, intuitions about whether the result of our actions is under our control go hand in hand with intuitions about whether the result of our actions is because of our skills.

As a final note, let us briefly consider how lack of control accounts explain the three general features of luck presented at the beginning of the article. Concerning goodness or badness, lack of control views can, like other views, simply include a significance condition. Concerning vagueness, the notion of control is not as precise as to remove all vagueness from the analysis of luck. Concerning gradualness, control, like luck, comes in degrees. In particular, lack of control of theorists might endorse the view that the degree of luck of an event is inversely proportional to the degree of control that the agent has over it—see Latus (2003) for further discussion.

6. Hybrid Accounts

Some authors opt for giving accounts of luck that mix modal or probabilistic conditions with lack of control conditions. The rationale behind this move is, as Latus (2003) puts it, that although lack of control over an event often goes hand in hand with the event having low chance of happening—or with the event being modally fragile—there are non-lucky events that are either beyond our control—for example, sunrises—or have low chance of occurring—for example, rare significant events brought about by ability. Latus’s hybrid view features a lack of control condition and a subjective probabilistic condition:

H1: A significant event E is lucky for an agent S at time t if only if, (i) just before the occurrence of E at t, S had a low degree of belief that E would occur at t, and (ii) E is beyond S’s control at t.

By contrast, Coffman (2007) and Levy (2011) opt for conjoining a lack of control condition with modal conditions. Coffman’s analysis is roughly the following—he includes several further refinements to handle specific cases of competing significant events:

H2: A significant event E is lucky for an agent S at time t if only if, (i) E does not occur around t in at least half the possible worlds obtainable by making no more than a small change to the actual world at t, and (ii) E is beyond S’s control at t.

Levy’s hybrid analysis (2011) features a different modal condition:

H3: A significant event E is lucky for an agent S at time t if only if, (i) E occurs in the actual world at t but does not occur at t or at times close to t in a large enough proportion of close possible worlds, where the relevant proportion of close possible worlds is inverse to the significance of E for S, and (ii) E is beyond S’s control at t.

Levy calls this kind of luck chancy luck, but argues that there also exists a non-chancy variety of luck, which is the kind of luck that affects one’s psychological traits or dispositions relative to a reference group of individuals—for example, human beings.

Any of the already discussed counterexamples to the necessity for luck of (i) subjective probabilistic conditions—for example, cases of agents without beliefs about events that are lucky for them, (ii) objective probabilistic conditions—for example, cases of highly probable lucky events, (iii) modal conditions—for example, Lackey’s buried treasure case, and (iv) lack of control conditions—for example, Lackey’s demolition case—are troublesome for hybrid views.

7. Luck and Related Concepts

There are several concepts that are closely related to the concept of luck. Here we will focus on the concepts of accident, coincidence, fortune, risk, and indeterminacy.

a. Accidents

The concept of accident is closely related to the concept of luck. After all, most accidents—for example, car crashes—involve luck—mostly bad luck. But as Pritchard (2005: 126) argues, there are paradigmatic cases of luck that involve no accidents. For example, if one self-consciously chooses a specific lottery ticket and wins the lottery, one’s winning is by luck, but it is not an accident given that one was trying to win.

From Pritchard’s example, we might infer that if an agent acts with the intention of bringing about some result, then if it occurs, it is not an accident. However, if someone prays with the intention of bringing about some event and the event occurs by sheer coincidence—because that person’s prayers are causally irrelevant to its occurrence—the event is accidental. But the mere causal relevance of an agent’s actions to an event’s occurrence is not sufficient for excluding accidentality either. If a pilot dancing in the cockpit unintentionally presses the depressurization button and as a result the plane crashes, the crash is an accident despite being caused by the pilot.

This suggests that what prevents the outcomes of an action from being accidental—but not from being lucky—is both the fact that an agent acts with the intention to bring about a certain outcome and the fact that her action is causally relevant to that outcome. For example, if someone wins a lottery in which participants have to pick a ball directly from the lottery drum with a blindfold on, that person’s winning is lucky but not accidental because of being brought about by her direct intentional action.

b. Coincidences

The concept of coincidence is also closely related to the concept of luck. Owens (1992) gives an account according to which a coincidence is an inexplicable event in the following sense: we cannot explain why its constituents come together because they are produced by independent causal factors—see also Riggs (2014) for a similar account. More specifically, coincidences are such that we cannot explain why they occur because there is no common nomological antecedent of their components or a nomological connection between them. For example, if someone prays for rain and it rains, that it rains is a coincidence because there is no nomological connection between that person’s prayers and the fact that it rains. On the other hand, how close or immediate should an antecedent be in order to prevent two events from constituting a coincidence is a matter that usually becomes clear in context. For example, we would regard as a coincidence the fact that someone wishes that her favorite team wins the final and that as a matter of fact it ends up winning the final despite both events have some distant nomological component—for example, the Big Bang; see Riggs (2014) for further discussion.

Not all lucky events are coincidental events. For example, it is no coincidence that a coin lands heads when someone flips it. But that might be clearly lucky for that person. In the same way, as causally relevant intentional action prevents an event from being an accident, causally relevant intentional action seems to prevent a pair of events—someone’s flipping of the coin and the coin landing heads—from being a coincidence. By contrast, all coincidental events, if significant, are lucky. For example, if someone prays for rain because she is in need of water and it rains, the coincidental event that it rains is lucky for that person.

Probabilistic and modal views have difficulties when it comes to accounting for highly probable or modally robust lucky events arising out of coincidence. As Lackey’s buried treasure case illustrates, if the occurrence of the components of a coincidence—A’s burial of the treasure and B’s digging at the same location—is highly probable or modally robust, the occurrence of the resulting coincidental event—B’s discovery of A’s treasure—is also highly probable or modally robust. Yet, the event is lucky precisely because it arises out of a coincidence.

c. Fortune

In the literature, there is some disagreement concerning whether or not the concept of fortune is the same as the concept of luck. Most modal theorists think that luck and fortune are different and use the distinction to argue that Lackey’s buried treasure case is in reality a case of fortune, while their theories are theories of luck.

For example, Pritchard (2005: 144, n.15; 2014) thinks that fortunate events are events beyond our control that count in our favor, but unlike lucky events, they are not chancy or modally fragile. In his way, having good health or a good financial situation are instances of fortune, not of luck, while winning a fair lottery is only an instance of luck. Rescher (1995: 28–9) similarly thinks that we can be fortunate if something good happens to or for us in the natural course of things, but we are lucky only if such eventuality is chancy. In a similar vein, Coffman (2007; 2014) thinks that we are lucky to win a fair lottery—given how unlikely it was—but we are merely unfortunate to lose it—given how likely it was.

Finally, Levy (2009; 2011: 17) thinks that fortunate events are non-chancy events—hence non-lucky—but luck-involving, in the sense that they have luck in their causal history and, in particular, in their proximate causes. His reply to Lackey’s buried treasure case is that luck in the circumstances—the lucky coincidence that someone places a plant at the same location in which someone has buried a treasure—is not inherited by the actions performed in those circumstances or by the events resulting from them—for example, the discovery of the treasure. So while there is luck involved in the circumstances of the discovery, the discovery itself is merely fortunate.

Against the distinction between luck and fortune, Broncano-Berrocal (2015) and Stoutenburg (2015) argue that the terms “luck” and “fortune” can be interchanged in English sentences without any significant semantic difference. Moreover, since English speakers use the terms interchangeably, arguing that luck and fortune are two distinct concepts entails that speakers are systematically mistaken in their usage of the terms, which is a hardly tenable error theory. For example, we would be wrong in saying that someone is fortunate to win a raffle or lucky to win a lottery that, completely unbeknownst to her, has been rigged in her favor.

d. Risk

There is a close connection between the concepts of luck and risk. In fact, some theorists think that the connection is so close that they think that the former can be explained in terms of the latter—see Broncano-Berrocal (2015), Coffman (2007), Pritchard (2014; 2015), and Williamson (2009) for relevant discussion. On the one hand, Pritchard (2015) explains that a risk or a risk event is a potential, unwanted event that is realistically possible—that is, something that could credibly occur—whereas a risky event is a potential, unwanted event that has higher risk than normal of occurring—for example, there is always a risk that one’s plane might crash, but flying by plane is not risky. With that distinction in place, Pritchard distinguishes two competing ways to understand the notion of risk or of risk event.

The probabilistic account of risk says that an event is at risk of occurring just in case there is non-zero objective probability that it will occur. How high its risk of occurrence is—that is, how risky it is—depends on how probable its occurrence is. The modal account of risk, by contrast, says that an event is at risk of occurring just in case it would occur in at least some close possible worlds—see also Coffman (2007) and Williamson (2009). How high its risk of occurrence is—that is, how risky it is—depends on how large the proportion of close possible worlds in which it would occur is—call this the proportion view of degrees of risk—or on how distant possible worlds in which it would occur are—call this the distance view of degrees of risk.

Pritchard contends that the probabilistic account fails to adequately account for degrees of risk. In particular, he argues that if two risk events E1 and E2 have the same probability of occurring but E1 is such that its occurrence is easily possible, E1 is riskier than E2, but the probabilistic account is committed to say that they are equally risky.

Pritchard (2014; 2015) also argues that when risk is understood in modal terms, the notions of luck and risk are basically co-extensive, because both how lucky and risky an event is depends on the modal profile of the event’s occurrence, that is, on the size of the proportion of close possible worlds in which it would not obtain, or the distance to the actual world of possible worlds in which it would not occur. According to Pritchard, the only two minor differences between the two notions are, on the one hand, that risk is typically associated to negative events, whereas luck can be predicated of both negative and positive events; on the other, that while we can talk of very low levels of risk, we cannot so clearly talk of low levels of luck.

Broncano-Berrocal (2015) makes a further distinction between two ways in which we think of risk: the risk that an event has of occurring—or event-relative risk—and the risk at which an agent is with respect to an event—or agent-relative risk. The distinction serves to delimit the scope of Pritchard’s account: his modal account of risk is an account of event-relative risk—the same applies to the probabilistic view. For Broncano-Berrocal, the modal and probabilistic accounts of event-relative risk are both correct: while the probabilistic conception is the one that is typically used or assumed in scientific and technical contexts, the modal conception better fits our everyday thinking about risky events. On the other hand, the best way to understand the agent-relative sense of risk is, according to Broncano-Berrocal, in terms of lack of control: an agent is at risk with respect to the possible occurrence of an event just in case its occurrence is beyond her control. He further argues that the agent-relative sense of risk is the one that really serves to account for luck: when risk is understood in terms of lack of control, the notions of luck and risk are basically co-extensive, because whether an event is lucky or risky for an agent depends on whether it is under the agent’s control.

e. Indeterminacy

In a causally deterministic world, events are necessitated as a matter of natural law by antecedent conditions. It might be thought that lucky events are events whose occurrence was not predetermined in that way. Against this idea, Pritchard (2005: 126–27) argues that at least some lucky events are not brought about by indeterminate factors. For example, given the position and momentum of the balls in a lottery drum at time t1 it might be fully determinate that a certain combination of balls will be the winner combination at t2. To make the point more vivid, Coffman (2007) proposes an example in which someone’s life depends on the fact that a ball remains perfectly balanced on the tip of a cone in a deterministic world. According to Coffman, that person can be properly described as being lucky if her stay in the deterministic world corresponds to the predetermined temporal interval in which the ball would remain balanced on the cone’s tip. Another example is the following: a Laplacian demon, who is able to predict the future given his knowledge of the complete state of a deterministic world at a prior time, might be unlucky to know in advance that he will die in a car accident. The moral of all these cases is that luck is—or at least seems—fully compatible with determinism.

8. References and Further Reading

  • Ballantyne, Nathan 2014. Does luck have a place in epistemology? Synthese 191:1391–1407.
    • Ballantyne argues that investigating the nature of luck does not allow to better understand knowledge.
  • Ballantyne, Nathan. 2012. Luck and interests. Synthese 185: 319–334.
    • Ballantyne provides a detailed examination of the different ways to formulate the significance condition on luck.
  • Baumann, Peter. 2012. No luck with knowledge? On a dogma of epistemology. Philosophy and Phenomenological Research DOI: 10.1111/j.1933-1592.2012.00622.
    • Baumann defends an objective probabilistic condition.
  • Broncano-Berrocal, Fernando. 2015. Luck as risk and the lack of control account of luck. Metaphilosophy 46: 1–25.
    • Broncano-Berrocal proposes a lack of control account and argues that luck can be explained in terms of risk.
  • Coffman, E. J. 2015. Luck: Its nature and significance for human knowledge and agency. Palgrave Macmillan.
    • Coffman’s monograph includes extensive criticism of leading theories of luck and argues that luck can be explained in terms of the notion of stroke of luck; it also explores the applications in epistemology and philosophy of action of that idea.
  • Coffman, E. J. 2014. Strokes of luck. Metaphilosophy 45: 477–508.
    • Coffman proposes an account of strokes of luck.
  • Coffman, E. J. 2009. Does luck exclude control? Australasian Journal of Philosophy 87: 499–504.
    • Coffman defends a specific way to understand the lack of control condition on luck.
  • Coffman, E. J. 2007. Thinking about luck. Synthese 158: 385–398.
    • Coffman gives a hybrid account of luck in terms of easy possibility and lack of control.
  • Church, Ian M. (2013). Getting ‘Lucky’ with Gettier. European Journal of Philosophy. 21: 37–49.
    • Church explores several ways to model degrees of luck in modal terms.
  • Hales, Steven D. 2015. Luck: Its Nature and Significance for Human Knowledge and Responsibility, by E.J. Coffman. The Philosophical Quarterly, DOI:10.1093/pq/pqv093.
    • Critical book review of Coffman’s monograph.
  • Hales, Steven. D. 2014. Why every theory of luck is wrong. Noûs, DOI: 10.1111/nous.12076.
    • Hales gives three kind of counterexamples to probabilistic, modal, and lack of control accounts of luck.
  • Hales, Steven. D. & Johnson, Jennifer Adrienne. 2014. Luck attributions and cognitive Bias. Metaphilosophy 45: 509–528.
    • Hales and Johnson conduct an empirical investigation on luck attributions and suggest that the results might indicate that luck is a cognitive illusion.
  • Lackey, Jennifer. 2008. What luck is not. Australasian Journal of Philosophy 86: 255-67.
    • Lackey argues that the conditions of modal and lack of control analyses are neither sufficient nor necessary for luck.
  • Latus, Andrew. 2003. Constitutive luck. Metaphilosophy 34: 460–475.
    • Latus gives a hybrid account of luck that features subjective probabilistic and lack of control conditions and uses the account to show that the concept of constitutive luck is not incoherent.
  • Levy, Neil. 2011. Hard luck: How luck undermines free will and moral responsibility. Oxford University Press.
    • Levy proposes a hybrid account that conjoins a modal condition with a lack of control condition and argues that the epistemic requirements on control are so demanding that are rarely met; he also applies this account to the free will debate.
  • Levy, Neil. 2009. What, and where, luck is: A response to Jennifer Lackey. Australasian Journal of Philosophy 87: 489–497.
    • Levy defends that Lackey’s buried treasure case poses no problem to modal accounts in terms of the distinction between luck and fortune.
  • McKinnon, Rachel. 2014. You make your own luck. Metaphilosophy 45: 558–577.
    • McKinnon gives an answer to the question of what does it mean to say that someone creates her own luck and uses her account of diachronic luck to explain how we evaluate performances.
  • McKinnon, Rachel. 2013. Getting luck properly under control. Metaphilosophy 44: 496–511.
    • McKinnon proposes an account of diachronic luck in terms of the notion of expected value.
  • Milburn, Joe. 2014. Subject-involving luck. Metaphilosophy 45: 578–593.
    • Milburn distinguishes between subject-relative and subject-involving luck and argues that one of the upshots of focusing on the latter is that lack of control accounts of luck become more attractive.
  • Owens, David. 1992. Causes and coincidences. Cambridge University Press.
    • Owens gives an account of coincidences according to which a coincidence is an event whose constituents are nomologically independent of each other.
  • Pritchard, Duncan (2015). Risk. Metaphilosophy 46: 436–461.
    • Pritchard argues that the standard way of conceptualizing risk in probabilistic terms is flawed and proposes an alternative modal conception.
  • Pritchard, Duncan. 2014. The modal account of luck. Metaphilosophy 45: 594–619.
    • Pritchard defends the modal account of luck from several objections.
  • Pritchard, Duncan. 2005. Epistemic luck. Oxford University Press.
    • Pritchard introduces the modal account of luck and gives corresponding accounts of epistemic and moral luck.
  • Pritchard, Duncan, & Smith, Matthew. 2004. The psychology and philosophy of luck. New Ideas in Psychology 22: 1–28.
    • Pritchard and Smith survey psychological research on luck and argue that it supports the modal account of luck.
  • Pritchard, Duncan, & Whittington, Lee John (eds.). 2015. The philosophy of luck. Wiley-Blackwell.
    • A volume with many of the papers contained in this bibliography.
  • Rescher, Nicholas. 2014. The machinations of luck. Metaphilosophy 45: 620–626.
    • Rescher defends an objective probabilistic account of luck.
  • Rescher, Nicholas. 1995. Luck: The brilliant randomness of everyday life. Farrar, Straus and Giroux.
    • Rescher provides an extensive examination of the concept of luck as well as of many other issues surrounding it.
  • Rescher, Nicholas. 1969. The concept of control. In Essays in Philosophical Analysis. University of Pittsburgh Press: 327–354.
    • Rescher provides an extensive examination of the concept of control.
  • Riggs, Wayne D. 2014. Luck, knowledge, and “mere” coincidence. Metaphilosophy 45 :627–639.
    • Riggs advances an account of coincidence and applies it to the theory of knowledge.
  • Riggs, Wayne. 2009. Knowledge, luck, and control. In Haddock, A., Millar, A. & Pritchard, D. (eds.). Epistemic value. Oxford University Press.
    • Riggs proposes a lack of control account of luck and replies to some objections.
  • Riggs, Wayne 2007. Why epistemologists are so down on their luck. Synthese 158: 329–344.
    • Riggs criticizes the modal account of luck and defends a lack of control condition.
  • Steglich-Petersen, Asbjørn 2010. Luck as an epistemic notion. Synthese 176: 361–377.
    • Steglich-Petersen gives an epistemic analysis of luck in terms of the notion of being in a position to know.
  • Stoutenburg, Gregory. 2015. The epistemic analysis of luck. Episteme, DOI:10.1017/epi.2014.35.
    • Stoutenburg gives an evidential account of degrees of luck.
  • Williamson, Timothy. 2009. Probability and danger. The Amherst Lecture in Philosophy 4: 1–35.
    • Williamson compares probabilistic and modal conceptions of safety and risk and discusses how they bear on the theory of knowledge.

 

Author Information

Fernando Broncano-Berrocal
Email: fernando.broncanoberrocal@kuleuven.be
University of Leuven (KU Leuven)
Belgium

Epistemic Justification

We often believe what we are told by our parents, friends, doctors, and news reporters. We often believe what we see, taste, and smell. We hold beliefs about the past, the present, and the future. Do we have a right to hold any of these beliefs? Are any supported by evidence? Should we continue to hold them, or should we discard some? These questions are evaluative. They ask whether our beliefs meet a standard that renders them fitting, right, or reasonable for us to hold. One prominent standard is epistemic justification.

Very generally, justification is the right standing of an action, person, or attitude with respect to some standard of evaluation. For example, a person’s actions might be justified under the law, or a person might be justified before God.

Epistemic justification (from episteme, the Greek word for knowledge) is the right standing of a person’s beliefs with respect to knowledge, though there is some disagreement about what that means precisely. Some argue that right standing refers to whether the beliefs are more likely to be true. Others argue that it refers to whether they are more likely to be knowledge. Still others argue that it refers to whether those beliefs were formed or are held in a responsible or virtuous manner.

Because of its evaluative role, justification is often used synonymously with rationality. There are, however, many types of rationality, some of which are not about a belief’s epistemic status and some of which are not about beliefs at all. So, while it is intuitive to say a justified belief is a rational belief, it is also intuitive to say that a person is rational for holding a justified belief. This article focuses on theories of epistemic justification and sets aside their relationship to rationality.

In addition to being an evaluative concept, many philosophers hold that justification is normative. Having justified beliefs is better, in some sense, than having unjustified beliefs, and determining whether a belief is justified tells us whether we should, should not, or may believe a proposition. But this normative role is controversial, and some philosophers have rejected it for a more naturalistic, or science-based, role. Naturalistic theories focus less on belief-forming decisions—decisions from a subject’s own perspective—and more on describing, from an objective point of view, the relationship between belief-forming mechanisms and reality.

Regardless of whether justification refers to right belief or responsible belief, or whether it plays a normative or naturalistic role, it is still predominantly regarded as essential for knowledge. This article introduces some of the questions that motivate theories of epistemic justification, explains the goals that a successful theory must accomplish, and surveys the most widely discussed versions of these theories.

Table of Contents

  1. Starting Points
    1. The Dilemma of Inferential Justification
    2. Explaining How Beliefs are Justified
    3. Explaining the Role of Justification
    4. Explaining Why Justification is Valuable
    5. Justification and Knowledge
  2. Internalist Foundationalism
    1. Basic Beliefs
    2. Arguments For and Against Foundationalism
  3. Internalist Coherentism
    1. Varieties of Coherence
    2. Objections to Coherentism
  4. Infinitism
    1. Arguments for Infinitism
    2. Objections to Qualified Infinitism
  5. Types of Internalism and Objections
    1. Accessibilism and Mentalism
    2. Objections to Internalism
  6. The Gettier Era
    1. The History of the Gettier Problem
    2. Responses to the Gettier Problem
  7. Externalist Foundationalism
    1. Externalism, Foundationalism, and the DIJ
    2. Reliabilism
    3. Objections to Externalism
  8. Justification as Virtue
    1. Virtue Reliabilism
    2. Virtue Responsibilism
    3. Objections to Virtue Epistemology
  9. The Value of Justification
    1. The Truth Goal
    2. Alternatives to the Truth Goal
    3. Objections to the Polyvalent View
    4. Rejections of the Truth Goal
  10. Conclusion
  11. References and Further Reading

1. Starting Points

Consider your simplest, most obvious beliefs: the color of the sky, the date of your birth, what chocolate tastes like. Are these beliefs justified for you? What would explain the rightness or fittingness of these beliefs? One prominent account of justification is that a belief is justified for a person only if she has a good reason for holding it. If you were to ask me why I believe the sky is blue and I were to answer that I am just guessing or that my horoscope told me, you would likely not consider either a good reason. In either case, I am not justified in believing the sky is blue, even if it really is blue. However, if I were to say, instead, that I remember seeing the sky as blue or that I am currently seeing that it is blue, you would likely think better of my reason. So, having good reasons is a very natural explanation of how our beliefs are justified.

Further, the possibility that my belief that the sky is blue is not justified, even if it is true that the sky is blue, suggests that justification is more than simply having a true belief. All of my beliefs may be true, but if I obtained them accidentally or by faulty reasoning, then they are not justified for me; if I am seeking knowledge, I have no right to hold them. Further still, true belief may not even be necessary for justification. If I understand Newtonian physics, and if Newton’s arguments seem right to me, and if all contemporary physicists testify that Newtonian physics is true, it is plausible to think that my belief that it is true is justified, even if Einstein will eventually show that Newton and I are wrong. We can imagine this was the situation of many physicists in the late 1700s. If this is right, justification is fallible—it is possible to be justified in believing false propositions. Though some philosophers have, in the past, rejected fallibilism about justification, it is now widely accepted. Having good reasons, it turns out, does not guarantee having true beliefs.

But the idea that justification is a matter of having good reasons faces a serious obstacle. Normally, when we give reasons for a belief, we cite other beliefs. Take, for example, the proposition, “The cat is on the mat.” If you believe it and are asked why, you might offer the following beliefs to support it:

1. I see that the cat is on the mat.

2. Seeing that X implies that X.

Together, these seem to constitute a good reason for believing the proposition:

3. The cat is on the mat.

But does this mean that proposition 3 is epistemically justified for you? Even if the combination of propositions 1 and 2 counts as a good reason to believe 3, proposition 3 is not justified unless both 1 and 2 are also justified. Do we have good reasons for believing 1 and 2? If not, then according to the good reasons account of justification, propositions 1 and 2 are unjustified, which means that 3 is unjustified. If we do have good reasons for believing 1 and 2, do we have good reasons for believing those propositions? How long does our chain of good reasons have to be before even one belief is justified? These questions lead to a classic dilemma.

a. The Dilemma of Inferential Justification

For simplicity, let’s focus on proposition 1: I see that the cat is on the mat.

Horn A: If there are no good reasons to believe proposition 1, then proposition 1 is unjustified, which means 3 is unjustified.

Horn B: If there is a good reason to believe proposition 1, say proposition 1a, then either 1a is unjustified or we need another belief, proposition 1b, to justify 1a. If this process continues infinitely, then 1 is ultimately unjustified, and, therefore, 3 is unjustified.

Either way, proposition 3 is unjustified.

Horn A of the dilemma is the problem of skepticism about justification. If our most obvious beliefs are unjustified, then no belief derived from them is justified; and if no belief is justified, we are left with an extreme form of skepticism. Horn B of the dilemma is called the regress problem. If every reason we offer requires a reason that also requires a reason, and so on, infinitely, then no belief is ultimately justified.

Both of these problems assume that all justification involves inferring beliefs from one or more other beliefs, so let’s call these two problems the dilemma of inferential justification (DIJ). And let’s call the assumption that all justification involves inference from other beliefs the inferential assumption (also called the doxastic assumption, Pollock 1986: 19).

Responses to this dilemma typically take one of two forms. On one hand, we might embrace Horn A, which is, in effect, to adopt skepticism and eschew any further attempts to justify our beliefs. This is the classic route of the Pyrrhonian skeptics, such as Sextus Empiricus, and some later Academic skeptics, such as Arcesilaus. (For more on these views, see Ancient Greek Skepticism.)

On the other hand, we might offer an explanation of how beliefs can be justified in spite of the dilemma. In other words, we might offer an account of epistemic justification that resolves the dilemma, either by constructing a third, less problematic option or by showing that Horn B is not as troublesome as philosophers have traditionally supposed. This non-skeptical route is the majority position and the focus of the remainder of this article.

Philosophers tend to agree that any adequate account of epistemic justification—that is, an account that resolves the dilemma—must do at least three things: (1) explain how a belief comes to be justified for a person, (2) explain what role justification plays in our belief systems, and (3) explain what makes justification valuable in a way that is not merely practically or aesthetically valuable.

b. Explaining How Beliefs are Justified

One of the central aims of theories of epistemic justification is to explain how a person’s beliefs come to be justified in a way that resolves the DIJ. Those who accept the inferential assumption argue either that a belief is justified if it coheres with—that is, stands in mutual support with—the whole set of a person’s beliefs (coherentism) or that an infinite chain of sequentially supported beliefs is not as problematic as philosophers have claimed (infinitism).

Among those who reject the inferential assumption, some argue that justification is grounded in special beliefs, called basic beliefs, that are either obviously true or supported by non-belief states, such as perceptions (foundationalism). Others who reject the inferential assumption argue that justification is either a function of the quality of the mechanisms by which beliefs are formed (externalism) or at least partly a function of certain qualities or virtues of the believer (virtue epistemology).

In addition to resolving the DIJ, theories of justification must explain what it is about forming or holding a belief that justifies it in order to explain how a belief is justified. Some argue that justification is a matter of a person’s mental states: a belief is justified only if a person has conscious access to beliefs and evidence that support it (internalism). Others argue that justification is a matter of a belief’s origin or the mechanisms that produce it: a belief is justified only if it was formed in a way that makes the belief likely to be true (externalism), whether through an appropriate connection with the state of affairs the belief is about or through reliable processes. The former view is called internalism because the justifying reasons—whether beliefs, experiences, testimony, and so forth—are internal mental states, that is, states consciously available to a person. The latter view is called externalism because the justifying states are outside a person’s immediate mental access; they are relationships between a person’s belief states and the states of the world outside the believer’s mental states (see Internalism and Externalism in Epistemology).

c. Explaining the Role of Justification

A second central aim of epistemology is to identify and explain the role that justification plays in our belief-forming behavior. Some argue that justification is required for the practical work of having responsible beliefs. Having certain reasons makes it possible for us to choose well which beliefs to form and hold and which to reject. This is called the guidance model of justification. Some philosophers who accept the guidance model, like René Descartes and W. K. Clifford, pair it with a strongly normative role according to which justification is a matter of fulfilling epistemic obligations. This combination is sometimes called the guidance-deontological model of justification, where “deontology” refers to one’s duties with respect to believing. Other epistemologists reject the guidance and guidance-deontological models for more descriptive models. Justification, according to these philosophers, is simply a feature of our psychology, and though our minds form beliefs more effectively under some circumstances than others, the conditions necessary for forming justified beliefs are outside of our access and control. This objective, naturalistic model of justification has it that our understanding of justification should be informed, in large part, by psychology and cognitive science.

d. Explaining Why Justification is Valuable

A third central aim of theories of justification is to explain why justification is epistemically valuable. Some epistemologists argue that justification is crucial for avoiding error and increasing our store of knowledge. Others argue that knowledge is more complicated than attaining true beliefs in the right way and that part of the value of knowledge is that it makes the knower better off. These philosophers are less interested in the truth-goal in its unqualified sense; they are more interested in intellectual virtues that position a person to be a proficient knower, virtues such as intellectual courage and honesty, openness to new evidence, creativity, and humility. Though justification increases the likelihood of knowledge under some circumstances, we may rarely be in those circumstances or may be unable to recognize when we are; nevertheless, these philosophers suggest, there is a fitting way of believing regardless of whether we are in those circumstances.

A minority of epistemologists reject any connection between justification and knowledge or virtue. Instead, they focus either on whether a belief fits into an objective theory about the world or whether a belief is useful for attaining our many and diverse cognitive goals. An example of the former involves focusing solely on the causal relationship between a person’s beliefs and the world; if knowledge is produced directly by the world, the concept of justification drops out (for example, Alvin Goldman, 1967). Other philosophers, whom we might call relativists and pragmatists, argue that epistemic value is best explained in terms of what most concerns us in practice.

Debates surrounding these three primary aims inspire many others. There are questions about the sources of justification: Is all evidence experiential, or is some non-experiential? Are memory and testimony reliable sources of evidence? And there are additional questions about how justification is established and overturned: How strong does a reason have to be before a belief is justified? What sort of contrary, or defeating, reasons can overturn a belief’s justification? In what follows, we look at the strengths and weaknesses of prominent theories of justification in light of the three aims just outlined, leaving these secondary questions to more detailed studies.

e. Justification and Knowledge

The type of knowledge primarily at issue in discussions of justification is knowledge that a proposition is true, or propositional knowledge. Propositional knowledge stands in contrast with knowledge of how to do something, or practical knowledge. (For more on this distinction, see Knowledge.) Traditionally, three conditions must be met in order for a person to know a proposition—say, “The cat is on the mat.”

First, the proposition must be true; there must actually be a state of affairs expressed by the proposition in order for the proposition to be known. Second, that person must believe the proposition, that is, she must mentally assent to its truth. And third, her belief that the proposition is true must be justified for her. Knowledge, according to this traditional account, is justified true belief (JTB). And though philosophers still largely accept that justification is necessary for knowledge, it turns out to be difficult to explain precisely how justification contributes to knowing.

Historically, philosophers regarded the relationship between justification and knowledge as strong. In Plato’s Meno, Socrates suggests that justification “tethers” true belief “with chains of reasons why” (97A-98A, trans. Holbo and Waring, 2002). This idea of tethering came to mean that justification—when one is genuinely justified—guarantees or significantly increases the likelihood that a belief is true, and, therefore, we can tell directly when we know a proposition. But a series of articles in the 1960s and 1970s demonstrated that this strong view is mistaken; justification, even for true beliefs, can be a matter of luck. For example, imagine the following three things are truth: (1) it is three o’clock, (2) the normally reliable clock on the wall reads three o’clock, and (3) you believe it is three o’clock because the clock on the wall says so. But if the clock is broken, even though you are justified in believing it is three o’clock, you are not justified in a way that constitutes knowledge. You got lucky; you looked at the clock at precisely the time it corresponded with reality, but its correspondence was not due to the clock’s reliability. Therefore, your justified true belief seems not to be an instance of knowledge. This sort of example is characteristic of what I call the Gettier Era (§6). During the Gettier Era, philosophers were pressed to revise or reject the traditional relationship.

In response, some have maintained that the relationship between justification and knowledge is strong, but they modify the concept justification in attempt to avoid lucky true beliefs. Others argue that the relationship is weaker than traditionally supposed—something is needed to increase the likelihood that a belief is knowledge, and justification is part of that, but justification is primarily about responsible belief. Still others argue that whether we can tell we are justified is irrelevant; justification is a truth-conducive relationship between our beliefs and the world, and we need not be able to tell, at least not directly, whether we are justified. The Gettier Era (§6) precipitated a number of changes in the conversation about justification’s relationship to knowledge, and these remain important to contemporary discussions of justification. But before we consider these developments, we address the DIJ.

2. Internalist Foundationalism

One way of resolving the DIJ is to reject the inferential assumption, that is, to reject the claim that all justification involves inference from other beliefs. The most prominent way of doing this while avoiding skepticism is to show that all chains of good inference culminate at a unique kind of belief called a basic belief. Basic beliefs are beliefs that need not be inferred from any other beliefs in order to be justified. This approach to resolving the dilemma is called foundationalism because basic beliefs serve as a foundation on which all other justified beliefs are supported; a person’s beliefs are related to one another like the parts of a building: beliefs justified by inference are analogous to the roof and walls, which are in turn supported by foundational basic beliefs (see Figure 1).

Foundationalism comprises a family of views, all of which claim, at minimum, that all justified beliefs are either basic or inferred from other justified beliefs. Classically, foundationalists combine this view with the claims that we can know whether a belief is justified—that is, whether it stands in an evidential chain that starts with a basic belief—and the claim that knowing whether we are justified helps us fulfill our epistemic duties—in other words, we do well when we form or keep beliefs that are well supported and discard or refuse beliefs that are not; we do poorly when we do not.

The view that justification is a matter of having certain internal mental states is called internalism, and the family of views that include both is called internalist foundationalism. There is a further debate among internalists as to whether justification requires simply having certain mental states (propositional justification) or whether justified beliefs must be based on those mental states (doxastic justification). Philosophers who reject internalism are called externalists (see §7 of this article). Another debate among internalists is whether justification helps us to fulfill epistemic duties—that is, it tells us which beliefs are epistemically permissible, obligatory, or impermissible (the deontological conception of justification)—or whether it is simply a descriptive fact about our belief systems. (For an example of the latter, see Conee and Feldman 2004).

figure 1

Figure 1: Simple Foundationalist Justification. The dots represent beliefs; the arrowsrepresent inferential relations.

a. Basic Beliefs

It is one thing to say basic beliefs resolve the DIJ and quite another thing to explain how they do. René Descartes famously argued that some beliefs are basic because they are indubitable. If a belief is genuinely indubitable, Descartes argued, it cannot be false. As it is commonly understood, dubitability is a psychological, not epistemic, matter. It might be indubitable for me that my mother loves me, even if it is not true and even if it is the sort of belief that could be doubted, even perhaps by me. But Descartes used “indubitable” to describe a belief that is clear and distinct, which is supposed to guarantee that the belief is true. (See Harry Frankfurt, 1973 for a fuller discussion of clarity and distinctness.) Other foundationalists have explained how some beliefs might stop the regress in virtue of self-evidence, or their privileged role in our belief-forming systems, or their incorrigibility.

Long before Descartes, simple mathematical propositions, such as 2 + 2 = 4, and logical propositions, such as “no one is taller than herself,” were thought to be so obvious that they could not be false. These propositions, many claimed, are self-evidently true, that is, they need no supporting evidence because any attempt to support them would be weaker than their intuitive truth. Some philosophers include perceptual experiences among self-evident beliefs, experiences such as seeing red and hearing a ringing sound. Even if you misperceive a color or a sound, or misperceive what seems to be colored or what seems to be ringing, you cannot doubt that you are having the experience of seeing redness or hearing ringing.

Another explanation for why some beliefs are basic is that they play a privileged role in our belief-forming systems. A common example of beliefs privileged in this way are those formed on the basis of sensory perception: seeing a red ball, touching what feels like a rough surface, hearing a bell. You could be hallucinating these experiences, so it is not self-evident that there is a ball, bell, or surface to experience. Nevertheless, the world impresses itself on you in this way, and it would be difficult to imagine functioning without any sense perceptions whatsoever; they play a highly privileged role in our belief systems and, therefore, can justify other beliefs (hence the emphasis that scientists have traditionally placed on observation).

Further candidates for basicality are beliefs that are true in virtue of being believed, that is, if you believe them, they are true. For example, propositions about intentional states (in other words, states about a mental state, such as hoping, doubting, thinking, believing, and so forth), logically imply the existence of the subject who is in the state. So, for anyone who, while thinking, believes the proposition “I think” can logically infer “I exist.” Beliefs that if held are true are called incorrigible. Other examples may include beliefs about introspective states such as what you believe or feel or remember. If incorrigible beliefs can be recognized as true without appeal to any other beliefs, they are good candidates for justifying other, non-basic beliefs.

Unfortunately, it is not easy to see how all of our many and various non-basic justified beliefs can be inferred from this relatively small set of basic beliefs, even if we accepted every type of basic belief just mentioned. For example, imagine you have been looking for your laptop computer. When you find it, you form the belief, “There’s my laptop.” Did seeing your computer elicit the basic belief, “I seem to be perceiving a laptop there,” from which you then inferred the belief “There’s my laptop”? Not obviously. Seeing the laptop allowed you directly—without any reasoning at all—to form the belief that you found your laptop.

Examples like these have motivated some foundationalists to expand their accounts of basic beliefs to include a wider variety of experiences. These weaker accounts allow that there are many types of non-inferentially justified beliefs, all of which are at least properly basic, where “properly basic” means a belief that is either basic in the classic sense or that meets some other condition that makes it non-inferentially justified for a person. As long as there are a sufficient number of properly basic beliefs, these philosophers argue, a certain sort of foundationalism remains plausible.

One example of how proper basicality might work is Alvin Plantinga’s (1983; 1993a) argument for the rationality of religious belief. Plantinga’s notion of proper basicality is supposed to be weak enough to avoid problems with classic basic beliefs but strong enough to avoid the DIJ. According to him, if a belief is properly basic for a person, it is rational for that person to accept it without appealing to other reasons. He uses rational instead of justified to distance himself from classical problems. (Sometimes Plantinga puts it even more weakly, such that, if a belief is properly basic for a person, that person is not irrational in holding it.) As an example, Plantinga argues that if a person is raised in a religious community where the central religious claims he hears are corroborated by the community and none of those claims is undermined by contrary experience or argument, he is not violating any epistemic duty in believing that, say, God exists. His experiences and circumstances can “call forth belief in God” in a way that does not require other beliefs and can serve as a reason to accept other beliefs (1983: 81). This is a controversial view, not least because it either changes the discussion from justification to rationality or conflates justification and rationality. Nevertheless, basic beliefs are controversial no matter how they are characterized, and Plantinga’s proper basicality is just one among several. For another attempt to defend classical foundationalism against objections, see Timothy McGrew (1995).

b. Arguments For and Against Foundationalism

Foundationalism has remained competitive in the history of justification largely because of its intuitive advantages over competing views. The most common argument for foundationalism is the positive argument that it explains how we actually form beliefs on the basis of evidence. I believe the sky is blue because I see that it is blue, not because I infer it from other beliefs about the sky. Roderick Chisholm offers a sophisticated version of this argument, concluding that “[t]hinking and believing provide us with paradigm cases of the directly evident” (1966: 28). In addition to this positive argument, foundationalists offer the negative argument that no alternative account—skepticism, coherentism, or infinitism—has the resources to satisfactorily resolve the DIJ, that is, to avoid both skepticism and an infinite regress (see BonJour and Sosa 2003). This is, perhaps, the more powerful of the arguments and merits some attention.

Skepticism motivated epistemologists to inquire into justification in the first place, so the skeptical option is generally considered a loss. As an alternative, coherentists (§3) maintain that a person’s beliefs are justified in virtue of their relationship to the person’s belief set (see Lehrer 1974). If a belief stands or can stand in a consistent, mutually supportive relationship with other beliefs—a “web of belief,” as W. V. O. Quine (1970) calls it—that belief is justified. However, there is reason to believe that, since all beliefs stand in mutually supportive relationships, at least some beliefs (perhaps all) will play an indispensable role in their own support, rendering any coherentist argument viciously circular. Since circular arguments are fallacious, if coherentism entails that justification is circular, coherentism cannot resolve the DIJ.

A more recent alternative to skepticism is infinitism (see §4), according to which all justified beliefs stand in infinite chains of inferential relations (see Klein 2005). Skepticism is avoided because every belief is justified by some other belief. Unfortunately, infinitism requires that we accept one of two questionable assumptions: either that there simply is an infinite number of justifying beliefs available (and to which our minds, in virtue of being finite, do not have access) or that there is some algorithm that, for any belief, B, can direct us to a non-circular justifying belief for B. The problem with the former assumption is that it seems to depend on faith that there is an infinite series of justifiers, which is not obviously better than having no justification at all. And the problem with the latter is that it comes dangerously close to foundationalism, where the algorithm functions as a basic belief. If the infinitist cannot refute these objections, it cannot resolve the DIJ.

These are simple concerns about coherentism and infinitism, and we consider more sophisticated objections in sections 3 and 4. But, if neither coherentism nor infinitism can provide an alternative means of resolving the original dilemma, foundationalism may be the most promising alternative to skepticism. Unfortunately for foundationalists, even if they are right that some account of basic belief would adequately resolve the dilemma of inferential justification, it is not clear that such an account is currently available. Further, there are at least two other serious objections to foundationalism.

First, there is some concern that foundationalism cannot be justified by its own account of justification, that is, foundationalism is self-defeating. Alvin Plantinga (1993b) offers a version of this objection. According to foundationalists, a belief is justified if and only if it is either basic or inferred from other justified beliefs. This criterion, though, is not itself basic on any classical conception of basic beliefs (indubitability, self-evidence, evident to the senses, or incorrigibility), and it is not clear how it could be supported by other justified beliefs.

One straightforward response to this objection is that the arguments above (the positive argument and the negative argument by elimination), do provide, contra Plantinga, inferential support for foundationalism. In fact, Plantinga (1983; 1993) expands his own notion of proper basicality precisely to avoid the self-defeat objection. Further, if sophisticated reasoning strategies like induction could be justified on foundationalist grounds, then foundationalism itself may be justified on such grounds. For example, Laurence BonJour (1998) defends rational insight as a basic source of evidence and then argues that induction is justified by rational insight. If foundationalism is roughly correct and there are arguments grounded in rational insight that justify foundationalism, foundationalism might be vindicated. Of course, there remain concerns about the circularity of such arguments.

Other philosophers use an inference to the best explanation to defend a type of basic evidence, though these views may rightly be regarded as hybrids of foundationalism and coherentism. For example, Earl Conee and Richard Feldman (2008) argue that “[p]erceptual experiences can contribute toward the justification of propositions about the world when the propositions are part of the best explanation of those experiences that is available to the person.” The idea that what have been called basic beliefs are connected with the world and how we are positioned in the world is a better explanation of why we have the evidence we have than traditional accounts of justification. Catherine Z. Elgin (2005) offers a similar account, arguing that, while perceptions have “initial tenability” given their privileged role in our belief formation, they do not obtain this tenability in isolation from our whole evidential context; over time, certain perceptual beliefs have proved themselves to have the plausibility that allows us to privilege them.

A second objection to foundationalism is the meta-justification argument. The idea is that basic beliefs cannot resolve the DIJ because, even if their justification does not depend on other beliefs, it does depend on reasons which themselves require reasons. If I believe a proposition because it is indubitable, then I must have some reason for thinking that indubitable beliefs are likely to be true. If I do not, I am stuck with Horn A, and if I do, I am stuck with Horn B. To demonstrate this problem, Peter Klein (2005) asks us to imagine an argument between Fred and Doris, where Fred has come to what he regards as the basic belief on which his argument depends; call it b.

According to Fred, b has autonomous justification, that is, is a type of basic belief. Doris happens to agree that b is autonomously justified but asks whether beliefs with autonomous warrant are likely to be true. As a foundationalist, the most plausible option for Fred is the following: “He can hold that autonomously warranted propositions are somewhat likely to be true in virtue of the fact that they are autonomously warranted” (2005: 133).

If Fred is right, however, b only works as a justification for the rest of his argument precisely because he has added something to b. What has he added? Namely, that he “has a very good reason for believing b, namely b has F and propositions with F are likely to be true.” These are propositions independent of b that serve to justify b. Klein continues: “Of course Fred, now, could be asked to produce his reasons for thinking that b has F and that basic propositions are somewhat likely to be true in virtue of possessing feature F” (2005: 134). If this is right, basic beliefs do not stop the regress of reasons (see also Smithies 2014).

One response to this criticism comes from Laurence BonJour, who argues that it is plausible to think that understanding b includes a sort of built-in awareness of the content of those additional premises Klein mentions, such that understanding b constitutes, in and of itself, a reason to hold b (BonJour and Sosa 2003: 60-68). If it is possible to have an evidential state that includes, non-inferentially, all the content necessary for having a reason to believe a proposition is true, foundationalists may be able to describe a basic belief that stops the regress and avoids skepticism. But explaining just what this state is remains a point of controversy.

Another response is to construct an inference to the best explanation, as mentioned above in response to the self-defeat objection (Elgin, 2005; Conee and Feldman, 2008). The result, again, is typically a hybrid view, which may be equivalent to giving up foundationalism. Conee and Feldman say their view is closer to a “non-traditional version of coherentism” (2008: 98). And Elgin calls her view “a very weak foundationalism or…a coherence theory” (2005: 166). This raises questions about the merits of coherentism, to which we now turn.

3. Internalist Coherentism

Like foundationalists, coherentists attempt to avoid skepticism while rejecting infinitism. But they find a further problem with foundationalism. Every sensory state (seeing red, smelling cinnamon, and so forth) must be understood in a mental context, that is, one must have a set of background experiences, beliefs, and vocabulary sufficiently large for forming and understanding beliefs. All sensory beliefs, such as “I see red” and “I smell cinnamon,” require an immensely complex set of assumptions about self-reference, seeing, colors, smelling, and scents. This means that individual beliefs are not isolated bits of information that act as bricks in a building; they are nodes of information that depend for their meaning and support on a web of relationships with other beliefs.

Many coherentists accept the inferential assumption and argue that the result is not an infinite regress of inferences, but a non-linear system of support from which justification emerges as a property of the combination of inferences. As Donald Davidson puts it, “[N]othing can count as a reason for holding a belief except another belief” (2000: 156). Other coherentists reject the inferential assumption and argue that the result is a non-linear system of support from which justification emerges as a property of the set as a whole. Keith Lehrer explains: “This does not make the belief self-justified, however, even though it might be non-inferential. The belief is not justified independently of relations to other beliefs. It is justified because of the way it coheres with other beliefs belonging to a system of beliefs” (1990: 89). As we see below (§3.b), some coherentists reject the belief requirement of the inferential assumption, arguing that perceptual experiences can play a justifying role in the set of mental states that includes a person’s beliefs.

Regardless of whether coherentists accept the inferential assumption, they can allow that some beliefs are non-inferentially generated—for example, by experiences, intuitions, hunches, and so forth. But they are committed to the idea that the justification for beliefs generated in these ways depends essentially on their relationship to the person’s complete set of beliefs. Construed in this way, coherentism is specifically a view about justification and should not be confused with coherentism about a truth. Some philosophers have held both coherentism about truth and justification (Blanshard 1939 and Lewis 1946), but many who hold coherentism about justification reject coherentism about truth (see BonJour 1985, ch. 5, and Truth).

a. Varieties of Coherence

Broadly, coherentists argue that a belief is justified just in case it stands in a system of mutually supporting relationships with other beliefs in a person’s system of beliefs. For instance, my belief that the cat is on the mat involves a complicated set of beliefs: I am seeing a cat, I am seeing a mat, I am seeing a cat on a mat, a cat is a particular kind of mammal, a mat is a particular type of floor covering, my vision is generally reliable under normal circumstances, these are normal circumstances, and so forth. It is difficult to imagine arranging these in a linear, foundationalist fashion. In addition, it is not clear whether some of these beliefs are more basic than some others. Nevertheless, they all cohere, which means they are logically consistent with one another and with other beliefs in my belief set, and they mutually support one another. The challenge for coherentists is to explain just what “mutual support” amounts to.

Whereas foundationalists employ the metaphor of a building (or a pyramid, in some cases) to explain justificational relationships, coherentists employ the metaphor of a web (or, in some cases, a raft), according to which, each node (or plank) works alongside the others in a non-linear fashion to constitute a stable, interconnected whole (see Figure 2, as well as Neurath 1932, Quine 1970, and Sosa 1980). There are four candidates for how the web or raft holds together: logical consistency, logical entailment, inductive probability, and explanation.

figure 2

      Figure 2: Simple Coherentist Justification

                P-S represent propositions;

         the arrows represent lines of inference.

The first candidate, logical consistency, is generally regarded as necessary for coherence but too weak to stand on its own. For example, the belief that P and the belief that probably not-P are logically consistent. But they are not coherent; if one of them is true, the other is not likely to be true (BonJour 1985, ch. 5). Therefore, some early coherentists added that the relationship must also include logical entailment. This view, which I will call entailment coherentism, has it that a belief is justified just in case it entails or is entailed by every other belief in a person’s belief set (Blanshard 1939). Most coherentists now reject this relationship as overly strict, primarily because it seems possible to have two very different beliefs, neither of which entails the other and yet which are both justified. For example, consider the beliefs “I am seeing a needle puncture my skin” and “I am feeling pain.” Neither belief entails the other; nevertheless, it is intuitively plausible that both belong to a coherent set of beliefs.

Because of the problems with mere consistency and consistency plus entailment, most coherentists allow that entailment is sufficient for coherence but not necessary. To capture weaker relationships, they expand the notion to include inductive probability. Inductive probability coherentism is the view that a belief is justified just in case it is a member of a set each of whose members is entailed by or made more probable by a subset of the rest. C. I. Lewis, calling this type of justification “congruence,” puts it eloquently: “A set of statements, or a set of supposed facts asserted, will be said to be congruent if and only if they are so related that the antecedent probability of any one of them will be increased if the remainder of the set can be assumed as given premises” (1962). With their emphasis on inferential relations among beliefs, entailment and inductive probability coherentism attempt to resolve the DIJ by capturing the intuitive plausibility of the inferential assumption while avoiding the difficulties with basic beliefs.

Unfortunately, inductive probability coherentism faces problems similar to those that face entailment coherentism. It seems plausible for a person to hold two justified beliefs without the antecedent probability of either increasing the epistemic probability of the other, even when conjoined with other beliefs in the set. Consider, for example, your beliefs that “the Red Sox will win the Pennant” and “John F. Kennedy was shot in 1963.” Both beliefs are reasonably part of a person’s belief system, and yet it is difficult to see how one might contribute to a set of beliefs that makes the other more probable. Second, even if a subset of beliefs in a set increase the probability of each other member, the set might not be sufficiently comprehensive or well-connected with one’s experiences to justify one’s beliefs. Imagine a set of 100 beliefs, any 99 of which render the 100th member more probable than its antecedent probability. This set passes the inductive probability test and is, therefore, coherent on this account, but it includes very few beliefs. This suggests that, in order to maintain coherence, we could arbitrarily expand or contract our set of beliefs at will to avoid loss of rationality. The only guideline is that we preserve strong inductive inferences. Unfortunately, such arbitrary sets ignore important differences in the sources of beliefs; we can imagine two inductively coherent sets, one that includes sensory beliefs and one that does not. Inductive probability coherentism, without further qualification, implies that neither set is more rational than the other. As Catherine Z. Elgin puts it, “A good nineteenth-century novel is highly coherent, but not credible on that account. Even though Middlemarch is far more coherent than our regrettably fragmentary and disjointed views…, the best explanation of its coherence lies in the novelist’s craft, not in the truth…of the story” (2005: 159-60).

A third prominent account of coherence aimed at avoiding this criticism allows that entailment and inductive probability can contribute to coherence but only insofar as they function in a plausible explanation of the set of beliefs. According to this view, known as explanatory coherentism, beliefs are justified just in case they explain or are explained by the other beliefs of the same type (Harman 1986 and Poston 2014). This view is not committed to the inferential assumption and argues that justification is an emergent property of the explanatory relations among beliefs. Catherine Z. Elgin says that “epistemic justification is primarily a property of a suitably comprehensive, coherent account, when the best explanation of coherence is that the account is at least roughly true” (2005: 158). Elgin adds that the beliefs comprising a coherent system “must be mutually consistent, cotenable, and supportive. That is, the components must be reasonable in light of one another” (2005: 158).

Explanatory coherentism takes its motivation from responses to a problem in philosophy of science that was similar to the problem that faces inductive probability coherentism (Neurath 1932 and Hempel 1935). Not every proposition in a scientific theory is derived inferentially from others, and so there is some question as to whether such propositions could be believed justifiably. It turns out, though, that those propositions play an important explanatory role in the theory that organizes evidence and concepts in plausible ways, even if those propositions have no antecedent probability outside of the system. Elgin explains, “For example, although there is no direct evidence of positrons, symmetry considerations show that a physical theory that eschewed them would be significantly less coherent than one that acknowledged them. So physicists’ commitment to positrons is epistemically appropriate” (2005: 164). This suggests that explanations can play a justifying role independently of inferential relations, thus lending plausibility to coherentism.

Explanatory coherence avoids criticisms of earlier accounts in that it (1) maintains that consistency is an important constraint on a belief set, and (2) maintains that inferential relations contribute to explanatory power, while (3) also accounting for the intuitive connection of certain beliefs with sensory evidence and non-inferential coherence relations. Nevertheless, some criticisms have led philosophers like BonJour (1985), Lehrer (1974; 1990), and Poston (2014) to add other interesting and influential conditions to coherence theories, though space prevents us from exploring them here.

b. Objections to Coherentism

There are three prominent objections to coherentism. The first, which we already encountered in §2.b, is called the circularity problem. Since coherentism depends on mutual support relations, every particular belief will likely play an essential role in its own justification, rendering coherentist justification a form of circular argument (see Figure 3).

figure 3

The problem with circular justification is that it putatively undermines the goal of justification, which is to garner support for claim. If a claim is inferred from itself (P à P), the concluding proposition has only as much support as the premise, but that is precisely what we do not know. Therefore, multiplying the inferences between a proposition and an inference to that proposition (for example, (P à Q); (Q à R); (R à S); (S à P)) cannot justify P.

In response, some coherentists argue that the circularity objection oversimplifies the view. While it is true that a belief will almost certainly play a role in its own justification, this is only problematic if we assume the justificational relationship is linear. Properly understood, justification is a property that emerges from non-linear relationships among beliefs, whether inferential or non-inferential. For example, Catharine Z. Elgin tells a story about Meg (adapted from a story by Lewis 1946), whose logic textbook was stolen. There were three witnesses to the theft, but all are unreliable witnesses (one is aloof, one has severe vision problems, and one is a known liar). Nevertheless, all three witnesses agree that the thief had spiked green hair. Despite the fact that no one of the witnesses is reliable, their independent testimony to a single, unique proposition increases the likelihood that the proposition is true. As Elgin puts it, “This [agreement] makes a difference. … Their accord evidently enhances the epistemic standing of the individual reports” (2005: 157). If this is right, the antecedently low probability of the thief’s having spiked green hair can be added to the combined strength of the testimonies to create a justified belief without vicious circularity.

A second objection to coherentism is called the isolation objection. Even if a collection of beliefs could explain, and thereby justify, its members, it is not obvious how this set of beliefs is connected with reality, that is, with the content the beliefs are about. In rejecting basic beliefs, coherentists reject privileging any particular cognitive state in the belief system, such as sensory experiences. All beliefs are treated equally and are evaluated according to whether they cohere with the belief set. But beliefs can cohere with one another regardless of whether their content expresses true propositions about reality. Coherence cannot guarantee that the set is not isolated from reality.

Some coherentists respond to this objection by making special provisions for beliefs that derive from coherence-increasing sources, such as sense experience. (BonJour (1985) calls such beliefs “cognitively spontaneous beliefs.”) This makes the degree of coherence partly a matter of how well the system of beliefs integrates sense perception. Others appeal to more abstract distinctions among types of justification. For example, Keith Lehrer (1986) distinguishes personal justification, which involves the traditional, internalist coherence requirement, from verific justification, which is an externalist requirement on coherence. While objective coherence may be outside a person’s ken, it nevertheless contributes, along with personal justification, to what Lehrer calls complete justification. This externalist requirement helps to ground a person’s system of beliefs in the world those beliefs are supposed to be about.

Another coherentist response to the isolation objection is to allow experience itself, not just beliefs about experience, to figure in the evaluation of coherence. Catherine Z. Elgin (2005) argues that we have good reasons to privilege some perceptual experiences over very coherent sets of beliefs. She argues that this is because perception does not—contra foundationalists—work in isolation from other sorts of evidence. She says, “Only observations we have reason to trust have the power to unseat theories. So it is not an observation in isolation, but an observation backed by reasons that actually discredits the theory” (162). This also explains how we are able to privilege some perceptual experiences over others (say, in unfavorable conditions), though she admits that her view includes “something other than coherence,” and allows that it is a very weak form of foundationalism. For a reply along these lines that maintains a more traditional version of coherentism, see Kvanvig and Riggs (1992).

A third objection is called the plurality objection. Because justification is determined solely by the internal coherence of a person’s beliefs, coherence theory cannot guarantee that there is “one uniquely justified system of beliefs” (BonJour 1985: 107). BonJour explains that this is because “on any plausible conception of coherence, there will always be many, probably infinitely many, different and incompatible systems of belief which are equally coherent” (ibid.). To show just how pernicious this problem is, Lehrer asks us to imagine one set of beliefs comprised of both necessary and contingent beliefs and then to imagine a second set created by negating all the contingent beliefs in the first set (1990: 90). This has the nasty implication that, if coherence is sufficient for justification, then “for any contingent statement a person is completely justified in accepting is such that he is also completely justified in accepting the denial of that statement” (ibid.).

One response to the plurality objection is to invoke a “total evidence” requirement on explanatory and probabilistic relations. While we can arbitrarily construct probabilistically and explanatorily coherent sets, there is a non-trivial sense in which non-belief states explain our beliefs: sensation, testimony, and so forth. A theory of explanation that includes the antecedent probabilities of the beliefs based on this evidence would be more coherent with our total evidence than an arbitrary set of beliefs that ignores them. Recent debates over the relationship between coherence and truth include sophisticated analyses of probabilistic assessments (Klein and Warfield 1994 and Fitelson 2003) and an interesting argument for the impossibility of coherence’s increasing the probability that a belief is true (Olsson 2009), but there is not space to develop these arguments here.

For more on coherentism, see Coherentism in Epistemology.

4. Infinitism

Infinitism is an internalist view that proposes to resolve the dilemma of inferential justification by showing that Horn B of the DIJ, properly construed, is an acceptable option. In fact, argue infinitists, there are no serious problems with an infinite chain of justifying beliefs.

Traditionally, epistemologists have rejected the idea that a belief’s linear chain of justifying beliefs can extend infinitely because it leaves all beliefs ultimately unjustified. Inferential justification is said to transmit justification, not create it; therefore, an infinite chain of justifying beliefs would have no source of support to transmit. Similarly, since one could not hold an infinite number of beliefs or mentally trace an infinitely long chain of beliefs, infinitism betrays a common internalist intuition that a person must be aware of good reasons for holding a belief.

Infinitists claim these criticisms are misguided. In practice, justification is not as tidy as epistemologists would have us believe. The traditional idea that the regress must stop or bottom out in basic beliefs is unrealistic and unnecessary. Few of us attempt to draw inferences long enough to arrive at basic beliefs. We often stop looking for reasons when we are content that we have fulfilled our epistemic responsibility, not because the chain has actually ended (Aikin 2011). Foundationalists and coherentists, then, are relatively unconcerned with ultimate justification in their own epistemic behavior and, therefore, to hold epistemic justification to such high standards renders very few of our beliefs justified. To accommodate this messiness, infinitists might reject the inferential assumption, at least as classically understood. Like coherentists, infinitists may hold that justification is an emergent property of a set of beliefs and that justification comes in degrees such that, the longer the inferential chain, the stronger the degree of justification (Klein 2005).

a. Arguments for Infinitism

There are two main lines of argument for infinitism. The first is that foundationalism and coherentism cannot stop the structure of justification from regressing infinitely. For example, Peter Klein (2005) constructs a version of the meta-justification argument against foundationalism and argues that the most plausible version of coherentism (emergent justification accounts), because of its appeal to a basic assumption about the reliability of coherent sets, is merely a disguised form of foundationalism. If these arguments hit their mark, and if externalism is ruled out, infinitism may be the only non-skeptical option available.

The second main line of argument for infinitism is that the classic objections to infinitism are aimed at overly simplistic versions of the view; they do not threaten suitably qualified versions. For example, Scott Aikin (2009) argues that concerns about the regress arise because of a conflict between two types of intuition: (1) proceduralism, which includes our standard intuitions about good reasons and responsible believing, and (2) egalitarianism, which includes our intuitions that people are generally justified in believing a lot of things (beliefs about how to set DVRs and beliefs about how to get from home to work). Aikin claims that infinitists take the demands of proceduralism more seriously than egalitarian intuitions, maintaining that justification and knowledge are very difficult to attain. The more committed we are to following our chains of evidence, the more likely we are to attain our epistemic goals. However, we often stop far from what even foundationalists would take to be the end of those chains. And at every proposed stopping point, there is an infinite number of justificational questions about the appropriateness of the terms we are using, the reliability of our perceptions and concept attributions, and so forth. If this is right, infinitism may be the most plausible implication of our epistemic intuitions.

Similarly, Peter Klein (2014) argues that infinitism is a minimal thesis about what makes justification valuable, namely, that it renders our beliefs “reason-enhanced.” He says, “Infinitism holds that a belief-state is reason-enhanced whenever S deploys a reason for believing that p. Importantly, S can make a belief-state reason-enhanced even if the basis is another belief-state that is not (yet) reason-enhanced” (2014: 105). If this is right, then the process of inferring can create or produce original epistemic support, and we need not appeal to anything like basic beliefs for ultimate support. Further, infinitists do not object to a chain of inference’s stopping, for instance, when some presuppositions are explicit. For example, reasoning about Euclidean geometry may appropriately stop at Euclid’s axioms when we agree that they are our standard of evaluation. But we can also admit that those axioms can be challenged, and our reasoning could continue indefinitely. Infinitists simply argue that this is a standard feature of all justification.

b. Objections to Qualified Infinitism

Carl Ginet (2005) argues that even qualified infinitism is motivated on spurious grounds. One argument against foundationalism is that, even for basic beliefs, one needs a reason to believe they are true, and this initiates an infinite regress of reasons. Ginet objects, however, that this argument threatens foundationalism only if all reasons are inferential reasons. Of course, this is precisely what foundationalists reject. If some non-belief reasons are justified independently of any additional reasons for thinking they are true, that is, if they are inherently reasonable, the infinitist argument against foundationalism is question-begging.

In response, the infinitist might contend that, even if its critique of foundationalism is flawed, infinitism may yet be the more plausible alternative. If infinitism captures our intuitions about justification as adequately as foundationalism, and if it requires fewer controversial concepts (basic beliefs), infinitism may be an attractive competitor.

Another objection to infinitism is that, given our finite minds, we lack complete access to the infinite set of justifying beliefs. If a person has no access to his reason for belief, then infinitism is no longer internalist and, thereby, loses its means of defusing the DIJ. Of course, the infinitist may concede this and fall back on a mentalist account of epistemic access (see §5.a below). As Ginet puts it: a belief (L) “is available to S as a reason for so believing only if S is disposed, upon entertaining and accepting (L), to believe that the fact that (L) was among his reasons for so believing” (2005: 146). If this is right, a person may have a disposition to recognize further evidence for his justifying beliefs when prompted to do so.

Nevertheless, even this mentalist-enhanced infinitism faces the concern that the process of justification is never complete. An assumption behind the DIJ is that, if for any belief, there is not a reason to believe it is true, that belief and any beliefs inferred from it are unjustified. If this is right, and the justification condition for infinitism is never actually met, then we are left with skepticism.

A variation on this criticism is the idea that inferential justification can only transmit justification and cannot originate it. The idea is that all inference is conditional ((P → Q); (Q → R); (R → S)). Given this set of propositions, is S justified for us? That depends on whether P is justified. Telling us that P is justified by N, (N → P), though, does not answer the question of whether S is justified. We still need to know whether N is justified (Dancy 1985: 55). If this is right, then no matter how long the chain of inference is—even if it is infinite—no belief is justified.

Infinitists may respond to this objection by arguing that the justification condition is not a matter of getting to a final, infinitely large set, but of increasing one’s epistemic reasons for the proposition in question. Peter Klein, using the term “warrant” for “justification,” says that infinitism is like coherentism in this respect. He says, “Infinitism is like the warrant-emergent form of coherentism because it holds that warrant for a questioned proposition emerges as the proposition becomes embedded in a set of propositions” (2005: 135). Further, Klein explains that “warrant increases not because we are getting closer to a basic proposition but rather because we are getting further from the questioned proposition” (137). This amounts to a rejection of the claim that inferential justification can only transmit justification and, therefore, that a justificational chain must be complete in order to be adequate (recall Catherine Z. Elgin’s story about Meg in §3.b above).

A worry for this response is similar to a worry for coherentism. Any criterion that implies the infinite set of beliefs is justified is either part of the set or independent of it, in which case, it, too, needs a justification. If some sort of justification-conferring awareness is built into the increasingly large set, infinitism seems like foundationalism in disguise.

A further worry is that, if infinitists do not require that a person actually have an infinite number of justifying beliefs or perform an infinite number of inferences, then infinitism seems committed to the idea that inference itself can create justification. This, however, seems implausible. Carl Ginet writes, “…acceptable inference preserves justification … [but] there is nothing in the inferential relation itself that contributes to making any of those beliefs justified” (2005: 148-49). If inference cannot produce justification, it is unclear how a belief in an infinite chain of inferences comes to be justified.

For a more detailed treatment of infinitism, see Infinitism in Epistemology.

5. Types of Internalism and Objections

As noted above (§2), the view that justification is something we can determine by directly consulting our mental states is called internalism. This view does not entail that all epistemic concepts are internal. John Greco gives an example to demonstrate the difference: “[S]uppose that someone learns the history of his country from unreliable testimony. Although the person has every reason to believe the books that he reads and the people that teach him, his understanding of history is in fact the result of systematic lies and other sorts of deception” (2005: 259). Objectively speaking, this person’s beliefs are not reliably connected with reality. Subjectively, though, he is following his evidence to their rational conclusion. Should we say this person’s beliefs are justified? Since the reliability of his sources is beyond his ability to evaluate, the internalist says he has fulfilled his epistemic duty: yes, he is justified.

For centuries, there was no serious alternative to internalism. As we will see in §6, the advent of the Gettier case in the 20th century constitutes a serious challenge to internalism, and it contributed to alternative, externalist accounts of knowledge and justification. This move to externalism also led to closer scrutiny of internalism, and new concerns about its adequacy arose. I review just two of these here. But before doing so, it is helpful to distinguish two types of internalism: accessibilism and mentalism.

a. Accessibilism and Mentalism

According to accessibilists, in order for a belief to be justified for a person, that person must have “reflective access” to good reasons for holding that belief. To have reflective access is to be directly mentally aware of reasons for holding a belief. Some accessibilists argue that a person’s access must be occurrent, that is, she must be currently aware of her reasons for holding a belief (Conee and Feldman 2004). Others hold the looser requirement that, as long as a person has had direct access to relevant justifying reason, she is justified in holding the supported belief.

According to mentalists, reflective access may be sufficient for justification, but it is not necessary. All that is necessary for a belief to be justified is that a person has mental states that justify the belief, regardless of whether a person has reflective access to those states. Mentalists allow that some non-reflectively accessible mental states can justify beliefs.

Mentalism is supposed to have several advantages over accessibilism given the standard criticisms of internalism. For example, some have objected to internalism on the grounds that it cannot accommodate intuitive cases of stored or forgotten evidence. If, for example, you are driving and not thinking about whether Washington, D.C. is the capital of the United States, or you have forgotten any evidence for this belief, are you justified in believing that it is? If not, could we say that you know it is the capital? Accessibilists claim that a person must be able to access her evidence for a belief while she is currently thinking about it and presumably without prompting. Few of us, though, hold (or even could hold) a belief with all its attendant reasons in mind at once. Similarly, it seems reasonable to imagine that a person is justified in believing a proposition for which she has forgotten her evidence. Mentalists can handle these cases by claiming that the ability to access stored facts can constitute dispositional justification, and that even in cases of forgotten evidence, it could still be the case that the fact that it is justified is consciously available, either occurrently or dispositionally (Conee and Feldman 2004).

The worry for mentalism is that, in allowing non-occurrent mental states to count as reasons, mentalism betrays its claim to be internalist. For example, there may be a lot of evidence I could have that P is true if I were in the right place at the right time. But the existence of that evidence does not obviously justify P for me since being in such a place might be a matter of luck. Being at the right place at the right time may mean that the evidence that, say, “Washington is the capital,” is in a book nearby that I never happen to read or that the evidence is one of my mental states that I am not currently thinking about, even if I could when prompted. Specifying just what it means for evidence to be available but not occurrent turns out to be quite difficult. Richard Feldman (1988) argues that in neither of these examples am I justified in believing that Washington is the capital and that a mental state counts as evidence if and only if one is currently thinking of P. Feldman embraces the counterintuitive implication that “one does not know things such as that Washington is the capital when one is not thinking of them” (237). Despite these difficulties, the distinction between accessibilism and mentalism plays an important role in the debate over internalism.

For more on accessibilism and mentalism, see §1.c of, Internalism and Externalism in Epistemology.

b. Objections to Internalism

In addition to the Gettier problem (§6), there many other lines of argument that challenge internalism. Here, I review only three. One of these lines is called the access problem. Traditional foundationalists have accepted some version of accessibilism. For example, Roderick Chisholm writes that justification is “internal and immediate in that one can find out directly, by reflection, what one is justified in believing at any time” (1989: 7). But what if the belief P that justifies my current belief Q is tucked far back in the recesses of my memory and would require more time than I currently have to access it? Am I still justified in believing Q? Or worse, imagine that I have forgotten P; there is no possibility that I can directly access it. However, Q seems true to me, I remember that I had good reasons for believing it, and I do not have any reasons to doubt Q now. Am I justified in believing Q in this case?

Without some modification, the internalist must say no in both cases—the relevant evidence is neither immediately nor reflectively available—though intuitively these are normal cases of justified belief. The standard response is two-fold. First, we must admit that justification comes in degrees: having more evidence can increase one’s justification and some evidence is stronger than others. And second, the state of seeming to be justified or remembering that I am justified can, themselves, constitute reasons for belief. Therefore, in these cases, the internalist might respond that, while the justifications are not as strong as we would prefer, they are, nonetheless, based on accessible mental states.

A second, related objection to internalism is what, following John Greco, I will call the etiology problem. Internalism tends to make justification so easy that it is unclear how one is able to distinguish between good and bad reasons. Consider an example from Greco (2005):

Charlie is a wishful thinker and believes that he is about to arrive at his destination on time. He has good reasons for believing this, including his memory of train schedules, maps, the correct time at departure and at various stops, etc. However, none of these things is behind his belief—he does not believe what he does because he has these reasons. Rather, it is his wishful thinking that causes his belief. Accordingly, he would believe that he is about arrive on time even if he were not. (261)

Why is the combination of his beliefs about schedules, maps, and time a better reason for thinking he is about to arrive than wishful thinking? Presumably, it is because those things are reliable indicators of truth, whereas wishful thinking is not. Being a reliable indicator of truth, though, is an external relationship between the belief and the world—something to which Charlie has no access. We can arrive at a similar result from imagining that Charlie does base his beliefs on his beliefs about train schedules, and so forth, but stipulating that he formed those beliefs carelessly and haphazardly, and only accidentally arrived at the correct conclusion. Nevertheless, based on these beliefs, it seems clear to Charlie that the conclusion follows.

An internalist might respond that this objection depends on the mistaken assumption that internal factors exclude empirical evidence. To see how this assumption slips in, consider how an externalist might determine that train schedules are more fitting sources of evidence than wishful thinking. Presumably, externalists would evaluate the past track record of each source of evidence to see which more reliably indicates truth. The act of “reviewing their past track records,” however, involves appealing to internal states about what seems to be their track records and, therefore, is not obviously different from what an internalist would do; one has internal access to evidence that train arrivals correspond more reliably with train schedules than with wishes. By demanding that justification depends only on external features of the belief-forming process, and then appealing to internal features to evaluate external reliability, Greco is not denying that one must have good, accessible reasons for her beliefs; he is simply disguising the internal features by including them in the external conditions (Feldman 2005: 281). Therefore, either objective etiology is essential to justification, and, therefore, since no one has access to it, we are left with skepticism, or subjective access to evidence of reliable etiology is sufficient for justification, and the externalist criticism misses its mark.

Both the access problem and the etiology problem challenge the idea that we can determine whether we are justified by appeal to internal states. But even if this challenge can be answered, internalism is sometimes thought to imply that we can voluntarily control or change what we believe, that is, that we are guided but not determined by our evidence. The view that we have voluntary control over what we believe is called doxastic voluntarism (from the Greek doxa, for “what is given” and sometimes for “what is believed”). The idea is that internalism is intuitive partially because it allows us to take responsibility for our epistemic behavior. In fact, “[n]onvoluntarism is generally taken to rule out responsibility, since one is not responsible for what one does not control” (Adler 2002: 64). Taking responsibility implies we can decide to respond to evidence well or poorly. This suggests a third objection to internalism called the guidance problem. (For presentations of the guidance problem, see John Heil 1983 and William Alston 1989.)

It turns out that it is difficult to control what we believe: try to make yourself believe you are not reading this page or that you are not real. It is unclear what it would take to convince you that such things are true. That kind of shift would seem to require a complete change in your evidence. But if that is right, then our beliefs are tied strongly to factors outside our control; we cannot simply decide what evidence we have or whether to believe on the basis of that evidence. According to this critique, the idea that internalism explains how we take responsibility for our beliefs is misguided.

In response, contemporary internalists tend to accept that our beliefs are largely determined by the evidence we perceive ourselves to have, but they reject the idea that complete or even partial voluntary control is necessary for responsibility. Carl Ginet (2001) argues that our control over our beliefs is limited but that we nonetheless may decide what to believe in those cases where the evidence is indecisive, cases “where the subject has it open to her also to not come to believe it” (74). Further, Earl Conee and Richard Feldman (2004) argue that a person’s beliefs may appropriately fit one’s evidence even if she cannot control whether she forms those beliefs. For instance:

Suppose that a person spontaneously and involuntarily believes that the lights are on in the room, as a result of the familiar sort of completely convincing perceptual evidence. This belief is clearly justified, whether or not the person cannot voluntarily acquire, lose, or modify the cognitive process that led to the belief. (85)

For a more comprehensive treatment of the debate between internalists and externalists, see Internalism and Externalism in Epistemology.

6. The Gettier Era

The idea that justification is the crucial link between true belief and knowledge seems to be implicit in epistemology since Plato. In Theatetus, Socrates gives an example of a jury that has been persuaded by hearsay of a true judgment that can only be known by an eye-witness (201b-c). This example shows that “true judgment” is not the same thing as “knowledge,” and, therefore, that some other element is needed. Theatetus suggests that knowledge is true judgment plus a logos—an account or argument. Socrates considers three ways of giving an account of a true judgment but concludes that none is plausible. Nevertheless, from then until now, philosophers have generally thought something like the Theatetus’s suggestion must be right, and most of those accounts have been internalist. Socrates’s own suggestion, in Plato’s Meno, is that knowledge is a type of remembrance of what is true based on direct experience prior to being born. Descartes tries to close the gap between true belief and knowledge with the apprehension of clarity and distinctness. Kant attempts to bring them together with the transcendental apperception of the conditions for the possibility of veridical perception. In each case, the knower is assumed to have direct access to something that explains when true belief is knowledge.

Unfortunately, a thought experiment developed in the 20th century challenges the idea that any internal criteria can distinguish knowledge from accidentally true belief. This thought experiment was named the Gettier Problem after Edmund Gettier, who introduced the most influential examples in a famously brief 1963 paper. Examples from other philosophers proliferated after Gettier’s publication, but each new instance is standardly called a “Gettier Case.”

a. The History of the Gettier Problem

The idea is that there are cases where all three conditions on knowledge are met—a belief is justified and true—and yet that belief fails to be knowledge. Although some traditional internalists have allowed that a false belief can be justified, they have resisted the idea that a belief’s justification does not contribute to the likelihood of knowing. But if Gettier cases are successful, it is possible to be justified (in the classic internalist sense) in holding a true belief without that belief’s being knowledge.

The the broken clock example in §1 is an early version of this problem, constructed by Bertrand Russell (1948). Here is another example Russell includes alongside his clock case:

There is the man who believes, truly, that the last name of the Prime Minister in 1906 began with a B, but believes this because he believes that Balfour was Prime Minister then, whereas in fact it was Campbell-Bannerman. … Such instances can be multiplied indefinitely, and show that you cannot claim to have known merely because you turned out to be right. (171)

The problem, though, contra Russell, is not merely that such a person turns out to be right; it is that the person’s belief is justified in cases where a belief turns out to be true by luck; justified true belief in these cases does not increase the likelihood that the belief is knowledge. The evidence that justifies the belief is not connected with the truth of the belief in the right way, and, recall from the introduction, believing in the right way is precisely the sort of thing justification is supposed to indicate.

Such cases trace at least as far back as Alexius Meinong (1906), but the most famous are Gettier’s. His cases are interesting because they show that such cases can occur even when our evidence includes logical entailment. In his first example, Gettier asks us to imagine that two men, Smith and Jones, have applied for the same job. Imagine also that Smith has very good reasons for believing: “Jones will get the job” and “Jones has 10 coins in his pocket.” From this, it follows logically that: “The man who will get the job has 10 coins in his pocket,” and Smith forms the belief that this is true. As it turns out, however, Smith has 10 coins in his pocket (though he does not know it) and he will get the job. So, Smith’s belief that the man who will get the job has 10 coins in his pocket is true, and he has good reasons for why this is so, but his reasons are unconnected with the real reasons it is true. Most philosophers have concluded that, since Smith’s true belief is just a matter of luck (and not a function of his reasons’ connection with the state of affairs that make it true), Smith does not know that the man who will get the job has 10 coins in his pocket.

Because of the many possible variations on cases like these, the idea that justification is based on evidence to which we have direct access faces a serious challenge. There is no clear sense in which that sort of evidence always or even regularly increases the likelihood that a belief is knowledge.

b. Responses to the Gettier Problem

Some philosophers have tried to save strong internalist justification from Gettier cases. For example, D. M. Armstrong—although he ultimately defends an externalist theory of justification—argues that Gettier cases can be avoided by adding a requirement that all evidence for a belief must be, not merely justified, but also knowledge. In the Gettier case above, since it is false that Jones will get the job, this belief cannot be knowledge for Smith and, therefore, undermines Smith’s ability to know the man who will get the job has 10 coins in his pocket. (See Feldman 1974 for a counterexample.)

Others weaken the requirements on justification by arguing that, while knowledge may have constraints outside our conscious access, justification is more plausibly about responsible or apt belief than truth. Call this weak internalist justification (see Zagzebski, 1996).

Still others argue that Gettier cases suggest either that justification is simply not an internal matter or that knowledge does not require justification. Those who argue that justification is external claim that whether a belief is justified depends on whether there is a law-like connection (conceptual or physical) between a belief and the state of affairs it is about (Bergmann 2006). This approach is externalist because it explains justification in terms of belief-forming processes outside the mental life of the believer. In adopting externalism, some treat internal mental states as irrelevant for justification, while others argue that internal states can play an indirect and partial role in justification. Ernest Sosa (1991), for example, argues that internal states can contribute to the state of affairs that grounds the reliability of certain belief-forming behaviors.

7. Externalist Foundationalism

Gettier cases, in addition to other challenges to internalism, have led some epistemologists to reject the idea that justification requires an internal condition. In its most minimal form, externalism is the view that internalism is false, that is, that some features external to the mental life of a person play a necessary role in justification (Greco 2005: 258). However, many versions of externalism also explicitly reject internal conditions for justification, at least for non-inferential knowledge. Some philosophers have developed externalist accounts of knowledge that lack any account of justification (compare, Goldman, 1967, though he has since given up this view). The debate between externalists and internalists, though, is primarily about justification. Externalist accounts of justification differ from internalist accounts by challenging the idea that justification is primarily or ultimately about good reasons when good reasons are construed as mental states.

To accommodate the external features that connect beliefs with states of the world, externalists modify what was traditionally meant by justification; rather than appealing to a person’s subjective perspective on her evidence, externalists appeal to the objective features of the belief-forming and -holding behavior. Epistemic standing is not about the reasons a person has; it is about the relationship between a belief and the world, how that belief is formed or how it is maintained, and where the relationship is not a guarantee of truth but a strong indicator of truth, typically because of a causal, lawful, conceptual, or counterfactual connection with the states of affairs the belief is about. The most prominent version of externalism is the view that a belief is justified just in case it is caused by a reliable process, where “reliable” means that the process produces more true beliefs than false.

a. Externalism, Foundationalism, and the DIJ

Externalists agree that, to resolve the DIJ, one needs to avoid infinite regress and skepticism. So, rather than grounding justification in other beliefs (as coherentists do) or in non-belief states (as classical foundationalists do), externalists ground justification and knowledge in the objective way the world contributes to belief formation or maintenance.

Some externalists, like Armstrong (1973) and Goldman (1979), make room for something like basic beliefs, from which something like non-basic beliefs are inferred. This means that contemporary externalists tend to accept the foundationalist structure—some beliefs are produced reliably by non-belief states, and some beliefs can be produced by other beliefs—though they reject the distinction between basic and non-basic beliefs. All belief-forming processes are states external to the knower’s mental states, and whether a belief is justified (and, therefore, knowledge) depends on the reliability of those processes.

Unlike classical foundationalists, who appeal to internal seemings, indubitability, or self-evidence as justifying these states, externalists like Goldman argue that these states are knowledge simply because they stand in a reliable relationship with the world. A non-inferential belief is knowledge when and because it is lawfully (Armstrong) or reliably (Goldman) produced.

b. Reliabilism

The concept of reliability is crucial to externalist theories of justification (in contrast to externalist theories of knowledge, for example, Goldman 1967, 1976 and Armstrong 1973). There are two types of reliabilist theories of justification. According to reliable indicator theories, a belief is justified just in case its reason or ground is a reliable indicator of the belief’s truth (Swain 1981 and Alston 1988). According to process reliabilism, a belief is justified just in case it was causally produced by reliable processes (Goldman 1979 and Bach 1985). Although he focuses primarily on externalist theories of knowledge, D. M. Armstrong’s “thermometer theory of knowledge” explains that certain mental states serve as reliable indicators or signs of knowledge, and therefore make the belief reasonable, or “justifiable.” Comparing non-inferential belief and a thermometer, Armstrong writes:

In some cases, the thermometer-reading will fail to correspond to the temperature of the environment. Such a reading may be compared to non-inferential false belief. In other cases, the reading will correspond to the actual temperature. Such a reading is like non-inferential true belief. (166)

There are a number of important qualifications to Armstrong’s view, but the central point is that a belief is justified independently of whether the person has reasons to believe it: “The subject’s belief is not based on reasons, but it might be said to be reasonable (justifiable), because it is a sign, a completely reliable sign, that the situation believed to exist does in fact exist” (183).

The benefit of Armstrong’s law-like account is that it suggests a counterfactual account of causal relations along the following lines: as long as a person has a means of distinguishing a proposition, P, from a mutually exclusive but very similar proposition Q, then the person is justified in believing P. For example, if Judy and Trudy are twins, and when John sees someone who looks like Judy, he would not mistake Trudy for Judy, then Sam is justified in believing that he sees Judy. “But if Sam frequently mistakes Judy for Trudy, and Trudy for Judy, he presumably does not have any way of distinguishing between them” (Goldman 1976: 778).

Unfortunately, reliable indicator theories tend to be overly strict in their analysis of cases. Goldman asks us to consider Oscar, who is standing in an open field and sees a Dachshund, from which he forms the belief that he sees a dog. As it happens, Oscar often mistakes certain dog breeds for wolves, who frequent the field. If he were to see a wolf, he might easily mistake it for a dog. Now, is his seeing a Dachshund a reliable indicator of seeing a dog? Since Oscar would likely believe he is seeing a dog regardless of whether he is seeing a wolf or a Dachshund, reliable indicator theories (at least Armstrong’s) would say his seeing a Dachshund is not a reliable indicator of seeing a dog. Whether this criticism is ultimately successful or whether it applies to all reliable indicator theories, reliable process theories quickly overshadowed interest in this type of reliabilism.

Process reliabilism is the view that a belief is justified just in case it is produced by a reliable cognitive process, where a cognitive process may include either conscious reasoning processes or unconscious mechanisms. As I formulated it earlier in this article, reliabilism is a necessary and sufficient condition for justification (“just in case”), but some reliabilists formulate weaker versions. Goldman treats it as a sufficient condition (though he argues against the plausibility of alternative sufficient conditions): “If S’s believing p at t results from a reliable cognitive belief-forming process (or set of processes), then S’s belief in p at t is justified,” (1979: 13). Kent Bach treats it as only a necessary condition: “The idea, roughly, is that to be justified a belief must be formed as the result of reliable processes…” (1985: 199). Despite these differences, externalists univocally reject internalist conditions as sufficient for justification. This commitment, however, leaves them open to a number of interesting criticisms.

c. Objections to Externalism

Though externalism, putatively, has the advantage of avoiding the Gettier problem (though this is controversial) and several other skeptical concerns and of capturing some important intuitions about knowledge, it faces several serious criticisms. On the basis of these criticisms, some internalists claim that externalists have simply changed the subject altogether and are not really talking about justification.

One famous criticism of externalism is called the generality problem. Earl Conee and Richard Feldman (1998) present an example to demonstrate the problem:

Suppose that Smith has good vision and is familiar with the visible differences among common species of trees. Smith looks out a house window one sunny afternoon and sees a plainly visible a nearby maple tree. She forms the belief that there is a maple tree near the house. Assuming everything else in the example is normal, this belief is justified and Smith knows that there is a maple tree near the house. Process reliabilist theories reach the right verdict about this case only if it is true that the process that caused Smith’s belief is reliable. (372)

Is it reliable? That depends on which process formed the belief. Was it the unique causal set of events leading to that particular belief? If so, it is not reliable, since token, or one-time, events have no historical track record. Reliabilists respond to this challenge by saying it is the type of process that must be reliable in order for a belief to be justified, not the token. If that is right, then we face the problem of determining which type of process formed the belief. Was it the “visually initiated belief-forming process,” the “process of a retinal image of such-and-such specific characteristics leading to a belief that there is a maple tree nearby,” the “process of relying on a leaf shape to form a tree-classifying judgment,” the “perceptual process of classifying by species a tree located behind a solid obstruction,” or any number of others (373)? There are innumerable options, and even if a combination of types were involved, each type would have to meet reliability conditions. Conee and Feldman conclude, “Without a specification of the relevant type, process reliabilism is radically incomplete” (373).

A second objection to externalism is called the New Evil Demon Problem (NEDP) (Cohen and Lehrer 1983). In Descartes’s original evil demon problem, in order to motivate the problem of skepticism, we are asked to consider the possibility that all our current perceptions are the fictitious construction of a being intent on deceiving us such that all our perceptual and intuitive beliefs are false. Putting the thought experiment to a very different purpose, if the evil demon world is possible, we can imagine two worlds: (1) a non-deceptive world, where our perceptions are reliably produced by the world outside of our minds, and (2) an evil demon world, where there are people just like you and me, who have exactly the same mental states that we do but whose perceptions are systematically unreliable—they track nothing of truth at that world. There are no trees, buildings, bodies, and so forth. Whatever actually exists at that world, those people have no perception of it. According to externalists—process reliabilists, in particular—the beliefs of people in the real world are justified and those of people in the demon world are unjustified, despite the fact that their mental lives are identical. Yet it is difficult to imagine that demon world beliefs about looking both ways before crossing the street and getting a second opinion about a medical diagnosis are unjustified. People who believe such things are acting responsibly from their perspective on their evidence. This suggests that reliabilism is not really about justification at all.

A third objection to externalism is what Ernest Sosa (2001) calls the metaincoherence problem, which attempts to show that a person’s belief can be externally reliable while internally unjustified. In the literature, there are two versions of the metaincoherence problem. The first is what I call first-order metaincoherence, which attempts to show that externalism is insufficient for justification. The second is what I call second-order metaincoherence, which challenges the externalist’s reasons for holding externalism.

One famous example of first-order metaincoherence is a thought experiment given in various forms by Laurence BonJour (1985) and Keith Lehrer (1990). Consider Armstrong’s Thermometer Analogy from above. Imagine there was a human thermometer, that is, someone who “undergoes brain surgery by an experimental surgeon who invents a small device which is both a very accurate thermometer and a computational device capable of generating thoughts” (Lehrer 1990: 163). This person, whom Lehrer names Mr. Truetemp, is unaware of the device despite the fact that it regularly causes him to form reliable beliefs that he unreflectively accepts about the temperature. On a given day, he might reliably form and accept the belief that it is 104 degrees Fahrenheit outside. Is this belief knowledge? Lehrer concludes: “Surely not. He has no idea whether he or his thoughts about the temperature are reliable” (164). BonJour concludes similarly, “Part of one’s epistemic duty is to reflect critically upon one’s beliefs, and such critical reflection precludes believing things to which one has, to one’s knowledge, no reliable means of epistemic access” (1985: 42).

The second-order metaincoherence problem is stated by Barry Stroud (1989):

The scientific ‘externalist’ claims to have good reason to believe that his theory is true. It must be granted that if, in arriving at his theory, he did fulfill the conditions his theory says are sufficient for knowing things about the world, then if that theory is correct, he does in fact know that it is. But still, I want to say, he himself has no reason to think that he does have good reason to think that his theory is correct. (321)

The worry is that, since externalists claim that features of the world outside the mental life of a believer ultimately determine whether a belief is justified, then, if externalism is true, externalists have no reason to believe it is true; in fact, they are committed to believing that whether their belief that it is true is justified is outside their ability to determine from within their own perspective. Again, the belief may be externally reliable, but it is internally unjustified.

If these criticisms hit their mark, epistemologists must make some difficult decisions about which approach—internalism or externalism—has the fewest or least pernicious problems. In the 21st century, much work is underway to address these problems. If one remains unconvinced, there are recent developments that attempt to salvage some of the insights of internalism and externalism. A prominent example involves introducing character traits into the conditions for justification. We turn next to this view, called virtue epistemology.

8. Justification as Virtue

Classical theories of justification that imply a normative or belief-guiding dimension are modeled largely on normative ethical theories, whether teleological, or outcome-based, accounts or deontological, or duty-based, accounts. They ask whether people are rationally obligated to, permitted to, or obligated not to hold particular beliefs given their evidence. These are decision-based theories of rational normativity, as opposed to character-based theories. Just as virtue theory offers a non-decision-based alternative in ethics, it also suggests a non-decision-based alternative in epistemology. The attitudes and circumstances under which people form, maintain, and discard beliefs can be described as virtuous or vicious, and just as decision-based theories in epistemology are concerned with rational obligation (as opposed to moral obligation), character-based theories in epistemology are concerned either with intellectual character (as opposed to moral character), or with cognitive faculties understood as traits of a person (such as reason, perception, introspection, and memory). Of course, in matters of normativity, it is not a simple task to distinguish moral dimensions from rational or intellectual ones, but space prevents us from exploring that relationship here.

Virtue theories of justification hold that part of what justifies a belief is the intellectual traits with which a believer forms or holds the belief. Just as a person’s moral virtues contribute to the goodness of an action (kindness, compassion, honesty), a person’s intellectual virtues contribute to the epistemic goodness of a belief. Virtue theorists, however, are sharply divided as to which intellectual virtues are relevant. One prominent view is that justification is a function of those virtues that enhance reliability, that is, they have a strong external component (Sosa 1980; 2007). This view is known as virtue reliabilism.

A second prominent view is that justification is a function of those intellectual virtues that contribute to more general epistemic goods, including intellectual well-being, social trust, and the righting of epistemic injustice. These virtue responsibilists regard the truth-goal in epistemology very differently than both traditional epistemologists and their virtue reliabilist counterparts (Code 1984; Montmarquet 1993; Zagzebski 2000).

a. Virtue Reliabilism

A prominent version of virtue reliabilism is offered by Ernest Sosa (1980) in attempt to resolve the tension between foundationalists and coherentists. Sosa argues that if beliefs are grounded in truth-conductive intellectual virtues (where truth-conducive is conceived in process reliabilist terms), then foundationalists have empirically stable abilities or acquired habits that help explain the connection between sensory experience and non-inferential belief. Further, reliable virtues help explain how justification emerges from a coherent set of beliefs—coherence is a type of intellectual virtue.

What do these intellectual virtues look like for Sosa? Borrowing an example from his (2007), consider an archer who is aiming at a target. In order to be successful, the archer must have a degree of competence, which Sosa calls “adroitness,” and the shot must be accurate. These features are analogous to the epistemic state of having a true belief (accuracy) that is formed on the basis of good evidence (adroitness). These two features alone, though, are insufficient for the person to believe in the right way. The person must also exercise his adroitness in circumstances that increase his likelihood of having accurate beliefs, that is, his shot must be accurate because it is adroit. Sosa calls this third feature “aptness,” “its being true because competent” (2007: 23). Some of these circumstances will be outside the believer’s control—wind gusts in the archer’s case; causal ties to the world in the epistemic case. But some—for example, the virtues—are within the believer’s control.

Sosa explains:

Aptness depends on just how the adroitness bears on the accuracy. The wind may help some, for example…. If the shot is difficult, however, from a great distance, the shot might still be accurate sufficiently through adroitness to count as apt, though with some help from the wind. (2007: 79)

Notice that the role of the wind is analogous to certain external features of a person’s belief-forming state. Nevertheless, intellectual virtues like those mentioned above can increase one’s adroitness and thereby increase the likelihood of accuracy.

Imagine a person who has good evidence that P but who either does not appeal to that evidence when forming the belief that P, appealing instead to, say, wishful thinking, or who appeals to that evidence carelessly, refusing to consider alternatives or just how strong the evidence is. Despite this person’s having good evidence, her belief is not apt because the belief’s truth was not due to the person’s competence with the evidence.

Because of this external dimension, this branch of virtue epistemology is regarded as a form of reliabilism. Unlike externalist foundationalism, however, the reliability condition is not restricted to belief-forming processes; it is also highly dependent on context. Sosa says:

An archer might manifest sublime skill in a shot that does hit the bull’s-eye. This shot is then both accurate and adroit. But it could still fail to be accurate because adroit. The arrow might be diverted by some wind, for example, so that, if conditions remained normal thereafter, it would miss the target altogether. However, shifting winds might then ease it back on track towards the bull’s-eye. (79)

In epistemic cases, the believer must be suitably virtuous such that, under normal conditions, her beliefs are accurate because they are adroit.

b. Virtue Responsibilism

Sosa’s account has been well-received, though there is disagreement as to whether it is sufficient for solving the problems at issue. One prominent criticism is that Sosa does not take his use of virtues far enough. Rather than serving a more basic truth-goal, some argue that virtues should be conceived as central to the epistemic project.

Lorraine Code (1984) coined the term virtue responsibilism in contrast to Sosa’s reliabilism, and it is the view that justification, or rather, being an intellectually responsible agent, is a matter of acting virtuously in the practice of inquiry. Code argues that epistemic responsibility the central intellectual virtue. Similarly, James Montmarquet argues that, “S is subjectively justified in believing p insofar as S is epistemically virtuous in believing p” (1993: 99). This means that virtue responsibilism is internalist through and through.

Not all virtue responsibilists, however, eschew the truth-goal. As Linda Zagzebski explains, “It would not do any good for a person to be attentive, thorough, and careful unless she was generally on the right track” (2009: 82). But unlike externalist foundationalism, “the right track,” according to virtue epistemologists, does not necessarily include producing more true beliefs than false. There is more than one virtuous outcome, for example, in cases of creativity or inventiveness. It may be that “only 5 per cent of a creative thinker’s original ideas turn out to be true,” Zagzebski explains. “Clearly, their truth conduciveness in the sense of producing a high proportion of true beliefs is much lower than that of the ordinary virtues of careful and sober inquiry, but they are truth conducive in that they are necessary for the advancement of knowledge” (2000: 465). This suggests that the conditions under which a subject is justified are highly contingent on changing context and the goal of our epistemic behaviors. And virtue epistemologists argue that this captures the typical contingency of our epistemic lives.

c. Objections to Virtue Epistemology

In addition to internal disputes between virtue reliabilists and responsibilists, there are more serious concerns with the adequacy of virtue epistemology. Virtue reliabilism faces many of the same criticisms that face traditional reliabilism, including the generality problem, the New Evil Demon Problem, and the meta-incoherence problems. Further, although there is an intuitive sense in which a reliably functioning method of forming beliefs is virtuous (in the Aristotelian sense of “excellence”), it is not clear how virtue reliabilism is substantively different from classical reliabilism. To be sure, virtue responsibilists take special pains to explain the roles of context, luck, and the knower’s aptness in forming beliefs, but these do not seem unavailable to traditional reliabilists.

Similarly, virtue responsibilism faces many of the same problems as virtue ethics. There are questions about which intellectual states count as epistemic virtues (different responsibilists have different lists), whether some virtues should be privileged over others (for example, James Montmarquet (1992) argues that epistemic conscientiousness is the preeminent intellectual virtue), and the ontological status of virtues (whether they are real dispositions or simply heuristics for categorizing types of behavior). There are also serious concerns about some extreme versions of responsibilism that completely disconnect intellectual virtue from truth-seeking, as with Code’s account, rendering discussions of intellectual virtue the province of ethics rather than epistemology.

To alleviate some of these concerns, some virtue epistemologists defend a mixed theory, arguing that an adequate virtue epistemology requires both a reliability and a responsibility condition Greco (2000).

A general concern for both types of virtue epistemology is that virtue theory associates justification too closely with the idea of credit or achievement, whether a person has formed beliefs well. Jennifer Lackey (2007, 2009), for example, argues that if knowledge is produced by the virtuous activity of others (like that of a reliable witness) or if knowledge is innate, then it is not obvious how a person’s belief-forming behavior can be virtuous or vicious, as there is no behavior involved. In the case of the reliable witness, a hearer simply accepts on the basis of the witness’s testimony. In the case of innate knowledge, the knower does nothing to increase the likelihood that her beliefs are reliable; they are reliable for reasons outside her epistemic behavior. If these criticisms are right, virtue epistemology may be unable to explain a range of important types of knowledge.

For a more detailed treatment of virtue epistemology, see Virtue Epistemology.

9. The Value of Justification

Each of the theories of justification reviewed in this article presumes something about the value of justification, that is, about why justification is good or desirable. Traditionally, as in the case of Theatetus noted above, justification is supposed to position us to understand reality, that is, to help us obtain true beliefs for the right reasons. Knowledge, we suppose, is valuable, and justification helps us attain it. However, skeptical arguments, the influence of external factors on our cognition, and the influence of various attitudes on the way we conduct our epistemic behavior suggest that attaining true beliefs for the right reason is a forbidding goal, and it may not be one that we can access internally. Therefore, there is some disagreement as to whether justification should be understood as aimed at truth or some other intellectual goal or set of goals.

a. The Truth Goal

All the theories we have considered presume that justification is a necessary condition for knowledge, though there is much disagreement about what precisely justification contributes to knowledge. Some argue that justification is fundamentally aimed at truth, that is, it increases the likelihood that a belief is true. Laurence BonJour writes, “If epistemic justification were not conducive to truth in this way…then epistemic justification would be irrelevant to our main cognitive goal and of dubious worth” (1985: 8). Others argue that there are a number of epistemic goals other than truth and that in some cases, truth need not be among the values of justification. Jonathan Kvanvig explains:

[I]t might be the case that truth is the primary good that defines the theoretical project of epistemology, yet it might also be the case that cognitive systems aim at a variety of values different from truth. Perhaps, for instance, they typically value well-being, or survival, or perhaps even reproductive success, with truth never really playing much of a role at all. (2005: 285)

Given this disagreement, we can distinguish between what I will call the monovalent view, which takes truth as the sole, or at least fundamental, aim of justification, and the polyvalent view (or, as Kvanvig calls it, the plurality view), which allows that there are a number of aims of justification, not all of which are even indirectly related to truth.

b. Alternatives to the Truth Goal

One motive for preferring the monovalent view is that, if truth is not the primary goal of justification—that is, it connects belief with reality in the right way—then one is left only with goals that are not epistemic, that is, goals that cannot contribute to knowledge. The primary worry is that, in rejecting the truth goal, one is left with pragmatism. In response, those who defend polyvalence argue that, in practice, there are other cognitive goals that are (1) not merely pragmatic, and (2) meet the conditions for successful cognition. Kvanvig explains that “not everyone wants knowledge…and not everyone is motivated by a concern for understanding. … We characterize curiosity as the desire to know, but small children lacking the concept of knowledge display curiosity nonetheless” (2005: 293). Further, much of our epistemic activity, especially in the sciences, is directed toward “making sense of the course of experience and having found an empirically adequate theory” (ibid., 294). Such goals can be produced without appealing to truth at all. If this is right, justification aims at a wider array of cognitive states than knowledge.

Another argument for polyvalence allows that knowledge is the primary aim of justification but that much more is involved in justification than truth. The idea is that, even if one were aware of belief-forming strategies that are conducive to truth (following the evidence where it leads; avoiding fallacies), one might still not be able to use those strategies without having other cognitive aims, namely, intellectual virtues. Following John Dewey, Linda Zagzebski says that “it is not enough to be aware that a process is reliable; a person will not reliably use such a process without certain virtues” (2000: 463). As noted above, virtue responsibilists allow that the goal of having a large number of true beliefs can be superseded by the desire to create something original or inventive. Further still, following strategies that are truth-conducive under some circumstances can lead to pathological epistemic behavior. Amélie Rorty, for example, argues that belief-forming habits become pathological when they continue to be applied in circumstances no longer relevant to their goals (Zagzebski, ibid., 464). If this argument is right, then truth is, at best, an indirect aim of justification, and intellectual virtues like openness, courage, and responsibility may be more important to the epistemic project.

c. Objections to the Polyvalent View

One response to the polyvalent view is to concede that there are apparently many cognitive goals that fall within the purview of epistemology but to argue that all of these are related to truth in a non-trivial way. The goal of having true beliefs is a broad and largely indeterminate goal. According to Marian David, we might fulfill it by believing a truth, by knowing a truth, by having justified beliefs, or by having intellectually virtuous beliefs. All of these goals, argues David, are plausibly truth-oriented in the sense that they derive from, or depend on, a truth goal (David 2005: 303). David supports this claim by asking us to consider which of the following pairs is more plausible:

A1. If you want to have TBs [true beliefs] you ought to have JBs [justified beliefs].

A2. We want to have JBs because we want to have TBs.

B1. If you want to have JBs you ought to have TBs.

B2. We want to have TBs because we want to have JBs. (2005: 303)

David says, “[I]t is obvious that the A’s [sic] are way more plausible than the B’s. Indeed, initially one may even think that the B’s have nothing going for them at all, that they are just false” (ibid.). This intuition, he concludes, tells us that the truth-goal is more fundamental to the epistemic project than anything else, even if one or more other goals depend on it.

Almost all theories of epistemic justification allow that we are fallible, that is, that our justified beliefs, even if formed by reliable processes, may sometimes be false. Nevertheless, this does not detract from the claim that the aim of justification is true belief, so long as it is qualified as true belief held in the right way.

d. Rejections of the Truth Goal

In spite of these arguments, some philosophers explicitly reject the truth goal as essential to justification and cognitive success. Michael Williams (1991), for example, rejects the idea that truth even could be an epistemic goal when conceived of as “knowledge of the world.” Williams argues that in order for us to have knowledge of the world, there must be a unified set of propositions that constitute knowledge of the world. Yet, given competing uses of terms, vague domains of discourse, the failure of theoretical explanations, and the existence of domains of reality we have yet to encode into a discipline, there is not a single, unified reality to study. Williams argues that because of this, we do not necessarily have knowledge of the world:

All we know for sure is that we have various practices of assessment, perhaps sharing certain formal features. It doesn’t follow that they add up to a surveyable whole, to a genuine totality rather than a more or less loose aggregate. Accordingly, it does not follow that a failure to understand knowledge of the world with proper generality points automatically to an intellectual lack. (543)

In other words, our knowledge is not knowledge of the world—that is, access to a unified system of true beliefs, as the classical theory would have it. It is knowledge of concepts in theories putatively about the world, constructed using semantic systems that are evaluated in terms of other semantic systems. If this is, in fact, all there is to knowing, then truth, at least as classically conceived, is not a meaningful goal.

Another philosopher who rejects the truth goal is Stephen Stich (1988; 1990). Stich argues that, given the vast amount of disagreement among novices and experts about what counts as justification, and given the many failures of theories of justification to adequately ground our beliefs in anything other than calibration among groups of putative experts, it is simply unreasonable to believe that our beliefs track anything like truth. Instead, Stich defends pragmatism about justification, that is, justification just is practically successful belief; thus, truth cannot play a meaningful role in the concept of justification.

A response to both views might be that, in each case, the truth goal has not been abandoned but simply redefined or relocated. Correspondence theories of truth take it that propositions are true just in case they express the world as it is. If the world is not expressible propositionally, as Williams seems to suggest, then this type of truth is implausible. Nevertheless, a proposition might be true in virtue of being an implication of a theory, and so, for example, we might adopt a more semantic than ontological theory of truth, and it is not clear whether Williams would reject this sort of truth as the aim of epistemology.

Similarly, someone might object to Stich’s treating pragmatism as if it is not truth-conductive in any relevant sense. If something is useful, it is true that it is useful, even in the correspondence sense. Even if evidence does not operate in a classical representational manner, the success of beliefs in accomplishing our goals is, nevertheless, a truth goal. (See Kornblith 2001 for an argument along these lines.)

10. Conclusion

Epistemic justification is an evaluative concept about the conditions for right or fitting belief. A plausible theory of epistemic justification must explain how beliefs are justified, the role justification plays in knowledge, and the value of justification. A primary motive behind theories of justification is to solve the dilemma of inferential justification. To do this, one might accept the inferential assumption and argue that justification emerges from a set of coherent beliefs (internalist coherentism) or an infinite set of beliefs (infinitism). Alternatively, one might reject the inferential assumption and argue that justification derives from basic beliefs (internalist foundationalism) or through reliable belief-forming processes (externalist reliabilism). If none of these views is ultimately plausible, one might pursue alternative accounts. For example, virtue epistemology introduces character traits to help avoid problems with these classical theories. Other alternatives include hybrid views, such as Conee and Feldman’s (2008), mentioned above, and Susan Haack’s (1993) foundherentism.

11. References and Further Reading

  • Aikin, S. 2009. “Don’t Fear the Regress: Cognitive Values and Epistemic Infinitism.” Think, 23, 55-61.
  • Aikin, S. F. 2011. Epistemology and the Regress Problem. London: Routledge.
  • Alston, W. P. 1988. “An Internalist Externalism,” Synthese, 74, 265-283.
  • Alston, W. P. 1989. Epistemic Justification. Ithaca: Cornell University Press.
  • Armstrong, D. M. 1973. Belief, Truth, and Knowledge. Cambridge: Cambridge University Press.
  • Bach, K. 1985. “A Rationale for Reliabilism.” The Monist, 68, 246-63. Reprinted in S. Bernecker and F. Dretske, eds. 2000. Knowledge: Readings in Contemporary Epistemology. Oxford: Oxford University Press, 199-213. Cited pages are to this anthology.
  • Bergmann, M. 2006. Justification Without Awareness. New York: Oxford.
  • Blanshard, B. 1939. The Nature of Thought. London: Allen & Unwin.
  • BonJour, L. 1980. “Externalist Theories of Empirical Knowledge.” Midwest Studies in Philosophy 5: Studies in Epistemology. Minneapolis: University of Minnesota Press, 53-73.
  • BonJour, L. 1985. The Structure of Empirical Knowledge. Cambridge: Harvard University Press.
  • BonJour, L. and E. Sosa. 2003. Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues. Malden: Wiley-Blackwell.
  • Chisholm, R. 1966. Theory of Knowledge. Englewood Cliffs: Prentice Hall.
  • Chisholm, R. 1982. “A Version of Foundationalism,” in The Foundations of Knowing, ed. R. Chisholm. Minneapolis: University of Minnesota Press.
  • Chisholm, R. 1989. Theory of Knowledge, 3rd ed. Englewood Cliffs: Prentice Hall.
  • Code, L. 1984. “Toward a ‘Responsibilist’ Epistemology.” Philosophy and Phenomenological Research, 45 (1), 29–50.
  • Cohen, S. and K. Lehrer. 1983. “Justification, Truth, and Knowledge.” Synthese 55 (2), 191-207.
  • Conee, E. and R. Feldman. 2004. Evidentialism. New York: Oxford University Press.
  • Conee, E. and R. Feldman. 1998. “The Generality Problem for Reliabilism,” in E. Sosa and J. Kim, eds. Epistemology: An Anthology. Malden: Blackwell Publishers, 372-386. Page numbers are to this anthology.
  • Dancy, J. 1985. Introduction to Contemporary Epistemology. Oxford: Basil Blackwell.
  • David, M. 2005. “Truth as the Primary Epistemic Goal: A Working Hypothesis,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 296-312.
  • Davidson, D. “A Coherence Theory of Truth and Knowledge,” in E. Sosa and J. Kim, eds. Epistemology: An Anthology. Malden: Blackwell Publishers, 154-63.
  • Elgin, C. 2005. “Non-foundationalist Epistemology: Holism, Coherence, and Tenability,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 156-67.
  • Feldman, R. 1974. “An Alleged Defect in Gettier Counter-examples.” The Australasian Journal of Philosophy. 52, 68-69.
  • Feldman, R. 1988. “Having Evidence,” in Philosophical Analysis, ed. D. F. Austin. Kluwer Academic Publishers, 83-104.
  • Feldman, R. 2005. “Justification Is Internal,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 270-84.
  • Fitelson, B. 2003. “A Probabilistic Measure of Coherence.” Analysis, 63, 194–199.
  • Frankfurt, H. 1973/2008. Demons, Dreamers, and Madmen: The Defense of Reason in Descartes’s Meditations. Princeton: Princeton University Press.
  • Gettier, E. 1963. “Is Justified True Belief Knowledge?” Analysis. 23, 121-23.
  • Ginet, C. 2001. “Deciding to Believe,” in Knowledge, Truth and Duty, ed. Matthias Steup. Oxford: Oxford University Press, 63-76.
  • Ginet, C. 2005. “Infinitism Is not the Solution to the Regress Problem,” in Contemporary Debates in Epistemology, eds. Matthias Steup and Ernest Sosa. Malden: Blackwell Publishing, 140-149.
  • Goldman, A. 1967. “A Causal Theory of Knowing.” The Journal of Philosophy, 64, 357-72.
  • Goldman, A. 1976. “Discrimination and Perceptual Knowledge.” The Journal of Philosophy, 73, 771-91.
  • Goldman, A. 1979. “What Is Justified Belief?” in Knowledge and Justification, ed. George S. Pappas. Dordrecht, Holland: D. Reidel Publishing, 1-23.
  • Greco, J. 2005. “Justification Is Not Internal,” in Contemporary Debates in Epistemology, eds. Mattias Steup and
  • Ernest Sosa. Malden: Blackwell Publishing, 257-70.
  • Haack, S. 1993. Evidence and Inquiry. Malden: Blackwell Publishing.
  • Harman, G. 1986. Change in View. Cambridge: MIT Press.
  • Heil, J. 1983. “Doxastic agency.” Philosophical Studies, 43 (3), 355-364.
  • Hempel, C. 1935. “On the Logical Positivist’s Theory of Truth.” Analysis, 2 (4), 49-59.
  • Klein, P. 2005. “Infinitism Is the Solution to the Regress Problem,” in Contemporary Debates in Epistemology, eds.
  • Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 131-40.
  • Klein P. 2014. “No Final End in Sight,” in Current Controversies in Epistemology, ed. R. Neta. London: Routledge, 95-115.
  • Klein, P. and T. A. Warfield. 1994, “What Price Coherence?” Analysis, 54, 129–132.
  • Kornblith, H. 2001. Knowledge and Its Place in Nature. Oxford: Oxford University Press.
  • Kvanvig, J. L. and W. D. Riggs. 1992. “Can a Coherence Theory Appeal to Appearance States?” Philosophical Studies, 67, 197-217.
  • Kvanvig, J. 2005. “Truth Is not the Primary Epistemic Goal,” in Contemporary Debates in Epistemology, eds.
  • Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 285-96.
  • Lackey, J. 2007. “Why we don’t deserve credit for everything we know.” Synthese, 158, 345–361.
  • Lackey, J. 2009. “Knowledge and credit,” Philosophical Studies, 142, 27–42.
  • Lehrer, K. 1974. Knowledge. Oxford: Clarendon Oxford Press.
  • Lehrer, K. 1986. “The Coherence Theory of Knowledge.” Philosophical Topics, 14, pp. 5-25.
  • Lehrer K. 1990. Theory of Knowledge. Boulder: Westview Press.
  • Lewis, C. I. 1946. An Analysis of Knowledge and Valuation. LaSalle: Open Court.
  • McGrew, T. 1995. The Foundations of Knowledge. Lanham: Rowman & Littlefield.
  • Meinong, A. 1906. “Über die Erfahrungsgrundlagen unseres Wissens” [“On the Experiential Foundations of Our Knowledge”], in Abhandlungen zur Didaktik und Philosophie der Naturwissenschaften, Band [Vol.] I, Heft [Issue] 6, Berlin: J. Springer. Reprinted in Meinong 1968–78, Vol. V: 367–481.
  • Montmarquet, J. 1987. “Epistemic Virtue.” Mind, 96, 482–497.
  • Neurath, O. 1983/1932, “Protocol Sentences.” In R.S. Cohen and M. Neurath, eds. Philosophical Papers 1913–       1946. Dordrecht: Reidel.
  • Olsson, E. J. 2009. Against Coherence: Truth, Probability, and Justification. Oxford: Oxford University Press.
  • Plantinga, A. 1983. “Reason and Belief in God,” in A. Plantinga and N. Wolterstorff, eds. Faith and Rationality. Notre Dame: University of Notre Dame Press, 16-93.
  • Plantinga, A. 1993a. Warranted Christian Belief. New York: Oxford.
  • Plantinga, A. 1993b. Warrant: The Contemporary Debate. New York: Oxford.
  • Pollock, J. 1986. Contemporary Theories of Knowledge. Lanham: Rowman & Littlefield Publishers.
  • Poston, T. 2014. Reason and Explanation: A Defense of Explanatory Coherentism. Hampshire: Palgrave Macmillan.
  • Quine, W.V.O. 1970. Web of Belief. Cambridge: Harvard University Press.
  • Russell, B. 1948. Human Knowledge: Its Scope and Value. London: Routledge.
  • Smithies, D. 2014. “Can Foundationalism Solve the Regress Problem?” in Current Controversies in Epistemology, ed. R. Neta. London: Routledge, 73-94.
  • Sosa, E. 1980. “The Raft and the Pyramid: Coherence Versus Foundations in the Theory of Knowledge.” Midwest Studies in Philosophy, 5 (1), 3–26.
  • Sosa, E. 1991. “Reliabilism and intellectual virtue” in Knowledge in Perspective: Selected Essays in Epistemology. New York: Cambridge University Press, 131-145.
  • Sosa, E. 2001. “Reliabilism and Intellectual Virtue,” in Epistemology: Internalism and Externalism, ed. Hilary Kornblith. Malden: Blackwell Publishers, 147-62.
  • Sosa, E. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Volume 1. Oxford: Oxford University Press.
  • Stich, S. 1988. “Reflective Equilibrium, Analytic Epistemology, and the Problem of Cognitive Diversity.” Synthese, 74, 391-413.
  • Stich, S. 1990. The Fragmentation of Reason. Cambridge: The MIT Press.
  • Stroud, B. 1989. “Understanding Human Knowledge in General,” in Knowledge and Skepticism, Marjorie Clay and Keith Lehrer, eds. Boulder: Westview, 31-50.  Reprinted in S. Bernecker and F. Dretske, eds. 2000.   Knowledge: Readings in Contemporary Epistemology. Oxford: Oxford University Press, 307-323. Page numbers to this anthology.
  • Swain, Marshall. 1981. Reasons and Knowledge. Ithaca: Cornell University Press.
  • Williams, M. 2000. “Epistemological Realism,” in Epistemology: An Anthology, eds. E. Sosa and J. Kim. Malden:     Blackwell Publishers, 536-555.
  • Zagzebski, L. 1996. Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundation of Knowledge. Cambridge: Cambridge University Press.
  • Zagzebski, L. 2000. “Virtues of the Mind,” in Epistemology: An Anthology, eds. E. Sosa and J. Kim. Malden: Blackwell Publishers, 457-467.
  • Zagzebski, L. 2009. On Epistemology. Belmont: Wadsworth.

 

Author Information

Jamie Carlin Watson
Email: jamie.c.watson@gmail.com
Broward College
U. S. A.

Laozi (Lao-tzu, fl. 6th cn. B.C.E.)

laoziLaozi is the name of a legendary Daoist philosopher, the alternate title of the early Chinese text better known in the West as the Daodejing, and the moniker of a deity in the pantheon of organized “religious Daoism” that arose during the later Han dynasty (25-220 C.E.). Laozi is the pinyin romanization for the Chinese characters which mean “Old Master.” Laozi is also known as Lao Dan (“Old Dan”) in early Chinese sources (see Romanization systems for Chinese terms). The Zhuangzi (late 4th century B.C.E.) is the first text to use Laozi as a personal name and to identify Laozi and Lao Dan. The earliest materials to mention Laozi are in the Zhuangzi’s Inner Chapters (Chs. 1-7) in the narration of Lao Dan’s funeral in Ch. 3. Two other passages provide support for the linkage of Laozi and Lao Dan (in Ch. 14 and Ch. 27). There are seventeen passages in which Laozi plays a role in the Zhuangzi. Three are in the Inner Chapters, eight occur in chapters 11-14 in the Yellow Emperor sections of the text (chs. 11, 12, 13, 14), five are in chapters likely belonging to Zhuang Zhou’s disciples as the sources (chs. 21, 22, 23, 25, 27), and one is in the final concluding editorial chapter (ch. 33). In the Yellow Emperor sections in which Laozi is the main figure, four passages contain direct attacks on Confucius and the Confucian virtues of ren, yi, and li in the form of dialogues. The sentiments expressed by Laozi in these passages are reminiscent of remarks from the Daodejing and probably date from the period in which that collection was reaching some near final form.  Some of these themes include the advocacy of wu-wei, rejection of discursive reasoning and mind meddling, condemnation of making discriminations, and valorization of forgetting and fasting of the mind. The earliest ascriptions of authorship of the Daodejing to Laozi are in Han Feizi and the Huainanzi.  Over time, Laozi became a principal figure in institutionalized forms of Daoism and he was often associated with the many transformations and incarnations of the dao itself.

Table of Contents

  1. Laozi and Lao Dan in the Zhuangzi
  2. Laozi and the Daodejing
  3. The First Biography and the Establishment of Laozi as the Founder of Daoism
  4. The Ongoing Laozi Myth
  5. References and Further Reading

1. Laozi and Lao Dan in the Zhuangzi

The Zhuangzi gives the following, probably fictional, account of Confucius‘s impression of Laozi:

“Master, you’ve seen Lao Dan—what estimation would you make of him?” Confucius said, “At last I may say that I have seen a dragon—a dragon that coils to show his body at its best, that sprawls out to display his patterns at their best, riding on the breath of the clouds, feeding on the yin and yang. My mouth fell open and I couldn’t close it; my tongue flew up and I couldn’t even stammer. How could I possibly make any estimation of Lao Dan!” Zhuangzi, Ch. 14

Laozi’s relationship to Confucius is a major part of the Zhuangzi‘s picture of the philosopher. Of the seventeen passages mentioning Laozi, Confucius figures as a dialogical partner or subject in nine. While it is clear that Confucius is thought to have a long way to go to become a zhenren (the Zhuangzi‘s way of speaking about the perfected person), Lao Dan seems to feel sorry for Confucius in his reply to Wuzhi “No-Toes” in Ch. 5, The Sign of Virtue Complete. Laozi recommends to Wuzhi that he try to release Confucius from the fetters of his tendency to make rules and human discriminations (for example, right/wrong; beautiful/ugly) and set him free to wander with the dao.

Lao Dan addresses Confucius by his personal name “Qiu” in three passages. Since such a liberty is one that only a person with seniority and authority would take, this style invites us to believe that Confucius was a student of Lao Dan’s and thereby acknowledged Laozi as an authority. In one of these passages in which Lao Dan uses Confucius’s personal name Qiu, he cautions Confucius against clever arguments and making plans and strategies with which to solve life’s problems, telling him that such rhetoricians are simply like nimble monkeys and rat catching dogs who are set aside when unable to perform (Ch. 12, Heaven and Earth). And on another occasion, Qiu claims that he knows the “six classics” thoroughly and that he has tried to persuade 72 kings to their truth, but they have been unmoved. Lao Dan’s reply is, “Good!” He tells Confucius not to occupy himself with such worn out ways, and to instead live the dao himself (Ch.14, Turning of Heaven).

In his later attempt to provide an actual biography of Laozi by Sima Qian (see below), Laozi’s vocation as a librarian figures prominently.  If the ultimate source of this tradition is the Zhuangzi, we should not forget that the context of this record is as a component in the theme that Laozi taught Confucius, who was confused and having no success with his own teachings.  Accordingly, the point of the story that mentions Laozi’s occupation as librarian or an archivist (ch. 13) is that Confucius’ writings, offered to Laozi by Confucius himself, are simply not worthy to be put into a library. We cannot be sure, then, that there is any real memory of Confucius’s occupation being preserved for us, as the story may be an entire fiction meant to make a point about the inadequacy of Confucius’s teachings.

Finally, in Ch.14, Turning of Heaven, Lao Dan makes a direct attack not only on the rules and regulations of Confucius, but also the teachings of the Mohists, and the veneration of the ancient emperors and legendary sages of the past, displaying his preference for experiential oneness with dao to any teaching or tradition of philosophers or great minds of the past.

2. Laozi and the Daodejing

The ways in which expressions of Laozi in the seventeen passages in which he occurs in the Zhuangzi sound like sentiments in the Daodejing (hereafter, DDJ) represent collectively one basis for the traditional association of Laozi as author of the text.  For example, at Laozi’s funeral in Ch. 3, Qin Shi valorizes Laozi by saying that he accomplished much, without appearing to do so, which is a reference both to the Old Master’s rejection of pursuit of fame and power and also praise for his conduct as wu-wei (effortless action) in oneness with dao. Qin Shi’s praise of Laozi is also consistent with Laozi’s teaching to Yangzi Ju in Ch. 7 not to seek fame and power.  Such conduct and attitudes are encouraged strongly in DDJ 2, 7, 22, 24, 51 and 77.  When Laozi tells Wuzhi to return to Confucius and set him free from the disease of problematizing life and tying himself in knots by helping him to empty himself of making discriminations (Zhuangzi ch. 5), this same teaching shows up in the DDJ in many places (for example, chs. 5 and 18).  Likewise, Laozi criticizes Confucius for trying to spread the classics (12 in number in ch. 13 and 6 in ch. 14) instead of valuing the wordless teaching, the DDJ has a ready parallel in Ch. 56.  While Confucius is teaching his disciples to put forth effort and cultivate benevolence (ren) and appropriate conduct (yi),  Laozi tells him that he should be teaching effortless action (wu-wei) in Zhuangzi chs. 13, 14, and 21).  This teaching also shows up in the DDJ (chs. 2, 3, 20, 47, 48, 57, 63, and 64). Finally, if we take Zhuangzi Ch. 33 as an original part of the work, then Lao Dan (Laozi) actually quotes DDJ 28.

In addition to the ways in which Laozi’s teachings in the Zhuangzi sound like those of the DDJ, we should also note that both of the very early classical works known as the Hanfeizi and the Huainanzi contain passages that are direct quotes or unmistakable allusions to teachings in the DDJ and attribute them to Lao Dan or Laozi by name.  Tae Hyun Kim has made a study of these passages in Hanfeizi and the recent English translation of Huainanzi by John Major and others makes it easy to locate these citations (for example, see Huainanzi, 11.3).  All of these connections culminate in Sima Qian’s biography of Laozi (see below) which not only says that Laozi was the author of the DDJ, but explains that it was a written text of Laozi’s teachings given when he departed China to go to the West.  So, by the 1st Cent. B.C.E., it was accepted by tradition and lore that Laozi was the author of the DDJ.

However, the attribution of authorship of the DDJ to Laozi is much more complicated than it first appears.  The DDJ has 81 chapters and about 5,000 Chinese characters, depending on which text is used. Its two major divisions are the dao jing (chs. 1-37) and the de jing (chs. 38-81). But actually, this division probably rests on nothing other than the fact that the principal concept opening Chapter 1 is dao (way) and that of Chapter 38 is de (virtue). Moreover, although the text has been studied by commentators in Chinese history for centuries, the general reverence shown to it, and the long standing tradition that it was the work of the great philosopher Laozi, were two factors militating against any critical literary analysis of its structure. What we know now is that in spite of the view that the text had a single author named Laozi, it is clear to textual critics that the work is a collection of smaller passages edited into sections and not the work of a single hand. Most of these probably circulated orally, perhaps as single teachings or in small collections. Later they were gathered and arranged by an editor.

The internal structure of the DDJ is only one ground for the denial of a single author for the text.  The fact that we also now know there were multiple versions of the DDJ, even as early as 300 B.C.E., also suggests that it is unlikely that a single author wrote just one book that we now know as the DDJ.  Consider that for almost 2,000 years the Chinese text used by commentators in China and upon which all except the most recent Western language translations were based has been called the Wang Bi, after the commentator who made a complete edition of the DDJ sometime between 226-249 C.E. Although Wang Bi was not a Daoist, the commentary he wrote after collecting and editing the text became a standard interpretive guide, and generally speaking even today scholars depart from his arrangement of the actual text only when they can make a compelling argument for doing so. However, based on recent archaeological finds at Guodian in 1993 and Mawangdui in the 1970s we have no doubt that there were several simultaneously circulating versions of the DDJ text that pre-dated Wang Bi’s compilation of what we now call the “received text.”

Mawangdui is the name for a site of tombs discovered near Changsha in Hunan province. The Mawangdui discoveries include two incomplete editions of the DDJ on silk scrolls (boshu) now simply called “A” and “B.” These versions have two principal differences from the Wang Bi. Some word choice divergencies are present. The order of the chapters is reversed, with 38-81 in the Wang Bi coming before chapters 1-37 in the Mawangdui versions. More precisely, the order of the Mawangdui texts takes the traditional 81 chapters and sets them out like this: 38, 39, 40, 42-66, 80, 81, 67-79, 1-21, 24, 22, 23, 25-37. Robert Henricks has published a translation of these texts with extensive notes and comparisons with the Wang Bi under the title Lao-Tzu, Te-tao Ching.

The Guodian find consists of 730 inscribed bamboo slips found in a tomb near the village of Guodian in Hubei province in 1993. There are 71 slips with material that is also found in 31 of the 81 chapters of the DDJ and corresponding only to Chapters 1-66. Based on the probable date of the closing of the tomb, the version of the DDJ found within it may date as early as c. 300 B.C.E.

3. The First Biography and the Establishment of Laozi as the Founder of Daoism

We have now arrived at the stage where studies of Laozi’s biography usually begin.

The first known attempt to write a biography of Laozi is in the Shiji (Historical Records) by Sima Qian (145-89 B.C.E.). According to this text, Laozi was a native of Chu, a southern state of the Zhou dynasty.  His surname was Li, and his personal name was Er, and his style name was Dan. Sima Qian reports that Laozi was a historiographer in charge of the archives of Zhou.  Moreover, Sima Qian tells us that Confucius had traveled to see Laozi to learn about the performance of rituals from him.  According to The Book of Rites (Liji), a master known as Lao Dan was an expert on mourning rituals. On four occasions, Confucius (Kongzi, Master Kong) is reported to have responded to questions by appealing to answers given by Lao Dan. The records even say that Confucius once assisted him in a burial service.  Just what date we can put on this record from The Book of Rites is uncertain, but it may have informed Sima Qian’s biography.

According to the biography, during the course of their conversations Laozi told Confucius to give up his prideful ways and seeking of power.  When Confucius returned to his disciples, he told them that he was overwhelmed by the commanding presence of Laozi, which was like that of a mighty dragon.  The biography goes on to say that Laozi cultivated the dao and its de. However, as the state of Zhou continued to decline, Laozi decided to leave China through the Western pass (toward India) and that upon his departure he gave to the keeper of the pass, one Yin Xi, a book divided into two parts, one on dao and one on de, and of 5,000 characters in length.  After that, no one knew what became of him.  This is perhaps the most familiar of the traditions narrated by Sima Qian and it contains the core of most every subsequent biography or hagiography of Laozi of significance.  However, the biography did not end here.  Sima Qian went on to record what other sources said about Laozi.

In the first biography, Sima Qian says some report that Laolaizi came from Chu, was a contemporary of Confucius, and he authored a work in fifteen sections which speaks of the practical uses of the Daoist teachings.  But Sima Qian leaves it undecided whether he thinks Laolaizi should be identified with Laozi, even if he does include this reference in the section on Laozi.

Sima Qian adds another layer to the biography without commenting on the degree of confidence he has in its truthfulness, according to which it is said that Laozi lived 160 years or even 200 years, as a result of cultivating the dao and nurturing his longevity.

An additional tradition included in the first biography is that Dan, the historiographer of Zhou predicted in 479 B.C.E. that Zhou and Qin would break apart and that a new king would arise from Qin.  The point of this tradition is that Dan (Lao Dan?) had the power to predict the political future of the people, including the fragmentation of the Zhou dynasty and the rise of the Qin in about 221 B.C.E. (that is, Qinshihuang, or the first emperor of China). But Sima Qian likewise refuses to identify Laozi with this Dan.

Finally, the first biography concludes with a reference to Laozi’s son and his descendants. Another movement in the evolution of the Laozi story was completed by about 240 B.C.E. This was necessitated by Lao Dan’s association with the grand historiographer Dan during the Zhou, who predicted the rise of the Qin state. This information, along with that of Laozi’s journey to the West, and of the writing of the book for Yin Xi won a favorable position for Laozi during the Qin dynasty. The association of Laozi with a text (the DDJ) that was becoming increasingly significant was important. However, with the demise of the Qin state, some realignment of Laozi’s connection with them was needed. So, Qian’s final remarks about Laozi’s son helped to associate the philosopher’s lineage with the new Han ruling family. The journey to the West component now also had a new force. It explained why Laozi was not presently advising the Han rulers.

Overall, it seems that the earliest biography conforms closely to passages contained in Zhuangzi Chapters 11-14 and 26 in associating Laozi with the archivist or historiographer of Zhou, Laozi’s rebuke of Confucius’s prideful seeking of fame and pursuit of power, and the report that Confucius told his disciples that Laozi was like a great dragon. It is possible, then, that Zhuangzi is thus the ultimate source of Sima Qian’s information.

Sima Qian also says, “Laozi cultivated the dao and its virtue (de).” We recognize of course that “dao and its virtue” is Dao and de and that this phase is meant to solidify Laozi’s association with the Daode jing. What the Zhuangzi, Hanfeizi and Huainanzi only alluded to by putting near quotes from the DDJ in the mouth of Laozi, Sima Qian now makes into an explicit connection.  He even tells us that when the Zhou kingdom began to decline, Laozi decided to leave China and head into the West. When he reached the mountain pass, the keeper of the pass (Yin Xi) insisted that he write down his teachings, so that the people would have them after he left. So, “Laozi wrote a book in two parts, discussing the ideas of the dao and of de in some 5,000 words, and departed. No one knows where he ended his life.” These remarks make an unmistakable connection between what Laozi is said to have delivered to Yin Xi and the two sectional divisions of the DDJ and a very close approximation to its exact number of characters.

Sima Qian classified the Six Schools as Yin-Yang, Confucian, Mohist, Legalists, School of Names, and Daoists. Since his biography located Laozi in a time period predating the Zhuangzi, and the passages in the Zhuangzi seemed to be about a person who lived in the time of Confucius (and not to be simply a literary or traditional invention), then the inference was easy to make that Laozi was the founder of the Daoist school.

4. The Ongoing Laozi Myth

In The Lives of the Immortals (Liexuan zhuan) by Liu Xiang (79-8 B.C.E.) there are separate entries for Laozi and Yin Xi. According to the extension of the story of Laozi’s leaving China through the Western pass found in Liu Xiang’s work, Yin became a disciple of Laozi and begged him to allow him to go to the West as well. Laozi told him that he could come along, but only after he cultivated the dao. Laozi instructed Yin to study hard and await a summons which would be delivered to him in the marketplace in the city of Chengdu. There is now a shrine at the putative location of this site dedicated to “ideal disciple.” Additionally, in Liu Xiang’s text it is clear that Laozi is valorized as the preeminent immortal and as a superior daoshi (fangshi) who had achieved not only immortality through wisdom and the practice of techniques for longevity, but also mastery of the arts associated with the abilities and skills of one who was united with dao (compare the “Spirit Man” living in the Gushe mountains in Zhuangzi ch. 1 and Wang Ni’s remarks on the perfected person or zhenren in ch. 2).

Another important stage in the development of Laozi’s place in Chinese philosophical history occurred when Emperor Huan (147-167 C.E.) built a palace on the traditional site of Laozi’s birthplace and authorized veneration and sacrifice to Laozi. The “Inscription to Laozi” (Laozi ming) written by Pian Shao in c. 166 C.E. as a commemorative marker for the site goes well beyond Sima Qian’s biography. It makes the first apotheosis of Laozi into a deity. The text makes reference to the many cosmic metamorphoses of Laozi, portraying him as having been counselor to the great sage kings of China’s misty pre-history. Accordingly, during this period of the 2nd and 3rd centuries, the elite at the imperial court divinized Laozi and regarded him as an embodiment or incarnation of the dao, a kind of cosmic emperor who knew how to bring things into perfect harmony and peace by acting in wu-wei.

The Daoist cosmological belief in the powers of beings who experienced unity with the dao to effect transformation of their bodies and powers (for example, Huzi in Zhuangzi, ch.7) was the philosophical underpinning of the work, Classic on the Transformations of Laozi (Laozi bianhua jing, late 100s C.E., available now in a Dunhuang manuscript dating 612 C.E.). This work reflects some of the ideas in Pian Shao’s inscription, but takes them even further. It tells how Laozi transformed into his own mother and gave birth to himself, taking quite literally comments in the DDJ where the dao is portrayed as the mother of all things (DDJ, ch. 1). The work associates Laozi with various manifestations or incarnations of the dao itself.  In this text there is a complete apotheosis of Laozi into a numinal divinity. “Laozi rests in the great beginning, wanders in the great origin, floats through dark, numinous emptiness…He joins serene darkness before its opening, is present in original chaos before the beginnings of time….Alone and without relation, he has existed since before heaven and earth. Living deeply hidden, he always returns to be. Gone, the primordial; Present, a man” (Quoted in Kohn, “Myth,” 47). The final passage in this work is an address given by Laozi predicting his reappearance and promising liberation from trouble and the overthrow of the Han dynasty, an allusion that helps us fix the probable date of origin for the work.   The millennial cults of the second century believed Laozi was a messianic figure who appeared to their leaders and gave them instructions and revelations (for example, the hagiography of Zhang Daoling, founder of the Celestial Master Zhengyi movement contained in the 5th century work, Taiping Guangji 8).

The period of the Celestial Masters (c. 142-260 C.E.) produced documents enhancing the myth of Laozi who came then to be called Laojun (Lord Lao) or Taishang Laojun (Most High Lord Lao). Laojun could manifest himself in any time of unrest and bring Great Peace (taiping). Yet, the Celestial Masters never claimed that Laojun had done so in their day. Instead of such a direct manifestation, the Celestial Masters practitioners taught that Laojun transmitted to them talismans, registers, and new scriptures in the form of texts to guide the creation of communities of heavenly peace.  One work, very likely from the late 3rd or early 4th century C.E. entitled The Hundred and Eight Precepts Spoken by Lord Lao (Laojun shuo yibai bashi jie) became the earliest set of behavioral guides for Celestial Masters communities.  According to the text, Laozi delivered these precepts after returning from India and finding the people in a state of corruption.

During the reign of Emperor Huidi of the Western Jin dynasty (290-306 C.E.), Wang Fu, a master within the Daoist sectarian group known as the Celestial Masters, often debated with the Buddhist monk Bo Yuan about philosophical beliefs.  As a result of these exchanges, scholarly consensus holds that Wang Fu compiled a one scroll work entitled Classic of the Conversion of the Barbarians (Huahu jing, c. 300 C.E.).  The work is also known by the title The Supreme Numinous Treasure’s Sublime Classic on Laozi’s Conversion of the Barbarians (Taishang lingbao Laozi huahu miaojing). Perhaps the most inflammatory claim of this work was its teaching that when Laozi left China through the Western pass he went to India, where he transmorphed into the historical Buddha and converted the barbarians. The basic implication of the book was that Buddhism was actually only a form of Daoism.  This work inflamed Buddhists for decades.  In fact, both of the Tang Emperors Gaozong (649-683 C.E.) and Zhongzong (705-710 C.E.) gave imperial orders to prohibit its distribution. However, as bitter contention continued between Buddhism and Daoism, the Daoists actually expanded the Classic of the Conversion of the Barbarians, so that by 700 C.E. it was ten scrolls in length.  Four of these were recovered in the Dunhuang cache of manuscripts.  The much extended work came to include the account that Laozi entered the mouth of a queen in India and the next year was born from her right arm-pit to become the Buddha. He walked immediately after his birth, and “from then on Buddhist teaching came to flourish.”  To those familiar with the hagiographies of the Buddha, virtually all of this birth account is recognizable as associated with Buddha, not Laozi.

In the course of the production of polemical writings on the Buddhist side of the debate, attempts were made to turn the tables on the Daoists.  Laozi was portrayed as a bodhisattva or disciple of the Buddha sent to convert the Chinese.  This theory had other desirable extensions from a Buddhist viewpoint, because it was also applied to Confucius, enabling Buddhist rhetoricians to hold that Confucius was an avatar of Buddhism and that Confucianism was actually a form of distorted Buddhism.

Most later writings about Laozi continued to base their appeals to Laozi’s authority on his ongoing transformations, but they likewise provide evidence of the growing tension between Daoism and Buddhism. The first mythological account of Laozi’s birth is in the Classic of the Inner Explanation of the Three Heavens (Santian neijie jing), a Celestial Master work dated about 420 C.E. In this text, Laozi has three births: as the manifestation of the dao from pure energy to become a deity in heaven; in human form as the ancient philosopher author of the DDJ; and as the Buddha after his journey to the West. In the first birth, his mother is known as “The Jade Maiden of Mystery and Wonder.” In his second, he is born to a human woman known as Mother Li. This was an eighty-one year pregnancy, after which he was born from her left armpit (there is a tradition that Buddha had been born from his mother’s right arm pit). At birth he had white hair and so he was called laozi (here meaning something more like lao haizi or Old Child). This birth is set in the time of the Shang dynasty, several centuries before the date Sima Qian reports. But the purpose of such a move in the Laozi legend is to allow him time to travel to the West and then become the Buddha. The third birth takes place in India as the Buddha.

In the Yuan dynasty (1285 C.E.), Emperor Shizu ordered the burning of the Daoist canon of texts, and according to lore, the first writing destroyed was the greatly extended version of Classic of the Conversion of the Barbarians in ten or more scrolls.  Once again, though, the text and its story of Laozi seemed quite resilient. It reappeared in the form of an illustrated work entitled Eighty-one Transformations of Lord Lao (Laojun bashiyi hua tushuo).  The Buddhist thinker Xiangmai wrote a detailed, but polemical, history of this text and few scholars trust its reliability.  Whether the Eighty-one Transformations of Lord Lao still survives is arguable, although a work entitled Eighty-one Transformations of the Most High Lord Lao of Mysterious Origin of the Golden Portal (Jinque xuanyuan Taishang Laojun baishiyi hua tushuo) with illustrations and dating to 1598 is held in the Museum fur Volkerkunde in Berlin.  The version in Berlin provides an illustration for each of Laozi’s transformations, each accompanied by a short text. The first few depict his existence in cosmic time.  It is not until the 11th transformation that he enters historical time during the era of Fu Xi by the name Yuhuazi.  In his 34th transformation, Laozi sends Yin Xi to explain the sutras to the Indian barbarians.  The 58th transformation is Laozi’s appearance in the clouds to Zhang Daoling, the founder of the Celestial Master Zhengyi sect of Daoism that still exists today.

Ge Hong’s (283-343 C.E.) The Inner Chapters of the Master Who Embraces Simplicity (Baopuzi neipian) is arguably the most important Daoist philosophical work of the Jin dynasty. In this text, Ge Hong reports that in a state of visualization he saw Laozi, seven feet tall, with cloudlike garments of five colors, wearing a multi-tiered cap and carrying a sharp sword. According to Ge Hong, Laozi had a prominent nose, long eyebrows, and an elongated head. This physiological type was the template for portraying immortals in Daoist art.  Whereas Liu Xiang’s Collected Biographies of the Immortals (Liexian zhuan, c. 18 B.C.E.) reports that Laozi was born during the Shang dynasty, served as an archivist under the Zhou, was a teacher of Confucius, and later made his way to the West just as said in Sima Qian’s standard biography, Ge Hong also collected and edited the Biographies of Immortals (Shenxian zhuan).  According to the article on Laozi, Ge Hong praises Laozi’s practice of stillness and wu-wei, but he also represents Laozi as a master of the techniques of immortality and the efficacy of external alchemy, herbs and control of qi.  He attributes to Laozi what is called the alchemy of the nine cinnabars and eight minerals, as well as a vast knowledge of herbology and dietetics.  Ge Hong also tells a story about one Xu Jia who was a retainer of Laozi.  In the story, Laozi keeps Xu Jia alive by means of a powerful talisman placed in Xu’s mouth.  Its removal causes Xu’s death.  When replaced, Xu Jia lives again.  In all this, Laozi is portrayed as a master of life and death by means of talismanic power, a practice used by the Celestial Masters and continued by Daoist masters as late as the Ming dynasty, if not into the present era.

Other reported manifestations of Laozi gave authority to new Daoist lineages or modifications of practice.  For example, the Daoist master Kou Qianzhi reported a revelation received from Laozi in 415 C.E. which was a “New Code” for Daoist practitioners and communities.  He wrote down the revelation in a text that became known as Classic on Precepts of Lord Lao Recited to the Melody of the Clouds (Laojun yinsong jiejing).  This text contains 36 moral precepts each of which trace their authority to the introductory phrase, “Lord Lao said….”  Textual traces are not the only sources for the traditions and views of Laozi in Chinese philosophical history. Yoshiko Kamitsuka has done a study of how views about Laozi changed and been reflected in material culture, especially sculpture and inscription.

Laozi was also often looked to for political validation.  Throughout most of the Tang dynasty (618-907 C.E.), Laozi was regarded as the protector of the state because of the tradition that both the Tang ruling family and Laozi shared the surname Li and because of many reports of auspicious appearances of Laozi at the inauguration of the Tang dynasty in which he pledged his support during the rise and solidification of the ruling bureaucracy.

The hagiography of Laozi has continued to develop, down to the present day. There are even traditions that various natural geographic landmarks and features are the enduring imprint of Lord Lao on China and his face can be seen in them. It is more likely, of course, that Laozi’s immortality is in the mark made by the philosophical movement he has come to represent and the culture it created.

5. References and Further Reading

  • Ames, Roger. (1998). Wandering at Ease in the Zhuangzi. Albany: State University of New York Press.
  • Bokenkamp, Stephen R. (1997). Early Daoist Scriptures. Berkeley: University of California Press.
  • Boltz, William. (2005). “The Composite Nature of Early Chinese Texts.” In Text and Ritual in Early China, ed. Martin Kern. 50-78. Seattle: University of Washington Press.
  • Csikszentmihalyi, Mark and Ivanhoe, Philip J., eds. (1999). Religious and Philosophical Aspects of the Laozi. Albany: State University of New York.
  • Giles, Lionel. (1948). A Gallery of Chinese Immortals. London: John Murray.
  • Graham, Angus. (1981). Chuang tzu: The Inner Chapters. London: Allen & Unwin.
  • Graham, Angus. (1989). Disputers of the Tao: Philosophical Argument in Ancient China. La Salle, IL: Open Court.
  • Graham, Angus. [1998 (1986)], “The Origins of the Legend of Lao Tan.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 23-41. Albany: State University of New York Press.
  • Hansen, Chad. (1992). A Daoist Theory of Chinese Thought. New York: Oxford University Press.
  • Henricks, Robert. (1989). Lao-Tzu: Te-Tao Ching. New York: Ballantine.
  • Ivanhoe, Philip J. (2002). The Daodejing of Laozi. New York: Seven Bridges Press.
  • Kamitsuka, Yoshiko, (1998). “Lao-Tzu in Six Dynasties Taoist Sculpture.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 63-89. Albany: State University of New York Press.
  • Kim, Tae Hyun. (2010). “Other Laozi Parallels in the Hanfeizi An Alternative Approach to the Textual History of the Laozi and Early Chinese Thought.” Sino-Platonic Papers 199 (March 2010), ed. Victor H. Mair. Philadelphia: University of Pennsylvania Press.
  • Kohn, Livia (2008). “Laojun yinsong jiejing [Classic on Precepts of Lord Lao, Recited to the Melody in the Clouds].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Kohn, Livia, (1998). “The Lao-Tzu Myth.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 41-63. Albany: State University of New York Press.
  • Kohn, Livia, (1996). “Laozi: Ancient Philosopher, Master of Longevity, and Taoist God.” In Religions of China in Practice, ed. Donald S. Lopez, 52-63. Princeton: Princeton University Press.
  • Kohn, Livia and LaFargue, Michael. (1998). Lao-tzu and the Tao-te-ching. Albany: State University of New York Press.
  • Kohn, Livia and Roth, Harold (2002) Daoist Identity: History, Lineage, and Ritual. Honolulu: University of Hawaii Press.
  • Nylan, Michael and Csikzentmihalyi, Mark. (2003). “Constructing Lineages and Inventing Traditions through Exemplary Figures in Early China.” T’oung Pao 89: 1-41.
  • Penny, Benjamin (2008). “Laojun bashiyi huatu [Eighty-one Transformations of Lord Lao].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Penny, Benjamin (2008). “Laojun shuo yibai bashi jie [The 180 Precepts Spoken by Lord Lao].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Smith, Kidder (2003). “Sima Tan and the Invention of Daoism, ‘Legalism,’ et cetera.” The Journal of Asian Studies 62.1: 129-156.
  • Watson, Burton. (1968). The Complete Works of Chuang Tzu. New York: Columbia University Press.
  • Welch, Holmes. (1966). Taoism: The Parting of the Way. Boston: Beacon Press.
  • Welch, Holmes and Seidel, Anna, eds. (1979). Facets of Taoism. New Haven: Yale University Press.

 

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Daoist Philosophy

daoAlong with Confucianism, “Daoism” (sometimes called “Taoism“) is one of the two great indigenous philosophical traditions of China. As an English term, Daoism corresponds to both Daojia (“Dao family” or “school of the Dao”), an early Han dynasty (c. 100s B.C.E.) term which describes so-called “philosophical” texts and thinkers such as Laozi and Zhuangzi, and Daojiao (“teaching of the Dao”), which describes various so-called “religious” movements dating from the late Han dynasty (c. 100s C.E.) onward.  Thus, “Daoism” encompasses thought and practice that sometimes are viewed as “philosophical,” as “religious,” or as a combination of both.  While modern scholars, especially those in the West, have been preoccupied with classifying Daoist material as either “philosophical” or “religious,” historically Daoists themselves have been uninterested in such categories and dichotomies.  Instead, they have preferred to focus on understanding the nature of reality, increasing their longevity, ordering life morally, practicing rulership, and regulating consciousness and diet.  Fundamental Daoist ideas and concerns include wuwei (“effortless action”), ziran (“naturalness”), how to become a shengren (“sage”) or zhenren (“perfected person”), and the ineffable, mysterious Dao (“Way”) itself.

Table of Contents

  1. What is Daoism?
  2. Classical Sources for Our Understanding of Daoism
  3. Is Daoism a Philosophy or a Religion?
  4. The Daodejing
  5. Fundamental Concepts in the Daodejing
  6. The Zhuangzi
  7. Basic Concepts in the Zhuangzi
  8. Daoism and Confucianism
  9. Daoism in the Han
  10. Celestial Masters Daoism
  11. Neo-Daoism
  12. Shangqing and Lingbao Daoist Movements
  13. Tang Daoism
  14. The Three Teachings
  15. The “Destruction” of Daoism
  16. References and Further Reading

1. What is Daoism?

Strictly speaking there was no Daoism before the literati of the Han dynasty (c. 200 B.C.E.) tried to organize the writings and ideas that represented the major intellectual alternatives available. The name daojia, “Dao family” or “school of the dao” was a creation of the historian Sima Tan (d. 110 B.C.E.) in his Shi ji (Records of the Historian) written in the 2nd century B.C.E. and later completed by his son, Sima Qian (145-86 B.C.E.). In Sima Qian’s classification, the Daoists are listed as one of the Six Schools: Yin-Yang, Confucian, Mohist, Legalist, School of Names, and Daoists. So, Daoism was a retroactive grouping of ideas and writings which were already at least one to two centuries old, and which may or may not have been ancestral to various post-classical religious movements, all self-identified as daojiao (“teaching of the dao“), beginning with the reception of revelations from the deified Laozi by the Celestial Masters (Tianshi) lineage founder, Zhang Daoling, in 142 C.E.This article privileges the formative influence of early texts, such as the Daodejing and the Zhuangzi, but accepts contemporary Daoists’ assertion of continuity between classical and post-classical, “philosophical” and “religious” movements and texts.

2. Classical Sources for Our Understanding of Daoism

Daoism does not name a tradition constituted by a founding thinker, even though the common belief is that a teacher named Laozi originated the school and wrote its major work, called the Daodejing, also sometimes known as the Laozi. The tradition is also called “Lao-Zhuang” philosophy, referring to what are commonly regarded as its two classical and most influential texts: the Daodejing or Laozi (3rd Cn. B.C.E.) and the Zhuangzi (4th-3rd Cn. B.C.E.). However, various streams of thought and practice were passed along by masters (daoshi) before these texts were finalized. There are two major source issues to be considered when forming a position on the origins of Daoism. 1) What evidence is there for beliefs and practices later associated with the kind of Daoism  recognized by Sima Qian prior to the formation of the two classical texts? 2) What is the best reconstruction of the classical textual tradition upon which later Daoism was based?

With regard to the first question, Isabelle Robinet thinks that the classical texts are only the most lasting evidence of a movement she associates with a set of writings and practices associated with the Songs of Chu (Chuci), and that she identifies as the Chuci movement. This movement reflects a culture in which male and female masters variously called fangshi, daoshi, zhenren, or daoren practiced techniques of longevity and used diet and meditative stillness anto create a way of life that attracted disciples and resulted in wisdom teachings. While Robinet’s interpretation is controversial, there are undeniable connections between the Songs of Chu and later Daoist ideas. Some examples include a coincidence of names of immortals (sages), a commitment to the pursuit of physical immortality, a belief in the epistemic value of stillness and quietude, abstinence from grains, breathing and sexual practices used to regulate internal energy (qi), and the use of ritual dances that resemble those still done by Daoist masters (the step of Yu).

In addition to the controversial connection to the Songs of Chu, the Guanzi (350-250 B.C.E.) is a text older than both the Daodejing and probably all of the Zhuangzi, except the “inner chapters” (see below). The Guanzi  is a very important work of 76 “chapters.” Three of the chapters of the Guanzi are called the Neiye, a title which can mean “inner cultivation.” The self-cultivation practices and teachings put forward in this material may be fruitfully linked to several other important works: the Daodejing; the Zhuangzi; a Han dynasty Daoist work called the Huainanzi; and an early commentary on the Daodejing called the Xiang’er. Indeed, there is a strong meditative trend in the Daoism of late imperial China known as the “inner alchemy” tradition and the views of the Neiye seem to be in the background of this movement. Two other chapters of the Guanzi are called Xin shu (Heart-mind book). The Xin shu connects the ideas of quietude and stillness found in both the Daodejing and Zhuangzi to longevity practices. The idea of dao in these chapters is very much like that of the classical works. Its image of the sage resembles that of the Zhuangzi. It uses the same term (zheng) that Zhuangzi uses for the corrections a sage must make in his body, the pacification of the heart-mind, and the concentration and control of internal energy (qi). These practices are called “holding onto the One,” “keeping the One,” “obtaining the One,” all of which are phrases also associated with the Daodejing (chs. 10, 22, 39).

The Songs of Chu and Guanzi still represent texts which are themselves creations of actual practitioners of Daoist teachings and sentiments, just as do the Daodejing and Zhuangzi.  Who these persons were we do not know with certainty.  It is possible that we do have the names, remarks, and practices of some of these individuals (daoshi) embodied in the passages of the Zhuangzi. For example, in Chs. 1-7 alone, Xu You, Ch.1; Lianshu, Ch.1; Ziqi Ch. 2; Wang Ni, Ch. 2; Changwuzi, Ch. 2; Qu Boyu, Ch. 4; Carpenter Shi, Ch. 4; Bohun Wuren, Ch. 5; Nu Y, Ch. 6; Sizi, Yuzi, Lizi, Laizi, Ch. 6; Zi Sanghu, Meng Zifan, Zi Qinzan, Ch. 6; Yuzi and Sangzi, Ch. 6; Wang Ni and Putizi, Ch. 7; Jie Yu, Ch. 7; Lao Dan, Ch. 7; Huzi, Ch. 7).

As for a reasonable reconstruction of the textual tradition upon which Daoism is based, we should not try to think of this task so simply as determining the relationship between the Daodejing and the Zhuangzi, such as which text was first and which came later. These texts are composite. The Zhuangzi, for example, repeats in very similar form sayings and ideas  found in the Daodejing, especially in the essay composing Zhuangzi Chs. 8-10. However, we are not certain whether this means that whomever was the source of this material in the Zhuangzi knew the Daodejing and quoted it, or if they both drew from a common source, or even if the Daodejing in some way depended on the Zhuangzi. In fact, one theory about the legendary figure Laozi is that he was created first in the Zhuangzi and later became associated with the Daodejing. There are seventeen passages in which Laozi (a.k.a. Lao Dan) plays a role in the Zhuangzi and he is not mentioned by name in the Daodejing.

Based on what we know now, we could offer the following summary of the sources of early Daoism. Stage One: Zhuang Zhou’s “inner chapters” (chs. 1-7) of the Zhuangzi (c. 350 B.C.E.) and some components of the Guanzi, including perhaps both the Neiye and the Xin shu. Stage Two: The essay in Chs. 8-10 of the Zhuangzi  and some collections of material which represent versions of our final redaction of the Daodejing, as well as Chs. 17-28 of the Zhuangzi representing materials likely gathered by Zhuang Zhou’s disciples. Stage Three: the “Yellow Emperor” (Huang-Lao) manuscripts from Mawangdui and of the Zhuangzi (Chs. 11-19, and 22), and the text known as the Huainanzi (c. 139 B.C.E.).

3. Is Daoism a Philosophy or a Religion?

In the late 1970s Western and comparative philosophers began to point out that an important dimension of the historical context of Daoism was being overlooked because the previous generation of scholars had ignored or even disparaged connections between the classical texts and Daoist religious belief and practice not previously thought to have developed until the 2nd century C.E. We have to lay some of the responsibility for a prejudice against Daoism as a religion and the privileging of its earliest forms as a pure philosophy at the feet of the eminent translators and philosophers Wing-Tsit Chan and James Legge, who both spoke of Daoist religion as a degeneration of a pristine Daoist philosophy arising from the time of the Celestial Masters (see below) in the late Han period. Chan and Legge were instrumental architects in the West of the view that Daoist philosophy (daojia) and Daoist religion (daojiao) are entirely different traditions.

Actually, our interest in trying to separate philosophy and religion in Daoism is more revealing of the Western frame of reference we use than of Daoism itself. Daoist ideas fermented among master teachers who had a holistic view of life. These daoshi (Daoist masters) did not compartmentalize practices by which they sought to influence the forces of reality, increase their longevity, have interaction with realities not apparent to our normal way of seeing things, and order life morally and by rulership. They offered insights we might call philosophical aphorisms. But they also practid meditative stillness and emptiness to gain knowledge, engaged in physical exercises to increase the flow of inner energy (qi), studied nature for diet and remedy to foster longevity, practiced rituals related to their view that reality had many layers and forms with whom/which humans could interact, wrote talismans and practiced divination, engaged in spellbinding of “ghosts,” led small communities, and advised rulers on all these subjects. The masters transmitted their teachings, some of them only to disciples and adepts, but gradually these teachings became more widely available as is evidenced in the very creation of the Daodejing and Zhuangzi themselves.

The anti-supernaturalist and anti-dualist agendas that provoked Westerners to separate philosophy and religion, dating at least to the classical Greek period of philosophy was not part of the preoccupation of Daoists. Accordingly, the question whether Daoism is a philosophy or a religion is not one we can ask without imposing a set of understandings, presuppositions, and qualifications that do not apply to Daoism. But the hybrid nature of Daoism is not a reason to discount the importance of Daoist thought. Quite to the contrary, it may be one of the most significant ideas classical Daoism can contribute to the study of philosophy in the present age.

4. The Daodejing

The Daodejing (hereafter, DDJ) is divided into 81 “chapters” consisting of slightly over 5,000 Chinese characters, depending on which text is used. In its received form from Wang Bi (see below), the two major divisions of the text are the dao jing (chs. 1-37) and the de jing (chs. 38-81). Actually, this division probably rests on little else than the fact that the principal concept opening Chapter 1 is dao (way) and that of Chapter 38 is de (virtue). The text is a collection of short aphorisms that were not arranged to develop any systematic argument. The long standing tradition about the authorship of the text is that the “founder” of Daoism, known as Laozi gave it to Yin Xi, the guardian of the pass through the mountains that he used to go from China to the West (i.e., India) in some unknown date in the distant past. But the text is actually a composite of collected materials, most of which probably originally circulated orally perhaps even in single aphorisms or small collections. These were then redacted as someone might string pearls into a necklace. Although D.C. Lau and Michael LaFargue had made preliminary literary and redaction critical studies of the texts, these are still insufficient to generate any consensus about whether the text was composed using smaller written collections or who were the probable editors.

For almost 2,000 years, the Chinese text used by commentators in China and upon which all except the most recent Western language translations were based has been called the Wang Bi, after the commentator who used a complete edition of the DDJ sometime between 226-249 CE. Although Wang Bi was not a Daoist, his commentary became a standard interpretive guide, and generally speaking even today scholars depart from it only when they can make a compelling argument for doing so. Based on recent archaeological finds at Guodian in 1993 and Mawangdui in the 1970s we are certain that there were several simultaneously circulating versions of the Daodejing text as early as c. 300 B.C.E.

Mawangdui is the name for a site of tombs discovered near Changsha in Hunan province. The Mawangdui discoveries consist of two incomplete editions of the DDJ on silk scrolls (boshu) now simply called “A” and “B.” These versions have two principal differences from the Wang Bi. Some word choice divergencies are present. The order of the chapters is reversed, with 38-81 in the Wang Bi coming before chapters 1-37 in the Mawangdui versions. More precisely, the order of the Mawangdui texts takes the traditional 81 chapters and sets them out like this: 38, 39, 40, 42-66, 80, 81, 67-79, 1-21, 24, 22, 23, 25-37. Robert Henricks has published a translation of these texts with extensive notes and comparisons with the Wang Bi under the title Lao-Tzu, Te-tao Ching (1989). Contemporary scholarship associates the Mawangdui versions with a type of Daoism known as the Way of the Yellow Emperor and the Old Master (Huanglao Dao).

The Guodian find consists of 730 inscribed bamboo slips found near the village of Guodian in Hubei province in 1993. There are 71 slips with material that is also found in 31 of the 81 chapters of the DDJ and corresponding to Chapters 1-66. It may date as early as c. 300 B.C.E. If this is a correct date, then the Daodejing was already extant in a written form when the “inner chapters” (see below) of the Zhuangzi were composed. These slips contain more significant variants from the Wang Bi than do the Mawangdui versions. A complete translation and study of the Guodian cache has been published by Scott Cook (2013).

5. Fundamental Concepts in the Daodejing

The term Dao means a road, and is often translated as “the Way.” This is because sometimes dao is used as a nominative (that is, “the dao”) and other times as a verb (i.e. daoing). Dao is the process of reality itself, the way things come together, while still transforming. All this reflects the deep seated Chinese belief that change is the most basic character of things. In the Yi jing (Classic of Change) the patterns of this change are symbolized by figures standing for 64 relations of correlative forces and known as the hexagrams. Dao is the alteration of these forces, most often simply stated as yin and yang. The Xici is a commentary on the Yi jing formed in about the same period as the DDJ. It takes the taiji (Great Ultimate) as the source of correlative change and associates it with the dao. The contrast is not between what things are or that something is or is not, but between chaos (hundun) and the way reality is ordering (de). Yet, reality is not ordering into one unified whole. It is the 10,000 things (wanwu). There is the dao but not “the World” or “the cosmos” in a Western sense.

The Daodejing teaches that humans cannot fathom the Dao, because any name we give to it cannot capture it. It is beyond what we can express in language (ch.1). Those who experience oneness with dao, known as “obtaining dao,” will be enabled to wu-weiWu-wei is a difficult notion to translate. Yet, it is generally agreed that the traditional rendering of it as “nonaction” or “no action” is incorrect. Those who wu wei do act. Daoism is not a philosophy of “doing nothing.” Wu-wei means something like “act naturally,” “effortless action,” or “nonwillful action.” The point is that there is no need for human tampering with the flow of reality. Wu-wei should be our way of life, because the dao always benefits, it does not harm (ch. 81) The way of heaven (dao of tian) is always on the side of good (ch. 79) and virtue (de) comes forth from the dao alone (ch. 21). What causes this natural embedding of good and benefit in the dao is vague and elusive (ch. 35), not even the sages understand it (ch. 76). But the world is a reality that is filled with spiritual force, just as a sacred image used in religious ritual might be inhabited by numinal power (ch. 29). The dao occupies the place in reality that is analogous to the part of a family’s house set aside for the altar for venerating the ancestors and gods (the ao of the house, ch. 62). When we think that life’s occurrences seem unfair (a human discrimination), we should remember that heaven’s (tian) net misses nothing, it leaves nothing undone (ch. 37)

A central theme of the Daodejing is that correlatives are the expressions of the movement of dao. Correlatives in Chinese philosophy are not opposites, mutually excluding each other. They represent the ebb and flow of the forces of reality: yin/yang, male/female; excess/defect; leading/following; active/passive. As one approaches the fullness of yin, yang begins to horizon and emerge and vice versa. Its teachings on correlation often suggest to interpreters that the DDJ is filled with paradoxes. For example, ch. 22 says, “Those who are crooked will be perfected. Those who are bent will be straight. Those who are empty will be full.” While these appear paradoxical, they are probably better understood as correlational in meaning. The DDJ says, “straightforward words seem paradoxical,” implying, however, that they are not (ch. 78).

What is the image of the ideal person, the sage (sheng ren), or  the perfected person (zhen ren) in the DDJ? Well, sages wu-wei, (chs. 2, 63). They act effortlessly and spontaneously as one with dao and in so doing, they “virtue” (de) without deliberation or volitional challenge. In this respect, they are like newborn infants, who move naturally, without planning and reliance on the structures given to them by culture and society (ch. 15). The DDJ tells us that sages empty themselves, becoming void of the discriminations  used in conventional language and culture. Sages concentrate their internal energies (qi). They clean their vision (ch. 10). They manifest naturalness and plainness, becoming like uncarved wood (pu) (ch. 19). They live naturally and free from desires rooted in the discriminations that human society makes (ch. 37) They settle themselves and know how to be content (ch. 46). The DDJ makes use of some very famous analogies to drive home its point. Sages know the value of emptiness as illustrated by how emptiness is used in a bowl, door, window, valley or canyon (ch. 11). They preserve the female (yin), meaning that they know how to be receptive to dao and its power (de) and are not unbalanced favoring assertion and action (yang) (ch. 28). They shoulder yin and embrace yang, blend internal energies (qi) and thereby attain harmony (he) (ch. 42). Those following the dao do not strive, tamper, or seek to control their own lives (ch. 64). They do not endeavor to help life along (ch. 55), or use their heart-mind (xin) to “solve” or “figure out” life’s apparent knots and entanglements (ch. 55). Indeed, the DDJ cautions that those who would try to do something with the world will fail, they will actually ruin both themselves and the world (ch. 29). Sages do not engage in disputes and arguing, or try to prove their point (chs. 22, 81). They are pliable and supple, not rigid and resistive (chs. 76, 78). They are like water (ch. 8), finding their own place, overcoming the hard and strong by suppleness (ch. 36). Sages act with no expectation of reward (chs. 2, 51). They put themselves last and yet come first (ch. 7). They never make a display of themselves, (chs. 72, 22). They do not brag or boast, (chs. 22, 24) and they do not linger after their work is done (ch. 77). They leave no trace (ch. 27). Because they embody dao in practice, they have longevity (ch. 16). They create peace (ch. 32). Creatures do not harm them (chs. 50, 55). Soldiers do not kill them (ch. 50). Heaven (tian) protects the sage and the sage’s spirit becomes invincible (ch. 67).

Among the most controversial of the teachings in the DDJ are those directly associated with rulers. Recent scholarship is moving toward a consensus that the persons who developed and collected the teachings of the DDJ played some role in advising civil administration, but they may also have been practitioners of ritual arts and what we would call religious rites. Be that as it may, many of the aphorisms directed toward rulers in the DDJ seem puzzling at first sight. According to the DDJ, the proper ruler keeps the people without knowledge, (ch. 65), fills their bellies, opens their hearts and empties them of desires (ch. 3). A sagely ruler reduces the size of the state and keeps the population small. Even though the ruler possesses weapons, they are not used (ch. 80). The ruler does not seek prominence. The ruler is a shadowy presence, never standing out (chs. 17, 66). When the ruler’s work is done, the people say they are content (ch. 17). This picture of rulership in the DDJ is all the more interesting when we remember that the philosopher and legalist political theorist named Han Feizi used the DDJ as a guide for the unification of China. Han Feizi was the foremost counselor of the first emperor of China, Qin Shihuangdi (r. 221-206 B.C.E.). However, it is a pity that the emperor used the DDJ’s admonitions to “fill the bellies and empty the minds” of the people to justify his program of destroying all books not related to medicine, astronomy or agriculture. When the DDJ says that rulers keep the people without knowledge, it probably means that they do not encourage human knowledge as the highest form of knowing but rather they encourage the people to “obtain oneness with the dao.”

6. The Zhuangzi

The second of the two most important classical texts of Daoism is the Zhuangzi. This text is a collection of stories and remembered as well as imaginary conversations.  The text is well known for its creativity and skillful use of language. Within the text we find longer and shorter treatises, stories, poetry, and aphorisms. The Zhuangzi may date as early as the 4th century B.C.E. and according to imperial bibliographies of a later date, the Zhuangzi originally had 52 “chapters.” These were reduced to 33 by Guo Xiang in the 3rd century C.E., although he seems to have had the 52 chapter text available to him.  Ronnie Littlejohn has argued that the later work Liezi may contain some passages from the so-called “Lost Zhuangzi” 52 chapter version. Unlike the Daodejing which is ascribed to the mythological Laozi, the Zhuangzi may actually contain materials from a teacher known as Zhuang Zhou who lived between 370-300 B.C.E. Chapters 1-7 are those most often ascribed to Zhuangzi himself (which is a title meaning “Master Zhuang”) and these are known as the “inner chapters.” The remaining 26 chapters had other origins and they sometimes take different points of view from the Inner Chapters. Although there are several versions of how the remainder of the Zhuangzi may be divided, one that is gaining currency is Chs. 1-7 (Inner Chapters), Chs. 8-10 (the “Daode” essay), Chs. 11-16 and parts of 18, 19, and 22 (Yellow Emperor Chapters), and Chs. 17-28 (Zhuang Zhou’s Disciples’ material), with the remains of the text attributable to the final redactor.

7. Basic Concepts in the Zhuangzi

Zhuangzi taught that a set of practices, including meditative stillness, helped one achieve unity with the dao and become a “perfected person” (zhenren). The way to this state is not the result of a withdrawal from life. However, it does require disengaging or emptying oneself of conventional values and the demarcations made by society. In Chapter 23 of the Zhuangzi, aNanrong Chu inquiring of the character Laozi about the solution to his life’s worries was answered promptly: “Why did you come with all this crowd of people?” The man looked around and confirmed he was standing alone, but Laozi meant that his problems were the result of all the baggage of ideas and conventional opinions he lugged about with him. This baggage must be discarded before anyone can be zhenren, move in wu-wei and express profound virtue (de).

Like the DDJ, Zhuangzi also valorizes wu-wei, especially in the Inner Chapters, the Yellow Emperor sections on rulership, and the Zhuangzi disciples’ materials in Ch. 19. For its examples of such living the Zhuangzi turns to analogies of craftsmen, athletes (swimmers), ferrymen, cicada-catching men, woodcarvers, and even butchers. One of the most famous stories in the text is that of Ding the Butcher, who learned what it means to wu wei through the perfection of his craft. When asked about his great skill, Ding says, “What I care about is dao, which goes beyond skill. When I first began cutting up oxen, all I could see was the ox itself. After three years I no longer saw the whole ox. And now—now I go at it by spirit and don’t look with my eyes. Perception and understanding have come to a stop and spirit moves where it wants. I go along with the natural makeup, strike in the big hollows, guide the knife through the big openings, and follow things as they are. So I never touch the smallest ligament or tendon, much less a main joint. A good cook changes his knife once a year—because he cuts. A mediocre cook changes his knife once a month—because he hacks. I’ve had this knife of mine for nineteen years and I’ve cut up thousands of oxen with it, and yet the blade is as good as though it had just come from the grindstone. There are spaces between the joints, and the blade of the knife has really no thickness….[I] move the knife with the greatest subtlety, until—flop! The whole thing comes apart like a clod of earth crumbling to the ground.” (Ch. 3, The Secret of Caring for Life)  The recurring point of all of the stories in Zhuangzi about wu-wei is that such spontaneous and effortless conduct as displayed by these many examples has the same feel as acting in wu-wei.  The point is not that wu-wei results from skill development.  Wu-wei is not a cultivated skill. It is a gift of oneness with dao.  The Zhuangzi’s teachings on wu-wei are closely related to the text’s consistent rejection of the use of reason and argument as means to dao (chs. 2; 12, 17, 19).

Persons who exemplify such understanding are called sages, zhenren, and immortals. Zhuangzi describes the Daoist sage in such a way as to suggest that such a person possesses extraordinary powers. Just as the DDJ said that creatures do not harm the sages, the Zhuangzi also has a passage teaching that the zhenren exhibits wondrous powers, frees people from illness and is able to make the harvest plentiful (ch.1).  Zhenren are “spirit like” (shen yi), cannot be burned by fire, do not feel cold in the freezing forests, and life and death have no effect on them (ch. 2).  Just how we should take such remarks is not without controversy.  To be sure, many Daoist in history took them literally and an entire tradition of the transcendents or immortals (xian) was collected in text and lore.

Zhuangzi is drawing on a set of beliefs about master teachers that were probably regarded as literal by many, although some think he meant these to be taken metaphorically. For example, when Zhuangzi says that the sage cannot be harmed or made to suffer by anything that life presents, does he mean this to be taken as saying that the zhenren is physically invincible? Or, does he mean that the sage has so freed himself from all conventional understandings that he refuses to recognize poverty as any more or less desirable than affluence, to recognize blindness as worse than sight, to recognize death as any less desirable than life? As the Zhuangzi says in Chapter One, Free and Easy Wandering, “There is nothing that can harm this man.” This is also the theme of Chapter Two, On Making All Things Equal. In this chapter people are urged to “make all things one,” meaning that they should recognize that reality is one. It is a human judgment that what happens is beautiful or ugly, right or wrong, fortunate or not. The sage knows all things are one (equal) and does not judge. Our lives are snarled and jumbled so long as we make conventional discriminations, but when we set them aside, we appear to others as extraordinary and enchanted.

An important theme in the Zhuangzi is the use of immortals to illustrate various points. Did Zhuangzi believe some persons physically lived forever? Well, many Daoists did believe this. Did Zhuangzi believe that our substance was eternal and only our form changed? Almost certainly Zhuangzi thought that we were in a constant state of process, changing from one form into another (see the exchange between Master Lai and Master Li in Ch. 6, The Great and Venerable Teacher). In Daoism, immortality is the result of what may be described as a wu xing transformation. Wu xing means “five phases” and it refers to the Chinese understanding of reality according to which all things are in some state of combined correlation of qi as wood, fire, water, metal, and earth. This was not exclusively a “Daoist” physics. It underlay all Chinese “science” of the classical period, although Daoists certainly made use of it. Zhuangzi wants to teach us how to engage in transformation through stillness, breathing, and experience of numinal power (see ch. 6). And yet, perhaps Zhuangzi’s teachings on immortality mean that the person who is free of discrimination makes no difference between life and death. In the words of Lady Li in Ch. 2, “How do I know that the dead do not wonder why they ever longed for life?”

Huangdi (the Yellow Emperor) is the most prominent immortal mentioned in the text of the Zhuangzi and he is a main character in the sections of the book called “the Yellow Emperor Chapters” noted above. He has long been venerated in Chinese history as a cultural exemplar and the inventor of civilized human life. Daoism is filled with other accounts designed to show that those who learn to live according to the according to the dao have long lives. Pengzu, one of the characters in the Zhuangzi, is said to have lived eight hundred years. The most prominent female immortal is Xiwangmu (Queen Mother of the West), who was believed to reign over the sacred and mysterious Mount Kunlun.

The passages containing stories of the Yellow Emperor in Zhuangzi provide a window into the views of rulership in the text.  On the one hand, the Inner Chapters (chs. 1-7) reject the role of ruler as a viable vocation for a zhenren and consistently criticize the futility of government and politics (ch. 7).  On the other hand, the Yellow Emperor materials in Chs. 11-13 present rulership as valuable, so long as the ruler is acts by wu-wei.  This second position is also that taken in the work entitled the Huainanzi (see below).

The Daoists did not think of immortality as a gift from a god, or an achievement in the religious sense commonly thought of in the West. It was a result of finding harmony with the dao, expressed through wisdom, meditation, and wu-wei. Persons who had such knowledge were reputed to live in the mountains, thus the character for xian (immortal) is made up of two components, the one being shan “mountain” and the other being ren “person.” Undoubtedly, some removal to the mountains was a part of the journey to becoming a zhenren “true person.” Because Daoists believed that nature and our own bodies were correlations of each other, they even imagined their bodies as mountains inhabited by immortals. The struggle to wu-wei was an effort to become immortal, to be born anew, to grow the embryo of immortality inside. A part of the disciplines of Daoism included imitation of the animals of nature, because they were thought to act without the intention and willfulness that characterized human decision making. Physical exercises included animal dances (wu qin xi) and movements designed to enable the unrestricted flow of the cosmic life force from which all things are made (qi). These movements designed to channel the flow of qi became associated with what came to be called tai qi or qi gong. Daoists practiced breathing exercises, used herbs and other pharmacological substances, and they employed an instruction booklet for sexual positions and intercourse, all designed to enhance the flow of qi energy. They even practiced external alchemy, using burners to modify the composition of cinnabar into mercury and made potions to drink and pills to ingest for the purpose of adding longevity. Many Daoist practitioners died as a result of these alchemical substances, and even a few Emperors who followed their instructions lost their lives as well, Qinshihuang being the most famous.

The attitude and practices necessary to the pursuit of immortality made this life all the more significant. Butcher Ding is a master butcher because his qi is in harmony with the dao. Daoist practices were meant for everyone, regardless of their origin, gender, social position, or wealth. However, Daoism was a complete philosophy of life and not an easy way to learn.

When superior persons learn the Dao, they practice it with zest.

When average persons learn of the Dao, they are indifferent.

When petty persons learn of the Dao, they laugh loudly.

If they did not laugh, it would not be worthy of being the Dao.DDJ, 41

8. Daoism and Confucianism

Arguably, Daoism shared some emphases with classical Confucianism such as a this-worldly concern for the concrete details of life rather than speculation about abstractions and ideals. Nevertheless, it largely represented an alternative and critical tradition divergent from that of Confucius and his followers. While many of these criticisms are subtle, some seem very clear.

One of the most fundamental teachings of the DDJ is that human discriminations, such as those made in law, morality (good, bad) and aesthetics (beauty, ugly) actually create the troubles and problems  humans experience, they do not solve them (ch. 3a). The clear implication is that the person following the dao must cease ordering his life according to human-made distinctions (ch. 19). Indeed, it is only when the dao recedes in its influence that these demarcations emerge (chs. 18; 38), because they are a form of disease (ch. 74). In contrast, Daoists believe that the dao is untangling the knots of life, blunting the sharp edges of relationships and problems, and turning down the light on painful occurrences (ch. 4). So, it is best to practice wu-wei in all endeavors, to act naturally and not willfully try to oppose or tamper with how reality is moving or try to control it by human discriminations.

Confucius and his followers wanted to change the world and be proactive in setting things straight. They wanted to tamper, orchestrate, plan, educate, develop, and propose solutions. Daoists, on the other hand, take their hands off of life when Confucians want their fingerprints on everything. Imagine this comparison. If the Daoist goal is to become like a piece of unhewn and natural wood, the goal of the Confucians is to become a carved sculpture. The Daoists put the piece before us just as it is found in its naturalness, and the Confucians polish it, shape it, and decorate it. This line of criticism is made very explicitly in the essay which makes up Zhuangzi Chs. 8-10.

Confucians think they can engineer reality, understand it, name it, control it. But the Daoists think that such endeavors are the source of our frustration and fragmentation (DDJ, chs. 57, 72). They believe the Confucians create a gulf between humans and nature that weakens and destroys us. Indeed, as far as the Daoists are concerned, the Confucian project is like a cancer that saps our very life. This is a fundamental difference in how these two great philosophical traditions think persons should approach life, and as shown above it is a consistent difference found also between the Zhuangzi and Confucianism.

The Yellow Emperor sections of the Zhuangzi in Chs. 12, 13 and 14 contain five text blocks in which Laozi is portrayed in dialogue with Confucius and according to which he is pictured as Confucius’ master and teacher.  These materials provide a direct access into the Daoist criticism of the Confucian project.

9. Daoism in the Han

The teachings that were later called Daoism were closely associated with a stream of thought called Huanglao Dao (Yellow Emperor-Laozi Dao) in the 3rd and 2nd cn. B.C.E. The thought world transmitted in this stream is what Sima Tan meant by Daojia. The Huanglao school is best understood as a lineage of Daoist practitioners mostly residing in the state of Qi (modern Shandong area). Huangdi was the name for the Yellow Emperor, from whom the rulers of Qi said they were descended. When Emperor Wu, the sixth sovereign of the Han dynasty (r. 140-87 B.C.E.) elevated Confucianism to the status of the official state ideology and training in it became mandatory for all bureaucratic officials, the tension with Daoism became more evident. And yet, at court, people still sought longevity and looked to Daoist masters for the secrets necessary for achieving it. Wu continued to engage in many Daoist practices, including the use of alchemy, climbing sacred Taishan (Mt. Tai), and presenting talismanic petitions to heaven. Liu An, the Prince of Huainan and a nephew of Wu, is associated with the production of the work called the Masters of Huainan (Huainanzi, 180-122 B.C.E.).  This is a highly synthetic work formed at what is known as the Huainan academy and greatly influenced by Yellow Emperor Daoism.  John Major and a team of translators published the first complete English version of this text (2010).  The text was an attempt to merge cosmology, Confucian ideals, and a political theory using “quotes” attributed to the Yellow Emperor, although the statements actually parallel closely the Daodejing and the Zhuangzi. All this is of added significance because in the later Han work, Laozi binahua jing (Book of the Transformations of Laozi) the Chinese physics that persons and objects change forms was employed in order to identify Laozi with the Yellow Emperor.

10. Celestial Masters Daoism

Even though Emperor Wu forced Daoist practitioners from court, Daoist teachings found a fertile ground in which to grow in the environment of discontent with the policies of the Han rulers and bureaucrats. Popular uprisings sprouted. The Yellow Turban movement tried to overthrow Han imperial authority in the name of the Yellow Emperor and promised to establish the Way of Great Peace (Tai ping). Indeed, the basic moral and philosophical text that provided the intellectual justification of this movement was the Classic of Great Peace (Taiping jing), provided in an English version by Barbara Hendrischke. The present version of this work in the Daoist canon is a later and altered iteration of the original text dating about 166 CE and attributed to transnormal revelations experienced by Zhang Jiao.

Easily the most important of the Daoist trends at the end of the Han period was the wudou mi dao (Way of Five Bushels of Rice) movement, best known as the Way of the Celestial Masters (tianshi dao). This movement is traceable to a Daoist hermit named Zhang ling, also known as Zhang Daoling, who resided on a mountain near modern Chengdu in Sichuan. According to an account in Ge Hong’s Biographies of Spirit Immortals, Laozi appeared to Zhang (c. 142 CE) and gave him a commission to announce the soon end of the world and the coming age of Great Peace (taiping). The revelation said that those who followed Zhang would become part of the Orthodox One Covenant with the Powers of the Universe (Zhengyi meng wei). Zhang began the movement that culminated in a Celestial Master state. The administrators of this state were called libationers (ji jiu), because they performed religious rites, as well as political duties. They taught that personal illness and civil mishap were owing to the mismanagement of the forces of the body and nature. The libationers taught a strict form of morality and displayed registers of numinal powers they could access and control. Libationers were moral investigators, standing in for a greater celestial bureaucracy. The Celestial Master state developed against the background of the decline of the later Han dynasty. Indeed, when the empire finally decayed, the Celestial Master government was the only order in much of southern China.

When the Wei dynastic rulers became uncomfortable with the Celestial Masters’ power, they broke up the power centers of the movement. But this backfired because it actually served to disperse Celestial Masters followers throughout China. Many of the refugees settled near X’ian in and around the site of Louguan tai. The movement remained strong because its leaders had assembled a canon of texts [Statutory Texts of the One and Orthodox (Zhengyi fawen)]. This group of writings included philosophical, political, and ritual texts. It became a fundamental part of the later authorized Daoist canon.

11. Neo-Daoism

The resurgence of Daoism after the Han dynasty is often known as Neo-Daoism. As a result, Confucian scholars sought to annotate and reinterpret their own classical texts to move them toward greater compatibility with Daoism, and they even wrote commentaries on Daoist works.  A new type of Confucianism known simply as the Way of Mysterious Learning (Xuanxue) emerged. It is represented by a set of scholars, including some of the most prominent thinkers of the period: Wang Bi (226-249), He Yan (d. 249), Xiang Xiu (223?-300), Guo Xiang (d. 312) and Pei Wei (267-300).  In general, these scholars share in common an effort to reinterpret the social and moral understanding of Confucianism in ways to make it more compatible with Daoist philosophy. In fact, for many interpreters, the extent to which Daoist influence is evident in the texts of these writers has led some scholars to call this movement ‘Neo-Daoism.’ Wang Bi and Guo Xiang who wrote commentaries respectively on the Daodejing and the Zhuangzi, were the most important voices in this development. Traditionally, the famous “Seven Sages of the Bamboo Grove” (Zhulin qixian) have also been associated with the new Daoist way of life that expressed itself in culture and not merely in mountain retreats. These thinkers included landscape painters, calligraphers, poets, and musicians.

Among the philosophers of this period, the great representative of Daoism in southern China was Ge Hong (283-343 CE). He practiced not only philosophical reflection, but also external alchemy, manipulating mineral substances such as mercury and cinnabar in an effort to gain immortality. His work the Inner Chapters of the Master Who Embraces Simplicity (Baopuzi neipian) is the most important Daoist philosophical work of this period. For him, longevity and immortality are not the same, the former is only the first step to the latter.

12. Shangqing and Lingbao Daoist Movements

After the invasion of China by nomads from Central Asia, Daoists of the Celestial Master tradition who had been living in the north were forced to migrate into southern China, where Ge Hong’s version of Daoism was strong. The mixture of these two traditions is represented in the writings of the Xu family. The Xu family was an aristocratic group from what is today the city of Nanjing. Seeking Daoist philosophical wisdom and the long life it promised, many of them moved to Mao Shan Mountain, near the city. There they claimed to receive revelations from immortals, who dictated new wisdom and morality texts to them. Yang Xi was the most prominent medium recipient of the Maoshan revelations (360-370 CE). These revelations came from spirits who were local heroes named the Mao brothers, but they had been transformed into deities. Yang Xi’s writings formed the basis for Highest Purity (Shangqing) Daoism. The writings were extraordinarily well done and even the calligraphy in which they were written was beautiful.

The importance of these texts philosophically speaking is to be found in their idealization of the quest for immortality and transference of the material practices of the alchemical science of Ge Hong into a form of reflective meditation. In fact, the Shangqing school of Daoism is the beginning of the tradition known as “inner alchemy” (neidan), an individual mystical pursuit of wisdom.

Some thirty years after the Maoshan revelations, a descendent of Ge Hong, named Ge Chaofu went into a mediumistic trance and authored a set of texts called the Numinous Treasure (Lingbao) teachings. These works were ritual recitation texts similar to Buddhist sutras, and indeed they borrowed heavily from Buddhism. At first, the Shangqing and Lingbao texts belonged to the general stream of the Celestial Masters and were not considered separate sects or movements within Daoism, although later lineages of masters emphasized the uniqueness of their teachings.

13. Tang Daoism

As the Lingbao texts illustrate, Daoism acted as a receiving structure for Buddhism. Many early translators of Buddhist texts used Daoist terms to render Indian ideas. Some Buddhists saw Laozi as an avatar of Shakyamuni (the Buddha), and some Daoists understood Shakyamuni as a manifestation of the dao, which also means he was a manifestation of Laozi. An often made generalization is that Buddhism held north China in the 4th and 5th centuries, and Daoism the south. But gradually this intellectual currency actually reversed. Daoism grew in scope and impact throughout China.

By the time of the Tang dynasty (618-906 CE) Daoism was the intellectual philosophy that underwrote the national understanding. The imperial family claimed to descend from Li (by lore, the family of Laozi). Laozi was venerated by royal decree. Officials received Daoist initiation as Masters of its philosophy, rituals, and practices. A major center for Daoist studies was created at Dragon and Tiger Mountain (longhu shan), chosen both for its feng shui and because of its strategic location at the intersection of numerous southern China trade routes. The Celestial Masters who held leadership at Dragon and Tiger Mountain were later called “Daoist popes” by Christian missionaries because they had considerable political power.

In aesthetics, two great Daoist intellectuals worked during the Tang. Wu Daozi developed the rules for Daoist painting and Li Bai became its most famous poet. Interestingly, Daoist alchemists invented gunpowder during the Tang. The earliest block-print book on a scientific subject is a Daoist work entitled Xuanjie lu (850 CE). As Buddhism gradually grew stronger during the Tang, Daoist and Confucian intellectuals sought to initiate a conversation with it. The Buddhism that resulted was a reformed version known as Chan (Zen in Japan).

14. The Three Teachings

During the Five Dynasties (907-960 CE) and Song periods (960-1279 CE) Confucianism enjoyed a resurgence and Daoists found their place by teaching that principal thinkers of their tradition were Confucian scholars as well. Most notable among these was Lu Dongbin, a legendary Daoist immortal that many believed was originally a Confucian teacher.

Daoism became a complete philosophy of life, reaching into religion, social action, and individual health and physical well-being. A huge network of Daoist temples known by the name Dongyue Miao (also called tianqing guan) was created through the empire, with a miao in virtually every town of any size. The Daoist masters who served these temples were often appointed as government officials. They also gave medical, moral, and philosophical advice, and led religious rituals, dedicated especially to the Lord of the Sacred Mountain of the East named Taishan. Daoist masters had wide authority. All this was obvious in the temple iconography. Taishan was represented as the emperor, the City God (cheng huang) was a high official, and the Earth God was portrayed as a prosperous peasant. Daoism of this period integrated the Three Teachings (sanjiao) of China: Confucianism, Buddhism, and Daoism. This process of synthesis continued throughout the Song and into the period of the Ming Dynasty.

Such a wide dispersal of Daoist thought and practice, taken together with its interest in merging Confucianism and Buddhism, eventually created a fragmented ideology. Into this confusion came Wang Zhe (1113-1170 CE), the founder of Quanzhen (Complete Perfection) Daoism. It was Wang’s goal to bring the three teachings into a single great synthesis. For the first time, Daoist teachers adopted monastic forms of life, created monasteries, and organized themselves in ways they saw in Buddhism. This version of Daoist thought interpreted the classical texts of the DDJ and the Zhuangzi to call for a rejection of the body and material world. The Quanzhen order became powerful as the main partner of the Mongols (Yuan dynasty), who gave their patronage to its expansion. Less frequently, the Mongol emperors favored the Celestial Masters and their leader at Dragon and Tiger Mountain in an effort to undermine the power of the Quanzhen leaders. For example, the Zhengyi (Celestial Master) master of Beijing in the 1220s was Zhang Liusun. Under patronage he was allowed to build a Dongyue Miao in the city in 1223 and make it the unofficial town hall of the capital. But by the time of Khubilai Khan (r. 1260-1294) the Buddhists were used against all Daoists. The Khan ordered all Daoist books except the DDJ to be destroyed in 1281, and he closed the Quanzhen monastery in the city known as White Cloud Monastery (Baiyun Guan).

When the Ming (1368-1644) dynasty emerged, the Mongols were expulsed, and Chinese rule was restored. The emperors sponsored the creation of the first complete Daoist Canon (Daozang), which was edited between 1408 and 1445. This was an eclectic collection, including many Buddhist and Confucian related texts. Daoist influence reached its zenith.

15. The “Destruction” of Daoism

The Manchurian tribes that became rulers of China in 1644 and founded the Qing dynasty were already under the influence of conservative Confucian exiles. They stripped the Celestial Master of Dragon Tiger Mountain of his power at court. Only Quanzhen was tolerated. White Cloud Monastery (Baiyun Guan)) was reopened, and a new lineage of thinkers was organized. They called themselves the Dragon Gate lineage (Longmen pai). In the 1780s, the Western traders arrived, and so did Christian missionaries. In 1849, the Hakka people of Guangxi province, among China’s poorest citizens, rose in revolt. They followed Hong Xiuquan, who claimed to be Jesus’ younger brother. This millennial movement built on a strange version of Chinese Christianity sought to establish the Heavenly Kingdom of Peace (taiping). As the Taiping swept throughout southern China, they destroyed Buddhist and Daoist temples and texts wherever they found them. The Taiping army completely raised the Daoist complexes on Dragon Tiger Mountain. During most of the 20th century the drive to eradicate Daoist influence has continued. In the 1920s, the “New Life” movement drafted students to go out on Sundays to destroy Daoist statues and texts. Accordingly, by the year 1926 only two copies of the Daoist Canon (Daozang) existed and Daoist philosophical heritage was in great jeopardy. But permission was granted to copy the canon kept at the White Cloud Monastery, and so the texts were preserved for the world. There are 1120 titles in this collection in 5,305 volumes. Much of this material has yet to receive scholarly attention and very little of it has been translated into any Western language.

The Cultural Revolution (1966-1976) attempted to complete the destruction of Daoism. Masters were killed or “re-educated.” Entire lineages were broken up and their texts were destroyed. The miaos were closed, burned, and turned into military barracks. At one time, there were 300 Daoist sites in Beijing alone, now there are only a handful. However, Daoism is not dead. It survives as a vibrant philosophical system and way of life as is evidenced by the revival of its practice and study in several new University institutes in the People’s Republic.

16. References and Further Reading

  • Ames, Roger and Hall, David. (2003). Daodejing: “Making This Life Significant” A Philosophical Translation. New York: Ballantine Books.
  • Ames, Roger. (1998). Wandering at Ease in the Zhuangzi. Albany: State University of New York Press.
  • Bokenkamp, Stephen R. (1997). Early Daoist Scriptures. Berkeley: University of California Press.
  • Boltz, Judith M. (1987). A Survey of Taoist Literature: Tenth to Seventeenth Centuries, China Research Monograph 32. Berkeley: University of California Press.
  • Chan, Alan. (1991). Two Visions of the Way: A Translation and Study of the Heshanggong and Wang Bi Commentaries on the Laozi. Albany: State University of New York Press.
  • Cook, Scott (2013). The Bamboo Texts of the Guodian: A Study & Complete Translation. New York: Cornell University East Asia Program.
  • Coutinho, Steve (2014). An Introduction to Daoist Philosophies.  New York: Columbia University Press.
  • Creel, Herrlee G. (1970). What is Taoism? Chicago: University of Chicago Press.
  • Csikszentmihalyi, Mark and Ivanhoe, Philip J., eds. (1999). Religious and Philosophical Aspects of the Laozi. Albany: State University of New York.
  • Girardot, Norman J. (1983). Myth and Meaning in Early Taoism: The Theme of Chaos (hun-tun). Berkeley: University of California Press.
  • Graham, Angus. (1981). Chuang tzu: The Inner Chapters. London: Allen & Unwin.
  • Graham, Angus. (1989). Disputers of the Tao: Philosophical Argument in Ancient China. La Salle, IL: Open Court.
  • Graham, Angus. (1979). “How much of the Chuang-tzu Did Chuang-tzu Write?” Journal of the American Academy of Religion, Vol. 47, No. 3.
  • Hansen, Chad (1992). A Daoist Theory of Chinese Thought. New York: Oxford University Press.
  • Hendrischke, Barbara (2015, reprint ed.). The Scripture on Great Peace: The Taiping jing and the Beginnings of Daoism. Berkeley: The University of California Press.
  • Henricks, Robert. (1989). Lao-Tzu: Te-Tao Ching. New York: Ballantine.
  • Hochsmann, Hyun and Yang Guorong, trans. (2007). Zhuangzi. New York: Pearson.
  • Ivanhoe, Philip J. (2002). The Daodejing of Laozi. New York: Seven Bridges Press.
  • Kjellberg, Paul and Ivanhoe, Philip J., eds. (1996) Essays on Skepticism, Relativism, and Ethics in the Zhuangzi. Albany: State University of New York.
  • Kleeman, Terry (1998). Great Perfection: Religion and Ethnicity in a Chinese Millenial Kingdom. Honolulu: University of Hawaii Press.
  • Kohn, Livia, ed. (2004). Daoism Handbook, 2 vols. Boston: Brill.
  • Kohn, Livia (2009). Introducing Daoism. London: Routledge.
  • Kohn, Livia (2014). Zhuangzi: Text and Context.  St. Petersburg: Three Pines Press.
  • Kohn, Livia and LaFargue, Michael., eds. (1998). Lao-tzu and the Tao-te-ching. Albany: State University of New York Press.
  • Kohn, Livia and Roth, Harold., eds. (2002). Daoist Identity: History, Lineage, and Ritual. Honolulu: University of Hawaii Press.
  • Komjathy, Louis (2014). Daoism: A Guide for the Perplexed. London: Bloomsbury.
  • LaFargue, Michael. (1992). The Tao of the Tao-te-ching. Albany: State University of New York Press.
  • Lin, Paul J. (1977). A Translation of Lao-tzu’s Tao-te-ching and Wang Pi’s Commentary. Ann Arbor: University of Michigan.
  • Lau, D.C. (1982). Chinese Classics: Tao Te Ching. Hong Kong: Hong Kong University Press.
  • Littlejohn, Ronnie (2010). Daoism: An Introduction. London: I.B. Tauris.
  • Littlejohn, Ronnie (2011). “The Liezi’s Use of the Lost Zhuangzi.” Riding the Wind with Liezi: New Perspectives on the Daoist Classic. Eds. Ronnie Littlejohn and Jeffrey Dippmann. Albany: State University of New York.
  • Lynn, Richard John. (1999). The Classic of the Way and Virtue: A New Translation of the Tao-Te Ching of Laozi as Interpreted by Wang Bi. New York: Columbia University Press.
  • Mair, Victor, ed. (2010). Experimental Essays on Zhuangzi. St. Petersburg: Three Pines Press. New edition of University of Hawai’i, 1983.
  • Mair, Victor. (1990). Tao Te Ching: The Classic Book of Integrity and the Way. New York: Bantam Press.
  • Mair, Victor (1994). Wandering on the Way: Early Taoist Tales and Parables of Chuang Tzu. Honolulu: University of Hawai’i Press.
  • Major, John, Queen, Sarah, Set Meyer, Andrew, and Roth, Harold, trans. (2010). The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China. New York: Columbia University Press.
  • Maspero, Henri. (1981). Taoism and Chinese Religion. Amherst: University of Massachusetts Press.
  • Miller, James (2003). Daoism: A Short Introduction.  Oxford: Oxford University Press.
  • Moeller, Hans-Georg (2004). Daoism Explained: From the Dream of the Butterfly to the Fishnet Allegory. Chicago: Open Court.
  • Robinet, Isabelle. (1997). Taoism: Growth of a Religion. Stanford: Stanford University Press.
  • Roth, Harold (1999). Original Tao: Inward Training (Nei-yeh) and the Foundations of Taoist Mysticism. New York: Columbia University Press.
  • Roth, Harold D. (1992). The Textual History of the Huai Nanzi. Ann Arbor: Association of Asian Studies.
  • Roth, Harold D. (1991). “Who Compiled the Chuang Tzu?” In Chinese Texts and Philosophical Contexts, ed. Henry Rosemont, 84-95. La Salle: Open Court.
  • Schipper, Kristofer. (1993). The Taoist Body Berkeley: University of California Press.
  • Slingerland, Edward, (2003). Effortless Action: Wu-Wei As Conceptual Metaphor and Spiritual Ideal in Early China. New York: Oxford University Press.
  • Waley, Arthur (1934). The Way and Its Power: A Study of the Tao Te Ching and its Place in Chinese Thought. London: Allen & Unwin
  • Watson, Burton. (1968). The Complete Works of Chuang Tzu. New York: Columbia University Press
  • Welch, Holmes. (1966). Taoism: The Parting of the Way. Boston: Beacon Press.
  • Welch, Holmes and Seidel, Anna, eds. (1979). Facets of Taoism. New Haven: Yale University Press.

 

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Slavoj Žižek (1949 —)

philosopher, portraitSlavoj Žižek is a Slovenian-born political philosopher and cultural critic. He was described by British literary theorist, Terry Eagleton, as the “most formidably brilliant” recent theorist to have emerged from Continental Europe.

Žižek’s work is infamously idiosyncratic. It features striking dialectical reversals of received common sense; a ubiquitous sense of humor; a patented disrespect towards the modern distinction between high and low culture; and the examination of examples taken from the most diverse cultural and political fields. Yet Žižek’s work, as he warns us, has a very serious philosophical content and intention. He challenges many of the founding assumptions of today’s left-liberal academy, including the elevation of difference or otherness to ends in themselves, the reading of the Western Enlightenment as implicitly totalitarian, and the pervasive skepticism towards any context-transcendent notions of truth or the good.

One feature of Žižek’s work is its singular philosophical and political reconsideration of German idealism (Kant, Schelling and Hegel). Žižek has also reinvigorated the challenging psychoanalytic theory of Jacques Lacan, controversially reading him as a thinker who carries forward founding modernist commitments to the Cartesian subject and the liberating potential of self-reflective agency, if not self-transparency. Žižek’s works since 1997 have become more and more explicitly political, contesting the widespread consensus that we live in a post-ideological or post-political world, and defending the possibility of lasting changes to the new world order of globalization, the end of history, or the war on terror.

This article explains Žižek’s philosophy as a systematic, if unusually presented, whole; and it clarifies the technical language Žižek uses, which he takes from Lacanian psychoanalysis, Marxism, and German idealism. In line with how Žižek presents his own work, this article starts by examining Žižek’s descriptive political philosophy. It then examines the Lacanian-Hegelian ontology that underlies Žižek’s political philosophy. The final part addresses Žižek’s practical philosophy, and the ethical philosophy he draws from this ontology.

Table of Contents

  1. Biography
  2. Žižek’s Political Philosophy
  3. Criticism of Ideology as “False Consciousness”
  4. Ideological Cynicism and Belief
  5. Jouissance as Political Factor
  6. The Reflective Logic of Ideological Judgments (or How the King is King)
  7. Sublime Objects of Ideology
  8. Žižek’s Fundamental Ontology
  9. The Fundamental Fantasy & the Split Law
  10. Excursus: Žižek’s Typology of Ideological Regimes
  11. Kettle Logic, or Desire and Theodicy
  12. Fantasy as the Fantasy of Origins
  13. Exemplification: the Fall and Radical Evil (Žižek’s Critique of Kant)
  14. From Ontology to Ethics—Žižek’s Reclaiming of the Subject
  15. Žižek’s Subject, Fantasy, and the Objet Petit a
  16. The Objet Petit a & the Virtuality of Reality
  17. Forced Choice & Ideological Tautologies
  18. The Substance is Subject, the Other Does Not Exist
  19. The Ethical Act Traversing the Fantasy
  20. Conclusion
  21. References and Further Reading
    1. Primary Literature (Books by Žižek)
    2. Secondary Literature (Texts on Žižek)

1. Biography

Slavoj Žižek was born in 1949 in Ljubljana, Slovenia. He grew up in the comparative cultural freedom of the former Yugoslavia’s self-managing socialism. Here—significantly for his work— Žižek was exposed to the films, popular culture and theory of the noncommunist West. Žižek completed his PhD at Ljubljana in 1981 on German Idealism, and between 1981 and 1985 studied in Paris under Jacques AlainMiller, Lacan’s son-in-law. In this period, Žižek wrote a second dissertation, a Lacanian reading of Hegel, Marx and Kripke. In the late 1980s, Žižek returned to Slovenia where he wrote newspaper columns for the Slovenian weekly “Mladina,” and cofounded the Slovenian Liberal Democratic Party. In 1990, he ran for a seat on the four-member collective Slovenian presidency, narrowly missing office. Žižek’s first published book in English, The Sublime Object of Ideology, appeared in 1989. Since then, Žižek has published over a dozen books, edited several collections, published numerous philosophical and political articles, and maintained a tireless speaking schedule. His earlier works are of the type “Introductions to Lacan through popular culture / Hitchcock / Hollywood …” Since at least 1997, however, Žižek’s work has taken on an increasingly engaged political tenor, culminating in books on September 11 and the Iraq war. As well as being visiting professor at the Department of Psychoanalysis, Universite ParisVIII in 1982-3 and 1985-6, Žižek has lectured at the Cardozo Law School, Columbia, Princeton, the New School for Social Research, the University of Michigan, Ann Arbor, and Georgetown. He is currently a returning faculty member of the European Graduate School, and founder and president of the Society for Theoretical Psychoanalysis, Ljubljana.

2. Žižek’s Political Philosophy

a. Criticism of Ideology as “False Consciousness”

In a way that is oddly reminiscent of Nietzsche, Žižek generally presents his work in a polemical fashion, knowingly striking out against the grain of accepted opinion. One untimely feature of Žižek’s work is his continuing defense and use of the unfashionable term “ideology.” According to the classical Marxist definition, ideologies are discourses that promote false ideas (or “false consciousness”) in subjects about the political regimes they live in. Nevertheless, because these ideas are believed by the subjects to be true, they assist in the reproduction of the existing status quo, in an exact instance of what Umberto Eco dubs “the force of the fake.” To critique ideology, according to this position, it is sufficient to unearth the truth(s) the ideologies conceal from the subject’s knowledge. Then, so the theory runs, subjects will become aware of the political shortcomings of their current regimes, and be able and moved to better them. As Žižek takes up in his earlier works, this classical Marxian notion of ideology has come under theoretical attack in a number of ways. First, to criticize a discourse as ideological implies access to a Truth about political things the Truth that the ideologies, as false, would conceal. But it has been widely disputed in the humanities that there could ever be any One such theoretically accessible Truth. Secondly, the notion of ideology is held to be irrelevant to describe contemporary sociopolitical life, because of the increased importance of what Jurgen Habermas calls “mediasteered subsystems” (the market, public and private bureaucracies), and also because of the widespread cynicism of today’s subjects towards political authorities. For ideologies to have political importance, critics comment, subjects would have to have a level of faith in public institutions, ideals and politicians which today’s liberal-cosmopolitan subjects lack. The widespread notoriety of left-leaning authors like Michael Moore of Noam Chomsky, as one example, bears witness to how subjects today can know very well what Moore claims is the “awful truth,” and yet act as if they did not know.

Žižek agrees with critics about this “false consciousness” model of ideology. Yet he insists that we are not living in a post-ideological world, as figures as different as Tony Blair, Daniel Bell or Richard Rorty have claimed. Žižek proposes instead that in order to understand today’s politics we need a different notion of ideology. In a typically bold reversal, Žižek’s position is that today’s widespread consensus that our world is post-ideological gives voice to what he calls the “archideological” fantasy. Since “ideology” since Marx has carried a pejorative sense, no one who taken in by such an ideology has ever believed that they were so duped, Žižek comments. If the term “ideology” has any meaning at all, ideological positions are always what people impute to Others (for today’s left, for example, the political right are the dupes of one or another noble lie about natural community; for the right, the left are the dupes of well-meaning but utopian egalitarianism bound to lead to economic and moral collapse, and so forth). For subjects to believe in an ideology, it must have been presented to them, and been accepted, as non-ideological indeed, as True and Right, and what anyone sensible would believe. As we shall see in 2e, Žižek is alert to the realist insight that there is no more effective political gesture than to declare some contestable matter above political contestation. Just as the third way is said to be post-ideological or national security is claimed to be extra-political, so Žižek argues that ideologies are always presented by their proponents as being discourses about Things too sacred to profane by politics. Hence, Žižek’s bold opening in The Sublime Object of Ideology is to claim that today ideology has not so much disappeared from the political landscape as come into its own. It is exactly because of this success, Žižek argues, that ideology has also been able to be dismissed in accepted political and theoretical opinion.

b. Ideological Cynicism and Belief

Today’s typical first world subjects, according to Žižek, are the dupes of what he calls “ideological cynicism.” Drawing on the German political theorist Sloterdijk, Žižek contends that the formula describing the operation of ideology today is not “they do not know it, but they are doing it”, as it was for Marx. It is “they know it, but they are doing it anyway.” If this looks like nonsense from the classical Marxist perspective, Žižek’s position is that nevertheless this cynicism indicates the deeper efficacy of political ideology per se. Ideologies, as political discourses, are there to secure the voluntary consent—or what La Boétie called servitude volontaire of people about contestable political policies or arrangements. Yet, Žižek argues, subjects will only voluntarily agree to follow one or other such arrangement if they believe that, in doing so, they are expressing their free subjectivity, and might have done otherwise.

However false such a sense of freedom is, Žižek insists that it is nevertheless a political instance of what Hegel called an essential appearance. Althusser’s understanding of ideological identification suggests that an individual is wholly “interpellated” into a place within a political system by the system’s dominant ideology and ideological state apparatuses. Contesting this notion by drawing on Lacanian psychoanalysis, however, Žižek argues that it is a mistake to think that, for a political position to win peoples’ support, it needs to effectively brainwash them into thoughtless automatons. Rather, Žižek maintains that any successful political ideology always allows subjects to have and to cherish a conscious distance towards its explicit ideals and prescriptions—or what he calls, in a further technical term, “ideological disidentification.”

Again bringing the psychoanalytic theory of Lacan to bear in political theory, Žižek argues that the attitude of subjects towards authority revealed by today’s ideological cynicism resembles the fetishist’s attitude towards his fetish. The fetishist’s attitude towards his fetish has the peculiar form of a disavowal: “I know well that (for example) the shoe is only a shoe, but nevertheless, I still need my partner to wear the shoe in order to enjoy.” According to Žižek, the attitude of political subjects towards political authority evinces the same logical form: “I know well that (for example) Bob Hawke / Bill Clinton / the Party / the market does not always act justly, but I still act as though I did not know that this is the case.” In Althusser’s famous “Ideology and Ideological State Apparatuses,” Althusser staged a kind of primal scene of ideology, the moment when a policeman (as bearer of authority) says “hey you!” to an individual, and the individual recognizes himself as the addressee of this call. In the “180 degree turn” of the individual towards this Other who has addressed him, the individual becomes a political subject, Althusser says. Žižek’s central technical notion of the “big Other” [grand Autre] closely resembles—to the extent that it is not modelled on Althusser’s notion of the Subject (capital “S”) in the name of which public authorities (like the police) can legitimately call subjects to account within a regime—for example, “God” in a theocracy, “the Party” under Stalinism, or “the People” in today’s China. As the central chapter of The Sublime Object of Ideology specifies, ideologies for Žižek work to identify individuals with such important or rallying political terms as these, which Žižek calls “master signifiers.” The strange but decisive thing about these pivotal political words, according to Žižek, is that no one knows exactly what they mean or refer to, or has ever seen with their own eyes the sacred objects which they seem to name (for example: God, the Nation, or the People). This is one reason why Žižek, in the technical language he inherits (via Lacan) from structuralism, says that the most important words in any political doctrine are “signifiers without a signified” (that is, words that do not refer to any clear and distinct concept or demonstrable object).

This claim of Žižek’s is connected to two other central ideas in his work:

  • First: Žižek adapts the psychoanalytic notion that individuals are always “split” subjects, divided between the levels of their conscious awareness and the unconscious. Žižek contends throughout his work that subjects are always divided between what they consciously know and can say about political things, and a set of more or less unconscious beliefs they hold concerning individuals in authority, and the regime in which they live (see 3a). Even if people cannot say clearly and distinctly why they support some political leader or policy, for Žižek no less than for Edmund Burke, this fact is not politically decisive, as we will see in 2e below.
  • Second: Žižek makes a crucial distinction between knowledge and belief. Exactly where and because subjects do not know, for example, what “the essence” of “their people” is, the scope and nature of their beliefs on such matters is politically decisive, according to Žižek (again, see 2e below).

Žižek’s understanding of political belief is modelled on Lacan’s understanding of transference in psychoanalysis. The belief or “supposition” of the analysand in psychoanalysis is that the Other (his analyst) knows the meaning of his symptoms. This is obviously a false belief, at the start of the analytic process. But it is only through holding this false belief about the analyst that the work of analysis can proceed, and the transferential belief can become true (when the analyst does become able to interpret the symptoms). Žižek argues that this strange intersubjective or dialectical logic of belief in clinical psychoanalysis also what characterizes peoples’ political beliefs. Belief is always “belief through the Other,” Žižek argues. If subjects do not know the exact meaning of those “master signifiers” with which they political identify, this is because their political belief is mediated through their identifications with others. Although they each themselves “do not know what they do” (which is the title one of Žižek’s books [Žižek, 2002]), the deepest level of their belief is maintained through the belief that nevertheless there are Others who do know. A number of features of political life are cast into new relief given this psychoanalytic understanding, Žižek claims:

  • First, Žižek contends that the key political function of holders of public office is to occupy the place of what he calls, after Lacan, “the Other supposed to know.” Žižek cites the example of priests reciting mass in Latin before an uncomprehending laity, who believe that the priests know the meaning of the words, and for whom this is sufficient to keep the faith. Far from presenting an exception to the way political authority works, for Žižek this scenario reveals the universal rule of how political consensus is formed.
  • Second, and in connection with this, Žižek contends that political power is primarily “symbolic” in its nature. What he means by this further technical term is that the roles, masks, or mandates that public authorities bear is more important politically than the true “reality” of the individuals in question (whether they are unintelligent, unfaithful to their wives, good family women, and soforth). According to Žižek, for example, fashionable liberal criticisms of George W. Bush the man are irrelevant to understanding or evaluating his political power. It is the office or place an individual occupies in their political system (or “big Other”) that ensures the political force of their words, and the belief of subjects in their authority. This is why Žižek maintains that the resort of a political leader or regime to “the real of violence” (such as war or police action) amounts to a confession of its weakness as a political regime. Žižek sometimes puts this thought by saying that people believe through the big Other, or that the big Other believes for them, despite what they might inwardly think or cynically say.

c. Jouissance as Political Factor

A further key point that Žižek takes from Louis Althusser’s later work on ideology is Althusser’s emphasis on the “materiality” of ideology, its embodiment in institutions and peoples’ everyday practices and lives. Žižek’s realist position is that all the ideas in the world can have no lasting political effect unless they come to inform institutions and subjects’ day-to-day lives. In The Sublime Object of Ideology, Žižek cites Blaise Pascal’s advice that doubting subjects should get down on their knees and pray, and then they will believe. Pascal’s position is not any kind of simple proto-behaviorism, according to Žižek. The deeper message of Pascal’s directive, he asserts, is to suggest that once subjects have come to believe through praying, they will also retrospectively see that they got down on their knees because they always believed, without knowing it. In this way, in fact, Žižek can be read as a consistent critic not only of the importance of knowledge in the formation of political consensus, but also of the importance of “inwardness” in politics per se in the tradition of the younger Carl Schmitt.

Prior political philosophy has placed too little emphasis, Žižek asserts, on communities’ cultural practices that involve what he calls “inherent transgression.” These are practices sanctioned by a culture that nevertheless allow subjects some experience of what is usually exceptional to or prohibited in their everyday lives as civilized political subjects—things like sex, death, defecation, or violence. Such experiences involve what Žižek calls jouissance, another technical term he takes from Lacanian psychoanalysis. Jouissance is usually translated from the French as “enjoyment.” As opposed to what we talk of in English as “pleasure”, though, jouissance is an always sexualized, always transgressive enjoyment, at the limits of what subjects can experience or talk about in public. Žižek argues that subjects’ experiences of the events and practices wherein their political culture organizes its specific relations to jouissance (in first world nations, for example, specific sports, types of alcohol or drugs, music, festivals, films) are as close as they will get to knowing the deeper Truth intimated for them by their regime’s master signifiers: “nation”, “God”, “our way of life,” and so forth (see 2b above). Žižek, like Burke, argues that it is such ostensibly nonpolitical and culturally specific practices as these that irreplaceably single out any political community from its others and enemies. Or, as one of Žižek’s chapter titles in Tarrying With the Negative puts it, where and although subjects do not know their Nation, they “enjoy (jouis) their nation as themselves.”

d. The Reflective Logic of Ideological Judgments (or How the King is King)

According to Žižek, like and after Althusser, ideologies are thus political discourses whose primary function is not to make correct theoretical statements about political reality (as Marx’s “false consciousness” model implies), but to orient subjects’ lived relations to and within this reality. If a political ideology’s descriptive propositions turn out to be true (for example: “capitalism exploits the workers,” “Saddam was a dictator,” “the Spanish are the national enemy,” and so forth), this does not in any way reduce their ideological character, in Žižek’s estimation. This is because this character concerns the political issue of how subjects’ belief in these propositions, instead of those of opponents, positions subjects on the leading political issues of the day. For Žižek, political speech is primarily about securing a lived sense of unity or community between subjects, something like what Kant called sensus communis or Rousseau the general will. If political propositions seemingly do describe things in the world, Žižek’s position is that we nevertheless need always to understand them as Marx understood the exchange value of commodities—as “a relation between people being concealed behind a relation between things.” Or again: just as Kant thought that the proposition “this is beautiful” really expresses a subject’s reflective sense of commonality with all other subjects capable of being similarly affected by the object, so Žižek argues that propositions like “Go Spain!” or “the King will never stop working to secure our future” are what Kant called reflective judgments, which tell us as much or more about the subject’s lived relation to political reality as about this reality itself.

If ideological statements are thus performative utterances that produce political effects by their being stated, Žižek in fact holds that they are a strange species of performative utterance overlooked by speech act theory. Just because, when subjects say “the Queen is the Queen!” they are at one level reaffirming their allegiance to a political regime, Žižek at the same time holds that this does not mean that this regime could survive without appearing to rest on such deeper Truths about the way the world is. As we saw in 2b, Žižek maintains that political ideologies always present themselves as naming such deeper, extra-political Truths. Ideological judgments, according to Žižek, are thus performative utterances which, in order to perform their salutary political work, must yet appear to be objective descriptions of the way the world is (exactly as when a chairman says “this meeting is closed!” only thereby bringing this state of affairs into effect). In Sublime Object of Ideology, Žižek cites Marx’s analysis of being a King in Das Capital to illustrate his meaning. A King is only King because his subjects loyally think and act like he is King (think of the tragedy of Lear). Yet, at the same time, the people will only believe he is King if they believe that this is a deeper Truth about which they can do nothing.

e. Sublime Objects of Ideology

In line with Žižek’s ideas of “ideological disidentification” and “jouissance as a political factor” (see 2b and 2c above) and in a clear comparison with Derrida’s deconstruction, arguably the unifying thought in Žižek’s political philosophy is that regimes can only secure a sense of collective identity if their governing ideologies afford subjects an understanding of how their regime relates to what exceeds, supplements or challenges its identity. This is why Kant’s analytic of the sublime in The Critique of Judgment, as an analysis of an experience in which the subject’s identity is challenged, is of the highest theoretical interest for Žižek. Kant’s analytic of the sublime isolates two moments to its experience, as Žižek observes. In the first moment, the size or force of an object painfully impresses upon the subject the limitation of its perceptual capabilities. In a second moment, however, a “representation” arises where “we would least expect it,” which takes as its object the subject’s own failure to perceptually take the object in. This representation resignifies the subject’s perceptual failure as indirect testimony about the inadequacy of human perception as such to attain to what Kant calls Ideas of Reason (in Kant’s system, God, the Universe as a Whole, Freedom, the Good).

According to Žižek, all successful political ideologies necessarily refer to and turn around sublime objects posited by political ideologies. These sublime objects are what political subjects take it that their regime’s ideologies’ central words mean or name extraordinary Things like God, the Fuhrer, the King, in whose name they will (if necessary) transgress ordinary moral laws and lay down their lives. When a subject believes in a political ideology, as we saw in 2b above, Žižek argues that this does not mean that they know the Truth about the objects which its key terms seemingly name—indeed, Žižek will finally contest that such a Truth exists (see 3c, d). Nevertheless, by drawing on a parallel with Kant on the sublime, Žižek makes a further and more radical point. Just as in the experience of the sublime, Kant’s subject resignifies its failure to grasp the sublime object as indirect testimony to a wholly “supersensible” faculty within herself (Reason), so Žižek argues that the inability of subjects to explain the nature of what they believe in politically does not indicate any disloyalty or abnormality. What political ideologies do, precisely, is provide subjects with a way of seeing the world according to which such an inability can appear as testimony to how Transcendent or Great their Nation, God, Freedom, and so forth is—surely far above the ordinary or profane things of the world. In Žižek’s Lacanian terms, these things are Real (capital “R”) Things (capital “T”), precisely insofar as they in this way stand out from the reality of ordinary things and events.

In the struggle of competing political ideologies, Žižek hence agrees with Ernesto Laclau and Chantal Mouffe, the aim of each is to elevate their particular political perspective (about what is just, best, and so forth) to the point where it can lay claim to name, give voice to or to represent the political whole (for example: the nation). In order to achieve this political feat, Žižek argues, each group must succeed in identifying its perspective with the extra-political, sublime objects accepted within the culture as giving body to this whole (for example: “the national interest,” “the dictatorship of the proletariat”). Or else, it must supplant the previous ideologies’ sublime objects with new such objects. In the absolute monarchies, as Ernst Kantorowicz argued, the King’s so called “second” or “symbolic” body exemplified paradigmatically such sublime political objects as the unquestionable font of political authority (the particular individual who was King was contestable, but not the sovereign’s role itself). Žižek’s critique of Stalinism, in a comparable way, turns upon the thought that “the Party” had this sublime political status in Stalinist ideology. Class struggle in this society did not end, Žižek contends, despite Stalinist propaganda. It was only displaced from a struggle between two classes (for example, bourgeois versus proletarian) to one between “the Party” as representative of the people or the whole and all who disagreed with it, ideologically positioned as “traitors” or “enemies of the people.”

3. Žižek’s Fundamental Ontology

a. The Fundamental Fantasy & the Split Law

For Žižek, as we have seen, no political regime can sustain the political consensus upon which it depends, unless its predominant ideology affords subjects a sense both of individual distance or freedom with regard to its explicit prescriptions (2b), and that the regime is grounded in some larger or “sublime” Truth (2e). Žižek’s political philosophy identifies interconnected instances of these dialectical ideas: his notion of “ideological disidentification” (2b); his contention that ideologies must accommodate subjects’ transgressive experiences of jouissance (2c); and his conception of exceptional or sublime objects of ideology (2e). Arguably the central notion in Žižek’s political philosophy intersects with Žižek’s notion of “ideological fantasy”. “Ideological fantasy” is Žižek’s technical name for the deepest framework of belief that structures how political subjects, and/or a political community, comes to terms with what exceeds its norms and boundaries, in the various registers we examined above.

Like many of Žižek’s key notions, Žižek’s notion of the ideological fantasy is a political adaptation of an idea from Lacanian psychoanalysis: specifically, Lacan’s structuralist rereading of Freud’s psychoanalytic understanding of unconscious fantasy. As for Lacan, so for Žižek, the civilizing of subjects necessitates their founding sacrifice (or “castration”) of jouissance, enacted in the name of sociopolitical Law. Subjects, to the extent that they are civilized, are “cut” from the primal object of their desire. Instead, they are forced by social Law to pursue this special, lost Thing in Žižek’s technical term, the “objet petit a” (see 4a, 4b) by observing their societies’ linguistically mediated conventions, deferring satisfaction, and accepting sexual and generational difference. Subjects’ “fundamental fantasies,” according to Lacan, are unconscious structures which allow them to accept the traumatic loss involved in this founding sacrifice. They turn around a narrative about the lost object, and how it was lost (see 3d). In particular, the fundamental fantasy of a subject resignifies the founding repression of jouissance by Law—which, according to Lacan, is necessary if the individual is to become a speaking subject—as if it were a merely contingent, avoidable occurrence. In the fantasy, that is, what for Žižek is a constitutive event for the subject, is renarrated as the historical action of some exceptional individual (in Enjoy Your Symptom! the pre-Oedipal “anal father”). Equally, the jouissance the subject considers itself to have lost is posited by the fantasy as having been taken from it by this persecutory “Other supposed to enjoy” (see 3b).

In the notion of ideological fantasy, Žižek takes this psychoanalytic framework and applies it to the understanding of the constitution of political groups. If after Plato, political theory concerns the Laws of a regime, the Laws for Žižek are always split or double in kind. Each political regime has a body of more or less explicit, usually written Laws which demand that subjects forego jouissance in the name of the greater good, and according to the letter of its proscriptions (for example, the US or French constitutions). Žižek identifies this level of the Law with the Freudian ego ideal. But Žižek argues that, in order to be effective, a regime’s explicit Laws must also harbor and conceal a darker underside, a set of more or less unspoken rules which, far from simply repressing jouissance, implicate subjects in a guilty enjoyment in repression itself, which Žižek likens to the “pleasure-in-pain” associated with the experience of Kant’s sublime (see 2d). The Freudian superego, for Žižek, names the psychical agency of the Law, as it is misrepresented and sustained by subjects’ fantasmatic imaginings of a persecutory Other supposed to enjoy (like the archetypal villain in noir films). This darker underside of the Law, Žižek agrees with Lacan, is at its base a constant imperative to subjects to jouis!, by engaging in the “inherent transgressions” of their sociopolitical community (see 2b).

Žižek’s notion of the split in the Law in this way intersects directly with his notion of ideological disidentification examined in 2b. While political subjects maintain a conscious sense of freedom from the explicit norms of their culture, Žižek contends, this disidentification is grounded in their unconscious attachment to the Law as superego, itself an agency of enjoyment. If Althusser famously denied the importance of what people “have on their consciences” in the explanation of how political ideologies work, then for Žižek the role of guilt—as the way in which the subject enjoys his subjection to the laws—is vital to understanding subjects’ political commitments. Individuals will only turn around when the Law hails them, Žižek argues, insofar as they are finally subjects also of the unconscious belief that the “big Other” has access to the jouissance they have lost as subjects of the Law, and which they can accordingly reattain through their political allegiance (see 2b). It is this belief, what could be termed this “political economy of jouissance,” that the fundamental fantasies underlying political regimes’ worldviews are there to structure in subjects.

b. Excursus: Žižek’s Typology of Ideological Regimes

With these terms of Žižek’s Lacanian ontology in place, it becomes possible to lay out Žižek’s theoretical understanding of the differences between different types of ideological-political regimes. Žižek’s works maintain a lasting distinction between modern and premodern political regimes, which he contends are grounded in fundamentally different ways of organizing subjects’ relations to Law and jouissance (3a). In Žižek’s Lacanian terms, premodern ideological regimes exemplified what Lacan calls in Seminar XVII the discourse of the master. In these authoritarian regimes, the word and will of the King or master (in Žižek’s mathemes, S1) was sovereign—the source of political authority, with no questions asked. Her/His subjects, in turn, are supposed to know (S2) the edicts of the sovereign and the Law (as the classical legal notion has it, “ignorance is no excuse”). In this arrangement, while jouissance and fantasy are political factors, as Žižek argues, regimes’ quasi-transgressive practices remain exceptional to the political arena, glimpsed only in such carnivalesque events as festivals or the types of public punishment Michel Foucault (for example) describes in the introduction to Discipline and Punish.

Žižek agrees with both Foucault and Marx that modern political regimes exert a form of power that is both less visible and more far-reaching than that of the regimes they replaced. Modern regimes, both liberal capitalist and totalitarian, for Žižek, are no longer predominantly characterized by the Lacanian discourse of the master. Given that the Oedipal complex is associated by him with this older type of political authority, Žižek agrees with the Frankfurt School theorists that, contra Deleuze and Guattari, today’s subjectivity as such is already post- or anti-Oedipal. Indeed, in Plague of Fantasies and The Ticklish Subject, Žižek contends that the characteristic discontents of today’s political world—from religious fundamentalism to the resurgence of racism in the first world—are not archaic remnants of, or protests against traditional authoritarian structures, but the pathological effects of new forms of social organization. For Žižek, the defining agency in modern political regimes is knowledge (or, in his Lacanian mathemes, S2). The enlightenment represented the unprecedented political venture to replace belief in authority as the basis of polity with human reason and knowledge. As Schmitt also complained, the legitimacy of modern authorities is grounded not in the self-grounding decision of the sovereign. It is grounded in the ability of authorities to muster coherent chains of reasons to subjects about why they are fit to govern. Modern regimes hence always claim to speak not out of ignorance of what subjects deeply enjoy (“I don’t care what you want; just do what I say!”) but in the very name of subjects’ freedom and enjoyment.

Whether fascist or communist, Žižek argues in his early books, totalitarian (as opposed to authoritarian) regimes justified their rule by final reference to quasi-scientific metanarratives. These metanarratives—a narrative concerning racial struggle in Nazism, or the Laws of History in Stalinism—each claimed to know the deeper Truth about what subjects want, and accordingly could both justify the most striking transgressions of ordinary morality, and justify these transgressions by reference to subjects’ jouissance. The most disturbing or perverse features of these regimes can only be explained by reference to the key place of knowledge in these regimes. Žižek describes, for instance, the truly Catch 22esque logic of the Soviet show trials, wherein it was not enough for subjects to be condemned by the authorities as enemies, but they were made to avow their “objective” error in opposing the party as agent of the laws of history.

Žižek’s statements on today’s liberal capitalism are complex, if not in mutual tension. At times, Žižek tries to formalize the economic generation of surplus value as a meaningfully “hysterical” social arrangement. Yet Žižek predominantly argues, that the market driven consumerism of later capitalist subjects is characterized by a marketing discourse which—like totalitarian ideologies—does not appeal to subjects in the name of any collective cause justifying individuals’ sacrifice of jouissance. Instead, as social conservatives criticize, it musters the quasi-scientific discourses of marketing and public relations, or (increasingly) Eastern religion, in order to recommend products to subjects as necessary means in the liberal pursuit of happiness and self-fulfillment. In line with this change, Žižek contends in The Ticklish Subject that the paradigmatic type of leader today is not some inaccessible boss but the uncannily familiar figure of Bill Gates—more like a little brother than the traditional father or master. Again: for Žižek it is deeply telling that at the same time as the nuclear family is being eroded in the first world, other institutions, from the so-called “nanny” welfare state to private corporations, are increasingly becoming “familiarized” (with self-help sessions for employees, company days, casual days, and so forth).

c. Kettle Logic, or Desire and Theodicy

We saw how Žižek claims that the truth of political ideologies concerns what they do, not what they say (2d). At the level of what political ideologies say, Žižek maintains, a Lacanian critical theory maintains that ideologies must be finally inconsistent. Freud famously talked of the example of a man who returns a borrowed kettle back to its owner broken. The man adduces mutually inconsistent excuses which are united only in terms of his ignoble desire to evade responsibility for breaking the kettle: he never borrowed the kettle, the kettle was already broken when he borrowed it, and when he gave the kettle back it was not really broken anyway. As Žižek reads political ideologies, they function in the same way in the political field—this is the sense of the subtitle of his 2004 Iraq: The Borrowed Kettle. As we saw in 2d, Žižek maintains that the end of political ideologies is to secure and defend the idea of the polity as a wholly unified community. When political strife, uncertainty or division occur, political ideologies and the fundamental fantasies upon which they lean (3a) operate to resignify this political discontent so that the political ideal of community can be sustained, and to deny the possibility that this discontent might signal a fundamental injustice or flaw within the regime. In what amounts to a kind of political theodicy, Žižek’s work points to a number of logically inconsistent ideological responses to political discontents, which are united only by the desire that informs them, like Freud’s “kettle logic”:

  1. Saying that these divisions are politically unimportant, transient or merely apparent.
    Or, if this explanation fails:
  2. Saying that the political divisions are in any case contingent to the ordinary run of events, so that if their cause is removed or destroyed, things will return to normal.
    Or, more perilously:
  3. Saying that the divisions or problems are deserved by the people for the sake of the greater good (in Australia in the 90s, for example, we experienced “the recession we had to have”), or as punishment for their betrayal of the national Thing.

Žižek’s view of the political functioning of sublime objects of ideology can be charted exactly in terms of this political theodicy. (see 2e) We saw in 3a, how Žižek argues that subjects’ fantasy is what allows them to come to terms with the loss of jouissance fundamental to being social or political animals. Žižek centrally maintains that such narrative attempts at political self-understanding—whether of individuals or political regimes—are ultimately unable to achieve these ends, except at the price of telling inconsistencies.

As Žižek highlights in his analyses of the political discontents in former Yugoslavia following the fall of communism, each national or political community tends to claim that its sublime Thing is inalienable, and hence utterly incapable of being understood or destroyed by enemies. Nevertheless, the invariable correlative of this emphasis on the inalienable nature of one’s Thing, Žižek argues in Tarrying with the Negative (1993), is the notion that It is simultaneously deeply fragile if not under active threat. For Žižek, this mutual inconsistency is only theoretically resolvable if, despite first appearances, we posit a materialist teaching that says that the “substance” seemingly named by political regimes’ key rallying terms (see 2e) is only sustained in their lived communal practices (as we say when someone does not get a joke, “you had to be there”). Yet political ideologies, as such, cannot avow this possibility (see 2d). Instead, ideological fantasies posit various exemplars of a persecutory enemy or, as Žižek says, “the Other of the Other” to whom the explanation of political disunity or discontent can be traced. If only this other or enemy could be removed, the political fantasy contends, the regime would be fully equitable and just. Historical examples of such figures of the enemy include “the Jew” in Nazi ideology, or the “petty bourgeois” in Stalinism.

Again: a type of “kettle logic” applies to the way these enemies are represented in political ideologies, according to Žižek. “The Jew” in Nazi ideology, for example, was an inconsistent condensation of features of both the ruling capitalist class (money grabbing, exploitation of the poor) and of the proletariat (dirtiness, sexual promiscuity, communism). The only consistency this figure has, that is, is precisely as a condensation of everything that Nazi ideology’s Aryan Volksgemeinschaft (roughly, “national community”) was constructed in response and political opposition to.

d. Fantasy as the Fantasy of Origins

In a way that has drawn some critics (Bellamy, Sharpe) to question how finally political Žižek’s political philosophy is, Žižek’s critique of ideology ultimately turns on a set of fundamental ontological propositions about the necessary limitations of any linguistic or symbolic system. These propositions concern the widely known paradoxes that bedevil any attempt by a semantic system to explain its own limits, and/or how it came into being. If what preceded the system was radically different from what subsequently emerged, how could the system have emerged from it, and how can the system come to terms with it at all? If we name the limits of what the system can understand, do not we, in that very gesture, presuppose some knowledge of what is beyond these limits, if only enough to say what the system is not? The only manner in which we can explain the origin of language is within language, Žižek notes in For They Know Not What They Do. Yet we hence presuppose, again in the very act of the explanation, the very thing we were hoping to explain. Similarly, to take the example from political philosophy of Hobbes’ explanation of the origin of sociopolitical order, the only way we can explain the origin of the social contract is by presupposing that Hobbes’ wholly pre-social men nevertheless possessed in some way the very social abilities to communicate and make pacts that Hobbes’ position is supposed to explain.

For Žižek, fantasy as such is always fundamentally the fantasy of (one’s) origins. In Freud’s “Wolf Man” case, to cite the psychoanalytic example Žižek cites in For They Know Not What They Do, the primal scene of parental coitus is the Wolf Man’s attempt to come to terms with his own origin—or to answer the infant’s perennial question “where did I come from?” The problem here is this: who could the spectacle of this primal scene have been staged for or seen by, if it really transpired before the genesis of the subject that it would explain (see 3e, 4e)? The only answer is that the Wolf Man has imaginatively transposed himself back into the primal scene if only as an impassive object-gaze—whose historical occurrence he had yet hoped would explain his origin as an individual.

Žižek’s argument is that, in the same way, political or ideological systems cannot and do not avoid deep inconsistencies. No less than Machiavelli, Žižek is acutely aware that the act that founds a body of Law is never itself legal, according to the very order of Law it sets in place. He cites Bertolt Brecht: “what is the robbing of a bank, compared to the founding of a bank?” What fantasy does, in this register, is to try to historically renarrativize the founding political act as if it were or had been legal—an impossible application of the Law before the Law had itself come into being. No less than the Wolf Man’s false transposition of himself back into the primal scene that was to explain his origin, Žižek argues that the attempt of any political regime to explain its own origins in a political myth that denies the fundamental, extralegal violence of these origins is fundamentally false. (Žižek uses the example of the liberal myth of primitive accumulation to illustrate his position in For They Know Not What They Do, but we could cite here Plato’s myth of the reversed cosmos in the Laws and Statesman, or historical cases like the idea of terra nullius in colonial Australia).

e. Exemplification: the Fall and Radical Evil (Žižek’s Critique of Kant)

In a series of places, Žižek situates his ontological position in terms of a striking reading of Immanuel Kant’s practical philosophy. Žižek argues that in “Religion Within the Bounds of Reason Alone” Kant showed that he was aware of these paradoxes that necessarily attend any attempt to narrate the origins of the Law. The Judeo-Christian myth of the fall succumbs to precisely these paradoxes, as Kant analyses: if Adam and Eve were purely innocent, how could they have been tempted?; if their temptation was wholly the fault of the tempter, why then has God punished humans with the weight of original sin?; but if Adam and Eve were not purely innocent when the snake lured them, in what sense was this a fall at all? According to Žižek, Kant’s text also provides us with theoretical parameters which allow us to explain and avoid these paradoxes. The problems for the mythical narrative, Kant argues, hail from its nature as a narrative—or how it tries to render in a historical story what he argues is truly a logical or transcendental priority. For Kant, human beings are, as such, radically evil. They have always already chosen to assert their own self-conceit above the moral Law. This choice of radical evil, however, is not itself a historical choice either for individuals or for the species, for Kant. This choice is what underlies and opens up the space for all such historical choices. However, as Žižek argues, Kant withdraws from the strictly diabolical implications of this position. The key place in which this withdrawal is enacted is in the postulates of The Critique of Practical Reason, wherein Kant defends the immortality of the soul as a likely story on the basis of our moral experience. Because of radical evil, Kant argues, it is impossible for humans to ever act purely out of duty in this life—this is what Kant thinks our irremovable sense of moral guilt attests. But because people can never act purely in this life, Kant suggests, it is surely reasonable to hope and even to postulate that the soul lives on after death, striving ever closer towards the perfection of its will.

Žižek’s contention is that this argument does not prove the immortality of a disembodied soul. It proves the immortality of an embodied individual soul, always struggling guiltily against its selfish corporeal impulses (this, incidentally, is one reason why Žižek argues, after Lacan, that de Sade is the truth of Kant). In order to make his proof even plausible, Žižek notes, Kant has to tacitly smuggle the spatiotemporal parameters of embodied earthly existence into the postulated hereafter so that the guilty subject can continue endlessly to struggle against his radically evil nature towards good. In this way, though, Kant himself has to speak as if he knew what things are like on the other side of death—which is to say, from the impossible, because impossibly neutral, perspective of someone able to impassively see the spectacle of the immortal subject striving guiltily towards the good (see 4d). But in this way, also, Žižek argues that Kant enacts exactly the type of fantasmatic operation his reading of the fall (as a) narrative declaims, and which represents in nuce the basis operation also of all political ideologies.

4. From Ontology to Ethics—Žižek’s Reclaiming of the Subject

a. Žižek’s Subject, Fantasy, and the Objet Petit a

Perhaps Žižek’s most radical challenge to accepted theoretical opinion is his defense of the modern, Cartesian subject. Žižek knowingly and polemically positions his writings against virtually all other contemporary theorists, with the significant exception of Alain Badiou. Yet for Žižek, the Cartesian subject is not reducible to the fully self-assured “master and possessor of nature” of Descartes’ Discourses. It is what Žižek calls in “Kant With (Or Against) Kant,” an out of joint ontological excess or clinamen. Žižek takes his bearings here as elsewhere from a Lacanian reading of Kant, and the latter’s critique of Descartes’ cogito ergo sum. In the “Transcendental Dialectic” in The Critique of Pure Reason, Kant criticized Descartes’ argument that the self-guaranteeing “I think” of the cogito must be a thinking thing (res cogitans). For Kant (as for Žižek), while the “I think” must be capable of accompanying all of the subject’s perceptions, this does not mean that it is itself such a substantial object. The subject that sees objects in the world cannot see itself seeing, Žižek notes, any more than a person can jump over her own shadow. To the extent that a subject can reflectively see itself, it sees itself not as a subject but as one more represented object, what Kant calls the “empirical self” or what Žižek calls the “self” (versus the subject) in The Plague of Fantasies. The subject knows that it is something, Žižek argues. But it does not and can never know what Thing it is “in the Real”, as he puts it (see 2e). This is why it must seek clues to its identity in its social and political life, asking the question of others (and of the big Other (see 2b)) which Žižek argues defines the subject as such: che voui? (what do you want from me?). In Tarrying With the Negative, Žižek hence reads the Director’s Cut of Ridley Scott’s Bladerunner as revelatory of the Truth of the subject. Within this version of the film, as Žižek emphasizes, the main character Deckard literally does not know what he is—a robot that perceives itself to be human. According to Žižek, the subject is a “crack” in the universal field or substance of being, not a knowable thing (see 4d). This is why Žižek repeatedly cites in his books the disturbing passage from the young Hegel describing the modern subject not as the “light” of the modern enlightenment, but “this night, this empty nothing …”

It is crucial to Žižek’s position, though, that Žižek denies the apparent implication of this that the subject is some kind of supersensible entity, for example, an immaterial and immortal soul, and so forth. The subject is not a special type of Thing outside of the phenomenal reality we can experience, for Žižek. As we saw in 1e above, such an idea would in fact reproduce in philosophy the type of thinking which, he argues, characterizes political ideologies and the subject’s fundamental fantasy (see 3a). It is more like a fold or crease in the surface of this reality, as Žižek puts it in Tarrying With the Negative, the point within the substance of reality wherein that substance is able to look at itself, and see itself as alien to itself. According to Žižek, Hegel and Lacan add to Kant’s reading of the subject as the empty “I think” that accompanies any individual’s experience the caveat that, because objects thus appear to a subject, they always appear in an incomplete or biased way. Žižek’s “formula” of the fundamental fantasy (see 2a, 2d) “$ <> a” tries to formalize exactly this thought. Its meaning is that the subject ($), in its fundamental fantasy, misrecognizes itself as a special object (the objet petit a or lost object (see 2a)) within the field of objects that it perceives. In terms which unite this psychoanalytic notion with Žižek’s political philosophy, we can say that the objet petit a is exactly a sublime object (2e). It is an object that is elevated or, in Freudian terms, “sublimated” by the subject to the point where it stands as a metonymic representative of the jouissance the subject unconsciously fantasizes was taken from her/him at castration (3a). It hence functions as the object-cause of the subject’s desire that exceptional “little piece of the Real” that s/he seeks out in all of her/his love relationships. Its psychoanalytic paradigms are, to cite the title of a collection Žižek edited, “the voice and gaze as love objects”. Examples of the voice as object petit a include the persecutor’s voice in paranoia, or the very silence that some TV advertisements now use, and which captures our attention by making us wonder whether we may not have missed something. The preeminent Lacanian illustration of the gaze as object petit a is the anamorphotic skull at the foot of Holbein’s Ambassadors, which can only be seen by a subject who looks at it awry, or from an angle. Importantly, then, neither the voice nor the gaze as objet petit a attest to the subject’s sovereign ability to wholly objectify (and hence control) the world it surveys. In the auditory and visual fields (respectively), the voice and the gaze as objet petit a represent objects like Kant’s sublime things that the subject cannot wholly get its head around, as we say. The fact that they can only be seen or heard from particular perspectives indicates exactly how the subject’s biased perspective—and so his/her desire, what s/he wants—has an effect on what s/he is able to see. They thereby bear witness to how s/he is not wholly outside of the reality s/he sees. Even the most mundane but telling example of this subjective objet petit a of Lacanian theory is someone in love, of whom we commonly say that they are able to see in their lover something special, an “X factor,” which others are utterly blind to. In the political field, similarly—and as we saw in part 2c—subjects of a particular political community will claim that others cannot understand their regime’s sublime objects. Indeed, as Žižek comments about the resurgence of racism across the first world today, it is often precisely the strangeness of others’ particular ethnic or national Things that animates subjects’ hatred towards them.

b. The Objet Petit a & the Virtuality of Reality

In Žižek’s theory, the objet petit a stands as the exact opposite of the object of the modern sciences, that can only be seen clearly and distinctly if it is approached wholly impersonally. If the objet petit a is not looked at from a particular, subjective perspective—or, in the words of one of Žižek’s titles, by “looking awry” —it cannot be seen at all. This is why Žižek believes this psychoanalytic notion can be used to structure our understanding of the sublime objects postulated by ideologies in the political field, which as we saw in 3c show themselves to be finally inconsistent when they are looked at dispassionately. What Žižek’s Lacanian critique of ideology aims to do is to demonstrate such inconsistencies, and thereby to show us that the objects most central to our political beliefs are Things whose very sublime appearance conceals from us our active agency in constructing and sustaining them. (We will return to this thought in 4d and 4e below).

Žižek argues that the first place that the objet petit a appeared in the history of Western philosophy was with Kant’s notion of the transcendental object in The Critique of Pure Reason. Analyzing this Kantian notion allows us to elaborate more precisely the ontological status of the objet petit a. Kant defines the transcendental object as “the completely indeterminate thought of an object in general.” Like the objet petit a, then, Kant’s transcendental object is not a normal phenomenal object, although it has a very specific function in Kant’s epistemological conception of the subject. The avowedly anti-Humean function of this Kantian positing in the “Transcendental Deduction” is to ensure that the purely formal categories of the subject’s understanding can actually affect and indeed structure the manifold of the subject’s sensuous intuition. As Žižek stresses, that is, the transcendental object functions in Kant’s epistemology to guarantee that sense will continue to emerge for the subject, no matter what particular objects s/he might encounter.

We saw in 3c how Žižek argues that ideologies adduce ultimately inconsistent reasons to support the same goal of political unity. According to Žižek, as we can now elaborate, this is because the deepest political function of sublime objects of ideology is to ensure that the political world will make sense for subjects no matter what events transpire, in a way that he directly compares with Kant’s transcendental object. No matter what evidence someone might produce that all Jewish people are not acquisitive, capitalist, cunning, for example, a true Nazi will be able to immediately resignify this evidence by reference to his ideological notion of “the Jew”: “surely it is part of their cunning to appear as though they are not truly cunning,” and so forth. Importantly, it follows for Žižek that political community is always, in its very structure, an anticipated community. Subjects’ sense of political belonging is always mediated, according to him, by their shared belief in their regime’s key words or master signifiers. But these are words whose only “meaning” lies finally in their function, which is to guarantee that there will (continue to) be meaning. There is, Žižek argues, ultimately no actual, Real Thing better than the other real things subjects encounter that these words name (2e). It is only by acting as if there were such a Thing that community is maintained. This is why Žižek specifies in The Indivisible Reminder that political identification can only be, “at its most basic, identification with the very gesture of identification”:

…the coordination [between subjects in a political community] concerns not the level of the signified [of some positive shared concern] but the level of the signifier. [In political ideologies], undecidability with regard to the signified (do others really intend the same as me?) converts into an exceptional signifier, the empty Master-Signifier, the signifier-without-signified. ‘Nation’, ‘Democracy’, ‘Socialism’ and other Causes stand for that ‘something’ about which we are never sure what, exactly, it is – the point is, rather, that identifying with the Nation we signal our acceptance of what others accept, with a Master-Signifier which serves as the rallying point for all the others. (Žižek, 1996: 142)

This is the sense also in which Žižek claims in Plague of Fantasies that today’s virtual reality is “not virtual enough.” It is not virtual enough because the many options it offers subjects to enjoy (jouis) are transgressive or exotic possibilities. VR leaves nothing to the imagination or, in Žižek’s Lacanian terms, to fantasy. Fantasy, as we saw in 2a, operates to structure subjects’ beliefs about the jouissance which must remain only the stuff of imagination, purely “virtual” for subjects of the social law. For Žižek, then, it is identification with this law, as mediated via subjects’ anticipatory identifications with what they suppose others believe, that involves true virtuality.

c. Forced Choice & Ideological Tautologies

As 4b confirms (and as we commented in 1c), Žižek’s political philosophy turns around the idea that the central words of political ideologues are at base “signifiers without signified,” words that only appear to refer to exceptional Things, and which thereby facilitate the identification between subjects. As Žižek argues, these sublime objects of ideology have exactly the ontological status of what Kant called “transcendental illusions”—illusions whose semblance conceals that there is nothing behind them to conceal. Ideological subjects do not know what they do when they believe in them, Žižek contends. Yet, through the presupposition that the Other(s) know (2c), and their participation in the practices involving inherent transgression of their political community (2c), they “identify with the very gesture of identification” (4b). Hence, their belief, coupled with these practices, is politically efficient.

One of Žižek’s most difficult, but also deepest, claims is that the particular sublime objects of ideology with which subjects identify in different regimes (the Nation, the People, and so forth) each give particular form to a meta-law (law about all other laws) that binds any political community as such. This is the meta-law that says simply that subjects must obey all the other laws. In 2b above, we saw how Žižek holds that political ideologies must allow subjects the sense of subjective distance from their explicit directives. Žižek’s critical position is that this apparent freedom ideologies thereby allow subjects is finally a lure. Like the choice offered Yossarian by the “catch 22” of Joseph Heller’s novel, the only option truly available to political subjects is to continue to abide by the laws. No regime can survive if it waives this meta-law. The Sublime Object of Ideology hence cites with approval Kafka’s comment that it is not required that subjects think the law is just, only that it is necessary. Yet no regime, despite Kafka, can directly avow its own basis in such naked self-assertion without risking the loss of all legitimacy, Žižek agrees with Plato. This is why it must ground itself in ideological fantasies (3a) which at once sustain subjects’ sense of individual freedom (2c) and the sense that the regime itself is grounded extra-politically in the Real, and some transcendent, higher Good (2e).

This thought underlies the importance Žižek accords in For They Know Not What They Do to Hegel’s difficult notion of tautology as the highest instance of contradiction in The Science of Logic. If you push a subject hard enough about why they abide by the laws of their regime, Žižek holds that their responses will inevitably devolve into some logical variant of Exodus 3:14’s “I am that I am” statements of the form “because the Law (God / the People/ the Nation) is … the Law (God / the People / the Nation)”. In such tautological statements, our expectation that the predicates in the second half of the sentence will add something new to the (logical) subject given at its beginning is “contradicted,” Hegel argues. There is indeed something even sinister when someone utters such a sentence in response to our enquiries, Žižek notes—as if, when (for example) “the Law” is repeated dumbly as its own predicate (“because the law is the law”), it intimates the uncanny dimension of jouissance the law as ego ideal usually proscribes (3a). What this uncanny effect of sense attests to, Žižek argues in For They Know Not What They Do, is the usually “primordially repressed” force of the universal meta-law (that everyone must obey the laws) being expressed in the different, particular languages of political regimes: “because the People are the People,” “because the Nation is the Nation”, and so forth.

Žižek’s ideology critique hence contends that all political regimes’ ideologies always devolve finally around a set of such tautological propositions concerning their particular sublime objects. In The Sublime Object of Ideology, Žižek gives the example of a key Stalinist proposition: “the people always support the party.” On its surface, this proposition looks like a proposition that asserts something about the world, and which might be susceptible of disproof: perhaps there are some Soviet citizens who do not support the party, or who disagree with this or that of the party’s policies. What such an approach misses, however, is how in this ideology, what is referred to as “the people” in fact means “all those who support the party.” In Stalinism, that is, “the party” is the fetishized particular that stands for the people’s true interests (see 1e). Hence, the sentence “the people always support the party” is a concealed form of tautology. Any apparent people who in fact do not support the party by that fact alone are no longer “people” within Stalinist ideology.

d. The Substance is Subject, the Other Does Not Exist

In 4b, we saw how Žižek argues that political identification is identification with the gesture of identification. In 4c, we saw how the ultimate foundation of a regimes’ laws is a tautologous assertion of the bare political fact that there is law. What unites these two positions is the idea that the sublime objects of a political regime and the ideological fantasies that give narratives about their content conceal from subjects the absence of any final ground for Law beyond the fact of its own assertion, and the fact that subjects take it to be authoritative. Here as elsewhere, Žižek’s work surprisingly approaches leading motifs in the political philosophy of Carl Schmitt.

Importantly, once this position is stated, we can also begin to see how Žižek’s post-Marxist project of a critique of ideology intersects with his philosophical defense of the Cartesian subject. At several points in his oeuvre, Žižek cites Hegel’s statement in the “Introduction” to the Phenomenology of Spirit that “the substance is subject” as a rubric that describes the core of his own political philosophy. According to Žižek, critics have misread this statement by taking it to repeat the founding, triumphalist idea of modern subjectivity as such—namely, that the subject can master all of nature or “substance.” Žižek contends, controversially, that Hegel’s claim ought to be read in a directly opposing sense. For him, it indicates the truth that there can be no dominant political regime or, in Hegel’s terms, no “social substance” that does not depend for its authority upon the active, indeed finally anticipatory (4c) investment of subjects in it. Like the malign computer machines in The Matrix that literally run off the human jouissance they drain from deluded subjects, for Žižek the big Other of any political regime does not exist as a self-sustaining substance. It must ceaselessly run on the belief and actions of its subjects, and their jouissance (2c)—or, to recur to the example we looked at in 2d, the King will not be the King, for Žižek, unless he has his subjects. It is certainly telling that the leading examples of ideological tautology For They know What They Do discusses invoke precisely some subject’s will or decision as when a parent says to a child “do this … because I said so,” or when people do something “… because the King said so,” which means that no more questions can be asked.

In 4a, we saw how Žižek denies that the subject, because it is not itself a perceptible object, belongs to an order of being wholly outside of the order of experience. To elevate such a wholly Other order would, he argues, reproduce the elementary operation of the fundamental fantasy. We can now add to this thought the further position that the Cartesian subject is, according to Žižek, is finally nothing other than the irreducible point of active agency responsible for the always minimally precipitous political gesture of laying down a regime’s law. For Žižek, accordingly, the critical question to be asked of any theoretical or political position that posits some exceptional Beyond, as we saw in his reading of Kant (2e) is: from which subject-position do you speak when you claim a knowledge of this Beyond? As we saw in 2e, Žižek’s Lacanian answer is that the perspective that one always presupposes when one speaks in this manner is one that is always “superegoic” (see 2a)—tied to what he terms in Metastases of Enjoyment a “malevolently neutral” God’s eye view from nowhere. It is deeply revealing, from Žižek’s perspective, that the very perspective which allows the Kantian subject in the “dynamic sublime” to resignify its own finitude as itself a source of pleasure-in-pain (jouissance) is precisely one which identifies with the supersensible moral Law, before which the sensuous subject remains irredeemably guilty, infinitely striving to pay off its moral debt. As Žižek cites Hegel’s Phenomenology of Spirit:

It is manifest that beyond the so-called curtain [of phenomena] which is supposed to conceal the inner world, there is nothing to be seen unless we go behind it ourselves as much in order that we may see, as that there may be something behind there which can be seen. (Žižek, 1989: 196, emphasis added)

In other words, Žižek’s final position about the sublime objects of political regimes’ ideologies is that these belief inspiring objects are so many ways in which the subject misrecognizes its own active capacity to challenge existing laws, and to found new laws altogether. Žižek repeatedly argues that the most uncanny or abyssal Thing in the world is the subject’s own active subjectivity—which is why he also repeatedly cites the Eastern saying that “Thou art that.” It is finally the singularity of the subject’s own active agency that subjects misperceive in fantasies concerning the sublime objects of their regimes’ ideologies, in the face of which they can do nothing but reverentially abide by the rules. In this way, it is worth noting, Žižek’s work can claim a heritage not only of Hegel, but also from the Left Hegelians, and Marx’s and Feuerbach’s critiques of religion.

e. The Ethical Act Traversing the Fantasy

Žižek’s technical term for the process whereby we can come to recognize how the sublime objects of our political regimes’ ideologies are, like Marx’s commodities, fetish objects that conceal from subjects their own political agency is “traversing of the fantasy.” Traversing the fantasy, for Žižek, is at once the political subject’s deepest form of self-recognition, and the basis for his own radical political position or defense of the possibility of such positions. Žižek’s entire theoretical work directs us towards this “traversing of the fantasy” in the many different fields on which he has written, and despite the widespread consensus at the beginning of the new century that fundamental political change is no longer possible or desirable.

Insofar as political ideologies for Žižek, like for Althusser (see 2c), remain viable only because of the ongoing practices and beliefs of political subjects, this traversal of fantasy must always involve an active, practical intervention in the political world, which changes a regime’s political institutions. As for Kant, so for Žižek, the practical bearing of critical reason comes first, in his critique of ideology, and last, in his advocacy of the possibility of political change. Žižek hence also repeatedly speaks of traversing the fantasy in terms of an “Act” (capital “A”), which differs from normal human speech and action. Everyday speech and action typically does not challenge the framing sociopolitical parameters within which it takes place, Žižek observes. By contrast, what he means by an Act is an action which “touches the Real” (as he says) of what a sociopolitical regime has politically repressed or wiped its hands of, and which it cannot publicly avow without risking fundamental political damage (see 2c). In this way, the Žižekian Act extends and changes the very political and ideological parameters of what is permitted within a regime, in the hope of bringing into being new parameters in the light of which its own justice will be able to be retrospectively seen. This is the point of significant parallel with Alain Badiou’s work, whose influence Žižek has increasingly avowed in his more recent books. Notably, as Žižek specifies in The Indivisible Remainder, the Act as what it is effectively repeats the very act that he claims founds all political regimes as such, namely, the excessive, law founding gesture we examined in 4c. Just as the current political regime originated in a founding gesture excessive with regard to the laws it set in place, Žižek argues, so too can this political regime itself be superseded, and a new one replace it. In his reading of Walter Benjamin’s “Theses on the Philosophy of History” in The Sublime Object of Ideology, Žižek indeed argues that such a new Act also effectively repeats all previous, failed attempts at changing an existing political regime, which otherwise would be consigned forever to historical oblivion.

5. Conclusion

Slavoj Žižek’s work represents a striking challenge within the contemporary philosophical scene. Žižek’s very style, and his prodigious ability to write and examine examples from widely divergent fields, is a remarkable thing. His work reintroduces and reinvigorates for a wider audience ideas from the works of German Idealism. Žižek’s work is framed in terms of a polemical critique of other leading theorists within today’s new left or liberal academy (Derrida, Habermas, Deleuze), which claims to unmask their apparent radicality as concealing a shared recoil from the possibility of a subjective, political Act which in fact sits comfortably with a passive resignation to today’s political status quo. Not the least interesting feature of his work, politically, is indeed how Žižek’s critique of the new left both significantly mirrors criticisms from conservative and neoconservative authors, yet hails from an avowedly opposed political perspective. In political philosophy, Žižek’s Lacanian theory of ideology presents a radically new descriptive perspective that affords us a unique purchase on many of the paradoxes of liberal consumerist subjectivity, which is at once politically cynical (as the political right laments) and politically conformist (as the political left struggles to come to terms with). Prescriptively, Žižek’s work challenges us to ask questions about the possibility of sociopolitical change that have otherwise rarely been asked after 1989, including: what forms such changes might take?; and what might justify them or make them possible?

Looked at in a longer perspective, it is of course too soon to judge what the lasting effects of Žižek’s philosophy will be, especially given Žižek’s own comparative youth as a thinker (Žižek was born in 1949). In terms of the history of ideas, in particular, while Žižek’s thought certainly turns on their heads many of today’s widely accepted theoretical notions, it is surely a more lasting question whether his work represents any more lasting a break with the parameters that Kant’s critical philosophy set out in the three Critiques.

6. References and Further Reading

a. Primary Literature (Books by Žižek)

  • Iraq The Borrowed Kettle, New York: Verso, 2004.
  • Organs Without Bodies: On Deleuze and Consequences, New York, London: Routledge, 2003.
  • The Puppet and the Dwarf, New York: Routledge, 2003.
  • Did Somebody Say Totalitarianism? Five Essays on the (Mis)Use of a Notion, London; New York: Verso, 2001.
  • The Fright of Real Tears, Kieslowski and The Future, Bloomington: Indiana University Press, 2001.
  • On Belief, London: Routledge, 2001.
  • The Fragile Absolute or Why the Christian Legacy is Worth Fighting For, London; New York: Verso, 2000.
  • The Art of the Ridiculous Sublime, On David Lynch’s Lost Highway, Walter Chapin Center for the Humanities: University of Washington, 2000.
  • Contingency, Hegemony, Universality: Contemporary Dialogues on the Left, Judith Butler, Ernesto Laclau and SZ. London; New York: Verso, 2000.
  • Enjoy Your Symptom! Jacques Lacan in Hollywood and Out, second expanded edition, New York: Routledge, 2000.
  • The Ticklish Subject: The Absent Centre of Political Ontology, London; New York: Verso, 1999.
  • The Abyss Of Freedom Ages Of The World, with F.W.J. von Schelling, Ann Arbor: University of Michigan Press, 1997.
  • The Plague of Fantasies, London; New York: Verso, 1997.
  • Gaze And Voice As Love Objects, Renata Salecl and SZ editors. Durham: Duke University Press, 1996.
  • The Indivisible Remainder: An Essay On Schelling And Related Matters, London; New York: Verso, 1996.
  • The Metastases Of Enjoyment: Six Essays On Woman And Causality (Wo Es War), London; New York: Verso, 1994.
  • Mapping Ideology, SZ editor. London; New York: Verso, 1994.
  • Tarrying With The Negative: Kant, Hegel And The Critique Of Ideology, Durham: Duke University Press, 1993.
  • Enjoy Your Symptom! Jacques Lacan In Hollywood And Out, London; New York: Routledge, 1992.
  • Everything You Always Wanted to Know about Lacan (But Were Afraid To Ask Hitchcock), SZ editor. London; New York: Verso, 1992.
  • Looking Awry: an Introduction to Jacques Lacan through Popular Culture, Cambridge, Mass.: MIT Press, 1991.
  • For They Know Not What They Do: Enjoyment As A Political Factor, London; New York: Verso, 1991.
  • The Sublime Object of Ideology, London; New York: Verso, 1989.

b. Secondary Literature (Texts on Žižek)

  • Slavoj Žižek: A Little Piece of the Real, Matthew Sharpe, Hants: Ashgate, 2004.
  • Slavoj Žižek: A Critical Introduction, Ian Parker, London: Pluto Press, 2004.
  • Slavoj Žižek: Live Theory, Rex Butler, London: Continuum, 2004.
  • Žižek: A Critical Introduction, Sarah Kay, London: Polity, 2003.
  • Slavoj Žižek (Routledge Critical Thinkers), Tony Myers, London: Routledge, 2003.

 

Author Information

Matthew Sharpe
Email: matthew.sharpe@dewr.gov.au

Australia

Karl Popper: Philosophy of Science

Karl PopperKarl Popper (1902-1994) was one of the most influential philosophers of science of the 20th century. He made significant contributions to debates concerning general scientific methodology and theory choice, the demarcation of science from non-science, the nature of probability and quantum mechanics, and the methodology of the social sciences. His work is notable for its wide influence both within the philosophy of science, within science itself, and within a broader social context.

Popper’s early work attempts to solve the problem of demarcation and offer a clear criterion that distinguishes scientific theories from metaphysical or mythological claims. Popper’s falsificationist methodology holds that scientific theories are characterized by entailing predictions that future observations might reveal to be false. When theories are falsified by such observations, scientists can respond by revising the theory, or by rejecting the theory in favor of a rival or by maintaining the theory as is and changing an auxiliary hypothesis. In either case, however, this process must aim at the production of new, falsifiable predictions, while Popper recognizes that scientists can and do hold onto theories in the face of failed predictions when there are no predictively superior rivals to turn to. He holds that scientific practice is characterized by its continual effort to test theories against experience and make revisions based on the outcomes of these tests. By contrast, theories that are permanently immunized from falsification by the introduction of untestable ad hoc hypotheses can no longer be classified as scientific. Among other things, Popper argues that his falsificationist proposal allows for a solution of the problem of induction, since inductive reasoning plays no role in his account of theory choice.

Along with his general proposals regarding falsification and scientific methodology, Popper is notable for his work on probability and quantum mechanics and on the methodology of the social sciences. Popper defends a propensity theory of probability, according to which probabilities are interpreted as objective, mind-independent properties of experimental setups. Popper then uses this theory to provide a realist interpretation of quantum mechanics, though its applicability goes beyond this specific case. With respect to the social sciences, Popper argued against the historicist attempt to formulate universal laws covering the whole of human history and instead argued in favor of methodological individualism and situational logic.

Table of Contents

  1. Background
  2. Falsification and the Criterion of Demarcation
    1. Popper on Physics and Psychoanalysis
    2. Auxiliary and Ad Hoc Hypotheses
    3. Basic Sentences and the Role of Convention
    4. Induction, Corroboration, and Verisimilitude
  3. Criticisms of Falsificationism
  4. Realism, Quantum Mechanics, and Probability
  5. Methodology in the Social Sciences
  6. Popper’s Legacy
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Background

Popper began his academic studies at the University of Vienna in 1918, and he focused on both mathematics and theoretical physics. In 1928, he received a PhD in Philosophy. His dissertation, On the Problem of Method in the Psychology of Thinking, dealt primarily with the psychology of thought and discovery. Popper later reported that it was while writing this dissertation that he came to recognize “the priority of the study of logic over the study of subjective thought processes” (1976, p. 86), a sentiment that would be a primary focus in his more mature work in the philosophy of science.

In 1935, Popper published Logik der Forschung (The Logic of Research), his first major work in the philosophy of science.  Popper later translated the book into English and published it under the title The Logic of Scientific Discovery (1959). In the book, Popper offered his first detailed account of scientific methodology and of the importance of falsification. Many of the arguments in this book, as well as throughout his early work, are directed against members of the so-called “Vienna Circle,” such as Moritz Schlick, Otto Neurath, Rudolph Carnap, Hans Reichenbach, Carl Hempel, and Herbert Feigl, among others. Popper shared these thinkers’ concern with general issues of scientific methodology, and he sympathized with their distrust of traditional philosophical methodology. His proposed solutions to the problems arising from these concerns, however, were significantly different from those favored by the Vienna Circle.

Popper stayed in Vienna until 1937, when he took a teaching position at Canterbury University College in Christchurch, New Zealand, and he stayed there throughout World War II. His major works on the philosophy of science from this period include the articles that would eventually make up The Poverty of Historicism (1957). In these articles, he offered a highly critical analysis of the methodology of the social sciences, in particular, of attempts by social scientists to formulate predictive, explanatory laws.

In 1946, Popper took a teaching position at the London School of Economics, where he stayed until he retired in 1969. While there, he continued to work on a variety of issues relating to the philosophy of science, including quantum mechanics, entropy, evolution, and the realism vs. anti-realism debate, along with the issues already mentioned. His major works from this period include “The Propensity Interpretation of Probability” (1959) and Conjectures and Refutations (1963). He continued to publish until shortly before his death in 1994. In The Philosophy of Karl Popper (1974), Popper offers responses to many of his most important critics and provides clarifications of his mature views. His intellectual autobiography Unended Quest (1976) gives a detailed account of Popper’s evolving views, especially as they relate to the philosophy of science.

2. Falsification and the Criterion of Demarcation

Much of Popper’s early work in the philosophy of science focuses on what he calls the problem of demarcation, or the problem of distinguishing scientific (or empirical) theories from non-scientific theories. In particular, Popper aims to capture the logical or methodological differences between scientific disciplines, such as physics, and non-scientific disciplines, such as myth-making, philosophical metaphysics, Freudian psychoanalysis, and Marxist social criticism.

Popper’s proposals concerning demarcation can be usefully seen as a response to the verifiability criterion of demarcation proposed by logical empiricists, such as Carnap and Schlick. According to this criterion, a statement is cognitively meaningful if and only if it is, in principle, possible to verify. This criterion is intended to, among other things, capture the idea that the claims of empirical science are meaningful in a way that the claims of traditional philosophical metaphysics are not. For example, this criterion entails that claims about the locations of mid-sized objects are meaningful, since one can, in principle, verify them by going to the appropriate location. By contrast, claims about the fundamental nature of causation are not meaningful.

While Popper shares the belief that there is a qualitative difference between science and philosophical metaphysics, he rejects the verifiability criterion for several reasons. First, it counts existential statements (like “unicorns exist”) as scientific, even though there is no way of definitively showing that they are false. After all, the mere fact that one has failed to see a unicorn in a particular place does not establish that unicorns could not be observed in some other place. Second, it inappropriately counts universal statements (like “all swans are white”) as meaningless simply because they can never be conclusively verified. These sorts of universal claims, though, are common within science, and certain observations (like the observation of a black swan) can clearly show them to be false. Finally, the verifiability criterion is by its own light not meaningful, since it cannot be verified.

Partially in response to worries such as these, the logical empiricists’ later work abandons the verifiability criterion of meaning and instead emphasizes the importance of the empirical confirmation of scientific theories. Popper, however, argues that verification and confirmation played no role in formulating a satisfactory criterion of demarcation. Instead, Popper proposes that scientific theories are characterized by being bold in two related ways. First, scientific theories regularly disagree with accepted views of the world based on common sense or previous theoretical commitments. To an uneducated observer, for example, it may seem obvious that Earth is stationary, while the sun moves rapidly around it. However, Copernicus posited that Earth in fact revolved around the sun. In a similar way, it does not seem as though a tree and a human share a common ancestor, but this is what Darwin’s theory of evolution by natural selection claims. As Popper notes, however, this sort of boldness is not unique to scientific theories, since most mythological and metaphysical theories also make bold, counterintuitive claims about the nature of reality. For example, the accounts of world creation provided by various religions would count as bold in this sense, but this does not mean that they thereby count as scientific theories.

With this in mind, he goes on argue that scientific theories are distinguished from non-scientific theories by a second sort of boldness: they make testable claims that future observations might reveal to be false. This boldness thus amounts to a willingness to take a risk of being wrong. On Popper’s view, scientists investigating a theory make repeated, honest attempts to falsify the theory, whereas adherents of pseudoscientific or metaphysical theories routinely take measures to make the observed reality fit the predictions of the theory. Popper describes his proposal as follows:

Thus my proposal was, and is, that it is this second boldness, together with the readiness to look for tests and refutations, which distinguished “empirical” science from non-science, and especially from pre-scientific myths and metaphysics (1974, pp. 980-981)

In other places, Popper calls attention to the fact that scientific theories are characterized by possessing potential falsifiers—that is, that they make claims about the world that might be discovered to be false. If these claims are, in fact, found to be false, then the theory as a whole is said to be falsified. Non-scientific theories, by contrast, do not have any such potential falsifiers—there is literally no possible observation that could serve to falsify these theories.

Popper’s falsificationist proposal differs from the verifiability criterion in several important ways. First, Popper does not hold that non-scientific claims are meaningless. Instead, he argues that such unfalsifiable claims can often serve important roles in both scientific and philosophical contexts, even if we are incapable of ascertaining their truth or falsity. Second, while Popper is a realist who holds that scientific theories aim at the truth (see Section 4), he does not think that empirical evidence can ever provide us grounds for believing that a theory is either true or likely to be true. In this sense, Popper is a fallibilist who holds that while the particular unfalsified theory we have adopted might be true, we could never know this to be the case. For these same reasons, Popper holds that it is impossible to provide justification for one’s belief that a particular scientific theory is true. Finally, where others see science progressing by confirming the truth of various particular claims, Popper describes science as progressing on an evolutionary model, with observations selecting against unfit theories by falsifying them.

a. Popper on Physics and Psychoanalysis

In order to see how falsificationism works in practice, it will help to consider one of Popper’s most memorable examples: the contrast between Einstein’s theory of general relativity and the theories of psychoanalysis defended by Sigmund Freud and Alfred Adler. We might roughly summarize the theories as follows:

General relativity (GR): Einstein’s theory of special relativity posits that the observed speed of light in a vacuum will be the same for all observers, regardless of which direction or at what velocity these observers are themselves moving. GR allows this theory to be applied to cases where acceleration or gravity plays a role, specifically by treating gravity as a sort of distortion or bend in space-time created by massive objects.

Psychoanalysis: The theory of psychoanalysis holds that human behavior is driven at least in part by unconscious desires and motives. For example, Freud posited the existence of the id, an unconscious part of the human psyche that aims toward gratifying instinctive desires, regardless of whether this is rational. However, the desires of the id might be mediated or superseded in certain circumstances by its interaction with both the self-interested ego and the moral superego.

As we can see, both theories make bold, counter-intuitive claims about the fundamental nature of reality. Moreover, both theories can account for previously observed phenomena; for example, GR allows for an accurate description of the observed perihelion of Mercury, while psychoanalysis entails that it is possible for people to consistently act in ways that are against their own long-term best interest. Finally, both of these theories enjoyed significant support among their academic peers when Popper was first writing about these issues.

Popper argues, however, that GR is scientific while psychoanalysis is not. The reason for this has to do with the testability of Einstein’s theory. As a young man, Popper was especially impressed by Arthur Eddington’s 1919 test of GR, which involved observing during a solar eclipse the degree to which the light from distant stars was shifted when passing by the sun. Importantly, the predictions of GR regarding the magnitude shift disagreed with the then-dominant theory of Newtonian mechanics. Eddington’s observation thus served as a crucial experiment for deciding between the theories, since it was impossible for both theories to give accurate predictions. Of necessity, at least one theory would be falsified by the experiment, which would provide strong reason for scientists to accept its unfalsified rival. On Popper’s view, the continual effort by scientists to design and carry out these sorts of potentially falsifying experiments played a central role in theory choice and clearly distinguished scientific theorizing from other sorts of activities. Popper also takes care to note that insofar as GR was not a unified field theory, there was no question of GR’s being the complete truth, as Einstein himself repeatedly emphasized. The scientific status of GR, then, had nothing to do with neither (1) the truth of GR as a general theory of physics (the theory was already known to false) nor (2) the confirmation of GR by evidence (one cannot confirm a false theory).

In contrast to such paradigmatically scientific theories as GR, Popper argues that non-scientific theories such as Freudian psychoanalysis do not make any predictions that might allow them to be falsified. The reason for this is that these theories are compatible with every possible observation. On Popper’s view, psychoanalysis simply does not provide us with adequate details to rule out any possible human behavior. Absent of these sorts of precise predictions, the theory can be made to fit with, and to provide a purported explanation of, any observed behavior whatsoever.

To illustrate this point, Popper offers the example of two men, one who pushes a child into the water with the intent of drowning it, and another who dives into the water in order to save the child. Popper notes that psychoanalysis can explain both of these seemingly contradictory actions. In the first case, the psychoanalyst can claim that the action was driven by a repressed component of the (unconscious) id and in the second case, that the action resulted from a successful sublimation of this exact same sort of desire by the ego and superego. The point generalizes that regardless of how a person actually behaves, psychoanalysis can be used to explain the behavior. This, in turn, prevents us from formulating any crucial experiments that might serve to falsify psychoanalysis. Popper writes:

The point is very clear. Neither Freud nor Adler excludes any particular person’s acting in any particular way, whatever the outward circumstances. Whether a man sacrificed his life to rescue a drowning child (a case of sublimation) or whether he murdered the child by drowning (a case of repression) could not possibly be predicted or excluded by Freud’s theory (1974, p. 985).

Popper allows that there are often legitimate purposes for positing non-scientific theories, and he argues that theories which start out as non-scientific can later become scientific, as we determine methods for generating and testing specific predictions based on these theories. Popper offers the example of Copernicus’s theory of a sun-centered universe, which initially yielded no potentially falsifying predictions, and so would not have counted as scientific by Popper’s criteria. However, later astronomers determined ways of testing Copernicus’s hypothesis, thus rendering it scientific. For Popper, then, the demarcation between scientific and non-scientific theories is not grounded on the nature of entities posited by theories, by the truth or usefulness of theories, or even by the degree to which we are justified in believing in such theories. Instead, falsification provides a methodological distinction based on the unique role that observation and evidence play in scientific practice.

b. Auxiliary and Ad Hoc Hypotheses

While Popper consistently defends a falsification-based solution to the problem of demarcation throughout his published work, his own explications of it include a number of qualifications to ensure a better fit with the realities of scientific practice. It is in this context that Popper introduces several of his more notable contributions to the philosophy of science, including auxiliary versus ad hoc hypotheses, basic sentences, and degrees of verisimilitude.

One immediate objection to the simple proposal regarding falsification sketched in the previous section is based on the Duhem-Quine thesis, according to which it is in many cases impossible to test scientific theories in isolation. For example, suppose that a group of investigators uses GR to deduce a prediction about the perihelion of Mercury, but then discovers that this prediction disagrees with their measurements. This failure might lead them to conclude that GR is false; however, the failure of the prediction might also plausibly be blamed on the falsity of some other proposition that the scientists relied on to deduce the apparently falsifying prediction. There are generally a large number of such propositions, concerning everything from the absence of human error to the accuracy of the scientific theories underlying the construction and application of the measuring equipment.

Popper recognizes that scientists routinely attribute the failure of experiments to factors such as this, and further grants that there is in many cases nothing objectionable about their doing so. On Popper’s view, the distinctive mark of scientific inquiry concerns the investigators’ responses to failed predictions in cases where they do not abandon the falsified theory altogether. In particular, Popper argues that a scientific theory can be legitimately saved from falsification by the introduction of an auxiliary hypothesis that allows for the generation of new, falsifiable predictions. Popper offers an example taken from the early 19th century, when astronomers noticed that the orbit of Uranus deviated significantly from what Newtonian mechanics seemed to predict. In this case, the scientists did not treat Newton’s laws as being falsified by such an observation. Instead, they considered the auxiliary hypothesis that there existed an additional and so far unobserved planet that was influencing the orbit of Uranus. They then used this auxiliary hypothesis, together with equations of Newtonian mechanics, to predict where this planet must be located. Their predictions turned out to be successful, and Neptune was discovered in 1846.

Popper contrasts this legitimate, scientific method of theory revision with the illegitimate, non-scientific use of ad hoc hypotheses to rescue theories from falsification. Here, an ad hoc hypothesis is one that does not allow for the generation of new, falsifiable predictions. Popper gives the example of Marxism, which he argues had originally made definite predictions about the evolution of society: the capitalist, free-market system would self-destruct and be replaced by joint ownership of the means of production, and this would happen first in the most highly developed economies. By the time Popper was writing in the mid-20th century, however, it seemed clear to him that these predictions were false: free market economies had not self-destructed, and the first communist revolutions happened in relatively undeveloped economies. The proponents of Marxism, however, neither abandoned the theory as falsified nor introduced any new, falsifiable auxiliary hypotheses that might account for the failed predictions. Instead, they adopted ad hoc hypotheses that immunized Marxism against any potentially falsifying observations whatsoever. For example, the continued persistence of capitalism might be blamed on the action of counter-revolutionaries but without providing an account of which specific actions these were, or what specific new predictions about society we should expect instead. Popper concludes that, while Marxism had originally been a scientific theory:

It broke the methodological rule that we must accept falsification, and it immunized itself against the most blatant refutations of its predictions. Ever since then, it can be described only as non-science—as a metaphysical dream, if you like, married to a cruel reality (1974, p. 985).

c. Basic Sentences and the Role of Convention

A second complication for the simple theory of falsification just described concerns the character of the observations that count as potential falsifiers of a theory. The problem here is that decisions about whether to accept an apparently falsifying observation are not always straightforward. For example, there is always the possibility that a given observation is not an accurate representation of the phenomenon but instead reflects theoretical bias or measurement error on the part of the observer(s). Examples of this sort of phenomenon are widespread and occur in a variety of contexts: students getting the “wrong” results on lab tests, a small group of researchers reporting results that disagree with those obtained by the larger research community, and so on.

In any specific case in which bias or error is suspected, Popper notes that researchers might introduce a falsifiable, auxiliary hypothesis allowing us to test this. And in many cases, this is just what they do: students redo the test until they get the expected results, or other research groups attempt to replicate the anomalous result obtained. Popper argues that this technique cannot solve the problem in general, however, since any auxiliary hypotheses researchers introduce and test will themselves be open to dispute in just the same way, and so on ad infinitum. If science is to proceed at all then, there must be some point at which the process of attempted falsification stops.

In order to resolve this apparently vicious regress, Popper introduces the idea of a basic statement, which is an empirical claim that can be used to both determine whether a given theory is falsifiable and thus scientific and, where appropriate, to corroborate falsifying hypotheses. According to Popper, basic statements are “statements asserting that an observable event is occurring in a certain individual region of space and time” (1959, p. 85). More specifically, basic statements must be both singular and existential (the formal requirement) and be testable by intersubjective observation (the material requirement). On Popper’s view, “there is a raven in space-time region k” would count as a basic statement, since it makes a claim about an individual raven whose existence, or lack thereof, could be determined by appropriately located observers. By contrast, the negative existential claim “there are no ravens in space-time region k” does not do this, and thus fails to qualify as a basic statement.

In order to avoid the infinite regress alluded to earlier, where basic statements themselves must be tested in order to justify their status as potential falsifiers, Popper appeals to the role played by convention and what he calls the “relativity of basic statements.” He writes as follows:

Every test of a theory, whether resulting in its collaboration or falsification, must stop at some basic statement or other which we decide to accept. If we do not come to any decision, and do not accept some basic statement or other, then the test will have led nowhere… This procedure has no natural end. Thus if the test is to lead us anywhere, nothing remains but to stop at some point or other and say that we are satisfied, for the time being. (1959, p. 86)

From this, Popper concludes that a given statement’s counting as a basic statement requires the consensus of the relevant scientific community—if the community decides to accept it, it will count as a basic statement; if the community does not accept it as basic, then an effort must be made to test the statement by using it together with other statements to deduce a statement that the relevant community will accept as basic. Finally, if the scientific community cannot reach a consensus on what would count as a falsifier for the disputed statement, the statement itself, despite initial appearances, may not actually be empirical or scientific in the relevant sense.

d. Induction, Corroboration, and Verisimilitude

Falsification also plays a key role in Popper’s proposed solution to David Hume’s infamous problem of induction. On Popper’s interpretation, Hume’s problem involves the impossibility of justifying belief in general laws based on evidence that concerns only particular instances. Popper agrees with Hume that inductive reasoning in this sense could not be justified, and he thus rejects the idea that empirical evidence regarding particular individuals, such as successful predictions, is in any way relevant to confirming the truth of general scientific laws or theories. This places Popper’s view in explicit contrast to logical empiricists such as Carnap and Hempel, who had developed extensive, mathematical systems of inductive logic intended to explicate the degree of confirmation of scientific theories by empirical evidence.

Popper argues that there are in fact two closely related problems of induction: the logical problem of induction and the psychological problem of induction. The first problem concerns the possibility of justifying belief in the truth or falsity of general laws based on empirical evidence that concerns only specific individuals. Popper holds that Hume’s argument concerning this problem “establishes for good that all our universal laws or theories remain forever guesses, conjectures, [and] hypotheses” (1974, p. 1019). However, Popper claims that while a successful prediction is irrelevant to confirming a law, a failed prediction can immediately falsify it. On Popper’s view, then, observing 1,000 white swans does nothing to increase our confidence that the hypothesis “all swans are white” is true; however, the observation of a single black swan can, subject to the caveats mentioned in previous sections, falsify this same hypothesis.

In contrast to the logical problem of induction, the psychological problem of induction concerns the possibility of explaining why reasonable people nevertheless have the expectation that unobserved instances will obey the same general laws as did previously observed instances. Hume tries to resolve the psychological problem by appeal to habit or custom, but Popper rejects this solution as inadequate, since it suggests that there is a “clash between the logic and the psychology of knowledge” (1974, p. 1019) and hence that people’s beliefs in general laws are fundamentally irrational.

Popper proposes to solve these twin problems of induction by offering an account of theory preference that does not rely upon inductive inference and thus avoids Hume’s problems altogether. While the technical details of this account evolve throughout his writings, he consistently emphasizes two main points. First, he holds that a theory with greater informative content is to be preferred to one with less content. Here, informative content is a measure of how much a theory rules out; roughly speaking, a theory with more informative content makes a greater number of empirical claims, and thus has a higher degree of falsifiability. Second, Popper holds that a theory is corroborated by passing severe tests, or “by predictions which were highly improbable in the lights of our previous knowledge (previous to the theory which was tested and corroborated)” (1963, p. 220).

It is important to distinguish Popper’s claim that a theory is corroborated by surviving a severe test from the claim that the logical empiricist view that a theory is inductively confirmed by successfully predicting events that, were the theory to have been false, would have been highly unlikely. According to the latter view, a successful prediction of this sort, subject to certain caveats, provides evidence that the theory in question is actually true. The question of theory choice is tightly tied to that of confirmation: scientists should adopt whichever theory is most probable by light of the available evidence. On Popper’s view, by contrast, corroboration provides no evidence whatsoever the theory in question is true, or even that the theory is preferable to a so-far-untested but still unfalsified rival. Instead, a corroborated theory has shown merely that it is the sort of theory that could be falsified and thus can be legitimately classified as scientific. While a corroborated theory should obviously be preferred to an already falsified rival (see Section 2), the real work here is being done by the falsified theory, which has taken itself out of contention.

While Popper consistently rejects the idea that we are justified in believing that non-falsified, well-corroborated scientific theories with high levels of informative content are either true or likely to be true, his work on degrees of verisimilitude explores the idea that such theories are closer to the truth than were the falsified theories that they had replaced. The basic idea is as follows:

  1. For a given statement H, let the content of H be the class of all of the logical consequences of So, if H is true, then all of the members of this class would be true; if H were false however, then only some members of this class would be true, since every false statement has at least some true consequences.
  2. The content of H can be broken into two parts: the truth content consisting of all the true consequences of H, and the falsity content, consisting of all of the false consequences of
  3. The verisimilitude of H is defined as the difference between the truth content of H and falsity content of H. This is intended to capture the idea that a theory with greater verisimilitude will entail more truths and fewer falsehoods than does a theory will less verisimilitude.

With this definition in hand, it might now seem that Popper could incorporate truth into his account of his theory preference: non-falsified theories with high levels of informative content were closer to the truth than either the falsified theories they replaced or their unfalsified but less informative competitors. Unfortunately, however, this definition does not work, as arguments from Tichý (1974), Miller (1974), Harris (1974), and others show. Tichý and Miller in particular demonstrate that Popper’s proposed definition cannot be used to compare the relative verisimilitude of false theories, which is Popper’s main purpose in introducing the notion of verisimilitude. While Popper (1976) explores ways of modifying his proposal to deal with these problems, he is never able to provide a satisfactory formal definition of verisimilitude. His work on this area is nevertheless invaluable in identifying a problem that has continued to interest many contemporary researchers.

3. Criticisms of Falsificationism

While Popper’s account of scientific methodology has continued to be influential, it has also faced a number of serious objections. These objections, together with the emergence of alternative accounts of scientific reasoning, have led many philosophers of science to reject Popper’s falsificationist methodology. While a comprehensive list of these criticisms and alternatives is beyond the scope of this entry, interested readers are encouraged to consult Kuhn (1962), Salmon (1967), Lakatos (1970, 1980), Putnam (1974), Jeffrey (1975), Feyerabend (1975), Hacking (1983), and Howson and Urbach (1989).

One criticism of falsificationism involves the relationship between theory and observation. Thomas Kuhn, among others, argues that observation is itself strongly theory-laden, in the sense that what one observes is often significantly affected by one’s previously held theoretical beliefs. Because of this, those holding different theories might report radically different observations, even when they both are observing the same phenomena. For example, Kuhn argues those working within the paradigm provided by classical, Newtonian mechanics may genuinely have different observations than those working within the very different paradigm of relativistic mechanics.

Popper’s account of basic sentences suggests that he clearly recognizes both the existence of this sort of phenomenon and its potential to cause problems for attempts to falsify theories. His solution to it, however, crucially depends on the ability of the overall scientific community to reach a consensus as to which statements count as basic and thus can be used to formulate tests of the competing theories. This remedy, however, looks less attractive to the extent that advocates of different theories consistently find themselves unable to reach an agreement on what sentences count as basic. For example, it is important to Popper’s example of the Eddington experiment that both proponents of classical mechanics and those of relativistic mechanics could recognize Eddington’s reports of his observations as basic sentences in the relevant sense—that is, certain possible results would falsify the Newtonian laws of classical mechanics, while other possible results would falsify GR. If, by contrast, adherents of rival theories consistently disagreed on whether or not certain reports could be counted as basic sentences, this would prevent observations such as Eddington’s from serving any important role in theory choice. Instead, the results of any such potentially falsifying experiment would be interpreted by one part of the community as falsifying a particular theory, while a different section of the community would demand that these reports themselves be subjected to further testing.  In this way, disagreements over the status of basic sentences would effectively prevent theories from ever being falsified.

This purported failure to clearly distinguish the basic statements that formed the empirical base from other, more theoretical, statements would also have consequences for Popper’s proposed criterion of demarcation, which holds that scientific theories must allow for the deduction of basic sentences whose truth or falsity can be ascertained by appropriately located observers. If, contrary to Popper’s account, there is no distinct category of basic sentences within actual scientific practice, then his proposed method for distinguishing science from non-science fails.

A second, related criticism of falsifiability contends that falsification fails to provide an accurate picture of scientific practice. Specifically, many historians and philosophers of science have argued that scientists only rarely give up their theories in the face of failed predictions, even in cases where they are unable to identify testable auxiliary hypotheses. Conversely, it has been suggested that scientists routinely adopt and make use of theories that they know are already falsified. Instead, scientists will generally hold on to such theories unless and until a better alternative theory emerges.

For example, Lakatos (1970) describes a hypothetical case where pre-Einsteinian scientists discover a new planet whose behavior apparently violates classical mechanics. Lakatos argues that, in such a case, the scientists would surely attempt to account for these observed discrepancies in the way that Popper advocates—for example, by hypothesizing the existence of a hitherto unobserved planet or dust cloud. In contrast to what he takes Popper to be arguing, however, Lakatos contends that the failure of such auxiliary hypotheses would not lead them to abandon classical mechanics, since they had no alternative theory to turn to.

In a similar vein, Putnam (1975) argues that the initial widespread acceptance of Newtonian mechanics had little or nothing to do with falsifiable predictions, since the theory made very few of these. Instead, scientists were impressed by the theory’s success in explaining previously established phenomena, such as the orbits of the planets and the behavior of the tides. Putnam argues that, on Popper’s view, accepting such an uncorroborated theory would seem to be irrational. Finally, Hacking (1983) argues that many aspects of ordinary scientific practice, including a wide variety of observations and experiments, cannot plausibly be construed as attempts to falsify or corroborate any particular theory or hypothesis. Instead, scientists regularly perform experiments that have little or no bearing on their current theories and measure quantities about which these theories do not make any specific claims.

When considering the cogency of such criticisms, it is worth noting several things. First, it is worth recalling that Popper defends falsificationism as a normative, methodological proposal for how science ought to work in certain sorts of cases and not as an empirical description intended to accurately capture all aspects of historical scientific practice. Second, Popper does not commit himself to the implausible thesis that theories yielding false predictions about a particular phenomenon must immediately be abandoned, even if it is not apparent which auxiliary hypotheses must change. This is especially true in the absence of any rival theory yielding a correct prediction. For example, Newtonian mechanics had well-known problems with predicting certain sorts of phenomena, such as the orbit of Mercury, in the years preceding Einstein’s proposals regarding special and general relativity. Popper’s proposal does not entail that these failures of prediction should have led nineteenth century scientists to abandon this theory.

This being said, Popper himself argues that the methodology of falsificationism has played an important role in the history of science and that adopting his proposal would not require a wholesale revision of existing scientific methodology. If it turns out that scientists rarely, if ever, make theory choice on the basis of crucial experiments that falsify one theory or another, then Popper’s methodological proposal looks to be considerably less appealing.

A final criticism concerns Popper’s account of corroboration and the role it plays in theory choice. Popper’s deductive account of theory testing and adoption posits that it is rational to choose highly informative, well-corroborated theories, even though we have no inductive grounds for thinking that these theories are likely to be true. For example, Popper explicitly rejects the idea that corroboration is intended as an analogue to the subjective probability or logical probability that a theory is true, given the available evidence. This idea is central to both Popper’s proposed solution to the problem of induction and to his criticisms of competing inductivist or “Bayesian” programs.

Many philosophers of science, however, including Salmon (1967, 1981), Jeffrey (1975), Howson (1984a), and Howson and Urbach (1989), have objected to this aspect of Popper’s account. One line of criticism has focused on the extent to which Popper’s falsification offers a legitimate alternative to the inductivist proposals that Popper criticizes. For example, Jeffrey (1975) points out that it is just as difficult to conclusively falsify a hypothesis as it to conclusively verify it, and he argues that Bayesianism, with its emphasis on the degree to which empirical evidence supports a hypothesis, is much more closely aligned to scientific practice than Popper’s program.

A related line of objection has focused on Popper’s contention that it is rational for scientists to rely on corroborated theories, a claim that plays a central role in his proposed solution to the problem of induction. Urbach (1984) argues that, insofar as Popper is committed to the claim that every universal hypothesis has zero probability of being true, he cannot explain the rationality of adopting a corroborated theory over an already falsified one, since both have the same probability (zero) of being true. Taking a different tack, Salmon (1981) questions whether, on Popper’s account, it would be rational to use corroborated hypotheses for the purposes of prediction. After all, corroboration is entirely a matter of hypotheses’ past performance—a corroborated hypothesis is one that has survived severe empirical tests. Popper’s account, however, does not provide us with any reason for thinking that this hypothesis will have more accurate predictions about the future than any one of the infinite number of competing uncorroborated hypotheses that are also logically compatible with all of the evidence observed up to this point.

If these objections concerning corroboration are correct, it looks as though Popper’s account of theory choice is either (1) vulnerable to the same sorts of problems and puzzles that plague accounts of theory choice based on induction or (2) does not work as an account of theory choice at all.

While the sorts of objections mentioned here have led many to abandon falsificationism, David Miller (1998) provides a recent, sustained attempt to defend a Popperian-style critical rationalism. For more details on debates concerning confirmation and induction, see the entries on Confirmation and Induction and Evidence.

4. Realism, Quantum Mechanics, and Probability

While Popper holds that it is impossible for us to justify claims that particular scientific theories are true, he also defends the realist view that “what we attempt in science is to describe (and so far as possible) explain reality” (1975, p. 40). While Popper grants that realism is, according to his own criteria, an irrefutable metaphysical view about the nature, he nevertheless thinks we have good reasons for accepting realism and for rejecting anti-realist views such as idealism or instrumentalism. In particular, he argues that realism is both part of common sense and entailed by our best scientific theories. By contrast, he contends that the most prominent arguments for anti-realism are based on a “mistaken quest for certainty, or for secure foundations on which to build” (1975, p. 42). Once one accepts the impossibility of securing such certain knowledge, as Popper contends we ought to do, the appeal of these sorts of arguments is considerably diminished.

Popper consistently emphasizes that scientific theories should be interpreted as attempts to describe a mind-independent reality. Because of this, he rejects the Copenhagen interpretation of quantum mechanics, in which the act of human measurement is seen as playing a fundamental role in collapsing the wave-function and randomly causing a particle to assume a determinate position or momentum. In particular, Popper opposes the idea, which he associates with the Copenhagen interpretation, that the probabilistic equations describing the results of potential measurements of quantum phenomena are about the subjective states of the human observers, rather than concerning mind-independent existing physical properties such as the positions or momenta of particles.

It is in the context of this debate over quantum mechanics that Popper first introduces his propensity theory of probability. This theory’s applicability, however, extends well beyond the quantum world, and Popper argues that it can be used to interpret the sorts of claims about probability that arise both in other areas of science and in everyday life. Popper’s propensity theory holds that probabilities are objective claims about the mind-independent external world and that it is possible for there to be single-case probabilities for non-recurring events.

Popper proposes his propensity theory as a variant of the relative frequency theories of probability defended by logical positivists such as Richard von Mises and Hans Reichenbach. According to simple versions of frequency theory, the probability of an event of type e can be defined as the relative frequency of e in a large, or perhaps even infinite, reference class. For example, the claim that the “the probability of getting a six on a fair die is 1/6” can be understood as the claim that, in a long sequence of rolls with a fair die (the reference class), six would come up 1/6 of the time. The main alternatives to frequency theory that concern Popper are logical and subjective theories of probability, according to which claims about probability should be understood as claims about the strength of evidence for or degree of belief in some proposition. On these views, the claim that “the probability of getting a six on a fair die is 1/6” can be understood as a claim about our lack of evidence—if all we know is that the die is fair, then we have no reason to think that any particular number, such as a six, is more likely to come up on the next roll than any of the other five possible numbers.

Like other defenders of frequency theories, Popper argues that logical or subjective theories incorrectly interpret scientific claims about probability as being about the scientific investigators, and the evidence they have available to them, rather than the external world they are investigating. However, Popper argues that traditional frequency theories cannot account for single-case probabilities. For example, a frequency theorist would have no problem answering questions about “the probability that it will rain on an arbitrarily chosen August day,” since August days form a reference class. By contrast, questions about the probability that it will rain on a particular, future August day raises problems, since each particular day only occurs once. At best, frequency theories allow us to say the probability of it raining on that specific day is either 0 or 1, though we do not know which.

On Popper’s view, the failure to provide adequate treatment of single-case probabilities is a serious one, especially given what he saw as the centrality of such probabilities in quantum mechanics. To resolve this issue, Popper proposes that probabilities should be treated as the propensities of experimental setups to produce certain results, rather than as being derived from the reference class of results that were produced by running these experiments. On the propensity view, the results of experiments are important because they allow us to test hypotheses concerning the values of certain probabilities; however, the results are not themselves part of the probability itself. Popper argues that this solves the problem of single-case probability, since propensities can exist even for experiments that only happen once. Importantly, Popper does not require that these experiments utilize human intervention—instead, nature can itself run experiments, the results of which we can observe. For example, the propensity theory should, in theory, be able to make sense of claims about the probability that it will rain on a particular day, even though the experimental setup in this case is constituted by naturally occurring, meteorological phenomena.

Popper argues that the propensity theory of probability helps provide the grounds for a realist solution to the measurement problem within quantum mechanics. As opposed to the Copenhagen interpretation, which posits that the probabilities discussed in quantum mechanics reflect the ignorance of the observers, Popper argues these probabilities are in fact the propensities of the experimental setups to produce certain outcomes. Interpreted this way, he argues that they raise no interesting metaphysical dilemmas beyond those raised by classical mechanics and that they are equally amenable to a realist interpretation. Popper gives the example of tossing a penny, which he argues is strictly analogous to the experiments performed in quantum mechanics: if our experimental setup consists of simply tossing the penny, then the probability of getting heads is 1/2. If the experimental setup, however, is expanded to include the results of our looking at the penny, and thus includes the outcome of the experiment itself, then the probability will be either 0 or 1. This does not, though, involve positing any collapse of the wave-function caused merely by the act of human observation. Instead, what has occurred is simply a change in the experimental setup. Once we include the measurement result in our setup, the probability of a particular outcome will trivially become 0 or 1.

5. Methodology in the Social Sciences

Much of Popper’s early work on the methodology of science is concerned with physics and closely related fields, especially those where experimentation plays a central role. On Popper’s view, which was discussed in detail in previous sections, these sciences make progress by formulating a theory and then carefully designing experiments and observations aimed at falsifying the purported theory. The ever-present possibility that a theory might be falsified by these sorts of tests is, on Popper’s view, precisely what differentiates legitimate sciences, such as physics, from non-scientific activities, such as philosophical metaphysics, Freudian psychoanalysis, or myth-making.

This picture becomes somewhat more complicated, however, when we consider methodology in social sciences such as sociology and economics, where experimentation plays a much less central role. On Popper’s view, there are significant problems with many of the methods used in these disciplines. In particular, Popper argues against what he calls historicism, which he describes as “an approach to the social sciences which assumes that historical prediction is their principal aim, and which assumes that this aim is attainable by discovering the ‘rhythms’ or ‘patterns’, the ‘laws’ or ‘trends’ that underlie the evolution of history” (1957, p. 3).

Popper’s central argument against historicism contends that, insofar as the whole of human history is a singular process that occurs only once, it is impossible to formulate and test any general laws about history. This stands in stark contrast to disciplines such as physics, where the formulation and testing of laws plays a central role in making progress. For example, potential laws of gravitation can be tested by observations of planetary motions, by controlled experiments concerning the rates of falling objects near the earth’s surface, or in numerous other ways. If the relevant theories are falsified, scientists can easily respond, for instance, by changing one or more auxiliary hypotheses, and then conducting additional experiments on the new, slightly modified theory. By contrast, a law that purports to describe the future progress of history in its entirety cannot easily be tested in this way. Even if a particular prediction about the occurrence of some particular event is incorrect, there is no way of altering the theory to retest it—each historical event only occurs one, thus ruling out the possibility of carrying more tests regarding this event. Popper also rejects the claim that it is possible to formulate and test laws of more limited scope, such as those that purport to describe an evolutionary process that occurs in multiple societies, or that attempt to capture a trend within a given society.

Popper’s opposition to historicism is also evident in his objections what he calls utopian social engineering, which involves attempts by governments to fundamentally restructure the whole of society based on an overall plan or blueprint. On Popper’s view, the problem again concerns the impossibility of carrying out critical tests of the effectiveness of such plans. This impossibility is because of the holism of utopian plans, which involve changing everything at the same time. When the planners’ actions fail—as Popper thinks is inevitably the case with human interventions in society—to achieve their predicted results, the planners have no method for determining what in particular went wrong with their plan. This lack of testability, in turn, means that there is no way for the utopian engineers to improve their plans. This argument, among others, plays a central role in Popper’s critique of Marxism and totalitarianism in The Open Society and its Enemies (1945). More details on Popper’s political philosophy, including his critique of totalitarian societies, can be found here.

In place of historicism and utopian holism, Popper argues that the social sciences should embrace both methodological individualism and situational analysis. On Popper’s definition, methodological individualism is the view that the behavior of social institutions should be analyzed in terms of the behaviors of the individual humans that made them up. This individualism is motivated, in part, by Popper’s contention that many important social institutions, such as the market, are not the result of any conscious design but instead arise out of the uncoordinated actions of individuals with widely disparate motives. Scientific hypotheses about the behavior of such unplanned institutions, then, must be formulated in terms of the constituent participants. Popper’s presentation and defense of methodological individualism is closely related to that provided by the Austrian economist Frederich von Hayek (1942, 1943, 1944), with whom Popper maintained close personal and professional relationships throughout most of his life. For both Popper and Hayek, the defense of methodological individualism within the social sciences plays a key role in their broader argument in favor of liberal, market economies and against planned economies.

While Popper endorses methodological individualism, he rejects the doctrine of psychologism, according to which laws about social institutions must be reduced to psychological laws concerning the behavior of individuals. Popper objects to this view, which he associates with John Stuart Mill, on the grounds that it ends up collapsing into a form of historicism. The argument can be summarized as follows: once we begin trying to explain or predict the behavior currently existing in institutions in terms of individuals’ psychological motives, we quickly notice that these motives themselves cannot be understood without reference to the broader social environment within which these individuals find themselves. In order to eliminate the reference to the particular social institutions that make up this environment, we are then forced to demonstrate how these institutions were themselves a product of individual motives that had operated within some other previously existing social environment. This, though, quickly leads to an unsustainable regress, since humans always act within particular social environments, and their motives cannot be understood without reference to these environments. The only way out for the advocate of psychologism is to posit that both the origin and evolution of all human institutions can be explained purely in terms of human psychology. Popper argues that there is no historical support for the idea that there was ever such as an origin of social institutions. He also argues that this is a form of historicism, insofar as it commits us to discovering laws governing the evolution of society as a whole. As such, it inherits all of the problems mentioned previously.

In place of psychologism, Popper endorses a version of methodological individualism based on situational analysis. On this method, we begin by creating abstract models of the social institutions that we wish to investigate, such as markets or political institutions. In keeping with methodological individualism, these models will contain, among other things, representations of individual agents. However, instead of stipulating that these agents will behave according to the laws governing individual human psychology, as psychologism does, we animate the model by assuming that the agents will respond appropriately according to the logic of the situation. Popper calls this constraint on model building within the social sciences the rationality principle.

Popper recognizes that both the rationality principle and the models built on the basis of it are empirically false—after all, real humans often respond to situations in ways that are irrational and inappropriate. Popper also rejects, however, the idea that the rationality principle should be thought of as a methodological principle that is a priori immune to testing, since part of what makes theories in the social sciences testable is the fact that they make definite claims about individual human behavior. Instead, Popper defends the use of the rationality principle in model building on the grounds that is generally good policy to avoid blaming the falsification of a model on the inaccuracies introduced by the rationality principle and that we can learn more if we blame the other assumptions of our situational analysis (1994, p. 177). On Popper’s view, the errors introduced by the rationality principle are generally small ones, since humans are generally rational. More importantly, holding the rationality principle fixed makes it much easier for us to formulate crucial tests of rival theories and to make genuine progress in the social sciences. By contrast, if the rationality principle were relaxed, he argues, there would be almost no substantive constraints on model building.

6. Popper’s Legacy

While few of Popper’s individual claims have escaped criticism, his contributions to philosophy of science are immense. As mentioned earlier, Popper was one of the most important critics of the early logical empiricist program, and the criticisms he leveled against helped shape the future work of both the logical empiricists and their critics. In addition, while his falsification-based approach to scientific methodology is no longer widely accepted within philosophy of science, it played a key role in laying the ground for later work in the field, including that of Kuhn, Lakatos, and Feyerabend, as well as contemporary Bayesianism.  It also plausible that the widespread popularity of falsificationism—both within and outside of the scientific community—has had an important role in reinforcing the image of science as an essentially empirical activity and in highlighting the ways in which genuine scientific work differs from so-called pseudoscience.  Finally, Popper’s work on numerous specialized issues within the philosophy of science—including verisimilitude, quantum mechanics, the propensity theory of probability, and methodological individualism—has continued to influence contemporary researchers.

7. References and Further Reading

Popper Selections (1985) is an excellent introduction to Popper’s writings for the beginner, while The Philosophy of Karl Popper (Schilpp 1974) contains an extensive bibliography of Popper’s work published before the date, together with numerous critical essays and Popper’s responses to these. Finally, Unended Quest (1976) is an expanded version of the “Intellectual Autobiography” from Schilpp (1974), and it provides a helpful, non-technical overview of many of Popper’s main works in his own words.

a. Primary Sources

  • 1945. The Open Society and Its Enemies. 2 volumes. London: Routledge.
  • 1957. The Poverty of Historicism. London: Routledge. Originally published as a series of three articles in Economica 42, 43, and 46 (1944-1945).
  • 1959. The Logic of Scientific Discovery. London: Hutchinson. This is an English translation of Logik der Forschung, Vienna: Springer (1935).
  • 1959. “The Propensity Interpretation of Probability.” The British Journal for the Philosophy of Science 10 (37): 25–42.
  • 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge. Fifth edition 1989.
  • 1970. “Normal Science and Its Dangers.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgravez 51–58
  • 1972. Objective Knowledge: An Evolutionary Approach. Oxford: Clarendon Press. Revised edition 1979.
  • 1974. “Replies to My Critics” and “Intellectual Autobiography.” In: Schilpp, Paul Arthur, ed.
  • 1974. The Philosophy of Karl Popper. 2 volumes. La Salle, Ill: Open Court.
  • 1976. Unended Quest. London: Fontana. Revised edition 1984.
  • 1976. “A Note on Verisimilitude.” The British Journal for the Philosophy of Science 27 (2): 147–59.
  • 1978. “Natural Selection and the Emergence of Mind.” Dialectica 32 (3-4): 339–55.
  • 1982. The Open Universe: An Argument for Indeterminism. Edited by W. W. Bartley III. London: Routledge.
  • 1982. Quantum Theory and the Schism in Physics. Edited by W. W. Bartley III. New York: Routledge.
  • 1983. Realism and the Aim of Science. Edited by W. W. Bartley III. New York: Routledge.
  • 1985. Popper Selections. Edited by David W Miller. Princeton: Princeton University Press.
  • 1994. The Myth of the Framework: In Defense of Science and Rationality. Edited by Mark Amadeus Notturno. London: Routledge.
  • 1999. All Life Is Problem Solving. London: Routledge.

b. Secondary Sources

  • Ackermann, Robert John. 1976. The Philosophy of Karl Popper. Amherst: University of Mass. Press.
  • Agassi, Joseph. 2014. Popper and His Popular Critics: Thomas Kuhn, Paul Feyerabend and Imre Lakatos. 2014 edition. New York: Springer.
  • Blaug, Mark. 1992. The Methodology of Economics: Or, How Economists Explain. 2nd edition. New York: Cambridge University Press.
  • Caldwell, Bruce J. 1991. “Clarifying Popper.” Journal of Economic Literature 29 (1): 1–33.
  • Carnap, Rudolf. 1936. “Testability and Meaning.” Philosophy of Science 3 (4): 419–71. Continued in Philosophy of Science 4 (1): 1-40.
  • Carnap, Rudolf. 1995. An Introduction to the Philosophy of Science. New York: Dover. Originally published as Philosophical Foundations of Physics (1966).
  • Carnap, Rudolf.  2003. The Logical Structure of the World and Pseudoproblems in Philosophy. Translated by Rolf A. George. Chicago and La Salle, Ill: Open Court. Originally published in 1928 as Der logische Aufbau der Welt and Scheinprobleme in der Philosophie.
  • Catton, Philip, and Graham MacDonald, eds. 2004. Karl Popper: Critical Appraisals. New York: Routledge.
  • Currie, Gregory, and Alan Musgrave, eds. 1985. Popper and the Human Sciences. Dordrecht: Martinus Nijhoff.
  • Edmonds, David, and John Eidinow. 2002. Wittgenstein’s Poker: The Story of a Ten-Minute Argument Between Two Great Philosophers. Reprint edition. New York: Harper Perennial.
  • Feyerabend, Paul. 1975. Against Method. London; New York: New Left Books. Fourth edition 2010.
  • Fuller, Steve. 2004. Kuhn vs. Popper: The Struggle for the Soul of Science. New York: Columbia University Press.
  • Gattei, Stefano. 2010. Karl Popper’s Philosophy of Science: Rationality without Foundations. London; New York: Routledge.
  • Grünbaum, Adolf. 1976. “Is Falsifiability the Touchstone of Scientific Rationality? Karl Popper Versus Inductivism.” In Essays in Memory of Imre Lakatos, edited by R. S. Cohen, P. K. Feyerabend, and M. W. Wartofsky, 213–52. Dordrecht: Springer Netherlands.
  • Hacking, Ian. 1983. Representing and Intervening: Introductory Topics in the Philosophy of Natural Science. Cambridge; New York: Cambridge University Press.
  • Hacohen, Malachi Haim. 2002. Karl Popper: The Formative Years, 1902-1945 : Politics and Philosophy in Interwar Vienna. Cambridge: Cambridge University Press.
  • Hands, Douglas W. 1985. “Karl Popper and Economic Methodology: A New Look.” Economics and Philosophy 1 (1): 83–99.
  • Harris, John H. 1974. “Popper’s Definitions of ‘Verisimilitude.’” The British Journal for the Philosophy of Science 25 (2): 160–66.
  • Hausman, Daniel M. 1985. “Is Falsificationism Unpractised or Unpractisable?” Philosophy of the Social Sciences 15 (3): 313–19.
  • Hayek, Frederich von. 1942. “Scientism and the Study of Society. Part I.” Economica, New Series, 9 (35): 267–91.
  • Hayek, Frederich von.  1943. “Scientism and the Study of Society. Part II.” Economica, New Series, 10 (37): 34–63.
  • Hayek, Frederich von. 1944. “Scientism and the Study of Society. Part III.” Economica, New Series, 11 (41): 27–39.
  • Hempel, Carl G. 1945a. “Studies in the Logic of Confirmation (I.).” Mind, New Series, 54 (213): 1–26.
  • Hempel, Carl G. 1945b. “Studies in the Logic of Confirmation (II.).” Mind, New Series, 54 (214): 97–121.
  • Howson, Colin. 1984a. “Popper’s Solution to the Problem of Induction.” The Philosophical Quarterly 34 (135): 143–47.
  • Howson, Colin. 1984b. “Probabilities, Propensities, and Chances.” Erkenntnis 21 (3): 279–93.
  • Howson, Colin, and Peter Urbach. 1989. Scientific Reasoning: The Bayesian Approach. Chicago: Open Court Publishing. Third edition 2006.
  • Hudelson, Richard. 1980. “Popper’s Critique of Marx.” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 37 (3): 259–70.
  • Hume, David. 1993. An Enquiry Concerning Human Understanding: With Hume’s Abstract of A Treatise of Human Nature and A Letter from a Gentleman to His Friend in Edinburgh. Edited by Eric Steinberg. 2nd ed. Indianapolis: Hackett Publishing Company, Inc.
  • Jeffrey, Richard C. 1975. “Probability and Falsification: Critique of the Popper Program.” Synthese 30 (1/2): 95–117.
  • Keuth, Herbert. 2004. The Philosophy of Karl Popper. New York: Cambridge University Press.
  • Kuhn, Thomas S. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press. Third edition 1996.
  • Lakatos, Imre. 1970. “Falsification and the Methodology of Scientific Research Programmes.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgrave, 91–196. Cambridge: Cambridge University Press.
  • Lakatos, Imre.  1980. The Methodology of Scientific Research Programmes: Volume 1: Philosophical Papers. Cambridge University Press.
  • Lakatos, Imre, and Alan Musgrave, eds. 1970. Criticism and the Growth of Knowledge. Cambridge: Cambridge University Press.
  • Levi, Isaac. 1963. “Corroboration and Rules of Acceptance.” The British Journal for the Philosophy of Science 13 (52): 307–13.
  • Maher, Patrick. 1990. “Why Scientists Gather Evidence.” The British Journal for the Philosophy of Science 41 (1): 103-119.
  • Magee, Bryan. 1985. Philosophy and the Real World: An Introduction to Karl Popper. La Salle, Ill: Open Court.
  • Miller, David. 1974. “Popper’s Qualitative Theory of Verisimilitude.” British Journal for the Philosophy of Science, 166–77.
  • Miller, David. 1998. Critical Rationalism: A Restatement and Defense. Chicago: Open Court.
  • Munz, Peter. 1985. Our Knowledge of the Growth of Knowledge: Popper or Wittgenstein?. London; New York: Routledge.
  • O’Hear, Anthony. 1996. Karl Popper: Philosophy and Problems. Cambridge ; New York: Cambridge University Press.
  • Putnam, Hilary. 1974. “The ‘corroboration’ of Theories.” In The Philosophy of Karl Popper, edited by Paul Arthur Schilpp, 221–40. La Salle, Ill: Open Court.
  • Rowbottom, Darrell. 2010. Popper’s Critical Rationalism: A Philosophical Investigation. New York: Routledge.
  • Runde, Jochen. 1996. “On Popper, Probabilities, and Propensities.” Review of Social Economy 54 (4): 465–85.
  • Ruse, Michael. 1977. “Karl Popper’s Philosophy of Biology.” Philosophy of Science 44 (4): 638–61.
  • Salmon, Wesley. 1967. The Foundations of Scientific Inference. Pittsburgh: University of Pittsburgh Press.
  • Salmon, Wesley. 1981. “Rational Prediction.” The British Journal for the Philosophy of Science 32 (2): 115–25.
  • Schilpp, Paul Arthur, ed. 1974. The Philosophy of Karl Popper. 2 volumes. La Salle, Ill: Open Court.
  • Thornton, Stephen. 2014. “Karl Popper.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta.
  • Tichý, Pavel. 1974. “On Popper’s Definitions of Verisimilitude.” The British Journal for the Philosophy of Science 25 (2): 155–60.
  • Urbach, Peter. 1978. “Is Any of Popper’s Arguments against Historicism Valid?” The British Journal for the Philosophy of Science 29 (2): 117–30.

 

Author Information

Brendan Shea
Email: Brendan.Shea@rctc.edu
Rochester Community and Technical College, Minnesota Center for Philosophy of Science
U. S. A.

Albert Camus (1913—1960)

CamusAlbert Camus was a French-Algerian journalist, playwright, novelist, philosophical essayist, and Nobel laureate. Though he was neither by advanced training nor profession a philosopher, he nevertheless made important, forceful contributions to a wide range of issues in moral philosophy in his novels, reviews, articles, essays, and speeches—from terrorism and political violence to suicide and the death penalty. He is often described as an existentialist writer, though he himself disavowed the label. He began his literary career as a political journalist and as an actor, director, and playwright in his native Algeria. Later, while living in occupied France during WWII, he became active in the Resistance and from 1944-47 served as editor-in-chief of the newspaper Combat.  By mid-century, based on the strength of his three novels (The Stranger, The Plague, and The Fall) and two book-length philosophical essays (The Myth of Sisyphus and The Rebel), he had achieved an international reputation and readership. It was in these works that he introduced and developed the twin philosophical ideas—the concept of the Absurd and the notion of Revolt—that made him famous. These are the ideas that people immediately think of when they hear the name Albert Camus spoken today. The Absurd can be defined as a metaphysical tension or opposition that results from the presence of human consciousness—with its ever-pressing demand for order and meaning in life—in an essentially meaningless and indifferent universe. Camus considered the Absurd to be a fundamental and even defining characteristic of the modern human condition. The notion of Revolt refers to both a path of resolved action and a state of mind. It can take extreme forms such as terrorism or a reckless and unrestrained egoism (both of which are rejected by Camus), but basically, and in simple terms, it consists of an attitude of heroic defiance or resistance to whatever oppresses human beings. In awarding Camus its prize for literature in 1957, the Nobel Prize committee cited his persistent efforts to “illuminate the problem of the human conscience in our time.” He was honored by his own generation, and is still admired today, for being a writer of conscience and a champion of imaginative literature as a vehicle of philosophical insight and moral truth. He was at the height of his career—at work on an autobiographical novel, planning new projects for theatre, film, and television, and still seeking a solution to the lacerating political turmoil in his homeland—when he died tragically in an automobile accident in January 1960.

Table of Contents

  1. Life
  2. Literary Career
  3. Camus, Philosophical Literature, and the Novel of Ideas
  4. Works
    1. Fiction
    2. Drama
    3. Essays, Letters, Prose Collections, Articles, and Reviews
  5. Philosophy
    1. Background and Influences
    2. Development
    3. Themes and Ideas
      1. The Absurd
      2. Revolt
      3. The Outsider
      4. Guilt and Innocence
      5. Christianity vs. “Paganism”
      6. Individual vs. History and Mass Culture
      7. Suicide
      8. The Death Penalty
  6. Existentialism
  7. Camus, Colonialism, and Algeria
  8. Significance and Legacy
  9. References and Further Reading
    1. Works by Albert Camus
    2. Critical and Biographical Studies

1. Life

Albert Camus was born on November 7, 1913, in Mondovi, a small village near the seaport city of Bonê (present-day Annaba) in the northeast region of French Algeria. He was the second child of Lucien Auguste Camus, a military veteran and wine-shipping clerk, and of Catherine Helene (Sintes) Camus, a house-keeper and part-time factory worker. (Note: Although Camus believed that his father was Alsatian and a first-generation émigré, research by biographer Herbert Lottman indicates that the Camus family was originally from Bordeaux and that the first Camus to leave France for Algeria was actually the author’s great-grandfather, who in the early 19th century became part of the first wave of European colonial settlers in the new melting pot of North Africa.)

Shortly after the outbreak of WWI, when Camus was less than a year old, his father was recalled to military service and, on October 11, 1914, died of shrapnel wounds suffered at the first battle of the Marne. As a child, about the only thing Camus ever learned about his father was that he had once become violently ill after witnessing a public execution. This anecdote, which surfaces in fictional form in the author’s novel The Stranger and is also recounted in his philosophical essay “Reflections on the Guillotine,” strongly affected Camus and influenced his lifelong opposition to the death penalty.

After his father’s death, Camus, his mother, and his older brother moved to Algiers where they lived with his maternal uncle and grandmother in her cramped second-floor apartment in the working-class district of Belcourt. Camus’s mother Catherine, who was illiterate, partially deaf, and afflicted with a speech pathology, worked in an ammunition factory and cleaned homes to help support the family. In his posthumously published autobiographical novel The First Man, Camus recalls this period of his life with a mixture of pain and affection as he describes conditions of harsh poverty (the three-room apartment had no bathroom, no electricity, and no running water) relieved by hunting trips, family outings, childhood games, and scenic flashes of sun, seashore, mountain, and desert.

Camus attended elementary school at the local Ecole Communale, and it was there that he encountered the first in a series of teacher-mentors who recognized and nurtured the young boy’s lively intelligence. These father figures introduced him to a new world of history and imagination and to literary landscapes far beyond the dusty streets of Belcourt and working-class poverty. Though stigmatized as a pupille de la nation (that is, a war veteran’s child dependent on public welfare) and hampered by recurrent health issues, Camus distinguished himself as a student and was eventually awarded a scholarship to attend high school at the Grand Lycee. Located near the famous Kasbah district, the school brought him into close proximity with the native Muslim community and thus gave him an early recognition of the idea of the “outsider” that would dominate his later writings.

It was in secondary school that Camus became an avid reader (absorbing Gide, Proust, Verlaine, and Bergson, among others), learned Latin and English, and developed a lifelong interest in literature, art, theatre, and film. He also enjoyed sports, especially soccer, of which he once wrote (recalling his early experience as a goal-keeper): “I learned . . . that a ball never arrives from the direction you expected it. That helped me in later life, especially in mainland France, where nobody plays straight.” It was also during this period that Camus suffered his first serious attack of tuberculosis, a disease that was to afflict him, on and off, throughout his career.

By the time he finished his Baccalauréat degree in June 1932, Camus was already contributing articles to Sud, a literary monthly, and looking forward to a career in journalism, the arts, or higher education. The next four years (1933-37) were an especially busy period in his life during which he attended college, worked at odd jobs, married his first wife (Simone Hié), divorced, briefly joined the Communist party, and effectively began his professional theatrical and writing career. Among his various employments during the time were stints of routine office work where one job consisted of a Bartleby-like recording and sifting of meteorological data and another involved paper shuffling in an auto license bureau. One can well imagine that it was as a result of this experience that his famous conception of Sisyphean struggle, heroic defiance in the face of the Absurd, first began to take shape within his imagination.

In 1933, Camus enrolled at the University of Algiers to pursue his diplome d’etudes superieures, specializing in philosophy and gaining certificates in sociology and psychology along the way. In 1936, he became a co-founder, along with a group of young fellow intellectuals, of the Théâtre du Travail, a professional acting company specializing in drama with left-wing political themes. Camus served the company as both an actor and director and also contributed scripts, including his first published play Revolt in Asturia, a drama based on an ill-fated workers’ revolt during the Spanish Civil War. That same year Camus also earned his degree and completed his dissertation, a study of the influence of Plotinus and neo-Platonism on the thought and writings of St. Augustine.

Over the next three years Camus further established himself as an emerging author, journalist, and theatre professional. After his disillusionment with and eventual expulsion from the Communist Party, he reorganized his dramatic company and renamed it the Théâtre de l’Equipe (literally the Theater of the Team). The name change signaled a new emphasis on classic drama and avant-garde aesthetics and a shift away from labor politics and agitprop. In 1938 he joined the staff of a new daily newspaper, the Alger Républicain, where his assignments as a reporter and reviewer covered everything from contemporary European literature to local political trials. It was during this period that he also published his first two literary works—Betwixt and Between, a collection of five short semi-autobiographical and philosophical pieces (1937) and Nuptials, a series of lyrical celebrations interspersed with political and philosophical reflections on North Africa and the Mediterranean.

The 1940s witnessed Camus’s gradual ascendance to the rank of world-class literary intellectual. He started the decade as a locally acclaimed author and playwright, but he was a figure virtually unknown outside the city of Algiers; however, he ended the decade as an internationally recognized novelist, dramatist, journalist, philosophical essayist, and champion of freedom. This period of his life began inauspiciously—war in Europe, the occupation of France, official censorship, and a widening crackdown on left-wing journals. Camus was still without stable employment or steady income when, after marrying his second wife, Francine Faure, in December of 1940, he departed Lyons, where he had been working as a journalist, and returned to Algeria. To help make ends meet, he taught part-time (French history and geography) at a private school in Oran. All the while he was putting finishing touches to his first novel The Stranger, which was finally published in 1942 to favorable critical response, including a lengthy and penetrating review by Jean-Paul Sartre. The novel propelled him into immediate literary renown.

Camus returned to France in 1942 and a year later began working for the clandestine newspaper Combat, the journalistic arm and voice of the French Resistance movement. During this period, while contending with recurrent bouts of tuberculosis, he also published The Myth of Sisyphus, his philosophical anatomy of suicide and the absurd, and joined Gallimard Publishing as an editor, a position he held until his death.

After the Liberation, Camus continued as editor of Combat, oversaw the production and publication of two plays, The Misunderstanding and Caligula, and assumed a leading role in Parisian intellectual society in the company of Sartre and Simone de Beauvoir among others. In the late 40s his growing reputation as a writer and thinker was enlarged by the publication of The Plague, an allegorical novel and fictional parable of the Nazi Occupation and the duty of revolt, and by the lecture tours to the United States and South America. In 1951 he published The Rebel, a reflection on the nature of freedom and rebellion and a philosophical critique of revolutionary violence. This powerful and controversial work, with its explicit condemnation of Marxism-Leninism and its emphatic denunciation of unrestrained violence as a means of human liberation, led to an eventual falling out with Sartre and, along with his opposition to the Algerian National Liberation Front, to his being branded a reactionary in the view of many European Communists. Yet his position also established him as an outspoken champion of individual freedom and as an impassioned critic of tyranny and terrorism, whether practiced by the Left or by the Right.

In 1956, Camus published the short, confessional novel The Fall, which unfortunately would be the last of his completed major works and which in the opinion of some critics is the most elegant, and most under-rated of all his books. During this period he was still afflicted by tuberculosis and was perhaps even more sorely beset by the deteriorating political situation in his native Algeria—which had by now escalated from demonstrations and occasional terrorist and guerilla attacks into open violence and insurrection. Camus still hoped to champion some kind of rapprochement that would allow the native Muslim population and the French pied noir minority to live together peaceably in a new de-colonized and largely integrated, if not fully independent, nation. Alas, by this point, as he painfully realized, the odds of such an outcome were becoming increasingly unlikely.

In the fall of 1957, following publication of Exile and the Kingdom, a collection of short fiction, Camus was shocked by news that he had been awarded the Nobel Prize for literature. He absorbed the announcement with mixed feelings of gratitude, humility, and amazement. On the one hand, the award was obviously a tremendous honor. On the other, not only did he feel that his friend and esteemed fellow novelist Andre Malraux was more deserving, he was also aware that the Nobel itself was widely regarded as the kind of accolade usually given to artists at the end of a long career. Yet, as he indicated in his acceptance speech at Stockholm, he considered his own career as still in mid-flight, with much yet to accomplish and even greater writing challenges ahead:

Every person, and assuredly every artist, wants to be recognized. So do I. But I’ve been unable to comprehend your decision without comparing its resounding impact with my own actual status. A man almost young, rich only in his doubts, and with his work still in progress…how could such a man not feel a kind of panic at hearing a decree that transports him all of a sudden…to the center of a glaring spotlight? And with what feelings could he accept this honor at a time when other writers in Europe, among them the very greatest, are condemned to silence, and even at a time when the country of his birth is going through unending misery?

Of course Camus could not have known as he spoke these words that most of his writing career was in fact behind him. Over the next two years, he published articles and continued to write, produce, and direct plays, including his own adaptation of Dostoyevsky’s The Possessed. He also formulated new concepts for film and television, assumed a leadership role in a new experimental national theater, and continued to campaign for peace and a political solution in Algeria. Unfortunately, none of these latter projects would be brought to fulfillment. On January 4, 1960, Camus died tragically in a car accident while he was a passenger in a vehicle driven by his friend and publisher Michel Gallimard, who also suffered fatal injuries. The author was buried in the local cemetery at Lourmarin, a village in Provencal where he and his wife and daughters had lived for nearly a decade.

Upon hearing of Camus’s death, Sartre wrote a moving eulogy in the France-Observateur, saluting his former friend and political adversary not only for his distinguished contributions to French literature but especially for the heroic moral courage and “stubborn humanism” which he brought to bear against the “massive and deformed events of the day.”

2. Literary Career

According to Sartre’s perceptive appraisal, Camus was less a novelist and more a writer of philosophical tales and parables in the tradition of Voltaire. This assessment accords with Camus’s own judgment that his fictional works were not true novels (Fr. romans), a form he associated with the densely populated and richly detailed social panoramas of writers like Balzac, Tolstoy, and Proust, but rather contes (“tales”) and recits (“narratives”) combining philosophical and psychological insights.

In this respect, it is also worth noting that at no time in his career did Camus ever describe himself as a deep thinker or lay claim to the title of philosopher. Instead, he nearly always referred to himself simply, yet proudly, as un ecrivain—a writer. This is an important fact to keep in mind when assessing his place in intellectual history and in twentieth-century philosophy, for by no means does he qualify as a system-builder or theorist or even as a disciplined thinker. He was instead (and here again Sartre’s assessment is astute) a sort of all-purpose critic and modern-day philosophe: a debunker of mythologies, a critic of fraud and superstition, an enemy of terror, a voice of reason and compassion, and an outspoken defender of freedom—all in all a figure very much in the Enlightenment tradition of Voltaire and Diderot. For this reason, in assessing Camus’s career and work, it may be best simply to take him at his own word and characterize him first and foremost as a writer—advisedly attaching the epithet “philosophical” for sharper accuracy and definition.

3. Camus, Philosophical Literature, and the Novel of Ideas

To pin down exactly why and in what distinctive sense Camus may be termed a philosophical writer, we can begin by comparing him with other authors who have merited the designation. Right away, we can eliminate any comparison with the efforts of Lucretius and Dante, who undertook to unfold entire cosmologies and philosophical systems in epic verse. Camus obviously attempted nothing of the sort. On the other hand, we can draw at least a limited comparison between Camus and writers like Pascal, Kierkegaard, and Nietzsche—that is, with writers who were first of all philosophers or religious writers, but whose stylistic achievements and literary flair gained them a special place in the pantheon of world literature as well. Here we may note that Camus himself was very conscious of his debt to Kierkegaard and Nietzsche (especially in the style and structure of The Myth of Sisyphus and The Rebel) and that he might very well have followed in their literary-philosophical footsteps if his tuberculosis had not side-tracked him into fiction and journalism and prevented him from pursuing an academic career.

Perhaps Camus himself best defined his own particular status as a philosophical writer when he wrote (with authors like Melville, Stendhal, Dostoyevsky, and Kafka especially in mind): “The great novelists are philosophical novelists”; that is, writers who eschew systematic explanation and create their discourse using “images instead of arguments” (The Myth of Sisyphus 74).

By his own definition then Camus is a philosophical writer in the sense that he has (a) conceived his own distinctive and original world-view and (b) sought to convey that view mainly through images, fictional characters and events, and via dramatic presentation rather than through critical analysis and direct discourse. He is also both a novelist of ideas and a psychological novelist, and in this respect, he certainly compares most closely to Dostoyevsky and Sartre, two other writers who combine a unique and distinctly philosophical outlook, acute psychological insight, and a dramatic style of presentation. (Like Camus, Sartre was a productive playwright, and Dostoyevsky remains perhaps the most dramatic of all novelists, as Camus clearly understood, having adapted both The Brothers Karamazov and The Possessed for the stage.)

4. Works

Camus’s reputation rests largely on the three novels published during his lifetime—The Stranger, The Plague, and The Fall—and on his two major philosophical essays—The Myth of Sisyphus and The Rebel. However, his body of work also includes a collection of short fiction, Exile and the Kingdom; an autobiographical novel, The First Man; a number of dramatic works, most notably Caligula, The Misunderstanding, The State of Siege, and The Just Assassins; several translations and adaptations, including new versions of works by Calderon, Lope de Vega, Dostoyevsky, and Faulkner; and a lengthy assortment of essays, prose pieces, critical reviews, transcribed speeches and interviews, articles, and works of journalism. A brief summary and description of the most important of Camus’s writings is presented below as preparation for a larger discussion of his philosophy and world-view, including his main ideas and recurrent philosophical themes.

a. Fiction

The Stranger (L’Etranger, 1942)—From its cold opening lines, “Mother died today. Or maybe yesterday; I can’t be sure,” to its bleak concluding image of a public execution set to take place beneath the “benign indifference of the universe,” Camus’s first and most famous novel takes the form of a terse, flat, first-person narrative by its main character Meursault, a very ordinary young man of unremarkable habits and unemotional affect who, inexplicably and in an almost absent-minded way, kills an Arab and then is arrested, tried, convicted, and sentenced to death. The neutral style of the novel—typical of what the critic Roland Barthes called “writing degree zero”—serves as a perfect vehicle for the descriptions and commentary of its anti-hero narrator, the ultimate “outsider” and a person who seems to observe everything, including his own life, with almost pathological detachment.

The Plague (La Peste, 1947)—Set in the coastal town of Oran, Camus’s second novel is the story of an outbreak of plague, traced from its subtle, insidious, unheeded beginnings and horrible, seemingly irresistible dominion to its eventual climax and decline, all told from the viewpoint of one of the survivors. Camus made no effort to conceal the fact that his novel was partly based on and could be interpreted as an allegory or parable of the rise of Nazism and the nightmare of the Occupation. However, the plague metaphor is both more complicated and more flexible than that, extending to signify the Absurd in general as well as any calamity or disaster that tests the mettle of human beings, their endurance, their solidarity, their sense of responsibility, their compassion, and their will. At the end of the novel, the plague finally retreats, and the narrator reflects that a time of pestilence teaches “that there is more to admire in men than to despise,” but he also knows “that the plague bacillus never dies or disappears for good,” that “the day would come when, for the bane and the enlightening of men, it would rouse up its rats again” and send them forth yet once more to spread death and contagion into a happy and unsuspecting city.

The Fall (La Chute, 1956)—Camus’s third novel, and the last to be published during his lifetime, is in effect an extended dramatic monologue spoken by M. Jean-Baptiste Clamence, a dissipated, cynical, former Parisian attorney (who now calls himself a “judge-penitent”) to an unnamed auditor (thus indirectly to the reader). Set in a seedy bar in the red-light district of Amsterdam, the work is a small masterpiece of compression and style: a confessional (and semi-autobiographical) novel, an arresting character study and psychological portrait, and at the same time a wide-ranging philosophical discourse on guilt and innocence, expiation and punishment, good and evil.

b. Drama

Camus began his literary career as a playwright and theatre director and was planning new dramatic works for film, stage, and television at the time of his death. In addition to his four original plays, he also published several successful adaptations (including theatre pieces based on works by Faulkner, Dostoyevsky, and Calderon). He took particular pride in his work as a dramatist and man of the theatre. However, his plays never achieved the same popularity, critical success, or level of incandescence as his more famous novels and major essays.

Caligula (1938, first produced 1945)—“Men die and are not happy.” Such is the complaint against the universe pronounced by the young emperor Caligula, who in Camus’s play is less the murderous lunatic, slave to incest, narcissist, and megalomaniac of Roman history than a theatrical martyr-hero of the Absurd: a man who carries his philosophical quarrel with the meaninglessness of human existence to a kind of fanatical but logical extreme. Camus described his hero as a man “obsessed with the impossible” willing to pervert all values, and if necessary destroy himself and all those around him in the pursuit of absolute liberty. Caligula was Camus’s first attempt at portraying a figure in absolute defiance of the Absurd, and through three revisions of the play over a period of several years he eventually achieved a remarkable composite by adding to Caligula’s original portrait touches of Sade, of revolutionary nihilism, of the Nietzschean Superman, of his own version of Sisyphus, and even of Mussolini and Hitler.

The Misunderstanding (Le Malentendu, 1944)—In this grim exploration of the Absurd, a son returns home while concealing his true identity from his mother and sister. The two women operate a boarding house where, in order to make ends meet, they quietly murder and rob their patrons. Through a tangle of misunderstanding and mistaken identity they wind up murdering their unrecognized visitor. Camus has explained the drama as an attempt to capture the atmosphere of malaise, corruption, demoralization, and anonymity that he experienced while living in France during the German occupation. Despite the play’s dark themes and bleak style, he described its philosophy as ultimately optimistic: “It amounts to saying that in an unjust or indifferent world man can save himself, and save others, by practicing the most basic sincerity and pronouncing the most appropriate word.”

State of Siege (L’Etat de Siege, 1948)This odd allegorical drama combines features of the medieval morality play with elements of Calderon and the Spanish baroque; it also has apocalyptic themes, bits of music hall comedy, and a collection of avant-garde theatrics thrown in for good measure. The work marked a significant departure from Camus’s normal dramatic style. It also resulted in virtually universal disapproval and negative reviews from Paris theatre-goers and critics, many of whom came expecting a play based on Camus’s recent novel The Plague. The play is set in the Spanish seaport city of Cadiz, famous for its beaches, carnivals, and street musicians. By the end of the first act, the normally laid-back and carefree citizens fall under the dominion of a gaudily beribboned and uniformed dictator named Plague (based on Generalissimo Franco) and his officious, clip-board wielding Secretary (who turns out to be a modern, bureaucratic incarnation of the medieval figure Death). One of the prominent concerns of the play is the Orwellian theme of the degradation of language via totalitarian politics and bureaucracy (symbolized onstage by calls for silence, scenes in pantomime, and a gagged chorus). As one character observes, “we are steadily nearing that perfect moment when nothing anybody says will rouse the least echo in another’s mind.”

The Just Assassins (Les Justes, 1950)—First performed in Paris to largely favorable reviews, this play is based on real-life characters and an actual historical event: the 1905 assassination of the Russian Grand Duke Sergei Alexandrovich by Ivan Kalyayev and fellow members of the Combat Organization of the Socialist Revolutionary Party. The play effectively dramatizes the issues that Camus would later explore in detail in The Rebel, especially the question of whether acts of terrorism and political violence can ever be morally justified (and if so, with what limitations and in what specific circumstances). The historical Kalyayev passed up his original opportunity to bomb the Grand Duke’s carriage because the Duke was accompanied by his wife and two young nephews. However, this was no act of conscience on Kalyayev’s part but a purely practical decision based on his calculation that the murder of children would prove a setback to the revolution. After the successful completion of his bombing mission and subsequent arrest, Kalyayev welcomed his execution on similarly practical and purely political grounds, believing that his death would further the cause of revolution and social justice. Camus’s Kalyayev, on the other hand, is a far more agonized and conscientious figure, neither so cold-blooded nor so calculating as his real-life counterpart. Upon seeing the two children in the carriage, he refuses to toss his bomb not because doing so would be politically inexpedient but because he is overcome emotionally, temporarily unnerved by the sad expression in their eyes.  Similarly, at the end of the play he embraces his death not so much because it will aid the revolution, but almost as a form of karmic penance, as if it were indeed some kind of sacred duty or metaphysical requirement that must be performed in order for true justice to be achieved.

c. Essays, Letters, Prose Collections, Articles, and Reviews

Betwixt and Between (L’Envers et l’endroit, 1937)—This short collection of semi-autobiographical, semi-fictional, philosophical pieces might be dismissed as juvenilia and largely ignored if it were not for the fact that it represents Camus’s first attempt to formulate a coherent life-outlook and world-view. The collection, which in a way serves as a germ or starting point for the author’s later philosophy, consists of five lyrical essays. In “Irony” (“L’Ironie”), a reflection on youth and age, Camus asserts, in the manner of a young disciple of Pascal, our essential solitariness in life and death. In “Between yes and no” (“Entre Oui et Non”) he suggests that to hope is as empty and as pointless as to despair, yet he goes beyond nihilism by positing a fundamental value to existence-in-the-world. In “Death in the soul” (“La Mort dans l’ame”) he supplies a sort of existential travel review, contrasting his impressions of central and Eastern Europe (which he views as purgatorial and morgue-like) with the more spontaneous life of Italy and Mediterranean culture. The piece thus affirms the author’s lifelong preference for the color and vitality of the Mediterranean world, and especially North Africa, as opposed to what he perceives as the soulless cold-heartedness of modern Europe. In “Love of life” (“Amour de vivre”) he claims there can be no love of life without despair of life and thus largely re-asserts the essentially tragic, ancient Greek view that the very beauty of human existence is largely contingent upon its brevity and fragility. The concluding essay, “Betwixt and between” (“L’Envers et l’endroit”), summarizes and re-emphasizes the Romantic themes of the collection as a whole: our fundamental “aloneness,” the importance of imagination and openness to experience, the imperative to “live as if….”

Nuptials (Noces, 1938)—This collection of four rhapsodic narratives supplements and amplifies the youthful philosophy expressed in Betwixt and Between. That joy is necessarily intertwined with despair, that the shortness of life confers a premium on intense experience, and that the world is both beautiful and violent—these are, once again, Camus’s principal themes. “Summer in Algiers,” which is probably the best (and best-known) of the essays in the collection, is a lyrical, at times almost ecstatic, celebration of sea, sun, and the North African landscape. Affirming a defiantly atheistic creed, Camus concludes with one of the core ideas of his philosophy: “If there is a sin against life, it consists not so much in despairing as in hoping for another life and in eluding the implacable grandeur of this one.”

The Myth of Sisyphus (Le Mythe de Sisyphe, 1943)—If there is a single non-fiction work that can be considered an essential or fundamental statement of Camus’s philosophy, it is this extended essay on the ethics of suicide (eventually translated and repackaged for American publication in 1955). It is here that Camus formally introduces and fully articulates his most famous idea, the concept of the Absurd, and his equally famous image of life as a Sisyphean struggle. From its provocative opening sentence—“There is but one truly serious philosophical problem, and that is suicide”—to its stirring, paradoxical conclusion—“The struggle itself toward the heights is enough to fill a man’s heart. One must imagine Sisyphus happy”—the book has something interesting and challenging on nearly every page and is shot through with brilliant aphorisms and insights. In the end, Camus rejects suicide: the Absurd must not be evaded either by religion (“philosophical suicide”) or by annihilation (“physical suicide”); the task of living should not merely be accepted, it must be embraced.

The Rebel (L’Homme Revolte, 1951)—Camus considered this work a continuation of the critical and philosophical investigation of the Absurd that he began with The Myth of Sisyphus. Only this time his primary concern is not suicide but murder. He takes up the question of whether acts of terrorism and political violence can be morally justified, which is basically the same question he had addressed earlier in his play The Just Assassins. After arguing that an authentic life inevitably involves some form of conscientious moral revolt, Camus winds up concluding that only in rare and very narrowly defined instances is political violence justified. Camus’s critique of revolutionary violence and terror in this work, and particularly his caustic assessment of Marxism-Leninism (which he accused of sacrificing innocent lives on the altar of History), touched nerves throughout Europe and led in part to his celebrated feud with Sartre and other French leftists.

Resistance, Rebellion, and Death (1960)—This posthumous collection is of interest to students of Camus mainly because it brings together an unusual assortment of his non-fiction writings on a wide range of topics, from art and politics to the advantages of pessimism and the virtues (from a non-believer’s standpoint) of Christianity. Of special interest are two pieces that helped secure Camus’s worldwide reputation as a voice of liberty: “Letters to a German Friend,” a set of four letters originally written during the Nazi Occupation, and “Reflections on the Guillotine,” a denunciation of the death penalty cited for special mention by the Nobel committee and eventually revised and re-published as a companion essay to go with fellow death-penalty opponent Arthur Koestler’s “Reflections on Hanging.”

5. Philosophy

To re-emphasize a point made earlier, Camus considered himself first and foremost a writer (un ecrivain). Indeed, Camus’s dissertation advisor penciled onto his dissertation the assessment “More a writer than a philosopher.” And at various times in his career he also accepted the labels journalist, humanist, novelist, and even moralist. However, he apparently never felt comfortable identifying himself as a philosopher—a term he seems to have associated with rigorous academic training, systematic thinking, logical consistency, and a coherent, carefully defined doctrine or body of ideas.

This is not to suggest that Camus lacked ideas or to say that his thought cannot be considered a personal philosophy. It is simply to point out that he was not a systematic, or even a notably disciplined thinker and that, unlike Heidegger and Sartre, for example, he showed very little interest in metaphysics and ontology, which seems to be one of the reasons he consistently denied that he was an existentialist. In short, he was not much given to speculative philosophy or any kind of abstract theorizing. His thought is instead nearly always related to current events (e.g., the Spanish War, revolt in Algeria) and is consistently grounded in down-to-earth moral and political reality.

a. Background and Influences

Though he was baptized, raised, and educated as a Catholic and invariably respectful towards the Church, Camus seems to have been a natural-born pagan who showed almost no instinct whatsoever for belief in the supernatural. Even as a youth, he was more of a sun-worshipper and nature lover than a boy notable for his piety or religious faith. On the other hand, there is no denying that Christian literature and philosophy served as an important influence on his early thought and intellectual development. As a young high school student, Camus studied the Bible, read and savored the Spanish mystics St. Theresa of Avila and St. John of the Cross, and was introduced to the thought of St. Augustine St. Augustine would later serve as the subject of his baccalaureate dissertation and become—as a fellow North African writer, quasi-existentialist, and conscientious observer-critic of his own life—an important lifelong influence.

In college Camus absorbed Kierkegaard, who, after Augustine, was probably the single greatest Christian influence on his thought. He also studied Schopenhauer and Nietzsche—undoubtedly the two writers who did the most to set him on his own path of defiant pessimism and atheism. Other notable influences include not only the major modern philosophers from the academic curriculum—from Descartes and Spinoza to Bergson—but also, and just as importantly, philosophical writers like Stendhal, Melville, Dostoyevsky, and Kafka.

b. Development

The two earliest expressions of Camus’s personal philosophy are his works Betwixt and Between (1937) and Nuptials (1938). Here he unfolds what is essentially a hedonistic, indeed almost primitivistic, celebration of nature and the life of the senses. In the Romantic poetic tradition of writers like Rilke and Wallace Stevens, he offers a forceful rejection of all hereafters and an emphatic embrace of the here and now. There is no salvation, he argues, no transcendence; there is only the enjoyment of consciousness and natural being. One life, this life, is enough. Sky and sea, mountain and desert, have their own beauty and magnificence and constitute a sufficient heaven.

The critic John Cruikshank termed this stage in Camus’s thinking “naïve atheism” and attributed it to his ecstatic and somewhat immature “Mediterraneanism.” Naïve seems an apt characterization for a philosophy that is romantically bold and uncomplicated yet somewhat lacking in sophistication and logical clarity. On the other hand, if we keep in mind Camus’s theatrical background and preference for dramatic presentation, there may actually be more depth and complexity to his thought here than meets the eye. That is to say, just as it would be simplistic and reductive to equate Camus’s philosophy of revolt with that of his character Caligula (who is at best a kind of extreme or mad spokesperson for the author), so in the same way it is possible that the pensées and opinions presented in Nuptials and Betwixt and Between are not so much the views of Camus as they are poetically heightened observations of an artfully crafted narrator—an exuberant alter ego who is far more spontaneous and free-spirited than his more naturally reserved and sober-minded author.

In any case, regardless of this assessment of the ideas expressed in Betwixt and Between and Nuptials, it is clear that these early writings represent an important, if comparatively raw and simple, beginning stage in Camus’s development as a thinker where his views differ markedly from his more mature philosophy in several noteworthy respects. In the first place, the Camus of Nuptials is still a young man of twenty-five, aflame with youthful joie de vivre. He favors a life of impulse and daring as it was honored and practiced in both Romantic literature and in the streets of Belcourt. Recently married and divorced, raised in poverty and in close quarters, beset with health problems, this young man develops an understandable passion for clear air, open space, colorful dreams, panoramic vistas, and the breath-taking prospects and challenges of the larger world. Consequently, the Camus of the period 1937-38 is a decidedly different writer from the Camus who will ascend the dais at Stockholm nearly twenty years later.

The young Camus is more of a sensualist and pleasure-seeker, more of a dandy and aesthete, than the more hardened and austere figure who will endure the Occupation while serving in the French underground. He is a writer passionate in his conviction that life ought to be lived vividly and intensely—indeed rebelliously (to use the term that will take on increasing importance in his thought). He is also a writer attracted to causes, though he is not yet the author who will become world-famous for his moral seriousness and passionate commitment to justice and freedom. All of which is understandable. After all, the Camus of the middle 1930s had not yet witnessed and absorbed the shattering spectacle and disillusioning effects of the Spanish Civil War, the rise of Fascism, Hitlerism, and Stalinism, the coming into being of total war and weapons of mass destruction, and the terrible reign of genocide and terror that would characterize the period 1938-1945. It was under the pressure and in direct response to the events of this period that Camus’s mature philosophy—with its core set of humanistic themes and ideas—emerged and gradually took shape. That mature philosophy is no longer a “naïve atheism” but a very reflective and critical brand of unbelief. It is proudly and inconsolably pessimistic, but not in a polemical or overbearing way. It is unbending, hardheaded, determinedly skeptical. It is tolerant and respectful of world religious creeds, but at the same time wholly unsympathetic to them. In the end it is an affirmative philosophy that accepts and approves, and in its own way blesses, our dreadful mortality and our fundamental isolation in the world.

c. Themes and Ideas

Regardless of whether he is producing drama, fiction, or non-fiction, Camus in his mature writings nearly always takes up and re-explores the same basic philosophical issues. These recurrent topoi constitute the key components of his thought. They include themes like the Absurd, alienation, suicide, and rebellion that almost automatically come to mind whenever his name is mentioned. Hence any summary of his place in modern philosophy would be incomplete without at least a brief discussion of these ideas and how they fit together to form a distinctive and original world-view.

i. The Absurd

Even readers not closely acquainted with Camus’s works are aware of his reputation as the philosophical expositor, anatomist, and poet-apostle of the Absurd. Indeed, as even sitcom writers and stand-up comics apparently understand (odd fact: the comic-bleak final episode of Seinfeld has been compared to The Stranger, and Camus’s thought has been used to explain episodes of The Simpsons), it is largely through the thought and writings of the French-Algerian author that the concept of absurdity has become a part not only of world literature and twentieth-century philosophy but also of modern popular culture.

What then is meant by the notion of the Absurd? Contrary to the view conveyed by popular culture, the Absurd, (at least in Camus’ terms) does not simply refer to some vague perception that modern life is fraught with paradoxes, incongruities, and intellectual confusion. (Although that perception is certainly consistent with his formula.) Instead, as he himself emphasizes and tries to make clear, the Absurd expresses a fundamental disharmony, a tragic incompatibility, in our existence. In effect, he argues that the Absurd is the product of a collision or confrontation between our human desire for order, meaning, and purpose in life and the blank, indifferent “silence of the universe.” (“The absurd is not in man nor in the world,” Camus explains, “but in their presence together . . . it is the only bond uniting them.”)

So here we are: poor creatures desperately seeking hope and meaning in a hopeless, meaningless world. Sartre, in his essay-review of The Stranger provides an additional gloss on the idea: “The absurd, to be sure, resides neither in man nor in the world, if you consider each separately. But since man’s dominant characteristic is ‘being in the world,’ the absurd is, in the end, an inseparable part of the human condition.” The Absurd, then, presents itself in the form of an existential opposition. It arises from the human demand for clarity and transcendence on the one hand and a cosmos that offers nothing of the kind on the other. Such is our fate: we inhabit a world that is indifferent to our sufferings and deaf to our protests.

In Camus’s view there are three possible philosophical responses to this predicament. Two of these he condemns as evasions, and the other he puts forward as a proper solution.

The first choice is blunt and simple: physical suicide. If we decide that a life without some essential purpose or meaning is not worth living, we can simply choose to kill ourselves. Camus rejects this choice as cowardly. In his terms it is a repudiation or renunciation of life, not a true revolt.

The second choice is the religious solution of positing a transcendent world of solace and meaning beyond the Absurd. Camus calls this solution “philosophical suicide” and rejects it as transparently evasive and fraudulent. To adopt a supernatural solution to the problem of the Absurd (for example, through some type of mysticism or leap of faith) is to annihilate reason, which in Camus’s view is as fatal and self-destructive as physical suicide. In effect, instead of removing himself from the absurd confrontation of self and world like the physical suicide, the religious believer simply removes the offending world and replaces it, via a kind of metaphysical abracadabra, with a more agreeable alternative.

The third choice—in Camus’s view the only authentic and valid solution—is simply to accept absurdity, or better yet to embrace it, and to continue living. Since the Absurd in his view is an unavoidable, indeed defining, characteristic of the human condition, the only proper response to it is full, unflinching, courageous acceptance. Life, he says, can “be lived all the better if it has no meaning.”

The example par excellence of this option of spiritual courage and metaphysical revolt is the mythical Sisyphus of Camus’s philosophical essay. Doomed to eternal labor at his rock, fully conscious of the essential hopelessness of his plight, Sisyphus nevertheless pushes on. In doing so he becomes for Camus a superb icon of the spirit of revolt and of the human condition. To rise each day to fight a battle you know you cannot win, and to do this with wit, grace, compassion for others, and even a sense of mission, is to face the Absurd in a spirit of true heroism.

Over the course of his career, Camus examines the Absurd from multiple perspectives and through the eyes of many different characters—from the mad Caligula, who is obsessed with the problem, to the strangely aloof and yet simultaneously self-absorbed Meursault, who seems indifferent to it even as he exemplifies and is finally victimized by it. In The Myth of Sisyphus, Camus traces it in specific characters of legend and literature (Don Juan, Ivan Karamazov) and also in certain character types (the Actor, the Conqueror), all of who may be understood as in some way a version or manifestation of Sisyphus, the archetypal absurd hero.

[Note: A rather different, yet possibly related, notion of the Absurd is proposed and analyzed in the work of Kierkegaard, especially in Fear and Trembling and Repetition. For Kierkegaard, however, the Absurd describes not an essential and universal human condition, but the special condition and nature of religious faith—a paradoxical state in which matters of will and perception that are objectively impossible can nevertheless be ultimately true. Though it is hard to say whether Camus had Kierkegaard particularly in mind when he developed his own concept of the absurd, there can be little doubt that Kierkegaard’s knight of faith is in certain ways an important predecessor of Camus’s Sisyphus: both figures are involved in impossible and endlessly agonizing tasks, which they nevertheless confidently and even cheerfully pursue. In the knight’s quixotic defiance and solipsism, Camus found a model for his own ideal of heroic affirmation and philosophical revolt.]

ii. Revolt

The companion theme to the Absurd in Camus’s oeuvre (and the only other philosophical topic to which he devoted an entire book) is the idea of Revolt. What is revolt? Simply defined, it is the Sisyphean spirit of defiance in the face of the Absurd. More technically and less metaphorically, it is a spirit of opposition against any perceived unfairness, oppression, or indignity in the human condition.

Rebellion in Camus’s sense begins with a recognition of boundaries, of limits that define one’s essential selfhood and core sense of being and thus must not be infringed—as when a slave stands up to his master and says in effect “thus far, and no further, shall I be commanded.” This defining of the self as at some point inviolable appears to be an act of pure egoism and individualism, but it is not. In fact Camus argues at considerable length to show that an act of conscientious revolt is ultimately far more than just an individual gesture or an act of solitary protest. The rebel, he writes, holds that there is a “common good more important than his own destiny” and that there are “rights more important than himself.” He acts “in the name of certain values which are still indeterminate but which he feels are common to himself and to all men” (The Rebel 15-16).

Camus then goes on to assert that an “analysis of rebellion leads at least to the suspicion that, contrary to the postulates of contemporary thought, a human nature does exist, as the Greeks believed.” After all, “Why rebel,” he asks, “if there is nothing permanent in the self worth preserving?” The slave who stands up and asserts himself actually does so for “the sake of everyone in the world.” He declares in effect that “all men—even the man who insults and oppresses him—have a natural community.” Here we may note that the idea that there may indeed be an essential human nature is actually more than a “suspicion” as far as Camus himself was concerned. Indeed for him it was more like a fundamental article of his humanist faith. In any case it represents one of the core principles of his ethics and is one of the tenets that sets his philosophy apart from existentialism.

True revolt, then, is performed not just for the self but also in solidarity with and out of compassion for others. And for this reason, Camus is led to conclude that revolt too has its limits. If it begins with and necessarily involves a recognition of human community and a common human dignity, it cannot, without betraying its own true character, treat others as if they were lacking in that dignity or not a part of that community. In the end it is remarkable, and indeed surprising, how closely Camus’s philosophy of revolt, despite the author’s fervent atheism and individualism, echoes Kantian ethics with its prohibition against treating human beings as means and its ideal of the human community as a kingdom of ends.

iii. The Outsider

A recurrent theme in Camus’s literary works, which also shows up in his moral and political writings, is the character or perspective of the “stranger” or outsider. Meursault, the laconic narrator of The Stranger, is the most obvious example. He seems to observe everything, even his own behavior, from an outside perspective. Like an anthropologist, he records his observations with clinical detachment at the same time that he is warily observed by the community around him.

Camus came by this perspective naturally. As a European in Africa, an African in Europe, an infidel among Muslims, a lapsed Catholic, a Communist Party drop-out, an underground resister (who at times had to use code names and false identities), a “child of the state” raised by a widowed mother (who was illiterate and virtually deaf and dumb), Camus lived most of his life in various groups and communities without really being integrated within them. This outside view, the perspective of the exile, became his characteristic stance as a writer. It explains both the cool, objective (“zero-degree”) precision of much of his work and also the high value he assigned to longed-for ideals of friendship, community, solidarity, and brotherhood.

iv. Guilt and Innocence

Throughout his writing career, Camus showed a deep interest in questions of guilt and innocence. Once again Meursault in The Stranger provides a striking example. Is he legally innocent of the murder he is charged with? Or is he technically guilty? On the one hand, there seems to have been no conscious intention behind his action. Indeed the killing takes place almost as if by accident, with Meursault in a kind of absent-minded daze, distracted by the sun. From this point of view, his crime seems surreal and his trial and subsequent conviction a travesty. On the other hand, it is hard for the reader not to share the view of other characters in the novel, especially Meursault’s accusers, witnesses, and jury, in whose eyes he seems to be a seriously defective human being—at best, a kind of hollow man and at worst, a monster of self-centeredness and insularity. That the character has evoked such a wide range of responses from critics and readers—from sympathy to horror—is a tribute to the psychological complexity and subtlety of Camus’s portrait.

Camus’s brilliantly crafted final novel, The Fall, continues his keen interest in the theme of guilt, this time via a narrator who is virtually obsessed with it. The significantly named Jean-Baptiste Clamence (a voice in the wilderness calling for clemency and forgiveness) is tortured by guilt in the wake of a seemingly casual incident. While strolling home one drizzly November evening, he shows little concern and almost no emotional reaction at all to the suicidal plunge of a young woman into the Seine. But afterwards the incident begins to gnaw at him, and eventually he comes to view his inaction as typical of a long pattern of personal vanity and as a colossal failure of human sympathy on his part. Wracked by remorse and self-loathing, he gradually descends into a figurative hell. Formerly an attorney, he is now a self-described “judge-penitent” (a combination sinner, tempter, prosecutor, and father-confessor) who shows up each night at his local haunt, a sailor’s bar near Amsterdam’s red light district, where, somewhat in the manner of Coleridge’s Ancient Mariner, he recounts his story to whoever will hear it. In the final sections of the novel, amid distinctly Christian imagery and symbolism, he declares his crucial insight that, despite our pretensions to righteousness, we are all guilty. Hence no human being has the right to pass final moral judgment on another.

In a final twist, Clamence asserts that his acid self-portrait is also a mirror for his contemporaries. Hence his confession is also an accusation—not only of his nameless companion (who serves as the mute auditor for his monologue) but ultimately of the hypocrite lecteur as well.

v. Christianity vs. “Paganism”

The theme of guilt and innocence in Camus’s writings relates closely to another recurrent tension in his thought: the opposition of Christian and pagan ideas and influences. At heart a nature-worshipper, and by instinct a skeptic and non-believer, Camus nevertheless retained a lifelong interest and respect for Christian philosophy and literature. In particular, he seems to have recognized St. Augustine and Kierkegaard as intellectual kinsmen and writers with whom he shared a common passion for controversy, literary flourish, self-scrutiny, and self-dramatization. Christian images, symbols, and allusions abound in all his work (probably more so than in the writing of any other avowed atheist in modern literature), and Christian themes—judgment, forgiveness, despair, sacrifice, passion, and so forth—permeate the novels. (Meursault and Clamence, it is worth noting, are presented not just as sinners, devils, and outcasts, but in several instances explicitly, and not entirely ironically, as Christ figures.)

Meanwhile alongside and against this leitmotif of Christian images and themes, Camus sets the main components of his essentially pagan worldview. Like Nietzsche, he maintains a special admiration for Greek heroic values and pessimism and for classical virtues like courage and honor. What might be termed Romantic values also merit particular esteem within his philosophy: passion, absorption in pure being, an appreciation for and indeed a willingness to revel in raw sensory experience, the glory of the moment, the beauty of the world.

As a result of this duality of influence, Camus’s basic philosophical problem becomes how to reconcile his Augustinian sense of original sin (universal guilt) and rampant moral evil with his personal ideal of pagan primitivism (universal innocence) and with his conviction that the natural world and our life in it have intrinsic beauty and value. Can an absurd world have intrinsic value? Is authentic pessimism compatible with the view that there is an essential dignity to human life? Such questions raise the possibility that there may be deep logical inconsistencies within Camus’s philosophy, and some critics (notably Sartre) have suggested that these inconsistencies cannot be surmounted except through some sort of Kierkegaardian leap of faith on Camus’s part—in this case a leap leading to a belief not in God but in man.

Such a leap is certainly implied in an oft-quoted remark from Camus’s “Letter to a German Friend,” where he wrote: “I continue to believe that this world has no supernatural meaning…But I know that something in the world has meaning—man.” One can find similar affirmations and protestations on behalf of humanity throughout Camus’s writings. They are almost a hallmark of his philosophical style. Oracular and high-flown, they clearly have more rhetorical force than logical potency. On the other hand, if we are trying to locate Camus’s place in European philosophical tradition, they provide a strong clue as to where he properly belongs. Surprisingly, the sentiment here, a commonplace of the Enlightenment and of traditional liberalism, is much closer in spirit to the exuberant secular humanism of the Italian Renaissance than to the agnostic skepticism of contemporary post-modernism.

vi. Individual vs. History and Mass Culture

A primary theme of early twentieth-century European literature and critical thought is the rise of modern mass civilization and its suffocating effects of alienation and dehumanization. This became a pervasive theme by the time Camus was establishing his literary reputation. Anxiety over the fate of Western culture, already intense, escalated to apocalyptic levels with the sudden emergence of fascism, totalitarianism, and new technologies of coercion and death. Here then was a subject ready-made for a writer of Camus’s political and humanistic views. He responded to the occasion with typical force and eloquence.

In one way or another, the themes of alienation and dehumanization as by-products of an increasingly technical and automated world enter into nearly all of Camus’s works. Even his concept of the Absurd becomes multiplied by a social and economic world in which meaningless routines and mind-numbing repetitions predominate. The drudgery of Sisyphus is mirrored and amplified in the assembly line, the business office, the government bureau, and especially in the penal colony and concentration camp.

In line with this theme, the ever-ambiguous Meursault in The Stranger can be understood as both a depressing manifestation of the newly emerging mass personality (that is, as a figure devoid of basic human feelings and passions) and, conversely, as a lone hold-out, a last remaining specimen of the old Romanticism—and hence a figure who is viewed as both dangerous and alien by the robotic majority. Similarly, The Plague can be interpreted, on at least one level, as an allegory in which humanity must be preserved from the fatal pestilence of mass culture, which converts formerly free, autonomous, independent-minded human beings into a soulless new species.

At various times in the novel, Camus’s narrator describes the plague as if it were a dull but highly capable public official or bureaucrat:

It was, above all, a shrewd, unflagging adversary; a skilled organizer, doing his work thoroughly and well. (180) “But it seemed the plague had settled in for good at its most virulent, and it took its daily toll of deaths with the punctual zeal of a good civil servant.” (235)

 This identification of the plague with oppressive civil bureaucracy and the routinization of charisma looks forward to the author’s play The State of Siege, where plague is used once again as a symbol for totalitarianism—only this time it is personified in an almost cartoonish way as a kind of overbearing government functionary or office manager from hell. Clad in a gaudy military uniform bedecked with ribbons and decorations, the character Plague (a satirical portrait of Generalissimo Francisco Franco—or El Caudillo as he liked to style himself) is closely attended by his personal Secretary and loyal assistant Death, depicted as a prim, officious female bureaucrat who also favors military garb and who carries an ever-present clipboard and notebook.

So Plague is a fascist dictator, and Death a solicitous commissar. Together these figures represent a system of pervasive control and micro-management that threatens the future of mass society.

In his reflections on this theme of post-industrial dehumanization, Camus differs from most other European writers (and especially from those on the Left) in viewing mass reform and revolutionary movements, including Marxism, as representing at least as great a threat to individual freedom as late-stage capitalism. Throughout his career he continued to cherish and defend old-fashioned virtues like personal courage and honor that other Left-wing intellectuals tended to view as reactionary or bourgeois.

vii. Suicide

Suicide is the central subject of The Myth of Sisyphus and serves as a background theme in Caligula and The Fall. In Caligula the mad title character, in a fit of horror and revulsion at the meaninglessness of life, would rather die—and bring the world down with him—than accept a cosmos that is indifferent to human fate or that will not submit to his individual will. In The Fall, a stranger’s act of suicide serves as the starting point for a bitter ritual of self-scrutiny and remorse on the part of the narrator.

Like Wittgenstein (who had a family history of suicide and suffered from bouts of depression), Camus considered suicide the fundamental issue for moral philosophy. However, unlike other philosophers who have written on the subject (from Cicero and Seneca to Montaigne and Schopenhauer), Camus seems uninterested in assessing the traditional motives and justifications for suicide (for instance, to avoid a long, painful, and debilitating illness or as a response to personal tragedy or scandal). Indeed, he seems interested in the problem only to the extent that it represents one possible response to the Absurd. His verdict on the matter is unqualified and clear: The only courageous and morally valid response to the Absurd is to continue living—“Suicide is not an option.”

viii. The Death Penalty

From the time he first heard the story of his father’s literal nausea and revulsion after witnessing a public execution, Camus began a vocal and lifelong opposition to the death penalty. Executions by guillotine were a common public spectacle in Algeria during his lifetime, but he refused to attend them and recoiled bitterly at their very mention.

Condemnation of capital punishment is both explicit and implicit in his writings. For example, in The Stranger Meursault’s long confinement during his trial and his eventual execution are presented as part of an elaborate, ceremonial ritual involving both public and religious authorities. The grim rationality of this process of legalized murder contrasts markedly with the sudden, irrational, almost accidental nature of his actual crime. Similarly, in The Myth of Sisyphus, the would-be suicide is contrasted with his fatal opposite, the man condemned to death, and we are continually reminded that a sentence of death is our common fate in an absurd universe.

Camus’s opposition to the death penalty is not specifically philosophical. That is, it is not based on a particular moral theory or principle (such as Cesare Beccaria’s utilitarian objection that capital punishment is wrong because it has not been proven to have a deterrent effect greater than life imprisonment). Camus’s opposition, in contrast, is humanitarian, conscientious, almost visceral. Like Victor Hugo, his great predecessor on this issue, he views the death penalty as an egregious barbarism—an act of blood riot and vengeance covered over with a thin veneer of law and civility to make it acceptable to modern sensibilities. That it is also an act of vengeance aimed primarily at the poor and oppressed, and that it is given religious sanction, makes it even more hideous and indefensible in his view.

Camus’s essay “Reflections on the Guillotine” supplies a detailed examination of the issue. An eloquent personal statement with compelling psychological and philosophical insights, it includes the author’s direct rebuttal to traditional retributionist arguments in favor of capital punishment (such as Kant’s claim that death is the legally appropriate, indeed morally required, penalty for murder). To all who argue that murder must be punished in kind, Camus replies:

Capital punishment is the most premeditated of murders, to which no criminal’s deed, however calculated, can be compared. For there to be an equivalency, the death penalty would have to punish a criminal who had warned his victim of the date on which he would inflict a horrible death on him and who, from that moment onward, had confined him at his mercy for months. Such a monster is not to be encountered in private life.

Camus concludes his essay by arguing that, at the very least, France should abolish the savage spectacle of the guillotine and replace it with a more humane procedure (such as lethal injection). But he still retains a scant hope that capital punishment will be completely abolished at some point in the time to come: “In the unified Europe of the future the solemn abolition of the death penalty ought to be the first article of the European Code we all hope for.” Camus himself did not live to see the day, but he would no doubt be gratified to know that abolition of capital punishment is now an essential prerequisite for membership in the European Union.

6. Existentialism

Camus is often classified as an existentialist writer, and it is easy to see why. Affinities with Kierkegaard and Sartre are patent. He shares with these philosophers (and with the other major writers in the existentialist tradition, from Augustine and Pascal to Dostoyevsky and Nietzsche) an habitual and intense interest in the active human psyche, in the life of conscience or spirit as it is actually experienced and lived. Like these writers, he aims at nothing less than a thorough, candid exegesis of the human condition, and like them he exhibits not just a philosophical attraction but also a personal commitment to such values as individualism, free choice, inner strength, authenticity, personal responsibility, and self-determination.

However, one troublesome fact remains: throughout his career Camus repeatedly denied that he was an existentialist. Was this an accurate and honest self-assessment? On the one hand, some critics have questioned this “denial” (using the term almost in its modern clinical sense), attributing it to the celebrated Sartre-Camus political “feud” or to a certain stubbornness or even contrariness on Camus’s part. In their view, Camus qualifies as, at minimum, a closet existentialist, and in certain respects (e.g., in his unconditional and passionate concern for the individual) as an even truer specimen of the type than Sartre.

On the other hand, besides his personal rejection of the label, there appear to be solid reasons for challenging the claim that Camus is an existentialist. For one thing, it is noteworthy that he never showed much interest in (indeed he largely avoided) metaphysical and ontological questions (the philosophical raison d’etre of Heidegger and Sartre). Of course there is no rule that says an existentialist must be a metaphysician. However, Camus’s seeming aversion to technical philosophical discussion does suggest one way in which he distanced himself from contemporary existentialist thought.

Another point of divergence is that Camus seems to have regarded existentialism as a complete and systematic world-view, that is, a fully articulated doctrine. In his view, to be a true existentialist one had to commit to the entire doctrine (and not merely to bits and pieces of it), and this was apparently something he was unwilling to do.

A further point of separation, and possibly a decisive one, is that Camus actively challenged and set himself apart from the existentialist motto that being precedes essence. Ultimately, against Sartre in particular and existentialists in general, he clings to his instinctive belief in a common human nature. In his view human existence necessarily includes an essential core element of dignity and value, and in this respect he seems surprisingly closer to the humanist tradition from Aristotle to Kant than to the modern tradition of skepticism and relativism from Nietzsche to Derrida (the latter his fellow-countryman and, at least in his commitment to human rights and opposition to the death penalty, his spiritual successor and descendant).

7. Camus, Colonialism, and Algeria

One of the main topics and even preoccupations of recent Camus studies has been the writer’s attitude, as reflected in both his fiction and in his non-fiction, towards European colonialism in general and his response to the French-Algerian “problem” or “question” (as it was often termed) in particular. The first thing that can be noted in this respect is that, unlike Sartre and many other European intellectuals, Camus never delivered a formal critique of colonialism. Nor did he sign any of the frequent manifestos and declarations deploring the practice – a sin for which he was sharply criticized and even accused of moral cowardice. In 1958, partly to explain and vindicate himself, but mainly to illustrate and give voice to the painful complexities of colonial reform and decolonization, he published Algerian Chronicles, a collection of his writings on the vexing “problem” that he had personally agonized over for more than twenty years.

In addition to his perceived silence on the issue of colonialism (a silence, as Algerian Chronicles reveals, motivated by his fear that speaking out aggressively would be more likely to heighten tensions than secure the united and independent post-colonial Algeria he hoped for), Camus has also been criticized for the virtual erasure of Arab characters and culture from his fiction. The Irish writer and politician Conor Cruise O’Brien made a partial attempt to rescue Camus from this criticism by arguing that The Fall should be read as an autobiographical work in which Camus confesses his own personal failures, including his guilt at becoming a privileged citizen in a poor country. Several writers, and most prominently and forcefully Edward Said, have denounced the nearly total absence of Arab characters in Camus’s novels and stories. Moreover, the few Arab characters who do appear, these critics point out, are inevitably mute and anonymous. They are either shadow figures, including the nameless murder victim at the climactic center of The Stranger, or mere bodies, like the uncounted and unidentified native Algerians who presumably make up the major part of the death toll in The Plague but who otherwise have no speaking role or even visible presence in the novel. Along this same line of criticism, The Meursault Investigation is a fictional and metafictional riposte to Camus by the Algerian writer Kamel Daoud. A reimagining of the characters and events of The Stranger, told from the point of view of the brother of the murdered Arab, the novel represents both a corrective rebuke and a literary tribute to it famous original. In the introduction to her recent expanded edition of Algerian Chronicles, Alice Kaplan addresses these and related criticisms and cites relevant passages from Camus’s own writing in response to them.

8. Significance and Legacy

Obviously, Camus’s writings remain the primary reason for his continuing importance and the chief source of his cultural legacy, but his fame is also due to his exemplary life. He truly lived his philosophy; thus it is in his personal political stands and public statements as well as in his books that his views are clearly articulated. In short, he bequeathed not just his words but also his actions. Taken together, those words and actions embody a core set of liberal democratic values—including tolerance, justice, liberty, open-mindedness, respect for personhood, condemnation of violence, and resistance to tyranny—that can be fully approved and acted upon by the modern intellectual engagé.

On a purely literary level, one of Camus’s most original contributions to modern discourse is his distinctive prose style. Terse and hard-boiled, yet at the same time lyrical, and indeed capable of great, soaring flights of emotion and feeling, Camus’s style represents a deliberate attempt on his part to wed the famous clarity, elegance, and dry precision of the French philosophical tradition with the more sonorous and opulent manner of 19th century Romantic fiction. The result is something like a cross between Hemingway (a Camus favorite) and Melville (another favorite) or between Diderot and Hugo. For the most part when we read Camus we encounter the plain syntax, simple vocabulary, and biting aphorism typical of modern theatre or noir detective fiction. By the way it’s worth noting that Camus was a fan of the novels of Dashiell Hammett and James M Cain and that his own work has influenced the style and the existentialist loner heroes of a succession of later crime writers, including John D McDonald and Lee Child. This muted, laconic style frequently becomes a counterpoint or springboard for extended musings and lavish descriptions almost in the manner of Proust. Moreover, this base style frequently becomes a counterpoint or springboard for extended musings and lavish descriptions almost in the manner of Proust. Here we may note that this attempted reconciliation or union of opposing styles is not just an aesthetic gesture on the author’s part: It is also a moral and political statement. It says, in effect, that the life of reason and the life of feeling need not be opposed; that intellect and passion can, and should, operate together.

Perhaps the greatest inspiration and example that Camus provides for contemporary readers is the lesson that it is still possible for a serious thinker to face the modern world (with a full understanding of its contradictions, injustices, brutal flaws, and absurdities) with hardly a grain of hope, yet utterly without cynicism. To read Camus is to find words like justice, freedom, humanity, and dignity used plainly and openly, without apology or embarrassment, and without the pained or derisive facial expressions or invisible quotation marks that almost automatically accompany those terms in public discourse today.

At Stockholm Camus concluded his Nobel acceptance speech with a stirring reminder and challenge to modern writers: “The nobility of our craft,” he declared, “will always be rooted in two commitments, both difficult to maintain: the refusal to lie about what one knows and the resistance to oppression.” He left behind a body of work faithful to his own credo that the arts of language must always be used in the service of truth and the service of liberty.

9. References and Further Reading

a. Works by Albert Camus

  • The Stranger. Trans. Stuart Gilbert. New York: Vintage-Random House, 1946.
  • Camus’s first novel, a classic portrait of the “outsider” originally published in France as L’Etranger by Librairie Gallimard in 1942.
  • The Plague. Trans. Stuart Gilbert. New York: Vintage-International, 1991.
  • Camus’s second novel, originally published in France as La Peste by Librairie Gallimard in 1947.
  • The Fall. Trans. Justin O’Brien. New York: Vintage-Random House, 1956.
  • Camus’s third novel, a confessional monologue originally published in France as La Chute by Librairie Gallimard in 1956.
  • The Myth of Sisyphus and other Essays. Trans. Justin O’Brien. New York: Vintage-Random House, 1955.
  • A philosophical meditation on suicide originally published as Le Mythe de Sisyphe by Librairie Gallimard in 1942.
  • The Rebel. Trans. Anthony Bower. New York: Vintage-Random House, 1956.
  • A philosophical essay on the ethics of rebellion and political violence originally published as L’Homme Revolte by Librairie Gallimard in 1951.
  • Exile and the Kingdom. Trans. Justin O’Brien.  New York: Vintage-Random House, 1958.
  • A collection of short fiction originally published as L’Exil et le Royaume by Librairie Gallimard in 1957.
  • Lyrical and Critical Essays. Ed. Philip Thody. Trans. Ellen Conroy Kennedy. New York: Vintage-Random House, 1970.
  • A selection of critical writings, including essays on Melville, Faulkner, and Sartre, plus all the early essays from Betwixt and Between and Nuptials.
  • Resistance, Rebellion, and Death. Trans. Justin O’Brien. New York: Vintage International, 1995.
  • A collection of essays on a wide variety of political topics ranging from the death penalty to the Cold War.
  • Caligula and Three Other Plays. Trans. Stuart Gilbert. New York: Vintage-Random House, 1958.
  • A collection of four of Camus’s best-known dramatic works: Caligula, The Misunderstanding, The State of Siege, and The Just Assassins, with a foreword by the author.
  • The First Man. Trans. David Hapgood. New York: Alfred Knopf, 1995.
  • A posthumous novel, partly autobiographical.
  • Camus at Combat: Writings 1944-1947. Ed. Jaqueline Levi-Valenci. Trans. Arthur Goldhammer. Princeton, NJ: Princeton University Press, 2006.
  • A collection of articles and editorials that Camus wrote during and after WW II for the French Resistance journal Combat.
  • Algerian Chronicles. Ed. Alice Kaplan. Trans. Arthur Goldhammer. Cambridge, MA: Belknap Press, 2013.
  • A collection of Camus’s political writings on Algeria.

b. Critical and Biographical Studies

  • Barthes, Roland. Writing Degree Zero. New York: Hill and Wang, 1968.
  • Bloom, Harold, ed. Albert Camus. New York: Chelsea House, 1989.
  • Brée, Germaine. Camus. New Brunswick, NJ: Rutgers University Press, 1961.
  • Brée, Germaine, ed. Camus: A Collection of Critical Essays. Englewood Cliffs, NJ: Prentice-Hall, 1962.
  • Cruickshank, John. Albert Camus and the Literature of Revolt. London: Oxford University Press, 1959.
  • Cruickshank, John. The Novelist as Philosopher. London: Oxford University Press, 1959.
  • Foley, John. Albert Camus: From the Absurd to Revolt. Montreal: McGill-Queens University Press, 2008.
  • Hughes, Edward J. ed. The Cambridge Companion to Camus. Cambridge, UK: Cambridge University Press, 2007.
  • Kauffman, Walter, ed. Religion from Tolstoy to Camus. New York: Harper, 1964.
  • Lottman, Herbert R. Albert Camus: A Biography. Corte Madera, CA: Gingko Press, 1997.
  • Malraux, Andre. Anti-Memoirs. New York: Holt, Rinehart, and Winston, 1968.
  • Margerrison, Christine. et al. Albert Camus in the 21st Century: A Reassessment of his Thinking at the Dawn of the New Millennium. Amsterdam, NL: Rodopi, 2008.
  • McBride, Joseph. Albert Camus: Philosopher and Littérateur. New York: St. Martin’s Press, 1992.
  • O’Brien, Conor Cruise. Camus. London: Faber and Faber, 1970.
  • Said, Edward. “Camus and the French Imperial Experience.” In Culture and Imperialism. New York: Vintage Books, 1994.
  • Sartre, Jean-Paul. “Camus’s The Outsider.” In Situations. New York: George Braziller, 1965.
  • Ronald D Srigley. Albert Camus’ Critique of Modernity. Columbia, MO: University of Missouri Press, 2011.
  • Thrody, Philip. Albert Camus, 1913-1960. London: Hamish Hamilton, 1961.
  • Todd, Olivier. Albert Camus: A Life. New York: Alfred A. Knopf, 1997.
  • Zaretsky, Robert. A Life Worth Living: Albert Camus and the Quest for Meaning. Cambridge, MA: Belknap Press of Harvard University Press, 2013.

 

Author Information

David Simpson
Email: dsimpson@depaul.edu
DePaul University
U. S. A.

Proper Functionalism

‘Proper Functionalism’ refers to a family of epistemological views according to which whether a belief (or some other doxastic state) was formed by way of properly functioning cognitive faculties plays a crucial role in whether it has a certain kind of positive epistemic status (such as being an item of knowledge, or a justified belief). Alvin Plantinga’s proper functionalist theory of knowledge has been the most prominent among these theories. Michael Bergmann’s (2006) proper functionalist theory of justification has also been the focus of much discussion. But proper functionalist theories of other epistemic properties have also been developed. Richard Otte (1987) and Alvin Plantinga (1993b: Chapter 9) offer proper functionalist theories of epistemic probability, for example. Nicholas Wolterstorff (2010) defends a proper functionalist theory of epistemic oughts. And Peter Graham (2010) develops a proper functionalist theory of epistemic entitlement. Since Plantinga’s theory of knowledge and Bergmann’s theory of justification are the most widely known and most discussed proper functionalist views, and because they share many features with other proper functionalist theories, this article focuses primarily on them—what can be said in their favor, the challenges they face, the ways in which they might be defended, and how they compare with some of their closest rivals.

Table of Contents

  1. Plantinga’s Proper Functionalist Theory of Knowledge
    1. Motivations of Plantinga’s Theory
    2. The Content of Plantinga’s Theory
    3. Swampman
    4. Gettier Cases
  2. Bergmann’s Proper Functionalist Theory of Justification
    1. Some Advantages of Bergmann’s Theory
    2. Some Objections to Bergmann’s Theory
  3. Rival Theories
    1. Proper Functionalism and Phenomenal Conservatism
    2. Proper Functionalism and Virtue Epistemology
  4. References and Further Reading

1. Plantinga’s Proper Functionalist Theory of Knowledge

This article begins with a discussion of Alvin Plantinga’s proper functionalist theory of knowledge. As Plantinga himself frames matters, he takes himself to be giving a proper functionalist theory of a property he calls “warrant,” where warrant is whatever precisely it is which makes the difference between knowledge and mere true belief.

a. Motivations of Plantinga’s Theory

A theory of warrant is subject to Gettier-style counterexamples if a belief can meet all the conditions the theory specifies as jointly sufficient for knowledge, but meet them merely by accident (in a manner that precludes that beliefs being an item of knowledge). Plantinga argues that any theory that fails to construe a proper function condition as necessary for warrant is subject to counterexamples of this sort. This is so whether the theory emphasizes the believer’s internal states as most relevant to whether her belief has warrant, external factors, or both of these.

By way of illustration, Plantinga (1993b: 31-37) adopts a scenario originally introduced by Roderick Chisholm, who attributes it to Alexius Meinong. The scenario envisions an aging forest ranger living in the mountains, with a set of wind chimes hanging from a bough. The ranger is unaware of the fact that his hearing has been degenerating of late, and it has gotten to the point where he can no longer hear the chimes. He is also unaware that he is occasionally subject to small auditory hallucinations in which he appears to hear the wind-chime. On one occasion, he is thus appeared to and comes to believe that the wind is blowing. As it happens, the wind is blowing and causing the ringing of the chimes. Even if we stipulate that all is going well with this belief from the ranger’s own internal perspective, it is clear nonetheless that his belief lacks warrant. The reason his belief lacks warrant, Plantinga maintains, results from the fact that it is due to cognitive malfunction.

One might question whether this explanation is correct, however, on the ground that certain cognitively external environmental conditions are also amiss in this case. In particular, the case is one in which there is no reliable connection between the ranger’s appearing to hear the wind-chime and the wind’s blowing. And one might think that it is primarily for this reason that the ranger’s beliefs lack warrant. This thought might push one toward bypassing proper functionalism and endorsing a reliabilist theory of warrant instead (that is, an account according to which a belief having warrant is primarily a matter of it being formed or sustained in a way that involves a reliable connection to the truth). But Plantinga also argues that any reliabilist theory which does not incorporate a proper function condition is also subject to Gettier-style counterexamples.

Plantinga (1993a: 195-198, 205-207) takes this to be illustrated by The Case of the Epistemically Serendipitous Brain Lesion. Imagine Sam has a brain lesion, one that engenders cognitive processes which mostly result in false beliefs. One process the lesion engenders, however, is a process that results in the belief that one has a brain lesion. This particular process is highly reliable (it always results in one’s having a true belief). But clearly the belief that results is not a matter of knowledge. What explains why this is so, Plantinga maintains, is that the belief in question (though formed by a truth-reliable process) is not the result of cognitive proper function. Accordingly, Plantinga concludes that any reliabilist account of warrant must be augmented with a proper function condition.

Kenneth Boyce and Alvin Plantinga (2012: 127-128) have emphasized that there may be an even stronger lesson to be drawn from these cases. Once these cases are on the table, one can imagine variations of them in which different combinations of internal and external conditions (other than proper function ones) are met, but in which the belief in question lacks warrant because it ends up being true merely by accident. Furthermore, Boyce and Plantinga contend that in these cases it seems that part of what explains why these are cases in which the beliefs are true merely by accident (in a way that precludes their being items of knowledge) is that they were not formed in a manner specified by cognitive proper function; that is, the way they get at the truth is accidental from the perspective of the cognitive design plan. If that is correct, however, then (as Boyce and Plantinga point out), there is reason to believe that the notion of cognitive proper function is centrally involved in the notion of non-accidentally that any adequate analysis of warrant must capture.

b. The Content of Plantinga’s Theory

Examples of the sort discussed above are used by Plantinga to motivate the claim that cognitive proper function is necessary for warrant. Plantinga (1993b: 21-24) also maintains that the relevant notion of proper function presupposes that of a design plan—something that specifies the manner in which a thing is supposed to function in various circumstances. As Plantinga conceives of it, a design plan may be modeled as a set of ordered triples, where each triple specifies a circumstance, a response, and a purpose or function. One need not initially take this notion of a design plan to involve conscious design or purpose. The notion of a design plan at issue here is whatever notion is presupposed by talk of proper function for biological systems (as when a physician determines that a human heart is functioning the way it is supposed to on account of its pumping at 70 beats per minute). Plantinga himself gives a theistic account of this notion, but other proper functionalists, such as Ruth Millikan (1984) and Peter Graham (2012), have offered naturalistic, evolutionary accounts.

While Plantinga (1993b: 46) takes cognitive proper function to be necessary for warrant, he does not take it to be sufficient (or even nearly sufficient). Other conditions must also be satisfied. To a rough, first approximation, Plantinga takes a belief to be warranted if and only if it satisfies the following four conditions:

(1) The belief in question is formed by way of cognitive faculties that are properly functioning.

(2) The cognitive faculties in question are aimed at the production of true beliefs.

(3) The design plan is a good one. That is, when a belief is formed by way of truth-aimed cognitive proper function in the sort of environment for which the cognitive faculties in question were designed, there is a high objective probability that the resulting belief is true.

(4) The belief is formed in the sort of environment for which the cognitive faculties in question were designed.

While Plantinga adds various nuances, these four conditions serve to capture the main outlines of his view.

Many objections have been raised to Plantinga’s theory. Two of the most prominent among them are considered below. The first amounts to an objection to the claim that Plantinga’s four conditions are necessary for warrant. The second amounts to an objection to the claim that they are sufficient. For a sampling of other objections, one would do well to examine the collection of essays on Plantinga’s theory of warrant edited by Jonathan L. Kvanvig (1996).

c. Swampman

Some have argued that there are counterexamples to Plantinga’s theory involving beings who have warranted beliefs but who nevertheless fail to exhibit cognitive proper function. The most well-known version of this objection comes from Ernest Sosa (1993), who adapts a scenario originally proposed by Donald Davidson, and uses it against proper functionalism. In that scenario, Davidson is standing next to a swamp when lightning strikes a nearby dead tree, thereby obliterating Davidson. Simultaneously, by sheer accident, the lightning also causes the molecules of the tree to arrange themselves into a perfect duplicate of Davidson as he was at the time of his demise. The Davidson duplicate—this “Swampman”—leaves the swamp, acting and talking as if it were Davidson, having all the same intrinsic properties that Davidson would have had, had he left the swamp without having his unfortunate encounter. According to Sosa, “it … seems logically possible for … Swampman to have warranted beliefs not long after creation if not right away” (p. 54). Yet, not being the product of intentional design, and not having any evolutionary history, it would seem that Swampman has no design plan. And so we have what appears to be a counterexample to proper functionalism.

There are various responses to the Swampman objection. Plantinga (1993c: 206-208) and Graham (2012: 466-467) have each argued, albeit for different reasons, that it is doubtful the Swampman scenario is metaphysically possible. They have also suggested, again for different reasons, that if this scenario is possible, perhaps Swampman can acquire conditions for proper functioning without natural selection or intentional design. See Plantinga (1993c: 78) and Graham (2014). Bergmann (2006: 147-149) has argued that we are intuitively inclined to assign positive epistemic status to Swampman’s beliefs only to the extent we are inclined to think that his beliefs are fitting responses to the inputs he receives. And we are inclined to think that Swampman’s beliefs are fitting, argues Bergmann, only to the extent we are inclined to think of those responses as exhibiting cognitive proper function. Boyce and Plantinga (2012: 130-131) have suggested that since it is merely by accident that Swampman is forming his beliefs reliably, we can think of this case as a Gettier scenario (or at least, relevantly analogous to one), and thereby deny that Swampman’s beliefs have warrant). For a similar response, see (McNabb 2015).

Since then, Kenneth Boyce and Andrew Moon (2015) have argued that the Swampman objection relies on a false intuition concerning the conditions under which the belief of one creature has warrant if the belief of another, similar creature does. According to them, the central intuition that motivates our intuitive reaction to the Swampman case may be stated as follows:

(CI) If a belief B is warranted for a subject S and another subject S* comes to hold B in the same way that S came to hold B in a relevantly similar environment to the one in which S came to hold B, then B is warranted for S*.

They argue that it is CI, in conjunction with the stipulation that Swampman forms his beliefs in the same way that an ordinary human being would (an ordinary human being to whom we would be inclined to attribute knowledge), that explains our tendency to regard Swampman as having warranted beliefs. Boyce and Moon then go on to argue that CI is subject to counterexamples, and that this undercuts the force of the Swampman objection. See Section 3b for further discussion of their argument.

d. Gettier Cases

Plantinga has conceded that his theory, as he originally formulated it, is subject to Gettier-style counterexamples. In 2000, Plantinga formulated this counterexample:

I own a Chevrolet van, drive to Notre Dame on a football Saturday, and unthinkingly park in one of the many places reserved for the football coach. Naturally, his minions tow my van away and, as befits such lèse majesté, destroy it. By a splendid piece of good luck, however, I have won the Varsity Club’s Win-a-Chevrolet-Van contest, although I haven’t yet heard the good news. You ask me what sort of automobile I own; I reply, both honestly and truthfully, “A Chevrolet van.” My belief that I own such a van is true, but ‘just by accident’ (more accurately, it is only by accident that I happen to form a true belief); hence it does not constitute knowledge. All of the non-environmental conditions for warrant, furthermore, are met. It also looks as if the environmental condition is met: after all, isn’t the cognitive environment here on earth and in South Bend just the one for which our faculties were designed?

Clearly Plantinga’s belief (though true) is not an item of knowledge in this case and thus lacks warrant. So Plantinga’s original four conditions are not jointly sufficient for warrant. Something else must be added. But what?

According to Plantinga, what the original account requires is an addition to the environmental condition. More specifically, the problem in the above case is that while the global environment that Plantinga is in is the one for which his faculties were designed, his more local environment is epistemically misleading. So in order to deal with this counterexample, Plantinga proposes adding a resolution condition. This condition involves a distinction between two different kinds of environment, what Plantinga refers to as the “maxi-environment” and what he refers to as the “mini-environment.” The maxi-environment, Plantinga stipulates, is the kind of global environment in which we live here on earth, the kind of environment for which our cognitive faculties were designed (or to which they were adapted). The mini-environment, by contrast, is a much more specific state of affairs, one that includes, for a given exercise of one’s cognitive faculties E resulting in a belief B, all of the epistemically relevant circumstances obtaining when B is formed (though diminished with respect to whether B is true).

Letting ‘MBE’ denote the cognitive mini-environment with respect to B and E (which Plantinga says may contain as large a fragment of the actual world as one likes, up to whether B is true), Plantinga maintains that the needed resolution condition may be stated as follows:

(RC) A belief B produced by an exercise of cognitive powers has warrant sufficient for knowledge only if MBE (the mini-environment with respect to B and E) is favorable for E.

This, of course, raises the question of just what it is for a mini-environment to be “favorable.” Plantinga has, in the past, offered various proposals for what favorableness consists in that he has subsequently admitted to be unsatisfactory. A proposal is found in Boyce and Plantinga (2012: 134). For other proposals, see Crisp (2000) and Chignell (2003).

2. Bergmann’s Proper Functionalist Theory of Justification

Plantinga’s theory of warrant is not the only kind of proper functionalist theory. Proper functionalist theories of other epistemic concepts have also been developed. Noteworthy among these is Michael Bergmann’s proper functionalist theory of epistemic justification. The kind of epistemic justification that Bergmann (2006: 4-5) is interested in is doxastic justification. The having of this property is frequently (though not universally) held to be a necessary condition for a belief being an item of knowledge. In fact, it is often held that a belief having this property, in conjunction with its being non-accidentally true (in a way that rules out Gettier cases), is not only necessary, but also sufficient, for its being an item of knowledge.

A major divide in the literature occurs between those philosophers who are “externalists” about this kind of justification and those who are “internalists” about it. Just how this divide should be characterized is itself a matter of dispute. But for present purposes, we may characterize internalists about justification as being committed (at least) to the view that whether a belief is justified depends entirely on which mental states that belief is based upon (in such a way that necessarily, any two believers who are exactly alike in terms of their mental states and in terms of which of those mental states their beliefs are based upon are also alike in terms of which of their beliefs are justified). Externalists, by contrast, maintain that whether a belief is justified may depend on other factors.

It should be noted, however, that Bergmann (2006: chapter 3) divides up the territory a bit differently, though not in a way that impacts the current discussion. He takes it to be a necessary condition for a view of justification to count as “internalist” that it include an awareness requirement (that is, that it require, in order for a belief to be justified, that the believing subject is actually or potentially aware of some justification-contributor to that belief). The characterization of internalism given here, by contrast, includes no such requirement (and is similar to the characterization of a view of justification that Bergmann calls “mentalism,” one which he takes to be distinct from both externalism and internalism).

As Bergmann (2006: 3-7) points out, it is not always clear that philosophers who appear to dispute the nature of justification are actually disagreeing with one another. That is because it is plausible that epistemologists sometimes use the term ‘justification’ in different ways. He notes, for example, that some epistemologists use this term to pick out a subjective notion, one that it is satisfied by a belief provided that the subject is blameless in holding it. Others, by contrast, he observes, use the term to pick out a more objective notion, one according to which a belief is justified only if it is fitting with respect to the believer’s evidence or other epistemically relevant inputs. It is this objective notion of justification in which Bergmann is interested (see also pp. 111-113). He takes it to be a conceptually open question as whether this kind of justification is necessary for knowledge (though he thinks it is). And he also takes some disputes between self-avowed externalists (like himself) and self-avowed internalists (such as Richard Feldman and Earl Conee) to involve a genuine disagreement concerning the nature of this kind of justification.

Bergmann argues that the right way to analyze this kind of justification is in terms of proper function. More specifically, Bergmann’s (2006: 132-137) theory of epistemic justification takes the first of Plantinga’s three conditions (leaving out the fourth, environmental condition) to be necessary for a belief to be justified. Bergmann also takes the first three of Plantinga’s conditions, in conjunction with the condition that the subject does not take the relevant belief to be defeated, to be sufficient for a belief being justified. The motivations for this view are perhaps best appreciated by looking to its purported advantages.

a. Some Advantages of Bergmann’s Theory

Epistemic justification of the kind Bergmann has in mind has some puzzling features. On the one hand, it involves some notion of truth-aptness. In particular, there would appear to be some important, non-trivial, connection between a belief being justified and it being objectively likely to be true. At the very least, it would be a significant cost for a theory of justification to deny this. But which ways of forming and sustaining beliefs result in a high proportion of true beliefs depends on what sort of environment one is in. Our tending to believe that occluded objects still exist, for example, results in a high proportion of true beliefs in our environment, but it is easy to imagine environments in which this would not be the case. These considerations push in the direction of regarding what makes for epistemic justification a contingent matter, one that depends on the sort of environment one inhabits.

On the other hand, justification is a normative concept, the satisfaction of which does not appear to depend on the sort of environment in which one is located. This aspect of justification is made especially vivid by “The New Evil Demon Problem”, originally put forward by Keith Lehrer and Stewart Cohen (1983), as a problem for reliabilist theories of justification. Consider a population of beings, just like ourselves, who form their beliefs in response to experience in just the ways that we do, but who (unlike us) are victims of a Cartesian demon who renders their belief-forming processes unreliable. From many reliabilist theories of justification, it follows that these beings have far fewer justified beliefs than we do (since most of their beliefs are not formed in a truth-reliable manner). But this seems false. These beings are in an epistemically bad situation, to be sure, but they are still forming their beliefs in ways that are appropriate given their experiences because their beliefs are at least justified.

Bergmann’s theory of epistemic justification nicely combines these puzzling features. First, it accommodates the intuition that inhabitants of a demon world, who are like us, and who form their beliefs in response to experience in the same ways we do, have the same proportion of justified beliefs. For, as Bergmann (2006: 141-143) notes, his theory entails that provided these beings have a cognitive design plan comparable to ours and are properly functioning, many of their beliefs are justified, even though their ways of forming beliefs are, for the most part, unreliable. This analysis also, as Bergmann points out, accommodates the intuition that justification is importantly and non-trivially connected with truth-aptness. For, insofar as the beings living in a demon world fulfill Bergmann’s conditions for justification, the manner in which they form their beliefs would be truth-apt if they were placed in the environment for which their cognitive faculties were designed. Finally, since different design plans may be tailored to different kinds of environments, Bergmann’s theory accommodates the possibility that what makes for justification is a contingent matter, one that depends on the kind of environment for which the creatures at issue are situated.

b. Some Objections to Bergmann’s Theory

Like Plantinga’s theory, Bergmann’s faces the objection that it is subject to counterexamples involving creatures like Swampman. There is no need, though, to rehearse the various responses that might be given to this objection here (since many of them will be the same or similar to those described in Section 1c). As a theory of justification, however, Bergmann’s view also faces other objections, ones which are not (or not as obviously) applicable to a theory of warrant.

Todd. R. Long (2012: 264-265) questions, for example, whether Bergmann’s theory does in fact do a better job than alternative views in handling the New Evil Demon Problem. He grants that Bergmann’s view does accommodate the intuition that demon-world victims with the same design plan as ours do in fact have justified beliefs (in the same proportions that we do). But he notes that Bergmann’s view also entails that the same cannot be said for demon-world victims who are mentally indistinguishable from ourselves but whose ways of forming beliefs run contrary to their design plan. And Long maintains that to deny that beliefs of demon-world victims in the latter situation are justified also runs contrary to our intuitions. Bergmann (2006: 150), however, anticipates an objection like this. He suggests that there is an analogy between Swampman and the demon victims in such a scenario; accordingly, he adapts his reply to the former so as to apply it to the latter.

Another kind of objection to a proper functionalist theory of justification involves cases in which the design plan specifies ways of belief formation that appear to be objectively bad in some way, in spite of the fact that this component of the design plan is successfully aimed at truth. Long (2012) and Tucker (2014b) each present variations of this objection directed specifically against Bergmann’s view. There are also precedents found among objections to Plantinga’s theory of warrant (see for example Feldman 1993: 44). There are at least two kinds of cases of this sort. The following discussion will make reference to cases described by Tucker, who provides examples of each kind.

In the first kind of case, the design plan specifies coming to hold a belief on the basis of what appears to be an objectively bad form of reasoning. Tucker (2014b: 3321-3322) presents a case, for instance, in which a design plan specifies coming to hold a certain belief on the basis of the fallacy of denying the antecedent. As Tucker points out, even though denying the antecedent is, from a logical point of view, an objectively bad form of reasoning, there are circumstances in which reasoning that way is reliable. So there is no reason in principle why a good, truth-aimed design plan could not specify forming a belief in that way, under the right conditions. Even so, it is counterintuitive to think that a belief formed by way of committing a logical fallacy could be justified (at least in the absence of having any further basis).

In the second kind of case, the design plan specifies coming to hold a belief on the basis of an input that intuitively fails to provide any kind of epistemic support for that belief. Tucker (pp. 3318-3319) presents an example, for instance, in which a person comes to believe Gödel’s incompleteness theorem solely on the basis of his belief that his students hate a particular type of beer. Since Gödel’s incompleteness theorem is a necessary truth, there is no question that this belief-forming process is reliable. So there is no reason in principle why a good, truth-aimed design plan could not specify that a belief be formed in this way. Even so, it seems wrong to say that someone could come to be justified in believing Gödel’s incompleteness theorem solely on the basis of that belief.

This is a formidable objection. But there may be things that can be said on the proper functionalist’s behalf. Consider once again the first kind of case, a case that involves coming to hold a belief in the basis of formally bad reasoning. Some things that have been said in defense of reliabilism might also be of use to the proper functionalist here. Alvin Goldman (2002: 146-153), for example, points to research on the part of cognitive psychologists (such as Amos Tversky and Daniel Kahneman) indicating that human beings tend to rely on heuristics when engaged in probabilistic reasoning. As is now well known, these heuristics make people prone to commit elementary probabilistic fallacies. The conclusion that some psychologists have drawn is that these findings indicate that human beings are terrible at probabilistic reasoning. But as Goldman notes, other psychologists have drawn a more optimistic conclusion.

Goldman points to the work of a group of evolutionary psychologists (led by Gerd Gigerenzer, Leda Cosmides, and John Tooby) who argue that, given the limited information and computational power with which organisms must contend, an inference mechanism can be advantageous if it (in Goldman’s words) “often draws accurate conclusions about real-world environments, and does so quickly and with little computational effort” (p. 152). The heuristics humans rely on in probabilistic reasoning, some of these psychologists maintain, are mechanisms of just that sort. If that is the case, then perhaps human beings often do come to hold justified beliefs by way of these mechanisms after all, in spite of the fact that they are formally suspect. And if that is so, then perhaps other kinds of beings might come to form justified beliefs on the basis of kinds of reasoning that (from a purely logical point of view) are formally suspect, but nonetheless reliable in the environments for which their cognitive faculties were designed.

Now consider the second kind of case, the case in which the design plan specifies coming to hold a belief on the basis of an input that intuitively fails to provide any kind of epistemic support for that belief. Why is it exactly (concerning Tucker’s example) that we are inclined to deny that a person’s belief that his students dislike of a particular type of beer could justify the belief that Gödel’s incompleteness theorem is true? Perhaps it is because there does not appear to be any interesting logical connection between the content of the latter belief and the belief on which it is based. But a similar observation concerning the relationship between our sense experiences and the content of our perceptual beliefs is part of what motivates Bergmann’s proper functionalist theory.

As Bergmann (2006: 119) points out, “Thomas Reid emphasized that there does not seem to be any logical connection between our sense experiences and the content of the beliefs based on them.” Bergmann notes, for example, that “the tactile sensations we experience when touching a hard surface seem to have no logical relation to (nor do they resemble) the content of the hardness beliefs they prompt.” Because this is so, Bergmann argues that the evidential support relations that hold between various sensory experiences and the beliefs formed in response to them cannot be explained in terms of necessary connections. But this prompts the question as to what does explain these support relations. Bergmann (2006: 130-131) argues that proper functionalism provides a good answer to this question. The connections are to be explained by way of which belief-forming responses to sensory inputs are specified by the cognitive design plan.

To accept this motivation for proper functionalism is to accept the claim that at least some epistemic support relations hold only contingently. It is also to countenance the possibility that the epistemic support relations that hold for certain cognizers might seem utterly bizarre from the perspective of creatures like us. So, perhaps, for those who do take this motivation on board, the possibility of an agent’s coming to justifiably believe that Gödel’s incompleteness theorem is true solely on the basis of a belief concerning the beer preferences of his students no longer seems so counterintuitive. (See Bergmann (2006: 141) for a similar response to BonJour’s purported counterexamples to externalist views of justification involving reliably formed clairvoyant beliefs).

3. Rival Theories

Proper functionalist theories do not exist in a vacuum. A full appreciation of their merits or demerits requires an investigation into how well they stack up against their rivals. Two kinds of theories in particular that are often put up against proper functionalism—phenomenal conservatism and virtue epistemology. It is sometimes claimed by the proponents of these theories that they satisfy many of the same motivations as proper functionalism, while having fewer costs, as well as other advantages.

a. Proper Functionalism and Phenomenal Conservatism

At least to a first approximation, a phenomenal conservative theory of doxastic justification may be characterized as the view that a belief with the content that p is justified for an agent if it seems to the agent that p, the agent appropriately bases her belief that p on that seeming, and the agent has no defeaters for that belief. (See Phenomenal Conservatism for more details). As noted in Section 2a, proper functionalists about justification point to the apparent contingency of the connection between various experiences and the beliefs they justify as a motivation for their view. Phenomenal conservatives sometimes claim that their view does just as well at accommodating this apparent contingency while preserving the claim that there is a necessary connection between the things that justify our beliefs and the beliefs they justify. For this reason, phenomenal conservatism might be thought to do a better job than proper functionalism in accommodating the New Evil Demon intuition. Some phenomenal conservatives have also contended that it does a better job in accounting for the nature of evidential support.

Tucker (2011: 58-63) presses this point in connection with his objection (discussed in Section 2b) that proper functionalism allows for inputs which intuitively fail to provide any kind of epistemic support for a belief to justify that belief. In the example previously discussed, Tucker pointed to an instance in which a belief served as such an input. But Tucker also supplies examples in which the same seems to be true of the support relations that hold between various sensory experiences and the beliefs they are purported to justify. He notes, for example, that it is counterintuitive to think that a sensory experience associated with seeing a beautiful sunset could justify the belief that Gödel’s incompleteness theorem is true. But a design plan (presumably different from ours) might well specify that this is an appropriate belief-forming process.

Here the proper functionalist might attempt once more to press the Reidian point that in general it appears true that there is no inherent connection between our sensory experiences and the contents of the beliefs based on them. But Tucker (2011: 56-58, 61-63) suggests a way the advocate of phenomenal conservatism could account for the role that sensory experience plays in justifying our beliefs that accommodates this fact. According to Tucker, sensory experience might play a role in the justification of a certain belief by triggering a seeming with the content of that belief, it being a contingent matter which sensations trigger which seemings. Andrew Cullison (2013: 34-37) makes a similar suggestion, noting that just as two different sentences from different languages might well express the same proposition, two different kinds of cognitive apparatus associated with different species might cause seemings of the same content in response to differing kinds of phenomenology. This accommodates the Reidian point while preserving the claim that there is a necessary connection between the things that justify our beliefs (that is, our seeming states) and the beliefs they justify (via the identities of their contents).

Suppose one agrees that a phenomenal conservative view of justification does better than a proper functionalist view on these counts. This of course does not commit one to agreeing that phenomenal conservatism does better than proper functionalism over all. Bergmann (2013) argues, for example, that proper functionalists can accommodate many of the intuitions that motivate phenomenal conservatism, while also doing a better job in accommodating the intuition that some belief formations, downstream from sensory experiences, are objectively fitting responses to those experiences, whereas others are not.

Bergmann notes, for instance, that proper functionalists might adopt a model according to which, for humans (though not necessarily for all cognizers), when all goes well, a belief formed in response to a sensory experience is justified via being based on an intermediate seeming (one that is appropriately caused by the experience). He argues that this model accommodates many of the intuitions to which phenomenal conservatives appeal. But it also, he points out, allows for the possibility that there is an objective mismatch between a belief formed in response to a sensory experience and the nature of that experience, one which prevents the belief in question from being justified, even when the content of that belief matches the content of the intermediate seeming.

Bergmann describes, for example, a case in which a human cognizer, suffering from brain damage, forms the belief that she is holding a hard spherical object, in response to the olfactory sensation she experiences while smelling a lilac bush. Even if it is stipulated that she bases this belief on an intermediate seeming with the same content as her belief, it can still seem that her belief is objectively unfitting (in relation to her experience) and, for that reason, unjustified. A proper functionalist can accommodate this intuition, Bergmann claims, whereas a phenomenal conservative cannot. The proper functionalist can maintain that the reason the cognizer’s belief is objectively unfitting in this case is that, even though it is based on an appropriate intermediate seeming, it is not the appropriate response to the relevant sensation; it is not the belief her design plan specifies should result.

Relatedly, one might think that proper functionalism does better than phenomenal conservatism in accounting for the relation between justification and truth-aptness. A common objection to phenomenal conservative views is that they suffer from a “cognitive penetration” problem. In certain kinds of wishful thinking cases, for example, a seeming state might be caused by a desire; and in some such cases the believer in question will be unaware of this fact, and have no defeater for the belief in question. According to phenomenal conservatives, a belief properly based on such a seeming will still be justified. But to many this seems wrong. One explanation for why this consequence seems wrong is that it threatens to radically undermine the connection between justification and truth. A proper functionalist, by contrast, might maintain that when such cognitively penetrated seemings are produced in human beings, this is due either to cognitive malfunction or to one of the non-truth aimed facets of our cognitive design plan (either of which, according to her view, would render the belief unjustified). See Tucker (2014a) however for an argument that proper functionalists also suffer from cognitive penetration problems.

b. Proper Functionalism and Virtue Epistemology

According to John Greco (1993: 414), “the central idea of virtue epistemology is that, Gettier problems aside, knowledge is true belief which results from one’s cognitive virtues.” Similarly, Sosa (1993: 64) characterizes it as consisting of a family of theories which may be seen as “varieties of a single more fundamental option in epistemology, one which puts the explicative emphasis on truth-conducive intellectual virtues or faculties.”

Virtue epistemology is often thought of as coming in at least two varieties. Virtue responsibilists emphasize character traits—intellectual virtues such as open-mindedness, conscientiousness, perseverance in seeking the truth, an so on. Virtue reliabilists emphasize cognitive faculties, abilities, or competencies. (See Virtue Epistemology for more details). Of these two, it is virtue reliabilism that is most akin to proper functionalism. Accordingly, virtue reliabilism serves as a closer competitor. Or rather, since Greco (1993: 414) and Sosa (1993: 64) have both classified proper functionalism as a version of virtue epistemology, perhaps it should be said that it is the non-proper-functionalist versions thereof which may be seen as close competitors. For ease of exposition, the following discussion will focus on Sosa’s development of such a version.

According to Sosa’s (2015: 10) virtue theory of knowledge, knowledge is “apt belief” where apt belief is “belief that gets it right through competence rather than luck.” More precisely, according to Sosa, an apt belief is a belief that sufficiently manifests an “epistemic competence” (that is, a competence to get at the truth) (p. 9), where “a competence is in turn understood as a disposition to succeed in a given field of aimings, these being performances with an aim, whether the aim be intentional and even conscious, or teleological and functional” (p. 2). Note the similarity to proper functionalism here. Sosa’s epistemic competences are akin to Plantinga’s truth-aimed cognitive faculties. Both involve the property of being aimed at the formation of true beliefs, and both (when all goes well) are exercised in a way that is conducive to the fulfilment of that aim.

One way in which Sosa’s epistemic competencies differ from Plantinga’s truth-aimed cognitive faculties, however, is that the former do not initially seem to presuppose any notion of a design plan. And this might make Sosa’s theory more adept at accommodating things like Swampman scenarios (see the discussion in Section 1c). Indeed, it was Sosa (1993) who made famous that objection to proper functionalism. It might also make Sosa’s view more appealing to those who are both naturalistically inclined and skeptical about the prospects for a naturalistically acceptable account of cognitive proper function.

Proper functionalists have called into question whether Sosa’s view does in fact have these advantages. Plantinga (1993c: 79) has argued, for example, that in order to handle the case of The Case of the Epistemically Serendipitous Brain Lesion (discussed in Section 1a), Sosa’s epistemic virtues must involve competencies or faculties that are subject to proper function conditions. If that is right, then, as Plantinga (p. 81) points out, Sosa’s view (developed so as not to be subject to this counterexample) becomes a variety of proper functionalism. It should be noted however that virtue epistemologists may have other ways of dealing with this case. John Greco (2010: 152) has suggested, for instance, that “in the brain lesion case, the problem is not so much a lack of health as it is a lack of cognitive integration.” “The cognitive processes associated with the brain lesion,” claims Greco, “are not sufficiently integrated with other of the person’s cognitive dispositions so as to count as being part of cognitive character.” Whether this reply is successful may turn on just what is necessary for a cognitive process to exhibit the kind of cognitive integration required. Greco (2010: 152) suggests his own, non-proper-functionalist criteria. But it is open to proper functionalists to argue that part of what is required is incorporation into one’s cognitive design plan.

Since then, Boyce and Moon (2015) argued that there are other kinds of cases that pose a challenge to the claim that a true belief manifesting a competence is sufficient for its being an item of knowledge. As noted in Section 1c, Boyce and Moon propose a counterexample to what they regard as the central intuition underlying the Swampman Objection to proper functionalism. Their counterexample employs some of the cognitive science literature on initial knowledge, which supports the claim that human beings sometimes come to know things by way of innate, unlearned cognitive responses (see for example Spelke, 1994). Drawing from this literature (as well as from Bergmann, 2006:116-121), Boyce and Moon argue that some of these innate responses are merely contingently appropriate ways of forming beliefs (where the appropriateness at issue is of a kind necessary for warrant). They argue that while these responses are appropriate for human beings, given the kind environments to which humans are adapted, the same need not have been true for other kinds of beings.

Boyce and Moon then go on to argue that these facts entail there are possible cases involving two cognitive agents, who are members of different species, coming to hold the same belief, in the same way, in the same environment, but in which that belief is warranted for one of them (on account of its resulting from a way of forming beliefs that is appropriate for members of that species) but not the other. They further argue that not only do these cases furnish counterexamples to the central intuition motivating the Swampman objection to proper functionalism, but that they also provide a challenge to alternative theories. Boyce and Moon suggest, for instance, that they afford potential counterexamples to Sosa’s theory, at least insofar as it does not recognize factors such as proper function conditions or species membership as relevant to competence possession.

Proper functionalists point to the kinds of cases alluded to above as lending support to the view that a belief having arisen by way of cognitive proper function is necessary for it to count as an item of knowledge. It should be acknowledged, however, that virtue epistemologists have pointed to other kinds of cases in which the opposite seems true. John Greco (2010: 151-153) has noted, for example, that there appear to be “cases of improper function that actually increase a person’s capacity to know.” Greco cites various cases documented by the neurologist Oliver Sacks (1970) in order to illustrate this point. “An obvious example,” says Greco, “is the story of autistic twins, who enjoyed incredible mathematical abilities associated with their autism.” Another case is that of “a man whose illness resulted in an increase in detail and vividness regarding childhood memory.” So much so, Greco notes, that when “these memories were put to use in accurate and detailed paintings of the man’s hometown in Italy…the man came to be considered an expert on the layout and appearance of that town, even though he had not visited there in decades.” Greco claims that these are cases in which “dysfunction gave rise to knowledge.”

What might a proper functionalist say in response to these scenarios? A couple of strategies are suggested by Plantinga (1993c: 74-75) in a reply to Richard Feldman (1993: 48-49). Feldman also points to these kinds of cases as creating difficulties for proper functionalism; in particular, Feldman cites the case of the autistic twins described above. As Feldman notes, these twins had the ability to “just see” (apparently without counting) that the number of matches that had fallen out of a box was 111. In his reply, Plantinga further notes that these same twins could also “just see,” it seems, whether a given six or eight digit number was prime. The first strategy Plantinga suggests for dealing with these cases is to call into question whether the individuals involved really do acquire knowledge in the scenarios described. The second is to concede (at least for the sake of argument) that they do, but argue that this is consistent with proper functionalism.

Regarding the first strategy, Plantinga notes that while the twins mentioned above can in fact reliably identify prime numbers, they lack, according to Sacks, the concept of multiplication. But if the twins lack the concept of multiplication, Plantinga argues, it is not clear that they genuinely grasp the concept of a prime number; so it is not clear that they have the relevant beliefs. Plantinga concedes, however, that this is a less plausible thing to say regarding the twins’ ability to discern the number of matches that had fallen out of a box. Here Plantinga turns to the second strategy. He concedes that while the twins’ “faculties obviously seem to malfunction in some ways,” it is doubtful that they are malfunctioning in producing the belief that there are 111 matches on the floor. Plantinga suggests that, perhaps, the twins have a different design plan than that of other human beings, and that this belief-forming tendency of theirs is subject to proper function conditions. In support of this claim, he notes that it seems possible that this remarkable ability of theirs might become damaged (in such a way that it is no longer reliable); in that case, he contends, we would be inclined to say that this ability had malfunctioned.

Another possibility open to the proper functionalists is to concede that these are cases in which cognitive malfunction enables the acquisition of knowledge, but only by way of truth-aimed proper function. If a typical human being, as a result of cognitive malfunction, suddenly found it seeming to her that she could just see that 111 matches had fallen out of a box, we might doubt that she really knows there are 111 matches. We might think that this belief, formed by way of this new-found tendency of hers, fails to count as knowledge, unless or until she has independent confirmation that the tendency is reliable. Once she does have such confirmation, we might concede that the resulting beliefs do count as knowledge, but only because she learned this to be a reliable way of getting at the truth. So perhaps, in at least some of the cases at issue, the individuals in question do acquire knowledge via belief-forming tendencies resulting from cognitive malfunction, but only by way of having learned those tendencies to be reliable. And if this learning occurs by way of cognitive processes that are in accord with proper function, these cases pose no difficulties for a proper functionalist theory.

This is perhaps not a plausible thing to say regarding all of these cases, however. It is not as plausible a thing to say regarding the individual whose illness caused him to form detailed memorial beliefs pertaining to his hometown in Italy, for example. One reason this a less plausible thing to say concerning that case is that the person in question is (presumably) forming these beliefs in response to memory phenomenology, which is an epistemically appropriate way for human beings to form beliefs downstream from experience. We would be much less likely to judge this person as having knowledge if these same beliefs arose, say, in response to the kind of phenomenology associated with a vivid daydream, unaccompanied by memorial seemings, even if the resulting beliefs should turn out to be reliably formed. So, the proper functionalist might say, if cognitive malfunction is somehow enabling the acquisition of knowledge in this case, it is not by virtue of causing the subject to respond deviantly to his experience (since, in that regard, he is responding as proper function dictates). It must, rather, be by virtue of its causing some deviation upstream from experience (that is, by virtue of its producing an abnormality in the manner in which the subject’s memorial experiences are produced). Whether this creates a significant problem for proper functionalism, furthermore, may depend on just how the malfunction in question enables knowledge.

However exactly memory information is processed, stored, and retrieved so as to generate belief-producing memorial experiences, it is plausible that the cognitive system responsible (or set of systems responsible) has different functions associated with it. One of these functions is to generate experiences that reliably produce true beliefs. But no doubt there are other functions associated with this system that do not pertain to that goal (indeed, some may even be in tension with it). It is plausible, for instance, that some of those functions pertain to filtering information as it comes in, either by preventing some of that information from being stored in the first place, discarding some of that information after it has been stored, or preventing some of it from being encoded in the relevant experiences. The purpose of this filtration process might not be to secure the production of true beliefs, but to prevent various kinds of information overload, or to highlight important items information at the expense of discarding others. Plausibly, what occurs in the case at issue is that a malfunction results in the suppression of these kinds of functions, leaving various other truth-aimed functions associated with the production of the relevant memorial experiences intact.

This consideration suggests yet another possible strategy the proper functionalist might have for dealing with these kinds of cases. Yes, she might grant, some of these are cases in which cognitive malfunction enables knowledge, but not by way of interfering with truth-aimed cognitive proper function (at least not with respect to the process that issued in the relevant beliefs). In at least some of these cases, the malfunction enables knowledge by preventing various non-truth-aimed aspects of cognitive proper function from interfering with or dampening various truth-aimed aspects (or perhaps by preventing some truth-aimed aspects of cognitive proper function from interfering with or dampening various other truth-aimed aspects). The consequence is that certain truth-aimed aspects of cognitive proper function result in various items of knowledge they would not have otherwise produced. So even though these are cases in which cognitive malfunction enables knowledge, the proper functionalist might say, they are not counterexamples to the claim that knowledge itself must come by way of truth-aimed cognitive proper function. Or, she might insist, to the extent to which it is unclear that these purported items of knowledge come by way of truth-aimed cognitive proper function, it is also unclear that we should count them as genuine items of knowledge.

It should pointed out that many virtue theories of knowledge also quite naturally lend themselves to virtue theories of justification. As Sosa (2007: 22-23) points out, for instance, an agent can manifest skill in a performance even when that performance fails to achieve its aim or achieves it merely by luck. An archer might take a skillful shot (to use one of Sosa’s frequent analogies), for instance, while still missing the target (or hitting it only by luck) on account of erratic wind conditions. Similarly, a believer might manifest her skill at coming to hold true beliefs while nonetheless getting it wrong (or getting it right only by luck) on account of being in an epistemically bad environment. Under these circumstances, the belief in question may be said (in Sosa’s terminology) to be “adroit” but not “apt” (p. 23). A belief that is adroit, according to Sosa, may be said to be justified (in one good sense at least) even if it is not an item of knowledge (BonJour and Sosa: 2003: 157).

A Sosa (2015: 26-27) himself is well aware, the having of a skill presupposes something like a normal environment. As Sosa points out, we do not say that a person lacks driving skill merely because she is disposed to perform poorly on an icy road in the midst of a snowstorm. What matters is whether she is disposed to perform well under ordinary driving conditions. Similarly, what matters for whether an agent is skilled at coming to hold true beliefs is whether she is capable of doing so in a certain kind of environment. But which sort of environment is the relevant one? According to Bergmann, this question points to an area in which a proper functionalist theory of justification has the advantage.

As Bergmann (2006: 142-143) notes, in a 1991 work  Sosa takes justification to be relativized to an environment. The person in the demon world has justified beliefs relative to our environment, according to this view, but not relative to her own. Similarly, the beliefs of alien cognizers who have radically different methods of belief formation than we do (ones that are adapted to their own environment) may have beliefs that are justified relative to their environment but not relative to ours. Bergmann argues however that our ordinary concept of justification does not appear to be relativized in this way.

In later work, as Bergmann also points out, in 2003 Sosa  holds that there are two different senses in which a belief might be said to be justified. A belief is “adroit-justified” if the method by which it is formed is reliable in the actual world, and “apt-justified” if the method by which it is formed is reliable in the subject’s world. As Bergmann notes, however, this view does not account for our intuition that there is a single sense in which our beliefs, the alien cognizers’ beliefs, and the demon victims’ beliefs are all justified.

Proper functionalism, by contrast, Bergmann maintains, has no difficulty accommodating these intuitions, since it holds that the relevant environment is the one specified by the design plan (which is the same between us and the demon victims but different for the alien cognizers). Whether Bergmann points to a genuine advantage of his theory over Sosa’s in this regard has, however, been disputed. Markie (2009: 374-377) argues, for example, that Bergmann’s own theory faces disadvantages akin to those he attributes to Sosa’s.

As with most disputed views, the extent to which one is drawn to proper functionalist theories will depend in large measure on which intuitions one has, the relative weight one assigns to them, one’s assessment of how well the theories in question accommodate those intuitions, and whether their rivals do any better. And here one’s mileage may vary. But it is a safe bet that proper functionalist theories will continue to serve as serious contenders for the foreseeable future.

4. References and Further Reading

  • Bergmann, Michael. 2006. Justification Without Awareness: A Defense of Epistemic Externalism (Oxford: Oxford UP).
  • Bergmann, Michael. 2013. “Externalist Justification and the Role of Seemings” Philosophical Studies 166: 163-184.
  • BonJour, Laurence and Sosa, Ernest. 2003. Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues (Malden, MA: Blackwell Publishing).
  • Boyce, Kenneth and Plantinga, Alvin. 2012. “Proper Functionalism” The Continuum Companion to Epistemology, ed. Andrew Cullison (London: Continuum International Publishing Group).
  • Boyce, Kenneth and Moon, Andrew. 2015. “In Defense of Proper Functionalism: Cognitive Science Takes on Swampman,” Synthese Online First: DOI 10.1007/s11229-015-0899-6. http://link.springer.com/ article/10.1007/s11229-015-0899-6.
  • Crisp, Thomas M. “Gettier and Plantinga’s Revised Account of Warrant” Analysis 60: 42-50.
  • Chignell, Andrew. 2003. “Accidentally True Belief and Warrant” Synthese 137: 445-458.
  • Cullison, Andrew. 2013. “Seemings and Semantics” Seemings and Justification, ed. Chris Tucker (Oxford: Oxford UP).
  • Feldman, Richard. 1993. “Proper Functionalism” Nous 27: 34-50.
  • Goldman, Alvin “The Sciences and Epistemology” The Oxford Handbook of Epistemology (Oxford: Oxford UP).
  • Graham, Peter. 2012. “Epistemic Entitlement” Nous 46: 449-482.
  • Graham, Peter. 2014. “Warrant, Functions, History” Naturalizing Epistemic Virtue, eds. Abrol Fairweather and Owen Flanagan (Cambridge: Cambridge University Press).
  • Greco, John. 1993. “Virtues and Vices of Virtue Epistemology” Canadian Journal of Philosophy 23: 413-432.
  • Greco, John. 2010. Achieving Knowledge: A Virtue-Theoretic Account of Epistemic Normativity (Cambridge: Cambridge University Press).
  • Kvanvig, Jonathan L. (ed.). 1996 Warrant in Contemporary Epistemology: Essay’s in Honor of Plantinga’s Theory of Knowledge (London: Rowman & Littlefield Publishers).
  • Lehrer, Keith, and Cohen, Stewart. 1983. “Justification, Truth, and Coherence” Synthese 55: 191-207.
  • Long, Todd R. 2012. “Mentalist Evidentialism Vindicated (and a Super-Blooper Epistemic Design Problem for Proper Function justification)” Philosophical Studies 157: 251-266.
  • Markie, Peter. 2009. “Justification and Awareness” Philosophical Studies 146: 361-377.
  • McNabb, Tyler Dalton. 2015. “Warranted Religion: Answering Objections to Alvin Plantinga’s Epistemology” Religious Studies 51: 477-495.
  • Millikan, Ruth. 1984. “Naturalist Reflections on Knowledge” Pacific Philosophical Quarterly, 4:  315-334.
  • Otte, Richard. 1987. “A Theistic Conception of Probability” Faith and Philosophy 4: 427-447.
  • Plantinga, Alvin. 1993a. Warrant: The Current Debate (Oxford: Oxford UP).
  • Plantinga, Alvin. 1993b. Warrant and Proper Function (Oxford: Oxford UP).
  • Plantinga, Alvin. 1993c. “Why We Need Proper Function” Nous 27: 66-82.
  • Spelke, Elizabeth. 1994. “Initial Knowledge: Six Suggestions” Cognition 50: 431-445.
  • Sacks, Oliver. 1970. The Man Who Mistook His Wife for a Hat (New York: HarperCollins).
  • Sosa, Ernest. 1993. “Proper Functionalism and Virtue Epistemology” Nous 27: 51-65.
  • Sosa, Ernest. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Vol. 1 (Oxford: Oxford UP).
  • Sosa, Ernest. 2015. Judgement and Agency (Oxford: Oxford UP).
  • Tucker, Chris. 2011. “Phenomenal Conservatism and Evidentialism” Evidence and Religious Belief, eds. Kelly James Clark and Raymond J. VanArragon (Oxford: Oxford UP).
  • Tucker, Chris. 2014a. “If Dogmatists Have a Problem with Cognitive Penetration, You Do Too” Dialectica 68: 35-62.
  • Tucker, Chris. 2014b. “On What Inferentially Justifies What: The Vices of Reliabilism and Proper Functionalism,” Synthese 191: 3311-3328.
  • Wolterstorff, Nicholas. 2010. “Ought to Believe—Two Concepts” Practices of Belief: Selected Essays, Vol. 2, ed. Terence Cuneo (Cambridge: Cambridge University Press).

 

Author Information

Kenneth Boyce
Email: boyceka@missouri.edu
University of Missouri
U. S. A.

Dynamic Epistemic Logic

This article tells the story of the rise of dynamic epistemic logic. The rise began in the 1960s with the creation and development of epistemic logic, the logic of knowledge, Then in the late 1980s came dynamic epistemic logic, the logic of change of knowledge. Much of it was motivated by puzzles and paradoxes.

The number of active researchers in these logics grows significantly every year because there are so many relations and applications to computer science, to multi-agent systems, to philosophy, and to cognitive science.

The modal knowledge operators in epistemic logic are formally interpreted by employing binary accessibility relations in multi-agent Kripke models (relational structures), where these relations should be equivalence relations to respect the properties of knowledge. The operators for change of knowledge correspond to another sort of modality, more akin to a dynamic modality. A peculiarity of this dynamic modality is that it is interpreted by transforming the Kripke structures used to interpret knowledge, and not, at least not on first sight, by an accessibility relation given with a Kripke model. Although called “dynamic epistemic logic,” this two-sorted modal logic applies to more general settings than the logic of merely S5 knowledge.

The present article discusses in depth the early history of dynamic epistemic logic. It then mentions briefly a number of more recent developments involving factual change, one (of several) standard translations to temporal epistemic logic, and a relation to situation calculus (a well-known framework in artificial intelligence to represent change). Special attention is then given to the relevance of dynamic epistemic logic for belief revision, for speech act theory, and for philosophical logic. The part on philosophical logic pays attention to Moore sentences, the Fitch paradox, and the Surprise Examination.

Table of Contents

  1. Introduction
  2. An Example Scenario
  3. A History of DEL
    1. Announcements
    2. Other Informative Events
  4. DEL and Belief Revision
  5. DEL and Language
  6. DEL and Philosophy
  7. References and Further Reading

1. Introduction

In this overview we tell the story of the rise of dynamic epistemic logic. It is a bit presumptious to call it a rise, but we can only observe this rather peculiar phenomenon. The number of active researchers in these logics grows every year because there are so many relations to computer science, to multi-agent systems, to philosophy, and to cognitive science. It all began with the logic of knowledge in the 1960s, and much of it was motivated by puzzles and paradoxes.

Dynamic logic is the logic of changing knowledge. The starting point of dynamic epistemic logic (DEL) is therefore the logic of knowledge. A founding publication is [42]. We refer to [41] for an overview of epistemic logic and references. A key feature of epistemic logic is that the information state of several agents can be represented by a Kripke model. Given a set of agents and a set of propositional variables, a Kripke model consists of a set of states, a set of accessibility relations (each one a binary relation on the domain), namely one for each agent, and a valuation (that tells which propositional variables are true in which states). In epistemic logic the set of states of a Kripke model is interpreted as a set of epistemic alternatives. The information state of an agent consists of those epistemic alternatives that are possible according to the agent, which is represented by the binary accessibility relation Rα. An agent α knows that a proposition φ is true in a state a of a Kripke model M (M; aKαφ), if and only if that proposition φ is true in all the states that agent α considers possible in that state (that is, which are Rα-accessible from a). A proposition known by agent α may itself pertain to the knowledge of some agent (for instance if one considers the formula KαKβψ). In this way, a Kripke model with accessibility relations for all the agents represents the (higher-order) information of all relevant agents simultaneously.

In DEL, information change is modeled by transforming Kripke models. Since DEL is mostly about information change due to communication, the model transformations usually do not involve factual change. The bare physical facts of the world remain unchanged, but the agents’ information about the world changes. In terms of Kripke models that means that the accessibility relations of the agents have to change (and consequently the set of states of the model might change as well). Modal operators in dynamic epistemic languages denote these model transformations. The accessibility relation associated with these operators is not one within the Kripke model, but pertains to the transformation relation between the Kripke models, as the example in the next section will show.

In Section 2 an example scenario is presented which can be captured by DEL. In Section 3 an historical overview of the main approaches in DEL is presented, with details on their modelling techniques. Section 4 discusses how to model belief revision in DEL. Section 5 connects ideas between speech act theory and DEL. Finally, Section 6 is on the relation between DEL and philosophy: it deals with Moore-sentences, the Fitch-paradox, and the Surprise Examination.

2. An Example Scenario

4

Figure 1: A Kripke model for the situation in which two agents are each given a red or a white card.

Consider the following scenario: Ann and Bob are each given a card that is either red or white on one side (the face side) and nondescript on the other side (the back side). They only see their own card, and so they are ignorant about the other agent’s card. There are four possibilities: both have white cards, both have red cards, Ann has a white card and Bob has a red card, or the other way round. These are the states of the model, and are represented by informative names such as rw, meaning Ann was dealt a red card (r) and Bob was dealt a white card (w). Let us assume that both have red cards, that is, let the actual state be rr. This is indicated by the double lines around state rr in Figure 1. The states of the Kripke model are connected by lines, which are labeled (α or β, denoting Ann or Bob respectively) to indicate that the agents cannot distinguish the states thus connected. (To be complete it should also be indicated that no state can ever be distinguished from itself. For readability these “reflexive lines” are not drawn, but indeed the accessibility relations Rα and Rβ are equivalence relations, since epistemic indistinguishability is reflexive, symmetric and transitive.) In the model of Figure 1 there are no α-lines between those states where Ann has different cards, that is, she can distinguish states at the top, where she has a red card, from those at the bottom, where she has a white one. Likewise, Bob is able to distinguish the left half from the right half of the model. This represents the circumstance that Ann and Bob each know the colour of their own card but not the colour of the other’s card. In the Kripke model of Figure 1 we also see that the higher-order information is represented correctly. Both agents know that the other agent knows the colour of his or her card, and they know that they know this, and so on. It is remarkable that a single Kripke model can represent the information of both agents simultaneously.

rr2Figure 2: A Kripke model for the situation after Ann tells Bob she has a red card.

                                    bb 2Figure 3: A Kripke model for the situation after Ann might have looked at Bob’s card.

Suppose that after picking up their cards, Ann truthfully says to Bob “I have a red card”. The Kripke model representing the resulting situation is displayed in Figure 2. Now both agents know that Ann has a red card, and they know that they know she has a red card, and so on: it is common knowledge among them. (A formula φ is common knowledge among a group of agents if everyone in the group knows that φ, everyone knows that everyone knows that φ, and so on.) Hence there is no need anymore for states where Ann has a white card, so those do not appear in the Kripke model. Note that in the new Kripke model there are no longer any lines labeled β. No matter how the cards were dealt, Bob only considers one state to be possible: the actual one. Indeed, Bob is now fully informed.

Now that Bob knows the colour of Ann’s card, Bob puts his card face down on the table, and leaves the room for a moment. When he returns he considers it possible that Ann took a look at his card, but also that she didn’t. Assuming she did not look, the Kripke model representing the resulting situation is the one displayed in Figure 3. In contrast to the previous model, there are in this model lines for Bob again. This is because he is no longer completely informed about the situation. He does not know whether Ann knows the colour of his card, yet he still knows that both Ann and he have a red card. Only his higher-order information has changed. Ann on the other hand knows whether she has looked at Bob’s card and also knows whether she knows the colour of Bob’s card. She also knows that Bob considers it possible that she knows the colour of his card. In the model of Figure 3 we see that two states representing the same factual information can differ by virtue of the lines connecting them to other states: the state rr on the top and rr on the bottom only differ in higher-order information.

In this section, we have seen two ways in which information change can occur. Going from the first model to the second, the information change was public, in the sense that all agents received the same information. Going from the second to the third model involved information change where not all agents had the same information, because Bob did not know whether Ann looked at his card while he was away. The task of DEL is to provide a logic with which to describe these kinds of information change.

3. A History of DEL

DEL did not arise in a scientific vacuum. The “dynamic turn” in logic and semantics ([72], [34] and [60]) very much inspired DEL, and DEL itself can also be seen as a part of the dynamic turn. The formal apparatus of DEL is a lot like propositional dynamic logic (PDL) [40] and quantified dynamic logic (QDL) [39]. There is also a relation to update semantics (US) [36, 93] — not all formulas are interpreted dynamically, as there, but formulas and updates are clearly distinguished.

The study of epistemic logic within computer science and AI led to the development of epistemic temporal logic (ETL) in order to model information change in multi-agent systems (see [25] and [55]). Rather than model change by modal operators that transform the model, change is modeled by the progression of time in these approaches. Yet the kinds of phenomena studied by ETL and DEL largely overlap.

After this brief sketch of the context in which DEL was developed, the remainder of the section focuses on the development of its two main approaches. The first is public announcement logic, which is presented in Section 3.1. The second, presented in Section 3.2, is the dominant approach in DEL (sometimes identified with DEL).

a. Announcements

The original publication: Plaza The first dynamic epistemic logic, called public announcement logic (PAL), was developed by Plaza in [61]. This was published in 1989. The example where Ann says to Bob that she has a red card is an example of a public announcement. A public announcement is a communicative event where all agents receive the same information and it is common knowledge among them that this is so. The language of PAL is given by the following Backus-Naur Form:

3.1 aaaBesides the usual propositional language, Kαφ is read as agent α knows that φ, and [φ] ψ is read as after φ is announced ψ is the case. In the example above, we could for instance translate “After it is announced that Ann has a red card, Bob knows that Ann has a red card” as [rα]Kβrα.

An announcement is modeled by removing the states where the announcement is false, that is, by going to a submodel. This model transformation is the main feature of PAL’s semantics.

 

3.1 bbbb

In clause (v) the condition that the announced formula be true at the actual state entails that only truthful announcements can take place. The model MΙφ is the model obtained from M by removing all non-φ states. The new set of states consists of the φ-states of M. Consequently, the accessibility relations as well as the valuation are restricted to these states . The propositional letters true at a state remain true after an announcement. This reflects the idea that communication can only bring about information change, not factual change.

Gerbrandy and Groeneveld’s approach A logic similar to PAL was developed independently by Gerbrandy and Groeneveld in [32], which is more extensively treated in Gerbrandy’s PhD thesis [30]. There are three main differences between this approach and Plaza’s approach. First of all, Gerbrandy and Groeneveld do not use Kripke models in the semantics of their language. Instead, they use structures called possibilities which are defined by means of non-wellfounded set theory [1], a branch of set theory where the foundation axiom is replaced by another axiom. Possibilities and Kripke models are closely linked: possibilities correspond to bisimulation classes of Kripke models [18]. Later, Gerbrandy provided semantics without using non-wellfounded set theory for a simplified version of his public announcement logic [31].

The second difference is that Gerbrandy and Groeneveld also consider announcements that are not truthful. In their view, a logic for announcements should model what happens when new information is taken to be true by the agents. Hence, according to them, what happens to be true deserves no special status. This is more akin to the notion of update in US. In terms of Kripke models this means that by updating, agents may no longer consider the actual state to be possible, that is, Rα may no longer be reflexive. In a sense it would therefore be more accurate to call this logic a dynamic doxastic logic (a dynamic logic of belief) rather than a dynamic epistemic logic, since according to most theories, knowledge implies truth, whereas beliefs need not be true.

Thirdly, their logic is more general in the sense that subgroup announcements are treated (where only a subgroup of the group of all agents acquires new information); and especially private announcements are considered, where only one agent gets information. These announcements are modeled in such a way that the agents who do not receive information do not even consider it possible that anyone has learned anything. In terms of Kripke models, this is another way in which Rα may lose reflexivity.

Adding common knowledge Semantics for public, group and private announcements using Kripke models was proposed by Baltag, Moss, and Solecki in [14]. This semantics is equivalent to Gerbrandy’s semantics (as was shown in [58]). The main contribution in [14] to PAL was that their approach also covered common knowledge, which is an important concept when one is interested in higher-order information and plays an important role in social interaction (cf. [92]). The inclusion of common knowledge poses a number of technical problems.

b. Other Informative Events

Groeneveld and Gerbrandy’s approach In addition to a logic for announcements Gerbrandy also developed a system for more general information change involving many agents, each of whom may have a different perspective. This is for instance the case when Ann may look at Bob’s card.

In order to model this information change it is important to realize that distinct levels of information are not distinctly represented in a Kripke model. For instance what Ann actually knows about the cards depends on Rα, but what Bob knows about what Ann knows about the cards depends on Rα as well. Therefore changing something in the Kripke model, such as cutting a line, changes the information on many levels. In order to come to grips with this issue it really pays to use non-wellfounded semantics. One of the ways to think about the possibilities defined by Gerbrandy and Groeneveld is as infinite trees. In such a tree, distinct levels of information are represented by certain paths in the tree. By manipulating the appropriate part of the tree, one can change the agents’ information at the appropriate level. This insight stems from Groeneveld [37] and was also used by Renardel de Lavalette in [62], who introduces treelike lean modal structures using ordinary set theory in the semantics of a dynamic epistemic logic.

Van Ditmarsch’s approach Inspired by Gerbrandy and Groeneveld’s work, Van Ditmarsch developed a dynamic epistemic logic for modeling information change in knowledge games, where the goal of the players is to obtain knowledge of some aspect of the game. Clue and Battleships are typical examples of knowledge games. Players are never deceived in such games and therefore the dynamic epistemic logic of Gerbrandy and Groeneveld in which reflexivity might be lost, seems unsuitable. In Van Ditmarsch’s Ph.D. thesis [86], a logic is presented where all model transformations are from Kripke models with equivalence relations to Kripke models with equivalence relations, which is thus tailored to information change involving knowledge. This approach was further streamlined by Van Ditmarsch in [87] and later extended to include concurrent actions (when two or more events occur at the same time) in [90]. One of the open problems of these logics is that a completeness proof for the axiom systems has not been obtained.

The dominant approach: Baltag, Moss and Solecki Another way of modeling complex informative events was developed by [14], which has become the dominant approach in DEL. Their approach is highly intuitive and is lying at the basis of many papers in the field: indeed, many refer to this approach simply as DEL. Their key insight was that information changing events can be modeled in the same way as situations involving information. Given a situation, such as when Ann and Bob each have a card, one can easily provide a Kripke model for such a situation. One simply considers which states might occur and which of those states the agents cannot distinguish. One can do the same with events involving information. Given a scenario, such as Ann possibly looking at Bob’s card, one can determine which events might occur: either she looks and sees it is red (she learns that) or she sees that it is white (she learns that ), or she does not look at the card (she learns nothing new, indicated by the tautology Τ). It is clear that Ann can distinguish these particular events, but Bob cannot. Such models are called action models or event models.

An event model A is a triple 1 consisting of a set of events E, a binary relation 2 over E for each agent, and a precondition function 3 which assigns a formula to each event. This precondition determines under what circumstances the event can actually occur. Ann can only truthfully say that she has a red card, if in fact she does have a red card. The event model for the event where Ann might have looked at Bob’s card is

t2

Figure 4: An event model for when Ann might look at Bob’s card.

t2

Figure 5: The product update for the models of Figure 3 and Figure 4.

given in Figure 4, where each event is represented by its precondition.

The Kripke model of the situation following the event is constructed with a procedure called a product update. For each state in the original Kripke model one determines which events could take place in that state (that is, one determines whether the precondition of the event is true at that state). The set of states of the new model consists of those pairs of states and events (a, e), which represent the result of event e occurring in state a. The new accessibility relation is now easy to determine. If two states were indistinguishable to an agent and two events were also indistinguishable to that agent, then the result of those events taking place in those states should also be indistinguishable. This implication also holds the other way round: if the result of two events happening in two states are indistinguishable, then the original states and events should be indistinguishable as well. (Van Benthem [73] characterizes product update as having perfect recall, no miracles, and uniformity .) The basic facts about the world do not change due to a

comp tFigure 6: The product update for the models of Figure 1 and Figure 4.

merely communicative event. And so the valuation in <a, e> simply follows the old valuation in a.

longThe model in Figure 5 is the result of a product update of the model in Figure 2 and the event model of Figure 4. One can see that this is the same as the model in Figure 3 (except for the names of the states), which indicates that product update yields the intuitively right result.

One may wonder whether the model in Figure 4 represents the event accurately. According to the event model Bob considers it possible that Ann looks at his card and sees that it is white. Bob, however, already knows that the card is red, and therefore should not consider this event possible. This criticism is justified and one could construct an event model that takes this into account, but the beauty of the event model is precisely that it is detached from the agents’ information about the world in such a way that it provides an accurate model of just the information the agents have about the event. This means that product update yields the right outcome regardless of the Kripke model of the situation in which the event occurred. For instance taking the product update with the model of Figure 1, yields the Kripke model depicted in Figure 6, which represents the situation where Ann might look at Bob’s card immediately after the cards were dealt. The resulting model also represents that situation correctly. This indicates that in DEL static information and dynamic information can be separated.

In the logical language of DEL these event models appear as modalities [A, e], where e is taken to be the event that actually occurs. The language is given by the following Backus-Naur Form

gestClauses (i)–(iv) are the same as for PAL. In clause (v) hmmm is the reflexive transitive closure of the union of the accessibility relations of members of Γ. Clause (vi) is a standard clause for dynamic modalities, except that the accessibility relation for dynamic modalities is a relation on the class of all Kripke models. In clause (vii) it is required that the precondition of the event model is true in the actual state, thus ensuring that <a, e>, the new actual state, exists in the product update. Clauses (viii) and (ix) are the usual semantics for non-deterministic choice and sequential composition.

Not only informative events where different agents have a different perspective can be modeled in DEL, but also public announcements can be thought of in terms of event models. A public announcement can be modeled by an event model containing just one event: the announcement. All agents know this is the actual event, so it is the only event considered possible. Indeed, DEL is a generalization of PAL.

Criticism, alternatives, and extensions Many people feel somewhat uncomfortable with having models as syntactical objects. Baltag and Moss have tried to accommodate this by proposing different languages while maintaining an underlying semantics using event models [10, 13]. This issue is extensively discussed in [91, Section 6.1]. There are alternatives using hybrid logic [70], and algebraic logic ([11], [12]). Most papers just use event models in the language.

DEL has been extended in various ways. Operators for factual change [85, 81] and past operators from temporal logic have been added [64, 5]. DEL has been combined with probability [45], justification logic [63] and extended such that belief revision is also within its grasp. Connections have been made between DEL and various other logics. Its relation to PDL, ETL [80], AGM belief revision, and situation calculus [83] has been studied. DEL has been applied to a number of puzzles and paradoxes from recreational mathematics and philosophy. It has also been applied to problems in game theory (see [79] for a very detailed survey), as well as issues in computer security [94]. Complexity and succinctness of DEL has been investigated in [54, 69, 6]. Two recent overviews of DEL are [17, 78]. In the next section we pay attention to DEL and belief revision.

4. DEL and Belief Revision

Something you cannot model in DEL is changing your mind. Once you know a fact, you know it forever, that is, once kap is true, it remains true after every update. Even when we have weaker constraints on the accessibility relations (for belief, or even general accessibility), this remains the case. But sometimes, when you believe a fact, you change your mind, and you may come to believe the opposite. This is not shocking or anything, it might have been that you merely did not believe it firmly. This means a change of kap into ka-por, using the better suited belief modality ba for that: a change of bap into ba-p. In a different community, that of (AGM) belief revision, this is the most natural operation around—indeed called ‘belief revision’. In this section we shortly survey interactions between such AGM belief revision and dynamic epistemic logic.

Belief revision has been studied from the perspective of structural properties of reasoning about changing beliefs [29], from the perspective of changing, growing and shrinking knowledge bases, and from the perspective of models and other structures of belief change wherein such knowledge bases may be interpreted, or that satisfy assumed properties of reasoning about beliefs. A typical approach involves preferential orders to express increasing or decreasing degrees of belief [48, 56], where these works refer to the ‘systems of spheres’ in [51, 38]. Within this tradition multi-agent belief revision has also been investigated, for example, belief merging [46]. Belief operators are normally not explicit in the logical language, so that higher-order beliefs (I know that you are ignorant of a certain proposition) cannot be formalized. Iterated belief revision may also be problematic.

The link between belief revision and modal logic, that is, explicit belief modalities and belief change modalities in the logical language, was made in a strand of research known as dynamic doxastic logic. This was proposed and investigated by Segerberg and collaborators in works such as [68, 52, 67, 22]. These works are distinct from other approaches to belief revision in modal logics, without dynamic modal operators, such as [19, 50, 20], that also influenced the development of dynamic logics combining knowledge and belief change. In dynamic doxastic logics belief operators are in the logical language, and belief revision operators are dynamic modalities. Higher-order belief change, that is, to revise one’s beliefs about one’s own or other agents’ beliefs and ignorance, are considered problematic in dynamic doxastic logic, see [52]. In [68, 67] belief revision is restricted to propositional formulas (factual revision). There are

dynamic doxastic logics wherein [*φ] merely means belief revision with  φ according to some externally defined strategy, as in AGM style (this is the general setup in [68], not unlike the nonepistemic/doxastic modal setup in [71]), but there are also dynamic doxastic logics, such as [67], wherein [*φ] is a recipe operating on a semantic structure and outputting a novel structure, the standard approach in dynamic epistemic logic.

Belief revision in dynamic epistemic logic was initiated in [4, 88, 77, 15]. From these, [4, 88] propose a treatment involving degrees of belief and based on degrees of plausibility among states in structures interpreting such logics, so-called quantitative dynamic belief revision; whereas [77, 15] propose a treatment involving comparative statements about plausibilities (a binary relation between states denoting more/less plausible), so-called qualitative dynamic belief revision. The latter is clearly more suitable for logics of belief revision, and for notions such as conditional belief. The analogue of the AGM postulate of success must be given up when one incorporates higher-order belief change as in dynamic epistemic logic, where again a prime mover are Moore-sentences of the form ‘proposition p is true but you don’t know it’, which cannot after acceptance be believed by you. Many more works on dynamic belief revision have appeared since, for example, [33, 53, 24]. A prior, independent, strand to model belief revision was in temporal epistemic logic, and was initiated in the mid 1990s in [28]. Their integrated treatment of belief, knowledge, plausibilities, and change is similar to the more recent developments to model belief revision in dynamic epistemic logic, and the relation between the two approaches is incompletely understood.

For an example of belief revision in dynamic epistemic logic, consider one agent and a proposition p that the agent is uncertain about. The agent could be Ann, who is uncertain whether Bob has a red card, as in the proposition ro b before. We get a Kripke model depicted in Figure 7, not dissimilar from that in Figure 2. There are two states of the world, one where p is false and another one where p is true. Let us suggestively call them 0 and 1, respectively. The agent has epistemic preferences among these states. Namely, she considers it most plausible that 1 is the actual state, that is,, that p is true, and less plausible that 0 is the actual state. We write 1 < 0 where, as common in the area, the minimal element in the order is the most plausible state (and not, as maybe to be expected, the least plausible state). Let us further assume that p is false.

ppppFigure 7: Ann believes p but considers -p epistemically possible.

The agent believes a proposition when it holds in the most plausible states. For example, she believes that p is true. This is formalized as

bap

We write ba (for belief) instead of ka(for knowledge) as beliefs may be mistaken. Indeed, the agent believes that p but in fact p is false! But we also distinguish a modality for knowledge.

The agent knows a proposition when it holds in all plausible states. These are her strongest beliefs, or knowledge. In the case of this example her factual knowledge only involves tautologies such as p ∨-p This is described as

ka (p ∨-p)

Now imagine that the agent wants to revise her current beliefs. She believes that p is true, but has been given sufficient reason to be willing to revise her beliefs with -p instead. We can accomplish that when we allow a model transformation that makes the 0 state more plausible than the 1 state. There are various ways to do that. In this simple example we can simply observe that it suffices to make the state satisfying the revision formula -p, that is, 0, more plausible than the other state, 1. See Figure 8. As a consequence of that, the agent now believes -p: ba-p is true. Therefore, the revision was successful. This can already be expressed in the initial situation by using a dynamic modal operator [*-p] for the relation induced by the program “belief revision with -p”, followed by what should hold after that program is executed. In this dynamic modal setting we can then write that

-pΛbapΛ[*-p]ba-p

was already true at the outset.

lengFigure 8: Ann revises her belief with -p

In dynamic epistemic logic, unlike in the original AGM or the subsequent DDL setting, beliefs and knowledge can also be about modal formulas. For example, we not only have that bap, but we also have that bapppp the agent believes that she does not know whether p. We might say: Ann is aware that her belief in p is not very strong, that it is defeasible.

5. DEL and Language

Consider the connection between DEL and speech act theory. Speech act theory started with the work of [7], who argued that language is used to perform all sorts of actions; we make promises, we ask questions, we issue commands, and so forth. An example of a speech act is a bartender who says “The bar will be closed in five minutes” [8]. Austin distinguishes three kinds of acts that are performed by the bartender (i) the locutionary act of uttering the words, (ii) the illocutionary act of informing his clientele that the bar will close in five minutes, and (iii) the perlocutionary act of getting the clientele to order one last drink and leave.

Truth conditions, which determine whether an indicative sentence is true of false, are generalized to success conditions to determine whether a speech act is successful or not. In speech act theory there are several distinctions when it comes to the ways in which something can be wrong with a speech act [7, p. 18]. Here we do not make such distinctions and simply speak of success conditions. Searle gives in [66, p. 66] the following success conditions, among others, for an assertion that p by speaker S to hearer H:

  • S has evidence (reasons, and so forth) for the truth of p.
  • It is not obvious to both S and H that H knows (does not need to be reminded of, and so forth) p.
  • S believes p

Speech act theory has been embraced by the multi-agent systems community, for example, by the Foundation for Intelligent Physical Agents (FIPA). FIPA is an IEEE Computer Society standards organization that promotes agent-based technology and the interoperability of its standards with other technologies. It published a Communicative Act Library Specification [26] that includes a specification of the inform action, which is similar to Searle’s analysis of assertions.

It is worthwhile to join this analysis of assertions to the analysis of public announcements in PAL. It is clear from the list of success conditions that one usually only announces what one believes (or knows) to be true. So, an extra precondition for an announcement that φ by an agent α , should be that kap. Public announcements are indeed modeled in this way in [61].

As an example, consider the case when Ann tells Bob she has a red card: it is more appropriate to model this as an announcement that kara rather than the announcement that . Fortunately, these formulas were equivalent in the model under consideration. Suppose that Ann had said “We do not both have white cards”. When this is modeled as an announcement that wa, we obtain the model in Figure 9(a). However, Ann only knows this statement to be true when she in fact has a red card herself. Indeed, when we look at the result of the announcement that kawa we obtain the model in Figure 9(b). We see that the result of this

finished

Figure 9: An illustration of the difference between the effect of the announcement that φ and the announcement that kaφ and an announcement that only changes the agents’ higher-order information

announcement is the same as when Ann says that she has a red card (see Figure 2). By making presuppositions part of the announcement, we are in a way accommodating the precondition (see also [44]).

The second success condition in Searle’s analysis conveys that an announcement ought to provide the hearer with new information. In the light of DEL, one can revise this second success condition by saying that p is not common knowledge, thus taking higher-order information into account. It seems natural to assume that a speaker wants to achieve common knowledge of p, since that plays an important role in coordinating social actions; and so lack of common knowledge of p is a condition for the success of announcing p.

Consider the situation where Ann did look at Bob’s card when he was away and found out that he has a red card (Figure 9(c)). Suppose that upon Bob’s return Ann tells him “I do not know that you have a white card”. Both Ann and Bob already know this, and they also both know that they both know it. Therefore Searle’s second condition is not fulfilled, and so according to his analysis there is something wrong with Ann’s assertion. The result of this announcement is given in Figure 9(d). We see that the information of the agents has changed. Now Bob no longer considers it possible that Ann considers it possible that Bob considers it possible that Ann knows that Bob has a white card. And so the announcement is informative. One can give more and more involved examples to show that indeed change of common knowledge is a more natural requirement for announcements than Searle’s second condition, especially multi-agent scenarios.

Van Benthem [76] analyzes question and answer episodes using DEL. One of the success conditions of questions as speech acts is that the speaker does not know the answer [66, p. 66]. Therefore posing a question can reveal crucial information to the hearer in such a way that the hearer only knows the answer after the question has been posed ([74],[91, p. 61],[82]).

Professor a is program chair of a conference on Changing Beliefs. It is not allowed to submit more than one paper to this conference, a rule all authors of papers did abide to (although the belief that this rule makes sense is gradually changing, but this is besides the point here). Our program chair a likes to have all decisions about submitted papers out of the way before the weekend, since on Saturday he is due to travel to attend a workshop on Applying Belief Change. Fortunately, although there appears not to be enough time to notify all authors, just before he leaves for the workshop, his reliable secretary assures him that she has informed all authors of rejected papers, by personally giving them a call and informing them about the sad news concerning their paper.

Freed from this burden, Professor a is just in time for the opening reception of the workshop, where he meets the brilliant Dr. b. The program chair remembers that b submitted a paper to Changing Beliefs, but to his own embarrassment he must admit that he honestly cannot remember whether it was accepted or not. Fortunately, he does not have to demonstrate his ignorance to b, because b’s question ‘Do you know whether my paper has been accepted?’ does make a reason as follows: a is sure that would b’s paper have been rejected, b would have had that information, in which case b had not shown his ignorance to a. So, instantaneously, a updates his belief with the fact that b’s paper is accepted, and he now can answer truthfully with respect to this new revised belief set.

This phenomenon shows that when a question is regarded as a request [49], the success condition that the hearer is able to grant the request, that is, provide the answer to the question, must be fulfilled after the request has been made, and not before. (However, it is not commonly agreed upon in the literature that questions can be regarded as requests (cf. [35, Section 3].) This analysis of questions in DEL fits well within the broad interest in questions in dynamic semantics [3]. Recent work on DEL and questions is [2, 59, 23].

6. DEL and Philosophy

The role of public announcements as typical informative speech acts focussed the attention on a number of situations wherein that form of success cannot be achieved. This has been investigated mainly within philosophical logic, under the heading of ‘Moore sentences’ and the ‘Fitch paradox’. The ‘Moore sentence’ was introduced by Moore in [57] and his original analysis is that p ¬Kp (p is true and I don’t know/believe it) cannot sincerely be uttered. As this is an informative speech act, you are supposed to believe your beliefs. It seems incoherent, and maybe even paradoxical, to believe a proposition stating that you do not believe it. In the DEL setting we can give this a dynamic interpretation. It is then no longer paradoxical.

If I tell you “You don’t know that I play cello”, this has the conversational implicature “You don’t know that I play cello and it is true that I play cello”. This has the form p ¬Kp. Suppose I were tell you again “You don’t know that I play cello.” Then you can respond: “You’re lying. You just told me that you play cello.” We can analyze what is going on here in modal logic. We model your uncertainty, for which a single epistemic modality suffices. Initially, there are two possible worlds, one in which p is true and another one in which p is false, and that you cannot distinguish from one another. Although in fact p is true, you don’t know that: p ¬Kp. The announcement of p ¬Kp results in a restriction of these two possibilities to those where the announcement is true: in the p-world, p ¬Kp is true, but in the :p-world, p ¬Kp is false.

In the model restriction consisting of the single world where p is true, p is known: Kp. Given that Kp is true, so is¬p ∨ Kp, and ¬p ∨ Kp is equivalent to ¬(p ∧ ¬Kp), the negation of the announced formula. So, announcement of p ∧ ¬Kp makes it false! Gerbrandy [30, 31] calls this phenomenon an unsuccessful update; the matter is also taken up in [89, 43, 84].

We continue with some words on the Fitch paradox [27]. A standard analysis of the Fitch paradox is as follows – see the excellent review of the literature on Fitch’s paradox in the Stanford Encyclopedia of Philosophy [21], and the volume dedicated on knowability [65]. The existence of unknown truths is formalized as ∃p (p ∧ ¬Kp). The requirement that all truths are th-knowable is formalized as ∀p (p → ◊ Kp), where ◊ formalizes the existence of some process after which p is known, or an accessible world in which p is known. Fitch’s paradox is that the existence of unknown truths is inconsistent with the requirement that all truths are knowable.

The Moore-sentence ∧ ¬ Kp witnesses the existential statement ∃p (p ∧ ¬Kp). Assume that it is true. From ∃p (p ∧ ¬Kp) follows the truth of its instance (p ∧ ¬Kp) → ◊ K(p ∧ ¬Kp), and from that and p ∧ ¬Kp follows ◊ K(p ∧ ¬Kp). Whatever the interpretation of ◊, it results in having to evaluate K(p ∧ ¬Kp). But this is inconsistent for knowledge and belief.

We now get to the relation between knowable and DEL. The suggestion to interpret ‘knowable’ as ‘known after an announcement’ was made by van Benthem in [75], and [9] proposes a logic where ‘φ is knowable’ is interpreted in that way. In this setting, ◊p stands for ‘there is an announcement after which p (is true)’, so that ◊Kp stands for ‘there is an announcement after which p is known’, which is a form of ‘proposition p is knowable’.

For example, consider the proposition p for ‘it rains in Liverpool’. Suppose you are ignorant about p: ¬(KpK¬p). First, suppose that p is true. I can announce to you here and now that it is raining in Liverpool (according to your expectations, maybe…), after which you know that: 〈 p Kp stands for ‘p is true and after announcing p, p is known’ (〈φ〉 is the dual of [φ], that is, 〈φ〉ψ  is defined by abbreviation as ¬[φ]¬ψ ). Now, suppose that p is false. In a similar way, after I announce that, you know that; so that we have 〈¬p〉 K¬p. If you already knew whether p, having its value announced does not have any informative consequence for you. Therefore, 〈p〉 K∨ 〈¬p〉 K ¬is a validity. Therefore we also have〈p〉 (K∨  K ¬p) ∨ 〈¬p〉 (K∨  K ¬p) . We can generalize the statement ‘there is a proposition p such that after its announcement, p is known’, to ‘there exists a proposition q, such that after its announcement, p is known’, where q is not necessarily the same as p. Then we have informally captured the meaning of ◊Kp. In other words, this operator is a quantification over announcements. But we have then just proved that ◊ (K∨  K ¬p)is a validity. For more on such matters, see [9, 84].

Another paradox in philosophical logical circles that has been analyzed with DEL methods (and that has similar ‘Moore sentences’-like symptoms) is the Surprise Examination. This has been investigated in works as [30, 31, 89], and more recently by Baltag and Smets using plausibility epistemic structures, along the lines of [16].

Parts of the materials for this overview have been taken from [88, 47, 84], and subsequently revised to make it into a single comprehensive text.

7. References and Further Reading

  • [1] P. Aczel. Non-Well-Founded Sets. CSLI Publications, Stanford, CA, 1988. CSLI Lecture Notes 14.
  • [2] T. Agotnes, J. van Benthem, H. van Ditmarsch, and S. Minica. Question-Answer games, ˚ 2011.
  • [3] M. Aloni, A. Butler, and P. Dekker, editors. Questions in Dynamic Semantics. Elsevier, Amsterdam, 2007.
  • [4] G. Aucher. A combined system for update logic and belief revision. In Proc. of 7th PRIMA, pages 1–17. Springer, 2005. LNAI 3371.
  • [5] G. Aucher and A. Herzig. From DEL to EDL : Exploring the power of converse events. In K. Mellouli, editor, Proc. of ECSQARU, LNCS 4724, pages 199–209. Springer, 2007.
  • [6] G. Aucher and F. Schwarzentruber. On the complexity of dynamic epistemic logic. In Proc. of 14th TARK, 2013.
  • [7] J. L. Austin. How to Do Things with Words. Clarendon Press, Oxford, 1962.
  • [8] K. Bach. Speech acts. In E. Craig, editor, Routledge Encyclopedia of Philosophy, volume 8, pages 81–87. Routledge, London, 1998.
  • [9] P. Balbiani, A. Baltag, H. van Ditmarsch, A. Herzig, T. Hoshi, and T. De Lima. ‘Knowable’ as ‘known after an announcement’. Review of Symbolic Logic, 1(3):305–334, 2008.
  • [10] A. Baltag. A logic for suspicious players: epistemic actions and belief-updates in games. Bulletin of Economic Research, 54(1):1–45, 2002.
  • [11] A. Baltag, B. Coecke, and M. Sadrzadeh. Algebra and sequent calculus for epistemic actions. Electronic Notes in Theoretical Computer Science, 126:27–52, 2005.
  • [12] A. Baltag, B. Coecke, and M. Sadrzadeh. Epistemic actions as resources. J. of Logic Computat., 17(3):555–585, 2007.
  • [13] A. Baltag and L. S. Moss. Logics for epistemic programs. Synthese, 139:165–224, 2004.
  • [14] A. Baltag, L. S. Moss, and S. Solecki. The logic of public announcements, common knowledge, and private suspicions. In I. Gilboa, editor, Proceedings of TARK 98, pages 43–56, 1998.
  • [15] A. Baltag and S. Smets. A qualitative theory of dynamic interactive belief revision. In Proc. of 7th LOFT, Texts in Logic and Games 3, pages 13–60. Amsterdam University Press, 2008.
  • [16] A. Baltag and S. Smets. Group belief dynamics under iterated revision: fixed points and cycles of joint upgrades. In Proc. of 12th TARK, pages 41–50, 2009.
  • [17] A. Baltag, H. van Ditmarsch, and L.S. Moss. Epistemic logic and information update. In J. van Benthem and P. Adriaans, editors, Handbook on the Philosophy of Information, pages 361–456, Amsterdam, 2008. Elsevier.
  • [18] P. Blackburn, M. de Rijke, and Y. Venema. Modal Logic. Cambridge University Press, Cambridge, 2001. Cambridge Tracts in Theoretical Computer Science 53.
  • [19] O. Board. Dynamic interactive epistemology. Games and Economic Behaviour, 49:49–80, 2004.
  • [20] G. Bonanno. A simple modal logic for belief revision. Synthese (Knowledge, Rationality & Action), 147(2):193–228, 2005.
  • [21] B. Brogaard and J. Salerno. Fitch’s paradox of knowability, 2004. http://plato. stanford.edu/archives/sum2004/entries/fitch-paradox/.
  • [22] J. Cantwell. Some logics of iterated belief change. Studia Logica, 63(1):49–84, 1999.
  • [23] I. Ciardelli and F. Roelofsen. Inquisitive dynamic epistemic logic. Manuscript, 2013.
  • [24] C. Degremont. ´ The Temporal Mind. Observations on the logic of belief change in interactive systems. PhD thesis, University of Amsterdam, 2011. ILLC Dissertation Series DS-2010-03.
  • [25] R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about Knowledge. MIT, Cambridge, Massachusetts, 1995.
  • [26] FIPA. FIPA communicative act library specification, 2002. http://www.fipa.org/.
  • [27] F.B. Fitch. A logical analysis of some value concepts. The Journal of Symbolic Logic, 28(2):135–142, 1963.
  • [28] N. Friedman and J.Y. Halpern. A knowledge-based framework for belief change – part i: Foundations. In Proc. of 5th TARK, pages 44–64. Morgan Kaufmann, 1994.
  • [29] P. Gardenfors. ¨ Knowledge in Flux: Modeling the Dynamics of Epistemic States. Bradford Books, MIT Press, Cambridge, MA, 1988. 17
  • [30] J. Gerbrandy. Bisimulations on Planet Kripke. PhD thesis, University of Amsterdam, 1998. ILLC Dissertation Series DS-1999-01.
  • [31] J. Gerbrandy. The surprise examination in dynamic epistemic logic. Synthese, 155(1):21– 33, 2007.
  • [32] J. Gerbrandy and W. Groeneveld. Reasoning about information change. J. Logic, Lang., Inform., 6:147–169, 1997.
  • [33] P. Girard. Modal logic for belief and preference change. PhD thesis, Stanford University, 2008. ILLC Dissertation Series DS-2008-04.
  • [34] P. Gochet. The dynamic turn in twentieth century logic. Synthese, 130(2):175–184, 2002.
  • [35] J. Groenendijk and M. Stokhof. Questions. In J. van Benthem and A. ter Meulen, editors, Handbook of Logic and Language, pages 1055–1124. Elsevier, Amsterdam, 1997.
  • [36] J. Groenendijk, M. Stokhof, and F. Veltman. Coreference and modality. In S. Lappin, editor, The Handbook of Contemporary Semantic Theory, pages 179–213. Blackwell, Oxford, 1996.
  • [37] W. Groeneveld. Logical Investigations into Dynamic Semantics. PhD thesis, University of Amsterdam, 1995. ILLC Dissertation Series DS-1995-18.
  • [38] A. Grove. Two modellings for theory change. Journal of Philosophical Logic, 17:157–170, 1988.
  • [39] D. Harel. First-Order Dynamic Logic. LNCS 68. Springer, 1979.
  • [40] D. Harel. Dynamic logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, volume II, pages 497–604, Dordrecht, 1984. Kluwer Academic Publishers.
  • [41] Vincent Hendricks and John Symons. Epistemic logic. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Spring 2006.
  • [42] J. Hintikka. Knowledge and Belief. Cornell University Press, Ithaca, NY, 1962.
  • [43] W. Holliday and T. Icard. Moorean phenomena in epistemic logic. In L. Beklemishev, V. Goranko, and V. Shehtman, editors, Advances in Modal Logic 8, pages 178–199. College Publications, 2010.
  • [44] J. Hulstijn. Presupposition accommodation in a constructive update semantics. In G. Durieux, W. Daelemans, and S. Gillis, editors, Proceedings of CLIN VI, 1996.
  • [45] J. Gerbrandy J. van Benthem and B. Kooi. Dynamic update with probabilities. Studia Logica, 93(1):67–96, 2009.
  • [46] S. Konieczny and R. Pino Perez. Merging information under constraints: A logical frame- ´ work. Journal of Logic and Computation, 12(5):773–808, 2002.
  • [47] B.P. Kooi. Dynamic epistemic logic. In J. van Benthem and A. ter Meulen, editors, Handbook of Logic and Language, pages 671–690. Elsevier, 2011. Second edition.
  • [48] S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence, 44:167–207, 1990.
  • [49] R. Lang. Questions as epistemic requests. In H. Hiz, editor, ˙ Questions, pages 301–318. Reidel, Dordrecht, 1978.
  • [50] N. Laverny. Revision, mises ´ a jour et planification en logique doxastique graduelle ` . PhD thesis, Institut de Recherche en Informatique de Toulouse (IRIT), Toulouse, France, 2006.
  • [51] D.K. Lewis. Counterfactuals. Harvard University Press, Cambridge (MA), 1973.
  • [52] S. Lindstrom and W. Rabinowicz. DDL unlimited: dynamic doxastic logic for introspective ¨ agents. Erkenntnis, 50:353–385, 1999.
  • [53] F. Liu. Changing for the Better: Preference Dynamics and Agent Diversity. PhD thesis, University of Amsterdam, 2008. ILLC Dissertation Series DS-2008-02.
  • [54] C. Lutz. Complexity and succinctness of public announcement logic. In Proceedings AAMAS 06, Hakodate, Japan, 2006.
  • [55] J.-J. Ch. Meyer and W. van der Hoek. Epistemic Logic for AI and Computer Science. Cambridge University Press, Cambridge, 1995.
  • [56] T.A. Meyer, W.A. Labuschagne, and J. Heidema. Refined epistemic entrenchment. Journal of Logic, Language, and Information, 9:237–259, 2000.
  • [57] G.E. Moore. A reply to my critics. In P.A. Schilpp, editor, The Philosophy of G.E. Moore, pages 535–677. Northwestern University, Evanston IL, 1942. The Library of Living Philosophers (volume 4).
  • [58] L. S. Moss. From hypersets to Kripke models in logics of announcements. In J. Gerbrandy, M. Marx, M. de Rijke, and Y. Venema, editors, JFAK. Essays Dedicated to Johan van Benthem on the Occasion of his 50th Birthday, Amsterdam, 1999. Amsterdam University Press.
  • [59] Michal Peli and Ondrej Majer. Logic of questions and public announcements. In Nick Bezhanishvili, Sebastian Lbner, Kerstin Schwabe, and Luca Spada, editors, Logic, Language, and Computation, pages 145–157. Springer, 2011. LNCS 6618.
  • [60] J. Peregrin, editor. Meaning: the dynamic turn. Elsevier, Amsterdam, 2003. 19
  • [61] J. Plaza. Logics of public communications. Synthese, 158(2):165–179, 2007. This paper was originally published as Plaza, J. A. (1989). Logics of public communications. In M. L. Emrich, M. S. Pfeifer, M. Hadzikadic, and Z.W. Ras (Eds.), Proceedings of ISMIS: Poster session program (pp. 201–216). Publisher: Oak Ridge National Laboratory, ORNL/DSRD- 24.
  • [62] G. R. Renardel de Lavalette. Changing modalities. J. Logic and Comput., 14(2):253–278, 2004.
  • [63] B. Renne. A survey of dynamic epistemic logic. manuscript, 2008.
  • [64] J. Sack. Adding Temporal Logic to Dynamic Epistemic Logic. PhD thesis, Indiana University, Bloomington, USA, 2007.
  • [65] J. Salerno, editor. New Essays on the Knowability Paradox. Oxford University Press, Oxford, UK, 2009. [66] J. R. Searle. Speach Acts, An Essay in the Philosophy of Language. Cambridge University Press, Cambridge, 1969.
  • [67] K. Segerberg. Irrevocable belief revision in dynamic doxastic logic. Notre Dame Journal of Formal Logic, 39(3):287–306, 1998.
  • [68] K. Segerberg. Two traditions in the logic of belief: bringing them together. In H. J. Ohlbach and U. Reyle, editors, Logic, Language, and Reasoning, pages 135–147, Dordrecht, 1999. Kluwer.
  • [69] P. Iliev T. French, W. van der Hoek and B. Kooi. On the succinctness of some modal logics. Artificial Intelligence, 197:56–85, 2013.
  • [70] B. D. ten Cate. Internalizing epistemic actions. In M. Martinez, editor, Proceedings of the NASSLLI 2002 student session, pages 109 – 123, Stanford University, 2002.
  • [71] J. van Benthem. Semantic parallels in natural language and computation. In Logic Colloquium ’87, Amsterdam, 1989. North-Holland.
  • [72] J. van Benthem. Exploring Logical Dynamics. CSLI Publications, Stanford, 1996.
  • [73] J. van Benthem. Games in dynamic-epistemic logic. Bulletin of Economic Research, 53(4):219–248, 2001.
  • [74] J. van Benthem. Logics for information update. In J. van Benthem, editor, Proceedings of TARK 2001, pages 51–67, San Francisco, 2001. Morgan Kaufmann.
  • [75] J. van Benthem. What one may come to know. Analysis, 64(2):95–105, 2004.
  • [76] J. van Benthem. ‘one is a lonely number’: on the logic of communication. In Z. Chatzidakis, P. Koepke, and W. Pohlers, editors, Logic Colloquium ’02. ASL, Poughkeepsie, 2006. 20
  • [77] J. van Benthem. Dynamic logic of belief revision. Journal of Applied Non-Classical Logics, 17(2):129–155, 2007.
  • [78] J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge University Press, 2011.
  • [79] J. van Benthem. Logic in Games. MIT Press, 2013. To appear.
  • [80] J. van Benthem, J.D. Gerbrandy, T. Hoshi, and E. Pacuit. Merging frameworks for interaction. Journal of Philosophical Logic, 38:491–526, 2009.
  • [81] J. van Benthem, J. van Eijck, and B. Kooi. Logics of communication and change. Information and Computation, 204(11):1620–1662, 2006.
  • [82] W. van der Hoek and R. Verbrugge. Epistemic logic: a survey. In L.A. Petrosjan and V.V. Mazalov, editors, Game theory and Applications, volume 8, pages 53–94, 2002.
  • [83] H. van Ditmarsch, A. Herzig, and T. De Lima. From situation calculus to dynamic epistemic logic. Journal of Logic and Computation, 21(2):179–204, 2011.
  • [84] H. van Ditmarsch, W. van der Hoek, and P. Iliev. Everything is knowable – how to get to know whether a proposition is true. Theoria, 78(2):93–114, 2012.
  • [85] H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic epistemic logic with assignment. In Proc. of 4th AAMAS, pages 141–148. ACM, 2005.
  • [86] H. P. van Ditmarsch. Knowledge games. PhD thesis, University of Groningen, 2000. ILLC Dissertation Series DS-2000-06.
  • [87] H. P. van Ditmarsch. Descriptions of game actions. J. Logic, Lang., Inform., 11:349–365, 2002.
  • [88] H. P. van Ditmarsch. Prolegomena to dynamic logic for belief revision. Synthese, 147:229– 275, 2005.
  • [89] H. P. van Ditmarsch and B. Kooi. The secret of my success. Synthese, 151(2):201–232, 2006.
  • [90] H. P. van Ditmarsch, W. van der Hoek, and B. Kooi. Concurrent dynamic epistemic logic. In V. F. Hendricks, K. F. Jørgensen, and S. A. Pedersen, editors, Knowledge Contributors, pages 45–82. Kluwer, Dordrecht, 2003.
  • [91] H. P. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic Epistemic Logic. Springer, Berlin, 2007.
  • [92] Peter Vanderschraaf and Giacomo Sillari. Common knowledge. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Fall 2007. 21
  • [93] F. Veltman. Defaults in update semantics. Journal of Philosophical Logic, 25:221–261, 1996. [94] Y. Wang, L. Kuppusamy, and J. van Eijck. Verifying epistemic protocols under common knowledge. In Proc. of 12th TARK, pages 257–266. ACM, 2009.

 

Author Information

Hans van Ditmarsch
Email: hans.van-ditmarsch@loria.fr
University of Lorraine
France

and

Wiebe van der Hoek
Email: wiebe@csc.liv.ac.uk
The University of Liverpool
United Kingdom

and

Barteld Kooi
Email: B.P.Kooi@rug.nl
University of Groningen
Netherlands

Classification

One of the main topics of scientific research is classification. Classification is the operation of distributing objects into classes or groups—which are, in general, less numerous than them. It has a long history that has developed during four periods: (1) Antiquity, where its lineaments may be found in the writings of Plato and Aristotle; (2) The Classical Age, with natural scientists from Linnaeus to Lavoisier; (3) The 19th century, with the growth of chemistry and information science; and (4) the 20th century, with the arrival of mathematical models and computer science. Since that time, and from an extensional viewpoint, mathematics, specifically, the theory of orders and the theory of graphs or hypergraphs, has facilitated the precise study of strong and weak forms of order in the world, and the computation of all the possible partitions, chains of partitions, covers, hypergraphs or systems of classes that we can construct on a domain. With the development of computer science, Artificial Intelligence, and new kinds of languages such as oriented-objected languages, an intensional approach has completed the previous one. Ancient discussions between Aristotle and Plato, Ramus and Pascal, Jevons and Joseph found some kind of revival via object-oriented modeling and programming, most of objected oriented languages being concerned with hierarchies, or partial orders: these structures reflect in fact the relations between classes in those languages, which generally admit single or multiple inheritance. In spite of these advances, most of classifications are still based on the evaluation of resemblances between objects that constitute the empirical data. This one is almost always computed by the means of some notion of distance and of some algorithms of aggregation of classes. So all these classifications remain, for technical and epistemological reasons that are detailed below, very unstable ones. A real algebra of classifications, which could explain their properties and the relations existing between them, is lacking. Though the aim of a general theory of classifications is surely a wishful thought, some recent conjecture gives the hope that the existence of a metaclassification (or classification of all classification schemes) is possible.

Table of Contents

  1. General Introduction: Classification Problems
  2. A Brief History of Classifications
    1. From Antiquity to the Renaissance
    2. From Classical Age to Victorian Taxonomy
    3. The Beginning of Modernity
  3. The Problem of Information Storage and Retrieval
  4. Ranganathan and the PMEST Scheme
  5. Order and Mathematical Models
    1. Extensional Structures
    2. A Glance at an Intensional Approach
  6. The Idea of a General Theory of Classifications
  7. References and Further Readings

1. General Introduction: Classification Problems

Classification problems are one of the basic topics of scientific research. For example, mathematics, physics, natural sciences, social sciences and, of course, library and information sciences all make use of taxonomies.  Classification is a very useful tool for ordering and organization. It has increased knowledge and helped to facilitate information retrieval.

Roughly speaking, ‘classification’ is the operation consisting of sharing, distributing or allocating objects in classes or groups which are, in general, less numerous than them. Commonly, classifications are defined on finite sets. However, if the objects are, for example, mathematical structures there can be infinite classifications. In this case, the previous requirement, of course, must be weakened: we may only want the (infinite) cardinal of the classification to be less than or equal to the (infinite) cardinal of the set of objects to be classified.  What we call ‘classification’ is also the result of this operation. We want, as much as it is possible, for this result be constant, namely, that the classification itself remains stable for a little transformation of data (of course, the sense of this requirement will have to become clearer). Various situations may happen: the classes may intersect or not, be finite or infinite, formal or fuzzy, hierarchically ordered or not, and so on.

The basic operation of grouping elements into classes, which simplifies the world, is a very powerful operation, but it also raises many questions. In particular, a number of philosophers, from Socrates to Diderot and even post-modern philosophers, criticized such an operation (see, for instance, Foucault 1967). Indeed, this operation has multiple profits.  First is the substitution of a rational and regular order in the chaotic and muddled multiplicities. Second is the reduction of the size of sets, so that, once we have constituted classes of equivalences, we can work with these classes and no more with the elements. Third, and finally, to make a partition of a set means locating in it a symmetry that  decreases the complexity of the problem and so simplifies the world. We can say with Dagognet (1984, 1990) than “less is more”: to compress the data really brings an intellectual gain.

Having outlined the main reasons for classifications, let us see how these classifications have developed and which forms they got throughout the course of time.

2. A Brief History of Classifications

The history of classifications (Dahlberg 1976) develops in four periods. From Plato and Aristotle to the 18th century, ancient classifications are hierarchical ones, they are finite and generally based on one single criterion. During the 18th century, some new classifications appear, which are multicriteria  – a domain can be co-divided in many ways, as Kant said in his Logic (see Kant 1988) – and indefinite or virtually infinite (Kant believed that we could endlessly subdivide the extension of a concept).  At the end of the 18th and at the beginning of the 19th century, with the chemical classifications of Lavoisier and then of Mendeleyev, one discovers combinatorial classifications or multiple crossed orders, like the chemical table of Elements, which correspond to a new concept of classification. In the 20th century, through the progress of mathematical order theory, factorial analysis of correspondence, and automatic classification, formal models begin to develop.

a. From Antiquity to the Renaissance

French commentator of Greek philosophers, R. Joly said that a typical trend of the Greek spirit was to reduce a multiple and complex reality into some categories which satisfy the reason, both by their restricted number and by the clear and precise sense that becomes attached to each of them. Indeed, Plato and Aristotle are among the great classifiers of these ancient times.

In all of Plato’s Dialogues, and especially in the latest ones (Parmenides, Sophist, Politicus, Philaebus), Plato obviously classified a lot of things (ways of life, political constitutions, pleasures, arts, jobs, kinds of knowledge, and so forth). Generally, for Plato, things were classified in relation with the distance that separates them from their archetypal forms, which yields some order (or pre-order) on them. Plato’s classifications are finite, hierarchical, dichotomous, and based on a single criterion. For example, in Gorgias (465c), a set of all practices is divided into two classes, the practices concerning the body and the practices concerning the soul, each of them being then divided into two others: gymnastics and medicine, on one hand, and legislation and justice, on the other hand. In the same way, in Republic (510a), the whole universe, viewed as the set of all real things, is divided into the visible world and the invisible world, each class being subdivided into images and objects or living beings on one hand, mathematical objects and ideas, on the other hand.

According to Plato, the rules of classifications are very simple. First, we have to make symmetric divisions in order to get well-balanced classes. For example, if we classify the peoples, we have to avoid setting the Greek in front of the other peoples, because one of the classes will be plethoric while the other one will have only one element (Politicus, 262a). Second, As a good cook who cuts an animal─this metaphor is in the Phaedrus−it is also necessary to choose the good joints or articulations. For example, in the field of numbers, it would be senseless to set 1000 in front of 999 other numbers. In contrast, the opposition even/odd or prime/not prime, is a real one. Thirds, in general, we must also avoid using negative determinations. For example, we have to avoid determinations like not-A because it is impossible that the non-being has sorts or species, these determinations block the development of thought.

Plato did not observe these wise rules, so incurring Aristotle’s criticisms. Against Plato’s theory, Aristotle argues that the method of division is not a powerful tool because it is non-conclusive. It does not make syllogisms (First Analytics, I, 31). In another text (Second Analytics, II,5), Aristotle insists on the contingency of the passage from a predicate to another one, that is, in the Platonic division, for every new attribute, we can wonder why it is such an attribute  oppose to another one. The differences introduced by dichotomies can be also purely negative and thus do not necessarily define a real being. Moreover, binary divisions presuppose that the number of the primitive species is a power of 2. In a division, a predicate can belong to different primitive species, for example “bipedalism” can apply to both birds and humans. But, according to Aristotle, the application of this term is not the same in both cases. Finally, the Platonic division confuses extensional and intensional views. It can identify the triangle, which is a kind, and one of its properties, for example, the equality of the sum of its angles in two right angles.

The previous questions get no answer in Plato’s theory. Aristotle rejected Plato’s method of division. But, Aristotle also rejected the Platonic doctrine of forms. According to Aristotle (Metaphysics, I, 9), Plato’s forms fail to explain how there could be permanence and order in the world. Far more, he argued, Plato’s theory of forms cannot explain anything at all in our material world. The properties that the forms have (according to Plato the forms are eternal, unchanging, transcendent, and so forth) are not compatible with material objects and the metaphor of participation or imitation breaks down in a number of cases. For instance, it is unclear what it mean for a white object to participate in, or to copy, the form of whiteness−that is, it is hard to understand the relationship between the form of whiteness and white objects themselves.

For all these reasons, Aristotle develops his own concepts, and his own logic of classifications. In the Topics (I, chap. 1), Aristotle introduces the notions of kind, species, property and a whole theory of basic predication that has subsequently developed in the work of Porphyry and Boece, respectively. This theory is based on the opposition between essence, all of the characters that define a thing, and accident, the qualities whose presence or absence does not modify the things essence. A commentator of the Aristotelian system, Porphyry (234-305), puts these distinctions to good use and tries to specify the hierarchy of the kinds and the species as defined by Aristotle. The famous Porphyrian Tree is the first abstract tree outlining these distinctions and illustrates the subordination existing between them (See Figure 1).

Fig. 1

Figure 1: The Porphyrius Tree

In a passage of his Commentary on Aristotle’s Categories (2014) Porphyry asked good questions at the origin of a hotly-debated controversy over whether or not universals were physical or immaterial substances. That is, a contention over whether universals are separated from sensible things or if they are involved in them, finding their consistency therein. In opposition to the traditional views (Platonic and Aristotelian or scholastic realisms), other solutions appeared. For example, Nominalism (Roscelin, 11th c.) claimed that universals are but words and that nothing corresponds to them in the Nature, which knows only the singular. Against that was Conceptualism (Abélard, 12th cn. and Ockham, 14th cn.), the view that kinds exist as predicates of subjects that, themselves, are real. In the last centuries of Middle Ages and in the Renaissance, we find also great scholars who work on classification. In particular, Francis Bacon (1561-1626), whose work on the classification of knowledge that has inspired the great librarians of the 19th century. But, the logic of classifications, which remains, in this time, the Aristotelian logic, receives practically no new development until the 18th century.

b. From Classical Age to Victorian Taxonomy

In the Classical Age, taxonomy as a fully-fledged discipline began to develop for several reasons. One important reason emerges from the birth of natural science and the need to organize floras and faunas in connection with the growth of the human population on Earth, in the context of the beginning of agronomy (Dagognet, 1970). In this period, naturalists like Tournefort (1656-1708), Linnaeus (1707-1778), De Jussieu (1748-1836), Desfontaines (1750-1833) and Cuvier (1769-1832) tried to classify plants and animals all around the world.

When classifying things or beings, you must get a criterion or an index, in order to make classes and separate varieties inside the classes. Indeed, all those naturalists differ on the criteria of their classifications. For example, concerning the classification of plants, Tournefort chose corolla, while Linnaeus chose the sexual organs of the plant. Concerning the animals, the classification of Cuvier violates Aristotle’s recommendations, by compositing vertebrates and invertebrates which, by chance, are something real. At the end of the century, Kant summarizes, in his Logic (1800), the main part of the knowledge about classifications in this period, by specifying the definitions of a certain number of terms and operations that the naturalists of the time empirically use. Kant was only interested in the forms of the classifications. In his Logic he defines a logical division of a concept as “the division of all the possible contained in it”. The rules of this division are the following: 1) members of the division are mutually exclusive, 2) their union restores the sphere of the divided concept, 3) each member of the division can be itself divided (the division of such divided members is a subdivision).  (1) and (2) seem to indicate that Kant was approaching our concept of a partition. But (3) shows that he does not have the concept of a chain of partitions, since he does not see that a subdivision of the same level forms one and the same partition.

These problems were also discussed, during the 19th century in Anglo-Saxon countries, even after Darwin’s theory of evolution. One may think that Darwin’s belief in branching evolution was based upon his familiarity with the taxonomy of his day, from which he was very aware. There were great taxonomists in England in the Victorian age and some of them−for instance, the paleontologist H. Alleyne Nicholson, a specialist of British Stromatoporoids−were prodigious and wrote monographs still in force today (Woodward 1903). At approximately the same time, H. Agassiz (Agassiz 1957), a scholar in classification theory, wrote about taxonomic concepts like categories, divisions, forms, homologies, analogies, and so on. Among different taxonomic systems mentioned in his Essay on Classification, include the classical systems of Leeuckart, Vogt, Linnaeus, Cuvier, Lamarck, de Blainville, Burmeister, Owen, Ehrenberg, Milne-Edwards, von Siebold, Stannius, Oken, Fitzinger, MacLeay, von Baer, van Bencden, and van der Hoeven. In The Origin of Species, Darwin himself said that it was a

truly wonderful fact…that all animals and all plants throughout all time and space should be related to each other in group subordinate to group, in the manner which we everywhere behold−namely, varieties of the same species most closely related together, species of the same genus less closely and unequally related together, forming sections and sub-genera, species of distinct genera much less closely related, and genera related in different degrees, forming sub-families, families, orders, subclasses, and classes. (1859, 128)

But what he called the “principle of divergence”–namely, the fact that during the modification of the descendants of any one species and during the incessant struggle of all species to increase in numbers, the more diversified these descendants become, the better will be their chance of succeeding in the battle of life−was illustrated by his famous tree-like diagram sketched in 1837 in the notebook in which he first posited evolution. From this time, tree-like structures, that has been also of great use in chemistry and would be formalized at the end of the century by the mathematician Arthur Cayley, tended to replace classifications.

c. The Beginning of Modernity

A new kind of classifications appeared at the end of the 18th century, with the development of Chemistry, namely, combinatorial classifications or cross multiple orders. This kind of classifications is either the crossing of two or more divisions, or the crossing of two or more hierarchies of divisions. In such a structure, as Granger (1967) said,  “elements are distributed according to two or several dimensions, giving rise to a multiplication table”. In a combinatorial classification, the elements themselves are not necessarily distributed into classes. Only the components of these elements are classified. For Granger, this model refers to the Cartesian plane and to the ordinal principle on which it is based. The Cartesian plane, results from a will of ordering a certain distribution of points in the space, by ordering points in every row and then by ordering the rows themselves. The virtue of multiple orders is to place what is classified in the intersection of a line and a column. So, as Dagognet (1969) has shown, when an element is absent or there is an empty compartment, it can be defined by its surroundings. This is what happened in the Mendeleyev table. This table has two main advantages. First, the table is creative, so the mass of a chemical element can be calculated from those which surround it (see Figure 2), and hence, chemical elements, which did not exist in Nature but were synthesized only 30 years later in laboratories, have already been accounted for by Mendeleyev. Second, the classification is not a purely spatial picture of the world. The temporality, in particular the future, is already present in it.

Fig. 2

Figure 2: The mass of an unknown element in the Mendeleyev Table

3. The Problem of Information Storage and Retrieval

At the end of the 19th century, the development of scientific research, which raised the question of information storage and retrieval, encouraged the constitution of voluminous librarian catalogues. This included the Dewey’s decimal classification, Otlet and La Fontaine’s universal decimal classification, and the Library of Congress classification. The aim of these kinds of classifications was to account for the whole of knowledge in the world. But, many problems arose from this attempt of library sciences to organize the whole knowledge. Three rules were commonly respected in more natural classifications: 1) Everything classified must appear in the catalogue (which must be, in principle, finite and complete), 2) there is no empty class, 3) nothing can belong to more than one class. Generally, these rules are not respected in library classifications. To face the extraordinary challenge of cataloguing knowledge growing indefinitely throughout the course of time, the big library classifications designed at the end of the 19th century adopted the principle of decimalization. This system was used because decimal numbers, used as numeral items, authorize indefinite extensions of classifications. Suppose you start with 10 main classes, from 0 to 9. If you add a zero to each number, you get the possibility of forming 100 classes (from 00 to 99) and if you go on, you can obtain 1000 classes (from 000 to 999). Then you can also put a comma or a point, and define items like: 150.234. After the point, the sequence of numbers is potentially infinite and you can go as far as is needed. Another difference is that library classifications can sometimes allow for vacant classes in their hierarchy, and also can, assume the inscription of classified subjects in several places. Vacant classes are used because a librarian must manage some place for new documents that are still temporarily unclassified. Multiple inscriptions are also used because readers, who sometimes do not know exactly what they are looking for, need to have a broad ranging accesses to knowledge. This made made way for the existence of entries like author, subject, time, place, and so forth. The previous requirement of decimalization is obvious in the Dewey Decimal Classification (DDC) proposed by Melvil Dewey in 1876 (Béthery 1982). This classification is made up of ten main classes or categories, each of them being divided into ten secondary classes or subcategories. These last ones contain in turn ten subdivisions. The partition of the ten main classes thus gives successively 100 divisions and 1000 sections.

DDC — main sections

  • 000 – Computer Science, Information and General Works
  • 100 – Philosophy and Psychology
  • 200 – Religion
  • 300 – Social Sciences
  • 400 – Language
  • 500 – Science (including Mathematics)
  • 600 – Technology and Applied Science
  • 700 – Arts and Recreation
  • 800 – Literature
  • 900 – History, Geography and Biography

In the same way, the Universal Decimal Classification (UDC) of Otlet and La Fontaine globally presents the same hierarchical organization, except in the fourth nodal class, which is left empty (thus, applying the previous principle of vacant classes).

As librarians have rapidly observed, one undesirable consequence of such decimal schemes is the increasing fragmentation of subjects as taxonomist’s work proceed. For example, the Dewey Classification, though having this useful advantage of being infinitely extendible, turns out rapidly to be a list or a nomenclature. This is also the case of the UDC of Otlet and La Fontaine, and of all the classifications of the same type. A first attempt to make up for such a disadvantage has consisted of allowing some junctions between categories in the classification. A second one is the possibility of using some tables (7 in the DDC) to aid in the search of a complex object, which may be located in different sites. For instance, a book of poetry, written by various poets from around the world, would appear in several classes, indexed thanks to the tables. In general, DDC used to combine elements from different parts of the structure, in order to construct a number representing the subject content. This one often combines 2 or more subject elements with linking numbers and geographical and temporal elements. The method consists of forming a new item rather than drawing upon a list containing each class and its meaning. For example, 330 (for Economics) + 9 (for Geographic Treatment) + 04 (for Europe) and the use of ‘/’ gives 330/94 (European Economy). Another example is the following: 973 (for United States) + 05 (division for periodicals) and the use of the point ‘.’ gives 973.05 (periodicals concerning the United States generally).

Other specific features occur in library classifications, which tend to make them very different from classical scientific taxonomies. One spectacular difference with hierarchical classifications in Zoology or Botany is, as we have already seen, that it is possible for subjects to appear in more than one class. For example, in DDC, a book on Mathematics could appear in the 372.7 section or in the 510 section, depending on if the book is a monograph instruction for teachers on how to teach mathematics, or a mathematics textbook for children. Another difference is a relative flexibility of library classifications.

Though there exist improvements, UDC and DDC, like most of the classifications constructed at the same time (see Bliss 1929) are based on a perception of knowledge and of the relationships between academic disciplines extant from 1890 to 1910. Moreover, though updated regularly, UDC and DDC, as decimal systems, are less hospitable to the addition of new subjects. These kinds of classification are based on fixed and historically dated categories. One may observe, for example, that none of the main concepts of our present library science (digital library, knowledge organization, automatic indexing, information retrieval, and so forth) were included in the index of the 2005 UDC edition, and that technical taxonomies generally require more complex features (Dobrowolski 1964).

4. Ranganathan and the PMEST Scheme

There have been many pursuits to solve the aforementioned librarian problems. Some of them are well known since the middle of the 20th century. In the course of the 20th century, new modes of indexing and original classification schedules appeared in library science with the Indian librarian Shiyali Ramamrita Ranganathan (1933, 2006) and his faceted classification – also called “Colon classification” (CC), because of its use of the colon to indicate relations between subjects in the former edition.

Ranganathan was at first a mathematician and knew little about the library. But he took charge of the Madras University Library, and was then deputed by his University to study Library Science in London. There, he attended the School of Librarianship in the University College and discovered, as he said later, the “charm of classifications”, and also its problems. He saw very quickly that Decimal Classifications did not give satisfaction to users. On the opposite, he had the vision of a meccano set, where, instead of having ready-made rigid toys, one can construct them with a few fundamental components. This made him think of a new kind of classification.

It appeared to Ranganathan that the new theory might be organized at the higher level in 5 fundamental categories (FC) called facets: Personality, Matter, Energy, Space and Time−in summary PMEST. In each isolate facet  a Compound Subject is deemed to be a manifestation of one (and only one) of one or other of the five fundamental categories. There is also subfacets, so that the facet scheme PMEST and the subfacets we may form from it, are then used to sort subclasses in the main classes of the classification.

The difference with previous classifications is in the way one defines ‘subfacets’. Rather than simply dividing the main classes into a series of subordinate classes, one subdivides each main class by particular characteristics into facets. Facets, labeled by Arabic numbers, are then combined to make subordinate classes as needed. For example, Literature may be divided by the characteristic of language into the facet of Language, including English, German, and French. It may also be divided by form, which yields the facet of Form, including poetry, drama and fiction. So CC contains both basic subjects and their facets, which contain isolates. A basic subject stands alone, for example: Literature in the subject English Literature, while an isolate, in contrast, is a term that modifies a basic subject, for example, the term ‘English’. Every isolate in every facet must be a manifestation of one of the five fundamental categories in the PMEST scheme.

The advantages of the CC are numerous. The first one is a greater flexibility in determining new subjects and subject numbers. A second is the concept of phases, which allows taxonomists to readily combine most of the main classes in a subject.  Consider for example a subject like Mathematics for biologist. In this case, single class number enumerative systems, as those predominating in US libraries, tend to force classifiers to choose either Mathematics or Biology as the main subject. However, CC supplies a specific notation to indicate this be-phased condition.

Indeed, some problems remain unsolved. In CC, facets, that is, small components of larger entities or units are similar to flat faces of a diamond which reflect the underlying symmetry of the crystal structure, so that the general structure of Ranganathan Classification, as that of a faceted classification in general, is a kind of permutohedron. In principle, all descriptions may be done, whatever the order of them. For example, if we have to classify a paper speaking about seasonal variations of the concentration of noradrenaline in the tissue of the rat, we must get the same access if we have the direct sequence: (1) Seasonal, variations, concentration, noradrenaline, tissue, rat, or the reversed one: (2) Rat, tissue, noradrenaline, concentration, variations, seasonal. In mathematical words, this means clearly that the underlying structure that makes this transformation possible must be a commutative group. But this is not always the case, and for some dihedral groups, this structure is even forbidden. Another potential worry is that the PMEST scheme, which certainly has some connections with Indian thought, is far from being universally accepted (see De Grolier 1962) and has not been very often implemented in libraries, even in India.

So, in spite of all the improvements they receive in the course of time, a lot of problems have been raised in front of library classifications. In particular, library classifications will be strongly questioned in the 20th century by the proliferating development of the knowledge. First, the ceaseless flux of new documents forbids a stiff topology for classifications. The problem, then, is to know how to construct evolutionary structures. Second, the successive orderings of the knowledge (groupings and revisions and not only ramifications) has called relational powerful and automated documentary languages. Classifications still remain necessary, because documentary languages cannot do everything. So the problem is still open. But, with the big development of mathematics in the last century, this general problem, which is the great problem of order, has to be investigated by the means of mathematical structures.

5. Order and Mathematical Models

First attempts to study orders in mathematics began to develop at the end of the 19th century with Peano, Dedekind and Cantor (especially with his theory of ordinals, which are linear ordered sets).  They go on with Peirce (1880) and Shröder (1890) and their works around the question of an algebra of logic. Then, in the first part of the 20th century, comes the notion of partial order with an article of MacNeille (1937) and the famous work of G. Birkhoff (1967) who introduced the notion of lattice, algebraically developed later in the great book of Rasiowa and Sikorski (1970). During the same period, mathematical models of hierarchical classifications, which have been investigated in the USA by Sokal and Sneath (1963, 1973) or, in England, by Jardine and Sibson (1971) were developed in France in the works of Barbut and Monjardet (1970), Lerman (1970, 1981), and Benzécri (1973). All these works supposed the big last century advances in mathematical order theory: especially the papers of Birkhoff (1935), Dubreil-Jacotin (1939), Ore (1942, 1943), Krasner (1953-1954) and Riordan (1958). The Belgian logician Leo Apostel (1963) and the Polish mathematicians Luszczewska-Romahnowa and Batog (1965a, 1965b) have also published important articles on the subject. The more and more important use of computers in the search of automatic classifications has also been, in those years, a reason for searchers to get interested in mathematical models.

As there are many forms of classifications in the world of knowledge (we can find them, as we have seen, in mathematics, natural sciences, library and information science, and so forth) there are also many possible mathematical models for classifications. We begin with the study of extensional structures.

a. Extensional Structures

In order to clarify the situation, we start with the weakest form of them and move to stronger forms. Mathematics allows us to begin with very few axioms, that usually define weak general structures, and afterwards, by adding new conditions, one can get other properties and stronger models. In our case, the weakest structure is just a hypergraph H = (X,P) in the sense of Berge (1970), with X a set of vertices and P a set of nonempty subsets called edges (See Figure 3).

Fig. 3

Figure 3: A Hypergraph

In this case, the set of edges P does not necessarily cover the set X, and some nodes (vertex of degree zero), may have no link to some edge. Assume the following conditions:

(C0)    X ∈ P,

(C1)    For all x ∈ P, {x} ∈ P,

Accordingly, we have a system of classes (in the sense of Brucker-Barthélemy 2007).

Add now the following new conditions: for every Pi ∈ P:

(C2)      Pi ∩ Pj = Ø,

(C3)      ∪ Pi = X,

Then P is a partition of X and the Pi are the blocks of the partition P.

Let now P(X) be the set of partitions on a nonempty finite set X. We may define on P(X) a partial order relation ≤ (reflexive, antisymmetric and transitive) such that P(X), ≤) is a lattice in the sense of Birkhoff (1967), that is, a partial order where every pair of elements has the same least upper bound and the same greatest lower bound. Then, one can prove that all the chains (all the linearly ordered sequences of partitions) of this lattice are equivalent to hierarchical classifications. So, the set C(X) of all these chains is exactly the set of all hierarchical classifications on a set. This set C(X) has itself a mathematical structure: it is a semilattice for set intersection. This model allows us to get all the possible partitions of P(X) and all the possible chains of C(X) (See Figure 4).

Fig. 4

Figure 4: The lattice of partitions of a 4-element set.

A first problem is that such partitions are very numerous. For |X| = 9, for example, there is already 21147 partitions. So, when we want to classify some domain of objects (plants, animals, books, and so forth), it is not very easy to examine what classification is the best one among, say, several thousands of them.

A second problem is that the world is not made of chains of partitions. If it were, of course, the game would be over. Everything could be inserted in some hierarchical classification. But, the real world has no reason to present itself as a hierarchical classification. In the real world, we have generally to deal with quite chaotic entities, complicated fuzzy classes and poor structured objects, all that form what we can call ‘rough data’. So when we want to get a clear order, we have to construct it,  such that it is extracted from the complicated data. For that, we have to compare objects, to know the degree to which they are similar, and to do so, we need of course a notion of ‘similarity’. In order to make empirical classifications we must evaluate the similarities or dissimilarities between elements to be classified. In the history of taxonomic science, Buffon (1749) and Adanson (1757) have tried to understand the meaning of this evaluation in the following way. First, they claim, we have to measure the distance between the objects by the means of some index, so that we can build classes. Afterwards, we have to measure the distance between classes themselves, so that we can group some classes into classes of classes, and so replace the initial set of objects with an ordered set of classes that is less numerous than them.

What old taxonomists were doing, only basis of observation, can now be carried out with the help of mathematics, using a modern notion of distance. Lerman (1970) and Benzécri (1973) showed that a hierarchical classification, that is, a chain of partitions, is nothing but a particular kind of distance or, a particular kind of dissimilarity (Van Cutsem 1994). It is an ultrametric distance, which gives tree representations (Barthélemy and Guénoche 1988) and also has the special property to correspond exactly with the chain, so that, when considering all the chains, the set of their corresponding distance matrices makes a semiring (R, +, ×) when we interpret the lattice operations min and max in an anusual but clever manner (+ for min, × for max) (Gondran 1976). Problems arise when the distance between the objects classified is not ultrametric. In such cases, we have to choose the closest ultrametric smaller than the given distance, and so, access to the best hierarchical classification we can get and which is the closest one to the data. However, this kind of approach leads, in general, to relatively unstable classifications.

Indeed, there are two kinds of instability for classifications. The first, Intrinsic instability,,is associated to the plurality of methods (distances, algorithms and so forth) that can be used to make the classifications of objects. The second is extrinsic instability, which is connected to the fact that our knowledge is changing with time, so the definitions of objects (or attributes of the objects) are evolving.

An answer to the question of intrinsic instability is a theorem of Lerman (1970) which says that if the number of attributes (or properties) possessed by the objects of a set X is constant, the associated quasi-order given by any natural metric is the same. But this result has two limits. First, when the sample variance of the number of attributes is a big one, of course, the stability is lost and second, if we classify the attributes, instead of classifying the objects, the reverse is not true.

For extrinsic instability the answers are more difficult to find. We may appeal to methods used in library decimal classifications (UDC, Dewey, and so forth), which make possible infinite ramified extensions, but these classifications, as we have seen, are apt to assume that higher levels are invariant and have also the disadvantage to be enumerative and to degenerate rapidly into simple lists. Also, pseudo-complemented structures (Hilman 1964) that admit some kinds of waiting boxes (or compartments) for indexing things that are not yet classified. We get as well structures whose transformations obey certain rules that have been fixed in advance. That is the case of Hopcroft 3-2 trees (Aho, Hopcroft, Ulmann 1983) for instance, or of structures close to these ones (Larson and Walden, 1979). In recent years, new models for making classifications came from conceptual formal analysis (Barwise and Seligman, 2003), computer science or views using non-classical logics in the domain of formal ontologies (Smith 1997, 2003). In computer science, for example, the concept of Abstract Data Type (ADT), related to the concept of Data Abstraction, important in object-oriented programming, may be viewed as a generalization of mathematical structures. An ADT is a mathematical model for data types, where a data type is defined by its behavior from the point of view of a user of the data. More formally, an ADT may be defined as a “class of objects whose logical behavior is defined by a set of values and a set of operations” (Dale-Walker 1996), which is strictly analogous to algebraic structures in mathematics. So, if we are not satisfied by a rough classification like the partition into collections, streams and iterators (support loops accessing data items) and relational data structures that capture relationships between data items, we must admit that ADT can also be regarded as a generalized approach of a number of algebraic structures, such as lattices, groups, and rings (Lidi 2004). Hence, classifications of ADT turn into classifications in algebraic specifications of ADT (Veglioni 1996). In this context, computer science adds nothing to mathematics and the problem is now that a classification of mathematical structures using, for instance, Category theory, as Pierce (1970) tried does not bring a sufficient answer because a category may exist while its objects are not necessarily constructible (Parrochia-Neuville 2013).

So, none of the previous approaches is very convincing for solving the basic problem, which always remains the same. We are lacking a general theory of classifications, which would only be able to study and, in the best case, solve some the main problems of classification.

b. A Glance at an Intensional Approach

Instead of making partitions by dividing a set of entities, so that the classes obtained in this way are extensional classes, as we saw in the previous section, we can instead proceed by associating a description to a set of entities. In this case, the classes are called intensional classes. Aristotle himself mixed the two points of view in his logic but Leibniz was the first to propose a purely intensional interpretation of classes. For a long time, that view was a minority and has never won unanimous support among the Ancient philosophers and logicians (as the numerous discussions between Aristotle and Plato, Ramus and Pascal, Jevons and Joseph demonstrate). However, the development of computer science brought this view back, since for declarative languages and particularly object-oriented languages, pure extensional classes or sets are rather uncommon. In this approach, the intension can be given either a priori, for example by a human actor from his knowledge of the domain, or a posteriori, when it is deduced from the analysis of a set of objects. In object-oriented modeling and programming, classes are traditionally defined a priori, with their extension mostly derived at running stage. This is usually done manually (intension being represented by logical predicates or tags), but techniques for a posteriori class discovery and organization also exist. In the context of programming languages, they deal with local class hierarchy modification by adding interclasses and use similarity-based clustering techniques or the Galois lattice approach (Wille 1996).

When there is an unrelated collection of sets, which is the case in artifact-based software classification, an issue is to compare and organize these sets simply by inclusion, or to apply conceptual clustering techniques. However, most of objected oriented languages are concerned with hierarchies, whose structure may be a tree, a lattice, or any partial order. The reason is that such structures reflect the variety of languages, some of them admitting multiple inheritance (C++, Eiffel), others only single inheritance (Smalltalk). Java has a special policy concerning this point: it admits two kinds of concepts, classes and interfaces, with single inheritance for classes and multiple inheritance for interfaces.

The viewpoint of Aristotle was the following: the division must be exhaustive, with parts mutually exclusive, and an indirect consequence of Aristotle’s principles is that only leaves of the hierarchy should have instances. Furthermore, the divisions must be based on a common concern whose modern name is the ‘discriminator’ in Unified Modeling Language (UML). But usual programming practices do not necessarily satisfy those principles. Multiple inheritance, for example, is contradictory with the assumption of mutually exclusive parts, and instances may in general be directly created from all (non-abstract) classes. Direct subclasses of a class can be derived according to different needs with different discriminators, but there is no evidence that this approach leads to relevant classifications. Objected oriented approaches, which transgress Aristotelian principles, are almost always practical storage modes but do not satisfy the main requisites of good classifications.

There are main principles that yield good classification, which are described in the intensional perspective. First–with Apostel [1963]– are some basic definitions.

From an intensional viewpoint, a division (or partition) is a closed formula F, which contains some assertion of the type (P ⊃ (Q1 ∨ Q2 ∨…∨ Qn)). So, a classification is a sequence of implicative-disjunctive propositions which takes the following form: everything which has the property P has also one of the n properties Q1 … Qn. Everything which has the property Qr  has also the property S, and so on (Apostel 1963, 188).

A division is essential if the individuals having the property P – and only this individual – may also have one of the properties Qi. So, we can see that there are degrees in essentiality insofar as the number of individuals having the Q’s without having the P’s is greater or less. At every level, a classification may be probably or necessarily essential or exhaustive, or exclusive.

We call intensional weight w(P) of a property P,  the set of disjunctions implied by this property (with necessity, factuality or probability). Properties defining classes in the same level may have extremely variable intensional weights. The basis of a division is the constant relation R, if any, between the properties of two different classes of this division.

A basis of division is (partially or totally) exhausted in some level insofar as, for this level, we do not find, in any case, true disjunctive propositions that are implied by the properties of this level and whose terms are connected by this very relation R.

A division is said to follow another one immediately (or to be immediately subsequent) if, for all P properties of the first, and for all Q properties of the second that are disjunctively implied by the P’s, there exists no sequence of R properties disjunctively implied by the P’s and disjunctively implying the Q’s.

The form of a property defining a class is the logical form of this property (conjunction of properties, disjunction of properties, negation of properties, single property).

For Apostel, an optimal classification should satisfy the following requisites:

  1. Every level needs a basis for division;
  2. No new basis for division shall be introduced before the previous one is exhausted;
  3. Every division is essential;
  4. Intensional weights of classes in a given level are comparable and relations between intensional weights of subsequent division properties in the classification must be constant.
  5. Properties used to define classes are conjunctive ones, and not negative ones.
  6. From the intensional viewpoint, divisions must be immediately subsequent.

In real domains, these requirements, or some of them, fail to hold. Levels are often extensionally equivalent but intensionally, the basis of division, the intensional weight, and so forth may change or not.

A natural classification is such that the definition of the domain classified determines in one and the same way the choice of the criteria of classification. It means that the fundamental set may be divided such that the division in the first level of the classification is an essential and subsequent one.

Intensional and extensional classifications are intimately related. Gathering entities in sets to produce extensional classes implies tagging these entities by their membership to these classes. But, intensional classes, built according to these descriptions, have an extension, which may be different from the initial extensional classes. So, in fact, both perspectives are not totally isomorphic and from Peirce (Hulwitt 1997) to Quine (1969), and presently, the question of natural classes remains an open and somewhat controversial question.

6. The Idea of a General Theory of Classifications

The idea of a general theory of classifications is not new. Such a project has been anticipated by Kant’s logic at the end of the 18th century. Then it was followed by many attempts to classify sciences at the beginning of the 19th century (Kedrov 1977) and had been posed by Auguste Comte in his Cours de philosophie positive (Comte 1975) as a general theory based on the study of symmetries in nature. Comte was inspired by mathematician Gaspard Monge and his classification of surfaces in geometry. However, this remains, in the work of Comte, a wishful thought. In the same way, the French naturalist Augustin-Pyramus de Candolle, published in 1813 an Elementary Theory of Botany, a book in which he introduced the term ‘taxonomia’, used in this work for the first time (de Candolle 1813). De Candolle showed that Botany had to leave artificial methods for natural ones, in order to get a method independent from the nature of the objects. Unfortunately, nothing very concrete or precise followed his remarks. Moreover, the previous projects were only concerned with finite classifications, particularly, biological ones. A higher and more general view came into light around the 1960s with the Belgian logician Leo Apostel. Apostel (1963) wanted to write a concrete version of Set theory, and, in order to do that, needed axioms that allow him to include in the theory only the classes actually existing in the world. As such, Apostel was led to ask some questions about the well-known axioms of Zermelo-Fraenkel’s Set theory. He did not reject the whole ZF-axiomatics but however suspected axioms like the pairing axiom, the axiom of separation and the power set axiom. He also left optional the axiom of infinity and had rather a negative opinion about the axiom of choice. This project got a new revival with the recent book of Parrochia-Neuville (2013).

The hardships of solving the problem of instability of classifications provided motivation for a search for some clear composition laws to be defined on the set of classifications over a set and to a true algebra of classifications, if possible, which is very difficult because this algebra would have to be, in principle, commutative and non-associative. This search is all the more crucial that a recent theorem proved by Kleinberg (2002) shows that one cannot hope to find a classifying function which would be together scale invariant, rich enough and consistent. This result means that we cannot find empirical stable classifications by using traditional clustering methods.

In the past, some attempts have been made to formalize non-commutative parenthesized products: Comtet (1970) and Neuville, in the 1980s used the Lukasiewicz’s Reverse Polish Notation (RPN), named also Postfix Notation, whose advantage is not only to make brackets or parentheses superfluous, but also to perform calculations on trees in the required order. But, a general algebra of classifications on a set is not known, even if some new models−Loday’s dendriform algebras, for example, which work very well for trees (See Dzhumadil’daev-Löfwall 2002)−are good candidates. In any event, we are invited to look for it, for two reasons. First, the world is not completely chaotic and our knowledge is evolving according to some laws. Second, there exist quasi-invariant classifications in physics (elementary particle classification), chemistry (Mendeleyev table of Elements), crystallography (the 232 groups of crystallographic structures) among others. Most of these good classifications are based on some mathematical structures (Lie groups, discrete groups, and so forth.). To address questions concerning classification theory, and clarify the different domains of it, one may propose this final view (See Figure 6):

  • When our mathematical tools apply only to sense data, we get phenomenal classifications (by clustering methods): these are generally quite unstable.
  • When our mathematical tools deal with crystallographic or quantum structures, we get what we call, using a Kantian concept, noumenal classifications (for instance, by invariance of discrete groups or Lie Groups). These are generally more stables.
  • When we search a general theory of classifications (including infinite ones), we are in the domain of pure mathematics. In this field, ordering and articulating the infinite set of classifications comes to construct the continuum.

Figure 6

Figure 6: Metaclassification

This problem is far from being solved because there are a lot of unstable theories (Shelah 1978, 1998). However, the recent work of Parrochia-Neuville (2013) assumes the conjecture that a metaclassification, that is, a classification of all mathematical schemes of classifications, does exist. The reason is that all these forms may be expressed as ellipsoids of an n-dimensional space (Jambu 1983) that must converge necessarily on a point, the index of the classification. If the real proof comes, this will give a theorem of existence of such a structure from which a number of important results could follow.

7. References and Further Readings

  • Adanson, M. 1757. Histoire naturelle du Sénégal. Paris: Claude-Jean-Baptiste Bauche.
  • Aho, A.V., Hopcroft, J.E, Ulmann, J.D. 1983. Data Structures and algorithms. Reading (Mass.): Addison-Wesley Publishing Company.
  • Agassiz, L. 1962. Essay on Classification (1857), reprint. Cambridge: Harvard University Press.
  • Apostel, L. 1963. Le problème formel des classifications empiriques. La Classification dans les Sciences. Gembloux: Duculot.
  • Aristotle, 1984. The Complete Works. Princeton: Princeton University Press.
  • Barbut M., Monjardet, B. 1970. Ordre et classifications, 2 vol. Paris: Hachette.
  • Barthélemy, J.-P., A. Guénoche. 1988. Les arbres et les représentations des proximités. Paris: Masson.
  • Barwise, J., Seligman, J. 2003. The logic of distributed systems. Cambridge: Cambridge University Press.
  • Béthery, A. 1982. Abrégé de la classification décimale de Dewey. Paris: Cercle de la librairie.
  • Bliss, H. E. 1929. The organization of knowledge and the system of the sciences. New York: H. Holt and Company.
  • Benzécri, J.-P., et alii. 1973. L’analyse des données, 1, La taxinomie, 2 Correspondances. Paris: Dunod.
  • Birkhoff, G. 1935. On the structure of abstract algebras. Proc. Camb. Philos. Soc. 31, 433-454.
  • Birkhoff, G. 1967. Lattice theory (1940), 3rd ed. Providence: A.M.S.
  • Brucker F., Barthélemy, J.-P. 2007. Eléments de Classification, aspects combinatoires et algorithmiques. Paris: Hermès-Lavoisier.
  • Buffon, G. L. Leclerc de, 1749. Histoire naturelle générale et particulière (vol. 1). Paris: Imprimerie royale.
  • Candolle (de), A. P. 1813. Théorie élémentaire de la Botanique ou exposition des principes de la classification naturelle et de l’art d’écrire et d’étudier les végétaux, first edition. Paris: Deterville.
  • Comte, A. 1975. Philosophie Première, Cours de Philosophie Positive (1830), Leçons 1-45. Paris: Hermann.
  • Comtet, L. 1970. Analyse combinatoire. Paris: P.U.F..
  • Dagognet, F. 2002. Tableaux et Langages de la Chimie (1967). Seyssel: Champ Vallon.
  • Dagognet, F. 1970. Le Catalogue de la Vie. Paris: P.U.F..
  • Dagognet, F. 1984. Le Nombre et le lieu. Paris: Vrin.
  • Dagognet, F. 1990. Corps réfléchis. Paris: Odile Jacob.
  • Dahlberg, I., 1976. Classification theory, yesterday and today. International Classification 3 n°2, pp. 85-90.
  • Dale, N., Walker, H. M. 1996. Abstract Data Types: Specifications, Implementations, and Applications. Lexington, Massachusetts: D.C. Heath and Company.
  • Darwin, C.R., 1964. On the Origin of Species (1859), reprint. Cambridge: Harvard University Press.
  • De Grolier, E. 1962. Etude sur les catégories générales applicables aux classifications documentaires, Unesco.
  • Dobrowolski, Z. 1964. Etude sur la construction des systèmes de classification. Paris, Gauthier-Villars.
  • Dubreil, P., Jacotin, M.-L. 1939. Théorie algébrique des relations d’équivalence. J. Math. 18, pp. 63-95.
  • Dzhumadil’daev,A. et Löfwall, C. 2002. Trees, free right-symmetric algebras, free Novikov Algebras and Identities. Homology, homotopy and Applications, vol.(4(2), pp. 165-190.
  • Foucault, M. 1967. Les Mots et les Choses. Paris: Gallimard.
  • Gondran, M. 1976. La structure algébrique des classifications hiérarchiques. Annales de l’Insee, pp. 22-23.
  • Granger, G.-G. 1980. Pensée formelle et Science de l’Homme (1967). Paris: Aubier-Montaigne.
  • Hilman, D.J. 1965. Mathematical classification technics for non static document collections, with particular reference to the problem of revelance. Classification Research, Elsinore Conference Proceedings, Munksgaard, Copenhagen, pp. 177-209.
  • Huchard, M., R. Godin, , A. Napoli, A. 2003. Objects and Classification. ECOOP 2000 Workshop reader, J. Malenfant, S. Moisan, A. Moreira (Eds), LNCS 1964. Berlin-Heidelberg-New York: Springer-Verlag, pp 123-137.
  • Hulswit, M. 1997. Peirce’s Teleological Approach to Natural Classes. Transactions of the Charles S. Peirce Society, pp. 722-772.
  • Jambu, M. 1983. Classification automatique pour l’analyse des données, 2 vol.. Paris: Dunod.
  • Jardine N., Sibson, R. 1971. Numerical Taxonomy. New York: Wiley.
  • Joly, R. 1956. Le thème philosophique des genres de vie dans l’Antiquité grecque. Bruxelles: Mémoires de l’Académie royale de Belgique, classe des Lettres et des Sciences mor. et pol., tome Ll, fasc. 3.
  • Kant, E. 1988. Logic. New York: Dover Publications.
  • Kedrov, B. 1977. La Classification des Sciences (vol. 2). Moscou: Editions du Progrès.
  • Kleinberg, J. 2002. An impossibility theorem for Clustering. Advances in Neural Information Processing Systems (NIPS), 15, pp. 463-470.
  • Krastner M. 1953-1954. Espaces ultramétriques et ultramatroïdes. Paris: Séminaire, Faculté des Sciences de Paris.
  • Larson, J.A., Walden, W.E. 1979. Comparing insertion shemes used to update 3-2 trees. Information Systems, vol.4, pp. 127-136.
  • Lerman, I.C. 1970. Les bases de la classification automatique. Paris: Gauthier-Villars.
  • Lerman, I.C. 1981. Classification et analyse ordinale des données. Paris: Dunod.
  • Lidi R., 2004. Abstract Algebra. Berlin-Heidelberg-New York: Springer-Verlag.
  • Luszczewska-Romahnowa S., Batog T. 1965a. A generalized classification theory I. Stud. Log., tom XVI, pp. 53-70.
  • Luszczewska-Romahnowa S., Batog T. 1965b. A generalized classification theory II. Stud. Log., tom XVII, pp. 7-30.
  • MacNeille 1937. Partially ordered sets. Transaction Amer. Math. Soc., vol. 42, pp. 416-460.
  • Ore O. 1942. Theory of equivalence relations. Duke Math. J. 9, pp. 573-627.
  • Ore O. 1943. Some studies on closer relations. Duke Math. J. 10, pp. 761-785.
  • Parrochia, D., Neuville, P. 2013. Towards a general theory of classifications. Bäsel: Birkhaüser.
  • Peirce C. S. 1880. On the Algebra of Logic. American Journal of Mathematics 3, pp. 15-57.
  • Pierce, R.S. 1970. Classification problems. Mathematical System theory, vol. 4, n°1, March, pp. 65-80.
  • Plato, 1997. The Complete Works. Cambridge: Hacking publishing Company
  • Porphyry, 2014. On Aristotle’s Categories. London, New York: Bloomsbury Publishing Plc.
  • Quine, W.V.O. 1969. Ontological Relativity and Other Essays. New York: Columbia University Press.
  • Ranganathan, S. R. 1933. Colon Classification. Madras: Madras Library Association.
  • Ranganathan, S. R. 2006. Prolegomena to Library Classification (1937), Reprint. New Delhi: Ess Pub..
  • Rasiowa H., Sikorski, R. 1970. The Mathematics of Metamathematics. Cracovia: Drukarnia Uniwersytetu Jagiellonskiego.
  • Riordan, J. 1958. Introduction to combinatorial analysis. New York: Wiley.
  • Roux, M. 1985. Algorithmes de classification. Paris: Masson.
  • Shelah, S. 1988. Classification Theory (1978). Amsterdam: North Holland.
  • Shröder, E. 1890. Vier Kombinatorische Probleme. Z. Math. Phys. 15, pp. 361-376.
  • Smith, B. 1997. Boundaries: An Essay in Mereotopology. L. Hahn (ed.), The Philosophy of Roderick Chisholm. La Salle, Open Court: Library of Living Philosophers, pp. 534-561.
  • Smith, B. 2003. Groups, sets and wholes. Revista di estetica, NS (P. Bozzi Festschrift), 24-3, 1209-130.
  • Sokal R. R., Sneath, P.H. 1963. Principle of numerical taxonomy. San Francisco: W. H. Freeman.
  • Sokal, R. R., and Sneath, P. H. 1973. Numerical Taxonomy, the principles and practice of numerical classifications. San Francisco: W. H. Freeman.
  • Van Cutsem B. (ed.) 1994. Classification and dissimilarity analysis. New York-Berlin-Heidelberg: Springer Verlag.
  • Veglioni, S. 1996. Classifications in Algebraic specifications of Abstract Data Types. CiteSeerX
  • Windsor, M. P. 2009. Taxonomy was the foundation of Darwin’s evolution. Taxon 58, 1, pp. 43-49.
  • Wille, R. 1996. Restructuring lattice theory: an approach based on hierarchy of concepts. Rival, I (ed.) Ordered Sets. Boston: Reidel, pp. 445-470.
  • Woodward, H. 1903. Memorial to Henry Alleyne Nicholson. M.D., D.Sc., F.R.S. Geological Magazine, 10, pp. 451-452.

 

Author Information

Daniel Parrochia
Email: daniel.parrochia@wanadoo.fr
Université Jean Moulin – Lyon III
France

The Aim of Belief

It is often said that belief has an aim. This aim has been traditionally identified with truth and, since the late 1990s, with knowledge. With this claim, philosophers designate a feature of belief according to which believing a proposition carries with it some sort of commitment or teleological directedness toward the truth (or knowledge) of that proposition. This feature is taken to be constitutive of belief (that is, it is part of what a belief is that it is an attitude having this aim) and individuative of that type of mental state (that is, it is sufficient for distinguishing beliefs from other types of mental attitude like desire and imagining). Philosophers appeal to belief’s aim mainly for explanatory purposes: the aim is supposed to explain a number of other features of belief, such as the impossibility of believing at will, the infelicity of asserting Moorean sentences (for example, “I believe that it is raining, but it is not raining”), and the normative force of evidential considerations in the processes of belief-formation and revision.

Though many tend to agree on the above aspects of the aim, there are major disagreements over two further issues: (1) how to interpret the claim that belief has an aim, and (2) what this aim is. With respect to (1), the claim has received very different interpretations. Some have interpreted it literally, taking the aim as an intentional purpose of believers or a functional goal of beliefs; others have interpreted it metaphorically, as some kind of commitment or norm governing beliefs and their regulation (formation, maintenance, and revision); still others deny that beliefs aim at truth in a substantive sense and endorse minimalist accounts of belief’s truth-directedness. With respect to (2), there is an ongoing debate on whether the aim of belief is truth, knowledge, or some other condition such as epistemic justification.

Table of Contents

  1. The Truth-Directedness of Belief
    1. The Aim as Constitutive and Individuative of Belief
    2. Differences between the Aim and Other Properties of Belief
    3. The Explanatory Role of the Aim
  2. Interpretations of the Aim
    1. Teleological Interpretations
    2. Normative Interpretations
    3. Minimalist Interpretations
  3. What Does Belief Aim At?
  4. Relevance of the Topic
  5. References and Further Reading

1. The Truth-Directedness of Belief

The claim that “belief aims at truth” was first coined by Bernard Williams (1973) to designate a set of properties of beliefs, namely (1) that truth and falsehood are dimensions of assessment of beliefs as opposed to other psychological states and dispositions; (2) that to believe that p is to believe that p is true; and (3) that to say “I believe that p” carries, in general, a claim that p is true; that is, it is a qualified way of asserting that p is true (Williams, 1973, p. 137).

Since Williams, many have taken up the claim that belief aims at truth. However, with such an expression, these philosophers do not refer to a set of properties as Williams did, but to a unique feature of belief (sometimes also called truth-directedness). This feature (that is, aiming at truth) is supposed to capture the specific relation of belief with truth. This relation seems to be peculiar to belief, and to play an important role in the characterization of this type of attitude. No other attitude seems to entertain such a special relation with truth. Like belief, the content of attitudes like (propositional) desires, imaginings, and mere thoughts can be true or false. But differently from these attitudes, beliefs are considered defective if their content is false, or correct if it is true: if I imagine that snow is black, there is nothing defective in my imagination; but if I believe that snow is black, there is something wrong with my belief. Also, we can arbitrarily decide to form or revise attitudes like imagining and assuming regardless of whether we take their contents to be true or false, but this seems not to be possible for beliefs. In short, these attitudes are not sensitive to truth-regarding considerations in the way beliefs are (in both normative and descriptive ways). The relation of belief with truth also differs from that of factive attitudes like knowledge and regret. Differently from beliefs, these attitudes imply the truth of their content. If I know that it is raining now in Paris, then it is true that it is raining now in Paris. But if I believe that, the content of my belief may be false. The relation of belief to truth is thus neither as weak as that of other attitudes like imagining, nor as strong as that of knowledge. This is why it is often conceived as an aim or a commitment toward the truth (or knowledge) of the believed proposition: beliefs may fail to be true (to achieve that aim), not that they may fail to aim at truth.

That granted, a further question is how to interpret the claim that beliefs aim at truth. Philosophers conceive of truth-directedness in very different ways: as an intentional aim of the believer to accept a proposition if and only if it is true; as a function regulating our cognitive processes; as a norm requiring one to believe a proposition only if true; as a value attached to believing truly. In this section I remain neutral on the specific interpretations of the aim, postponing a discussion of these interpretations to §2. The objective of the present section is to introduce some properties commonly attributed to truth-directedness, independent of its specific interpretation. For ease of exposition, it will also be assumed that truth is the aim of belief until §3, where alternative candidates are considered.

Section 1.a introduces two properties commonly attributed to truth-directedness: (1) that it is a constitutive or essential feature of belief, and (2) that it is individuative of belief with respect to other mental attitudes. Section 1.b considers the differences between truth-directedness and other truth-related properties of belief such as the direction of fit and the value of having true beliefs. The truth-aim is usually attributed to belief in order to explain a number of characteristics of this attitude concerning its relation with truth. Section 1.c lists the main features that truth-directedness is supposed to explain.

a. The Aim as Constitutive and Individuative of Belief 

When philosophers attribute an aim to belief, they conceive of this property as constitutive of this type of attitude. This means, roughly, that it is part of what a belief is (that is, part of the essence or the concept of belief) that it is a mental attitude directed at the truth. Let us label this the constitutivity thesis. Depending on how we conceive truth-directedness, there will be different ways of working up to this thesis. If, for example, we interpret truth-directedness as a goal of the agent (compare §2.a), we can conceive of beliefs as analogous to acts like concealing (Steglich-Petersen, 2006, p. 512). Part of what it is to conceal an object X is that it is a type of act involving the goal that someone will not find X. It is in virtue of this goal that an action counts as an instance of concealing. Similarly, a way of stating the constitutivity thesis for belief is that it is part of what S’s believing that p is that S has an aim or goal (or that it is a function of S’s cognitive system) to retain that attitude only if it is true. It is in virtue of this aim of the agent who believes (or this function of her cognitive system) that that attitude counts as belief.

Alternatively, if one interprets truth-directedness as a norm to believe only the truth (compare §2.b), the constitutivity thesis amounts to understanding this norm by analogy to rules constitutive of practices like games (Wedgwood, 2002, p. 268). A practice is constituted by a set of rules if and only if it is part of what that practice is that this set of rules is in force for agents engaged in that practice (Glüer & Pagin, 1998). Consider a specific example: chess is a game constituted by a set of rules stating which moves are legal or permissible in the game. If one plays chess, one is thereby committed by the rules of the game to perform only legal moves. The performance of a particular act does not count as a chess-move if it cannot be assessed (justified, criticized…) according to the constitutive rules of the game. Similarly, if it is part of what a belief is that it is an attitude governed by a norm to believe only the truth, a mental attitude does not count as a belief if it cannot be assessed (criticized, justified…) on the basis of this norm, as right or correct if true and wrong or incorrect if false. One can also conceive of the constitutivity thesis by analogy to other types of entity essentially constituted by norms or values. For example, it is constitutive of what it is to be a citizen to be subject to certain rights and commitments, and it is constitutive of murder to be an act of killing in a wicked, inhumane, or barbarous way (for the latter example, see Dretske, 2000, pp. 243-245).

The claim that truth-directedness is constitutive of belief can be conceived of in at least two ways, as relative to the concept of belief or to its nature. According to the conceptual interpretation, it is a condition of understanding the concept of belief that we conceive of beliefs as mental attitudes directed toward truth (Boghossian, 2003; Engel, 2004; Shah, 2003). A proper understanding of the concept of bachelor implies conceiving of a bachelor as an unmarried man. Analogously, if one has a correct grasp of the concept of belief and conceives of a mental attitude as a belief, she understands it as one that, in some sense to be specified, is directed toward truth.

Other philosophers consider truth-directedness as constitutive of the nature or essence of belief (Brandom, 2001; Railton, 1994; Velleman, 2000a; Wedgwood, 2002, 2007). The relation between belief and truth-directedness is here conceived of as one of metaphysical dependence of the former on the latter: as it is essential to water that it has a certain chemical composition (H2O), it is essential to belief that it is an attitude involving a commitment to or an aim at truth. A mental attitude counts as a belief at least partially in virtue of aiming at the truth. It is simply impossible for an attitude to be a belief if it lacks this property.

It is usually held that the essentialist interpretation of the thesis does not entail the conceptual one (for example, Wedgwood, 2007, ch. 6). It is part of the essence of water, but not of its concept, that water is H2O—we can understand the concept of water without conceiving water as having that specific chemical composition. Similarly the truth-aim may be constitutive of the essence of belief but not of its concept (see Zangwill, 2005 for a similar view). Also, some philosophers have argued that the conceptual interpretation does not entail the essentialist one (Papineau, 2013; Shah, 2003, fn. 41; Shah & Velleman, 2005, fn. 43; Wedgwood, 2007).

The second property commonly attributed to truth-directedness is the individuativity of belief: the aim is the feature that individuates belief as that type of mental state and distinguishes beliefs from other mental attitudes (Engel, 2004; Lynch, 2009; Railton 1994; Velleman, 2000a; Wedgwood, 2002). Though many other attitudes entertain relations with truth (compare §1.b), it is claimed that belief is the only attitude aiming at truth. The truth-aim plays a fundamental role in sorting out beliefs from other mental attitudes, being the distinctive feature of beliefs with respect to other types of attitude like thoughts, suppositions, desires, and imagining.

Philosophers usually appeal to the individuativity of truth-directedness for belief for two main reasons: (1) singling out the aim as a peculiarly distinctive property of belief helps to achieve a better grasp of what truth-directedness is and to distinguish this property from other properties of belief (a philosopher who assumes individuativity in order to define truth-directedness is Velleman, 2000a, pp. 247-252); and (2) individuativity provides an argument to the best explanation for the claim that belief aims at truth: as the argument goes, without assuming that belief’s truth-directedness has this peculiar individuative role, one cannot account for the difference between beliefs and other attitudes (Engel, 2004; Railton, 1994).

It has also been suggested that if truth-directedness is the distinctive feature of belief with respect to other mental attitudes, this would provide an argument for the claim that this property is also constitutive of belief (Lynch, 2009b, 81; McHugh & Whiting, 2014; Velleman, 2000a; Wedgwood, 2002). Here is a way in which this argument may proceed: if the truth-aim were not a necessary and constitutive feature of belief, it would be possible for a belief not to aim at truth. But then, assuming that the aim is the only feature distinguishing beliefs from other mental attitudes, it would be impossible to classify that attitude as a belief rather than as a different type of attitude. Thus, the truth-aim must be a feature that beliefs possess necessarily and essentially. The argument from individuativity is not the only one supporting the constitutivity of truth-directedness for belief. Since other arguments partially depend on normativist interpretations of the aim, they will be considered in 2.b.

A number of critics have pointed out that it is possible to distinguish beliefs from other types of attitude without stipulating that it involves a constitutive aim at truth. These philosophers identify the attitude of believing a proposition with that of merely holding it true or accepting it (Glüer & Wikforss, 2013; Vahid, 2009), or they take other dispositional or motivational properties of belief as distinctive of this type of attitude. For a discussion of some of these views see §2.c.

b. Differences between the Aim and Other Properties of Belief

According to many philosophers engaged in the present debate (in particular those endorsing teleological and normative interpretations), truth-directedness is supposed to characterize and distinguish belief from other types of mental attitude. This property is conceived of as unique to belief, not possessed by any other attitude. These philosophers are careful to distinguish it from other properties relating belief to truth that other attitudes also possess. In this subsection I will introduce some of these properties and explain in which respects they are supposed to differ from the aim of belief. Mentioning these other properties will provide a rough idea of what truth-directedness is not. However, before considering these properties, it is worth mentioning that some philosophers endorsing minimalist conceptions of truth-directedness tend to identify the aim with some of these properties; these alternative interpretations of the aim will be briefly mentioned in this subsection and considered in more detail in §2.c.

An obvious truth-related feature of belief is the fact that believing something is believing it to be true (Velleman, 2000a). In other words, beliefs have propositions as content, and propositions can be true or false. This property is obviously not individuative of belief, and thus cannot be identified with truth-directedness. All propositional attitudes share it with beliefs. For instance, believing that p is believing that p is true, hoping that p is hoping that p is true, imagining that p is imagining that p is true, and so on (Engel, 2004; Velleman, 2000a).

It is also commonly held that beliefs involve specific causal, functional, and dispositional-motivational roles with respect to action and behavior. Some of these roles determine another aspect under which beliefs are related to truth. Using Ramsey’s (1931) metaphor, beliefs are like maps by which we steer in the world and upon which we are disposed to act. Belief is an attitude involving dispositions to act and behave as if its content were true and to use it as premise in reasoning (Armstrong, 1973; Stalnaker, 1984). Some have argued that belief’s aim at truth can be identified with the possession of similar dispositional and functional properties. In response to this challenge, it has been argued that these properties are not sufficient to set belief apart from other mental attitudes, and thus to capture the distinctive relationship between belief and truth (Engel, 2004; Velleman, 2000a). Other types of attitude seem to possess these very same properties. For instance, attitudes like acceptance and pretense all seem to dispose the subject to act as if their content were true and have the same motivational role.

Another property commonly attributed to belief, and concerning the way it is related to truth, is its mind-to-world direction of fit. On the one hand, some attitudes, like desires, have a world-to-mind direction of fit: if what is desired is not the case, the world should be changed in order to fit what is desired, and not vice versa. On the other hand, other attitudes, like beliefs, have a mind-to-world direction of fit: if what is believed is not the case (that is, it does not fit what it is supposed to represent), the belief’s contents should be revised to fit the world, and not vice versa. This is only one way of fleshing out the distinction (see Frost, 2014 and Humberstone, 1992 for overviews of the distinction). Another popular way is to distinguish between cognitive and conative states, where cognitive states are such that the proposition in their content is regarded as something that is true, while conative states are such that they involve regarding the proposition in the content as something to be made true (Velleman, 2000a). It is difficult to evaluate the relation of the truth-directedness of belief with direction of fit, since this depends on which account of direction of fit one accepts, and there is no unique and undisputed account. Some philosophers seem to identify belief’s direction of fit with its aim at truth (Humberstone, 1992; Platts, 1979). Others (Engel, 2004; Shah & Velleman, 2005; Velleman, 2000a) distinguish the two features, arguing that other mental attitudes such as suppositions, assumptions, and imagining possess the same direction of fit as beliefs, and thus this property cannot be identified with truth-directedness, which is distinctive of beliefs. Notice that the persuasiveness of this argument depends on whether one endorses an account of direction of fit according to which other attitudes would have the same direction of fit as belief.

It is also important to distinguish the truth-directedness of belief from the value of possessing true beliefs. It has been argued that having true beliefs is something valuable (David, 2005; Horwich, 2006; Kvanvig, 2003; Lynch, 2004). We naturally prefer to have true rather than false beliefs, and tend to attribute some sort of value to true beliefs and disvalue to false ones. It seems to be a platitude that true beliefs are at least extrinsically and instrumentally valuable. For example, we might prefer true beliefs to false ones because the former are more conducive to the satisfaction of one’s desires and the avoidance of dangers. Some philosophers have argued that true beliefs have also epistemic value. For example, it has been argued that believing the truth is an intrinsically valuable cognitive success. Though one might expect there to be important connections between the two topics, the issue of whether true beliefs are valuable must be distinguished from the further issue of whether truth is the aim of belief. While the former is a matter of aims, goals, and evaluations extrinsic to the notion of belief (for example, the goal of believing truths and not believing falsehoods), the latter is a property intrinsic and constitutive of such a mental state (Vahid, 2006, 2009, p. 19). Another respect in which the two features must be distinguished is that the value of true beliefs is hardly individuative of beliefs: other types of mental state such as guesses, hypotheses, and conjectures are evaluable according to their being true or false. In spite of these important differences, some philosophers have suggested that the value of true beliefs can be at least in part related to and explained by the constitutive aim of belief, even if not identified with it (Engel, 2004; Lynch, 2004 Railton, 1994; Williams, 2002).

c. The Explanatory Role of the Aim

The hypothesis that beliefs involve an aim at truth has been used to explain a number of features specific to this mental attitude. Before considering such features, it is important to stress that not everyone who endorses some version of this hypothesis thinks that it can explain all of these features. The main features supposed to be explained by truth-directedness are the following:

  • The difficulty or impossibility of believing at will,
  • The infelicity of asserting Moorean sentences and the absurdity of having Moorean beliefs,
  • The normativity of mental content,
  • The motivational force of evidential considerations in deliberative contexts,
  • The nature of epistemic normativity and the norms governing belief and theoretical reasoning, and
  • The correctness standard of belief.

(1) As famously argued by Williams (1973), belief’s truth-aim would enable one to explain the difficulty of believing at will (see also Velleman, 2000a). Believing a proposition p at will would entail believing it without regard to whether p is true. However, if beliefs constitutively involve aiming at truth, the only considerations relevant to forming and maintaining a belief would be those in conformity to its constitutive aim; that is, truth-relevant considerations. Believing at will would thus be either impossible or very difficult. This line of argument has been widely discussed in the literature. For critical discussions see, for example, Frankish (2007); Hieronymi (2006); Setiya (2008); and Yamada (2012).

(2) Belief’s truth-directedness could also explain the infelicity of asserting Moorean sentences and the absurdity of thinking Moorean thoughts—sentences and thoughts having the form “I believe that p, but not p” (for example, Baldwin, 2007; Littlejohn, 2010; Millar, 2009; Moran, 1997; Railton, 1994). Though these sentences are not self-contradictory, if asserted, they sound odd and infelicitous. As Moore (1942, p. 543) observes, this feature of belief-ascription seems to show that self-ascribing a belief in the first person carries with it an implied claim to the truth of the believed proposition. Similar ascriptions relative to many other mental states involve either no infelicity (there is no paradox in asserting “I assume that p but it is false that p”) or a contradiction (it is contradictory to assert “I know that p but it is false that p”). The infelicity of asserting Moorean sentences can be explained as follows: on the one hand, an assertion is an act by which the speaker commits herself to the truth of what she says; on the other hand, a belief is a mental state involving an aim at the truth of the believed proposition. We can also think of this aim as a sort of commitment (Baldwin, 2007; Millar, 2009; see §2.a for normative interpretations of the aim). The infelicity would thus be due to a conflict between the respective constitutive commitments or aims of assertion and belief. By asserting a Moorean sentence like “p and I do not believe that p,” a speaker would both endorse a commitment to the truth of p and deny such a commitment at the same time. This explanation can be easily extended to an explanation of the unreasonableness of Moorean thoughts and judgments, since a judgment, like an assertion, can be considered an act involving a commitment to the truth of what is adjudged.

(3) Many philosophers argue that mental content is normative (for an overview and references, see Glüer & Wikforss, 2010). This thesis is often interpreted as the claim that there are norms governing the correct use of concepts in the content of propositional mental attitudes. An example of such norms is, for instance, that the concept white is correctly applied to an object x if and only if x is white. Some have suggested that the aim of belief can provide an explanation of the normativity of mental content. In particular, Velleman (2000a) has suggested that the normativity of content can be entirely reduced to the truth-directedness of belief: if there is a norm governing mental content, this norm applies only to the contents of attitudes that aim at truth; that is, to beliefs. Boghossian (2003) has provided an argument according to which the normativity of mental content would derive from that of belief. First, he argues that the truth-directedness of belief has to be conceived as a norm constitutive of the concept of belief. Second, he argues that there is a constitutive connection between the notions of content and belief: our grasp of the concept of content depends on the grasp of the concept of belief. The normativity of content would thus be inherited by the normativity of belief. This argument has been the target of several criticisms; see, in particular, Glüer & Wikforss (2009) and Miller (2008).

(4) Belief’s truth-directedness has also been invoked to explain certain aspects of doxastic deliberation (namely deliberation concerning what to believe). One such aspect is the motivational force of evidential considerations in deliberative contexts. In particular, Shah (2003) and Shah and Velleman (2005) have argued that truth-directedness can explain doxastic transparency, the phenomenon according to which, in the context of doxastic deliberation, the question whether to believe that p is invariably settled by the answer to the further question whether p is true. Roughly, the idea is that when an agent engages in deliberation whether to believe a given proposition, only evidential (truth-regarding) considerations can be treated as reasons for believing. Other types of considerations (for example, practical) have no motivational force in the deliberation. This can be explained by the hypothesis that the concept of belief is constitutively governed by a norm to believe p only if p is true, and that in doxastic deliberation, the agent deploying that concept in the question whether to believe that p is motivated by the truth-norm to form a belief only if it is true. This in turn explains why only truth-relevant considerations matter in answering the question. Other philosophers have provided similar explanations of doxastic transparency—and more generally of the central role of evidence in deliberative belief-formation processes—compatible with non-normative interpretations of the truth-aim (for example, Steglich-Petersen, 2006, §5). It is worth noting here that similar explanations of the impossibility of believing in response to non-evidential considerations can also be used to explain the impossibility of believing at will (see (1) above).

(5) Belief’s aim has also been invoked to explain the various norms governing belief and theoretical reasoning, and to shed light on the nature of epistemic normativity in general. For example, according to Velleman, belief’s truth-directedness accounts for the justificatory force of theoretical reasoning. Theoretical reasoning justifies a belief by adducing considerations that indicate it to be true (2000a, p. 246). This is the case because being true is what satisfies the aim of belief. Other philosophers have argued that belief’s aim helps to explain norms of rationality and justification governing beliefs, and, more generally, the nature of epistemic normativity (Boghossian, 2003; Millar, 2004, 2009; Shah & Velleman, 2005; Sosa, 2007; Wedgwood, 2002, 2013). A common explanation takes these norms as instrumentally conducive to the satisfaction of the constitutive truth-aim of belief. This approach to epistemic normativity is not new in the literature. Many philosophers of the past have argued that epistemic standards of justification and rationality would be derivable from the fundamental goal of believing truly and avoiding falsehoods (for an overview see Alston, 2005, chs. 1 and 2). Criticisms of this type of approach to epistemic normativity typically mirror arguments against similar approaches in the practical domain. See, for example, Berker (2013); Firth (1981); Kelly (2003); and Maitzen (1995).

The various attempts to reduce or explain epistemic normativity in terms of a fundamental aim or norm of truth governing belief are considered by some philosophers as part of a wider project directed at providing analogous accounts for other normative domains. In particular, some have argued that practical normativity can be tracked back to constitutive norms of action and agency, which in turn would determine derivative norms of practical rationality and justification (Korsgaard, 1996; Shah, 2008; Velleman, 2000a; Wedgwood, 2007).

(6) According to some philosophers, the aim at truth would also explain why a belief is correct if and only if it is true, that is, the so-called correctness standard of belief (Steglich-Petersen, 2006, 2009; Velleman, 2000a). Philosophers endorsing teleological interpretations of the aim hold that the standard would be an instrumental assessment indicating the measure of success that a belief must attain in order to achieve its constitutive aim. However, this thesis is the subject of major disagreements. Philosophers giving normative interpretations of truth-directedness either identify the correctness standard with the constitutive aim of belief (Engel, 2007; Wedgwood, 2002), or argue for the independence of the two (Shah & Velleman, 2005).

2. Interpretations of the Aim

In the contemporary debate, there is a wide disagreement on how to interpret the claim that belief aims at truth. There are two main interpretations of the aim: teleological and normativist. According to teleological accounts, the aim of belief is an intentional purpose of subjects holding beliefs, or a functional goal of cognitive systems regulating the formation, maintenance, and revision of beliefs. Normativist accounts hold that the claim that beliefs have an aim must be interpreted metaphorically. According to normativists, truth-directedness is better understood as a commitment, a norm governing the regulation of beliefs (their formation, maintenance, and revision). Other philosophers have endorsed minimalist accounts of truth-directedness, denying that beliefs aim at truth in a substantive sense.

a. Teleological Interpretations

Teleological (also called “teleologist”) interpretations hold that beliefs are literally directed at truth as an aim, an end, or a goal (telos in Greek). This aim would be realized in truth-conducive processes and practices of belief-regulation, whose role is the formation, maintenance, and revision of beliefs. An attitude would count as a belief only if it is formed and regulated by these processes and practices. An advantage commonly attributed to teleological interpretations is that these interpretations seem more compatible with a naturalistic account of belief than rival interpretations (in particular, normativist ones). The thought is that intentions, goals, or functions can be accounted for in naturalistic terms. Furthermore, this interpretation would naturally fit with broadly instrumentalist, naturalistically unproblematic conceptions of epistemic normativity and epistemic rationality (note however that these conceptions have been the target of many criticisms; for example, Berker, 2013; Kelly, 2003).

Teleological interpretations differ with respect to how they conceive the aim at truth. Some teleologists interpret the aim of belief as an intentional goal of the subject, like an interest to accept a proposition only if it is true. For example, according to Steglich-Petersen (2006), believing is accepting a proposition with the purpose of getting its truth-value right. According to such an interpretation, the aim is realized through deliberative practices like judgments, in which an agent accepts a proposition only if she has evidence in support of its truth, and maintains that acceptance in the absence of contrary evidence. Steglich-Petersen recognizes that many of our beliefs are regulated in entirely sub-intentional ways. However, he argues that only beliefs considered at an intentional level are connected to a literal aim:

cognitive states and processes that are not connected with any literal aim or intention of a believer can nevertheless count as ‘beliefs’ in virtue of […] being to some degree conducive to the hypothetical aim of someone intending to form a belief in the primary strong sense. (2006, p. 515)

Other philosophers have advanced sub-intentional interpretations of the aim, conceiving it as a functional goal of the attitude or the psychological system to form true beliefs and revise false ones. This function would be regulated at a sub-personal, often unconscious level. A similar approach has been defended by Bird (2007) and McHugh (2012b). Some authors also interpret certain functionalist accounts such as those of Burge (2003), Millikan (1984), and Plantinga (1993) as teleological in this sense (see, for example, McHugh, 2012a, fn. 6, 2012b, fn. 49).

The most popular interpretation of the aim is a “mixed” one, according to which truth-directedness would be constituted by both intentional and sub-intentional processes. In particular, Velleman (2000a) maintains that there is a broad spectrum of ways in which the aim can be regulated. While sometimes it is realized in the intentional aim of a subject in an act of judgment about a certain matter, at other times there are cognitive systems in charge of the regulation of belief designed to ensure the truth of such mental states. Such systems would carry out this function more or less automatically, not relying on the subject’s intentions. Other philosophers who distinguish between intentional and sub-personal levels of regulation of the aim are Millar (2004, ch.2); Sosa (2007, 2009).

A well-known objection to teleological accounts, provided by Owens (2003), is specifically directed at intentional and mixed interpretations of the aim (for similar objections see Kelly, 2003). Owens observes that if beliefs aim at truth as argued by teleologists, believing would be similar to guessing. Guesses are mental acts aiming at truth, in the sense that when one guesses, one strives to give the true answer to a question. As Owens writes,

a guesser intends to guess truly. The aim of a guess is to get it right: a successful guess is a true guess and a false guess is a failure as a guess. Someone who does not intend to guess truly is not really guessing. (2003, p. 290)

 According to a teleologist perspective, similar considerations are valid for belief, which is a mental state held with the purpose of holding it only if true. But there are at least two important disanalogies between the intentional aim involved in guessing and the aim of belief.

First, the aim of belief does not interact with other aims of the subject the way the truth-aim of guesses does. The aim of guessing (as well as that of other goal-directed activities) can interact with other goals and objectives of the subject, it can conflict with these other goals, and it can be weighed with them. In particular, when we guess, we integrate the truth-aim constitutive of guessing with other purposes, such as the practical relevance of guessing, and we consider guessing that p reasonable when aiming at the truth by means of a guess that p would maximize expected utility (Owens, 2003, p. 292). If beliefs, like guesses, constitutively involve an aim at truth, then we should expect that, on at least some occasions, we would weigh the aim of belief against other aims. For example, when engaged in deliberation about whether to believe a given proposition, our pursuit of the truth-goal may be constrained by other goals and purposes of the subject in the usual way. But belief’s aim does not work like this. A large reward to believe that today it is not sunny gives me a reason to try to believe it, but, in deliberation about what to believe, these considerations do not interact and cannot be weighed with the truth-aim of belief in the way they do with other aims and purposes of the subject. In this respect, belief appears to be “insulated” from all but one aim, in a way that aim-directed behaviors in general are not (McHugh, 2012a, p. 430).

The second disanalogy suggested by Owens is that, in guessing, we can exercise a kind of voluntary control that is not possible in the case of belief. The guesser can compare different considerations and then decide whether to terminate her inquiry and guess. Nothing similar happens in deliberation about whether to believe a given proposition, where one cannot decide when to conclude her inquiry and start believing a proposition. The deliberation is concluded more or less automatically and cannot be controlled by reflection on how best to achieve the aim. Given these disanalogies, Owens concludes that while a guess is an attitude regulated by an intentional aim at truth, belief is not.

Teleologists have provided some replies to Owen’s argument. In particular, it has been argued that the aim of belief does in fact interact and can be weighed with other aims (Steglich-Petersen, 2009); it has been denied that evidential considerations play the exclusively prominent role in belief-formation suggested by Owen’s argument (McHugh, 2012a; for a similar point, though not directly related to Owen’s argument, see Frankish, 2007); and it has been argued that the direct form of control we have on the formation of guesses, but not of beliefs, can be explained by the fact that belief is a mental state, while guessing is a mental act (Shah & Velleman, 2005).

A related problem for a teleological interpretation of the aim is that sometimes we are completely indifferent to certain matters, and sometimes we even prefer (have the goal or aim) not to have any belief on certain matters. Nevertheless, evidence for these truths constitutes reasons for us to believe them, and if presented with such evidence in normal circumstances, we cannot refrain from forming beliefs on these matters. This seems to show that truth-directedness, and more generally epistemic rationality, cannot be reduced to aim-directed activities in the common sense of the term (Kelly, 2003).

Another very popular argument against teleological accounts of truth-directedness is Shah’s (2003) “teleologist’s dilemma”. The dilemma relies on the following observation: on the one hand, in practices of doxastic deliberation—deliberation directed at forming a belief about a certain matter—considerations concerning the evidence in support of the truth of a given proposition are the only ones that are relevant in order to answer the question whether to believe that proposition (this is what Shah calls the phenomenon of doxastic transparency; compare §1.c). On the other hand, some belief-formation processes can be influenced by non-evidential factors (for example, cases of wishful thinking). In an attempt to explain these two types of belief formation, the teleologist is pushed in two incompatible directions: she can consider the truth-aim as a disposition so weak as to allow cases in which beliefs are caused by non-evidential processes, in which case she cannot account for the exclusive influence of evidential considerations in deliberative contexts of belief-formation; alternatively, in order to account for the exclusive role of evidence in doxastic deliberation, she can strengthen the disposition that constitutes aiming at truth so that it excludes the influence of non-truth-regarding considerations from such kinds of reasoning—but then she cannot accommodate non-deliberative cases in which non-evidential factors influence belief-formation. In either case, the teleologist cannot explain the truth-regulation of belief in both deliberative and non-deliberative contexts. Therefore, a teleologist interpretation of the aim is not sufficient alone to provide an explanation for the truth-directedness of beliefs in all processes of belief formation.

In order to address this problem, Shah & Velleman (2005) argue that belief is regulated by two levels of truth-directedness: a sub-intentional teleological mechanism responsible for weak regulation in non-deliberative contexts, and one conceived in normative terms, able to explain the strong truth-regulation in deliberative contexts (see §2.b). For accounts of the dilemma compatible with a teleologist perspective, see, for example, Steglich-Petersen (2006) and Toribio (2013). For other objections to the teleological account, see Engel (2004) and Zalabardo (2010).

b. Normative Interpretations

Another way of interpreting belief’s truth-directedness has been through normative terms. According to normativist accounts of the aim of belief, the claim that “belief aims at truth” is just a metaphorical way of expressing the thought that beliefs are constitutively governed by a norm prescribing (or permitting) one to believe the truth (or only the truth). For example, if Mary forms the belief that it is now 12 a.m., she does what the norm requires (that is, she possesses a right belief) if that proposition is true, and she violates the norm if that proposition is false. Many normativists identify the norm of belief with a standard of correctness:

(C) a belief is correct if and only if the believed proposition is true

These philosophers take this standard to be constitutive of the essence or the concept of belief: belief would be a mental state characterized by the fact of being correct if and only if it is true (see §1.a for more details on normativist interpretations of the constitutivity thesis). This interpretation of the aim is probably the most popular in the early 21st century. It has been defended by, among others, Boghossian (2003); Engel (2004, 2013); Gibbard (2005); Millar (2004); Shah (2003); Wedgwood (2002, 2007, 2013).

Let us here clarify a common confusion about the claim that belief is constitutively governed by a norm: that a truth-norm constitutively governs belief does not mean that all beliefs necessarily satisfy that norm. What is constitutive of belief is not the satisfaction of the norm (as a matter of fact, many beliefs happen to be false, and thus incorrect), but that the norm is in force and believers and their beliefs can be assessed and criticized according to it—as correct if the belief is true, and incorrect if it is false.

One of the best known arguments for a normative interpretation of truth-directedness, suggested by Shah (2003), is the argument to the best explanation of doxastic transparency. As mentioned in §2.a, transparency is the (alleged) phenomenon according to which the deliberative question whether to believe a given proposition p is invariably settled by answering the further question whether it is true that p. The two questions are answerable to the same set of considerations; that is, considerations concerning the evidence for or against the truth of p. This phenomenon is specific to deliberative contexts in which an agent explicitly considers whether to believe a given proposition. In such contexts, only evidential (truth-relevant) considerations can influence belief-formation. In contexts in which a subject forms a belief without passing through a deliberative process, on the contrary, non-evidential considerations could influence the belief-formation.

According to Shah, only a normative interpretation of the aim of belief can explain these facts—doxastic transparency, why this phenomenon is specific to doxastic deliberation, and the exclusive role of evidential considerations in deliberative contexts. The explanation is the following: let us assume that it is constitutive of the concept of belief that a belief is correct if and only if it is true. This is interpreted as the claim that someone believing a proposition p is under a normative commitment to believe p only if it is true. When a subject engages in doxastic deliberation and asks herself whether to believe a given proposition, she deploys the concept of belief. Assuming she understands this concept and is aware of its application conditions, she interprets this question as whether to form a mental attitude that she should have only if the proposition is true. This in turn determines a disposition to be moved only by considerations relevant to the truth of p. This explains transparency and the exclusive role of evidential considerations in deliberative contexts. In contrast, in non-deliberative contexts where belief-formation works at a sub-intentional level, the subject does not explicitly consider the question whether to believe p, does not deploy the concept of belief, and is not thereby motivated by the norm to regard only truth-relevant considerations as relevant in the process of belief-formation. For this reason, non-evidential factors can influence belief-formation in these contexts. In sum, Shah’s normativist account allows him to explain both the strong role of truth in the belief regulation in deliberative contexts, and its weak role in non-deliberative ones.

An objection to Shah’s argument is that it assumes an implausibly strong form of motivational internalism according to which the norm of belief necessarily and immediately motivates the agent when she recognizes and accepts it. This contrasts with the ways in which, in general, norms tend to motivate agents (McHugh, 2013; Steglich-Petersen, 2006).

Another argument for a normativist account of truth-directedness, suggested by Wedgwood (2002), is composed of two steps. First, it is argued that the correctness standard of belief expresses a relation of strong supervenience (correctness of a belief strongly supervenes on the truth of that belief’s content). The standard thereby articulates a necessary feature of belief: necessarily, all true beliefs are correct and all false beliefs are incorrect. Second, since the standard articulates a necessary feature of belief, it is an essential feature of beliefs. Both steps of the argument have been criticized (for example, Steglich-Petersen, 2008, pp. 277-278). Against the second step, one cannot infer from a thing necessarily possessing a certain property to the property being essential to the thing it is a property of—using a well-known example of Fine (1994, pp. 4-5), one cannot infer from the necessary claim that Socrates is the only member of the singleton having as its only member Socrates to the claim that it is essential to Socrates that he is the only member of that singleton. Against the first step, it has been argued that it relies on contentious assumptions about normative supervenience: it is an error to deduce from the supervenience of a normative property N over a non-normative property G the necessity of the claim that every object having property G also has property N. The most one can conclude is that, necessarily, if some object has property N in virtue of having property G, then anything with property G also has property N (where necessity here takes a wide scope on the conditional). For similar considerations on normative relations of supervenience, see Blackburn (1993, p. 132); and Steglich-Petersen (2008).

Other arguments often used in support of the normativist interpretation do not clearly favor this interpretation over alternative substantive conceptions of truth-directedness, such as teleological ones. For example, it has been argued that unless one assumes that belief is constitutively governed by a truth-norm, one is not in a position to distinguish beliefs from other cognitive propositional attitudes, such as assuming, thinking, or imagining. The assumption that belief is constitutively governed by a truth-norm has also been used in arguments to the best explanation of a number of features of belief such as (1) the infelicity of asserting Moorean sentences; (2) the disposition to rely on a believed proposition as a reason for action and a premise in practical reasoning (Baldwin, 2007, p. 83); and (3) the relation between belief, assertion, evidence, and action (Griffiths, 1962). See §1.c for discussion of some of these arguments. These various arguments have received formulations both in normativist and teleological terms (for teleological formulations, simply replace occurrences of “truth-norm” with “truth-aim”); for this reason, they do not favor either interpretation. It is also worth mentioning that the claim that belief is constitutively normative has received indirect support from views that, for independent reasons, hold that intentional attitudes in general are constitutively normative (Brandom, 1994; Millar, 2004; Wedgwood, 2007).

Though the normativist interpretation has been the most popular in the last two decades, it has also been the target of several criticisms. According to the No Guidance Argument, a truth-norm is incapable of guiding an agent in the formation and revision of her beliefs. One can conform one’s beliefs to a norm requiring one to believe only true propositions only by first forming beliefs about whether these propositions are true. The only way to follow this norm will thus be continuing to believe what one already believes. Such a norm would not provide any guidance as to what a subject should do in order to comply with it. More precisely, this norm would have no guiding role in processes of belief regulation (formation, maintenance, and revision). Versions of this argument have been given by Glüer & Wikforss (2009, pp. 44-45); Horwich (2006, p. 354); and Mayo (1963, p. 141). A reply to this argument consists in arguing that even if the truth-norm does not provide any direct guidance, it can guide belief regulation indirectly, via some other derived principle like norms of evidence and rationality (Boghossian, 2003; Wedgwood, 2002); or it could guide in specific contexts, such as in doxastic deliberation where an agent explicitly considers her evidence for a given proposition p with the aim of making up her mind about whether p (Shah & Velleman, 2005). For a further defense of the argument see Glüer & Wikforss (2010b, 2013).

Another criticism of the normative interpretation is that, in general, an agent subject to a norm should have some form of intentional control over the actions necessary to satisfy it and be free to choose whether to conform to the norm or not. These conditions on control and freedom to comply seem to be constraints on norms in general. However, belief formation is (at least often) an involuntary process and is realized at an automatic, non-inferential level. It is thus unclear how a truth-norm governing belief can satisfy the above constraints. For versions of this objection, see Glüer & Wikforss (2009); and Steglich-Petersen (2006).

Another problem for normative interpretations of truth-directedness concerns the formulation of the alleged norm of belief. If beliefs are constitutively governed by a truth-norm, it should be possible to state this norm in terms of some duty, prescription, or permission. However, all the suggested formulations seem to be affected by some problem. Bykvist and Hattiangadi (2007), in particular, consider several possible formulations of the norm and conclude that none of them is free from problems. The best known formulations are the following:

(1) For any S, p: S ought to (believe that p) iff p

(2) For any S, p: if S ought to (believe that p), then p

(3) For any S, p: S ought to (believe that p iff p)

All these formulations are flawed in some way. (1) implies that one ought to believe every true proposition, included trivial and uninteresting ones (see also Sosa, 2003 for a similar point). Furthermore, provided there are true propositions that it is impossible to believe (for example, it is raining and nobody believes that it is raining), (1) violates the commonly accepted rule according to which “ought” implies “can.” (2) seems not to be normatively interesting because it is unable to place any requirement on believers—if p is true, nothing follows from it about what S ought to believe; and if p is false, it only follows that it is not the case that S ought to believe that p; it does not follow that S ought not to believe that p. (3) is problematic for it does not allow one to derive claims about what one ought or ought not to believe. For example, from (3) and the falsity of p, one cannot derive that one ought not to believe p. Furthermore, (3) seems to be subject to the same objections raised against (1).

Bykvist and Hattiangadi raise similar objections to other formulations of the truth-norm. From this, they conclude that this general failure could be considered a clue that belief is not at all a normative concept, at least not in the way suggested by normativists. Many have considered this conclusion too hasty. First, even if all the available formulations of the norm are wrong, this does not mean that it is impossible to formulate the norm of belief in “ought” terms; it could simply mean that the right formulation is yet to come. Second, some have suggested alternative formulations that seem to avoid the above problems. For example, Whiting (2010) has suggested that interpreting the truth-norm as a norm of permissibility could avoid most of the problems. Other formulations avoiding these problems have been suggested by Littlejohn (2010), Fassio (2011), and Raleigh (2013). For a discussion, see Bykvist and Hattiangadi (2013). A third way of avoiding these objections is to deny that the norm of belief is a truth-norm (see §3).

A reply to the various considered objections to normative interpretations consists in abandoning a deontic conception of the truth-norm, according to which the norm would be like a prescription, directive, or permission. Adopting alternative non-deontic interpretations of the norm would allow one to avoid the various objections considered above. Some have suggested interpreting the normativity of belief in evaluative terms; that is, in terms of what it is good (in a certain sense of “good” to be specified) to believe (Fassio, 2011; McHugh, 2012b). Others have interpreted the norm of belief as involving a type of normativity sui generis (McHugh, 2014; Rosen, 2001, p. 621), as an ideal of reason (Engel, 2013), or as an “ought to be” in the Sellarsian sense, not requiring addressees of the norm to be capable of voluntarily following it (Chrisman, 2008).

For other criticisms of normativist interpretations of truth-directedness, see also Davidson (2001); Dretske (2000); Horwich (2006, 2013); and Papineau (1999, 2013). The common factor in these criticisms is the defense of the thesis that if there are norms governing beliefs, these are practical, contingent, and not constitutive of belief. Replies to some of the above objections are in Engel (2007, 2013); Shah & Velleman (2005); and Wedgwood (2013).

c. Minimalist Interpretations

The label “minimalist interpretations” is used here for a range of different views. The common factor of these views is that they deny that there is such a property as a truth-aim of belief, at least if one identifies it with some feature different from those considered in §1.b. Minimalists hold that the features supposedly explained by the aim of belief (see §1.c) can be explained by other properties commonly ascribed to these mental states, such as their causal, dispositional, functional, or motivational roles, or their direction of fit (Davidson, 2001; Dretske, 2000; Papineau, 1999).

Given the present characterization, one may wonder whether there is a clear-cut dividing line between teleological and minimalist views; in particular, between sub-intentional teleological views, identifying the aim at truth with functional mechanisms of the cognitive system, and dispositionalist and functionalist minimalist accounts. A way of distinguishing these two approaches is by looking at the dispositional or functional role distinctive of belief (compare McHugh, 2012a, fn. 6): while functionalist accounts congenial to teleological approaches to the truth-aim focus on the input side of belief’s functional role, and exclusively identify this role with a truth-directed goal (for example, forming true beliefs), minimalists think that the role that individuates belief is at least partially on the output side, and they are mainly concerned with practical roles of belief, such as satisfying the subject’s desires or providing reasons for action.

Some philosophers have endorsed accounts of belief according to which causal, dispositional, and/or functional roles of beliefs with respect to action and behavior would be sufficient to characterize and individuate this type of mental attitude. Some have argued that the specific relation between belief and truth can be fully explained by the functional role of beliefs of providing reasons for action. Others have argued that all that is necessary for an attitude to qualify as a belief is that it dispose the subject to behave in ways that would promote the satisfaction of his desires if its content were true. For similar views see, for example, Stalnaker (1984). Armstrong (1973) argues that an essential function of beliefs is moving a subject to action given the presence of suitable dominant desires and purposes, and locates in this causal role the peculiar difference between belief and other mental attitudes such as mere thoughts. Still others have identified the aim at truth of belief with its direction of fit (Humberstone, 1992; Platts, 1979).

A “deflationary” interpretation of truth-directedness has been defended by Vahid (2006, 2009). Vahid first considers the feature of accepting-as-true, introduced by Velleman (2000a), as common to all cognitive states (beliefs, assumptions, conjectures, imaginations…). He suggests that to capture the truth-directedness of belief, one should not add any further (teleological or normative) property to the fact that belief is an attitude of regarding-as-true. Rather, what is distinctive of belief according to Vahid is the specific way in which one regards-as-true a given proposition. While other attitudes involve regarding a proposition as true for the sake of something else, in order to reach certain specific goals (for example, assuming is regarding-as-true for the sake of argument, imagining involves regarding a proposition as true for motivational purposes), believing is regarding a proposition as true for its own sake, as an end in itself.

The main criticism directed at minimalist interpretations is that other mental states such as suppositions and assumptions possess these same properties (same causal, functional, dispositional, and motivational roles; same direction of fit) and, thus, that these properties are not sufficient alone to individuate the peculiar truth-directedness of belief, to explain the special features of belief listed in §1.c, and to distinguish beliefs from other types of mental attitude (Engel, 2004; Velleman, 2000a). For a reply, see, for instance, Zalabardo (2010, §10), who challenges the claim that a purely motivational conception of belief would not be sufficient to distinguish beliefs from other mental attitudes. See also Glüer & Wikforss (2009, p. 42).

3. What Does Belief Aim At?

There is debate concerning whether the aim of belief is truth, as has been traditionally argued, or some other property. Since the late 1990s, an increasing number of philosophers have defended the claim that knowledge is the fundamental aim or norm of belief. Upholders of this view are, among others, Adler (2002); Bird (2007); Huemer (2007); Littlejohn (2013); Peacocke (1999); Sutton (2007); and Williamson (2000). The best known defender of the thesis that belief would aim at knowledge is Williamson (2000). Williamson’s main motivations to hold this thesis derive from his view about the nature of knowledge and its relation to belief. Williamson criticizes the idea that it is possible to provide an analysis of knowledge in terms of other more fundamental notions. Rather, other epistemic notions such as belief and justification should be understood as derivative from the more fundamental notion of knowledge—this is the so-called Knowledge First approach in epistemology. In particular, Williamson suggests that belief be considered roughly as the attitude of treating a proposition as if one knew it. Knowledge would thus fix the standard of appropriateness or the success condition for a belief, and merely believing p without knowing it would be a sort of “botched knowing.” (2000, p. 47). In this sense, belief would not aim merely at truth but at knowledge.

A well-known argument for the knowledge aim is based on a parallel between assertion and belief. On the one side, many have argued that assertion is constitutively governed by a knowledge norm (Adler, 2002; Bird, 2007; Sutton, 2007; Williamson, 2000):

(KNA) one should assert p only if one knows p.

On the other hand, some philosophers have suggested that occurrent belief is the inner analogue of assertion (for example, Williamson, 2000, pp. 255-256). More precisely, the idea is that (flat-out) assertion is the verbal counterpart of a judgment, and a judgment is a form of occurrent (outright) belief. If so, it is plausible that assertion and belief are governed by the same norm, and knowledge would be the norm of belief too:

(KNB) one should believe p only if one knows p.

Similar arguments have been suggested by Adler (2002); Bird (2007); McHugh (2011); Sosa (2010, p. 48); Sutton (2007); and Williamson (2000). To this line of argument, it may be objected that knowledge is not the norm of assertion. Some philosophers have suggested counterexamples to this thesis (Brown 2008; Lackey 2007). Others have argued that assertion is governed by other norms such as truth or justification (Douven, 2006; Weiner, 2005). Another way to criticize this argument consists in challenging the similarity between belief and assertion, arguing in particular that they would not be governed by the same norms. Whiting (2013, pp. 187-188) provides some reasons why one should expect standards of belief and assertion to diverge: since assertion is an “external” act, involving a social dimension, in evaluating an assertion one might have to take into account the expectations and needs of interlocutors and the role of speech acts in the unfolding conversation. Furthermore, assertion is a potential source of testimony. In asserting, one takes on responsibility for others’ beliefs. All these considerations are extraneous to belief, which is a “private” state of mind. It would thus not be surprising if assertions were governed by more demanding epistemic standards than belief due to their social character and their communicative role. Brown (2012, pp. 137-144) provides another argument against the claim that assertion and belief share the same epistemic standard: she first argues that whether an assertion or belief is epistemically appropriate partially depends on its consequences (for example, the epistemic propriety of asserting varies with the stakes), and second, that the consequences of asserting p may differ from those of believing p. It follows that there can be cases in which it is epistemically appropriate for a subject to believe that p, but not to assert that p, and vice versa.

Considerations about versions of Moore’s paradox with “know” in place of “belief” provide another argument for the claim that knowledge is the aim or norm of belief (Adler, 2002; Gibbons, 2013; Huemer, 2007; Sutton, 2007). As it sounds absurd or infelicitous to assert sentences like “it is raining but I do not know it is raining,” it seems incoherent to believe that it is raining and at the same time that one does not know that it is raining. An explanation of this fact could be that knowledge is the aim or norm of belief. A subject believing that it is raining but that she does not know it would violate (KNB). This type of argument has been the target of two objections: first, some have argued that a weaker standard, like a truth-norm, would be sufficient to explain the absurdity of this sort of Moorean belief (see, for example, the explanation considered in §1.c). Second, it has been argued that while asserting Moorean propositions of the form “it is raining but I do not know it is raining” sounds absurd, there is no such absurdity in believing these same propositions. It seems both reasonable and appropriate to believe something even while believing not to know it (McGlynn, 2013; Whiting, 2013, pp. 188-189). Using an example in McGlynn (2013, p. 387), there is nothing unreasonable or incongruous for Jane to believe that her ticket will lose, that this belief is justified, and that nonetheless this belief fails to count as knowledge. A reply to the latter criticism consists in distinguishing between outright (or full) belief and partial belief. Only outright belief would be subject to a knowledge norm. For a reply, see McGlynn (2013, §3) and Whiting (2013, p. 189).

Another argument for the knowledge aim/norm of belief is provided by the way in which we tend to assess (justify and criticize) our beliefs. Williamson (2005, p. 109) provides the following case: John is at the zoo and sees what appears to him to be a zebra in a cage. The animal in the cage is really a zebra. However, unbeknownst to John, to save money, most of the other animals in the zoo have been replaced by cleverly disguised farm animals. In this scenario, John’s belief is true and fully reasonable (after all, he has no reason to believe that the animal in the cage could not be a zebra). Still, John does not know it is a zebra. Intuitively, John needs an excuse for believing that the animal is a zebra. He can excuse himself by pointing out that he is in no position to distinguish his state from one of knowing. The need for an excuse indicates that it is wrong for John to believe that the animal is a zebra. Despite the fact that John’s belief is both reasonable and true, it is somewhat defective. This contrasts with knowing that it is a zebra, which provides a full justification for believing it, not a mere excuse. A reply to this argument is that John is not wrong in believing that the animal is a zebra (and thus does not need an excuse for this). Rather, if John stands in need of correction, this is due to the false background beliefs he has—for example, the belief that animals in this area are all zebras (Littlejohn, 2010; Whiting, 2013).

Some philosophers have argued that knowledge is the aim or norm of belief on the ground that knowledge has more value than mere true belief (Bird, 2007). However, on the one hand, there is no general agreement on whether knowledge is more valuable than true belief. On the other hand, from the fact that knowledge is more valuable than true belief, it does not follow that belief is governed by a norm or involves a constitutive aim at knowledge.

For other arguments in support of the claim that the aim or norm of belief is knowledge, see Bird (2007); Engel (2004); McHugh (2011); and Sutton (2007). For other criticisms of the claim, see Littlejohn (2010); McGlynn (2013); and Whiting (2013).

Though truth and knowledge are widely identified as the main candidates for being the aim or norm of belief, some philosophers have suggested other properties. Another available option is that the aim of belief is reasonability or justification. For a defense of the claim that non-factive justification is the condition of epistemic success for belief, see, for example, Feldman (2002, pp. 378-379). It should be noted that this view has not found many proponents in the 21st century literature, at least if we exclude philosophers for whom justified belief is factive or requires knowledge (for example, Gibbons, 2013; Littlejohn, 2012; Sutton, 2007). An original approach is that of Smithies (2012), who argues that the fundamental norm of belief is that one be in a position to know what one believes, where “[o]ne is in a position to know a proposition just in case one satisfies all the epistemic, as opposed to psychological, conditions for knowledge, such as having ungettiered justification to believe a true proposition.” (2012, p. 4)

Some philosophers have identified the aim of belief with some specific kind of epistemic virtue one could manifest in the possession of belief (Zagzebski, 2004) or with understanding (Kvanvig, 2003). According to other philosophers, the fundamental aim of belief consists in the satisfaction of practical goals such as survival, utility, or the satisfaction of desires and wants. Similar views are particularly popular among philosophers favoring naturalistic approaches in the philosophy of mind. For example, Millikan (1984) argues that beliefs are integrated in a naturally selected cognitive system having the function of tracking features of the world in order to help in the satisfaction of biological needs such as survival and reproduction. A similar view has been more or less explicitly endorsed by, for example, Horwich (2006); Kornblith (1993); and Papineau (1999, 2013).

Another option is that belief possesses multiple aims or norms equally fundamental and irreducible to each other. For example, Weiner (2014) endorses pluralism about epistemic norms, arguing that there are many different epistemic norms, each valid from a different standpoint, and that no one of these standpoints need be better than another. In a virtue-theoretic framework, Wright (2014) argues that there are two fundamental epistemic aims: believing in accordance with the intellectual virtues (such as intellectual courage, carefulness, and open-mindedness), and believing the truth and avoiding falsehoods.

4. Relevance of the Topic

The topic discussed in the present article has relevance for several more general philosophical debates. It is clearly relevant in those areas of philosophy directly concerned with the notion of belief, such as the Philosophy of Mind, and in particular, the ontology of mental attitudes. One of the issues that traditionally have most concerned philosophers is that of individuating the distinctive feature of belief with respect to other mental attitudes such as trusting, mere thinking, imagining, guessing, and so on. As David Hume admits (1739, book I, part III, §7), the distinction between belief and mere thought was the first philosophical problem that the Scottish philosopher posed himself, and also one of the hardest he found to solve (for a discussion, see Armstrong, 1973, part I, §5). Whereas the difference between belief and other types of mental attitude seems to reside in the specific relationship that belief entertains with truth (or knowledge), it has been extremely difficult to grasp the peculiar nature of such a relationship. The 21st century debate on the aim of belief promises to provide an answer to such a problem. In answering questions about the nature of the aim (see §2), it also promises to shed some light on the issue of whether belief is a normative attitude or whether it can be characterized by a fully naturalistic account.

The progress in the debate about the aim/norm of belief also substantially contributes to the study of norms and aims of other attitudes such as desires, emotions, and intentions. The aim of such studies is to provide a unified and coherent representation of the various norms and aims of mental attitudes and of their reciprocal relations (for example, Millar, 2009; Railton, 1997; Shah, 2008; Velleman, 2000b; Wedgwood, 2007). For instance, Wedgwood (2007) interprets the aim of attitudes as norms of correctness and argues that similar norms are constitutive and individuative of all intentional attitudes. Similarly, Shah (2008) applies an argument analogous to the transparency argument for belief (considered in §2.b) to other attitudes. In particular, he argues that the hypothesis that the concept of intention is governed by a constitutive norm would best explain the presumed fact that in order to conclude deliberation on whether to intend to A one must answer the question whether to A.

The debate on the aim of belief has, also, a particular relevance for certain views in philosophy of normativity. According to a prominent view in meta-ethics (so-called Constitutivism), normative facts can be grounded in facts about the constitution of action or agency. According to this view, agency is constitutively governed by practical norms (for example, Korsgaard, 1996; Velleman, 2000b). Some philosophers have tried to extend the view to other normative domains such as epistemology and aesthetics. The epistemological analogue of Ethical Constitutivism holds that epistemic normativity can be grounded in the constitutive aim or norm of belief. For a criticism of constitutivist approaches, see Enoch (2006).

The view that epistemic normativity is grounded in a fundamental truth-aim of belief has deep roots in the 20st century history of epistemology. Many philosophers in the past have argued that there is a strict dependence relation between the fundamental aim or norm of belief (sometimes presented as a conjunction of two values or goals of believing truly and avoiding falsehoods) and other derivable normative epistemic standards such as justification and rationality. Versions of this view have been defended by many well-known epistemologists, including Chisholm, Goldman, Lehrer, Plantinga, Alston, Foley, and Sosa (see Alston, 2005, ch. 1, for an overview). Accounts of this relation differ depending on the notions of justification and rationality adopted by philosophers, and by how philosophers conceive the relation between the truth-aim and other derived normative properties (for example, consequentialist, deontologist, virtue-based…).

Another approach in epistemology concerned with the topic of the aim or norm of belief is the so-called Knowledge First approach introduced in §3. According to this view, knowledge has a prominent role among epistemic notions and constitutes the fundamental epistemic standard of assertion, belief, action, practical reasoning, and disagreement (compare “Knowledge Norms”). This approach has generated a comparative study of these standards (for example, Smithies, 2012). An example is the reformulation of various arguments for the knowledge norm of assertion in order to defend other knowledge norms, such as those of belief and action (compare §3). In this perspective, the debate on the aim of belief can help in understanding important aspects of epistemic norms of assertion, action, practical reasoning, and disagreement, and in turn can receive important contributions from advances in the debates about norms governing these other practices.

5. References and Further Reading

  • Adler, J. (2002). Belief’s own ethics (Vol. 112). MIT Press.
  • Alston, W. (2005). Beyond “justification”: Dimensions of epistemic evaluation (Vol. 81). Ithaca: Cornell University Press.
  • Armstrong, D. M. (1973). Belief, truth and knowledge (Vol. 24). London: Cambridge University Press.
  • Baldwin, T. (2007). The normative character of belief. In M. S. Green & J. N. Williams (Eds.), Moore’s paradox: New essays on belief, rationality, and the first person (pp. 76–89). Oxford: Oxford University Press.
  • Berker, S. (2013). The rejection of epistemic consequentialism. Philosophical Issues, 23(1), 363–387.
  • Bird, A. (2007). Justified judging. Philosophy and Phenomenological Research, 74(1), 81–110.
  • Blackburn, S. (1993). Essays in quasi-realism. New York, NY: Oxford University Press.
  • Boghossian, P. A. (2003). The normativity of content. Philosophical Issues, 13(1), 31–45.
  • Brandom, R. B. (1994). Making it explicit: Reasoning, representing, and discursive commitment. Cambridge, MA: Harvard University Press.
  • Brandom, R. B. (2001). Modality, normativity, and intentionality. Philosophy and Phenomenological Research, 63(3), 611–623.
  • Brown, J. (2008). The knowledge norm for assertion. Philosophical Issues, 18(1), 89–103.
  • Brown, J. (2012). Assertion and practical reasoning: Common or divergent epistemic Standards? Philosophy and Phenomenological Research, 84(1), 123–157.
  • Burge, T. (2003). Perceptual entitlement. Philosophy and Phenomenological Research, 67(3), 503–548.
  • Bykvist, K., & Hattiangadi, A. (2007). Does thought imply ought? Analysis, 67(296), 277–285.
  • Bykvist, K., & Hattiangadi, A. (2013). Belief, truth, and blindspots. In T. Chan (Ed.), The aim of belief (pp. 100–122). New York, NY: Oxford University Press.
  • Chrisman, M. (2008). Ought to believe. Journal of Philosophy, 105(7), 346–370.
  • David, M. (2005). Truth as the primary epistemic goal: A working hypothesis. In M. Steup & E. Sosa (Eds.), Contemporary debates in epistemology (pp. 296–312). Oxford: Blackwell.
  • Davidson, D. (2001). Comments on Karlovy Vary papers. In P. Kotatko (Ed.), Interpreting Davidson (pp. 285-308). Stanford, CA: CSLI Publications.
  • Douven, I. (2006). Assertion, knowledge, and rational credibility. Philosophical Review, 115(4), 449–485.
  • Dretske, F. (2000). Norms, history and the constitution of the mental. In Perception, knowledge and belief (pp. 242–258.). Cambridge: Cambridge University Press.
  • Engel, P. (2004). Truth and the aim of belief. In D. Gillies (Ed.), Laws and models in science (pp. 77–97). London: King’s College Publications.
  • Engel, P. (2007). Belief and normativity. Disputatio, 2(23), 179–203.
  • Engel, P. (2013). Doxastic correctness. Aristotelian Society Supplementary Volume, 87(1), 199–216.
  • Enoch, D. (2006). Agency, shmagency: Why normativity won’t come from what is constitutive of action. Philosophical Review, 115(2), 169–198.
  • Fassio, D. (2011). Belief, correctness and normativity. Logique Et Analyse, 54(216), 471-486.
  • Feldman, R. (2002). Epistemological duties. In P. Moser (Ed.), The Oxford handbook of epistemology (pp. 362–384). Oxford: Oxford University Press.
  • Fine, K. (1994). Essence and modality. Philosophical Perspectives, 8, 1–16.
  • Firth, R. (1981). Epistemic merit, intrinsic and instrumental. Proceedings and Addresses of the American Philosophical Association, 55(1), 5–23.
  • Frankish, K. (2007). Deciding to believe again. Mind, 116(463), 523–547.
  • Frost, K. (2014). On the very idea of direction of fit. Philosophical Review, 123(4), 429–484.
  • Gibbard, A. (2005). Truth and correct belief. Philosophical Issues, 15(1), 338–350.
  • Gibbons, J. (2013). The norm of belief. Oxford: Oxford University Press.
  • Glüer, K., & Pagin, P. (1998). Rules of meaning and practical reasoning. Synthese, 117(2), 207–227.
  • Glüer, K., & Wikforss, Å. (2009). Against content normativity. Mind, 118(469), 31–70.
  • Glüer, K., & Wikforss, Å. (2010a). The normativity of meaning and content. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Glüer, K., & Wikforss, Å. (2010b). The truth norm and guidance: A reply to Steglich-Petersen. Mind, 119(475), 757–761.
  • Glüer, K., & Wikforss, Å. (2013). Against belief normativity. In T. Chan (Ed.), The aim of belief. New York, NY: Oxford University Press.
  • Griffiths, A. P. (1962). On belief. Proceedings of the Aristotelian Society, 63(n/a), 167–186.
  • Hieronymi, P. (2006). Controlling attitudes. Pacific Philosophical Quarterly, 87(1), 45–74.
  • Horwich, P. (2006). The value of truth. Noûs, 40(2), 347–360.
  • Horwich, P. (2013). Belief-truth norms. In T. Chan (Ed.), The aim of belief (pp. 17–31). New York, NY: Oxford University Press.
  • Huemer, M. (2007). Moore’s paradox and the norm of belief. In S. Nuccetelli & G. Seay (Eds.), Themes from G.E. Moore (pp. 142–57). Oxford: Oxford University Press.
  • Humberstone, I. L. (1992). Direction of fit. Mind, 101(401), 59–83.
  • Hume, D. (1739). A treatise of human nature. Oxford: Oxford University Press.
  • Kelly, T. (2003). Epistemic rationality as instrumental rationality: A critique. Philosophy and Phenomenological Research, 66(3), 612–640.
  • Kornblith, H. (1993). Epistemic normativity. Synthese, 94(3), 357–376.
  • Korsgaard, C. M. (1996). The sources of normativity (Vol. 110). Cambridge: Cambridge University Press.
  • Kvanvig, J. L. (2003). The value of knowledge and the pursuit of understanding (Vol. 113). Cambridge: Cambridge University Press.
  • Lackey, J. (2007). Norms of assertion. Noûs, 41(4), 594–626.
  • Littlejohn, C. (2010). Moore’s paradox and epistemic norms. Australasian Journal of Philosophy, 88(1), 79–100.
  • Littlejohn, C. (2012). Justification and the truth-connection. Cambridge: Cambridge University Press.
  • Littlejohn, C. (2013). The Russellian retreat. Proceedings of the Aristotelian Society, 113, 293–320.
  • Lynch, M. (2004). True to life: Why truth matters. Cambridge, MA: MIT Press.
  • Lynch, M. (2009a). The value of truth and the Truth of Values. In A. Haddock, A. Millar, & D. Pritchard (Eds.), Epistemic value. Oxford: Oxford University Press.
  • Lynch, M. (2009b). Truth, value and epistemic expressivism. Philosophy and Phenomenological Research, 79(1), 76–97.
  • Maitzen, S. (1995). Our errant epistemic aim. Philosophy and Phenomenological Research, 55(4), 869–876.
  • Mayo, B. (1963). Belief and constraint. Proceedings of the Aristotelian Society, 64, 139–156.
  • McGlynn, A. (2013). Believing things unknown. Noûs, 47(2), 385–407.
  • McHugh, C. (2011). What do we aim at when we believe? Dialectica, 65(3), 369–392.
  • McHugh, C. (2012a). Belief and aims. Philosophical Studies, 160(3), 425–439.
  • McHugh, C. (2012b). The truth norm of belief. Pacific Philosophical Quarterly, 93(1), 8–30.
  • McHugh, C. (2013). Normativism and doxastic deliberation. Analytic Philosophy, 54(4), 447–465.
  • McHugh, C. (2014). Fitting belief. Proceedings of the Aristotelian Society, 114(2pt2), 167–187.
  • McHugh, C., & Whiting, D. (2014). The normativity of belief. Analysis, 74(4), 698–713.
  • Millar, A. (2004). Understanding people: Normativity and rationalizing explanation. New York, NY: Oxford University Press.
  • Millar, A. (2009). How reasons for action differ from reasons for belief. In S. Robertson (Ed.), Spheres of reason (pp. 140–163). New York, NY: Oxford University Press.
  • Miller, A. (2008). Thoughts, oughts and the conceptual primacy of belief. Analysis, 68(299), 234–238.
  • Millikan, R. G. (1984). Language, thought and other biological categories. Cambridge, MA: MIT Press.
  • Moore, G. E. (1942). A reply to my critics. In P. A. Schilpp (Ed.), The philosophy of G. E. Moore. Chicago, IL: Open Court.
  • Moran, R. A. (1997). Self-knowledge: Discovery, resolution, and undoing. European Journal of Philosophy, 5(2), 141–61.
  • Owens, D. J. (2003). Does belief have an aim? Philosophical Studies, 115(3), 283–305.
  • Papineau, D. (1999). Normativity and judgment. Proceedings of the Aristotelian Society, 73(73), 16–43.
  • Papineau, D. (2013). There are no norms of belief. In T. Chan (Ed.), The Aim of Belief (pp. 64–79). New York, NY: Oxford University Press.
  • Peacocke, C. (1999). Being known. Oxford: Oxford University Press.
  • Plantinga, A. (1993). Warrant and proper function. New York, NY: Oxford University Press.
  • Platts, M. B. (1979). Ways of meaning: An introduction to a philosophy of language. London: Routledge & K. Paul.
  • Railton, P. (1994). Truth, reason, and the regulation of belief. Philosophical Issues, 5, 71–93.
  • Railton, P. (1997). On the hypothetical and non-hypothetical in reasoning about belief and action. In G. Cullity & B. N. Gaut (Eds.), Ethics and practical reason (pp. 53–79). New York, NY: Oxford University Press.
  • Raleigh, T. (2013). Belief norms and blindspots. Southern Journal of Philosophy, 51(2), 243–269.
  • Ramsey, F. P. (1931). Foundations of mathematics and other logical essays. London: Routledge.
  • Rosen, G. (2001). Brandom on modality, normativity and intentionality. Philosophy and Phenomenological Research, 63(3), 611–623.
  • Setiya, K. (2008). Believing at will. Midwest Studies in Philosophy, 32(1), 36–52.
  • Shah, N. (2003). How truth governs belief. Philosophical Review, 112(4), 447–482.
  • Shah, N. (2008). How action governs intention. Philosophers’ Imprint, 8(5), 1–19.
  • Shah, N., & Velleman, J. D. (2005). Doxastic deliberation. Philosophical Review, 114(4), 497–534.
  • Smithies, D. (2012). The normative role of knowledge. Noûs, 46(2), 265–288.
  • Sosa, E. (2003). The place of truth in epistemology. In L. Zagzebski & M. DePaul (Eds.), Intellectual virtue: Perspectives from ethics and epistemology (pp. 155–180). New York, NY: Oxford University Press.
  • Sosa, E. (2007). A virtue epistemology: Apt belief and reflective knowledge, Volume I. Oxford: Oxford University Press.
  • Sosa, E. (2009). Knowing full well: The normativity of beliefs as performances. Philosophical Studies, 142(1), 5–15.
  • Sosa, E. (2010). Knowing full well. Princeton, NJ: Princeton University Press.
  • Stalnaker, R. (1984). Inquiry. Cambridge, MA: MIT Press.
  • Steglich-Petersen, A. (2006). No norm needed: On the aim of belief. Philosophical Quarterly, 56(225), 499–516.
  • Steglich-Petersen, A. (2008). Against essential normativity of the mental. Philosophical Studies, 140(2), 263–283.
  • Steglich-Petersen, A. (2009). Weighing the aim of belief. Philosophical Studies, 145(3), 395–405.
  • Sutton, J. (2007). Without justification. Cambridge, MA: MIT Press.
  • Toribio, J. (2013). Is there an “ought” in belief? Teorema: Revista Internacional de Filosofía, 32(3), 75–90.
  • Vahid, H. (2006). Aiming at truth: Doxastic vs. epistemic goals. Philosophical Studies, 131(2), 303–335.
  • Vahid, H. (2009). The epistemology of belief. London: Palgrave Macmillan.
  • Velleman, D. (2000a). On the aim of belief. In D. Velleman (Ed.), The possibility of practical reason (pp. 244–281). New York, NY: Oxford University Press.
  • Velleman, D. (2000b). The possibility of practical reason (Vol. 106). New York, NY: Oxford University Press.
  • Wedgwood, R. (2002). The aim of belief. Philosophical Perspectives, 16(s16), 267–97.
  • Wedgwood, R. (2007). The nature of normativity. New York, NY: Oxford University Press.
  • Wedgwood, R. (2013). Doxastic correctness. Aristotelian Society Supplementary Volume, 87(1), 217–234.
  • Weiner, M. (2005). Must we know what we say? Philosophical Review, 114(2), 227–251.
  • Weiner, M. (2014). The spectra of epistemic norms. In J. Turri & C. Littlejohn (Eds.), Epistemic norms: New essays on action, belief, and assertion (pp. 201–218). Oxford: Oxford University Press.
  • Whiting, D. (2010). Should I believe the truth? Dialectica, 64(2), 213–224.
  • Whiting, D. (2013). Nothing but the truth: On the norms and aims of belief. In T. Chan (Ed.), The Aim of Belief. New York, NY: Oxford University Press.
  • Williams, B. (1973). Deciding to believe. In B. Williams (Ed.), Problems of the Self (pp. 136–51). Cambridge, MA: Cambridge University Press.
  • Williams, B. (2002). Truth and truthfulness: An essay in genealogy. Princeton, NJ: Princeton University Press.
  • Williamson, T. (2000). Knowledge and its limits. Oxford: Oxford University Press.
  • Williamson, T. (2005). Knowledge, context, and the agent’s point of view. In G. Preyer & G. Peter (Eds.), Contextualism in philosophy: Knowledge, meaning, and truth (pp. 91–114). New York, NY: Oxford University Press.
  • Wright, S. (2014). The dual-aspect norms of belief and assertion: A virtue approach to epistemic norms. In C. Littlejohn & J. Turri (Eds.), Epistemic norms: New essays on action, belief, and assertion (pp. 239–258). New York, NY: Oxford University Press.
  • Yamada, M. (2012). Taking aim at the truth. Philosophical Studies, 157(1), 47–59.
  • Zagzebski, L. (2004). Epistemic value and the primacy of what we care about. Philosophical Papers, 33(3), 353–377.
  • Zalabardo, J. L. (2010). Why believe the truth? Shah and Velleman on the aim of belief. Philosophical Explorations, 13(1), 1–21.
  • Zangwill, N. (2005). The normativity of the mental. Philosophical Explorations, 8(1), 1–19.

 

Author Information

Davide Fassio
Email: davide.fassio@unige.ch
University of Geneva
Switzerland

Ernst Cassirer (1874-1945)

Ernst CassirerErnst Cassirer was the most prominent, and the last, Neo-Kantian philosopher of the twentieth century. His major philosophical contribution was the transformation of his teacher Hermann Cohen’s mathematical-logical adaptation of Kant’s transcendental idealism into a comprehensive philosophy of symbolic forms intended to address all aspects of human cultural life and creativity. In doing so, Cassirer paid equal attention to both sides of the traditional Neo-Kantian division between the Geisteswissenschaften and Naturwissenschaften, that is, between the social sciences and the natural sciences. This is expressed most systematically in his masterwork, the multi-volume Philosophie der symbolischen Formen (1923-9). Here Cassirer marshaled the widest learning of human cultural expression—in myth, religion, language, philosophy, history, art, and science—for the sake of completing and correcting Kant’s transcendental program. The human being, for Cassirer, is not simply the rational animal, but the animal whose experience with and reaction to the world is governed by symbolic relations. Cassirer was a quintessential humanistic liberal, believing freedom of rational expression to be coextensive with liberation. Cassirer was also the twentieth century’s greatest embodiment of the Enlightenment ideal of comprehensive learning, having written widely-acclaimed histories of the ideas of science, historiography, mathematics, mythology, political theory, and philosophy. Though cordial with both Moritz Schlick and Martin Heidegger, Cassirer’s popularity was eclipsed by the simultaneous rise of logical positivism in the English-speaking world and of phenomenology on the European continent. His professional career was the victim, too, of the political events surrounding the ascendency of Nazism in German academies.

Table of Contents

  1. Biography
  2. Philosophy of Symbolic Forms
  3. Cultural Anthropology
  4. Myth
  5. Language
  6. Science
  7. Political Philosophy
  8. The Davos Conference
  9. References and Further Reading
    1. Cassirer’s Major Works
    2. Further Reading

1. Biography

Ernst Cassirer was born in 1874, the son of the established Jewish merchant Eduard Cassirer, in the former German city of Breslau (modern day Wrocław, Poland). He matriculated at the University of Berlin in 1892. His father intended that he study law, but Cassirer’s interest in literature and philosophy prevented him from doing so. Sampling various courses at the universities at Leipzig, Munich, and Heidelberg, Cassirer was first exposed to the Neo-Kantian philosophy by the social theorist Georg Simmel in Berlin. In 1896, Cassirer began his doctoral studies under Herman Cohen at the University of Marburg.

Cassirer’s interests at Marburg ran, as they would always, toward framing Neo-Kantian thought in the wider contexts of historical thinking. These interests culminated in his dissertation, Descartes: Kritik der Matematischen und Naturwissenschaftlichen Erkentniss (1899). Three years later, Cassirer published a similarly historical book on Leibniz’ System in seinen wissenschaftlichen Grundlagen (1902). Cassirer was also the editor of Leibniz’ Philosophische Werke (1906). His focus on the development of modern idealist epistemology and its foundational importance for the history of the various natural sciences and mathematics reached its apex in Cassirer’s three-volume Das Erkenntnisproblem in der Philosophie und Wissenschaft der neuren Zeit (1906-1920), for which he was awarded the Kuno Fischer Medal by the Heidelberg Academy. The first volume, Cassirer’s Habilitationschrift at the University of Berlin (1906), examines the development of epistemology from the Renaissance through Descartes; the second (1907) continues from modern empiricism through Kant; the third (1920) deals with the development of epistemology after Kant, especially the division between Hegelians and Neo-Kantians up to the mid-twentieth century; and the fourth volume of Das Erkenntnisproblem on contemporary epistemology and science was written in exile in 1940, but only published after the end of the war in 1946.

Although his quality as a scholar of ideas was unquestioned, anti-Jewish sentiment in German universities made finding suitable employment difficult for Cassirer. Only through the personal intervention of Wilhelm Dilthey was Cassirer given a Privatdozent position at the University of Berlin in 1906. His writing there was prolific and continued the Neo-Kantian preoccupation with the intersections among epistemology, mathematics, and natural science. Cassirer’s work on, and with, Einstein exemplifies the quality of his contributions to the philosophy of science: Der Substanzbegriff und der Funktionsbegriff (1910), and Zur Einstein’schen Relativitätstheoretische Betrachtung (1921). These works also mark Cassirer’s conviction that an historian of ideas could make a major contribution to the most contemporary problems in every field.

After the First World War, and in the more tolerant Weimar Republic, Cassirer was invited to a chair at the new University of Hamburg in 1919. There, Cassirer came into the cultural circle of Erwin Panofsky and the Warburg Library of the Cultural Sciences. Immediately Cassirer was absorbed into the vast cultural-anthropological data collected by the Library, affecting the widest expansion of Neo-Kantian ideas into the previously uncharted philosophical territories of myth, the evolution of language, zoology, primitive cultures, fine art, and music. The acquaintance with the Warburg circle transformed Cassirer from a student of the Marburg School’s analysis of the transcendental conditions of thinking into a philosopher of culture whose inquisitiveness touched nearly all areas of human cultural life. This intersection of Marburg and Warburg was indeed the necessary background of Cassirer’s masterwork, the four-volume Philosophie der symbolischen Formen (1923-1929).

In addition to his programmatic work, Cassirer was a major contributor to the history of ideas and the history of science. In conscious contrast with Hegelian accounts of history, Cassirer does not begin with the assumption of a theory of dialectical progress that would imply the inferiority of earlier stages of historical developments. By starting instead with the authors, cultural products, and historical events themselves, Cassirer instead finds characteristic frames of mind that are defined by the kinds of philosophical questions and responses that frame them, which are in turn constituted by characteristic forms of rationality. Among his works at this time, which influenced a generation of historians of ideas from Arthur Lovejoy to Peter Gay are Individuum und Kosmos in der Philosophie der Renaissance (1927); Die Platonische Renaisance in England und die Schule von Cambridge (1932); Philosophie der Aufklärung (1932); Das Problem Jean-Jacques Rousseau (1932); and Descartes: Lehre, Persönlichkeit, Wirkung (1939). Cassirer’s philosophy of science had a similar influence on the historical analyses of Alexander Koyré and, through him, Thomas Kuhn.

In 1929, Cassirer was chosen Rektor of the University of Hamburg, making him the first Jewish person to hold that position in Germany. However, even as Cassirer’s star was rising, the situation for Jewish academics was deteriorating. With Hitler’s election as Chancellor came the ban on Jews holding academic positions. Cassirer saw the writing on the wall and emigrated with his family in 1933. He spent two years at Oxford and then six at Göteborg, where he wrote Determinismus und Indeterminismus in der modernen Physik (1936), Descartes: Lehre, Persönlichkeit, Wirkung (1939), and Zur Logik der Kulturwissenschaften (1942). In 1931, he wrote the first comprehensive study of the Swedish legal theorist and proto-Analytic philosopher, Axel Hägerström.

In 1941, Cassirer boarded the last ship the Germans permitted to sail from Sweden to the United States, where he would hold positions at Yale for two years and then at Columbia for one. His final books, written in English, were the career-synopsis, An Essay on Man (1944), and his first philosophical foray into contemporary politics, The Myth of the State (1946), published posthumously. Cassirer’s death in New York City on April 13, 1945, preceded that of Hitler and the surrender of Germany by mere weeks.

2. Philosophy of Symbolic Forms

“The Philosophy of Symbolic Forms is not concerned exclusively or even primarily with the purely scientific, exact conceiving of the world; it is concerned with all the forms assumed by man’s understanding of the world” (Philosophy of Symbolic Forms, vol. III, 13). For Cassirer, Neo-Kantianism was less about doctrinal allegiance than it was about a common commitment to explore the cognitive structures that underlie the variety of human experience. After the death of Cohen, Cassirer became increasingly interested in value and culture. Inspired by the Warburg Library, Cassirer cast his net into an ocean of cultural expression, trying to find the common thread that united the manifold of cultural forms, that is, to move from the critique of reason to the critique of culture.

As to what precisely symbolic forms are, Cassirer offers perhaps his clearest definition in an early lecture at the Warburg Library (1921):

By ‘symbolic form’ I mean that energy of the spirit through which a mental meaning-content is attached to a sensual sign and inwardly dedicated to this sign. In this sense language, the mythical-religious world, and the arts each present us with a particular symbolic form. For in them all we see the mark of the basic phenomenon, that our consciousness is not satisfied to simply receive impressions from the outside, but rather that it permeates each impression with a free activity of expression. In what we call the objective reality of things we are thus confronted with a world of self-created signs and images. (“Der Begriff der Symbolischen Form im Aufbau der Geisteswissenschaften”)

An illustration Cassirer uses is that of the curved line on a flat plane. To the geometer, the line means a quantitative relation between the two dimensions of the plane; to the physicist, the line perhaps means a relation of energy to mass; and to the artist, the line means a relation between light and darkness, shape and contour. More than simply a reflection of different practical interests, Cassirer believes each of these brings different mental energies to bear in turning the visual sensation of the line into a distinct human experience. No one of these ways of experiencing is the true one; though they each have their distinctive pragmatic uses within their individual fields. The task of the philosopher is to understand the internal directedness of each of these mental energies independently and in relation to the others as the sum total of human mental expression, which is to say, culture.

The first two forms Cassirer discusses, in the first two volumes respectively, are language and myth. The third volume of the Philosophy of Symbolic Forms concerns contemporary advances in epistemology and natural science: “We shall show how the stratum of conceptual, discursive knowledge is grounded in those other strata of spiritual life which our analysis of language and myth has laid bare; and with constant reference to this substructure we shall attempt to determine the particularity, organization, and architectonics of the superstructure – that is, of science” (Philosophy of Symbolic Forms, vol. III, xiii). Cassirer works historically, tracing the problem of philosophical knowledge through the Ancient Greeks up through the Neo-Kantian tradition. The seemingly endless battle between intuition and conceptualization has been contended in various forms between the originators of myths and the earliest theorists of number, between the Milesians and Eleatics, between the empiricists and rationalists, and again right up to Ernst Mach and Max Planck. Cassirer’s position here is conciliatory: both sides have and will continue to contribute their perspective on the eternal questions of philosophy insofar as both recognize their efforts as springing from the human’s multifaceted and spontaneous creativity—as symbol-forming rather than designating endeavors that in their dialectics, each with the other side, construct more elaborate and yet universal ways to navigate our world:

Physics gains this unity and extension by advancing toward ever more universal symbols. But in this process it cannot jump over its own shadow. It can and must strive to replace particular concepts and signs with absolutely universal ones. But it can never dispense with the function of concepts and signs as such: this would demand an intellectual representation of the world without the basic instruments of representation. (Philosophy of Symbolic Forms, vol. III, 479)

The fourth volume, The Metaphysics of Symbolic Forms, was published posthumously. Along with other papers left at the time of his death, the German original is now found in the first volume of Cassirer’s Nachgelassene Manuskripte und Texte, edited by John Michael Krois and Oswald Schwemmer in 1995. The English volume, assembled and edited by Donald Philip Verene and John Michael Krois in 1996, contains two texts from different periods in Cassirer’s writings. The first, from 1928, deals with human nature rather than metaphysics proper. In agreement with Heidegger, curiously, Cassirer seeks to replace traditional metaphysics with a fundamental study of human nature. Much of the thematic discussion of this part receives a refined and more complete expression in Cassirer’s 1944 Essay on Man. What is of novel interest here concerns his discussion of then contemporary philosophical anthropologists like Dilthey, Bergson, and Simmel and also the Lebensphilosophen, Schopenhauer, Kierkegaard, and Nietzsche, who otherwise receive short shrift in his work. His critical remarks of these latter thinkers involve their treatment of life as a new sort of metaphysics, one marred, however, by the sorts of dogmatism of pre-Kantian metaphysics.

The second text in Verene and Krois’s assembled volume comes from 1940, well after the project had been otherwise finished, and its theme is what Cassirer terms “basis phenomena”: phenomena so fundamental that they cannot be derived from anything else. The main basis phenomena concerns how the tripartite structure of the self’s personal relation to the environment is mirrored in a tripartite social structure of the “I,” the “you,” and that which binds society: “work.” Not to be confused with the Marxist conception of work, for Cassirer work is anything made or effected, any subjective operation on the objective world. The initial and most fundamental production of work, for Cassirer, is culture—the sphere in which the “I” and “you” come together in active life.

Several objections to Cassirer’s masterwork have been raised. First, the precise identity and number of forms is ambiguous over Cassirer’s corpus. In the lecture from 1921, Cassirer names language, myth-religion, and art as forms, but that number cannot be considered exhaustive. Even in his summatory Essay on Man, consecutive pages maintain different lists: “myth, language, art, religion, history, science” (222) and then “language, myth, art, religion, science” (223); elsewhere science is omitted (63); mathematics is sometimes added; and religion is sometimes considered part of mythic thinking. The first two of the four volumes of The Philosophy of Symbolic Forms—on language and myth respectively—would seem to indicate that each volume would treat a specific form. But the latter two volumes break the trend to deal with a host of different forms. Moreover, it is ambiguous how precisely those forms are related. For example, myth is sometimes treated as a primitive form of language and sometimes non-developmentally as an equal correlate. Arithmetic and geometry are the logic that undergirds the scientific symbolic form, but in no way do they undergird primitive forms of science that have been superseded. Whether the forms are themselves developmental or whether development takes place by the instantiation of a new form is also left vague. For example, Cassirer indicates that the move from Euclidean to non-Euclidean geometry involves not just progress but an entirely new system of symbolization. However, myth does not seem to develop itself into anything else other than into something wholly different, that is, representational language.

There is, however, a certain necessity to Cassirer’s imprecision on these points. Taken together, the Philosophy of Symbolic Forms is a grand narrative that exposits how various human experiences evolve out of an originally animalistic and primitive articulation of expressive signs into the complicated and more abstract forms of culture in the twenty-first century. As “energies of the spirit” they cannot be affixed with the kind of rigid architectonic featured in Kant’s transcendental deduction of purely logical forms. Though spontaneous acts of mental energy, symbolic forms are both developmental and pragmatic insofar as they adapt over time to changing environments in response to real human needs, something that resists an overly rigid structuralism. Those responses feature a loose sort of internal-logic, but one characterized according to contingent cultural interactions with the world. Therefore, one ought not to expect Cassirer to offer the same logical precision that comes with the typical Neo-Kantian discernment of mental forms insofar as logic is only one form among many cultural relations with life.

3. Cultural Anthropology

Cassirer’s late Essay on Man (1944) expresses neatly his lifelong attempt to combine his Neo-Kantian view of the actively-constituting subject with his Warburgian appreciation for the diversity of human culture. Here, as ever, Cassirer begins with the history of views up into his present time, culminating in the presentation of a definitive scientific thesis that he would then proceed to refute. Johannes von Uexküll’s  Umwelt und Innenwelt der Tiere (1909) argued that evolutionary biology has taken too far the view that animal parts and functions develop as a response to environmental factors. In its place Uexküll offers the “functional circle” of animal activity, which identifies the interaction of distinct receptor and effector systems. Animals are not simply reacting to the environment as it presents itself in sensory stimuli. They adapt themselves, consciously and unconsciously, to their environments, sometimes with clear signs of intelligence and insight. Different animals use diverse and sometimes highly complex systems of signals to better respond and manipulate their environments to their advantage. Dogs, for example, are adroit at reading signals in body language, vocal tones, and even hormone changes while being remarkably effective in expressing a complex range of immediate inner states in terms of the vocalized pitch of their whimpers, grunts, or barks, as well as the bends of their tails, or the posture of their spines. In Pavlov’s famous experiments, dogs were conditioned to react both to the immediate signals of meat—its visual appearance and smell—and also to mediate signals, like a ringing bell, to the same effect.

Cassirer thinks this theory makes good sense of the animal world as a corrective to a too-simple version of evolution, but doubts this can be applied to humans. Over and above the signals received and expressed by animals, human beings evolved to use symbols to make their world meaningful. The same ringing of the bell would not be considered by man a physical signal so much as a symbol whose meaning transcends its real, concrete stimulation. For man, a bell does not indicate simply that food is coming, but induces him to wonder why that bell might indicate food, or perhaps whether an exam is over, or the fulfillment of a sacrament, or that someone is on the telephone. None of those symbols would lead necessarily to a response in the way the conditioned dog salivates at the bell. They instead prompt a range of freely creative responses in human knowers within distinct spheres of meaning:

Symbols—in the proper sense of this term—cannot be reduced to mere signals. Signals and symbols belong to two different universes of discourse: a signal is a part of the physical world of being; a symbol is a part of the human world of meaning. Signals are ‘operators’; symbols are ‘designators’. Signals, even when understood and used as such, have nevertheless a sort of physical or substantial being; symbols have only a functional value. (Essay on Man 32)

Between the straightforward reception of physical stimuli and the expression of an inner world lies, for Cassirer, the symbolic system: “This new acquisition transforms the whole of human life. As compared with the other animals man lives not merely in a broader reality; he lives, so to speak, in a new dimension of reality” (Essay on Man 24). That dimension is distinctively Kantian: the a priori forms of space and time. Animals have little trouble working in three-dimensional space; their optical, tactile, acoustic, and kinesthetic apprehension of spatial distances functions at least as well as it does in humans. But only to the human is the symbol of pure geometrical space meaningful, a universal, non-perceptual, theoretical space that persists without immediate relationship to his or her own interaction with the world: “Geometrical space abstracts from all the variety and heterogeneity imposed upon us by the disparate nature of our senses. Here we have a homogenous, a universal space” (Essay on Man 45). In terms of time, too, there can be no doubt that higher animals remember past sensations, or that memory affects the manner in which they respond when similar sensations are presented. But in the human person the past is not simply repeated in the present, but transformed creatively and constructively in ways that reflect values, regrets, hopes, and so forth,

It is not enough to pick up isolated data of our past experience; we must really re-collect them, we must organize and synthesize them, and assemble them into a focus of thought. It is this kind of recollection which gives us the characteristic human shape of memory, and distinguishes it from all the other phenomena in animal or organic life. (Essay on Man 51)

As animals recall pasts and live within sensory space, human beings construct histories and geometries. Both history and geometry, then, are symbolic engagements that render the world meaningful in an irreducibly human fashion.

This symbolic dimension of the person carries him or her above the effector-receptor world of environmental facts and subjective responses. He or she lives instead in a world of possibilities, imaginations, fantasy, and dreams. However, just as there is a kind of logic to the language of contrary-to-fact conditionals or to the rules of poetic rhythym, so too is there a natural directedness expressed in how human beings construct a world of meaning out of those raw effections and receptions. That directedness cannot, however, be restricted to rational intentionality, though reason is indeed an essential component. In distinction from the Neo-Kantian theories of experience and representation, Cassirer thinks there is a wider network of forms that enable a far richer engagement between subject and object than reason could produce: “Hence, instead of defining man as an animal rationale, we should define him as an animal symbolicum” (Essay on Man 26).

With his definition of man as the symbolic animal, Cassirer is in position to reenvision the task of philosophy. Philosophy is much more than the analysis and eventual resolution of a set of linguistic problems, as Wittgenstein would have it, nor is it restricted, as it was for many Neo-Kantians, to transcendentally deducing the logical forms that would ground the natural sciences. Philosophy’s “starting point and its working hypothesis are embodied in the conviction that the varied and seemingly dispersed rays may be gathered together and brought into a common focus (Essay on Man 222). The functions of the human person are not merely aggregrate, loosely-connected expressions and factual conditions. Philosophy seeks to understand the connections that unite those expressions and conditions as an organic whole.

4. Myth

Max Müller was the leading theorist of myth in Cassirer’s day. In the face of Anglophone linguistic analysis, Müller held myth to be the necessary means by which earlier people communicate, one which left a number of traces within more-developed contemporary languages. What is needed for the proper study of myth, beyond this appreciation of its utility, is a step by step un-riddling of the mythical objects in non-mythical concepts so as to rationally articulate what a myth really means. Sigmund Freud, of course, also considered myth to be a sort of unconscious expression, one that stands as a primitive version of the naturally-occuring expression of subconscious drives.

Cassirer considers myth in terms of the Neo-Kantian reflex by first examining the conditions for thinking and then analyzing the objects which are thought. In his Sprache und Mythos (1925), which is a sort of condensed summary of the first two volumes of Philosophy of Symbolic Forms, Cassirer comes to criticize Müller, more so than Freud, for an unreflective realism about the objects of myth. To say that objects of any sort are what they are independent of their representation is to misunderstand the last century of transcendental epistemology. Accordingly, to treat myth as a false representation of those objects, one waiting to be “corrected” by a properly rational representation, is to ignore the wider range of human intellectual power. Naturalizing myths, as Müller and his followers sought to do, does not dissolve an object’s mythical mask so much as transplants it into the foreign soil of an alternative symbolic form:

From this point of view all artistic creation becomes a mere imitation, which must always fall short of the original. Not only simple imitation of a sensibly presented model, but also what is known as idealization, manner, or style, must finally succumb to this verdict; for measured by the naked ‘truth’ of the object to be depicted, idealization is nothing but subjective misconception and falsification. And it seems that all other processes of mental gestation involve the same sort of outrageous distortion, the same departure from objective reality and the immediate data of experience. (Language and Myth, trans. Langer [1946], 6)

Müller’s view of myth is a symptom of a wider problem. For if myth is akin to art or language in falsifying the world as it really is, then language is limited to merely expressing itself without any claim to truth either: “From this point it is but a single step to the conclusion which the modern skeptical critics of language have drawn: the complete dissolution of any alleged truth content of language, and the realization that this content is nothing but a sort of phantasmagoria of the spirit” (Language and Myth, trans. Langer [1946], 7). Cassirer rejects such fictionalism in myth and language both as an appeal to psychologistic measures of truth that fail to see a better alternative in the philosophy of symbolic forms.

For Cassirer, myth (and language, discussed below) does reflect reality: the reality of the subject. Accordingly, the study of myth must focus on the mental processes that create myth instead of the presupposed ‘real’ objects of myth:

Instead of measuring the content, meaning, and truth of intellectual forms by something extraneous which is supposed to be reproduced in them, we must find in these forms themselves the measure and criterion for their truth and intrinsic meaning. Instead of taking them as mere copies of something else, we must see in each of these spiritual forms a spontaneous law of generation; and original way and tendency of expression which is more than a mere record of something initially given in fixed categories of real existence. (Language and Myth, trans. Langer [1946], 8)

The mythic symbol creates its own “world” of meaning distinct from that created by language, mathematics, or science. The question is no longer whether mythic symbols, or any of these other symbolic forms, correspond to reality since it is distinct from that mode of representation, but instead it is a question on how myths relate to those other forms as limitations and supplementations. No matter how heterogeneous and variegated are the myths that come down to us, they move along definite avenues of feeling and creative thought.

An example Cassirer uses to illustrate his understanding of myth-making is the Avesta myth of Mithra. Attempts to identify Mithra as the sun-god, and thereby analogize it to the sun-god of the Egyptians, Greeks, and other early people, are misguided insofar as they stem from the attempt to explain away the object of mythical thinking in naturalistic rational terms. Cassirer points out that the analogy doesn’t hold for strictly interpretive reasons: Mithra is said to appear on mountain tops before dawn and is said to illuminate the earth at night as well, and cannot be the mythical analog of the sun. Mithra is not a thing to be naturalized, but evidence of an alternate spiritual energy that fashions symbolic responses to experiential confusions. What Mithra specifically reflects is a mode of thinking as it struggles to make sense of how the qualities of light and darkness result from a single essential unity: the cosmos.

As historical epochs provide new and self-enclosed worlds of experience, so too does myth evolve in conjunction with the needs of the age as an expression of overlapping but quite distinct patterns of mental life. Myths are hardly just wild stories with a particular pragmatic lesson. There is a specific mode of perception that imbues mythic thinking with its power to transcend experience. Similar to Giambattista Vico’s vision of historical epochs, Cassirer views the development of culture out of myth as a narrative of progressively more abstract systems of representation that serve as the foundation for human culture. Like Vico, too, there is continuity between the most elevated systems of theoretical expression of modern day—namely, religion, philosophy, and above all natural science—and a more primitive mind’s reliance upon myth and magic. However, Cassirer shares more with Enlightenment optimism than with Vico’s pessimistic conviction about the progressive degeneracy of scientific abstraction.

5. Language

The first volume of The Philosophy of Symbolic Forms (1923), on language, is guided by the search for epistemological reasons sufficient to explain the origin and development of human speech. Language is neither a nominal nor arbitrary designation of objects, nor, however, does language hold any immediate or essential connection to the object of its designation. The use of a word to designate an object is already caught in a web of intersubjectively-determined meanings which of themselves contain much more than the simple reference. Words are meaningful within experience, and that experience lies, as it did for Kant, as a sort of middle-ground between the pure reception of objects and the autonomous activity of reason to generate forms within which  content could be meaningful. In contrast to Kant and the Neo-Kantians, however, those forms cannot be presumed to be identical among all rational agents over the spans of history. Animal language is essentially a language of emotion, expressions of desires and aversions in response to environmental factors. Similarly the earliest words uttered by our primitive ancestors were signs to deal with objects, every bit a tool alongside other tools to deal with the primitive’s sensed reality. As the human mind evolved to add spatio-temporal intuitions to mere sensation, a representational function overtook the mind’s merely expressive operations. The primitive vocalized report of received sensations became representations of enduring objects within fixed spatial points: “The difference between propositional language and emotional language is the real landmark between the human and the animal world. All the theories and observations concerning animal language are wide of the mark if they fail to recognize that fundamental difference” (Essay on Man 30). The features of those objects were further abstracted such that from commonalities there emerged a host of types, kinds, and eventually universals, whose meaning allowed for the emergence of mathematics, science, and philosophy.

The animal’s emotive signals operate as a practical imagination in a world of immediate experience. Proper human propositional speech, on the other hand, is already imbued at even its most basic levels with theoretical structures that involve quintessentially spatio-temporal forms linking subjects and their objects: “Language has a new task wherever such relationships are signified linguistically, where ‘here’ is distinguished from ‘there,’ where the location of the speaker is distinguished from the one spoken to, or where the greater nearness or distance is rendered by various indicative particles” (“The Problem of the Symbol and its Place in the System of Philosophy” in Luft [2015], 259). The application of dimensionality, and temporality as well, transforms the subjective sensation into an objective representation. Prepositions, participles, subjunctives, conditionals, and the rest, all involve either temporal or spatial prescriptions, and none of them seems to be a feature of animal space. The older animalistic content is not entirely discarded as the same basic desires and emotions are expressed. The means of that expression, however, are formally of an entirely different character that binds the subject to the object in ways supposed to be binding for other rational agents. Although the interjection “ouch!” expresses pain well enough, and although animals have variously similar yelps and cries, it lacks the representational form of the proposition “I (this one, here and now) am (presently) in pain..” In the uniquely human sphere of ethics, too, the reliance on subjunctive and conditional verbal forms—“I ought not to have done that,” for example—always carries language beyond simple evocations of pleasures and aversions into the symbolic realm of meaningfulness.

The Neo-Kantian position on language allows Cassirer to address two contemporary anomalies in linguistic science. The first is the famous case of Helen Keller, the unfortunate deafblind girl from Alabama, who, with the help of her teacher Anne Sullivan, went on to become a prolific author and social activist. Sullivan had taught Helen signs by using a series of taps on her hand to correspond to particular sense impressions. Beyond her disabled sensory capacities, Cassirer argued, Helen was unable to cognize in the characteristically human way. One day at a water pump, Sullivan tapped “water” and Helen recognized the disjunction between the various sensations of water (varying temperatures, viscocities, and degrees of pressure) and the “thing” which is universally referred to as such. That moment opened up for Helen an entire world of names, not as mere expressive signals covering various sensations but as intersubjectively valid objective symbols. This discovery marked her entry into a new, symbolic mode of thinking: “The child had to make a new and much more significant discovery. She had to understand that everything has a name—that the symbolic function is not restricted to particular cases but is a principle of universal applicability which encompasses the whole field of human thought” (Essay on Man 34f).

The second case is the pathology of aphasia. Similar to Helen Keller, what had long been thought a deficiency of the senses was revealed by Cassirer to be a cognitive failing. In the case of patients with traumatic injuries to certain areas of the brain, particular classes of speech act became impossible. The mechanical operation of producing the words was not the problem, but an inability to speak objectively about “unreal” conditions: “A patient who was suffering from a hemiplegia, from a paralysis of the right hand, could not, for instance, utter the words: ‘I can write with my right hand,’ because this was to him the statement of a fact, not of a hypothetical or unreal case” (Essay on Man 57). These types of aphasiacs were confined to the data provided by their sense impressions and therefore could not make the crucial symbolic move to theoretical possibility. For Cassirer, this was good evidence that language was neither mere emotional expression nor free-floating propositional content that could be analyzed logically only a posteriori.

In addition to these cases of abnormal speech pathology, Cassirer’s attention to the evolution of language enabled him to take a much wider view of both the form of utterance and its content than his more famous counterparts among the linguistic analysts. In Carnap’s Logical Syntax of Language, for example, the attempt is made to reduce semantic rules to syntax. The expected outcome was a philosophical grammar, a sound and complete system of words in the sort of logical relation that would be universally valid. For Cassirer, however, “human speech has to fulfill not only a universal logical task but also a social task which depends on the specific social conditions of the speaking community. Hence we cannot expect a real identity, a one-to-one correspondence between grammatical and logical forms” (Essay on Man 128). Contrary to the early analytical school, language cannot be considered a given thing waiting to be assessed according to independent logical categories, but instead needs to be assessed according to the a priori application of those categories to verbal expressions. Accordingly, the task of the philosopher of language must be refocused to account for the diversity and creativity of linguistic dynamics in order to better encapsulate the human rational agent in the fullest possible range of his or her powers.

6. Science

Cassirer was perhaps the last systematic philosopher to have both exhaustive knowledge of the historical development of each of the individual sciences as well as thorough familiarity with his day’s most important advancements. Substance and Function (1910) could still serve as a primer for the history of major scientific concepts prior to the twenthieth century. The first part examines the concepts of number, space, and a vast array of special problems such as Emil du Bois-Reymond’s “limiting concepts”; Robert Mayer’s methodological advancements in thermo-dynamics; the spatial continuities of atoms in the physics of Roger Boscovich and Gustav Fechner; Galileo’s concept of inertia; Heinrich Hertz’s mechanics; and John Dalton’s law of multiple proportions. Each of these is examined with a view toward the epistemological presuppositions that gave rise to those problems and how each scientist’s innovations represented a novel way of posing problems through an application of spatio-temporal concepts.

This historical survey allows Cassirer to offer his own contributions to these problems along recognizably Neo-Kantian lines in the second part of Substance and Function. Science cannot be considered a collection of empirical facts. Science discovers no absolute qualities but only qualities in relation to other qualities within a particular field, such as the concept of mass as the sum of relations with respect to external impulses in motion, or energy as the momentary condition of a given physical system. Concrete sensuous impressions are only transformed into empirical objects by the determination of spatial and temporal form. The properties of objects, in bringing them into meaningful discourse by means of measurement, are thus mathematized as a field of relations: “The chaos of impressions becomes a system of numbers; but these numbers first gain their denomination, and thus their specific meaning, from the system of concepts which are theoretically established as universal standards of measurement” (Substance and Function 149). Objects as they stand outside possible experience are not the proper subject matter of science, anymore than they are for mathematics. Proper science examines the logical connections among the spatio-temporal relationships of objects precisely as they are constituted by experience.

Abandoning the particular sensuous properties of objects for their logical relations as members of a system refocuses the scientific inquiry on how the natural world is symbolized by mathematical logic. Science becomes anthropomorphized insofar as whatever content is available to experience will be content that the human being spontaneuosly and creatively renders meaningful: “No content of experience can ever appear as something absolutely strange; for even in making it a content of our thought, in setting it in spatial and temporal relations with other contents we have thereby impressed it with the seal of our universal concepts of connection, in particular those of mathematical relations” (Substance and Function 150). However, this in no way reduces science to mere relativism of personal inner projections, as if one way of representing the world were no better than any other. Though we do not know objects independent of mental representation, scientific understanding functions objectively by fixing the permanent logical elements and their connections within a uniform manifold of experience: “The object marks the logical possession of knowledge, and not a dark beyond forever removed from knowledge” (Substance and Function 303). Thus, science is absolutely tied to empirical reality, by which Cassirer means the sum of logical relations through which humans cognize the world. Therefore science, too, as much as language or myth, symbolically constitutes the world in its particular idiom: “The symbol possesses its adequate correlate in the connection according to law, that subsists between the individual members, and not in any constitutive part of the perception; yet it is this connection that gradually reveals itself to be the real kernel of the thought of empirical ‘reality’” (Substance and Function 149).

This Neo-Kantian vision of science is not something Cassirer thinks stands to “correct” science as currently practiced. On the contrary, the great modern scientists themselves have assumed precisely the same view, though in terms lacking the proper philosophical rigor. Newton’s assumption of absolute space and time put science on its first firm foundation, and in doing so he had to relinquish a purely sense-certain view of experience. Space and time in classical physics fix natural processes within a geometric schema, and fix mass as a self-identical thing within infinitely different spaces and different times. What Newton failed to realize was that this vision of space and time imputed ideal forms into what he believed was the straightforward observation of real objects. Kant had already shown as much. James Clark Maxwell’s theory of light waves breaks with this system of transcribing observational circumstances with mathematical equations that associate spatial positions with affair-states. Maxwell’s spatial point simultaneously has two correlate directional quantities: the magnetic and electrical vectors, whose representations in mathematics are readily cognizable but whose observation as such is impossible. The theory of Maxwell was therefore functionally meaningful without requiring a substantial ontology behind it. The definitive theory of light he discovered was not about a permanent thing situated within space and time but a set of interrelated magnitudes that could be functionally represented as a universal constant.

Hermann Ludwig von Helmholtz was among the first natural scientists to properly acknowledge the difference between observational descriptions of reality and symbolic theoretical constructions of it. As Cassirer quotes Helmholtz:

[I]n investigating [phenomena] we must proceed on the supposition that they are comprehensible. Accordingly, the law of sufficient reason is really nothing more than the urge of our intellect to bring all our perceptions under its own control. It is not a law of nature. Our intellect is the faculty of forming general conceptions. It has nothing to do with our sense-perceptions and experiences unless it is able to form general conceptions or laws. (Essay on Man 220)

The alleged sensory manifold held so dear in naively realist science gave way before Helmholtz’s demonstration that such is an ideally defined totality according to the rule which distinguishes properties on the basis of numerical series. That ideal unit is, for Helmholtz, the “symbol,” which cannot be considered a “copy” of a non-signifying object-in-itself (for how could that be conceived?) but the functional correspondence between two or more conceptual structures. Thus what is discovered by Helmholtzian science are the laws of interrelation among phenomena, the laws which are the very condition of our experiencing something as an object in the first place.

To Helmholtz’s experimental demonstration, Cassirer is able to add the relational but still universal nature of scientific designation; that is, the crucial differentiation between substance-concepts and function-concepts:

For laws are never mere compendia of perceptible facts, in which the individual phenomena are merely placed end to end as on a string. Rather every law, as compared to immediate perception, comprises a […] transition to a new perspective. This can occur only when we replace the concrete data provided by experience with symbolic representations, which on the basis of certain theoretical presuppositions that the observer accepts as true and valid are thought to correspond to them. (The Philosophy of Symbolic Forms III, 21)

Accordingly, the truth of science does not depend upon an accurate conceptualization of substances so much as it does on the demonstrating the limits of conceptual thinking about those substances, that is, their symbolic functions.

The scientist cannot attain his end without strict obedience to the facts of nature. But this obedience is not passive submission. The work of all the great natural scientists – of Galileo and Newton, of Maxwell and Helmholtz, of Planck and Einstein—was not mere fact collecting; it was theoretical, and that means constructive, work. This spontaneity and productivity is the very center of all human activities. It is man’s highest power and it designates at the same time the natural boundary of our human world. In language, in religion, in art, in science, man can do no more than to build up his own universe – a symbolic universe that enables him to understand and interpret, to articulate and organize, to synthesize and universalize his human experience. (Essay on Man 221)

Cassirer’s essay Zur Einsteinschen Relativitätstheorie (1921) was his last major thematic enterprise before the first volume of The Philosophy of Symbolic Forms. In it he sees himself following Cohen’s task of updating Kant’s philosophical groundwork for science. Kant had taken for granted that the forms of science in his own day represented scientific thinking as such. His epistemological groundwork accordingly needed to support Newtonian physics. After Kant’s death, science leapt past the limits set by Newton just as mathematics pushed the limits of Euclidian three-dimensional geometry. Einstein’s theories of relativity effectively dismantled the authority of both; the fact that they did proved to Cassirer the non-absolute status of scientific symbolization as a doctrine about objects. An elucidation of the epistemological conditions that could allow for Einstein’s relativity was now necessary.

Cassirer replaced Kant’s static formalism with his attention to the varied and alterable features of mathematical science that could accomodate radical new forms of mathematical logic and, by extension, systems of natural science. Pure Euclidean geometry was so influential because it dealt concretely and intuitively with real things as uniform and absolute substances. And it still works with most material applications. When non-Euclidian geometry came to the fore with Gauss, Riemann, and Christoffel, it was considered a mere play of analytical concepts that held some logical curiosity but no applicability. Over time a gradual shift ensued from the widening of the concept of experience to include non-uniform concepts of space.

Pure Euclidean space stands, as it now seems, not closer to the demands of empirical and physical knowledge than the non-Euclidean manifolds but rather more removed. For precisely because it represents the logically simplest form of spatial construction it is not wholly adequate to the complexity of content and the material determinateness of the empirical. Its [i.e., the Euclidean] fundamental property of homogeneity, its axiom of the equivalence in the principal of all points, now marks it as an abstract space; for, in the concrete and empirical manifold, there never is such uniformity, but rather thorough-going differentiation reigns in it. (“Euclidean and non-Euclidean Geometry,” in Luft [2015], 243)

It is thus not the case, as traditionally thought, that the new physical sciences simply adopted a more abstract vision of mathematics as its basis. Their physics represent a more widely-encompassing symbolic representation that expresses a new mode of experience, one less concerned with the sense impressions of real objects than with the reality of their logical relations.

Einstein needed a geometry of curvature that varied according to the relation of mass and energy in order for general relativity to work, but this of itself does not mean Euclidean geometry was or even could be proven wrong by Minkowski space-time. In the terminology of symbolic forms, Cassirer thinks Einstein’s relativity has transcended the symbolic forms of natural objects with those of pure mathematical relations. The result is the fracture of non-commensurable ways of analyzing one and the same “substance”: physically, chemically, mathematically, and so forth. Those forms ought not to be reduced to a single “meta” method that levels their differences as merely partial views. Each ought to be retained as equally valid parts of the total determination of the object. Thus Einstein was right to abandon absolute Newtonian space-time for relative Minkowski space-time. But his reason for doing so did not concern the former’s falsity. In place of a single absolutist description, the new relativism embraced an epistemology that featured a wider variety of equally valid modes of thinking about one and the same object. Objects, in Cassirer’s idiom, are relative to the symbolic form under which they are expressed.

The One reality can only be disclosed and defined as the ideal limit of diversely changing theories; but the setting of this limit itself is not arbitrary; it is inescapable, since the continuity of experience is established only thereby. No particular astronomical system, the Copernican no more than the Ptolemaic…may be taken as an expression of the ‘true’ cosmic order, but only the whole of these systems as they continuously unfold in accordance with a certain context. …We do not need the objectivity of absolute things, but we do require the objective determinacy of the way of experience itself. (Philosophy of Symbolic Forms III, 476)

Cassirer’s view of the evolution of science may be compared with Thomas Kuhn’s view insofar as both reject a single consistent progress toward absolute truth. Cassirer’s symbolic forms echo in Kuhn’s paradigms as incommensurable frameworks of meaning that stand in discomfitted relationships with one another. But where Kuhn sees the conditions for shifted paradigms in the quasi-sociological language of the community crises brought about by insoluable intra-paradigm problems, Cassirer sees a more epistemological metamorphosis in the evolution and expansion of human thinking. More than just a professional and social shift away from Pythagoras or Galileo to Einstein or Plank, Cassirer thinks rational agency matures to embrace more variegated, more useful, and more precise symbols. This evolution does not bring the rational agent  closer to the truth of objects, but it does bring more useful and exacting means by which to think about those objects. Insofar as science, more so than myth or language, cultivates that progression through its activity, it presents, for Cassirer, the prospect to carry human nature to the very highest cultural achievements possible: “Science is the last step in man’s mental development and it may be regarded as the highest and most characteristic attainment of human culture” (Essay on Man 207).

7. Political Philosophy

Cassirer’s political philosophy has its roots in Renaissance humanism and the classics of Modern thought: Machiavelli, Rousseau, Kant, Goethe, and Humboldt. Ever concerned with a subject’s connection to the wider sphere of cultural life, Cassirer noted that the Ancient, Medieval, and Renaissance conceptions of politics were framed within a holistic worldview. In Modern times, a holistic order still obtained, but after Machiavelli, this order is based upon intrapersonal relationships rather than the divine or the natural. These social and political relationships are, like symbolic forms, neither entirely objective nor entirely subjective. They represent the construction of ourselves in the framework of our ideal comprehensive social life.

Man’s social consciousness depends upon a double act of identification and discrimination. Man cannot find himself, he cannot become aware of his individuality, except through the medium of his social life. […] Man, like the animals, submits to the rules of society but, in addition, he has an active share in bringing about, and an active power to change, the forms of social life. (Essay on Man 223)

As it did for Kant, human dignity derives from the capacity of rational agents to pose and constrain themselves by normative laws of their own making. Cassirer stresses against Marx and Heidegger, respectively, that it is neither the material nor ontological conditions that man is born or thrown into that determines political order or social value. Rather, it is the active processes by which the human person creates laws for themself, social institutions for themself, and norms for themself are paramount in determining the place of the human being in society. Politics is not simply the study of the relations between social institutions, as Marx and his sociological disciples believed, but of their meaningful construction within the symbolic forms of myth-making, art, poetry, religion, and science.

Human culture taken as a whole may be described as the process of man’s progressive self-liberation. Language art, religion, science, are various phases in this process. In all of them man discovers and proves a new power – the power to build up a world of his own, an ‘ideal’ world. Philosophy cannot give up its search for a fundamental unity in this ideal world (Essay on Man, 228).

The opponent in Cassirer’s last work, The Myth of the State, is Heidegger and the kind of twentieth century totalitarian mythologies of “crisis” by which he and so much of Germany were then entranced. Even if he did stand mostly alone, Cassirer stood firmly against the myth of Aryan supremacy, the myth of the eternal Jew, and the myth of Socialist utopia. He did not oppose the creative acts that gave rise to these myths but the unthinking allegiance they demanded of their acolytes. In so doing, Cassirer felt Germany, and not just Germany, had abandoned its heritage of classical liberalism, tradition of laws, and its belief in the rational progress of both science and religion for a worldview based in power and struggles for personal gain masking as equality. With obvious reference toward Heidegger and the National Socialists, Cassirer laments:

Perhaps the most important and the most alarming feature in this development of modern political thought is the appearance of a new power: the power of mythical thought. The preponderance of mythical thought over rational thought in some of our modern systems is obvious. (Myth of the State, 3)

Cassirer’s focus in Myth of the State is mostly not, however, the contemporary state of European politics. In fact, only in the last chapter is the word Nazi mentioned. The great majority is caught up instead with history, almost jarringly so given the immediate crisis and Cassirer’s personal place in it. He has far more to say about medieval theories of grace, Plato’s Republic, and Hegel than he does about the rise of Hitler or the War. Back in the First World War, Cassirer’s wife Toni would write in her biography, Mein Leben mit Ernst Cassirer, that despite some limited clerical duties on behalf of Germany, their major wartime concerns were whether there was sufficient electricity to write and whether the train tickets were first class (Toni Cassirer, 1948, 116-20): “We weren’t politicians, and didn’t even know any politicians” (Ibid., 117). And that aloofness stayed with Cassirer until the end. Charles W. Hendel, who was responsible for Cassirer’s appointment at Yale and who later became the posthumous editor of Myth of the State, illustrates how frustrating Cassirer’s silence on contemporary political matters were: “Won’t you tell us the meaning of what is happening today, instead of writing about past history, science, and culture? You have so much knowledge and wisdom—we who are working with you know that so well—but you should give others, too, the benefit of it” (Myth of the State x). In the early twentieth-first century, Edward Skidelsky declaimed Cassirer’s reluctance to speak about contemporary politics as a symptom of a greater philosophical shortcoming:

“[Cassirer’s] is an enchanting vision. But it is also a fundamentally innocent one. Liberalism may have triumphed in the political sphere, but it was the illiberal philosophy of Heidegger that won the day at Davos and went on to leave the deepest stamp on twentieth-century culture. Who now shares Cassirer’s faith in the humanizing power of art or the liberating power of science? Who now believes that the truth will make us free?” (Skidelsky 2008, 222)

8. The Davos Conference

The historical event for which Cassirer is best known is the famous conference held in Davos, Switzerland in 1929. Planned as a symposium to bring together French- and German-speaking academics in a spirit of international collaboration, the conference was set in the resort town made famous by Thomas Mann’s epic The Magic Mountain (1924). Counting nearly 1,300 attendees, more than 900 of who were the town’s residents, the conference featured 56 lectures delivered over the span of three weeks. Among those in attendance were contemporary heavyweights like Fritz Heinemann and Karl Joël, and rising stars like Emmanuel Lévinas, Joachim Ritter, Maurice de Gandillac, Ludwig Binswanger, and a young Rudolf Carnap. The centerpiece of the conference was to have been the showdown between the two most important philosophers in Germany: Cassirer and Heidegger. Curiously, there never was a disputation proper, in the sense of an official point-by-point debate, in part because neither man was up for it: Cassirer was bed-ridden by illness and Heidegger was less interested in attending lectures than the resort town’s recreational activities. As a characteristic expression of his disdain toward stuffy academic conferences, Heidegger even gave one of his own talks while wearing his ski-suit.

Cassirer was the student and heir of Hermann Cohen, the unchallenged leader of Marburg Neo-Kantianism. Heidegger was the most brilliant student of the Southwest Neo-Kantian Heinrich Rickert, but was recommended to the chair of Marburg by none other than Marburger Paul Natorp. On at least three separate occasions, Cassirer and Heidegger were considered for the same academic post, as successor to Husserl, then to Rickert, and finally for the leading position in Berlin in 1930 (Gordon, 2010, 40). Cassirer and Heidegger were thus the two greatest living thinkers in the tradition of Kantian philosophy, and were invited to Davos to defend their rival interpretation on the question of whether an ontology could be derived from Kant’s epistemology. Their positions were contradictory in clear ways: Cassirer held the Marburg line that Kant’s entire project required that the thing-in-itself be jettisoned for a transcendental analysis of the forms of knowing. Heidegger wanted to recast not only Kant but philosophy itself as a fundamental investigation into the meaning of Being, and by specific extension, the human way of Being: Dasein. The debate about the proper interpretation of Kant went nearly nowhere, and Heidegger’s interpretation had more to do with Heidegger than with Kant. Cassirer, the co-editor of the critical edition of Kant’s works and the author of a superb intellectual biography, was no doubt the superior exegete. Nevertheless, Heidegger was doubtless the more captivating and original philosopher.

Beyond their divergent interpretations of Kant, the debate brought to the fore two competing intellectual forces that were at genuine odds: Cassirer’s Neo-Kantian maintenance of the spontaneous mental freedom requisite for the production of symbolic forms was pitted against Heidegger’s existential-phenomenological concentration on the irrevocable “thrownness” of human beings into a world of which the common denominator was their realization of death. Cassirer thought Heidegger vastly overstated Dasein’s thrownness and understated its spontaneity, and that his subjectivism discounted the objectivity of the sciences and of moral laws. Also, if both the character of rationality and the inviolable value of the human person lie in a subject’s spontaneous use of theoretical and practical forms of reasoning, then the danger was clear: Heidegger’s Dasein had one foot in irrationality and the other in nihilism.

The historical significance of the Davos Conference thus lay, ironically, in its symbolic meaning. Primed by the cultural clash between humanism and iconoclasm represented by Thomas Mann’s characters Settembrini and Naphta, the participants in Davos expected the same battle between the stodgy old enlightenment Cassirer and the exciting, young, radical Heidegger. No doubt some in the audience fancied themselves a Hans Castorp, whose soul, and the very fate of Europe, was caught in the tug of war between Settembrini/Cassirer’s liberal rationalism and Naphta/Heidegger’s conservative mysticism. (Though, to be sure, Mann’s model for Naphta was György Lukács and not Heidegger.) In the Weimar Republic’s “Age of Crisis,” it was not so much what either man said, but what each symbolized that mattered. As Rudolf Carnap wrote in his journal, “Cassirer speaks well, but somewhat pastorally. […] Heidegger is serious and objective, as a person very attractive” (Friedman, 2000, 7). In a subsequent satirical reenactment, a young Emmanuel Lévinas mocked Cassirer by performing in buffo what he took to be the salient point of his lectures at Davos: “Humboldt, culture, Humboldt, culture” (Skidelsky, 2008, 1). Indeed what Cassirer defended was then subject to parody among the young. Cassirer was the last of the great polymaths like Goethe, the last comprehensive historian like Ranke, the last optimist like Humboldt, and the last of the Neo-Kantian academic establishment. Heidegger represented the revolution of a new German nation, one that would sweep away the old ways of philosophy as much as Hitler would sweep away Wilhelmine politics. Heidegger welcomed crisis as the condition for new growth and invention; Cassirer saw in crisis the collapse of a culture that took so long to achieve. Cassirer was the great scholar. Heidegger was the great philosopher. Cassirer clung to rational optimism and humanist culture while Heidegger championed existential fatalism. In 1929, the Zeitgeist clearly favored the latter.

The consequences of Davos, like the meaning of the conference itself, operated on two levels. On the level of the factual, Cassirer and Heidegger would maintain a somewhat detached respect for the other, with mutually critical yet professionally cordial responses in print over the years to come. Neither man came to change either his interpretation of Kant or his philosophy generally in any major way due to the conference. Symbolically, however, Davos was a disaster for Cassirer and for Neo-Kantianism. Europe was immediately swept up in increasingly violent waves of nationalism. Days after Hitler’s election as Chancellor in 1933, Jews were banned from teaching in state schools. The Night of the Long Knives happened five years after Davos, and then the Night of Broken Glass four years after that. Neo-Kantian philosophers, especially the followers and friends of Hermann Cohen, were mainly Jewish. Cassirer fled to England and then Sweden in 1933 in fear of the Nazi’s, even while Heidegger was made Rektor at Freiburg. The Wilhelmine era’s enlightened cultural humanism, and its last defender, had clearly lost.

9. References and Further Reading

What follows is a list of Cassirer’s major works. For an exhaustive bibliography, see http://www1.uni-hamburg.de/cassirer/bib/bibgr.htm. For the contents of Cassirer’s archive at Yale, see http://www1.uni-hamburg.de/cassirer/bib/yale1.htm.

a. Cassirer’s Major Works

  • (1899) Descartes: Kritik der Matematischen und Naturwissenschaftlichen Erkenntnis (dissertation at Marburg).
  • (1902) Leibniz’ System in seinen wissenschaftlichen Grundlagen. Marburg: Elwert.
  • (1906) Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit. Erster Band. Berlin: Bruno Cassirer.
  • (1907) Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit. Zweiter Band. Berlin: Bruno Cassirer.
  • (1910) Substanzbegriff und Funktionsbegriff: Untersuchungen über die Grundfragen der Erkenntniskritik. Berlin: Bruno Cassirer.
  • (1916) Freiheit und Form: Studien zur deutschen Geistesgeschichte. Berlin: Bruno Cassirer.
  • (1921) Zur Einsteinschen Relativitätstheorie. Erkenntnistheoretische Betrachtungen. Berlin: Bruno Cassirer.
  • (1923) Philosophie der symbolischen Formen. Erster Teil: Die Sprache. Berlin: Bruno Cassirer.
  • (1925) Philosophie der symbolischen Formen. Zweiter Teil: Das mythische Denken. Berlin: Bruno Cassirer.
  • (1925) Sprache und Mythos: Ein Beitrag zum Problem der Götternamen. Leipzig: Teubner.
  • (1927) Individuum und Kosmos in der Philosophie der Renaissance. Leipzig: Teubner.
  • (1929) Die Idee der republikanischen Verfassung. Hamburg: Friedrichsen.
  • (1929) Philosophie der symbolischen Formen. Dritter Teil: Phänomenologie der Erkenntnis. Berlin: Bruno Cassirer.
  • (1932) Die Platonische Renaissance in England und die Schule von Cambridge. Leipzig: Teubner.
  • (1932) Die Philosophie der Aufklärung. Tübingen: J.C.B. Mohr.
  • (1936) Determinismus und Indeterminismus in der modernen Physik. Göteborg: Göteborgs Högskolas Årsskrift.
  • (1939) Axel Hägerström: Eine Studie zur Schwedischen Philosophie der Gegenwart. Göteborg: Högskolas Årsskrift.
  • (1939) Descartes: Lehre, Persönlichkeit, Wirkung. Stockholm: Bermann-Fischer Verlag.
  • (1942) Zur Logik der Kulturwissenschaften. Göteborg: Högskolas Årsskrift.
  • (1944) An Essay on Man. New Haven: Yale University Press.
  • (1945) Rousseau, Kant, Goethe: Two Essays. New York: Harper & Row.
  • (1946) The Myth of the State. New Haven: Yale University Press.

b. Further Reading

  • Barash, Jeffrey Andrew (2008), The Symbolic Construction of Reality: The Legacy of Ernst Cassirer. Chicago: University of Chicago Press.
  • Bayar, Thora Ilin (2001), Cassirer’s Metaphysics of Symbolic Forms: A Philosophical Commentary. New Haven: Yale University Press.
  • Braun, H.J., Holhey H., & Orth, E.W. (eds.) (1998), Über Ernst Cassirers Philosophie des symbolischen Formen. Frankfurt: Suhrkamp.
  • Cassirer, Toni (1948), Mein Leben mit Ernst Cassirer. Hildesheim: Gerstenberg.
  • Friedman, Michael (2000), A Parting of the Ways: Carnap, Cassirer, and Heidegger. Peru, IL: Open Court.
  • Gaubert, Joël (1996), La science politique d’Ernst Cassirer: pour une réfondation symbolique de la raison pratique contre le mythe politique contemporain. Paris: Éd. Kimé.
  • Gordon, Peter E. (2010), Continental Divide: Heidegger, Cassirer, Davos. Cambridge: Harvard University Press.
  • Hamlin, C., & Krois, J.M. (eds.) (2004), Symbolic Forms and Cultural Studies: Ernst Cassirer’s Theory of Culture. New Haven: Yale University Press.
  • Hanson, J. & Nordin, S. (2006), Ernst Cassirer: The Swedish Years. Bern: Peter Lang.
  • Heidegger, Martin (1928), “Ernst Cassirer: Philosophie der symbolischen Formen. 2. Teil: Das mythische Denken.” Deutsche Literaturzeitung, 21: 1000–1012.
  • Itzkoff, Seymor (1971), Ernst Cassirer: Scientific Knowledge and the Concept of Man. South Bend, IN: Notre Dame University Press.
  • Krois, John Michael (1987), Cassirer: Symbolic Forms and History. New Haven: Yale University Press.
  • Langer, Suzanne (1942), Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art. Cambridge, Mass.: Harvard University Press.
  • Lipton, D. R. (1978), Ernst Cassirer: The Dilemma of a Liberal Intellectual in Germany, 1914-1933. Toronto: University of Toronto Press.
  • Lofts, S.G. (2000), Cassirer: A “Repetition” of Modernity. Albany: SUNY Press.
  • Lübbe, Hermann (1975), Cassirer und die Mythen des zwanzigsten Jahrhunderts. Göttingen: Vandenhoeck & Ruprecht.
  • Luft, Sebastian (ed.) (2015), The Neo-Kantian Reader. New York: Routledge.
  • Paetzold, Heinz (1995), Ernst Cassirer — Von Marburg nach New York: eine philosophische Biographie. Darmstadt: Wissenschaftliche Buchgesellschaft.
  • Renz, Ursula (2002), Die Rationalität der Kultur: Zur Kulturphilosophie und ihrer transzendentalen Begründung bei Cohen, Natorp und Cassirer. Hamburg: Felix Meiner.
  • Rudolph, Enno (ed.) (1999), Cassirers Weg zur Philosophie der Politik. Hamburg: Felix Meiner.
  • Schilpp, Paul Arthur (ed.) (1949), The Philosophy of Ernst Cassirer. Evanston: Library of Living Philosophers.
  • Schultz, William (2000), Cassirer and Langer on Myth: An Introduction. New York: Routledge.
  • Schwemmer, Oswald (1997), Ernst Cassirer. Ein Philosoph der europäischen Moderne. Berlin: Akademie Verlag.
  • Skidelsky, Edward (2008), Ernst Cassirer: The Last Philosopher of Culture. Princeton: Princeton University Press.
  • Tomberg, Markus (1996), Der Begriff von Mythos und Wissenschaft bei Ernst Cassirer und Kurt Hübner. Münster: LIT Verlag.
  • Verene, Donald Phillip (ed.) (1979), Symbol, Myth, and Culture: Essays and Lectures of Ernst Cassirer, 1935-1945. New Haven: Yale University Press.
  • Verene, Donald Phillip (2011) The Origins of the Philosophy of Symbolic Forms: Kant, Hegel, and Cassirer. Evanston, IL: Northwestern University Press.

 

Author Information

Anthony K. Jensen
Email: Anthony.Jensen@providence.edu
Providence College
U. S. A.

Legal Hermeneutics

The question of how best to determine the meaning of a given text (legal or otherwise) has always been the chief concern of the general field of inquiry known as hermeneutics. Legal hermeneutics is rooted in philosophical hermeneutics and takes as its subject matter the nature of legal meaning. Legal hermeneutics asks the following sorts of questions: How do we come to decide what a given law means? Who makes that decision? What are the criteria for making that decision? What should be the criteria? Are the criteria that we use for deciding what a given law means good criteria? Are they necessary criteria? Are they sufficient? In whose service do our interpretive criteria operate? How were these criteria chosen and by whom? Within what sociopolitical, sociocultural, and sociohistorical contexts were these criteria generated? Are the criteria we have used in the past to ascertain the meaning of a given law the criteria we should still use today? Why or why not? What personal or political goals do the meanings of laws serve? How can we come up with better meanings of laws? On what basis can one meaning of a given law be justifiably prioritized over another? Through an interrogation into these meta-interpretive questions, legal hermeneutics serves the critical role of helping the interpreter of laws reach a higher level of self-reflexivity about the interpretive process. From a legal hermeneutical point of view, it is primarily through this heightened transparency about the process of interpretation that better meaning assessments are generated.

Some distinctive features of legal hermeneutics are (1) it is rooted in philosophical hermeneutics; (2) within the schema of mainstream philosophies of law, it is most closely conceptually related to legal interpretivism; (3) it shares an antifoundationalist sensibility with many alternative theories of law; and (4) within jurisprudence proper (legal theory), its primary substantive focus is on the debate in constitutional theory between the interpretive methods of originalism and non-originalism.

Table of Contents

  1. Roots: Philosophical Hermeneutics
  2. Legal Hermeneutics and Mainstream Philosophy of Law
  3. Legal Hermeneutics and Alternative Theories of Law
  4. Legal Hermeneutics in Jurisprudence Proper
  5. Conclusion
  6. References and Further Reading
    1. References
    2. Further Reading

1. Roots: Philosophical Hermeneutics

The term hermeneutics can be traced back at least as far as Ancient Greece. David Hoy traces the origin of term to the Greek god Hermes, who was, among other things, the inventor of language and an interpreter between the gods and humanity. In addition, the Greek term ἑρμηνεύω, or hermeneutice, is central to Aristotle’s On Interpretation (Περὶ Ἑρμηνείαςas), which concerns the relationship between language and logic and meaning.

Hermeneutical approaches to meaning are thematized and utilized in many academic disciplines: archaeology, architecture, environmental studies, international relations, political theory, psychology, religion, and sociology. Specifically philosophical hermeneutics is unique in that rather than taking a particular approach to meaning, it is concerned with the nature of meaning, understanding, or interpretation.

Legal hermeneutics is rooted in philosophical hermeneutics, which asks not only the question of how best to interpret a given text, but also the deeper question of what it means to interpret a text at all. In other words, philosophical hermeneutics takes as its object of inquiry the interpretive process itself and seeks interpretive practices designed to respect that process (Dostal 2002; Malpas 2014; Wachterhauser 1994). Philosophical hermeneutics, then, can be alternately described as the philosophy of interpretation, the philosophy of understanding, or the philosophy of meaning.  The central problem of philosophical hermeneutics is how to successfully ascertain anything on the order of an objective interpretation, understanding, or meaning in light of the apparent fact that all meaning is ascertained through the filter of at least one interpreter’s subjectivity (Bleicher 1980: 1). Philosophical hermeneutics seeks transparency in the interpretive process en route to better meaning determinations. On this view, better theories of interpretation (1) capture the key features of the interpretive process, (2) recognize each act of understanding as an interpretation, and (3) are able to distinguish between more and less legitimate or objective interpretations, understandings, or meanings.

Philosophical hermeneutics has its theoretical origins in the work of 19th century German philologist Friedrich Ast. Ast’s Basic Elements of Grammar, Hermeneutics, and Criticism (Grundlinien der Grammatik, Hermeneutik und Kritik) of 1808 contains an early articulation of the main components of what later became known as the hermeneutic circle. Ast wrote that the basic principle of all understanding was a cyclical process of coming to understand the parts through the whole and the whole through the parts.  This basic principle derived, for Ast, from the “original unity of all being” (Ast 1808: Section 72) or what Ast called spirit or Geist. (Ast’s Geist is commonly understood to have been derived from Herder’s concept of Volkgeist.)

To understand a text, for Ast, was to determine its inner meaning or spirit, its own internal development, through a circularity of reason, a dialectical relation between the parts of a given work and the whole (Ast 1808: Section 76).  What Ast called the hermeneutic of the spirit involved, in turn, the development of an understanding of the spirit of the writer and her era and an attempt to identify the one idea, or Grundidee, that unified a given text and that provided clarification regarding the relationship of the whole to the parts and the parts to the whole. In this process, for Ast, it was incumbent upon the interpreter to always remain cognizant of the historical period in which the text was situated.

Friedrich August Wolf was a contemporary of Ast’s and a fellow philologist. His Lecture on the Encyclopedia of Classical Studies (Vorlesung über die Enzyklopädie der Altertumswissenschaft) of 1831 defined hermeneutics as the science of the rules by which the meaning of signs is determined. These rules pointed, for Wolf, to a knowledge of human nature. Both historical and linguistic facts have a proper role in the interpretive process, according to Wolf, and help us to understand the organic whole that is the text. For Wolf, however, the primary task of hermeneutics was not the identification of the Grundidee or focal point of the text à la Ast, but the much more practical goal of the achievement of a high level of communication or dialogue between the interpreter of the text and the author, as well as between the interpreter and those to whom the text is to be explained.

Although aspects of the hermeneutics of both Ast and Wolf have survived into contemporary philosophical hermeneutics, the hermeneutics of both are generally understood to be concerned with what later became known as regional hermeneutics, or hermeneutics applicable to specific fields of study. Friedrich D. E. Schleiermacher, by contrast, was the first to define hermeneutics as the art of understanding itself, irrespective of field of study (Palmer 1969: 84).  Underlying and grounding the specific rules of interpretation of the various fields of study, for Schleiermacher, was a unity grounded in the fact that all interpretation takes place in language (bid.). Schleirmacher thought that a general, rather than regional, hermeneutics was possible and that such a general hermeneutics would consist of the principles for the understanding of language (Ibid.). Specifically, for Schleiermacher, proper interpretation, or understanding, was not merely a function of grasping the thoughts of the author, but of coming to grips with the extent to which the language in which the thoughts took place affected, constrained, and informed those thoughts. Schleiermacher, then, is calling our philosophical attention to the fact that when we say we understand something, we are essentially just comparing it to something we already know, most basically a given language. Here, to understand is to place something within a pre-existing context of intelligibility. Understanding, for Schleiermacher, was therefore decidedly circular, but for him this did not amount to the conclusion that understanding was impossible. Instead, circularity is how understanding is defined.  Understanding necessarily and structurally entails that the text and the interpreter share the same language and the same context of intelligibility.

Wilhelm Dilthey continued Schleiermacher’s pursuit of understanding qua understanding, but he sought to do so within the specific context of what he called the human sciences, or the Geisteswissenschaften (Dilthey 1883). The methods of scientific knowledge, for Dilthey, were too “reductionist and mechanistic” to capture the fullness of human-created phenomena (Palmer 1969, 100). The human sciences, or humanities, required instead two particular processes: (1) the development of an appreciation for the role of historical consciousness in our conceptions of meaning, and (2) a recognition that human-created phenomena are generated from “life itself” rather than through theory or concepts (Ibid.). In contemporary hermeneutic theory, the first process is often referred to as the historicality, or Geschichtlichkeit, of meaning and the second as life-philosophy, or Lebensphilosophie, the phenomenological view that meaning can be only be generated and understanding can only be had through lived experience (Erlebnis) and not through the examination of concepts, theories, or other purely idealistic or rational methods (Nenon 1995).

While Dilthey observed that the categorical methods of understanding useful in science were inappropriate for use in the human sciences, Martin Heidegger switched the entire hermeneutic enterprise from an epistemological focus to an ontological one. This switch is customarily referred to as the ontological turn in hermeneutics (Kisiel 1993; Tugendhat 1992). For Heidegger, in his classic, Being and Time (1962/2008), the question of the nature of understanding, or Verstehen, could only be answered by first answering the question of the nature of what it means to be.  Accordingly, Heidegger set out in Being and Time to discover the nature of being qua being. To do so Heidegger went to the things themselves, or die Sachen selbst, in keeping with the phenomenological methodology he learned from his teacher, Edmund Husserl. Heidegger called his phenomenological inquiry into the nature of being qua being fundamental ontology. He also called it hermeneutic ontology, which highlights that, for Heidegger, being and interpretation are inextricably linked almost to the point of identity.

The idea that, for Heidegger, being and interpretation were virtually the same phenomenon is arguably best captured in two of Heidegger’s key concepts: Dasein and being-in-the-world. Dasein can be roughly translated as the human way of being, but its literal translation is “being there” or “being here.” With these concepts, Heidegger is attempting to stress that the human way of being is interactive both with one’s environment and with others in the world. To be human is to be active and involved in one’s world and with other people rather than to be in a particular static state. There are no isolated human subjects separate from the world, for Heidegger, and the human way of being is not adequately characterized by the traditional philosophical distinction between subject and object, or by the distinction between subject and other subjects (or minds and other minds, as this polarity is sometimes described) that originates, for Heidegger, in Descartes’s Meditations. Instead, being, for humans, is being-in-the-world, a term meant to highlight the lack of clear barriers between human beings and the contexts, or schemas of intelligibility, in which they find themselves (Dreyfus & Wrathall 2002). According to Heidegger, what this means for the phenomenon of understanding is that it is always a function of how a given human being is in the world, that is, a function of context. The relationship between being, or context, and understanding is reciprocal. Understanding, for Heidegger, discloses to us what it means to be, and who we are affects how we understand things. In other words, understanding, for Heidegger, is not a sort of apprehension of the way things really are, as the canonical, modern, philosophical tradition might think of it, but rather it is the process of appreciating the manner in which things are there for a particular person, or group of persons, in the world. Further, the manner in which things are there for us in the world is a function primarily of shared social and cultural practices. To understand something, then, is to be able to place it within a schema of intelligibility, which is generated by the shared social and cultural practices in which one finds oneself (Dreyfus & Wrathall 2005).

In his Truth and Method (1975), Hans-George Gadamer picks up on Heidegger’s concept of the hermeneutic circle of understanding that is at the core of what it means to be human in the world, but while it is true that Gadamer works within the Heideggerian paradigm to the extent that he fully accepts the ontological turn in hermeneutics, Gadamer’s own stated project in Truth and Method is to get at the question of understanding qua understanding. Specifically, Gadamer observes that the traditional paths to truth are wrong-headed and run antithetical to the reality that being and interpretive understanding are intertwined. In the traditional paths to truth, truth and method are at odds. The methods used in the Western tradition will not get us to truth. These methods are critical interpretation, or traditional hermeneutics, and the Enlightenment focus on reason as the path to truth. Both of these methods have what Gadamer calls a pre-judgment against pre-judgment. That is, they both fail to acknowledge the role of the interpreter in determining truth. Traditional critical interpretation is inadequate because it seeks original intent or original meaning, that is, it holds on to the fiction that the meaning of the text can be found in the original intent of the author or in the words of the text. The Enlightenment focus on reason is an equally inadequate path to truth because it retains the subject/object distinction and thinks the path to truth is through the scientific method, both of which are wrong-headed.

For Gadamer, the word pre-judgment, or Vorurteile means the same thing as Heidegger’s fore-structure of understanding. Gadamer claims that today’s negative connotation of pre-judgment only develops with the Enlightenment (Schmidt 2006: 100). The original meaning of pre-judgment, according to Gadamer, was neither positive nor negative, but simply a view we hold, either consciously or unconsciously. All understanding necessarily starts with pre-judgments. The pre-judgments of the interpreter, for Gadamer, rather than being a barrier to truth, actually facilitate its generation. The pre-judgments of the interpreter—held as a result of the interpreter’s personal facticity—not only contribute to the generation of the question being raised in the first instance, but, if taken into account on the path to truth, are capable of being critically evaluated and revised, with the result that the quality of the interpretation is improved. Additionally, pre-judgments are either legitimate or illegitimate. Legitimate pre-judgments lead to understanding. Illegitimate pre-judgments do not. One of the goals of Truth and Method is to provide a theoretically sound basis upon which to distinguish between legitimate and illegitimate pre-judgments (Schmidt 2006: 102). Understanding or meaning, for Gadamer, is a function of legitimate pre-judgments.

The model for how understanding actually operates, for Gadamer, is the conversation or dialogue. In an authentic dialogue, says Gadamer, understanding or meaning is something that occurs inside of a tradition, which is just a set of cultural assumptions and beliefs. A tradition is a worldview, or Weltanschauung, a system of intelligibility, a framework of ideas and beliefs through which a given culture experiences and interprets the world. A tradition, in this Gadamerian sense, is the theoretical grandchild of what Ast called a given text’s Grundidee, or one idea that unified it. For Gadamer, a legitimate pre-judgment is a pre-judgment that survives throughout time, eventually becoming a central part of a given culture, a part of its tradition. Understanding or meaning is an event, a happening, the substance of which is a fusion of this narrowly defined concept of tradition and the pre-judgments of the interpreter. In this sense, understanding is not willed by the participants. If it were, the dialogue would not be authentic and understanding or meaning could never be achieved. Instead, the conversation or dialogue wills the path to understanding. The thing itself reveals the truth.

In the course of the dialogue, and as a direct and organic result of the things being discussed by the particular participants of the conversation, a question arises. This question becomes the matter at hand, the topic of the conversation. As the conversation proceeds, the answer will show up as well, and it will be a function of the “fusion of horizons” between the perspectives or pre-judgments of the participants of the conversation (Gadamer 1975). This fusion is understanding/meaning. It is the answer to the question and the closest thing there is to truth. In this way, both the things themselves and the participants of the conversation together generate both the conversation topic (the question) and the answer. Together, the things and the participants of the conversation generate the truth of the matter. Moreover, all of this takes place within a tradition that gives legitimacy and weight to the meaning generated.

It is important, for Gadamer, that the path to truth is phenomenological, that is, we must go to the things themselves, and the path is also hermeneutic in that it appreciates that the pre-judgment against pre-judgment is unavoidable. Every interpreter arrives at a text with what Gadamer calls a given horizon, or conglomeration of pre-judgments, which is analogous to a Heideggerian world or fore-structure of understanding and which has been described as a given schema of intelligibility in which an interpreter finds himself or herself. A Gadamerian horizon is a shared system of social and cultural practices that provides the scope of what shows up as meaningful for an interpreter as well as for how things show up. Picking up on the hermeneutic circle, Gadamer holds that an act of understanding is always interpretive.

Another key element of Gadamerian philosophical hermeneutics is Gadamer’s insistence that interpretation, understanding, or meaning cannot take place outside of practical application. Interpretation is more than mere explication for Gadamer. It is more than mere exegesis.  Beyond these things, interpretation of a given text—and it is important that everything is a text—always and necessarily takes place through the lens of present concerns and interests. The interpreter always and necessarily¸ in other words, comes to the table of the interpretive conversation or dialogue with a present concern that is grounded in a given epistemological or metaphysical horizon in which the interpreter dwells. In this way, for Gadamer, Aristotle got it right that understanding necessarily occurs through practical reasoning, or phronesis. For Gadamer, “[a]pplication does not mean first understanding a given universal in itself and then afterward applying it to a concrete case. It is the very understanding of the universal….itself” (Gadamer 1975). That phronesis is central to Gadamer’s hermeneutics is not disputed (Arthos 2014).

But, even more important than this, for Gadamer, the distance in time between the interpreter and the text is not a barrier to understanding but that which enables it. Temporal distance between text and interpretation is a “positive and productive condition enabling understanding” (Gadamer 1975).  When we seek to interpret a text, we are trying to figure out not the author’s original intent but “what the text has to say to us” (Schmidt 2006: 104), and this is a function of the extent to which the author’s original intent and the meaning generated by the contemporary context and the contemporary interpreter agree, that is, the extent to which the horizons of the author and the instant interpreter fuse or blend. (Gadamer specifically discusses legal hermeneutics in Truth and Method. He writes that there are two commonly understood ways of determining meaning in the law. The first is when a judge decides a case. In such a scenario, the judge must necessarily factor the present facts into the decision. The second is the case of the legal historian. In this second scenario, although it may seem as if the task is to discover the meaning of the law by only considering the history of the law, the reality is that it is impossible for the legal historian to understand the law solely in terms of its historical origin to the exclusion of considerations of the continuing effect of the law. In other words, determinations of meaning in the law, as is the case of all determinations of meaning, necessarily and at all times involves practical application.)

Post-Gadamerian philosophical hermeneutics takes many forms but can arguably be said to begin with the work of Emilio Betti. Finding what he saw as an epistemological relativism in the philosophical hermeneutics of Gadamer, Betti returns to the general hermeneutics of Schleiermacher and Dilthey and resists the tide of the ontological turn (Pinton 1972, 1973). Betti was a legal theorist who tried to bring the hermeneutic project back to one of interpretation without reference to the human way of being. Betti believed in and sought objective understanding or objective interpretation, or Auslegung, while at the same time stressing that texts reflected human intentions. Accordingly, he thought it was possible to ascertain the meaning of the text through replicating the original creative process, the train of thought, so to speak, of the text’s author. Betti believed in the autonomy of the text (Bleicher 1980: 58). Objective interpretation was possible, for him, but this objectivity was based both in terms of a priori epistemological existence à la Plato’s forms and of historical and cultural coherence. (Bleicher 1980: 28-29).

Jürgen Habermas, like Emilio Betti, seeks objective understanding (Habermas 1971), but, unlike Betti and in agreement with Gadamer, Habermas believes that hermeneutics is not and cannot be merely a matter of trying to find the best method of interpretation. Instead, objectivity of interpretation is grounded in something Habermas called communicative action, a sort of Gadamerian dialogue modified by the recognition that power imbalances often distort what passes for collective understanding, and that real consensus—the closest thing available to truth and/or objective understanding—can only be had where that consensus has been generated impartially and in circumstances where agreement has been unconstrained. (Habermas’s communicative action concept is also known as communicative praxis or communicative rationality.) While Gadamer’s philosophical hermeneutics grounded a kind of quasi-objectivity in the authority of tradition, however, Habermas found this approach insufficiently able to guide social liberation and progress. The task of hermeneutics is not merely to deconstruct the process of understanding and/or to somehow ground that understanding in either method à la Betti or tradition à la Gadamer, but to determine rules of ascertaining universal validity in the social sciences en route to social change. In this way, Habermas’s hermeneutics claims that hermeneutics can and does permit the kind of value judgments of which some critics say hermeneutics is incapable.

Paul Ricoeur was a contemporary philosophical hermeneutist who is known for creating what is often described as a critical hermeneutics. For Ricoeur, meaning and understanding are to be obtained through culture and narrative, as these take place in time. Influenced by Freud, Ricoeur thought all ideology required a critique to uncover repressed and hidden meanings that exist behind surface meanings that pass for truth.  In The Conflict of Interpretations (1974), Ricoeur argued that there were many and various paths to understanding and that each uniquely adds to meaning. Ricoeur’s work has been taken up recently by phenomenologists interested in questions of the nature of paradox (Geniusas 2015).

The work of Jacques Derrida (Derrida 1976, 1978) is more commonly associated with a 20th century movement in French philosophy known as deconstruction than with philosophical hermeneutics per se. However, there are important similarities between the two movements. First, deconstruction on its own terms, like hermeneutics, is not a method. Instead, deconstruction is a critique of authoritative systems of intelligibility or meaning that exposes the hierarchies of power within those systems. In understanding itself as outside of existing theoretical schemas, in other words, Derrida’s deconstruction is within the hermeneutic tradition. Second, deconstruction is based on Heidegger’s concept of Destruktion, a central concept in his hermeneutic ontology. But, while Heidegger’s Destrucktion, a project of critiquing authoritative systems of meaning that are based on structures of foundationalist metaphysics or epistemology, concludes that every act of understanding is an act of interpretation (Heidegger, 2008/1962), Derrida’s deconstruction involves identifying that language, or text, contains conceptual oppositions that involve the prioritizing of one side of a given conceptual opposition over the other, for example, writing over speech. Still, Derrida’s deconstruction is clearly in the hermeneutic tradition in that it is designed to highlight the elliptical and enigmatic nature of language and meaning. This is particularly evident in Derrida’s concept of différance, according to which every word in a given language implicates other words, which implicate other words, in a process of infinite reference and therefore what Derrida calls absence, meaning an absence of definitive meaning.

Susan-Judith Hoffman argues that Gadamerian hermeneutics furthers feminist objectives and can be understood as a form of feminist theorizing. Highlighting Gadamer’s account of the importance of difference, his notion of understanding as an inclusive dialogue, his account of pre-judgments as conditions for understanding that must always remain provisional, his account of tradition as that which is transformed by our reflection, and his account of language (Hoffman, 2003: 103), Hoffman argues that Gadamer’s philosophical hermeneutics is in line with feminist theorizing in that it “overthrows the false universalism of the natural sciences as the privileged model of human understanding” (Ibid.: 81).  In the process, Gadamer’s hermeneutics amounts to feminist theorizing in two important ways. First, it contains a sensitivity to the historical and cultural situation of knowledge and knowledge seekers, and second, it contains the critical power to challenge reductive universalizing tendencies in traditional canons of thought (Ibid.: 82).

Linda Martín Alcoff also sees value for feminist theory in Gadamer’s hermeneutics. For Alcoff, Gadamer’s “openness to alterity,” his “move from knowledge to understanding,” and his “holism in justification and immanent realism” all align themselves with feminist theorizing (Alcoff 2003: 256). That Gadamer’s philosophical hermeneutics contained these elements was insisted upon by Gadamer himself, who saw his philosophical hermeneutics as a critique of the Enlightenment view that truth could be had through abstract reasoning, divorced from historical considerations, as well as a call for the acknowledgment that the path to truth was through the particular rather than through the universal (Gadamer 1975). Miranda Fricker has recently developed hermeneutical themes into what she calls hermeneutical injustice, according to which an injustice is done when the collective hermeneutical resources available to a given individual or group are inadequate for expressing one or more important areas of their experience (Fricker 2007).

The work of Donatella di Cesare, Günter Figal, and James Risser is at the forefront of contemporary hermeneutics. For Cesare, the ground of hermeneutics is in Heideggerian existentialism, but this does not mean that hermeneutics is a kind of nihilism. Instead, hermeneutics, or the philosophy of understanding, is aimed at consensus; it is a constructive enterprise (Cesare 2005). For Figal, hermeneutics is most fundamentally a critique of objectivity and a call to understand things previously understood as objective elements of human life (Figal 2010). Risser has been interpreted as attempting to advance beyond Gadamer’s philosophical hermeneutics by acknowledging the radical finitude at stake in the phenomenon of tradition (George 2014).

2. Legal Hermeneutics and Mainstream Philosophy of Law

Within mainstream philosophy of law, legal hermeneutics is most closely aligned with legal interpretivism. Legal interpretivism is conceptually positioned between the two main subfields of philosophy of law: legal positivism and natural law theory. While mainstream philosophy law has many faces, and includes, among other theories, legal realism, legal formalism, legal pragmatism, and legal process theory, legal positivism and natural law theory form the theoretical poles between which each of the mainstream theories can be understood to lie. Legal positivism is the view, in broad strokes, that there is no necessary connection between law and morality and that law owes neither its legitimacy nor its authority to moral considerations (Feinberg and Coleman 2008; Patterson 2003). The validity of law, for the legal positivist, is determined not by its moral content but by certain social facts (Hart 1958, 1961; Dickson 2001; Coleman 2001; Gardner 2001). Natural law theory is grounded in the work of two main thinkers: John Finnis and Lon Fuller. For Finnis, an unjust law has no authority (Finnis 1969; 1980; 1991), and for Fuller, an immoral law is no law at all (Fuller 1958). Natural law theory, generally speaking, then, is the view that there is a necessary connection between law and morality and that an immoral law is not a law (Raz 2009; Simmonds 2007; Murphy 2006).

Sometimes called a third, main theory of law, legal interpretivism, developed by Ronald Dworkin, is the view that the law is essentially interpretive in nature and that it gains is authority and legitimacy from legal principles. Dworkin understands these principles to be neither bare rules nor moral tenets, but a set of guidelines to interpretation that are generated from legal practice (Dworkin 2011, 1996, 1986, 1985, 1983). Some describe legal interpretivism as a hybrid between legal positivism and natural law theory for the reason that Dworkin’s principles seem to qualify both as rules and to have a kind of normative quality that is similar to moral tenets (Hiley et al. eds. 1991; Brink 2001; Burley 2004; Greenberg 2004). But, what distinguishes legal interpretivism from both legal positivism and natural law theory is its insistence that legal meaning is tempered by the legal tradition within which it operates (Greenberg 2004; Hershovitz 2006). For the legal interpretivist, in other words, the line between legal positivism and natural law theory is not clearly drawn. Instead, rules and normative guidelines together shape and form both what the law is and what it means. This approach to legal ontology and meaning is known as the interpretive turn in analytic jurisprudence (West 2000; Feldman 1991).

Arguably, however, what legal positivism, natural law theory, and legal interpretivism all have in common is epistemological and metaphysical foundationalism. While, for the legal positivist, the answer to both the question of what the law is and the question of what the law means can be found in rules (Hart 1958), for the natural law theorist, the answer to both questions can be found in morality (Fuller 1958).  Similarly, for the legal interpretivist, the answer to both questions is found in legal principles.  In other words, for the legal interpretivist, law gains its legitimacy and authority from principles emanating from legal practice. Although law is interpretive in nature, on this view, the interpretative process stops at the point at which a judgment call has to be made as to what the/a law means, preferably by someone well-versed in the relevant legal tradition. Once that judgment call is made, we have our answer. Meaning has been determined.

The legal hermeneutical approach is similar but different in the important respect that no meaning determination is ever understood to be fixed. As is the case for the legal interpretivist, for the legal hermeneutist, law is interpretive in nature, but at no point can any meaning determination rise to the level of definitive. Things like Dworkinian principles are acknowledged and considered, along with myriad other factors relevant to good interpretive practice, but the story of the meaning of the law, for the legal hermeneutist, most certainly cannot end at any point. There is no foundation to the law, for the legal hermeneutist, and there can be none. Instead, there can only be better or worse interpretations, measured comparatively and by the quality of the interpretive practices used to generate the various interpretations. More importantly, however, for the legal hermeneutist, objective interpretation simply is not and cannot be the project. Instead, the search for legal meaning is a critical project. The search for legal meaning involves critical engagement with previous interpretations and the current interpretation and includes critical analysis of the conditions for the possibility for both.

Legal hermeneutics, then, while similar to legal interpretivism in many respects, provides an alternative to the three main theories of law in that its approach to legal meaning can be understood to avoid engagement with the question of foundationalism that is characteristic of the traditional approaches.  Rather than offering a new theory of law, legal hermeneutics “provides us with the necessary protocols for determining meaning” (Douzinas, Warrington, and McVeigh 1992: 30). Legal hermeneutics provides no specific theory of law and privileges no particular methodology or ideology. Instead, legal hermeneutics calls the interpreter of legal texts first and foremost to the fact that every act of understanding a law is an act of interpretation, and at the same time, highlights that better interpretation takes conscious and proactive account of what philosophical hermeneutics, as described above, reveals as the necessary structures and components of the interpretive process. Some might describe this feature of legal hermeneutics as taking the determinacy of meaning to be context-dependent and open-ended. While this account is on track, another key feature of legal hermeneutics is that it is a descriptive rather than a normative project. Legal hermeneutics, then, is more a way of clarifying the nature of how legal interpretation actually works than a theory of how legal interpretation ought to work. In this way, legal hermeneutics can be understood to provide the tools with which to investigate, clarify, and help solve what appear from other perspectives to be insoluble legal problems, particularly problems based in conflicts of interpretation.

3. Legal Hermeneutics and Alternative Theories of Law

Legal hermeneutics shares an antifoundationalist sensibility with many alternative theories of law, including the critical legal studies movement, Marxist legal theory, deconstructionist legal theory, postmodernist legal theory, outsider jurisprudence, and the law and literature movement. For each of these theories of law, the goal of locating law’s ultimate legitimacy, authority, or meaning anywhere at all is understood as an exercise in futility. Some characterize this feature of these theories as the failure of complete determinacy as a semantic thesis, rather than as a failure of ultimate justification as an epistemological thesis. However, for others, this distinction is not meaningful and fails to adequately account for the radical rejection of the entire project of justification inherent in alternative theories.

The critical legal studies movement was an intellectual movement in the late 1970s and early 1980s that stood for the proposition that there is radical indeterminacy in the law. Conceptually based in the critical theory of the Frankfurt School, critical legal studies stands for the proposition that legal doctrine is an empty shell.  There is no such thing as the law, for the critical legal theorist, as the law is understood as an entity that exists out of context (Binder 1996/1999: 282). Instead, law is produced by power differentials that have their origins in differences in levels of property ownership. The liberal ideal of the rule of law devoid of influence from power differentials, contained in all analytic approaches to jurisprudence, is an illusion. For this reason, law is inherently self-contradictory and self-defeating and can never be a mere formality, as liberal theory and analytic jurisprudence would have us believe. This way of understanding the law is known as the indeterminacy thesis. For some, this does not necessarily mean that law is indeterminable. However, it means that determinability is context-dependent. Others do not find this distinction meaningful.

Marxist legal theory begins with the work of Evgeny Pashukanis and takes place in contemporary form in the work of Alan Hunt, among others. For Pashukanis, law was inextricably linked to capitalism and hopelessly bourgeois. Outside of capitalism, things like legal rights are unnecessary, since outside of capitalism there are no conflicting interests or rights to be meted out or over which it is necessary for persons to fight. In the socialist society that Pashukanis envisions on the other side of capitalism, what would take the place of law and all talk of individual rights would be a sort of quasi-utilitarianism that values collective satisfaction over the perceived need to protect the individual interests of individual legal subjects (Pashukanis 1924).  What contemporary Marxist legal theory retains from Pashukanis is the view that law is inescapably political, merely one form of politics. In this way, law is always potentially coercive and expressive of prevailing economic relations, and the content of law always manifests the interests of the dominant class (Hunt 1996, 1999: 355). So described, the content of law, for Marxist legal theorists, has no theoretical or practical basis in anything epistemologically foundational or universal.

Deconstructionist legal theories can be considered post-structuralist like critical legal studies but are unique in that they center around conceptual oppositions or binary concepts, also known as binaries. According to the deconstructionist approach, within a given conceptual opposition, one term in the opposition has been traditionally privileged over the other in a particular context, or text. A text can be a written text, an argument, a historical tradition, or a social practice. Jacques Derrida, considered the forerunner of deconstruction as a philosophy of language and meaning, famously identified a conceptual opposition between writing and speech, for example, with writing being the privileged form (Derrida 1976). Privileged, in deconstruction, means truer, more valuable, more important, or more universal than the opposing term (Balkin, 1996, 1999: 368). According to deconstructionist theories of law, legal distinctions are often masked conceptual oppositions taht privilege one term over another. For example, individualism is privileged over altruism, and universalizability is privileged over the attention to the particular that is an inherent part of equitable distribution. These binary concepts and the privileging of one term in each binary lend an instability to the law, on deconstructionist terms, that is decidedly anti-foundationalist. J.M. Balkin, for example, argues that the true nature of the legal subject is ignored and obliterated by conventional legal theory (Balkin 2010; 1993). Balkin argues that when an attempt is made to understand a law, we bring our subjective experiences to bear on that attempted understanding (Ibid.). For Balkin, mainstream philosophy of law’s failure to acknowledge that this is the case is its very deep and abiding flaw.

Postmodernist legal theories are grounded in a 20th century movement in aesthetic and intellectual thought, which departed from interpretation based in universal truths, essences, and foundations. Postmodern legal theory departs from a belief in the rule of law, or any generalized or universalizable Grand Theory of Jurisprudence, in favor of using “local, small-scale problem-solving strategies to raise new questions about the relation of law, politics, and culture” (Minda 1995: 3). Other than this statement, it is difficult to describe postmodernist legal theory in any general way, since the entire point of postmodernist legal theory is that generalized theories are vacuous, even impossible. Instead, there are only individual theories, individual authors of theories, and individual texts/laws. It is fair to say, however, that postmodern legal theorists generally resist the sort of conceptual theorization routinely practiced by more mainstream legal academics and analytic philosophers for the reason that more mainstream approaches unduly emphasize abstract theory at the expense of pragmatic concerns (Ibid.). The postmodern rejection of ultimate theories can be construed as a form of antifoundationalism.

Outsider jurisprudence is an area of legal theorizing that is highly skeptical of the ability of mainstream legal theory to address the needs of members of historically marginalized groups. Although there has been a proliferation of kinds of outsider jurisprudence in the early 21st century, including LatCrit and QueerCrit (Mahmud 2014; Valdes 2003; Eskridge 1994), there are two main kinds of outsider jurisprudence: critical race theory and feminist jurisprudence (Parks 2008; Jones 2002; Delgado 2012; Levit and Verchick 2006). Critical race theorists are concerned with the particularized experiences of African Americans in American jurisprudence. They share with the postmodernists a rejection of the idea of the existence of one grand and universally applicable theory of law that applies equally to everyone: “There is a hidden category of persons to whom the laws do not equally and universally apply, for the critical race theorists, and that category of persons is African Americans” (Minda 1995: 167).  Key themes in critical race theory are a call to contextualized theorizing about the law that acknowledges that the lives and experiences of African Americans in America have a juridical tenor very different from the lives and experiences of other Americans, a critique of political liberalism, which bases its apportionment of rights on the fiction that African Americans as a group have the same degree of access to rights, in American society, as other Americans, and a call for juridical acknowledgment of the persistence of racism in American society (Ladson-Billings 2011; Whyte 2005; Delgado 1995: xv).

Feminist jurisprudence “[goes] beyond rules and precedents to explore the deeper structures of the law” (Chamallas 2003: xix). It operates under the belief that gender is a significant factor in American life and explores the ways in which gender, and related power dynamics between men and women throughout American legal history, have affected how American law has developed (Ibid.). Feminist jurisprudence concerns itself with legal issues of particular significance to women, such as sexual harassment, domestic violence, and pay equity. It also approaches legal theory in a way that comports with many women’s lived experiences, that is, without pretending, as mainstream jurisprudence tends to do, that gender is irrelevant to the outcome of legal disputes (Ibid.). Of primary concern to feminist legal scholars is the systemic nature of women’s inequality and the pervasiveness of female subordination through law in America. The methodology of feminist jurisprudence is the excavation and examination of hidden legalized mechanisms of discrimination to uncover hierarchies in law that operate to the detriment of the ideal of equal rights for women (Ibid.). The feminist legal scholar’s identification of hidden power dynamics at work in American law can be construed as yet another antifoundationalist perspective on law.

A recent development in outsider jurisprudence is intersectionality theory, or the idea that oppression takes place across multiple, intersecting systems, or axes, of oppression (Cho, Crenshaw, and McCall 2013; MacKinnon 2013; Walby 2007). Intersectionality theory is grounded in the thought of Kimberlé Crenshaw (Crenshaw 1989, 1991) and reinforces the idea from critical race theory and feminist jurisprudence that law operates differently on the bodies of the oppressed. For Crenshaw, race and gender discrimination combine on the bodies of black women in a way that neither race discrimination nor gender discrimination alone capture or are able to capture or handle. Crenshaw’s point is that ignoring race when taking up gender reinforces the oppression of people of color, and anti-racist perspectives that ignore patriarchy reinforce the oppression of women (Crenshaw 1991, 1252). But, more specifically, taking up any form of oppression in a vacuum ignores the way that oppression actually works in the lives of the oppressed. For the law to help combat oppression, it must grapple with the complexities and nuances of lived experience.

Containing very similar themes to legal hermeneutics is what is known as the law and literature movement (Fish 1999; Rorty 2007, 2000, 1998, 1991, 1979; Bruns 1992; Fiss 1982). The law and literature movement, like certain forms of legal hermeneutics, is heavily influenced by the deconstructionist philosophy of Jacques Derrida (Derrida 1990, 1992). The literary legal theorist, in other words, has developed an appreciation for the costs of excluding certain types of questions from the process of ascertaining meaning in the law (Levinson and Mailloux 1988: xi). Moreover, there is an active attempt on the part of the literary legal theorist to dismantle or undo the conventional illusion that the structures that support claims to authentic, legitimate, or official meaning are built on solid ground. The role of the interpreter is also highlighted in these approaches, as is the inextricability of determinations of meaning from the power dynamics in which they take place (Thorsteinsson 2015; Surrette 1995).

4. Legal Hermeneutics in Jurisprudence Proper

Legal hermeneutics in jurisprudence proper, legal theory, can be traced back to the publication of Francis Lieber’s 19th century work, Legal and Political Hermeneutics (Lieber 2010/1880). There, Lieber tried to identify principles of legal interpretation that would bring consistency and objectivity to the interpretation of the U.S. Constitution, and at the same time exposed strict intentionalist interpretative methods—defined as those in which the so-called intent of the Framers had interpretive authority—as incoherent (Binder and Weisberg 2000: 48). More than 125 years after Lieber’s landmark text, contemporary legal hermeneutics is still trying to find that balance. Contemporary legal hermeneutics retains Lieber’s goal of objectivity of interpretation and his attention to the roles of history, temporality, politics, and socio-historical context in credible meaning assessments. The central question of legal hermeneutics in constitutional theory is: What sorts of interpretive methods should we use to come up with an interpretation of the constitution that approaches objectivity despite the fact that, owing to certain realities about how the interpretive process works, it is impossible for us to ascertain the intent of the Framers?

Another question at the core of legal hermeneutics, however, is: Even if we could ascertain the intent of the Framers, which all legal hermeneutists think is impossible, why would we want to do so, given the nature of what a constitution is—a living, breathing text designed to govern real people in real life contexts—and the fact that legal hermeneutical principles based in philosophical hermeneutics dictate that the particular time and place, that is, the context, of a given application of a given law significantly influences, and should influence, the content of the interpretation? This is an example of the hermeneutic circle at work in legal interpretation. That is, from the vantage point of legal hermeneutics: What the constitution means in a particular instance is importantly influenced by the context in which the interpretation is taking place, the application, and the context in which the interpretation is taking place, the application, is importantly influenced by what the constitution means in that same context.

The primary focus of contemporary legal hermeneutics is the debate in constitutional theory between the interpretive methods of originalism and non-originalism.  Originalism is the view, generally, that the meaning of the constitution is to be found by determining the original intent of the Framers, understood to be most prudently found in the text of the constitution itself (Scalia and Garner 2012; Calabresi 2007; Monaghan 2004). By contrast, non-originalism is the view, generally, that the constitution is a living, breathing document meant more as a set of guidelines for future lawmakers than as a strict rulebook demanding literal compliance (Cross 2013; Balkin 2011; Goodwin 2010).  For clarification purposes, it should be noted that the divide between originalism and non-originalism is akin to the divide between epistemological foundationalism and antifoundationalism.

Within the debate between originalists and non-originalists, clearly all legal hermeneutists are necessarily non-originalists, since by the basic tenets of legal hermeneutics, original intent cannot be ascertained. But, what separates the legal hermeneutist from the average non-originalist is a high degree of respect for the text of the constitution as an interpretive starting point, together with a call to heightened self-reflexivity regarding the degree to which one’s own pre-judgments, and the pre-judgments of previous interpreters, may be affecting the interpretive process. By the same token, just as the legal interpretivist is constrained by the principles of legal practice in the interpretive process, the legal hermeneutist is similarly constrained by the spirit of the text. Finally, while the goal of the average non-originalist is a definitive interpretation of the text, however at odds with the original intent of the Framers, the legal hermeneutist has the more modest goal of deconstructing the mosaic of considerations that went into previous interpretations in an effort to examine each tile of the mosaic, one by one, more in the service of understanding the text/law within a given context than in the service of producing anything on the order of a definitive interpretation for posterity.

Another way of thinking about legal hermeneutics, however, is to see it as neither originalist nor non-originalist, but orthogonal to the originalist/antioriginalist continuum. In other words, it is consistent with the themes of legal hermeneutics that it rejects the originalist/antioriginalist continuum itself as wrong-headed and unproductive. Indeed, legal hermeneutics rejects interpretive method altogether in favor of a call to an increased level of self-reflexivity on the part of the interpreter, a call that is meant to actively and consciously engage the interpreter in the interpretive process in a way that neither originalism nor non-originalism demands.

On the contemporary scene, George Taylor’s work in legal hermeneutics follows Ricoeur’s in philosophical hermeneutics. In his “Hermeneutics and Critique in Legal Practice,” Taylor argues that Ricoeur’s approach to hermeneutics gets it right when it attempts to mediate the difference between understanding and explanation (Taylor 2000: 1101 et seq.).  Understanding, on this view, is obtained through hermeneutic methods, but explanation is obtained through science.  Ricoeur, according to Taylor, sees the interpretive enterprise as containing both elements. The way Taylor sees it, Ricoeur’s emphasis on the narrative nature of meaning acknowledges the roles of both understanding and explanation in a successful interpretation (Taylor 2000: 1123). The usefulness of legal hermeneutics, for Taylor, is that it correctly identifies and brings to the forefront that there is explanation or fact in understanding or interpretation, and there is understanding or interpretation in explanation or fact, shedding a kind of glaring light on all understandings that might deny this reality. The goals of originalism, on this view, are simply impossible to reach.

Francis J. Mootz, III agrees with Taylor about the impossibility to ascertain the original meaning (Mootz 1994). Accordingly, instead of engaging in what he understands as the necessarily fruitless exercise of attempting to ascertain original meaning, Mootz argues, we should instead attempt to find the interpretation that “allows the text to be most fully realized in the present situation” (Mootz 1988: 605).

Georgia Warnke describes the interpretive turn in the study of justice as an abandonment of the attempt to discern universally valid principles of justice in favor of attempts to “articulate those principles of justice that are suitable for a particular culture and society” in light of that society’s culture and traditions, “the meanings of its social goods,” and its public values (Warnke 1993: 158). We would then appeal to hermeneutical standards of coherence to reject interpretations that fail to respect that culture or those traditions, or meanings (Ibid.). Such an approach, for Warnke, “[shifts] the emphasis from a conflict between two opposing rights…to a conflict between two interpretations of…actions and practices that are consonant with [a given culture’s] traditions and self-understandings” (Ibid.: 162).

For Gregory Leyh, legal hermeneutics reveals to us the political nature of every act of constitutional interpretation. This includes both originalist approaches to constitutional interpretation as well as non-originalist approaches. However, for Leyh, legal hermeneutics also provides us with some constructive lessons for improving the quality of our necessarily political acts of interpretation. Specifically, in “Toward a Constitutional Hermeneutics,” Leyh makes the case for a legal hermeneutics based in the philosophical hermeneutics of Hans-George Gadamer (Gadamer 1975) in which, as the self-understanding of the interpreters of legal texts is increased, the quality of the interpretation produced by those interpreters is increased (Leyh 1992). This self-understanding would include primarily an explicit acknowledgment of the role that history plays in the development of both understanding and meaning (Ibid.: 370), an explicit acknowledgment of the “irreducible conditions of all human knowing” (Ibid.: 371), and attentiveness to the kinds of issues characteristically associated with the interpretation of all texts, including legal texts (Ibid.). For Leyh, a call to the constitution’s original meaning, á la a standard originalist approach, for example, entails certain assumptions about historical understanding, for example, that it is fixed and identifiable by subsequent interpreters, which legal hermeneutics exposes as impossible. What constitutional theorists need, for Leyh, is not greater insight into the intent of the framers, for this is not obtainable, but deeper reflection on the issue of the conditions that make historical knowledge possible at all (Ibid.: 372). For Leyh, legal hermeneutics “sets for itself an ontological task, namely, that of identifying the ineluctable relationships between text and reader, past and present, that allow for understanding to occur in the first place” (Ibid.).

There are two key aspects to Leyh’s legal hermeneutics: (1) an appreciation for the role of language in understanding, which sharpens our awareness of the “historical structures constitutive of all knowledge,” and (2) a recognition of the “enabling character of our prejudgments and preconceptions as windows to the past” (Ibid.: 372). Taking these things into consideration, it is impossible, according to Leyh, for us to obtain an understanding of historical texts like the constitution without going through the language we use today and our present-day prejudgments and preconceptions, or what Hans-George Gadamer called our pre-judgments. For Leyh, all reason is historical, and there is a historicity to all inquiry (Ibid.: 375). “No text simply sits before us and announces its meaning,” Leyh writes (1988: 375).

Rather than understanding the historicity of all inquiry as an impediment between the contemporary interpreter and the text, however, Leyh suggests that this information should aid us in recognizing that reason “finds its expression only as it is applied concretely” (Ibid.). In other words, interpretation is always practical, it always occurs in a particular set of circumstances at a particular time and place, and it applies itself to a particular set of facts. An acknowledgement of this reality on the part of the interpreter, for Leyh, adds a level of awareness vis-à-vis the interpretive process that can only aid in making sound judgments of constitutional interpretation.

David Hoy’s take on legal hermeneutics involves a focused critique of the intentionalist position in constitutional theory, according to which the so-called intent of the framers is the ultimate authority on constitutional meaning (Hoy 1992). For Hoy, while the intentionalist believes that no interpretation is needed to locate the intent of the framers, the hermeneutist understands that the concept of intended meaning presupposes a prior understanding of meaning in a different sense of the word. The concept of an ambiguous sentence highlights this prior understanding of meaning. A given sentence can have two different meanings in this prior sense, Hoy explains, whether either or both of them were intended or not (Hoy 1992: 175). The hermeneutist acknowledges, in other words, according to Hoy, a difference between sentence meaning and speaker’s meaning. However, while the intentionalist incorrectly presumes that there are only two possible bases for a theory of meaning—intention and convention (Ibid.)—the hermeneutist understands that there can be no fact of the matter vis-à-vis sentence meaning. Hoy writes, “[Hermeneutics] acknowledges semantic complexity. It does not exclude questions about intention when these are relevant to interpretation, but it believes that since textual meaning is not reducible to intended meaning, there are many other kinds of questions that can be asked about texts” (Hoy 1992: 178).

At the same time, Hoy’s hermeneutics stands for the proposition that the traditional way law is practiced operates as a constraint on judicial discretion. It provides a schema of intelligibility in which a judge must necessarily decide a case. As Hoy indicates, using discretion to decide what the law means within the tradition of the practice of law is what judges do all the time. “Only when the judges know that the law entails one decision and they nevertheless decide something else could they be said to be rewriting,” writes Hoy (1992: 183), and the hermeneutic claim is that this is almost never the case. See also Hoy, David (1987) “Interpreting the Law: Hermeneutical and Poststructuralist Perspectives,” Southern California Law Review 58 (1985): 136-76 and “Dworkin’s Constructive Optimism v. Deconstructive Legal Nihilism,” Law and Philosophy 6 (1987): 321-56. If Hoy is right, then, as Leyh points out as well, there is no act of judicial interpretation that takes place without interpretation. Such a possibility is an illusion. Instead, all acts of understanding are acts of interpretation including originalist and/or intentionalist acts of understanding.

In the early 21st century, John T. Valauri argued that the new questions for legal hermeneutics are different from the ones of the late 1980s and early 1990s (Valauri 2010). For Valauri, the continuing significance of hermeneutics for legal theory is to help us sort through the varieties of originalism that compete for our allegiance in the aftermath of what he sees as a kind of unanimous consent to originalism’s legitimacy. In other words, for Valuari, the hermeneutical question is no longer whether originalism is valuable, but what kind of originalism is valuable (Ibid.). The remaining questions that need to be answered to help us sort through the varieties of originalism, for Valauri, are (1) whether the various forms of originalism share a common conception of understanding and interpretation, and (2) whether hermeneutics is a descriptive or normative practice. To address these questions, says Valauri, we need to “[recover]…the fundamental hermeneutical problem” (Gadamer 1975), which means focusing on three key hermeneutical paradigms: (1) the process of application, (2) Aristotle’s practical wisdom, and (3) a focus on the “Aristotelian face” of hermeneutics over the Heideggerian one (Valauri 2010).  The significance of paradigms (1) and (2) are self-explanatory and common to all forms of hermeneutics, legal and otherwise. By paradigm (3), Valauri hopes to recover legal hermeneutics from its Heideggerian-based, full scale rejection of method that many mainstream legal theorists find so unpalatable.

Drawing themes and seeking overlap between the various contemporary legal hermeneutists, a legal hermeneutical approach to constitutional theory can be understood as a call to the interpreter of the constitution to take into conscious consideration the following factors when engaged in constitutional interpretation: (1) the identity of the interpreter, of previous interpreters, and the original author, (2) the sociohistorical context in which the text was written and in which the interpretation is taking place, (3) the political climate at the time the text was written and in which the interpretation is taking place, (4) the extent to which the meaning of words and concepts relevant to the interpretation have changed or have not changed over time, (5) the particularity of experience of those affected by a given law, (6) the extent to which that experience is acknowledged or unacknowledged by previous interpretations, (7) the relationship between who the interpreter is, who the interpreter takes herself to be, and the kinds of interpretive choices the interpreter makes, (8) the necessary truth that original meaning is an illusion and cannot be ascertained, and (9) the extent to which one’s own pre-judgments enter into one’s attempt at ascertaining meaning. This final aspect adds a level of self-reflexivity to the interpretive enterprise that is understood to significantly improve the quality of the interpretation. In other words, from the vantage point of legal hermeneutics, the more that assumptions customarily unacknowledged in mainstream legal theory are excavated and examined, the greater the degree of legitimacy of the interpretation.

5. Conclusion

Legal hermeneutics is an approach to legal texts that understands that the legal text is always historically embedded and contextually informed so that it is impossible to understand the law simply as a product of reason and argument. Instead, meaning in the law takes place according practical, material, and context-dependent factors such as power, social relations, and other contingent considerations. As Gerald Bruns has put it:

Legal hermeneutics is what occurs in the give-and-take—the dialogue—between meaning and history. The historicality of the law means that its meaning is always supplemented whenever the law is understood. This understanding is always situated, always an answer to some unique question that needs deciding, and so is different from the understanding of the law in its original meaning, say, the understanding a legal historian would have in figuring the law in terms of the situation in which it was originally handed down. The historicality of the law means that its meaning is always supplemented whenever it is understood or interpreted. Supplementation always takes the form of self-understanding; that is, it is generated by the way we understand ourselves—how we see and judge ourselves—in light of the law. But, this self-understanding throws its light on the law in turn, allowing us to grasp the original meaning of the law in a new way. The present gives the past its point. (Bruns 1992)

This seems to mean, at a minimum, that every Supreme Court decision is an interpretation, which directly undermines all originalist approaches to constitutional theory.

The claim that every Supreme Court decision is an act of interpretation, however, is not a claim about the indeterminacy of meaning itself but a more modest claim about the impossibility of ascertaining original meaning. The difference between these two positions is subtle but important. While for the non-originalist, the possibility of authoritative meaning is an illusion, for the legal hermeneutist more and less authoritative meanings are possible and are a function of the interpreter’s taking conscious account of several key factors that inform and shape the interpretive process. Taking conscious account of each of these factors when attempting to interpret a given legal text lends to the interpretative process a sort of legitimacy and authority, the possibility of which most non-originalist positions deny.

Legal hermeneutics, then, can be understood as an anti-method in constitutional theory. As Gregory Leyh has identified, “[H]ermeneutics neither supplies a method for correctly reading texts nor underwrites an authoritative interpretation of any given text, legal or otherwise” (Leyh 1992: xvii). Instead, “the activity of questioning and adopting a suspicious attitude toward authority is at the heart of hermeneutical discourse. Hermeneutics involves confronting the aporias that face us, and it attempts to undermine, at least in partial ways, the calm assurances transmitted by the received views and legal orthodoxies” (Leyh 1992: Ibid.). Arguably, any approach to legal hermeneutics that rejects its distinctively critical enterprise, then, misses the point of legal hermeneutics entirely. As an approach to legal interpretation, it is necessarily, following Heidegger and Gadamer, a complete rejection of the gods of both truth and method in favor of a call to the interpreter of laws to cast an incisive and self-reflexive gaze on all that is called mainstream legal orthodoxy.

6. References and Further Reading

a. References

  • Alcoff, Linda Martín (2003) “Gadamer’s Feminist Epistemology” in L. Code (ed.) Feminist Interpretations of Hans-Georg Gadamer, University Park: Penn State University Press.
  • Aquinas, Thomas (1998) On Law, Morality, and Politics. Ed. William P. Baumgarth and Richard J. Regan Indianapolis: Hackett Publishing Co.
  • Aristotle (350 B.C.E./ 2000) On Interpretation. Trans. E.M. Edghill. Adelaide: University of Adelaide Library.
  • Arthos, John (2014) “What is Phronesis? Seven Hermeneutic Differences in Gadamer and Ricoeur,” Philosophy Today, 58(1): 53-66.
  • Ast, F. (1808) Grundlinien der Grammatik, Hermeneutik und Kritik. Landshut, Ger.: Jos. Thomann.
  • Balkin, J.M. (2011) Living Originalism. Cambridge: Belknap Press of Harvard University Press.
  • Balkin, J.M. (2010) “Deconstruction” in A Companion to Philosophy of Law and Legal Theory (Second Edition), Dennis Patterson, ed., Malden: Wiley-Blackwell.
  • Balkin, J.M. (1999, 1996) “Deconstruction” in A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 367-374.
  • Balkin, J.M. (1993) “Understanding Legal Understanding: The Legal Subject and the Problem of Legal Coherence,” 103 Yale Law Journal 105.
  • Binder, Guyora (1996/1999) “Critical Legal Studies,” in A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 280-290.
  • Bleicher, J. (1980) Contemporary Hermeneutics: Hermeneutics as Method, Philosophy and Critique. London, Boston: Routledge & Kegan Paul.
  • Brink, David (2001) “Legal Interpretation and Morality,” in B. Leiter (ed.), Objectivity in Law and Morals. Cambridge: Cambridge University Press.
  • Bruns, Gerald L. (1992) “Law and Language: A Hermeneutics of the Legal Text,” in Legal Hermeneutics: History, Theory, and Practice, ed. Gregory Leyh, Berkeley: University of California Press, 23-40.
  • Burley, Justine (ed.) (2004) Dworkin and His Critics: With Replies by Dworkin. Oxford: Blackwell.
  • Calabresi, Steven (2007) Originalism: A Quarter-Century of Debate. Washington, D.C.: Regnery Pub.
  • (Di) Cesare, Donatella (2005) “Reinterpreting Hermeneutics,” Philosophy Today, 49(4): 325-332.
  • Chamallas, Martha (2003) Introduction to Feminist Legal Theory. 2nd Ed. New York: Aspen Publishers.
  • Cho, S., K.W. Crenshaw and L. McCall (2013) “Toward a Field of Intersectionality Studies: Theory, Applications, and Praxis,” Signs, 38(4): 785-810.
  • Coleman, Jules (2001) The Practice of Principle. Oxford: Clarendon Press.
  • Crenshaw, K.W. (1991) “Mapping the Margins: Intersectionality, Identity Politics, and Violence against Women of Color,” Stanford Law Review, 43(6): 1241-99.
  • Crenshaw, K.W. (1989) “Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine, Feminist Theory and Antiracist Politics,” University of Chicago Legal Forum, 140: 139-67.
  • Cross, Frank B. (2013) The Failed Promise of Originalism. Stanford: Stanford Law Books, an imprint of Stanford University Press.
  • Delgado, Richard (ed.) (1995) Critical Race Theory: The Cutting Edge. Philadelphia: Temple University Press.
  • Derrida, Jacques (1976) Of Grammatology. Baltimore: Johns Hopkins University Press.
  • Derrida, Jacques (1978) Writing and Difference. Trans. A. Bass. London: Routledge & Kegan Paul.
  • Derrida, Jacques (1990) “Force of Law: ‘The Mystical Foundation of Authority.’” Cordozo Law Review, 97, 1, 276.
  • Derrida, Jacques (1992) “Before the Law” in Acts of Literature. Ed. Derek Attridge. New York    and London: Routledge, 181-220.
  • Dickson, Julie (2001) Evaluation and Legal Theory. Oxford: Hart Publishing.
  • Dilthey, Wilhelm (1883) Introduction to the Human Sciences. In Makkreel, Rudolf A. and Frithjob Rodi, eds. 1989. Selected Works: Volume I: Introduction to the Human Sciences. Princeton: Princeton University Press.
  • Dostal, Robert J. (ed.) (2002) The Cambridge Companion to Gadamer. Cambridge: Cambridge University Press.
  • Douzinas, Costas, Ronnie Warrington and Shaun McVeigh (1991) Postmodern Jurisprudence: The Law of Text in the Texts of Law. London, New York: Routledge.
  • Dreyfus, H.L. and M.A. Wrathall (eds.) (2005) A Companion to Heidegger. Oxford: Blackwell.
  • Dreyfus, H.L. and M.A. Wrathall (eds.) (2002) Heidegger Reexamined (4 Volumes). London: Routledge.
  • Dworkin, Ronald (1983) “My Reply to Stanley Fish (and Walter Benn Michaels): Please Don’t Talk about Objectivity Any More,” in Mitchell, W.J.T., ed., The Politics of Interpretation. Chicago: University of Chicago Press, 287-313.
  • Dworkin, Ronald (1985) A Matter of Principle. Cambridge: Harvard University Press.
  • Dworkin, Ronald (1986) Law’s Empire. Cambridge: Harvard University Press.
  • Dworkin, Ronald (1996) “Objectivity and Truth: You’d Better Believe It,” Philosophy and Public Affairs, 25:88.
  • Dworkin, Ronald (2011) Justice for Hedgehogs. Cambridge: Harvard University Press.
  • Feldman, Stephen Matthew (1991) “The New Metaphysics: The Interpretive Turn in Jurisprudence,” Iowa Law Review, Vol. 76, 1991.
  • Figal, Günter (2010) Objectivity: The Hermeneutical and Philosophy. Albany: State University of New York Press.
  • Finnis, John (1980) Natural Law and Natural Rights. Oxford: Clarenden Press.
  • Finnis, John (ed.) (1991) Natural Law, 2 Vols. New York: New York University Press.
  • Finnis, John (1969) The Morality of Law. New Haven: Yale University Press, rev. edn.
  • Fiss, Owen (1982) “Objectivity and Interpretation,” Stanford Law Review, 34: 739-763.
  • Fish, Stanley (1999) Doing What Comes Naturally: Change, Rhetoric, and the Practice of Theory in Literary and Legal Studies. Durham, London: Duke University Press.
  • Fricker, M. (2007) Epistemic Injustice: Power and Ethics of Knowing, Oxford: Oxford University Press.
  • Fuller, Lon (1958) “Positivism and fidelity to law—a response to Professor Hart.” Harvard Law Review, 71: 630-72.
  • Gadamer, Hans-George (1975) Truth and Method. London: Sheed & Ward.
  • Gardner, John (2001) “Legal Positivism: 5 ½ Myths,” 46 American Journal of Jurisprudence, 199.
  • Geniusas, Saulius (2015) “Between Phenomenology and Hermeneutics: Paul Ricoeur’s Philosophy of Imagination,” Human Studies: A Journal for Philosophy and the Social Sciences, 38: 2, 223-241.
  • Gibgons, Michael T. (2006) “Hermeneutics, Political Inquiry, and Practical Reason: An Evolving Challenge to Political Science,” The American Political Science Review, 100(4): 563-571.
  • Goodwin, Liu (2010) Keeping Faith with the Constitution. Oxford: Oxford University Press.
  • Greenberg, Mark (2004) “How Facts Make Law,” Legal Theory, 10: 157-98.
  • Habermas, Jürgen (1971) “Der Universalitätsanspruch der Hermeneutik” (The Hermeutic Claim to Universality) in Karl- Otto Apel et al., eds., Hermeneutik und Ideologiekritik (Hermeneutics and Ideology) Frankfurt: Suhrkamp, 120-158.
  • Hart, H.L.A. (1961) The Concept of Law Oxford. Oxford: Oxford University Press.
  • Hart, H.L.A. (1958) “Positivism and the Separation of law and Morals.” Harvard Law Review 71: 593- 629.
  • Heidegger, Martin (2008) Being and Time. Trans. John Macquarrie and Edward Robinson. New York: HarperPerennial/Modern Thought. (Translation of Sein und Zeit. Reprint. Originally published: Harper & Row, 1962).
  • Hekman, Susan (1986) Hermeneutics and the Sociology of Knowledge. Notre Dame: University of Notre Dame Press.
  • Hershovitz, Scott (ed.) (2006) Exploring Law’s Empire. Oxford: Oxford University Press.
  • Hiley, David R. et al. (eds.) (1991) The Interpretive Turn: Philosophy, Science, Culture. Ithaca: Cornell University Press.
  • Hinchman, Lewis P. (1995) “Aldo Leopold’s Hermeneutic of Nature,” The Review of Politics, 57(2): 225-249.
  • Hoffman, S.J. (2003) “Gadamer’s Philosophical Hermeneutics and Feminist Projects,” in L. Code (ed.) Feminist Interpretations of Hans-Georg Gadamer, University Park: Penn State University Press.
  • Hoy, David Couzens (1992) “Intentions and the Law: Defending Hermeneutics,” in Legal Hermeneutics: History, Theory, and Practice. Ed. Gregory Leyh. Berkeley, Los Angeles, Oxford: University of California Press.
  • Hoy, David Couzens (1987) “Dworkin’s Constructive Optimism v. Deconstructive Legal Nihilism,” Law and Philosophy 6: 321-56.
  • Hoy, David Couzens (1985) “Interpreting the Law: Hermeneutical and Poststructuralist Perspectives,” Southern California Law Review 58: 136-76.
  • Hunt, Alan (1996, 1999) “Marxist Theory of Law”. In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 355-366.
  • Johnsen, Harald and Bjornar Olsen (1992) “Hermeneutics and Archaeology: On the Philosophy of Contextual Archaeology,” American Antiquity, 57(3): 419-436.
  • Kisiel, Theodore (1993) The Genesis of Heidegger’s Being and Time. Berkeley: University of California Press.
  • Kornprobst, Markus (2009) “International Relations as Rhetorical Discipline: Toward (Re-) Newing Horizons,” International Studies Review, 11(1): 87-108.
  • Levinson, Sanford and Steven Mailloux (1988) Interpreting Law and Literature: A Hermeneutic Reader. Evanston: Northwestern University Press.
  • Leyh, Gregory (1988) “Toward a Constitutional Hermeneutics.” American Journal of Political Science. 32(2): 369-387.
  • Leyh, Gregory (ed.) (1992) Legal Hermeneutics: History, Theory, and Practice. Berkeley, Los Angeles, Oxford: University of California Press.
  • Lieber, Francis (2010/1880) Legal and Political Hermeneutics. Lawbook Exchange, Ltd.
  • Malpas, Jeff and Hans-Helmuth Gander (eds.) (2014) The Routledge Companion to Hermeneutics. London, New York: Routledge, Taylor & Francis Group.
  • Malpas, Jeff and Hans-Helmuth Gander (1992) “Analysis and Hermeneutics,” Philosophy & Rhetoric, 25(2): 93-123.
  • Minda, Gary (1995) Postmodern Legal Movements. New York, London: New York University Press.
  • Monaghan, Henry Paul (2004) “Doing Originalism,” Columbia Law Review, 104(1): 32-38.
  • Mootz, Francis J., III (1994) “The New Legal Hermeneutics,” 47 Vand. L. Rev. 116.
  • Mootz, Francis J., III (1988) “The Ontological Basis of Legal Hermeneutics: A Proposed Model of Inquiry Based on the Work of Gadamer, Habermas, and Ricoeur,” 68 B.U.L. Rev. 523.
  • Murphy, Mark C. (2006) Natural Law in Jurisprudence & Politics. Cambridge: Cambridge University Press.
  • Nenon, Tom (1995) “Hermeneutical Truth and the Structure of Human Experience: Gadamer’s Critique of Dilthey,” in The Specter of Relativism: Truth, Dialogue, and Phronesis in Philosophical Hermeneutics, ed. Schmidt, Lawrence. Evanston: Northwestern University Press. 39-55.
  • Palmer, Richard E. (1969) Hermeneutics: Interpretation Theory in Schleiermacher, Dilthey, Heidegger, and Gadamer. Evanston: Northwestern University Press.
  • Pashukanis, Evgeny (1924/2002) The General Theory of Law and Marxism. New Brunswick: Transaction Publishers.
  • Pérez-Gómez, Alberto (1999) “Hermeneutics as Discourse in Design,” Design Issues, 15(2) Design Research: 71-79.
  • Pinton, Giorgio Alberto (1972/1973) “Emilio Betti’s (1890-1969) Theory of General Interpretation.” Ph.D. Dissertation. Hartford Seminary Foundation.
  • Ricoeur, P. (1974) The Conflict of Interpretations. Evanston: Northwestern University Press.
  • Rorty, Richard (2007) Philosophy as Cultural Politics. Cambridge: Cambridge University Press.
  • Rorty, Richard (2000) Philosophy and Social Hope. CITY: Penguin.
  • Rorty, Richard (1998) Achieving Our Country: Leftist Thought in Twentieth Century America. Cambridge: Harvard University Press, 1998.
  • Rorty, Richard (1991) Essays on Heidegger and Others: Philosophical Papers, Volume 3. Cambridge: Cambridge University Press.
  • Rorty, Richard (1979) Philosophy and the Mirror of Nature. Princeton: Princeton University Press.
  • Scalia, Antonin (1997) A Matter of Interpretation: Federal Courts and the Law: An Essay. Princeton: Princeton University Press.
  • Scalia, Antonin and Bryan Garner (2010) Reading Law: The Interpretation of Legal Texts. St. Paul: Thomson/West.
  • Schmidt, Lawrence (2006) Understanding Hermeneutics. Stocksfield: Acumen Publishing Ltd.
  • Surette, Leon (1995) “Richard Rorty Lays Down the Law,” Philosophy and Literature, 19(2): 261-275.
  • Taylor, George H. (2000) “Hermeneutics and Critique in Legal Practice: Critical Hermeneutics: The Intertwining of Explanation and Understanding as Exemplified in Legal Analysis,” 76 Chi.-Kent L. Rev. 1101.
  • Tugendhat, Ernst (1992) “Heidegger’s Idea of Truth.” trans. Christopher Macann, in Macann, Christopher (ed.) Martin Heidegger: Critical Assessments. 4 Vols. London: Routledge.
  • Valauri, John T. (2010) “As Time Goes By: Hermeneutics and Originalism,” Nevada Law Review, August 23, 2010.
  • Wachterhauser, Brice R (ed.) (1994) Hermeneutics and Truth. Evanston: Northwestern University Press.
  • Walby, S. (2007) “Complexity Theory, Systems Theory and Multiple Intersecting Social Inequalities,” Philosophy of the Social Sciences, 37(4): 449-70.
  • Warnke, Georgia (1993) Justice and Interpretation. Cambridge: MIT Press.
  • Wolf, Friedrich August (1831) Vorlesung über die Enzyklopadie der Altertumswissenschaft. Vorlesungen über die Altertumswissenschaft series, Ed. J.D. Gürtler, Vol. I. Leipzig: Lehnhold.

b. Further Reading

  • Attridge, Derrick (ed.) (1992) Acts of Literature. Ed. Derek Attridge. New York,   London: Routledge. 181-220.
  • Austin, John (1832) The Province of Jurisprudence Determined.
  • Austin, John (1987) “Deconstructive practice and legal theory.” Yale Law Journal, 96, 743.
  • Barnett, Randy E. (1995-1996) The Relevance of the Framers’ Intent. 19 Harv. J. L. & Pub. Pol’y 403.
  • Bentham, Jeremy (1970) Of Laws in General. Ed. H.L.A. Hart. London: University of London, Athlone Press.
  • Binder, Guyora, and Robert Weisberg. Literary Criticisms of Law. Princeton: Princeton University Press, 2000.
  • Bix, Brian (1999) Natural Law Theory. In A Companion to Philosophy of Law and Legal Theory. Ed.             Dennis Patterson. Malden, Oxford: Blackwell Publishing.
  • Bobbit, Phillip (1996, 1999) “Constitutional Law and Interpretation.” In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing.
  • Bobbit, Phillip (1982) “A Typology of Constitutional Arguments.” Constitutional Fate: Theory of the Constitution. Oxford: Oxford University Press.
  • Brennan, Jr. William (1986) “The Constitution of the United States: Contemporary Ratification,” The Great Debate: Interpreting Our Written Constitution. Washington, D.C.: The Federalist Society.
  • Brest, Paul (1980) “The Misconceived Quest for the Original Understanding,” 60 B.U. L. Rev. 204.
  • Campos, Paul (1992) “Against Constitutional Theory,” Yale Journal of Law and the Humanities 4.
  • Campos, Paul (1993) “That Obscure Object of Desire: Hermeneutics and the Autonomous Legal Text,” Minnesota Law Review 77.
  • Cicero, Marcus Tullius (1988) De Re Publica; (On the Commonwealth). trans. C.W. Keyes, Cambridge: Harvard University Press.
  • Cicero, Marcus Tullius (1988) De Legibus (On the Laws). trans. C.W. Keyes, Cambridge: Harvard University Press.
  • Delgado, Richard (2012) “Centennial Reflections on the California Law Review’s Scholarship on Race: The Structure of Civil Rights Thought,” 100 Calif. L. Rev. 431.
  • Derrida, Jacques (1986) “But beyond…(Open Letter to Anne McClintock and Rob Nixon)” Trans. Peggy Kamuf. Critical Inquiry 13 (Autumn). 167-168.
  • Derrida, Jacques (1973) “Différance.” Speech and Phenomena, and Other Essays on Husserl’s Theory of Signs. Evanston: Northwestern University Press.
  • Dilthey, Wilhelm (1958) Gesammelte Schriften. Vol. II. Stuttgart: B.G. Teubner.
  • Epictetus (1926-1928) The Discourses as reported by Arrian, the Manual, and Fragments. London, W. Heinemann; New York, G.P. Putnam’s Sons.
  • Epictetus (2008) Enchiridion. Auckland: Floating Press.
  • Eskridge, Jr., William N. (1994) “Gaylegal Narratives,” 46 Stan. L. Rev. 607, 633.
  • Eskridge, Jr., William N. (1990) “Gadamer/Statutory Interpretation,” 90 Colum. L. Rev. 609.
  • Feinberg, Joel and Jules Coleman (2008) Philosophy of Law. Belmont: Thomson Wadsworth.
  • Fish, Stanley (1984) “Fiss v. Fiss,” 36 Stanford Law Review.
  • Fiss, Owen (1982) “Objectivity and Interpretation,” 34 Stanford Law Review 739.
  • George, Theodore D (2014) “Remarks on James Risser’s ‘The Life of Understanding: A Contemporary Hermeneutics,’” Philosophy Today, 58(1): 107-116.
  • Grotius, Hugo (1625) De iure belli ac pacis libri tres. Paris: Buon
  • Heidegger, Martin (1999) Ontology: Hermeneutics of Facticity. John van Buren (trans.) Bloomington: Indiana University Press.
  • Hoy, David Couzens (1978) The Critical Circle: Literature, History, and Philosophical Hermeneutics. Berkeley, Los Angeles, Oxford: University of California Press.
  • Hutchinson, A. (ed.) (1989) Critical Legal Studies. Totowa: Rowman & Littlefield.
  • Jones, Bernie D. (2002) “Critical Race Theory: New Strategies for Civil Rights in the New Millennium?” 18 Harv. BlackLetter J. 1.
  • Ladson-Billings, Gloria, “Race…to the Top, Again: Comments on the Genealogy of Critical Race Theory,” 43 Conn. L. Rev. 1439.
  • Levinson, Sanford (1980) “Law as Literature,” 60 Texas Law Review 373-403.
  • Levit, Nancy and Robert R.M. Verchick, Feminist Legal Theory, New York, London: New York University Press.
  • Mahmud, Tayyab (2014) “Foreword: Looking Back, Moving Forward: Latin Roots of the Modern Global and Global Orientation of Latcrit,” 12 Seattle J. Soc. Just. 699.
  • Levit, Nancy and R.M. Verchick (eds.) (2006) Feminist Legal Theory: A Primer. New York, London: New York University Press:.
  • MacKinnon, C. (2013) “Intersectionality as a Method: A Note,” Signs, 38:4, Intersectionalty: Theorizing Power, Empowering Theory, 1019-30.
  • Marx, Karl (1967) Capital. A Critical Analysis of Capitalist Production, Vol. 1, New York: International.
  • Marx, Karl (1859/1994) Preface to A Contribution to the Critique of Political Economy, in Simon, supra, 209.
  • Parks, Gregory Scott (2008) “Note: Toward a Critical Race Realism,” 17 Cornell J.L. & Pub. Pol’y 683.
  • Patterson, Dennis (ed.) (2003) Philosophy of Law and Legal Theory. Malden: Blackwell Publishing.
  • Patterson, Dennis (1996/1999) “Postmodernism.” In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 375-384.
  • Patterson, Dennis (1996) Law and Truth. Oxford, New York: Oxford University Press.
  • Poteat, W. H. (1985) Polanyian Meditations: in Search of a Post-Critical Logic. Durham: Duke University Press.
  • Raz, Joseph (2009) Between Authority and Interpretation: On the Theory of Law & Practical Reason, Oxford: Oxford University Press.
  • Rorty, R. (1988) The Linguistic turn: recent essays in philosophical method. Midway Reprint edition. Chicago: University of Chicago Press.
  • Rorty, R. (1979) Philosophy and the mirror of nature. Princeton: Princeton University Press.
  • Schmidt, Lawrence (ed.) (1995) The Specter of Relativism: Truth, Dialogue, and Phronesis in Philosophical Hermeneutics, Evanston: Northwestern University Press.
  • Simmonds, N.E. (2007) Law as a Moral Idea, Oxford: Oxford University Press.
  • Teo, Thomas (2011) “Empirical Race Psychology and the Hermeneutics of Espistemological Violence,” Human Studies, 34(3): 237-255.
  • Thorsteinsson, Björn (2015) “From ‘Différance’ to Justice: Derrida and Heidegger’s ‘Anaximander’s Saying,” Continental Philosophy Review, 48(2): 255-271.
  • Tushnet, Mark V. (1983) “Following the Rules Laid Down: A Critique of Interpretivism and Neutral Principles” 96 Harvard Law Review 781.
  • Valauri, John T. (1991) “Constitutional Hermeneutics” in The Interpretive Turn: Philosophy, Science, Culture, David R. Hiley, James F. Bohman, and Richard Shusterman, eds. Ithaca: Cornell University Press.
  • Valdes, Francisco (2003) “Outsider Jurisprudence, Critical Pedagogy and Social Justice Activism: Marking the Stirrings of Critical Legal Education,” Asian American Law Journal, 10(1): Article 7.
  • Vedder, Ben (2002) “Religion and Hermeneutic Philosophy,” International Journal for Philosophy and Religion, 51(1): 39-54.
  • Warner, Richard (1996, 1999) “Legal Pragmatism” In A Companion to Philosophy of Law and Legal Theory. ed. Dennis Patterson, Malden: Blackwell Publishing, 385-393.
  • West, Robin L. (2000) “Commentary: Are There Nothing but Texts in this Class? Interpreting the Interpretive turns in Legal Thought,” Chicago-Kent College of Law Chicago-Kent Law Review, 76 Chi.-Kent L. Rev. 1125, 19547.
  • Whyte, Megan K. “Going Back to Class? The Reemergence of Class in Critical Race Theory Symposium: Introduction: From Discourse to Struggle: A New Direction in Critical Race Theory,” 11 Mich. J. Race & L. 1.

 

Author Information

Tina Botts
Email: tina.botts@oberlin.edu
Oberlin College
U. S. A.

Charles Sanders Peirce: Logic

C. S. PeirceCharles Sanders Peirce (1839-1914) was an accomplished scientist, philosopher, and mathematician, who considered himself primarily a logician. His contributions to the development of modern logic at the turn of the 20th century were colossal, original and influential. Formal, or deductive, logic was just one of the branches in which he exercized his logical and analytical talent. His work developed upon Boole’s algebra of logic and De Morgan’s logic of relations. He worked on the algebra of relatives (1870-1885), the theory of quantification (1883-1885), graphical or diagrammatic logic (1896-1911), trivalent logic (1909), higher-order and modal logics. He also contributed significantly to the theory and methodology of induction, and discovered a third kind of reasoning, different from both deduction and induction, which he called abduction or retroduction, and which he identified with the logic of scientific discovery.

Philosophically, logic became for Peirce a broad discipline with internal divisions and external architectonic relations to other parts of scientific inquiry. Logic depends upon, or draws its principles from, mathematics, phaneroscopy (=phenomenology), and ethics, while metaphysics and psychology depend upon logic. One of the most important characters of Peirce’s late logical thought is that logic becomes coextensive with semeiotic (his preferred spelling), namely the theory of signs. Peirce divides logic, when conceived as semeiotic, into (i) speculative grammar, the preliminary analysis, definition, and classification of those signs that can be used by a scientific intelligence; (ii) critical logic, the study of the validity and justification of each kind of reasoning; and (iii) methodeutic or speculative rhetoric, the theory of methods. Peirce’s logical investigations cover all these three areas.

Table of Contents

  1. Logic among the Sciences
  2. Logic as Semeiotic
    1. Speculative Grammar
    2. Logical Critics
      1. From Three Types of Inference to Three Stages of Inquiry
      2. Abductive Logic
      3. Deductive Logic
      4. Inductive Logic
    3. Methodeutic
  3. Peirce’s Logic in Historical Perspective
  4. References and Further Reading

1. Logic among the Sciences

Peirce’s idea of logic is guided by finding the location of logic in the map of the sciences. Peirce’s mature classification of the sciences (CP 1.180-202, 1903; see Brent 1987), which is a “ladder-like scheme” (MS 328, p. 20, c. 1905), takes superordinate sciences to provide principles to subordinated sciences, forming a ladder of decreasing generality.

According to Peirce’s 1903 scheme, which he as late as 1911 considered a satisfactory account (MS 675), sciences are either sciences of discovery, sciences of review, or practical sciences. Logic is a science of discovery. The sciences of discovery are divided into mathematics, philosophy and idioscopy. Mathematics studies the necessary consequences of purely hypothetical states of things. Philosophy, by contrast, is a positive science, concerning matters of fact. Idioscopy embraces more special physical and psychical sciences, and depends upon philosophy. Philosophy in turn divides into phaneroscopy, normative sciences and metaphysics. Phaneroscopy is the investigation of what Peirce calls the phaneron: whatever is present to the mind in any way. The normative sciences (aesthetic, ethics, and logic) introduce dichotomies, in that they are, in general, the investigation of what ought and what ought not to be. Metaphysics gives an account of the universe in both its physical and psychical dimensions. Since every science draws its principles from the ones above it in the classification, logic must draw its principles from mathematics, phaneroscopy, aesthetics and ethics, while metaphysics, and a fortiori psychology, draw their principles from logic (EP 2, pp. 258-262, 1903).

In sharp contrast to the logicist hypothesis, Peirce did not believe that mathematics depends upon deductive logic. On the contrary, in a sense it is deductive logic that depends upon mathematics. For Peirce, mathematics is the practice of deduction, logic its description and analysis: Peirce’s father Benjamin Peirce had defined mathematics as the science which draws necessary conclusions (B. Peirce 1870, p. 1). Hence deductive logic for Charles became the science of drawing necessary conclusions (CP 4.239, 1902). Logic cannot furnish any justification of a piece of deductive reasoning: deduction in general is in the first place mathematically, rather than logically, valid. And deductive logic is at any rate only a part of logic: “logic is the theory of all reasoning, while mathematics is the practice of a particular kind of reasoning” (MS 78, p. 4; see Haack 1993 and Houser 1993). Logic rather draws its principles from phaneroscopy, as the latter analyzes the structure of appearance but does not pronounce upon the veracity of such appearance. Logic also draws its principles from the normative sciences of ethics and esthetics (Peirce’s preferred spelling), which precede normative logic in the ladder of generality. Ethics depends on esthetics because ethics draws from esthetics the principles involved in the idea of a summum bonum, the highest good. Since ethics is the science that distinguishes good from bad conduct, it must be concerned with deliberate, self-controlled, conduct, because only by deliberate conduct is it possible to say whether the conduct is good or bad. Logic treats of a special kind of deliberate conduct, thought, and distinguishes good from bad thinking, that is, valid from invalid reasoning. Since deliberate thought is a species of deliberate conduct, logic must draw its principles form ethics (CP 5.120-50; EP 2, pp. 196-207, 1903)

Of the sciences down the ladder of generality, metaphysics and psychology come out next. Peirce had learnt from Kant that metaphysical conceptions mirror those of formal logic. Peirce’s criticism ever since the 1860s had been that Kant’s table of categories was mistaken not because he based them upon formal logic but because the formal logic that Kant had used was itself poor and ultimately wrong (see NEM 4, p. 162, 1898). The only way to arrive at a good metaphysics is to begin with a good logical theory (EP 2, pp. 30-31, 1898). Psychology, too, depends upon logic. According to Peirce, different versions of logical psychologism characterized the logics of his time, especially in Germany. Logic for Peirce considers not what or how we in fact think but how we ought to think; logic is a normative, not a descriptive, science. The validity of an argument consists in the fact that its conclusion is true, always or for the most part, when its premises are true; it has nothing to do with reference to a mind. Logical necessity is a necessity of (non-empirical) facts, not a necessity of thinking. No appeal to psychology is thereby of any aid in logic. On the contrary, it is psychology that stands in the need of a science of logic (EP 2, pp. 242-257, 1903).

2. Logic as Semeiotic

In the 1890s (MS 595, 787), Peirce divided logic into three branches: speculative grammar (also called stechiology), logical critics (or just critics) and methodeutic (also called speculative rhetoric). The division echoes the three sciences of the medieval Trivium: grammar, dialectic and rhetoric.

Perhaps the most salient character of Peirce’s logic as a whole is that in his later works (MS L 75, 1902; MS 478, 1903; MS 693, 1904; MS 640, 1909) logic is identified with semeiotic, the science and philosophy of signs and representations. Already in his early works on the theory of inference Peirce had affirmed that logic is the branch of semeiotic that treats of one particular kind of representations, namely symbols, in their reference to their objects (W1, p. 309, 1865). By the beginning of the 20th-century, he had shifted from the idea of “logic-within-semeiotic” to that of “logic-as-semeiotic.” He thus needed to distinguish between logic in the narrow sense, which he now calls logical critics, and logic in the wide sense; the latter is made coextensive with semeiotic. “Logic, in its general sense, is, as I believe I have shown, only another name for semiotic (σημειωτική), the quasi-necessary, or formal, doctrine of signs” (CP 2.227, c.1897; cf. Fisch 1986, pp. 338-341).

According to Peirce’s mature views, an enlargement of logic to cover all varieties of signs was a valuable methodological guidance to the building of an objective, anti-psychological and formal logical theory: “The study of the provisional table of the Divisions of Signs will, if I do not deceive myself, help a student to many a lesson in logic” (MS S 46, 1906; cf. MS 283, c. 1905; MS 675-676, 1911; MS 12, 1912). Therefore, his logic contains, as a proper part of it, a study of its own scope and expansions. In homage to Thomas of Erfurt’s grammatica speculativa, which at Peirce’s times was misattributed to Duns Scotus, Peirce names this part of logic “speculative grammar.”

a. Speculative Grammar

In the 1890s Peirce regarded speculative grammar as an analysis of the nature of assertion (MS 409-8, 1894; CP 3.432, 1896, MS 787, c. 1897). Starting with the Syllabus of his 1903 Lowell Lectures (A Syllabus of Certain Topics of Logic, MS 478, MS 540), speculative grammar becomes a classification of signs. In the Syllabus Peirce defines a Sign or Representamen as

the first Correlate of a triadic relation, the second Correlate being termed its Object, and the possible Third Correlate being termed its Interpretant, by which triadic relation the possible Interpretant is determined to be the first correlate of the same triadic relation to the same Object, and for some possible Interpretant. (MS 540, CP 2.242; EP 2, p. 290).

A sign for Peirce is something that represents an independent object and which thereby brings another sign, called interpretant, to represent that object as the sign does. According to a long tradition in the history of logic, Peirce declares that the principal classes of signs that logic is concerned with are terms, propositions, and arguments. But by 1903 these three elements become parts of a larger taxonomic scheme.

Since the Syllabus and until at least 1909 Peirce continued experimenting with principles and terminologies, without however settling on any definitive division. This section presents the main principles of the Syllabus classification.

Signs are divisible by three trichotomies; first, according as the sign in itself is a mere quality, is an actual existent, or is a general law; secondly, according as the relation of the sign to its Object consists in the sign’s having some character in itself, or in some existential relation to that Object, or in its relation to an Interpretant; thirdly, according as its Interpretant represents it as a sign of possibility, or as a sign of fact, or a sign of reason. (CP 2.243, 1903)

The first trichotomy considers signs as (i) tones, when taken in their material qualities (such as the blueness of the ink), as (ii) tokens (such as any instance of the word “the”), and as (iii) general types (such as the word “the”). The second trichotomy is the best known, namely that of (i) icons, or signs that bear similarity or resemblance to their objects, (ii) indices, which have factual connections to their objects, and (iii) symbols, which have rational connections to their objects. The third trichotomy divides signs into terms, propositions, and arguments: Through his work on the logic of relatives (see § 2.b.iii.), Peirce had come to consider the terms as (i) rhemas, which are unsaturated predicates with logical bonds or subject-positions, in some ways similar to Frege’s Begriff and Russell’s propositional function; (ii) propositions, which unify subject and predicate and thus assert, or as in the Syllabus are dicisigns, signs that tell, and (iii) arguments, which embody the ultimate perfection and end of signs as a representation of facts that are signs of other facts, such as the premises being the sign of the conclusion. [Peirce’s theory of the proposition is articulated and highly original and has been thoroughly investigated in Hilpinen 1982, 1992; Ferriani 1987; Chauviré 1994; Stjernfelt 2014.]

There are cross-divisions of these three trichotomies across speculative grammar. A term or rhema is a symbol which is represented by its interpretant as an icon of its object, while a proposition or dicisign is a symbol which is represented by its interpretant as an index of its object. Arguments themselves are considered as symbols that represent their conclusion in three different ways: iconically in abduction, indexically in deduction, and symbolically in induction. (In an early cross-division proposed in 1867 these last two were interchanged. See W2, p. 58). Other outcomes of the classifications consisted in further divisions of objects and interpretants into various subtypes. [For more on Peirce’s classifications, see Weiss & Burks 1945; Short 2007, chs. 7-9; and Burch 2011.]

Grammatical taxonomy shows that there are three kinds of arguments, each manifesting a different semiotic principle. But it is up to the second branch of logic, critics, to investigate the question of logical validity and justification of such arguments. The analysis of the conditions of validity of these three kinds of reasoning is a critical, not grammatical, question.

b. Logical Critics

i. From Three Types of Inference to Three Stages of Inquiry

Logical critics is the heart of Peirce’s logic. It cover what usually goes under the name of logic proper, that is, the investigation of inference and arguments. Many 19th-century logicians (for example, John S. Mill, George Boole, John Venn and William Stanley Jevons) took the range of logic to include deductive as well as inductive logic. As appears from the classification, the remarkable novelty of Peirce’s logical critics is that it embraces three essentially distinct though not entirely unrelated types of inferences: deduction, induction, and abduction. Initially, Peirce had conceived deductive logic as the logic of mathematics, and inductive and abductive logic as the logic of science. Later in his life, however, he saw these as three different stages of inquiry rather than different kinds of inference employed in different areas of scientific inquiry.

Peirce had formulated a definite theory of logical leading principles early in the late 1860s. His argument is roughly as follows. In any inference, we pass from some fact to some other fact that follows logically from it. The former is the premise (for in cases where there is more than one they may be colligated or compounded into one copulative premise), the latter is the conclusion.

P
∴C

The conclusion follows from the premise logically, that is, according to some leading principle, L. As logic supposes inferences to be analyzed and criticized, as soon as the logician asks what is it that warrants the passing from such premise to the conclusion she is obliged to express the leading principle L in a proposition and to lay it down as an additional premise:

P
L
∴C

This gives what Peirce calls a complete argument, in opposition to incomplete, rhetorical or enthymematic arguments. This second argument has itself its own leading principle L1, which may again be expressed in a proposition and laid down as a further premise:

P
L
L1
∴C

When L1 is not a substantially different leading principle than L, then L is said to be a logical leading principle. In Peirce’s words:

This second argument has certainly itself a leading principle, although it is a far more abstract one than the leading principle of the original argument. But you might ask, why not express this new leading principle as a premise, and so obtain a third argument having a leading principle still more abstract? If, however, you try the experiment, you will find that the third argument so obtained has no more abstract a leading principle than the second argument has. Its leading principle is indeed precisely the same as that of the second argument. This leading principle has therefore attained a maximum degree of abstractness; and a leading principle of maximum abstractness may be termed a logical principle. (NEM 4, p.175, 1898)

A logical leading principle is therefore a formal or logical proposition which, when explicitly stated, adds nothing to the premises of the inference which it governs. The central question of logical critics becomes that of determining different kinds of logical leading principles.

Peirce’s initial strategy to prove that there are three and only three irreducible kinds of reasoning was to use syllogism. He gave the demonstration that the second and the third figure are reducible to the first only through the employment of the very figure that is to be reduced. The principles involved in the three syllogistic figures cannot then be reduced to a combination of other, more primitive principles, as they invariably enter as parts into the reduction proof itself. From this Peirce drew the broader conclusion that the three figures of syllogism correspond to the three kinds of inference in general: deduction corresponds to the first figure, abduction to the second, and induction to the third:
Peirce graphic

In Peirce 1878 and Peirce 1883, abduction and induction are described as inversions of a deductive syllogism. If we call the major premise of a syllogism in the first figure Rule, its minor premise Case, and its conclusion Result, then abduction may be said to be the inference of a Case from a Result and a Rule, while induction may be said to be the inference of a Rule from a Case and a Result:

Peirce graphic

Later in 1903 Peirce had come to the conclusion that the three kinds of reasoning are in fact three stages in scientific research. First comes abduction, now often also called retroduction, by which a hypothesis or conjecture that explains some surprising fact is set forth. Then comes deduction, which traces the necessary consequences of the hypothesis. Lastly comes induction, which puts those consequences to test and generalizes its conclusions.

Any inquiry is for Peirce bound to follow this pattern: abduction–deduction–induction. Each kind of inference retains its validity and modus operandi and is logically irreducible to either of the others; yet all three of them are necessary in any complete process of inquiry. Of the three methods, Peirce took deduction to be the most secure and the least fertile, while abduction is the most fertile and the least secure.

All these three departments of critics epitomize the originality of Peirce’s contributions. The following sections deals with Peirce’s abductive, deductive, and inductive logics, respectively.

ii. Abductive Logic

The central question of abductive or retroductive logic is: is there a logic of scientific discovery? If yes, what are its justification and method?

Initially, Peirce described abduction as the inference of a Case from a Rule and a Result:

Hypothesis proceeds from Rule and Result to Case; it is the formula of the […] process by which a confused concatenation of predicates is brought into order under a synthetizing predicate. (Peirce 1883, p. 145)

Its general formula is this:

Result: S is M1 M2 M3 M4
Rule: P is M1 M2 M3 M4
Case: Therefore, S is P.

A certain number of surprising facts have been observed which call for explanation, and a single predicate embracing all of them is found which would explain them. When I notice that light manifests such-and-such complicated and surprising phenomena, and I know that ether waves exhibits those same phenomena, I conclude abductively that, if light were ether waves, it would be normal for it to manifest those phenomena. This offers rational ground for the hypothesis that light is ether waves.

In 1900, Peirce began viewing this description of abduction as inadequate. What he in 1883 had called hypothesis or abduction was actually induction about characters instead of things and is therefore better to be called qualitative induction (see § 2.b.iv): its leading principle is inductive and not abductive. Abduction is no longer constrained by the syllogistic framework. Most generally, it is the non-inductive process of forming an explanatory hypothesis. In Peirce’s words, abduction “is the only logical operation which introduces any new idea” (CP 5.172, 1903). Although abduction asserts its conclusions only conjecturally, it has a definite logical form. The following has become its standard, albeit not the ultimately satisfactory, description after Peirce’s pronouncement of it in the seventh of the Harvard Lectures of 1903:

The surprising fact, C, is observed;
But if A were true, C would be a matter of course,
Hence, there is reason to suspect that A is true. (CP 5.189, 1903)

This schema reveals why abduction is also called retroduction: it is reasoning that leads from a consequent of an admitted consequence to its antecedent.

Another description of the logical form of abduction is contained in a later, unpublished manuscript:

In the inquiry, all the possible significant circumstances of the surprising phenomenon are mustered and pondered, until a conjecture furnishes some possible Explanation of it, by which I mean a syllogism exhibiting the surprising fact as necessarily following from the circumstances of its occurrence together with the truth of the conjecture as premisses. (MS 843, p. 41, 1908)

The explaining syllogism is the inversion of the 1903 formula:

If A were true, C would be observable.
A is true.
Therefore, C is observable.

One more and hitherto unknown formulation of retroduction is found in an unpublished letter to Lady Welby:

[The] “interrogative mood” does not mean the mere idle entertainment of an idea. It means that it will be wise to go to some expense, dependent upon the advantage that would accrue from knowing that Any/Some S is M, provided that expense would render it safe to act on that assumption supposing it to be true. This is the kind of reasoning called reasoning from consequent to antecedent. For it is related to the Modus Tollens thus:

Peirce graphic

Instead of “interrogatory”, the mood of the conclusion might more accurately be called “investigand”, and be expressed as follows:

It is to be inquired whether A is not true.

The reasoning might be called “Reasoning from Surprise to Inquiry..” (Peirce to Welby, July 16, 1905, LoF, pp. 907-908)

The whole course of thought, consisting in noticing the surprising phenomenon, searching for pertinent circumstances, asking a question, forming a conjecture, remarking that the conjecture appears to explain the surprising phenomenon, and adopting the conjecture as plausible, constitutes the first, abductive stage of inquiry. Nonetheless, its crucial phase is that of forming the conjecture itself. This is often described by Peirce as an act of insight, or an instinct for guessing right, or what Galileo called il lume naturale. [That Peirce actually got the phrase from Galileo has sometimes been contested. But see the story by Victor Baker in Bellucci, Pietarinen & Stjernfelt (2014) on the “Myth of Galileo.” Baker refers to Jaime Nubiola’s finding of Peirce’s copy of Galileo’s Opere that had that phrase underlined in Peirce’s hand. That fifteen volumes edition was at least in 2012 still to be found at the Robbins Library at the Department of Philosophy, Harvard University.]

However, to pronounce reasoning to be instinctive would amount to excluding it from the realm of logic. For logic only considers reasoning, and reasoning is a deliberate act subject to self-control. According to Peirce, abduction is an inference type based upon a logical principle. In its most abstract shape, such logical principle gives abduction its justification, and the justification of abduction is the bottom question of logical critics (EP 2, p. 443, 1908).

According to Peirce, abduction “consists in studying the facts and devising a theory to explain them” (CP 5.145, 1903). Its only justifications are that “if we are ever to understand things at all, it must be in that way,” and that “its method is the only way in which there can be any hope of attaining a rational explanation” (CP 2.777, 1902). The only justification for a hypothesis is that it might explain the facts. But in general, an inference is valid if its leading principle is an instance of a logical principle which is conducive to the acquisition of new information. Therefore, the logical leading principle of all abductions is that nature, in general, is explainable. To suppose something inexplicable is contrary to the principles of logic: such supposition only has the appearance of an explanation conducive to the acquisition of new information, but to really suppose something inexplicable is to renounce knowledge.

That nature is explicable is therefore the primary abduction underlining all possible abductions. Human powers of insight may well be justified also inductively, that is, as testified by the history of science. But abduction’s primary justification is abductive rather than inductive: if we are to acquire new knowledge at all, sooner or later we must reason abductively. [See Bellucci & Pietarinen 2014; Burks 1946; Fann 1970; Kapitan 1992; Kapitan 1997; Paavola 2004 for further details on Peirce’s theory of abduction or retroduction, and Ma & Pietarinen 2015 for a dynamic logic approach to Peirce’s interrogative abduction.]

Of these three stages of reasoning, abduction is the most fertile but the least secure. For this reason, Peirce affirms that abduction is the principal kind of reasoning in which, after logical critics has pronounced it valid, it remains to be inquired whether and how it is advantageous. To carry out such tasks pertains to the third branch of logic, methodeutic, which is discussed in § 2.c below.

iii. Deductive Logic

The works of George Boole and Augustus De Morgan provided the essential backdrop for Peirce’s development of deductive logic. Also Benjamin Peirce’s Linear Associative Algebra (B. Peirce 1870) influenced his son’s early development of algebraic logic of relatives. Peirce’s dissatisfaction with how Boole represented syllogisms as algebraic equations led him to develop new algebraic approaches to logic, which he did by combining Boole’s calculus (Boole 1847, 1854) with De Morgan’s treatment of relations (De Morgan 1847, 1860).

Some of Peirce’s most important contributions to the development of modern logic are highlighted below.

1867: In the paper “An Improvement in Boole’s Calculus of Logic” published in the Proceedings of the American Academy of Arts and Sciences (Peirce 1867), Peirce subscribed all operation symbols with a comma to differentiate the logical from the arithmetical interpretation. He also made Boole’s union operator inclusive rather than exclusive (he was anticipated in this by Jevons 1864). Peirce became aware of the limitations of Boole’s algebraic logic, such as that it cannot express categorical propositions (Some X is Y), so it fails to properly represent quantification.

1870: Peirce’s development of De Morgan’s theory of relations is fully exposed in his paper “Description of a Notation for the Logic of Relatives, resulting from an Amplification of the Conceptions of Boole’s Calculus of Logic” (Peirce 1870), which was communicated to the American Academy of Arts and Sciences in January 1870. In this paper, Peirce combines De Morgan’s theory with Boole’s calculus. The result is a logical algebra equivalent in expressive power to first-order predicate logic without identity. Peirce’s paper introduces a number of original innovations. Among them is a new process of logical differentiation, explained in Welsh (2012, pp.166-180). Peirce also introduces the copula of inclusion, illation symbol, later also termed the sign of illation and expressed in cursive form as cursive illation symbol. Inclusion is for Peirce a wider and logically simpler concept than that of equality. This difference marks another important departure from Boole’s mathematical algebra towards new types of logical algebras. Beginning with C. I. Lewis’s (1918) comments on Peirce’s algebra, the literature has discussed whether in this 1870 paper what Peirce calls relative terms are to be equated with relations (see Merrill 1997 for a summary and further references). It appears that in the very least they are what verbs and phrases express linguistically, such as lover of___, whatever is a lover of­____, or buyer of____for____from____. Here blanks stand for nouns that are required to complete the expressions. Beginning in 1882, Peirce comes to define relative terms as classes of ordered pairs ( later understood as being relations). He denotes such terms by rhemas with blanks for expressions standing for subjects, such as ­­____is a lover of____, whatever____is a lover of­____ and so forth. Of note is that the algebra of Peirce’s 1870 paper is able to express various forms of quantification although the term “quantifier” and its modern conception was to emerge only later in his works after the early 1880s.

1880a: The long paper “On the Algebra of Logic” (Peirce 1880a), which was published in the American Journal of Mathematics, introduces a number of further developments of which the following six are listed here:

(1) The copula, expressed as a binary relation between classes of propositions,  Pi illation symbol C, is now understood to express the notion of the semantic consequence, namely that “every state of things in which a proposition of the class Pi is true is a state of things in which the corresponding propositions of the class Ci are true” (W4, p. 166). The binary relation here is thus a truth-functional implication. Moreover, the remarks in his Logic Notebook published in W4 (p. 216) were written in the same year and appear to be the first instance presenting variables v and f to denote the truth values true and false.

(2) A dash over a symbol is used to denote a negative of the symbol. A dash over the sign of illation in Pi illation symbol Ci  indicates the class complement. Constants ∞ and 0 are taken to mean the values of the possible and the impossible. An important modal component which Peirce would develop later on is thus emerging in this work.

(3) The totality of all that is possible is according to Peirce the “universe of discourse, and may be very limited,” that is, limited to that which “actually occurs,” rendering “everything which does not occur” impossible (W4, p. 170). The important idea of working with variable and restricted domains makes a marked difference not only to Frege who is well-known to have his logic to quantify over the entire “logical thought”, but also to Schröder, who though also working on the algebra of logic had nonetheless rendered Peirce’s 1885 (see below) algebra of logic so as to quantify, in a Fregean fashion, what Peirce had later in 1903 remarked to be “the whole universe of logical possibility” (MS 478, pp. 163-4).

(4) A new operation on relatives, which Peirce termed transaddition (º) is then introduced (W4, p.204). Taking two relatives, such as being lover of___ (l) and being servant of___ (s), their relative product ls denotes whatever is lover of a servant of___. Their transaddition l º s denotes whatever is not a lover of everything but servants of___, that is, it denotes a complement of the complements of the relative product of the two terms l and s.

(5) The 1880 paper is also the first in which the idea of a relative sum, which is the complement of the transaddition and which Peirce in 1882 will denote by the dagger (†), is employed. For example, ls reads lover of everything but servants of___. Hence this 1880 paper marks a decisive move towards a theory of quantification which will see its emergence in his 1883 Note B in the Studies in Logic (see below) and which comes to be completed in his 1885 “Algebra of Logic” paper (see below).

(6) The 1880 paper also suggests a mathematical theory of lattices for the treatment of the algebra of logic (W4, pp. 183-188).

Arthur Prior (1958, 1964) showed that Peirce’s 1880 paper provides a complete basis for propositional logic.

1880b: In an unpublished manuscript (MS 378, Peirce 1880b) entitled “A Boolian Algebra with One Constant”, which still in 1926 was tagged “to be discarded” at the Department of Philosophy at Harvard University, Peirce reduces the number of logical operations to one constant. He states that “this notation … uses the minimum number of different signs … shows for the first time the possibility of writing both universal and particular propositions with but one copula” (W4, p.221). Peirce’s notation was later termed the Sheffer stroke, and is also well-known as the NAND operation, in Peirce’s terms the operation by which “[t]wo propositions written in a pair are considered to be both denied” (W4, p.218). In the same manuscript, Peirce also discovers what is the expressive completeness of the NOR operation, indeed today rightly recognized as the Peirce arrow.

1881: “On the Logic of Number” (Peirce 1881), published in American Journal of Mathematics and read before the National Academy of Sciences, was noted by Gerrit Mannoury (1909, pp. 51, 78) to be the first successful axiomatization of natural numbers. Shields (1981/2012) has shown Peirce’s axiom system to be equivalent to the better-known systems of Dedekind (1888) and Peano (1889). Peirce’s paper formulates, presumably for the first time, the notions of partial and total linear orders, recursive definitions for arithmetical operations, and the general definition of cardinal numbers in terms of ordinals. The paper also provides a purely cardinal definition of a finite set (Dedekind-finite) by checking whether De Morgan’s syllogism of transposed quantity is valid. [The syllogism of transposed quantity is expressed in the following mode of inference: Every Texan kills a Texan; Nobody is killed by but one person; Hence, every Texan is killed by a Texan.] Peirce then derives in this paper the latter property of finiteness from the ordinal one. Doing the converse assumes the axiom of choice.

During 1881-2, Peirce edited a book, published in 1883 and entitled Studies in Logic by Members of the Johns Hopkins University (Peirce 1883). It contained significant graduate work by his students Benjamin I. Gilman, Christine Ladd(-Franklin), Allan Marquand and Oscar Howard Mitchell. Peirce contributed to the volume a paper “A Theory of Probable Inference”, together with Note A, “A Limited Universe of Marks”, and Note B, “The Logic of Relatives.” Some developments in Mitchell’s paper as well as in Note B are worth highlighting.

Mitchell’s “On a New Algebra of Logic” was hailed by his teacher as “one of the greatest contributions that the whole history of logic can show” (MS 492; LoF, p. 225). Peirce attributed to Mitchell two major discoveries: first, the invention of the basic form of proof transformation and second, the interpretation of quantifiers in multiple dimensions, one of which is time. The former is similar to the resolution rule in logic programming, and consists of a series of insertions (by adding to premises) and erasures (by elimination of consequents). In Peirce’s words, “the passage from a premiss or premisses … to a necessary conclusion in the manner to which is alone usually called necessary reasoning, can always be reached by adding to the stated antecedents and subtracting from stated consequents, being understood that if an antecedent be itself a conditional proposition, its antecedent is of the nature of a consequent” (MS 905; LoF, pp. 731-732). The latter discovery—universes in multiple dimensions—has its correlate in the idea of interpreted domains and in the modern notion of temporal logics and many-sorted quantification. It can also be seen as a development of new languages that take the role of indices in the quantifiers to be mappings from contexts to values in universes of discourse. Having in mind Mitchell’s pioneering idea of logical dimensions, Peirce goes on to mention that the study of Mitchell’s paper was for him necessary in order to break “ground in the gamma [modal logic] part of the subject” of existential graphs (MS 467; LoF, p. 332; see below). Years later, Peirce defines the term “dimension” in the Dictionary of Philosophy and Psychology by noting that it is

an element or respect of extension of a logical universe of such a nature that the same term which is individual in one such element of extension is not so in another. Thus, we may consider different persons as individual in one respect, while they may be divisible in respect to time, and in respect to different admissible hypothetical states of things, etc. This is to be widely distinguished from different universes, as, for example, of things and of characters, where any given individual belonging to one cannot belong to another. The conception of a multidimensional logical universe is one of the fecund conceptions which exact logic owes to O. H. Mitchell. Schröder, in his then second volume, where he is far below himself in many respects, pronounces this conception ‘untenable’. But a doctrine which has, as a matter of fact, been held by Mitchell, Peirce, and others, on apparently cogent grounds, without meeting any attempt at refutation in about twenty years, may be regarded as being, for the present, at any rate, tenable enough to be held. (DPP 2, p. 27)

Mitchell develops, for the first time, the idea and the notation for existential and universal quantifiers, and notices that it is by from alternations of these quantifiers that logic derives its expressive power. Peirce testifies in the same dictionary entry that placing Σ and Π in alternating orders “was probably first introduced by O. H. Mitchell in his epoch-making paper” (DPP 2, p. 650). However, being limited to monadic predicates, Mitchell’s language was deprived of some expressive power.

Having supervised and perused Mitchell’s paper, in Note B of the Studies in Logic, Peirce generalizes the groundwork Mitchell had laid on the theory of quantification. His theory of relatives adds indices as individual variables to the operators Σ and Π to denote individual objects. Relative products and relative sums are then defined as (lb)ij = Σx(l)ix(b)xj and (l † b)ij = Πx {(l)ix + (b)xj}, thus becoming species of existential and universal quantification: the lover of a benefactor is “a particular combination, because it implies the existence of something loved by its relate and a benefactor of its correlate.” The lover of everything but benefactors is “universal, because it implies the non-existence of anything except what is either loved by its relate or a benefactor of its correlate” (Peirce 1883, p. 189). Peirce had already had the relative sum at his disposal and the idea of it expressing the non-existence of exceptions naturally led to its dual of the existential quantification. Towards the end of Note B Peirce writes something is a lover of something as Σi Σaj lij, everything is a lover of something as Πi Σj lij, there is something which stands to something in the relation of loving everything except benefactors of it as Σi Σk Πj (lij + bjk), and so on. Taking α to denote accuser to___of___, ε excuser to___of___, and π preferrer to___of___, Πi Σj Σk (α)ijkjki + πkij) means that “having taken any individual i whatever, it is always possible so to select two, j and k, that i is an accuser to j of k, and also is either excused by j to k or is something to which j is preferred by k” (Peirce 1883, p. 201). The phrasing Peirce uses here (such as “having taken any individual”, “it is always possible so to select”) is indicative of a new semantic treatment of quantifiers and sequences of quantifiers which he goes on to pursue further in later papers, and which in Hilpinen (1982), Hintikka (1996) and Pietarinen (2006) have shown to agree with game-theoretic semantics. Interestingly, Peirce’s examples are all stated in prenex normal form, which highlights the idea of sequences of dependent quantifiers. Peirce’s quantifiers bind variables ranging over interpreted domains. In this 1883 paper, he provides the basic inference rules, such as Σi Πj  illation symbol Πj Σi, for manipulating the strings of quantifiers. The language is not inductively defined, it lacks notation for functions, and it uses neither constants nor an equality sign, but in other respects it coincides with that of first-order predicate calculus.

Alfred Tarski’s summary concerning Peirce’s contributions to the logical theory of relatives is illuminating:

[t]he title of creator of the theory of relations was reserved for C. S. Peirce. In several papers published between 1870 and 1882, he introduced and made precise all the fundamental concepts of the theory of relations and formulated and established its fundamental laws. Thus Peirce laid the foundation for the theory of relations as a deductive discipline; moreover he initiated the discussion of more profound problems in this domain. In particular, his investigations made it clear that a large part of the theory of relations can be presented as a calculus which is formally much like the calculus of classes developed by G. Boole and W. S. Jevons, but which greatly exceeds it in richness of expression and is therefore incomparably more interesting from the deductive point of view. (Tarski 1941, p. 73)

However, it is his 1885 theory of quantification that Peirce calculated to settle the problems of deductive logic and logical analysis in a way that decidedly brought him beyond the algebraic approach to the logic of relatives.

1885: Peirce’s logic of quantifiers comes to a full blossom in his paper, written in summer 1884, “On the Algebra of Logic: A Contribution to the Philosophy of Notation”, and published in the American Journal of Mathematics in the following year (Peirce 1885). This massive paper defies any condensed exposition; but in summary, it contains Peirce’s “five icons of algebra” as a system of natural deduction based on introduction and elimination rules. Peirce had repeatedly stated that his having supervised and examined Mitchell’s paper was essential in order to arrive at the idea of these two basic operations. There is an abundant use of truth-functional propositions and an anticipation of the truth-table method to test tautologies. One of the examples comes close to the tableaux method, later proposed by Evert Beth and Jaakko Hintikka, that spells out a systematic search for counter-models by deriving contradictions from the negations of the formula to be proved. In order “to find whether a formula is necessarily true,” he says, “substitute f and v for the letters and see whether it can be supposed false by any such assignment of values” (Peirce 1885, p. 224; Pietarinen 2006; Anellis 2012a).

When he moves on to the first-order (“first-intentional”) logic, Peirce seeks to devise a notation that is as iconic as possible, building on his semiotic insight that the more iconic a notation is, the better suited it would be for logical analysis. He starts by using “Σ for some, suggesting a sum, and Π for all, suggesting a product” (1885, p. 180). Once again, Peirce credits Mitchell, now for the method of separating the “quantifying part”—which he later termed the “Hopkinsian” to honour its place of discovery (MS 515, 1902)—from the pure Boolean expression: the latter refers to an individual by its use of indices (like pronouns in language) while the former states what that individual is. The quantifying operators are, however, “only similar to a sum and product,…because the individuals of the universe may be denumerable” (1885, p. 180). Peirce’s consideration illustrates similar lines of thought as those that prompted Löwenheim to formulate his famous 1915 theorem: if a first-order sentence has a model then it has also a countable model, or generally, models for sets of formulas being of some cardinality imply models of some other infinite cardinality (Badesa 2004). (Associating infinite products and sums with conjunctions and disjunctions was what Wittgenstein took to be his own biggest mistake in logic.) The 1885 paper continues introducing rules for quantifier manipulation, including “putting the Σs to the left, as far as possible” (1885, p. 182), which is a prelude to the idea of Skolem normal forms. One could say that it is the sequences of quantifiers, especially those of dependent quantifiers, that contribute to a linear logic notation as being maximally iconic, and that it is the prenex and Skolem normal forms that bring out maximal analyticity which logical icons exploit. The 1885 paper then presents many examples drawn from natural language to be analyzed logically with this new notation. The paper also extensively deals with issues having to do with the representation of mathematical notions such as one-to-one correspondence and identity in the second-intentional logic, developed in the third part of the paper, and in which variables range over relations. There is an early attempt at axiomatizing set theory as well as some profound philosophical consideration on the possibility of developing a “method for the discovery of methods in mathematics,” which is to be based on these new approaches that aim at formulating a general theory of deductive logic.

Thanks to the volumes that have appeared in the Chronological Edition of the Writings between 1982 and 2010, and which have covered Peirce’s work up to 1892, these earlier phases of Peirce’s deductive logic are now relatively well understood. But the research from that point on has been hampered by the unavailability of systematic editions concerning Peirce’s later logical writings. Yet the mid-1890s marks only the beginnings of a new and by far the most productive era in Peirce’s logical investigations, which were to last until the last months of his life. This situation has by no means been adequately reflected in the secondary literature.

Although Peirce would continue his investigations on the algebra of logic throughout his life, the algebraic element would no longer assume a central position in his overall oeuvre:

In 1895 Schröder published the third huge volume of his logic, which consisted mainly of a vast elaboration in detail of the logical algebra of my Note B. That I never considered that algebra to be a great masterpiece is sufficiently shown by my giving my exposition of it no other title than “Note B.” The perusal of Schröder’s book convinced me that the algebra was not what was wanted, and in the Monist for January 1897 I produced a system of graphs which I now term Entitative Graphs. I shortly after abandoned that and took up Existential Graphs” (MS 467; LoF, p. 332).

Although it was Schröder’s elaboration that was to influence the works of the early model theorists such as Löwenheim and Skolem (Brady 2000), it was Peirce’s and Mitchell’s works that germinated the concept of first-order statements being true-in-a model (Pietarinen 2006; Bellucci & Pietarinen 2015a). Moreover, Peirce’s incessant hunt for new logical notations and methods was much more ambitious and philosophical than his early algebraic investigations revealed.

What was to take the place of algebra were the ideas that emerged from diagrammatic, iconic and topological considerations on logical representation and reasoning. These considerations were at first prompted by logical analogues to algebraic invariants in chemistry first developed by Peirce’s John Hopkins colleague J. J. Sylvester (1878) and investigated in Kempe (1886). Peirce was initially fascinated by the analogy in which a chemical atom is like a relative “in having a definite number of loose ends or “unsaturated bonds”, corresponding to the blanks of the relative” (CP 3.469, 1897). But the continual search for better and better notations for the overall purposes of logical analysis would also reveal the reasons why Peirce had to overcome this analogy between logic and chemistry.

Peirce’s theory of Existential Graphs (EGs), first conceived in summer 1896 and developed in subsequent years (for example Peirce 1897, 1906), was in part motivated by his need to respond to the expressive insufficiency and lack of analytic power of the systems described in his Note B, which he later termed the algebra of dyadic (dual) relatives, and in the 1885 general (universal) algebra of logic. The analytic power comes from the idea of subsuming what the algebraic operations do when composing concepts under one mode of composition. This composition of concepts is effected in the theory of EGs by the device of ligatures. A ligature is a complex line, composed of what Peirce terms the lines of identities, which connects various parts and areas of the graphs: [See e.g. Zeman 1964; Roberts 1973; Shin 2002; Dipert 2006; Pietarinen 2005, 2006, 2011, 2015a.]
figure 1

Fig. 1 

Peirces_Logic_Figure2

Fig. 2 

Peirces_Logic_Figure3

Fig. 3

The meaning of these lines is that the two or more descriptions apply to the same thing. For example, in Figure 2 there is a horizontal line attached to the predicate term “is obedient.” It means that “something exists which is obedient.” There is also another line which connects to the predicate term “is a catholic,” and that composition means that “something exists which is a catholic”, which is equivalent to the graph-instance in Figure 3. Since in Figure 1 these two lines are in fact connected by a continuous line, the graph-instance in Figure 1 means that “there exists a catholic which is obedient,” that is, “there exists an obedient catholic.” Ligatures, representing continuous connections composed of two or more lines of identities, stand for quantification, identity and predication, all in one go.

These EGs are drawn on a sheet of assertion that represents what the modeller knows or what mutually has been agreed upon to be the case by those who undertake the investigation of logic. The sheet thus represents the universe of discourse. The graph that is drawn on the sheet puts forth an assertion, true or false, that there is something in the universe to which it applies. This is the reason why Peirce terms these graphs existential. Drawing a circle around the graph, or alternatively, shading the area on which the graph-instance rests, means that nothing exists of the sort of description intended. In Figure 4, the assertion “something is a catholic” is denied by drawing an oval around it and thus severing that assertion from the sheet of assertion:


Fig. 4

Fig. 4

The graph-instance depicted in Figure 4 thus means that “something exists that is not catholic.”

Peirce aimed at a diagrammatic syntax that would use a minimal number of logical signs but at the same time be maximally expressive and as analytic as possible. His ovals, for instance, have different notational functions: “The first office which the ovals fulfill is that of negation. […] The second office of the ovals is that of associating the conjunctions of terms. […] This is the office of parentheses in algebra” (MS 430, pp. 54-56, 1902). The ovals are thus not only the diagrammatic counterpart to negation but also serve to represent the compositionality of a graph-formula. He held (MS 430, 1902; MS 670, 1911) that a notation that does not separate the sign of truth-function from the representation of its scope is more analytic than some other notation, such as that of an ordinary “symbolic” language, where such a separation is needed to force a one-dimensional notation. The role of ovals as denials is in fact a derived function from more primitive considerations of inclusion and implication (Bellucci & Pietarinen 2015; MS 300, 1908).

As to expressivity, Peirce had already recognized that the notion of dependent quantification was essential in any system expressive enough to serve the purposes of logical analysis of any assertions. The nested system of ovals in EGs effectuate this in a natural way, much in contrast to algebras that resort to an explicit use of parentheses and other punctuation devices. For example, the graph in Figure 5 means that “Every Catholic adores some woman.” The graph in Figure 6 means that “Some woman is adored by every Catholic.” Peirce notes that the latter asserts more since it states that all Catholics adore the same woman, whereas the former allows different Catholics to adore different women.

Fig. 5

Fig. 5        

Peirces_Logic_Figure6

Fig. 6  

The graph in Figure 7 means that “anything whatever is unloved by something that benefits it,” that is, “everything is benefitted by something or other that does not love it”:


Fig. 7

Fig. 7

Lastly, Figure 8 provides an example of a very complex graph taken from MS 504 (1898):

 

Fig. 8

Fig. 8

Peirce provided the meaning in natural language this way:

Every being unless he worships some being who does not create all beings either does not believe any being (unless it be not a woman) to be any mother of a creator of all beings or else he praises that woman to every being unless to a person whom he does not think he can induce to become anything unless it be a non-praiser of that woman to every being.

It is on the level of semantics where the power of dependent quantification comes to the fore. Peirce carried the semantics out in terms of defining the basics of what today is recognized as two-player zero-sum semantic games. For Peirce these games take place between the Graphist/Utterer and the Grapheus/Interpreter. [Sometimes, especially in Peirce’s model-building games, these roles split so that the Grapheus and the Interpreter are playing separate roles, see Pietarinen 2013.] Peirce’s semantic games were not limited to EGs; he applied the same idea also to interpret quantificational expressions and connectives in his general algebra of logic.

It speaks to the superiority of EGs over algebraic systems that in it deduction, following Mitchell’s work, is reduced to a minimum number of permissive operations. Peirce termed these operations illative rules of transformation, and in effect they consist only of two: insertions (permissions to draw a graph-instance on the sheet of assertion) and erasures (permissions to erase a graph-instance from the sheet). More precisely, the oddly enclosed areas of graphs (areas within an odd number of enclosures) permit inserting any graph on that area, while evenly enclosed areas permit erasing any graph from that area. A copy of a graph-instance is permitted to be pasted on that same area or any area deeper within the same nest of enclosures (the rule of iteration), and a copy thus iterated is permitted to be erased (the converse rule of deiteration). An interpretational corollary is that the double enclosure with no intervening graphs in the middle area can be inserted and erased at will.

A more detailed exposition of these illative rules of transformation would need to show their application to quantificational expressions, namely applying insertions and erasures to ligatures. A flavor of such proofs is given by inspecting the Figures 1, 2 and 3: an application of a permissible erasure on the line of identity in Figure 1 amounts to the graph-instance in Figure 2, and that another application of a permissible erasure on the upper part of the graph-instance in Figure 2 amounts to the graph-instance depicted in Figure 3. Thus what is represented in Figure 2 is a logical consequence of the graph-instance in Figure 1, and what is represented in Figure 3 is a logical consequence of the graph-instance given in Figure 2.

Roberts (1973) was the first to prove that these transformation rules, first given by Peirce in 1898, form a semantically complete system of deduction. Roberts did not mention, however, that Peirce had demonstrated their soundness in 1898 and again in 1903 and that he had argued for their completeness in terms of what he termed the “perfect archegetic rules of transformation” in the unpublished parts of the Syllabus for the Lowell Lectures that Peirce delivered in 1903.

The polarity of the outermost ends or portions of ligatures determines whether the quantification is existential (that end or portion resting on even/positive areas) or universal (if it rests on odd/negative area). Unlike in the Tarski-type semantics, but just as what happens in game-theoretic semantics, the preferred rule of interpretation of the graphs is what Peirce termed “endoporeutic”: one looks for the outermost portions of ligatures on the sheet of assertions first, assigns semantic values to that part, and then proceeds inwards into the areas enclosed with ovals. In non-modal contexts, ligatures are not well-formed graphs because they may cross the enclosures.

The diagrammatic nature of EGs consists in the iconic relationship between forms of relations exhibited in the diagrams and the real relations in the universe of discourse. Peirce was convinced that, since these graphical systems exploit a proper diagrammatic syntax, they—together with any of their extensions that would be introduced to cover modalities, non-declarative expressions, speech acts, and so forth—can express any assertion, however intricate. Guided by the precepts laid out by the diagrammatic forms of expression, and together with the simple illative permissions by which deductive inference proceeds, the conclusions from premises can be “read before one’s eyes”; these graphs present what Peirce believed is a “moving picture of the action of the mind in thought” (MS 298; LoF, p. 655; late 1906-1907).

If upon one lantern-slide there be shown the premisses of a theorem as expressed in these graphs and then upon other slides the successive results of the different transformations of those graphs; and if these slides in their proper order be successively exhibited, we should have in them a veritable moving picture of the mind in reasoning. (MS 905; LoF, p. 723; late 1907-1908)

The theory of EGs that uses only the notation of ovals and the spatial notion of juxtaposition of graphs is termed by Peirce the Alpha part of the EGs, and it corresponds to propositional logic. The extension of the alpha part with ligatures and rhemas (also termed spots by Peirce) gives rise to the Beta part, and it corresponds to fragments of first-order predicate calculus. What Peirce termed the Gamma part was a boutique of a number of developments, including various modalities such as metaphysical, epistemic and temporal modalities, as well as extensions of such graphs with ligatures. In Peirce’s writings, there are developments of graphical systems for higher-order logics and abstraction (Peirce’s “logic of potentials”), the logic of collections, and investigation of meta-logical expressions that use the language of graphs to talk about notions and properties of the graphs in that language (Peirce’s “graphs of graphs”). He mentions late in 1911 that the Delta part would also need to be added, most likely because of the ever-expanding systems that had been mushrooming in the Gamma part.

Peirce’s further contributions to deductive logic. While the development of the theory of the logic of existential graphs was his chief contribution, Peirce’s other contributions to the development of modern logic were numerous. In the Logic Notebook (1909) he defined a number of operations for three-valued logic and gave semantics for them in terms of defining truth-tables for such new connectives (Fisch & Turquette 1966). In these systems, which he called triadic logic, the third value is “the limit” between “true” and “not true,” and it applies to what Lane (1999) has identified as boundary-propositions: in Peirce’s terms, boundary-propositions have “a lower mode of being” which can “neither be determinately P, nor determinately not-P,” but are “at the limit between P and not P” (MS 399, p. 344r, 1909). Peirce defined several connectives to realize this idea in alternative ways, including four one-place connectives which were later reinvented as strong negation, two Post negations and the Tertium function, as well as six two-place connectives, including one that pertains to the logic of ordinary discourse.

Generally, Peirce divided deduction in two: on the one hand, deduction is either necessary or probable (deductive reasoning about probabilities), and on the other hand, deduction is either corollarial or theorematic. Corollarial deduction is reasoning “where it is only necessary to imagine any case in which the premisses are true in order to perceive immediately that the conclusion holds in that case.” Theorematic deduction “is deduction in which it is necessary to experiment in the imagination upon the image of the premiss in order from the result of such experiment to make corollarial deductions to the truth of the conclusion” (MS L 75, 1902). He considered the theorematic/corollarial distinction his first real discovery in the philosophy of mathematics. Theorematic deductions can be of different kinds and degrees of complexity, and he took the classification of various types of theorematic deductions to be of the utmost value in the theory of logic (MS 617; MS 201; Peirce 1908). Stjernfelt (2014) proposes a new classification of theorematic inferences. Hintikka (1980) has argued that reasoning is theorematic if it increases the number of layers of quantifiers, and that an argument is the more theorematic the more new individuals are used in it (see also Ketner 1985; Zeman 1986; Hoffmann 2010).

Zooming into some of the details of Peirce’s systems of logic, including those of diagrammatic logics, one finds a treasury of developments the meaning of which is only beginning to unravel over a century later (Bellucci, Pietarinen & Stjernfelt 2014). In 1886, Peirce suggested in a letter to his former student Allan Marquand, who had designed mechanical logic machines for syllogistic reasoning, that “it is by no means hopeless to expect to make a machine for really difficult problems. But you would have to proceed step by step. I think electricity would be the best thing to rely on” (L 269, Peirce to Marquand, 30 December, 1886; W5, p. 422). He then showed how switching circuits can be connected serially and in parallel, noting that these two configurations correspond to multiplication (algebraic sum as logical disjunction) and addition (algebraic product as logical conjunction) in logic. In addition to the idea of real logical machines running on electricity, Peirce was also very interested in the philosophical question of whether living intelligence is required in performing deductive reasoning, an issue of continuing relevance to A.I. and to the prospects of automatized theorem proving. In 1902 he developed two notational systems with sixteen binary connectives to map out all of the possible truth functions of the binary propositional calculus (Clark 1997; Zellweger 1997). According to Max Fisch, “No other logician compares with Peirce in attention to systems of notation and to sign-creation” (Fisch 1982, p. 132). Peirce’s work on these notational systems foresaw geometrical structures of logic, including spaces revealed by the study of the geometry of negation and other operators. Based on Peirce’s conceptual and sign-theoretic considerations, an apparatus for displaying and performing a complete set of the sixteen binary connectives in a two-valued propositional logic was patented in the U.S.A. in 1981 by Shea Zellweger.

Peirce also worked on early forms of topology (Havenel 2010), including studies on what might be recognized as rudimentary versions of homologies and knots, in his attempts to find pathways not only to logical issues but also to questions in philosophy of mathematics (Murphey 1961; Moore 2010).

Moreover, his diagrammatic systems of modal logic included suggestions for defining several types of multi-modal logics in terms of tinctures of areas of graphs. Tinctures enable logic to assert, among others, modalities including necessities and metaphysical possibilities, and so call for changes in the nature of how the corresponding logics behave, including the identification of individuals at the presence of multiple universes of discourses. He defined epistemic operators in terms of subjective possibilities, which, just as in contemporary epistemic logic, are epistemic possibilities defined as duals of knowledge operators. He analyzed the meaning of identities between actual and possible objects in quantified multi-modal logics. As an example, the two graphs given in Figures 9 and 10 that he presented in a 1906 draft of the Prolegomena paper (MS 292) illustrate the nature of the interplay between epistemic modalities and quantification.

Fig. 9

Fig. 9

Fig. 10

Fig. 10

The graph in Figure 9 is read “There is a man who is loved by one woman and loves a woman known by the Graphist to be another.” The reason is this. In the equivalent graph depicted in Figure 10 the woman who loves is denoted by the name A, and the woman who is loved is denoted by the name B. The shaded area is a tincture that refers to the modality of subjective possibility. Thus the graph in Figure 10 means that it is subjectively impossible, by which Peirce means that “it is contrary to what is known by the Graphist” (= the modeller of the graph), that A should be B. In other words, the woman who loves and the woman who is loved (whom the graph does not assert to be otherwise known to the Graphist) are known by the Graphist not to be the same person.

Peirce’s work highlights the philosophical significance of ideas that were rediscovered later and largely after the mid-twentieth century, though often in different clothes: in Peirce’s largely unpublished works one finds him addressing such topics as multi-modal logics and possible-worlds semantics, quantification into modal contexts, cross-world identities (in MS 490 he termed these special relations connecting objects in different possible worlds (for references, see Pietarinen 2005), cumulative and branching quantifiers (the latter being related to independence-friendly logic, see Pietarinen 2015b), as well as what later on became known as “Peirce’s Puzzle” (Dekker 2001; Hintikka 2011; Pietarinen 2015b), namely the question of the meaning of indefinites in conditional sentences, which Peirce himself analyzed in quantified modal extensions of EGs.

Far from merely anticipating later discoveries, thus, Peirce’s logic in general puts what later on came to be explored in the fields of philosophical logic, formal semantics and pragmatics, philosophy of logic, mathematics, mind and language, cognitive and computing sciences, and history and philosophy of science, into a systematic logico-semeiotic perspective. From time to time, his ideas even surpass stagnated contemporary discussions, especially in the philosophy of logic and mathematics. [See for example Bellucci, Pietarinen & Stjernfelt 2014; Lupher & Adajian 2015; Sowa 2006; Zalamea 2012a; Zalamea 2012b; PM. For further details on Peirce’s deductive logic, see the collection of Houser and others, eds. 1997. Hilpinen 2004 provides a useful overview.]

iv. Inductive Logic

In 1865 (W1, pp. 263-64) Peirce defines induction as inference from Case and Result to Rule. Its general form is:

Case: M1 M2 M3 Mare S
Result: M1 M2 M3 Mare P
Rule: Therefore, all S are P.

A certain number of objects (M1 M2 M3 M4 ), known to belong to a certain class (S), possess a certain character (P); therefore, it can be infered inductively that the whole class S possesses that character. I notice that neat, swine, sheep, and deer, which I know are cloven-hoofed, are herbivores. Therefore, I infer inductively that all cloven-hoofed animals are herbivores.

Later, Peirce came to divide induction into three principal kinds. Crude induction is the lowest form of induction, based upon the common practice of generalizing about future events on the ground of previous experience. For example, “No instance of a genuine power of clairvoyance has ever been established: So I presume there is no such thing”; “cancer is incurable, because every known case has proved to be so.” Its general form is “All observed As are B. Therefore, All As are B.” It is the weakest form of inductive reasoning in terms of security. Qualitative induction is the intermediate kind in terms of security. It is what Peirce had earlier called hypothetical reasoning or abduction. It consists in testing a hypothesis by sampling the possible predications that may be made on the basis of it (CP 7.216). Qualitative abduction is reasoning that tests hypotheses already formulated. It should not be confused with abduction, which is reasoning that originates new hypotheses. Quantitative induction is the highest form of induction in terms of security. It investigates the real probability that a member of a certain class will have a certain character. Its procedure consists in finding a representative sample of the class and noting the proportion of them that possess the character P. Then, the inference is drawn that the proportion holds for the whole class. Its logical form is

S1 S2 S3 S4 and so forth, are taken at random from the Ms.
The proportion p of S1 S2 S3 S4 is P.
Hence, probably and approximately, the same proportion p of the Ms are P.

The inversion of a quantitative induction gives us a statistical deduction, whose form is

The proportion p of the Ms are P.
S1 S2 S3 S4 and so forth, are taken at random from the Ms.
Hence, the proportion p of them is P.

Although crude, qualitative, and quantitative induction are different in kind, their justification is, according to Peirce, the same:

The validity of Induction consists in the fact it proceeds according to a method which though it may give provisional results that are incorrect will yet if steadily pursued, eventually correct any such error. […] all Induction possesses this kind of validity, and […] no Induction possesses any other kind that is more than a further determination of this kind. (MS 293, 1907)

The validity rests upon induction being self-corrective: in the long run induction is bound to lead us ever closer to the correct representation of reality. Its validity is therefore linked to esse in futuro, to the possibility of self-correction of the very method itself. Any actual induction that is performed may well be wrong or partly wrong, but it remains valid because its leading principle is valid, that is, is conducive to truth in the long run.

Peirce’s polemic target was a theory that would make the validity of induction rest upon some principles of uniformity or regularity in nature. According to Peirce, that was how John S. Mill and Philodemus of Gadara (ca.110–ca.30 B.C.E.) attempted, unsoundly, to justify induction. Of the several objections that Peirce raised from time to time against this way of justifying induction, one is worth reporting. Mill argues that a universe without any regularity is imaginable, and that in that universe inductions would be invalid. But the absence of uniformity, that is, the absence among certain objects S of the character P, is itself a uniformity. No universe is imaginable in which induction is not valid. According to Peirce, “even if nature were not uniform, induction would be sure to find it out, so long as inductive reasoning could be performed at all” (CP 2.775).

Cheng 1969, Goudge 1946, Merrill 1975 and Forster 1989 provide further details on Peirce’s inductive logic.

c. Methodeutic

The third branch of Peirce’s logic is methodeutic, which he also called speculative rhetoric. He defined it to be “the study of the proper way of arranging and conducting an inquiry” (MS 606, p. 17), depicting it as being “not so exact in its conclusions as is critical logic” (MS L 75, 1902) and as involving “certain psychological principles” (MS 633, 1909). But it nevertheless is a theoretical study and not an art. Methodeutic is based upon critics, and considers not what is admissible (logical validity) but what is advantageous (logical economy). It is a “theoretical study of advantages” (MS L 75, 1902).

Abduction is of special interest to methodeutic, because abduction is the only mode of inference that can initiate a scientific hypothesis. But being justifiable is not a sufficient property of good hypotheses:

Any hypothesis which explains the facts is justified critically. But among justifiable hypotheses we have to select that one which is suitable for being tested by experiment. (MS L 75, 1902)

Among critically equivalent hypotheses (that is, hypotheses that explain the facts), one should be able to select for testing those that are capable of experimental verification. [Being capable of experimental verification is in Peirce’s philosophy of science to be conceived in the wide sense, including mental experimentation and imaginative activities in our thoughts (Bellucci & Pietarinen 2015b). It is not the same thing as the empirical verification criterion of the positivists, which Peirce often criticized.] This is the core of Peirce’s philosophy of pragmati(ci)sm, which teaches that the whole meaning of a hypothesis is in its conceivable practical (that is, experienceable) effects; pragmaticism therefore is “nothing else than the question of the logic of abduction” (CP 5.196, 1903).

In turn, among “pragmatistically” equivalent hypotheses (that is, hypotheses that are capable of experimental verification) one should select those that in the sense of Peirce’s economy of research are the cheapest ones. His argument for the economic character of methodeutic is roughly as follows: the logical validity of abduction presupposes that nature be in principle explainable. This means that to discover is simply to expedite an event that would sooner or later occur. Therefore, the real service of a logic of abduction is of the nature of an economy. Economy itself depends on three factors: cost (of money, time, energy, thought), the value of the hypothesis itself, and its effects upon other projects and hypotheses (MS L 75, 1902; MS 690, CP 7.164-231, 1901).

Although primarily concerned with abduction, methodeutic also has an interest in deduction and induction. Theorematic deductions (see §2.b.iii) manifest peculiar logical steps that are abductive rather than deductive. In order to overcome the lack of critical instruments for the investigation of those steps, Peirce emphasizes the need to have an inventory and logical classification of valuable steps in the history of mathematics which would become part of a methodeutic of necessary reasoning (Peirce 1908, MS 200-201).

Peirce also considered the study of the properties of different logical and mathematical notations and symbolisms as belonging to the department of methodeutic. In this respect, he coined the maxim of the ethics of terminology and of notation:

The person who introduces a conception into science has both the right and the duty of prescribing a terminology and a notation for it; and his terminology and notation should be followed except so far as it may prove positively and seriously disadvantageous to the progress of science. If a slight modification is sufficient to remove the objection, a much greater one should be avoided. (MS 530, 1902)

Induction too has its methodological side. The methods of the three classes of inductions are all based on “samples,” and they all presuppose that the samples are representative of the class from which they are sampled: methodeutic should therefore teach methods of producing fair samplings. His own experimental work is exemplary in that it develops new statistical methods to ascertain that truly randomized samples are achieved and that fully blinded testing conditions are secured. He emphasizes the method of predesignation, which prescribes that the characters concerning which class is sampled are to be chosen beforehand so that the sampler would not be influenced by any agreement among the members of the sample (see Goudge 1946).

Other things Peirce considers to pertain to methodeutic include the principles of definition, the methods of classification in general, and the doctrine of the clearness of ideas.

Peirce’s logic, conceived as semeiotic, characterizes a broad philosophical, methodological and scientific area of investigation. Although the present article has exposed a number of developments in Peirce’s studies in deductive logic, the deductive part is only a fraction of the wider project of semeiotic, the theory and philosophy of signs, and the logic of science. From a contemporary perspective, deductive, formal, and mathematical logic may have become the mainstay of logic as such, but for Peirce other areas of logic, such as speculative grammar and the critics and methodeutic of abduction and induction, are at least as important as the deductive part of logic.

3. Peirce’s Logic in Historical Perspective

Peirce’s algebraic work in formal logic influenced Ernst Schröder (1841–1902), who drew heavily upon Peirce’s work in the three volumes of his Vorlesungen über die Algebra der Logik (Schröder 1890–1905). Peirce also successfully initiated a school in logic during his Johns Hopkins period (1879–1884), whose most evident manifestation is the richness and originality of the papers contained in the Studies in Logic (Peirce 1883). For the main part of his career, Peirce had been in contact and correspondence with the most prominent logicians, mathematicians and scientists of the time, and his works appeared in leading scientific journals and proceedings.

All these facts notwithstanding, the reception of Peirce’s deductive logic has been strangely erratic, even in the early days. Especially in his later period (1892–1914), Peirce worked virtually alone in an adverse environment and without much intellectual and material support. It is true that the recognition of his contributions has suffered from a long-term unavailability of his vast Nachlass of over 100,000 surviving pages of manuscripts and correspondence. In some cases at least, the explanation may be found in the unprecedented technical and mathematical standard and rigor characterizing his work. But what is certainly a chief reason behind the general neglect of Peirce’s logic is the rise, at the end of the 19th century, of what has later been named the Frege-Russell tradition in logic.

The historiography of logic seems to have accepted the idea, initially promoted by Bertrand Russell and subsequently canonized by historian of logic Jean van Heijenoort (1912–1986), of a “Fregean revolution” in logic. In this narrative, modern mathematical logic (also deceptively called symbolic logic) has replaced traditional or Aristotelian logic. According to such a picture, the work of the “algebraists,” (including Boole, De Morgan, Peirce and Schröder) belongs to the pre-Fregean logical paradigm.

Anellis (2012b) identified seven features of such a “Fregean” revolution: (1) A propositional calculus with a truth-functional definition of connectives, especially the conditional. (2) Decomposition of propositions into function and argument instead of into subject and predicate. (3) A quantification theory, based on a system of axioms and inference rules. (4) Definitions of infinite sequence and natural number in terms of logical notions (that is the logicization of mathematics). (5) Presentation and clarification of the concept of a formal system. (6) Relevance and use of logic for philosophical investigations (especially for philosophy of language). (7) Separating singular propositions, such as “Socrates is mortal” from universal propositions such as “All Greeks are mortal.” All these characteristics, Anellis argued, can be found in Peirce’s work, which therefore falls within the parameters of van Heijenoort’s conception of the Fregean revolution and the definition of mathematical logic. One also needs to remember that there are many characteristics of Peirce’s logic and philosophy of logic, vitally important to his logical vision, that either add to, modify or reject those that have been taken to typify the Fregean tradition. What may be ill-named as a Fregean revolution is found in a different, and perhaps more penetrating and consequential shape in Peirce’s work.

Peirce and Frege discovered quantificational theory around the same time (1879–1883). Frege’s work was at the time largely ignored. Russell credited Frege a posteriori with having founded modern logic in the Begriffsschrift (Frege 1879). However, while Frege’s notation was hardly ever used, the Peirce-Schröder notation was largely adopted by others. The important results of Löwenheim and Skolem at the beginning of the 20th century were presented in the Peirce-Schröder system without any trace of influence by Frege or Russell. Peano’s use of the existential and universal quantifiers derives from Schröder and Peirce, not from Frege. Unlike Frege, Peirce recognized the utmost importance of dependent quantifiers and experimented with that idea in various ways in the algebra of logic and in existential graphs, and proposed new systems and dimensions of quantification that involve independent quantification (MS 430). Peirce’s overall influence upon the development of modern logic was considerable though its nature and scope had remained ill-understood for a long time (Putnam 1982; Dipert 1995; Pietarinen 2015a).

Peirce’s philosophy of logic had no better fate. Aside from Josiah Royce and especially Lady Victoria Welby with whom Peirce corresponded on the logic of signs and semiotics during 1903-1910, Peirce’s radical idea of “logic as semeiotic” largely passed by unnoticed. In the 1930s Charles Morris took, misleadingly, Peirce’s trivium of speculative grammar, critics and methodeutic to correspond to the division of the study of language into syntax, semantics, and pragmatics (Morris 1938, pp. 21-22). Carnap (1942) adopted Morris’ trichotomy and made it popular. Peirce’s philosophy of signs has since been studied by semioticians, led by the pioneering explorations by Roman Jakobson and Umberto Eco (see Eco 1975; Jakobson 1977; Eco 1984). Other aspects of Peirce’s philosophy of logic, such as the distinction between corollarial and theorematic deduction, his ideas on diagrammatic reasoning, and the evolution of new logical notations and meanings, is gaining the interest, not only of logicians and historians of logic, but also of philosophers of science, cognitive scientists as well as many scholars, scientists, artists and practitioners looking for ways to overcome boundaries of narrow conceptions of logic, reasoning and representation, as well as the outdated 20th-century scientific methodologies that have characterized their respective fields. [See for example the 2014 Peirce Centennial Conference at Lowell as well as the Applying Peirce Conference series at Helsinki in 2007 and 2014, which have brought together scholars and scientists interested in Peirce’s thought virtually on any field of science.]

From the perspectives of the history and philosophy of modern logic, it may not be entirely right to talk in strict terms about the two traditions in logic, namely those of the algebraic and the symbolic ones. On the one hand, Peirce’s line of work in the algebra of logic led to the invention of a spectrum of methods in the semantic and model-theoretic tradition while the logic that Schröder, for example, preferred was to quantify over the entire universe and was thus at bottom a universalist one, thereby sharing the same preference as Frege. On the other hand, Peirce’s continuous search for new notations for the purposes of logical analysis and representation made what others may have considered to be the subject of symbolic notations really the subject of diagrams and icons. Algebraic notations were for Peirce iconic and often even very graphically so. What mattered to him was to remain clear of the significations of logical signs. Logical signs were to be interpreted in proper contexts and according to the purposes of investigation at hand. Thus, Peirce’s philosophy of logic stands in stark contrast to purely formal, mathematical and proof-theoretic approaches to logic, which do not care so much for signification. Peirce should accordingly be counted in the pragmatic, rather than just the semantic, tradition in philosophy of logic and language (Pietarinen 2006; Tiercelin 1991).

The famous van Heijenoort–Hintikka distinction between “logic as calculus” and “logic as a universal medium” is nonetheless instructive here (van Heijenoort 1967; Hintikka 1997; Peckhaus 2004). According to the former view of logic as calculus, methods and languages are many, they are reinterpretable according to the context and purposes at hand, and they admit of many and varying universes as well as modal and intensional considerations. The latter, universalist position means, in contrast, that there is one logic to “rule them all,” and so our thought is bounded by what that logic can express. Peirce fits squarely into the former camp. Here again it is not that all who worked on the algebra of logic would be members of that same camp (Schröder is a counterexample), or that all of those who in the literature have been tagged as formalists would share the universalist presuppositions (David Hilbert may serve as another kind of a counterexample). It may be one of the lessons of Peirce’s pragmaticism and the methodological pluralism which he exercised in his logic that one does not fix in advance what may in the future be considered to fall within the scope of logic.

4. References and Further Reading

  • Peirce’s works
  • 1867. An Improvement in Boole’s Calculus of Logic. Proceedings of the American Academy of Arts and Sciences 7, pp. 249-261.
  • 1870. Description of a Notation for the Logic of Relatives. Memoirs of the American Academy of Arts and Sciences 9, pp. 317-378.
  • 1880a. On the Algebra of Logic.  American Journal of Mathematics 3, pp. 15–57.
  • 1881. On the Logic of Number. American Journal of Mathematics 4, pp. 85-95.
  • 1883 (ed.). Studies in Logic by Members of the Johns Hopkins University. Boston: Little, Brown, and Co. 1883.
  • 1885. On the Algebra of Logic. A Contribution to the Philosophy of Notation. American Journal of Mathematics 7, pp. 197–202.
  • 1897. The Logic of Relatives. The Monist 7, pp. 161–217.
  • 1901-1902. Entries in Dictionary of Philosophy and Psychology, 3 vols, edited by Baldwin, James Mark. Cited as DPP followed by volume and page number.
  • 1906. Prolegomena to an Apology for Pragmaticism. The Monist 16, pp. 492–546.
  • 1908. Some Amazing Mazes. The Monist 18 (3), pp. 416-464.
  • 1931–1966. The Collected Papers of Charles S. Peirce, 8 vols., ed. by Hartshorne, C, Weiss, P. and Burks, A. W. Cambridge: Harvard University Press. Cited as CP followed by volume and paragraph number.
  • 1967. Manuscripts in the Houghton Library of Harvard University, as identified by Richard Robin, “Annotated Catalogue of the Papers of Charles S. Peirce,” Amherst: University of Massachusetts Press, 1967, and in “The Peirce Papers: A supplementary catalogue,” Transactions of the C. S. Peirce Society 7 (1971): 37–57. Cited as MS followed by manuscript number and, when available, page number.
  • 1976. The New Elements of Mathematics by Charles S. Peirce, 4 vols., ed. by Eisele, C. The Hague: Mouton. Cited as NEM followed by volume and page number.
  • 1982 – …. Writings of Charles S. Peirce: A Chronological Edition, 7 vols., ed. by. Moore, E. C., Kloesel, C. J. W. et al. Bloomington: Indiana University Press. Cited as W followed by volume and page number.
  • 2010. Philosophy of Mathematics: Selected Writings, ed. by M. E. Moore, Bloomington and Indianapolis, IN: Indiana University Press. Cited as PM.
  • 2015. Logic of the Future. Peirce’s Writings on Existential Graphs, ed. by A.-V. Pietarinen, Bloomington: Indiana University Press. Cited as LoF.
  • Other works
  • Anellis, I. 2012a. Peirce’s Truth-Functional Analysis and the Origin of the Truth Table. History and Philosophy of Logic 33, pp. 37–41.
  • Anellis, I. 2012b. How Peircean was the ‘Fregean’ Revolution in Logic? arXiv:1201.0353.
  • Badesa, C. 2004. The Birth of Model Theory: Löwenheim’s Theorem in the Frame of the Theory of Relatives, Princeton: Princeton University Press.
  • Bellucci, F. & Pietarinen, A.-V. (2014). New Light on Peirce’s Concept of Retroduction and Scientific Reasoning, International Studies in the Philosophy of Science 28(2), pp. 1-21.
  • Bellucci, F. & Pietarinen, A.-V. (2015). Existential Graphs as an Instrument of Logical Analysis. Part I: Alpha, to appear.
  • Bellucci, F., Pietarinen, A.-V. & Stjernfelt, F. eds. 2014. Peirce: 5 Questions. VIP/Automatic Press.
  • Boole, G. 1847. The Mathematical Analysis of Logic. Cambridge: Macmillan, Barclay, & Macmillan.
  • Boole, G. 1854. An Investigation of the Laws of Thought. Cambridge: Walton & Maberly.
  • Brady, G. 2000. From Peirce to Skolem. Amsterdam: Elsevier Science.
  • Brent, B. 1987. Charles S. Peirce. Logic and the Classification of the Sciences, Kingston/Montreal: MacGill-Queen’s University Press
  • Burch, R. W. 2011. Peirce’s 10, 28, and 66 Sign-Types: The Simplest Mathematics. Semiotica 184, pp. 93–98.
  • Burks, A. W. 1946. Peirce’s Theory of Abduction. Philosophy of Science 13, pp. 301-306.
  • Carnap, R. 1942. Introduction to Semantics, Cambridge, Mass: MIT Press.
  • Chauviré, Ch. 1994. Logique et Grammaire Pure. Propositions, Sujets et Prédicats Chez Peirce. Histoire Epistémologie Langage 16, pp. 137–175.
  • Cheng, C.-Y. 1969. Peirce’s and Lewis’s Theories of Induction, The Hague: Martinus Nijhoff.
  • Clark, G. 1997. New Light on Peirce’s Iconic Notation for the Sixteen Binary Connectives. In Houser and others 1997, pp. 304-333.
  • Dedekind, R. 1888. Was sind und was sollen die Zahlen. Braunschweig: Vieweg.
  • Dekker, Paul 2001. Dynamics and Pragmatics of ‘Peirce’s Puzzle’, Journal of Semantics 18, pp. 211-241.
  • De Morgan, A. 1847. Formal Logic. London: Taylor and Walton.
  • De Morgan, A. 1860. On the Syllogism IV; and on the Logic of Relations. Transactions of the Cambridge Philosophical Society 10, pp. 331-358.
  • Dipert, R. 1995. Peirce’s Underestimated Place in the History of Logic: A Response to Quine. In Ketner, K. L. ed. Peirce and Contemporary Thought. New York: Fordham University Press, pp. 32-58.
  • Dipert, R. 2006. Peirce’s Deductive Logic: Its Development, Influence, and Philosophical Significance. In: Misak, C. (ed.). The Cambridge Companion to Peirce. Cambridge: Cambridge University Press, pp. 287-324.
  • Eco, U. 1975. Trattato di semiotica generale. Milano: Bompiani.
  • Eco, U. 1984. Semiotica e filosofia del linguaggio. Torino: Einaudi.
  • Fann, K. T. 1970. Peirce’s Theory of Abduction. The Hague: Martinus Nijhoff.
  • Ferriani, M. 1987. Peirce’s Analysis of the Proposition: Grammatical and Logical Aspects. In Ferriani, M. & Buzzetti, D. (eds.), Speculative grammar, universal grammar and philosophical analysis of language. Amsterdam: Benjamins, pp. 149-172.
  • Fisch, M. H. 1982. The Range of Peirce’s Relevance, The Monist 65, pp. 123-141. Reprinted in Fisch 1986, pp. 422-448.
  • Fisch, M. H. 1986. Peirce, Semeiotic and Pragmatism. Ed. by K. L. Ketner and C. J. W. Kloesel, Bloomington: Indiana University Press.
  • Fisch, M. H. & Turquette, A. 1966. Peirce’s Triadic Logic. Transactions of the Charles S. Peirce Society 2, pp.71-85.
  • Forster, P. 1989. Peirce on the Progress and Authority of Science. Transactions of the Charles S. Peirce Society 25, pp. 421–452.
  • Frege, G. 1879. Begriffsschrift: eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle: Louis Nebert.
  • Goudge, T. 1946. Peirce’s Treatment of Induction. Philosophy of Science 7, pp. 56-68.
  • Haack, S. 1993. Peirce and Logicism: Notes Towards an Exposition. Transactions of the Charles S. Peirce Society 29, pp. 33–56.
  • Havenel, J. 2010. Peirce’s Topological Concepts. In Moore 2010, pp. 283-322.
  • Hilpinen, R. 1982. On C. S. Peirceʼs Theory of the Proposition: Peirce as a Precursor of Game-Theoretical Semantics. The Monist 65, pp. 182-188.
  • Hilpinen, R. 1992. On Peirce’s Philosophical Logic: Propositions and Their Objects. Transactions of the Charles S. Peirce Society 28, pp. 467–488.
  • Hilpinen, R. 2004. Peirce’s Logic, in Gabbay, D.M., and J. Woods. 2004. Handbook of the History of Logic. Vol. 3: The Rise of Modern Logic From Leibniz to Frege. Vol. 3. Amsterdam: Elsevier North-Holland, pp. 611-658.
  • Hintikka, J. 1980. C. S. Peirce’s ‘First Real Discovery’ and Its Contemporary Relevance. Monist 63, pp. 304-315.
  • Hintikka, J. 1996. The Place of C. S. Peirce in the History of Logical Theory. In J. Brunning, J. & Forster, P. eds. The Rule of Reason: The Philosophy of Charles Sanders Peirce, Toronto: University of Toronto Press, pp. 13–33.
  • Hintikka, J. 1997. Lingua Universalis vs. Calculus Ratiocinator: An Ultimate Presupposition of Twentieth Century Philosophy. Dordrecht: Kluwer.
  • Hintikka, J. 2011. What the bald man can tell us. In: Biletzky, A. (ed.) Hues of Philosophy: Essays in Memory of Ruth Manor. London: College Publications.
  • Hoffmann, M. 2010. Theoric Transformations. Transactions of the Charles S. Peirce Society 46, pp. 570–590.
  • Houser, N. 1993. On ‘Peirce and Logicism’: A Response to Haack. Transactions of the Charles S. Peirce Society 29, pp. 57–67.
  • Houser, N., Roberts, D., Van Evra, J. eds. 1997. Studies in the Logic of Charles S. Peirce. Bloomington and Indianapolis: Indiana University Press.
  • Jakobson, R. 1977. A Few Remarks on Peirce, Pathfinder in the Science of Language. MLN 92, pp. 1026–1032.
  • Kapitan, T. 1992. Peirce and the Autonomy of Abductive Reasoning. Erkenntnis 37, pp. 1–26.
  • Kapitan, T. 1997. Peirce and the Structure of Abductive Inference. In Houser and others eds. 1997, pp. 477-496.
  • Kempe, A. B. 1886. A Memoir on the Theory of Mathematical Form. Philosophical Transactions of the Royal Society of London 177, pp. 1-70.
  • Ketner, K. L. 1985. How Hintikka Misunderstood Peirce’s Account of Theorematic Reasoning. Transactions of the Charles S. Peirce Society 21, pp. 407–418.
  • Lane, R. 1999. Peirce’s Triadic Logic Revisited. Transactions of the Charles S. Peirce Society 35, pp. 284–311.
  • Lewis, C. I. 1918.  A Survey of Symbolic Logic. Berkeley: University of California Press.
  • Lupher, T. and Adajian, T. eds. 2015. Philosophy of Logic: 5 Questions. Copenhagen: Automatic Press.
  • Ma, Minghui & Pietarinen, A.-V. 2015. A dynamic approach to Peirce’s interrogative construal of abductive reasoning, IFCoLog Journal of Logics and their Applications, in press.
  • Mannoury, G. 1909. Methodologisches und Philosophisches zur Elementar-Mathematik. Haarlem: P. Visser.
  • Merrill, D. D. 1997. Relations and Quantification in Peirce’s Logic, 1870-1885. In Houser et. al. eds. 1997, pp. 158-172.
  • Merrill, G. H. 1975. Peirce on Probability and Induction. Transactions of the Charles S. Peirce Society 11, pp. 90–109.
  • Moore, M. ed. 2010. New Essays on Peirce’s Mathematical Philosophy. Chicago: Open Court.
  • Morris, C.  W. 1938. Foundations of the Theory of Signs. In Morris, C. 1971. Writings on the General Theory of Signs. The Hague: Mouton.
  • Murphey, M. G. 1961. The Development of Peirce’s Philosophy, Cambridge, Mass: Harvard University Press, 2nd ed. 1993, Indianapolis: Hackett.
  • Paavola, S. 2004. Abduction as a Logic and Methodology of Discovery: The Importance of Strategies. Foundations of Science 9, pp. 267–283.
  • Peano, G. 1889. Arithmetices Principia. Nova Methodo Exposita. Torino: Bocca.
  • Peckhaus, V. 2004. Calculus Ratiocinator vs. Characteristica Universalis? The Two Traditions in Logic, Revisited. History and Philosophy of Logic 25, pp. 3-14.
  • Peirce, B. 1870. Linear Associative Algebra. Litographated ed., Washington D.C.
  • Pietarinen, A.-V. 2005. Compositionality, Relevance and Peirce’s Logic of Existential Graphs. Axiomathes 15, pp. 513-540.
  • Pietarinen, A.-V. 2006. Signs of Logic: Peircean Themes on the Philosophy of Language, Games and Communication. Dordrecht: Springer.
  • Pietarinen, A.-V. 2011. Existential Graphs: What a Diagrammatic Logic of Cognition Might Look Like. History and Philosophy of Logic 32, pp. 265–281.
  • Pietarinen, A.-V. 2013. Logical and Linguistic Games from Peirce to Grice to Hintikka (with comments by J. Hintikka). Teorema 33, pp. 121-136.
  • Pietarinen, A.-V. 2015a. Exploring the Beta Quadrant. Synthese 192, pp. 941-970.
  • Pietarinen, A.-V. ed. 2015b. Two Papers on Existential Graphs by Charles S. Peirce: 1. Recent Developments of Existential Graphs and their Consequences for Logic (MS 498, 499, 490, S-36, 1906), 2. Assurance through Reasoning (MS 669, 670, 1911). Synthese 192, pp. 881-922.
  • Pietarinen, A.-V. & Bellucci, F. 2015a. Habits of Reasoning: On the Grammar and Critics of Logical Habits. In D. E. West & M. Anderson eds. Consensus on Peirce’s Concept of Habit: Before and Beyond Consciousness. Dordrecht: Springer.
  • Pietarinen, A.-V. & Bellucci, F. 2015b. The Iconic Moment: Towards a Peircean Theory of Diagrammatic Imagination. In J. Redmond, A. N. Fernàndez, O. Pombo eds. Epistemology, Knowledge, and the Impact of Interaction, Dordrecht: Springer.
  • Prior, A. N. 1958. Peirce’s Axioms for Propositional Calculus. The Journal of Symbolic Logic 23, pp. 135–136.
  • Prior, A. N. 1964. The Algebra of the Copula. In Moore, E. & Robin, R. eds. Studies in the Philosophy of Charles Sanders Peirce. Amherst: The University of Massachusetts Press, pp. 79-94.
  • Putnam, H. 1982. Peirce the Logician. Historia Mathematica 9, pp. 290-301.
  • Roberts, D. D. 1973. The Existential Graphs of Charles S. Peirce. The Hague- Paris: Mouton.
  • Schröder, E. 1890-1905. Vorlesungen über die Algebra der Logik, 3 vols. Leipzig: Teubner.
  • Shields, P. 1981. Charles S. Peirce on the Logic of Number, 2nd ed. Boston: Docent Press, 2012.
  • Shin, S.-J. 2002. The Iconic Logic of Peirce’s Graphs. Cambridge, MA: MIT Press.
  • Short, T. L. 2007. Peirce’s Theory of Signs. Cambridge: Cambridge University Press.
  • Sowa, J. 2006. Peirce’s Contributions to the 21st Century, 14th International Conference on Conceptual Structures, Aalborg, Denmark, July 16-21, Lecture Notes in Computer Sceince, 4068, pp. 54-69.
  • Stjernfelt, F. 2014. Natural Propositions. The Actuality of Peirce’s Doctrine of Dicisigns, Boston: Docent Press.
  • Sylvester, J. J. 1878. On an Application of the New Atomic Theory to the Graphical Representation of the Invariants and Covariants of Binary Quantics. American Journal of Mathematics 1, pp. 64-104.
  • Tarski, A. 1941. On the Calculus of Relations. Journal of Symbolic Logic 6, pp. 73-89.
  • Tiercelin, C. 1991. Peirce’s Semiotic Version of The Semantic Tradition in Formal Logic. In New Inquiries into Meaning and Truth, N. Cooper & P. Engel eds. Harverest Wheatsheaf: St. Martin’s Press, pp. 187–210.
  • Van Heijenoort, J. 1967. Logic as Calculus and Logic as Language. Synthese 17, pp. 324–330.
  • Walsh, A. 2012. Relations between Logic and Mathematics in the Work of Benjamin and Charles S. Peirce. Boston: Docent Press.
  • Weiss, P. & Burks, A. 1945. Peirce’s Sixty-Six Signs. The Journal of Philosophy 42, pp. 383–388.
  • Zalamea, F. 2012a. Synthetic Philosophy of Contemporary Mathematics, Urbanomic.
  • Zalamea, F. 2012b. Peirce’s Logic of Continuity: A Mathematical and Conceptual Approach, Docent Press.
  • Zellweger, S. 1997. Untapped Potential in Peirce’s Iconic Notation for the Sixteen Binary Connectives. In Houser and others eds. 1997, pp. 334-386.
  • Zeman, J. 1964. The Graphical Logic of Charles S. Peirce, Ph.D. dissertation, University of Chicago.
  • Zeman, J. 1986. Peirceʼs Philosophy of Logic. Transactions of the Charles S. Peirce Society 22, pp. 1-22.

 

Author Information

Francesco Bellucci
Email: bellucci.francesco@gmail.com
Tallinn University of Technology
Estonia

and

Ahti-Veikko Pietarinen
Email: ahti-veikko.pietarinen@ttu.ee
Tallinn University of Technology
Estonia

Thomas S. Kuhn (1922—1996)

KuhnThomas Samuel Kuhn, although trained as a physicist at Harvard University, became an historian and philosopher of science through the support of Harvard’s president, James Conant. In 1962, Kuhn’s renowned The Structure of Scientific Revolutions (Structure) helped to inaugurate a revolution—the 1960s historiographic revolution—by providing a new image of science. For Kuhn, scientific revolutions involved paradigm shifts that punctuated periods of stasis or normal science. Towards the end of his career, however, Kuhn underwent a paradigm shift of his own—from a historical philosophy of science to an evolutionary one.

In this article, Kuhn’s philosophy of science is reconstructed chronologically. To that end, the following questions are entertained: What was Kuhn’s early life and career? What was the road towards Structure? What is Structure? Why did Kuhn revise Structure? What was the road Kuhn took after Structure? At the heart of the answers to these questions is the person of Kuhn himself, especially the intellectual and social context in which he practiced his trade. This chronological reconstruction of Kuhn’s philosophy begins with his work in the 1950s on physical theory in the Lowell lectures and on the Copernican revolution and ends with his work in the 1990s on an evolutionary philosophy of science. Rather than present Kuhn’s philosophy as a finished product, this approach endeavors to capture it in the process of its formation so as to represent it accurately and faithfully.

Table of Contents

  1. Early Life and Career
  2. The Road to Structure
    1. The Lowell Lectures
    2. The Copernican Revolution
    3. The Last Mile to Structure
  3. The Structure of Scientific Revolutions
  4. The Road after Structure
    1. Historical and Historiographic Studies
    2. Metahistorical Studies
    3. Evolutionary Philosophy of Science
  5. Conclusion
  6. References and Further Reading
    1. Kuhn’s Work
    2. Secondary Sources

1. Early Life and Career

Kuhn was born in Cincinnati, Ohio, on 18 July 1922. He was the first of two children born to Samuel L. and Minette (neè Stroock) Kuhn, with a brother Roger born several years later. His father was a native Cincinnatian and his mother a native New Yorker. Kuhn’s father, Sam, was a hydraulic engineer, trained at Harvard University and at Massachusetts Institute of Technology (MIT) prior to World War I. He entered the war, and served in the Army Corps of Engineers. After leaving the armed services, Sam returned to Cincinnati for several years before moving to New York to help his recently widowed mother Setty (neè Swartz) Kuhn. Kuhn’s mother, Minette, was a liberally educated person who came from an affluent family.

Kuhn’s early education reflected the family’s liberal progressiveness. In 1927, Kuhn began schooling at the progressive Lincoln School in Manhattan. His early education taught him to think independently, but by his own admission, there was little content to the thinking. He remembered that by the second grade, for instance, he was unable to read proficiently, much to the consternation of his parents.

Beginning in the sixth grade, Kuhn’s family moved to Croton-on-Hudson, a small town about fifty miles from Manhattan, and the adolescent Kuhn attended the progressive Hessian Hills School. According to Kuhn the school was staffed by left-oriented radical teachers, who taught the students pacifism. When he left the school after the ninth grade, Kuhn felt he was a bright and independent thinker. After spending an uninspired year at the preparatory school Solebury in Pennsylvania, Kuhn spent his last two years of high school at the Yale-preparatory Taft School in Watertown, Connecticut. He graduated third in his class of 105 students and was inducted into the National Honor Society. He also received the prestigious Rensselaer Alumni Association Medal.

Kuhn matriculated to Harvard College in the fall of 1940, following his father’s and uncles’ footsteps. At Harvard, he acquired a better sense of himself socially by participating in various organizations. During his first year, Kuhn took a yearlong philosophy course. In the first semester, he studied Plato and Aristotle; while in the second semester, he studied Descartes, Spinoza, Hume, and Kant. He intended to take additional philosophy courses but could not find time. He attended, however, several of George Sarton’s lectures on the history of science, but he found them boring.

At Harvard, Kuhn agonized over majoring in either physics or mathematics. After seeking his father’s counsel, he chose physics because of career opportunities. Interestingly, the attraction of physics or mathematics was their problem-solving traditions. In the fall of his sophomore year, the Japanese attacked Pearl Harbor and Kuhn expedited his undergraduate education by going to summer school. The physics department focused on teaching predominantly electronics, and Kuhn followed suit.

Kuhn underwent another radical transformation, also during his sophomore year. Although he was trained a pacifist the atrocities perpetrated in Europe during World War II, especially by Hitler, horrified him. Kuhn experienced a crisis, since he was unable to defend pacifism reasonably. The outcome was that he became an interventionist, which was the position of many at Harvard—especially its president, Conant. The episode left a lasting impact upon him. In a Harvard Crimson editorial, Kuhn supported Conant’s effort to militarize the universities in the United States. The editorial came to the attention of the administration, and eventually Conant and Kuhn met.

In the spring of 1943, Kuhn graduated summa cum laude from Harvard College with an S.B. After graduation, he worked for the Radio Research Laboratory located in Harvard’s biology building. He conducted research on radar counter technology, under John van Vleck’s supervision. The job procured for Kuhn a deferment from the draft. After a year, he requested a transfer to England and then to the continent, where he worked in association with the U.S. Office of Scientific Research and Development. The trip was Kuhn’s first abroad and he felt invigorated by the experience. However, Kuhn realized that he did not like radar work, which led him to reconsider whether he wanted to continue as a physicist. But, these doubts did not dampen his enthusiasm for or belief in science. During this time, Kuhn had the opportunity to read what he wanted; he read in the philosophy of science, including authors such as Bertrand Russell, P.W. Bridgman, Rudolf Carnap, and Philipp Frank.

After V.E. day in 1945, Kuhn returned to Harvard. As the war abated with the dropping of atomic bombs on Japan, Kuhn activated an earlier acceptance into graduate school and began studies in the physics department. Although Kuhn persuaded the department to permit him to take philosophy courses during his first year, he again chose the pragmatic course and focused on physics. In 1946, Kuhn passed the general examinations and received a master’s degree in physics. He then began dissertation research on theoretical solid-state physics, under the direction of van Vleck. In 1949, Harvard awarded Kuhn a doctorate in physics.

Although Kuhn had high regard for science, especially physics, he was unfulfilled as a physicist and continually harbored doubts during graduate school about a career in physics. He had chosen both a dissertation topic and an advisor to expedite obtaining a degree. But, he was to find direction for his career through Conant’s invitation in 1947 to help prepare a historical case-based course on science for upper-level undergraduates. Kuhn accepted the invitation to be one of two assistants for Conant’s course. He undertook a project investigating the origins of seventeenth-century mechanics, a project that would transform his image of science.

That transformation came, as Kuhn recounted later, on a summer day in 1947 as he struggled to understand Aristotle’s idea of motion in Physics. The problem was that Kuhn tried to make sense of Aristotle’s idea of motion using Newtonian assumptions and categories of motion. Once he realized that he had to read Aristotle’s Physics using assumptions and categories contemporary to when the Greek philosopher wrote it, suddenly Aristotle’s idea of motion made sense.

After this experience, Kuhn realized that he wanted to be a philosopher of science by doing history of science. His interest was not strictly history of science but philosophy, for he felt that philosophy was the way to truth and truth was what he was after. To achieve that goal, Kuhn asked Conant to sponsor him as a junior fellow in the Harvard Society of Fellows. Harvard initiated the society to provide promising young scholars freedom from teaching for three years to develop a scholarly program. Kuhn’s colleagues stimulated him professionally, especially a senior fellow by the name of Willard Quine. At the time, Quine was publishing his critique on the distinction between the analytic and the synthetic, which Kuhn found reassuring for his own thinking.

Kuhn began as a fellow in the fall of 1948, which provided him the opportunity to retool as a historian of science. Kuhn took advantage of the opportunity and read widely over the next year and a half in the humanities and sciences. Just prior to his appointment as a fellow, Kuhn was also undergoing psychoanalysis. This experience allowed him to see other people’s perspectives and contributed to his approach for conducting historical research.

2. The Road to Structure

a. The Lowell Lectures

In 1950, the trustee of the Lowell Institute, Ralph Lowell, invited Kuhn to deliver the 1951 Lowell lectures. In these lectures, Kuhn outlined a conception of science in contrast to the traditional philosophy of science’s conception in which facts are slowly accumulated and stockpiled in textbooks. Kuhn began by assuring his audience that he, as a once practicing scientist, believed that science produces useful and cumulative knowledge of the world, but that traditional analysis of science distorts the process by which scientific knowledge develops. He went on to inform the audience that the history of science could be instructive for identifying the process by which creative science advances, rather than focusing on the finished product promulgated in textbooks. Because textbooks only state the immutable scientific laws and marshal forth the experimental evidence to support those laws, they cover over the creative process that leads to the laws in the first place.

Kuhn then presented an alternative historical approach to scientific methodology. He claimed that the traditional position in which Galileo rejected Aristotle’s physics because of Galileo’s experiments is a fallacy. Rather, Galileo rejected Aristotelianism as an entire system. In other words, Galileo’s evidence was necessary but not sufficient; rather, the Aristotelian system was under evaluation, which also included its logic. Next, Kuhn proposed an alternative image of science based on the new approach to the history of science. He introduced the notion of conceptual frameworks, and drew from psychology to defend the advancement of science though scientists’ predispositions. These predispositions allow scientists to negotiate a professional world and to learn from their experiences. Moreover, they are important in organizing the scientist’s professional world and scientists do not dispense with them easily. Change in them represents a foundational alteration in a professional world.

Kuhn argued that although logic is important for deriving meaning and for managing and manipulating knowledge, scientific language—as natural—outstrips such formalization. He upended the tables on an important tool for the traditional analysis of science. By revealing the limitations of logical analysis, he showed that logic is necessary but insufficient for justifying scientific knowledge. Logic, then, cannot guarantee the traditional image of science as the progressive accumulation of scientific facts. Kuhn next examined logical analysis in terms of language and meaning. His position was that language is a way of dissecting the professional world in which scientists operate. But, there is always ambiguity or overlap in the meaning of terms as that world is dissected. Certainly, scientists attempt to increase the precision of their terms but not to the point that they can eliminate ambiguity. Kuhn concluded by distinguishing between creative and textbook science.

In the same year of the Lowell lectures, Harvard appointed Kuhn as an instructor and the following year as an assistant professor. Kuhn’s primary teaching duty was in the general education curriculum, where he taught Natural Sciences 4 along with Leonard Nash. He also taught courses in the history of science. And, it was during this time that Kuhn developed a course on the history of cosmology. Kuhn utilized course preparation for scholarly writing projects. For example, he handed out draft chapters of The Copernican Revolution to his classes.

A part of Kuhn’s motivation for developing a new image of science was the misconceptions of science held by the public. He blamed its misconceptions on introductory courses that stressed the textbook image of science as a fixed body of facts. After discussing this state of affairs with friends and Conant, Kuhn provided students with a more accurate image of science. The key to that image, claimed Kuhn, was science’s history, which displays the creative and dynamic nature of science.

b. The Copernican Revolution

In The Copernican Revolution, Kuhn claimed he had identified an important feature of the revolution, which previous scholars had missed: its plurality. What Kuhn meant by plurality was that scientists have philosophical and even religious commitments, which are important for the justification of scientific knowledge. This stance was anathema to traditional philosophers of science, who believed that such commitments played little—if any—role in the justification of scientific knowledge and relegated them to the discovery process.

Kuhn began reconstruction of the Copernican revolution by establishing the genuine scientific character of ancient cosmological conceptual schemes, especially the two-sphere cosmology composed of an inner sphere for the earth and an outer sphere for the heavens. For Kuhn, conceptual schemes exhibit three important features. They are comprehensive in terms of scientific predictions, there is no final proof for them, and they are derived from other schemes. Finally, to be successful conceptual schemes must perform logical and psychological functions. The logical function is expressed in explanatory terms, while the psychological function in existential terms. Although the logical function of the two-sphere cosmology continued to be problematic, its psychological function afforded adherents a comprehensive worldview that included even religious elements.

The major logical problem with the two-sphere cosmology was the movement and positions of the planets. The conceptual scheme Ptolemy developed in the second century guided research for the next millennium. But, problems surfaced with the scheme and predecessors could only correct it so far with ad hoc modifications. Kuhn asked at this point in the narrative why the Ptolemaic system, given its imperfection, was not overthrown sooner. The answer, for Kuhn, depended on a distinction between the logical and psychological dimensions of scientific revolutions. According to Kuhn, there are logically different conceptual schemes that can organize and account for observations. The difference among these schemes is their predictive power. Consequently, if an observation is made that is not compatible with a prediction the scheme must be replaced. But, before change can occur, there is also the psychological dimension to a revolution.

Copernicus had to overcome not only the logical dimension of the Ptolemaic system but also its psychological dimension. Aristotle had established this latter dimension by wedding the two-sphere cosmology to a philosophical system. Through the Aristotelian notion of motion among the earthly and heavenly spheres, the inner sphere was connected and depended on the outer sphere. The ability to presage future events linked astronomy to astrology. Such an alliance, according to Kuhn, provided a formidable obstacle to change of any kind.

But change began to take place, albeit slowly. From Aristotle to Ptolemy, a sharp distinction arose between the psychological dimensions of cosmology and the mathematical precision of astronomy. By Ptolemy’s time, astronomy was less concerned with the psychological dimensions of data interpretation and more with the accuracy of theoretical prediction. To some extent, this aided Copernicus, since whether the earth moved could be determined by theoretical analysis of the empirical data. But still, the earth as center of the universe gave existential consolation to people. The strands of the Copernican revolution, then, included not only astronomical concerns but also theological, economic, and social ones. Besides the Scholastic tradition, with its impetus theory of motion, other factors also paved the way for the Copernican revolution, including the Protestant revolution, navigation for oceanic voyages, calendar reform, and Renaissance humanism and Neoplatonism.

Copernicus, according to Kuhn, was the immediate inheritor of Aristotelian-Ptolemaic cosmological tradition and, except for the position of the earth, was closer to that tradition than to modern astronomy. For Kuhn, De Revolutionibus precipitated a revolution and was not the revolution itself. Although the problem Copernicus addressed was the same as for his predecessors, that is, planetary motion, his solution was to revise the mathematical model for that motion by making the earth a planet that moves around the sun. Essentially, Copernicus maintained the Aristotelian-Ptolemaic universe but exchanged the sun for the earth, as the universe’s center. Although Copernicus had eliminated major epicycles, he still used minor ones and the accuracy of planetary position was no better than Ptolemy’s. Kuhn concluded that Copernicus did not really solve the problem of planetary motion.

Initially, according to Kuhn, there were only a few supporters of Copernicus’ cosmology. Although the majority of astronomers accepted the mathematical harmonies of De Revolutionibus after its publication in 1543, they rejected or ignored its cosmology. Tycho Brahe, for example, although relying on Copernican harmonies to explain astronomical data, proposed a system in which the earth was still the universe’s center. Essentially, it was a compromise between ancient cosmology and Copernican mathematical astronomy. However, Brahe recorded accurate and precise astronomical observations, which helped to compel others towards Copernicanism—particularly Johannes Kepler, who used its mathematical precision to solve the planetary motion problem. The final player Kuhn considered in the revolution was Galileo, who, Kuhn claimed, provided through telescopic observations not proof of but rather propaganda for Copernicanism.

Although astronomers achieved consensus during the seventeenth century, Copernicanism still faced serious resistance from Christianity. The Copernican revolution was completed with the Newtonian universe, which not only had an impact on astronomy but also on other sciences and even non-sciences. For instance, Newton’s universe changed the nature of God to that of a clockmaker. For Kuhn, Newtonian’s impact on disciplines other than astronomy was an example of its fruitfulness. Scientific progress, concluded Kuhn, is not the linear process, as championed by traditional philosophers of science, in which scientific facts are stockpiled in a warehouse. Rather, it is the repeated destruction and replacement of scientific theories.

The professional reviews of The Copernican Revolution signaled Kuhn’s acceptance into the philosophical and historical communities. His reconstruction of the revolution was considered for the most part scientifically accurate and methodologically appropriate. Reviewers considered integration of the science and the social an advance over other histories that ignored these dimensions of the historical narrative. Although philosophers appreciated the historical dimension of Kuhn’s study, they found its analysis imprecise according to their standards. Overall, both the historical and philosophical communities expressed no major objections to the image of science that animated Kuhn’s narrative.

Kuhn’s reconstruction of the Copernican revolution portrayed a radically different image of science than that of traditional philosophers of science. Justification of scientific knowledge was not simply a logical or objective affair but also included non-logical or subjective factors. According to Kuhn, scientific progress is not a clear-sighted linear process aimed directly at the truth. Rather, there are contingencies that can divert and forestall the progress of science. Moreover, Copernicus’ revolution changed the way astronomers and non-astronomers viewed the world. This change in perceiving the world was the result of new sets of challenges, new techniques, and a new hermeneutics for interpreting data.

Besides differing from traditional philosophers of science, Kuhn’s image of science put him at odds with Whig historians of science. These historians underrated ancient cosmologies by degrading them to myth or religious belief. Such a move was often a rhetorical ploy on the part of the victors to enhance the status of the current scientific theory. Only by showing how Aristotelian-Ptolemaic geocentric astronomy was authentic science could Kuhn argue for the radical transformation (revolution) that Copernican heliocentric astronomy invoked. Kuhn also asserted that Copernicus’ theory was not accepted simply for its predictive ability, since it was not as accurate as the original conceptual scheme, but because of non-empirical factors, such as the simplicity of Copernican’s system in which certain ad hoc modifications for accounting for the orbits of various planets were eliminated.

In 1956, Harvard denied Kuhn tenure because the tenure committee felt his book on the Copernican revolution was too popular in its approach and analysis. A friend of Kuhn knew Steven Pepper, who was chair of the philosophy department at the University of California at Berkeley. Kuhn’s friend told Pepper that Kuhn was looking for an academic position. Pepper’s department was searching for someone to establish a program in the history and philosophy of science. Berkeley eventually offered Kuhn a position in the philosophy department and later asked if he also wanted an appointment in the history department. Kuhn accepted both positions and joined the Berkeley faculty as an assistant professor.

Kuhn found Stanley Cavell in the philosophy department, a soulmate to replace Nash. Kuhn had meet Cavell earlier while they were both fellows at Harvard. Cavell was an ethicist and aesthetician, whom Kuhn found intellectually stimulating. He introduced Kuhn to Wittgenstein’s notion of language games. Besides Cavell, Kuhn developed a professional relationship with Paul Feyerabend, who was also working on the notion of incommensurability.

In 1958, Berkeley promoted Kuhn to associate professor and granted him tenure. Moreover, having completed several historical projects, he was ready to return to the philosophical issues that first attracted him to the history of science. Beginning in the fall of 1958, he spent a year as a fellow at the Center for Advanced Study in the Behavioral Sciences at Stanford, California. What struck Kuhn about the relationships among behavioral and social scientists was their inability to agree on the fundamental problems and practices of their discipline. Although natural scientists do not necessarily have the right answers to their questions, there is an agreement over fundamentals. This difference between natural and social scientists eventually led Kuhn to formulate the paradigm concept.

c. The Last Mile to Structure

Although The Copernican Revolution represented a significant advance in Kuhn’s articulation of a revolutionary theory of science, several issues still needed attention. What was missing from Kuhn’s reconstruction of the Copernican revolution was an understanding of how scientists function on a daily basis, when an impending revolution is not looming. That understanding emerged gradually during the last mile on the road to Structure in terms of three papers written from the mid-fifties to the early sixties.

In the first paper, ‘The function of measurement in modern physical science’, Kuhn challenged the belief that if scientists cannot measure a phenomenon then their knowledge of it is inadequate or not scientific. Part of the reason for Kuhn’s concern over measurement in science was its textbook tradition, which he believed perpetuates a myth about measurement that is misleading. Kuhn compared the textbook presentation of measurement to a machine in which scientists feed laws and theories along with initial conditions into the machine’s hopper at the top, turn a handle on the side representing logical and mathematical operations, and then collect numerical predictions exiting the machine’s chute in the front. Scientists finally compare experimental observations to theoretical predictions. The function of these measurements serves as a test of the theory, which is the confirmation function of measurement.

Kuhn claimed that the above function is not why measurements are reported in textbooks; rather, measurements are reported to give the reader an idea of what the professional community believes is reasonable agreement between theoretical predictions and experimental observations. Reasonable agreement, however, depends upon approximate, not exact, agreement between theory and data and differs from one science to the next. Moreover, external criteria do not exist for determining reasonableness. For Kuhn, the actual function of normal measurement in science is found in its journal articles. That function is neither invention of novel theories nor the confirmation of older ones. Discovery and exploratory measurements in science instead are rare. The reason is that changes in theories, which require discovery or confirmation, occur during revolutions, which are also quite rare. Once a revolution occurs, moreover, the new theory only exhibits potential for ordering and explaining natural phenomena. The function of normal measurement is to tighten reasonable agreement between novel theoretical predictions and experimental observations.

The textbook tradition is also misleading in terms of normal measurement’s effects. It claims that theories must conform to quantitative facts. Such facts are not the given but the expected and the scientist’s task is to obtain them. This obligation to obtain the expected quantitative fact is often the incentive for developing novel technology. Moreover, a well-developed theoretical system is required for meaningful measurement in science. Besides the function of normal or expected measurement, Kuhn also examined the function of extraordinary measurement—which pertain to unexpected results. It is this latter type of measurement that exhibits the discovery and confirmatory functions. When normal scientific practice results consistently in unexpected anomalies, this leads to crisis, and extraordinary measurement often aids to resolve it. Crisis then leads to the invention of new theories. Again, extraordinary measurement plays a critical role in this process. Theory invention in response to quantitative anomalies leads to decisive measurements for judging a novel theory’s adequacy, whereas qualitative anomalies generally lead to ad hoc modifications of theories. Extraordinary measurement allows scientists to choose among competing theories.

Kuhn was moving closer towards a notion of normal science through an analysis of normal measurement, in contrast to extraordinary measurement, in science. His conception of science continued to distance him from traditional philosophers of science. But, the notion of normal measurement was not as robust as he needed. Importantly, Kuhn was changing the agenda for philosophy of science from justification of scientific theories as finished products in textbooks to dynamic process by which theories are tested and assimilated into the professional literature. A robust notion of normal science was the revolutionary concept he needed, to overturn the traditional image of science as an accumulated body of facts.

With the introduction of normal and extraordinary measurement, the step towards the notions of normal and extraordinary science in Kuhn’s revolutionary image of science was imminent. Kuhn worked out those notions in The Essential Tension. He began by addressing the notion that creative thinking in science assumes a particular assumption of science in which science advances through unbridled imagination and divergent thinking—which involves identifying multiple avenues by which to solve a problem and determining which one works best. Kuhn acknowledged that such thinking is responsible for some scientific progress, but he proposed that convergent thinking—which limits itself to well-defined, often logical, steps for solving a problem—is also an important means of progress. While revolutions, which depend on divergent thinking, are an obvious means for scientific progress, Kuhn insisted that few scientists consciously design revolutionary experiments. Rather, most scientists engage in normal research, which represents convergent thinking. But, occasionally scientists may break with the tradition of normal science and replace it with a new tradition. Science, as a profession, is both traditional and iconoclastic, and the tension between them often creates a space in which to practice it.

Next, Kuhn utilized the term paradigm, while discussing the pedagogical advantages of convergent thinking—especially as displayed in science textbooks. Whereas textbooks in other disciplines include the methodological and conceptual conflicts prevalent within the discipline, science textbooks do not. Rather, science education is the transmission of a tradition that guides the activities of practitioners. In science education, students are taught not to evaluate the tradition but to accept it.

Progress within normal research projects represents attempts to bring theory and observation into closer agreement and to extend a theory’s scope to new phenomena. Given the convergent and tradition-bound nature of science education and of scientific practice, how can normal research be a means for the generation of revolutionary knowledge and technology? According to Kuhn, a mature science provides the background that allows practitioners to identify non-trivial problems or anomalies with a paradigm. In other words, without mature science there can be no revolution.

Kuhn continued to develop the notion of normal research and its convergent thinking in ‘The function of dogma in scientific research’. He began with the traditional image of science as an objective and critical enterprise. Although this is the ideal, the reality is that often scientists already know what to expect from their investigations of natural phenomena. If the expected is not forthcoming, then scientists must struggle to find conformity between what they expect and what they observe, which textbooks encode as dogmas. Dogmas are critical for the practice of normal science and for advancement in it because they define the puzzles for the profession and stipulate the criteria for their solution.

Kuhn next expanded the range of paradigms to embrace scientific practice in general, rather than simply as a model for research. Specifically, paradigms include not only a community’s previous scientific achievements but also its theoretical concepts, the experimental techniques and protocols, and even the natural entities. In short, they are the community’s body of beliefs or foundations. Paradigms are also open-ended in terms of solving problems. Moreover, they are exclusive in their nature, in that there is only one paradigm per mature science. Finally, they are not permanent fixtures of the scientific landscape, for eventually paradigms are replaceable. Importantly, for Kuhn, when a paradigm replaces another the two paradigms are radically different.

Having done paradigmatic spadework, Kuhn then discussed the notion of normal scientific research. The process of matching paradigm and nature includes extending and applying the paradigm to expected but also unexpected parts of nature. This does not necessarily mean discovering the unknown as it does explaining the known. Although the dogma paper is only a fragment of the solution to problems associated with the traditional image of science, the complete solution was soon to appear in Structure.

3. The Structure of Scientific Revolutions

In July 1961, Kuhn completed a draft of Structure; and in 1962, it was published as the final monograph in the second volume of Neurath’s International Encyclopedia of Unified Science. Charles Morris was instrumental in its publication and Carnap served as its editor. Structure was not a single publishing event in 1962; rather, it covered the years from 1962 to 1970. After its publication, Kuhn was engrossed for the rest of the sixties addressing criticisms directed to the ideas contained in it, especially the paradigm concept. During this time, he continued to develop and refine his new image of science. The endpoint was a second edition of Structure that appeared in 1970. The text of the revised edition, however, remained essentially unaltered and only a ‘Postscript—1969’ was added in which Kuhn addressed his critics.

What Kuhn proposed in Structure was a new image of science. That image differed radically from the traditional one. The difference hinged on a shift from a logical analysis and an explanation of scientific knowledge as finished product to a historical narration and description of scientific practices by which a community of practitioners produces scientific knowledge. In short, it was a shift from the subject (the product) to the verb (to produce).

According to the traditional image, science is a repository of accumulated facts, discovered by individuals at specific periods in history. One of the central tasks of traditional historians, given this image of science, was to answer questions about who discovered what and when. Even though the task seemed straightforward, many historians found it difficult and doubted whether these were the right kinds of questions to ask concerning science’s historical record. The historiographic revolution in the study of science changed the sorts of questions historians asked by revising the underlying assumptions concerning the approach to reading the historical record. Rather than reading history backwards and imposing current ideas and values on the past, texts are read within their historical context thereby maintaining their integrity. The historiographic revolution also had implications for how to analyze and understand science philosophically. The goal of Structure, declared Kuhn, was to cash out those implications.

The structure of scientific development, according to Kuhn, may be illustrated schematically, as follows: pre-paradigm science → normal science → extraordinary science → new normal science. The step from pre-paradigm science to normal science involves consensus of the community around a single paradigm, where no prior consensus existed. This is the step required for transitioning from immature to mature science. The step from normal science to extraordinary science includes the community’s recognition that the reigning paradigm is unable to account for accumulating anomalies. A crisis ensues, and community practitioners engage in extraordinary science to resolve its anomalies. A scientific revolution occurs with crisis resolution. Once a community selects a new paradigm, it discards the old one and another period of new normal science follows. The revolution or paradigm shift is now complete, and the cycle from normal science to new normal science through revolution is free to occur again.

For Kuhn, the origin of a scientific discipline begins with the identification of a natural phenomenon, which members of the discipline investigate experimentally and attempt to explain theoretically. But, each member of that nascent discipline is at cross-purposes with other members; for each member often represents a school working from different foundations. Scientists, operating under these conditions, share few, if any, theoretical concepts, experimental techniques, or phenomenal entities. Rather, each school is in competition for monetary and social resources and for the allegiance of the professional guild. An outcome of this lack of consensus is that all facts seem equally relevant to the problem(s) at hand and fact gathering itself is often a random activity. There is then a proliferation of facts and hence little progress in solving the problem(s) under these conditions. Kuhn called this state pre-paradigm or immature science, which is non-directed and flexible, providing a community of practitioners little guidance.

To achieve the status of a science, a discipline must reach consensus with respect to a single paradigm. This is realized when, during the competition involved in pre-paradigm science, one school makes a stunning achievement that catches the professional community’s attention. The candidate paradigm elicits the community’s confidence that the problems are solvable with precision and in detail. The community’s confidence in a paradigm to guide research is the basis for the conversion of its members, who now commit to it. After paradigm consensus, Kuhn claimed that scientists are in the position to commence with the practice of normal science. The prerequisite of normal science then includes a commitment to a shared paradigm that defines the rules and standards by which to practice science. Whereas pre-paradigm science is non-directed and flexible, normal or paradigm science is highly directed and rigid. Because of its directedness and rigidity, normal scientists are able to make the progress they do.

The paradigm concept loomed large in Kuhn’s new image of science. He defined the concept in terms of the community’s concrete achievements, such as Newtonian mechanics, which the professional can commonly recognize but cannot fully describe or explain. A paradigm is certainly not just a set of rules or algorithms by which scientists blindly practice their trade. In fact, there is no easy way to abstract a paradigm’s essence or to define its features exhaustively. Moreover, a paradigm defines a family resemblance, à la Wittgenstein, of problems and procedures for solving problems that are part of a single research tradition.

Although scientists rely, at times, on rules to guide research, these rules do not precede paradigms. Importantly, Kuhn was not claiming that rules are unnecessary for guiding research but rather that they are not always sufficient, either pedagogically or professionally. Kuhn compared the paradigm concept to Polanyi’s notion of tacit knowledge, in which knowledge production depends on the investigator’s acquisition of skills that do not reduce to methodological rules and protocols.

As noted above, Newtonian mechanics represents an example of a Kuhnian paradigm. The three laws of motion comprising it provided the scientific community with the resources to investigate natural phenomena in terms of both precision and predictability. In terms of precision, Newtonian mechanics allowed physicists to measure and explain accurately—with clockwork exactitude—the motion not only of celestial but also terrestrial bodies. With respect to prediction, physicists used the Newtonian paradigm to determine the potential movement of heavenly and earthly bodies. Thus, Newtonian mechanics qua paradigm equipped physicists with the ability to explain and manipulate natural phenomena. In sum, it became a way of viewing the world.

According to Kuhn, a paradigm allows scientists to ignore concerns over a discipline’s fundamentals and to concentrate on solving its puzzles—as the Newtonian paradigm permitted physicists to do for several centuries. It not only guides scientists in terms of identifying soluble puzzles, but it also prevents scientists from tackling insoluble ones. Kuhn compared paradigms to maps that guide and direct the community’s investigations. Only when a paradigm guides the community’s activities is scientific advancement as cumulative progress possible.

The activity of practitioners engaged in normal science is paradigm articulation and extension to new areas. Indeed, the Newtonian paradigm was adapted even for medicine. When a new paradigm is established, it solves only a few critical problems that faced the community. But, it does offer the promise for solving many more problems. Much of normal science involves mopping up, in which the community forces nature into a conceptually rigid framework—the paradigm. Rather than being dull and routine, however, such activity, according to Kuhn, is exciting and rewarding and requires practitioners who are creative and resourceful.

Normal scientists are not out to make new discoveries or to invent new theories, outside the paradigm’s aegis. Rather, they are involved in using the paradigm to understand nature precisely and in detail. From the experimental end of this task, normal scientists go to great pains to increase the precision and reliability of their measurements and facts. They are also involved in closing the gap between observations and theoretical predictions, and they attempt to clarify ambiguities left over from the paradigm’s initial adoption. They also strive to extend the scope of the paradigm by including phenomena not heretofore investigated. Much of this activity requires exploratory investigation, in which normal scientists make novel discoveries but anticipated vis-à-vis the paradigm. To solve these experimental puzzles often requires considerable technological ingenuity and innovation on the part of the scientific community. As Kuhn notes, Atwood’s machine—developed almost a century after Newton, is a good illustration of this.

Besides experimental puzzles, there are also the theoretical puzzles of normal science, which obviously mirror the types of experimental puzzles. Normal scientists conduct theoretical analyses to enhance the match between theoretical predictions and experimental observations, especially in terms of increasing the paradigm’s precision and scope. Again, just as experimental ingenuity is required so is theoretical ingenuity to explain natural phenomena successfully.

Normal science, according to Kuhn, is puzzle-solving activity, and its practitioners are puzzle solvers and not paradigm testers. The paradigm’s power over a community of practitioners is that it can transform seemingly insoluble problems into soluble ones through the practitioner’s ingenuity and skill. Besides the assured solution, Kuhn’s paradigm concept also involved rules of the puzzle-solving game not in a narrow sense of algorithms but in a broad sense of viewpoints or preconceptions. Besides these rules of the game, as it were, there are also metaphysical commitments, which inform the community as to the types of natural entities, and methodological commitments, which inform the community as to kinds of laws and explanations. Although rules are often necessary for normal scientific research, they are not always required. Normal science can proceed in the absence of such rules.

Although scientists engaged in normal science do not intentionally attempt to make unexpected discoveries, such discoveries do occur. Paradigms are imperfect and rifts in the match between paradigm and nature are inevitable. For Kuhn, discoveries not only occur in terms of new facts but there is also invention in terms of novel theories. Both discoveries of new facts and invention of novel theories begin with anomalies, which are violations of paradigm expectations during the practice of normal science. Anomalies can lead to unexpected discoveries. For Kuhn, unexpected discoveries involve complex processes that include the intertwining of both new facts and novel theories. Facts and theories go hand-in-hand, for such discoveries cannot be made by simple inspection. Because discoveries depend upon the intertwining of observations and theories, the discovery process takes time for the conceptual integration of the novel with the known. Moreover, that process is complicated by the fact that novelties are often resisted due to prior expectations. Because of allegiance to a paradigm, scientists are loathed to abandon it simply because of an anomaly or even several anomalies. In other words, anomalies are generally not counter-instances that falsify a paradigm.

Just as anomalies are critical for discovery of new facts or phenomena, so they are essential for the invention of novel theories. Although facts and theories are intertwined, the emergence of novel theories is the outcome of a crisis. The crisis is the result of the paradigm’s breakdown or inability to provide solutions to its anomalies. The community then begins to harbor questions about the ability of the paradigm to guide research, which has a profound impact upon it. The chief characteristic of a crisis is the proliferation of theories. As members of a community in crisis attempt to resolve its anomalies, they offer more and varied theories. Interestingly, anomalies that are responsible for the crisis may not necessarily be new since they may have been present all along. This helps to explain why anomalies lead to a period of crisis in the first place. The paradigm promised resolution of them but was unable to fulfill its promise. The overall effect is a return to a situation very similar to pre-paradigm science.

Closure of a crisis occurs in one of three possible ways, according to Kuhn. First, on occasion that the paradigm is sufficiently robust to resolve anomalies and to restore normal science practice. Second, even the most radical methods are unable to revolve the anomalies. Under these circumstances, the community tables them until future investigation and analysis. Third, the crisis is resolved with the replacement of the old paradigm by a new one but only after a period of extraordinary science.

Kuhn stressed that the initial response of a community in crisis is not to abandon its paradigm. Rather, its members make every effort to salvage it through ad hoc modifications until the anomalies can be resolved, either theoretically or experimentally. The reason for this strong allegiance, claimed Kuhn, is that a community must first have an alternative candidate to take the original paradigm’s place. For science, at least normal science, is possible only with a paradigm, and to reject it without a substitute is to reject science itself, which reflects poorly on the community and not on the paradigm. Moreover, a community does not reject a paradigm simply because of a fissure in the paradigm-nature fit. Kuhn’s aim was to reject a naïve Popperian falsificationism in which single counter-instances are sufficient to reject a theory. In fact, he reversed the tables and contended that counter-instances are essential for the practice of vibrant normal science. Although the goal of normal science is not necessarily to generate counter-instances, normal science practice does provide the occasion for their possible occurrence. Normal science, then, serves as an opportunity for scientific revolutions. If there are no counter-instances, reasoned Kuhn, scientific development comes to a halt.

The transition from normal science through crisis to extraordinary science involves two key events. First, the paradigm’s boundaries become blurred when faced with recalcitrant anomalies; and, second, its rules are relaxed leading to proliferation of theories and ultimately to the emergence of a new paradigm. Often relaxing the rules allows practitioners to see exactly where the problem is and how to solve it. This state has tremendous impact upon a community’s practitioners, similar to that during pre-paradigm science. Extraordinary scientists, according to Kuhn, behave erratically—because scientists are trained under a paradigm to be puzzle-solvers, not paradigm-testers. In other words, they are not trained to do extraordinary science and must learn as they go. For Kuhn, this type of behavior is more open to psychological than logical analysis. Moreover, during periods of extraordinary science practitioners may even examine the discipline’s philosophical foundations. To that end, they analyze their assumptions in order to loosen the old paradigm’s grip on the community and to suggest alternative approaches to the generation of a new paradigm.

Although the process of extraordinary science is convoluted and complex, a replacement paradigm may emerge suddenly. Often the source of its inspiration is rooted in the practice of extraordinary science itself, in terms of the interconnections among various anomalies. Finally, whereas normal science is a cumulative process, adding one paradigm achievement to the next, extraordinary science is not; rather, it is like—using Herbert Butterfield’s analogy—grabbing hold of a stick’s other end. That other end of the stick is a scientific revolution.

The transition from extraordinary science to a new normal science represents a scientific revolution. According to Kuhn, a scientific revolution is non-cumulative in which a newer paradigm replaces an older one—either partially or completely. It can come in two sizes: a major revolution such as the shift from geocentric universe to heliocentric universe or a minor revolution such as the discovery of X-rays or oxygen. But whether big or small, all revolutions have the same structure: generation of a crisis through irresolvable anomalies and establishment of a new paradigm that resolves the crisis-producing anomalies.

Because of the extreme positions taken by participants in a revolution, opposing camps often become galvanized in their positions, and communication between them breaks down and discourse fails. The ultimate source for the establishment of a new paradigm during a crisis is community consensus, that is, when enough community members are convinced by persuasion and not simply by empirical evidence or logical analysis. Moreover, to accept the new paradigm, community practitioners must be assured that there is no chance for the old paradigm to solve its anomalies.

Persuasion loomed large in Kuhn’s scientific revolutions because the new paradigm solves the anomalies the old paradigm could not. Thus, the two paradigms are radically different from each other, often with little overlap between them. For Kuhn, a community can only accept the new paradigm if it considers the old one wrong. The radical difference between old and new paradigms, such that the old cannot be derived from the new, is the basis of the incommensurability thesis. In essence, there is no common measure or standard for the two paradigms. This is evident, claimed Kuhn, when looking at the meaning of theoretical terms. Although the terms from an older paradigm can be compared to those of a newer one, the older terms must be transformed with respect to the newer ones. But, there is a serious problem with restating the old paradigm in transformed terms. The older, transformed paradigm may have some utility, for example pedagogically, but a community cannot use it to guide its research. Like a fossil, it reminds the community of its history but it can no longer direct its future.

The establishment of a new paradigm resolves a scientific revolution and issues forth a new period of normal science. With its establishment, Kuhn’s new image of a mature science comes full circle. Only after a period on intense competition among rival paradigms, does the community choose a new paradigm and scientists once again become puzzle-solvers rather than paradigm-testers. The resolution of a scientific revolution is not a straightforward process that depends only upon reason or evidence. Part of the problem is that proponents of competing paradigms cannot agree on the relevant evidence or proof or even on the relevant anomalies that require resolution, since their paradigms are incommensurable.

Another factor that leads to difficulties in resolving scientific revolutions is that communication among members in crisis is only partial. This results from the new paradigm borrowing from the old paradigm theoretical terms and concepts, and laboratory protocols. Although they share borrowed vocabulary and technology, the new paradigm gives new meaning and uses to them. The net result is that members of competing paradigms talk past one another. Moreover, the change in paradigms is not a gradual process in which different parts of the paradigm are changed piecemeal; rather, the change must be as a whole and suddenly. Convincing scientists to make such a wholesale transformation takes time.

How then does one segment of the community convince or persuade another to switch paradigms? For members who worked for decades under the old paradigm, they may never accept the new paradigm. Rather, it is often the younger members who accept the new paradigm through something like a religious conversion. According to Kuhn, faith is the basis for conversion, especially faith in the potential of the new paradigm to solve future puzzles. By invoking the terms conversion and faith, Kuhn was not implying that arguments and reason are unimportant in a paradigm shift. Indeed, the most common reason for accepting a new paradigm is that it solves the anomalies the old paradigm could not. However, Kuhn point was that argument and reason alone are insufficient. Aesthetic or subjective factors also play an important role in a paradigm shift, since the new paradigm solves only a few, but critical, anomalies. These factors weigh heavily in the shift initially by reassuring community members that the new paradigm represents the discipline’s future.

From the resolution of revolutions, Kuhn made several important philosophical points concerning the principles of verification and falsification. As Kuhn acknowledged, philosophers no longer search for absolute verification, since no theory can be tested exhaustively; rather, they calculate the probability of a theory’s verification. According to probabilistic verification, every imaginable theory must be compared with one another vis-à-vis the available data. The problem in terms of Kuhn’s new image of science is that a theory is tested with respect to a given paradigm, and such a restriction precludes access to every imaginable theory. Moreover, Kuhn rejected falsifying instances because no paradigm resolves every problem facing a community. Under these conditions, no paradigm would ever be accepted. For Kuhn, the process of verification and falsification must include imprecision associated with theory-fact fit.

An interesting feature of scientific revolutions, according to Kuhn, is their invisibility. What he meant by this is that in the process of writing textbooks, popular scientific essays, and even philosophy of science, the path to the current paradigm is sanitized to make it appear as if it was in some sense born mature. Disguising a paradigm’s history is an outcome of a belief about scientific knowledge, which considers it as invariable and its accumulation as linear. This disguising serves the winner of the crisis by establishing its authority, especially as a pedagogical aid for indoctrinating students into a community of practitioners. Another important effect of a revolution, related to a paradigm shift, is a shift in the community’s image of science. The change in science’s image should be no surprise, since the prevailing paradigm defines science. Change that paradigm and science itself changes, at least how to practice it. In other words, the shift in science’s image is a result of a change in the community’s standards for what constitutes its puzzles and its puzzles’ solutions. Finally, revolutions transform scientists from practitioners of normal science, who are puzzle-solvers, to practitioners of extraordinary science, who are paradigm-testers. Besides transforming science, revolutions also transform the world that scientists inhabit and investigate.

One of the major impacts of a scientific revolution is a change of the world in which scientists practice their trade. Kuhn’s world-changes thesis, as it has become known, is certainly one of his most radical and controversial ideas, besides the associated incommensurability thesis. The issue is how far ontologically does the change go, or is it simply an epistemological ploy to reinforce the comprehensive effects of scientific revolutions. In other words, does the world really change or simply the worldview, that is, one’s perspective on or perception of the world? For Kuhn, the answer relied not on a logical or even a philosophical but rather a psychological analysis of the change.

Kuhn analyzed the changes in worldview by analogizing it to a gestalt switch, for example, duck-rabbit. Although the gestalt analogy is suggestive, it is limited to only perceptual changes and says little about the role of previous experience in such transformations. Previous experience is important because it influences what a scientist sees when making an observation. Moreover, with a gestalt switch, the person can stand above or outside of it acknowledging with certainty that one sees now a duck or now a rabbit. Such an independent perspective, which eventually is an authoritarian stance, is not available to the community of practitioners; there is no answer sheet, as it were. Because the community’s access to the world is limited by what it can observe, any change in what is observed has important consequences for the nature of what is observed, that is, the change has ontological significance.

Thus, for Kuhn, the change revolution brings about is more than simply seeing or observing a different world; it also involves working in a different world. The perceptual transformation is more than reinterpretation of data. For, data are not stable but they too change during a paradigm shift. Data interpretation is a function of normal science, while data transformation is a function of extraordinary science. That transformation is often a result of intuitions. Moreover, besides a change in data, revolutions change the relationships among data. Although traditional western philosophy has searched for three centuries for stable theory-neutral data or observations to justify theories, that search has been in vain. Sensory experience occurs through a paradigm of some sort, argued Kuhn, even articulations of that experience. Hence, no one can step outside a paradigm to make an observation; it is simply impossible given the limits of human physiology.

Kuhn then took on the nature of scientific progress. For normal science, progress is cumulative in that the solutions to puzzles form a repository of information and knowledge about the world. This progress is the result of the direction a paradigm provides a community of practitioners. Importantly, the progress achieved through normal science, in terms of the information and knowledge, is used to educate the next generation of scientists and to manipulate the world for human welfare. Scientific revolutions change all that. For Kuhn, revolutionary progress is not cumulative but non-cumulative.

What, then, does a community of practitioners gain by going through a revolution or paradigm shift? Has it made any kind of progress in its rejection of a previous paradigm and the fruit that paradigm yielded? Of course, the victors of the revolution are going to claim that progress was made after the revolution. To do otherwise would be to admit that they were wrong. Rather advocates of the new normal science are going to do everything they can to ensure that their winning paradigm is seen as pushing forward a better understanding of the world. The progress achieved through a revolution is two-fold, according to Kuhn. The first is the successful solution of anomalies that a previous paradigm could not solve. The second is the promise to solve additional problems or puzzles that arise from these anomalies.

But has the community gotten closer to the truth, that is, the notion of verisimilitude, by going through a revolution? According to Kuhn the answer is no. For Kuhn, progress in science is not directed activity towards some goal like truth. Rather, scientific progress is evolutionary. Just as natural selection operates during biological evolution in the emergence of a new species, so community selection during a scientific revolution functions similarly in the emergence of a new theory. And, just as species are adapted to their environments, so theories are adapted to the world. Kuhn had no answer to the question why this should be other than the world and the community that investigates it exhibit unique features. What these features are, Kuhn did not know, but he concluded that the new image of science he had proposed would resolve, like a new paradigm after a scientific revolution, these problems. He invited the next generation of philosophers of science to join him in a new philosophy of science incommensurate with its predecessor.

The reaction to Kuhn’s Structure was at first congenial, especially by historians of science, but within a few years it turned critical, particularly by philosophers. Critics charged him with irrationalism and epistemic relativism. Although he felt the reviews of Structure were good, his chief concerns were the tags of irrationalism and relativism—at least a pernicious kind of relativism. Kuhn believed the charges were inaccurate, however, simply because he maintained that science does not progress toward a predetermined goal. But, like evolutionary change, one theory replaces another with a better fit between theory and nature vis-à-vis competitors. Moreover, he believed that use of the Darwinian evolution was the correct framework for discussing science’s progress. But, he felt no one took it seriously.

On 13 July 1965, Kuhn participated in an International Colloquium in the Philosophy of Science, held at Bedford College in London. The colloquium was organized jointly by the British Society for the Philosophy of Science and by the London School of Economics and Political Science. Kuhn delivered the initial paper comparing his and Karl Popper’s conceptions of the growth of scientific knowledge. John Watkins then delivered a paper criticizing Kuhn’s notion of normal science, with Popper chairing the session. Popper also presented a paper criticizing Kuhn, as did several other members of the philosophy of science community, including Stephen Toulmin, L. Pearce Williams, and Margaret Masterman, who identified twenty-one senses of Kuhn’s use of paradigm in Structure. Masterman concluded her paper inviting others to join in clarifying Kuhn’s paradigm concept.

Kuhn himself took up Masterman’s challenge and clarified the paradigm concept in the second edition of Structure, particularly in its ‘Postscript—1969’. To that end, he divided paradigm into disciplinary matrix and exemplars. The former represents the milieu of the professional practice, consisting of symbolic generalizations, models, and values, while the latter represents solutions to concrete problems that a community accepts as paradigmatic. In other words, exemplars serve as templates for solving problems or puzzles facing the scientific community and thereby for advancing the community’s scientific knowledge. For Kuhn, scientific knowledge is not localized simply within theories and rules; rather, it is localized within exemplars. The basis for an exemplar to function in puzzle solving is the scientist’s ability to see the similarity between a previously solved puzzle and a currently unsolved one.

In the early sixties, van Vleck invited Kuhn to direct a project collecting materials on the history of quantum mechanics. In August 1960, Hunter Dupree, Charles Kittel, Kuhn, John Wheeler, and Harry Wolff, met in Berkeley to discuss the project’s organization. Wheeler next met with Richard Shryock and a joint committee of the American Physical Society and the American Philosophical Society on the History of Theoretical Physics in the Twentieth Century was formed to sponsor and develop the project. The project lasted for three years, with the first and last years of the project conducted in Berkeley and the middle year in Europe. The National Science Foundation funded the project.

The project led to a publication, by John Heilbron and Kuhn, on the origins of the Bohr atom. They provided a revisionist narrative of Bohr’s path to the quantized atom, beginning with his 1911 doctoral dissertation and concluding with his 1913 three-part paper on atomic structure. The intrigue of this historical study was that within a six-week period in mid-1912 Bohr went from little interest in models of the atom to producing a quantized model of J.J. Rutherford’s atom and applying that model to several perplexing problems. The authors explored Bohr’s sudden interest in atomic models. They proposed that his interest stemmed from specific problems, which guided Bohr in terms of both his reading and research toward the potential of the atomic structure for solving them. The solutions to those problems resulted from what Heilbron and Kuhn called a 1913 February transformation in Bohr’s research. What initiated the transformation, claimed the authors, was that Bohr had read a few months earlier J.W. Nicholson’s papers on the application of Max Planck’s constant to generate an atomic model. Although Nicholson’s model was incorrect, it led Bohr in the right direction. Then in February 1913, Bohr, in a conversation with H.R. Hansen, obtained the last piece of the puzzle. After the transformation, Bohr completed the atomic model project within the year.

Besides completing a draft of Structure in 1961, Kuhn was made full professor at Berkeley, but only in the history department. Members of philosophy department voted to deny him promotion in their department, a denial that angered and hurt Kuhn tremendously. Princeton University made Kuhn an offer to join its faculty, while he was in Europe. The university had recently inaugurated a history and philosophy of science program. The program’s chair was Charles Gillispie and its staff included John Murdoch, Hilary Putnam, and Carl Hempel. Upon returning to the United States in 1963, Kuhn visited Princeton. He decided to accept the offer and joined its faculty in 1964. He became the program’s director in 1967 and the following year Princeton appointed him the Moses Taylor Pyne Professor of History. As the sixties ended, Structure was becoming increasingly popular, especially among student radicals who believed it liberated them from the tyranny of tradition.

4. The Road after Structure

In 1979, Kuhn moved to M.I.T.’s Department of Linguistics and Philosophy. In 1983, he was appointed the Laurance S. Rockefeller Professor of Philosophy. At M.I.T., he took a linguistic turn in his thinking, reflecting his new environment, which had a major impact on his subsequent work, especially on the incommensurability thesis.

Structure’s success not only established the historiographic revolution in the study of science in either historically or philosophically or what came to be called the discipline of history and philosophy of science, but also supported the rise of science studies in general and specifically the sociology and anthropology of science, particularly the sociology of scientific knowledge. Kuhn rejected both these trajectories often attributed to Structure, for what he called historical philosophy of science. He conducted—as he categorized his work in the Essential Tension—either historical studies on science or their historiographic implications, or either metahistorical studies or their philosophical implications. In other words, his scholarly work was either historical or philosophical.

a. Historical and Historiographic Studies

Kuhn’s final major historical study was on Planck’s black-body radiation theory and the origins of quantum discontinuity. The transition from classical physics—in which particles pass through intermediate energy stages—to quantum physics—in which energy change is discontinuous—is traditionally attributed to Planck’s 1900 and 1901 quantum papers. According to Kuhn, this traditional account was inaccurate and the transition was initiated by Albert Einstein’s and Paul Ehrenfest’s independent 1906 quantum papers. Kuhn’s realization of this inaccuracy was similar to the enlightenment he experienced when struggling to make sense of Aristotle’s notion of mechanical motion. His initial epiphany occurred while reading Planck’s 1895 paper on black-body radiation. Through that experience, he realized that Planck’s 1900 and 1901 quantum papers were not the initiation of a new theory of quantum discontinuity, but rather they represented Planck’s effort to derive the black-body distribution law based on classical statistical mechanics. Kuhn concluded the study with an analysis of Planck’s second black-body theory, first published in 1911, in which Planck used the notion of discontinuity to derive the second theory. Rather than the traditional position, which claimed the second theory represents a regression on Planck’s part to classical physics, Kuhn argued that it represents the first time Planck incorporated into his theoretical work a theory in which he was not completely confident.

In the black-body radiation and quantum discontinuity historical study, Kuhn did not use paradigm, normal science, anomaly, crisis, or incommensurability, which he championed in Structure. Critics, especially within the history and philosophy of science discipline, were disappointed. Kuhn bemoaned the book’s reception, even by its supporters. However, he later explored the historiographic and philosophical issues raised in Black-Body Theory with respect to Structure. The historiographic issues that the former book addressed were the same raised in the 1962 monograph. Specifically, he claimed that current historiography should attempt to understand previous scientific texts in terms of their contemporary context and not in terms of modern science. Kuhn’s concern was more than historical accuracy; rather, he was interested in recapturing the thought processes that lead to a change in theory. Although Structure was Kuhn’s articulation of this process for scientific change, the terminology in the monograph did not represent a straightjacket for narrating history. For Kuhn, the terminology and vocabulary, like paradigm and normal science, used in Structure were not products, such as metaphysical categories, to which a historical narrative must conform; rather, they had a different metaphysical function—as presuppositions towards an historical narrative as process. In other words, Structure’s terminology and vocabulary were tools by which to reconstruct a scientific historical narrative and not a template for articulating it.

The purpose of history of science, according to Kuhn, was not just getting the facts straight but providing philosophers of science with an accurate image of science to practice their trade. Kuhn fervently believed that the new historiography of science would prevent philosophers from engaging in the excesses and distortions prevalent within traditional philosophy of science. He envisioned history of science informing philosophy of science as historical philosophy of science rather than history and philosophy of science, since the relationship was asymmetrical.

Prior to 1950, history of science was a discipline practiced mostly by eminent scientists, who generally wrote heroic biographies or sweeping overviews of a discipline often for pedagogical purposes. Within the past generation, historians of science, such as Alexander Koyré, Anneliese Maier, and E.J. Dijsterhuis, developed an approach to the history of science that was simply more than chronicling science’s theoretical and technical achievements. An important factor in that development was the recognition of institutional and sociological factors in the practice of science. A consequence of this historiographic revolution was the distinction between internal and external histories of science. Internal history of science is concerned with the development of the theories and methods employed by scientists. In other words, it studies the history of events, people, and ideas internal to scientific advancement. The historian as internalist attempts to climb inside the mind of scientists as they push forward the boundaries of their discipline. External history of science concentrates on the social and cultural factors that impinge on the practice of science.

For Kuhn, the distinction between internal and external histories of science mapped onto his pattern of scientific development. External or cultural and social factors are important during a scientific discipline’s initial establishment; however, once established, those factors no longer have a major impact on a community’s practice or its generation of scientific knowledge. They can have a minor impact on a mature science’s practice, such as the timing of technological innovation. Importantly for Kuhn, internal and external approaches to the history of science are not necessarily mutually exclusive but complementary.

b. Metahistorical Studies

As mentioned already, Kuhn considered himself a practitioner of both the history of science and the philosophy of science and not the history and philosophy of science, for a very practical reason. Crassly put, the goal for history is the particular while for philosophy the universal. Kuhn compared the differences between the two disciplines to a duck-rabbit Gestalt switch. In other words, the two disciplines are so fundamentally different in terms of their goals, that the resulting images of science are incommensurable. Moreover, to see the other discipline’s image requires a conversion. For Kuhn, then, the history of science and the philosophy of science cannot be practiced at the same time but only alternatively, and then with difficulty.

How then can the history of science be of use to philosophers of science? The answer for Kuhn was by providing an accurate image of science. Rejecting the covering law model for historical explanation because it reduces historians to mere social scientists, Kuhn advocated an image based on ordering of historical facts into a narrative analogous to the one he proposed for puzzle solving under the aegis of a paradigm in the physical sciences. Historians of science, as they narrate change in science, provide an image of science that reflects the process by which scientific information develops, rather than the image provided by traditional philosophers of science in which scientific knowledge is simply a product of logical verification or falsification. Kuhn insisted that the history of science and the philosophy of science remain distinct disciplines, so that historians of science can provide an image of science to correct the distortion produced by traditional philosophers of science.

According to Kuhn, the social history of science also distorts the image of science. For social historians, scientists construct rather than discover scientific knowledge. Although Kuhn was sympathetic to this type of history, he believed it created a gap between older constructions and the ones replacing them, which he challenged historians of science to fill. Besides social historians of science, Kuhn also accused sociologists of science for distorting the image of science. Although Kuhn acknowledged that factors such as interests, power, authority, among others, are important in the production of scientific knowledge, the predominant use of them by sociologists eclipses other factors such as nature itself. The key to rectifying the distortion introduced by sociologists is to shift from a rationality of belief, that is, the reasons scientists hold specific beliefs, to a rationality of change in beliefs, that is, the reasons scientists change their beliefs. For Kuhn, a historical philosophy of science was the means for correcting these distortions of the scientific image.

Kuhn’s historical philosophy of science focused on the metahistorical issues derived from historical research, particularly scientific development and the related issues of theory choice and incommensurability. Importantly for Kuhn, both theory choice and incommensurability are intimately linked to one another. The former cannot be reduced to an algorithm of objective rules but requires subjective values because of the latter.

Kuhn explored scientific development using three different approaches. The first was in terms of problem versus puzzle solving. According to Kuhn, problems have no ready solution; and, problem solving is often generally pragmatic and is the hallmark of an underdeveloped or immature science. Puzzles, on the other hand, occupy the attention of scientists involved in a developed or mature science. Although they have guaranteed solutions, the methods for solving puzzles are not assured. Scientists, who solve them, demonstrate their ingenuity and are rewarded by the community.

With this distinction in mind, Kuhn envisioned scientific development as the transition of a scientific discipline from an underdeveloped problem-solving state to a developed puzzle-solving one. The question then arises as to how this occurs. The answer that many took from Structure was, adopt a paradigm. However, Kuhn found this answer to be incorrect in that paradigms are not unique only to the sciences. But does articulating the question in terms of puzzle-solving help? Kuhn’s answer was pragmatic, that is, keep trying different solutions until one works. In other words, philosophers of science had no exemplars by which to solve their problems.

Kuhn’s second approach to scientific development was in terms of the growth of knowledge. He proposed an alternative view to the traditional one that scientific knowledge grows by a piecemeal accumulation of facts. To shed light on the alternative view, Kuhn offered a different reconstruction of science. The central ideas of a science cohere with one another, forming a set of the central ideas or core of a particular science. Besides the core, a periphery exists, which represents an area where scientists can investigate problems associated with a research tradition without changing core ideas.

Kuhn then drew parallels between the current reconstruction of science and the earlier one in Structure. Obviously, the transition in cores from one research tradition to another is a scientific revolution. Moreover, the core represents a paradigm that defines a particular research tradition. Finally, the periphery is identified with normal science. The core then provides the means by which to practice science, and to change the core requires significant retooling that practitioners naturally resist.

Is this change in the core a growth of knowledge? To answer the question, Kuhn examined the standard account of knowledge as justified true belief. What he found problematic with the account is the amount or nature of the evidence needed to justify a belief. And this, of course, raises the issue of truth for which he had no ready solution. Ultimately, Kuhn equivocated on the question of the growth of knowledge.

Kuhn’s final approach to scientific development was through the analysis of three scientific revolutions: the shift from Aristotelian to Newtonian physics, Volta’s discovery of the electric cell, and Planck’s black-body radiation research and quantum discontinuity. From these examples, Kuhn derived three characteristics of scientific revolutions. The first was holistic in that scientific revolutions are all-or-none events. The second was the way referents change after revolutions, especially in terms of taxonomic categories. According to Kuhn, revolutions redistribute objects among these categories. The final characteristic of scientific revolutions was a change in a discipline’s analogy, metaphor, or model, which represents the connection between taxonomic categories and the world’s structure.

According to traditional philosophers of science, the objective features of a good scientific theory include accuracy, consistency, scope, simplicity, and fecundity. However, these features, when used individually as criteria for theory choice, argued Kuhn, are imprecise and often conflict with one another. Although necessary for theory choice, they are insufficient and must include the characteristics of the scientists making the choices. These characteristics involve personal experiences or biography and personality or psychological traits. In other words, not only does theory choice rely on a theory’s objective features but also on individual scientists’ subjective characteristics.

Why have traditional philosophers of science ignored or neglected subjective factors in theory choice? Part of the answer is that they confined the subjective to the context of discovery, while restricting the objective to the context of justification. Kuhn insisted that this distinction does not fit with observations of scientific practice. It is artificial, reflecting science pedagogy. But, actual scientific practice reveals that textbook presentations of theory choice are stylized, to convince students who rely on the authority of their instructors. What else can students do? Textbook science discloses only the product of science, not its process. For Kuhn, since subjective factors are present at the discovery phase of science, they should also be present at the justification phase.

According to Kuhn, objective criteria function as values, which do not dictate theory choice but rather influence it. Values help to explain scientists’ behavior, which for the traditional philosopher of science may at times appear irrational. Most importantly, values account for disagreement over theories and help to distribute risk during debates over theories. Kuhn’s position had important consequences for the philosophy of science. He maintained that critics misinterpreted his position on theory choice as subjective. For them, the term denoted a matter of taste that is not rationally discussable. But, his use of the term did involve the discussable with respect to standards. Moreover, Kuhn denied that facts are theory independent and that there is strictly a rational choice to be made. Rather, he contended scientists do not choose a theory based on objective criteria alone but are converted based on subjective values.

Finally, Kuhn discussed theory choice with respect to the incommensurability thesis. The question he entertained was what type of communication is possible among community members holding competing theories. The answer, according to Kuhn, is that communication is partial. The answer raised a second, and more important, question for Kuhn and his critics. Is good reason vis-à-vis empirical evidence available to justify theory choice, given such partial communication? The answer would be straightforward if communication was complete, but it is not. For Kuhn, this situation meant that ultimately reasonable evaluation of the empirical evidence is not compelling for theory choice and, of course, raised the charge of irrationality, which he denied.

Kuhn identified two common misconceptions of his version of the incommensurability thesis. The first was that since two incommensurable theories cannot be stated in a common language, then they be cannot compared to one another in order to choose between them. The second was that since an older theory cannot be translated into modern expression, it cannot be articulated meaningfully.

Kuhn addressed the first misconception by distinguishing between incommensurability as no common measure and as no common language. He defined the incommensurability thesis in terms of the latter rather than the former. Most theoretical terms are homophonic and can have the same meaning in two competing theories. However, only a handful of terms are incommensurable or untranslatable. Kuhn considered this a modest version of the incommensurability thesis, calling it local incommensurability, and claimed that it was his originally intention. Although there may be no common language to compare terms that change their meaning during a scientific revolution, there is a partially common language composed of the invariant terms that do permit some semblance of comparison. Thus, Kuhn argued, the first criticism fails; because, and this was his main point, an incommensurate residue remains even with a partially common language.

As for the second misconception, Kuhn claimed that critics conflate the difference between translation and interpretation. The conflation is understandable since translation often involves interpretation. Translation for Kuhn is the process by which words or phrases of one language substitute for another. Interpretation, however, involves attempts to make sense of a statement or to make it intelligible. Incommensurability, then, does not mean that a theoretical term cannot be interpreted, that is, cannot be made intelligible; rather, it means that the term cannot be translated, that is, there is no equivalent for the term in the competing theoretical language. In other words, in order for the theoretical term to have meaning the scientist must go native in its use.

Kuhn introduced the notion of the lexicon and its attendant taxonomy to capture both a term’s reference and intention or sense. In the lexicon, there are referring terms that are interrelated to other referring terms, that is, the holistic principle. The lexicon’s structure of interrelated terms resembles the world’s structure in terms of its taxonomic categories. A particular scientific community uses its lexicon to describe and explain the world in terms of this taxonomy. And, members of a community or of different communities must share the same lexicon if they are to communicate fully with one another. Moreover, claimed Kuhn, if full translation is to be achieved the two languages must share a similar structure with respect to their respective lexicons. Incommensurability, then, reflects lexicons that have different taxonomic structures by which the world is carved up and articulated.

Kuhn also addressed a problem that involves communication among communities who hold incommensurable theories, or who occupy positions across a historical divide. Kuhn noted that although lexicons can change dramatically, this does not deter members from reconstructing their past in the current lexicon’s vocabulary. Such reconstruction obviously plays an important function in the community. But the issue is that, given the incommensurable nature of theories, assessments of true and false or right and wrong are unwarranted, for which critics charged Kuhn with a relativist position—a position he was less inclined to deny.

The charge stemmed from the fact that Kuhn advocated no privileged position from which to evaluate a theory. Rather, evaluations must be made within the context of a particular lexicon. And thus, evaluations are relative to the relevant lexicon. But, Kuhn found the charge of relativism trivial. He acknowledged that his position on the relativity of truth and objectivity, with respect to the community’s lexicon, left him no option but to take literally world changes associated with lexical changes. But, is this an idealist position? Kuhn admitted that it appears to be, but he claimed that it is an idealism like none other. On the one hand, the world is composed of the community’s lexicon, but one the other hand, preconceived ideas cannot mold it.

c. Evolutionary Philosophy of Science

From the mid-1980s to early-1990s, Kuhn transitioned from historical philosophy of science and the paradigm concept to an evolutionary philosophy of science and the lexicon notion. To that end, he identified an alternative role for the incommensurability thesis with respect to segregating or isolating lexicons and their associated worlds from one another. Incommensurability now functioned for Kuhn as a mechanism to isolate a community’s lexicon from another’s and as a means to underpin a notion of scientific progress as the proliferation of scientific specialties. In other words, as the taxonomical structure of the two lexicons become isolated and thereby incommensurable with one another, according to Kuhn, a new specialty and its lexicon split off from the old or parent specialty and its lexicon. This process accounts for a notion of scientific progress as an increase in the number of scientific specialties after a revolution.

Scientific progress, then, is akin to biological speciation, argued Kuhn, with incommensurability serving as the isolation mechanism. The result is a tree-like structure with increased specialization at the tips of the branches. Finally, Kuhn’s evolutionary philosophy of science is non-teleological in the sense that science progresses not towards an ultimate truth about the world but simply away from a lexicon that cannot be used to solve its anomalies to one that can. However, he still articulated incommensurability in terms of no common language, with its attendant problems involving the notion of meaning, and did not transform it fully with respect to an evolutionary philosophy of science.

Kuhn was working out an evolutionary philosophy of science in a proposed book, Words and Worlds: An Evolutionary View of Scientific Development. He divided it into three parts, with three chapters in each. In the first part, Kuhn framed the problem associated with the incommensurability thesis and addressed the difficulties accessing past scientific achievements. In the first chapter, he presented an evolutionary view of scientific development. Without an Archimedean platform to guide theory assessment, Kuhn proposed a comparative method for assessing theoretical changes. The method forbids assessment of theories in isolation and methodological solecism. In the next chapter, he discussed the problems associated with examining past historical studies in science. Based on several historical cases, he claimed that anomalies in older scientific texts could be understood only through an interpretative process involving an ethnographic or a hermeneutical reading. He had now laid the groundwork for examining the incommensurability thesis. In the third chapter, Kuhn discussed the changes of word-meanings as changes in a taxonomy embedded in a lexicon—an apparatus of a language’s referring terms. The result of these changes was an untranslatable gap between two incommensurable theories. Finally, the lexical terms referring to objects change as the number of scientific specialties proliferate.

In the book’s second part, Kuhn continued to explore the nature of a community’s lexicon, which he explicated in terms of taxonomic categories. These categories are grouped as contrast sets and no overlap of categories exists within the same contrast set, which Kuhn called the no-overlap principle. The principle prohibits the reference of terms to objects unless related to one another as species to genus. Moreover, the properties of the categories are reflected in the properties of their names. A term’s meaning then is a function of its taxonomic category. And, this restriction is the origin of untranslatability. In the first chapter of this part, Kuhn discussed the nature of substances in terms of sortal predicates. This move allowed Kuhn to introduce plasticity into the lexicon’s usage. Moreover, the differentiating set is not strictly conventional but relies on the world to which the different sets connect. In the next chapter, Kuhn extended the lexicon notion to artifacts, abstractions, and theoretical entities.

In the final chapter of the second part, Kuhn specified the means by which community members acquire a lexicon. First, they must already possess a vocabulary about physical entities and forces. Next, definitions play little, if any, role in learning new terms; rather, those terms are acquired through ostensive examples, especially through problem solving and laboratory demonstrations. Third, a single example is inadequate to learn the meaning of a term; rather, multiple examples are required. Next, acquisition of a new term within a statement also requires acquisition of other new terms within that statement. And lastly, students can acquire the terms of a lexicon through different pedagogical routes.

In the book’s concluding part, Kuhn discussed what occurs during a change in the lexicon and the implications for scientific development. In chapter seven, he examined the means by which lexicons change and the repercussions such change has for communication among communities with different lexicons. Moreover, he explored the role of arguments in lexical change. In the subsequent chapter, Kuhn identified the type of progress achieved with changes in lexicons. He maintained that progress is not the type that aims at a specific goal but rather is instrumental. In the final chapter, he broached the issues of relativism and realism not in traditional terms of truth and objectivity but rather with respect to the capability of making a statement. Statements from incommensurable theories that cannot be translated are ultimately ineffable. They can be neither true nor false but their capability of being stated is relative to the community’s history.

In sum, the book’s aim was certainly to address the philosophical issues left over from Structure, but more importantly, it was to resolve the problems generated by a historical philosophy of science. Although others were also responsible for its creation, Kuhn assumed responsibility for resolving the problems; and the sine qua non for resolving them was the incommensurability thesis. For Kuhn, the thesis was required more than ever to defend rationality from the post-modern development of the strong program.

5. Conclusion

In May 1990, a conference—or as Hempel called it, a Kuhnfest—was held in Kuhn’s honor at MIT, sponsored by the Sloan Foundation and organized by Paul Horwich and Judith Thomson. The conference speakers included Jed Buchwald, Nancy Cartwright, John Earman, Michael Friedman, Ian Hacking, John Heilbron, Ernan McMullin, N.M Swerdlow, and Norton Wise. The papers reflected Kuhn’s impact on the history and the philosophy of science. Hempel made a special appearance on the last day, followed by Kuhn’s remarks on the conference papers. As he approached the podium after Hempel’s remarks, before a standing-room-only audience, Kuhn was visibly moved by the outpouring of professional appreciation for his contributions, to a discipline that he cherished and from its members whom he truly respected.

Kuhn retired from teaching in 1991 and became an emeritus professor at MIT. During Kuhn’s career, he received numerous awards and accolades. He was the recipient of honorary degrees from around a dozen academic institutions, such as University of Chicago, Columbia University, University of Padua, and University of Notre Dame. He was elected a member of the National Academy of Science—the most prestigious society for U.S. scientists—and was an honorary life member of the New York Academy of Science and a corresponding fellow of the British Academy. He was president of the History of Science Society from 1968 to 1970 and the society awarded him its highest honor, the Sarton Medal, in 1982. Kuhn was also the recipient in 1977 of the Howard T. Behrman Award for distinguished achievement in the humanities and in 1983 of the celebrated John Desmond Bernal award. Kuhn died on 17 June 1996 in Cambridge, Massachusetts, after suffering for two years from cancer of the throat and bronchial tubes.

6. References and Further Reading

a. Kuhn’s Work

  • a Kuhn’s work
  • Kuhn Papers, MIT MC 240, Institute Archives and Special Collections, MIT Libraries, Cambridge, MA.
  • Kuhn, T. S. (1957) The Copernican Revolution: Planetary Astronomy in the Development of Western Thought. Cambridge, MA: Harvard University Press.
  • Kuhn, T. S. (1962) The Structure of Scientific Revolutions. Chicago, IL: University of Chicago Press.
  • Kuhn, T. S. (1963) ‘The function of dogma in scientific research’, in A.C. Crombie, ed. Scientific Change: Historical Studies in the Intellectual, Social and Technical Conditions for Scientific Discovery and Technical Invention, From Antiquity to the Present. New York: Basic Books, pp. 347-69.
  • Kuhn, T. S., Heilbron, J. L., Forman, P. and Allen, L. (1967) Sources for History of Quantum Physics: An Inventory and Report. Philadelphia, PA: American Philosophical Society.
  • Heilbron, J. L., and Kuhn, T. S. (1969) ‘The genesis of the Bohr atom’. Historical Studies in the Physical Sciences, 1, 211-90.
  • Kuhn, T. S. (1970) The Structure of Scientific Revolutions (2nd edition). Chicago, IL: University of Chicago Press.
  • Kuhn, T. S. (1977) The Essential Tension: Selected Studies in Science Tradition and Change. Chicago: University of Chicago Press.
  • Kuhn, T. S. (1987) Black-Body Theory and the Quantum Discontinuity, 1894-1912 (revised edition). Chicago: University of Chicago Press.
  • Kuhn, T. S. (1990) ‘Dubbing and redubbing: the vulnerability of rigid designation’, in C.W. Savage, ed. Scientific Theories. Minneapolis, MN: University of Minnesota Press, pp. 298-318.
  • Kuhn, T. S. (2000) The Road since Structure: Philosophical Essays, 1970-1993, with an Autobiographical Interview. Chicago: University of Chicago Press.
    • Contains a comprehensive interview with Kuhn covering his life and work.

b. Secondary Sources

  • Andersen, H. (2001) On Kuhn. Belmont, CA: Wadsworth Publishing.
    • A general introduction to Kuhn and his philosophy.
  • Andersen, H., Barker, P. and Chen, X. (2006) The Cognitive Structure of Scientific Revolutions. New York: Cambridge University Press.
  • Barnes, B. (1982) T.S. Kuhn and Social Science. London: Macmillan Press.
    • Discusses the impact of Kuhn’s philosophy for the sociology of science.
  • Bernardoni, J. (2009) Knowing Nature without Mirrors: Thomas Kuhn’s Antirepresentationalist Objectivity. Saarbrücken, DE: VDM Verlag Dr. Müller.
  • Bird, A. (2000) Thomas Kuhn. Princeton, NJ: Princeton University Press.
    • A critical introduction to Kuhn’s philosophy of science.
  • Bird, A. (2012) ‘The Structure of Scientific Revolutions and its significance: an essay review of the fiftieth anniversary edition’. British Journal for the Philosophy of Science, 63, 859-83.
  • Buchwald, J. Z. and Smith, G. E. (1997) ‘Thomas S. Kuhn, 1922-1996’. Philosophy of Science, 64, 361-76.
  • D’Agostino, F. (2010) Naturalizing Epistemology: Thomas Kuhn and the Essential Tension. New York: Palgrave Macmillan.
  • Davidson, K. (2006) The Death of Truth: Thomas S. Kuhn and the Evolution of Ideas. New York: Oxford University Press.
  • Favretti, R. R., Sandri, G. and Scazzieri, R., eds. (1999) Incommensurability and Translation: Kuhnian Perspectives on Scientific Communication and Theory Change. Northampton, MA: Edward Elgar.
  • Fuller, S. (2000) Thomas Kuhn: A Philosophical History of Our Times. Chicago, IL: University of Chicago Press.
    • A revisionist account of Kuhn as a foot soldier in Conant’s agenda to educate the public about science.
  • Fuller, S. (2004) Kuhn vs. Popper: The Struggle for the Soul of Science. New York: Columbia University Press.
  • Gattei, S. (2008) Thomas Kuhn’s ‘Linguistic Turn’ and the Legacy of Logical Empiricism: Incommensurability, Rationality and the Search for Truth. Burlington, VT: Ashgate.
  • Gutting, G., ed. (1980) Paradigms and Revolutions: Appraisals and Applications of Thomas Kuhn’s Philosophy of Science. Notre Dame, IN: University of Notre Dame Press.
    • A collection of articles addressing Kuhn’s philosophy of science.
  • Heilbron, J. L. (1998) ‘Thomas Samuel Kuhn’. Isis, 89, 505-15.
  • Horgan, J. (1991) ‘Reluctant revolutionary’. Scientific American, 264, 40-9.
    • Is based on an interview with Kuhn about his philosophy.
  • Hufbauer, K. (2012) ‘From student of physics to historian of science: TS Kuhn’s education and early career, 1940–1958’. Physics in Perspective, 14, 421-70.
    • A detailed reconstruction of Kuhn’s education and early career at Harvard.
  • Horwich, P., ed. (1993) World Changes: Thomas Kuhn and the Nature of Science. Cambridge, MA: MIT Press.
    • The published papers from the 1990 Kuhnfest.
  • Hoyningen-Huene, P. (1993) Reconstructing Scientific Revolutions: Thomas S. Kuhn’s Philosophy of Science. Chicago, IL: University of Chicago Press.
  • Hoyningen-Huene, P. and Sankey, H., eds. (2001) Incommensurability and Related Matters. Boston, MA: Kluwer.
  • Hung, E. H. -C. (2006) Beyond Kuhn: Scientific Explanation, Theory Structure, Incommensurability, and Physical Necessity. Burlington, VT: Ashgate.
  • Kindi, V. and Arabatzis T., eds. (2012) Kuhn’s The Structure of Scientific Revolutions Revisited. New York: Routledge.
    • A collection of essays examining the impact of Structure on contemporary philosophy of science.
  • Kuukkanen, J. M. (2008) Meaning Changes: A Study of Thomas Kuhn’s Philosophy. Saarbrücken, DE: VDM Verlag Dr. Müller.
  • Lakatos, I. and Musgrave, A., eds. (1970) Criticism and the Growth of Knowledge. Cambridge, U.K.: Cambridge University Press.
    • The published papers from the 1965 London colloquium.
  • Marcum, J.A. (2015) Thomas Kuhn’s Revolutions: A Historical and an Evolutionary Philosophy of Science? London: Bloomsbury.
  • Nickles, T., ed. (2003) Thomas Kuhn. Cambridge, UK: Cambridge University Press.
  • Onkware, K. (2010) Thomas Kuhn and Scientific Progression: Investigation on Kuhn’s Account of How Science Progresses. Staarbrücken: Lambert Academic Publishing.
  • Preston, J. M. (2008) Kuhn’s The Structure of Scientific Revolutions: A Reader’s Guide. London: Continuum.
  • Ruse, M. (1999) The Darwinian Revolution: Science Red in Tooth and Claw (2nd edition). Chicago, IL: University of Chicago Press
  • Sankey, H. (1994) The Incommensurability Thesis. London: Ashgate.
  • Sardar, Z. (2000) Thomas Kuhn and the Science Wars. New York: Totem Books.
  • Sharrock, W., and Read, R. (2002) Kuhn: Philosopher of Scientific Revolution. Cambridge, UK: Polity.
  • Sigurdsson, S. (1990) ‘The nature of scientific knowledge: an interview with Thomas Kuhn’. Harvard Science Review, Winter issue, 18-25.
  • Suppe, F., ed. (1977) The Structure of Scientific Theories (2nd edition). Urbana, IL: University of Illinois Press.
  • Swerdlow, N.M. (2013) ‘Thomas S. Kuhn, 1922-1996’. Biographical Memoir, National Academy of Sciences USA. http://www.nasonline.org/publications/biographical-memoirs/memoir-pdfs/kuhn-thomas.pdf.
  • Torres, J. M., ed. (2010) On Kuhn’s Philosophy and Its Legacy. Faculdade de Ciêcias da Universidade de Lisboa.
  • von Dietze, E. (2001) Paradigms Explained: Rethinking Thomas Kuhn’s Philosophy of Science. Westport, CT: Praeger.
  • Wade, N. (1977) ‘Thomas S. Kuhn: revolutionary theorist of science’. Science, 197, 143-5.
  • Wang, X. (2007) Incommensurability and Cross-Language Communication. Burlington, VT: Ashgate.
  • Wray, K. B. (2011) Kuhn’s Evolutionary Social Epistemology. New York: Cambridge University Press.

 

Author Information

James A. Marcum
Email: James_Marcum@baylor.edu
Baylor University
U. S. A.

Morality and Cognitive Science

What do we know about how people make moral judgments? And what should moral philosophers do with this knowledge? This article addresses the cognitive science of moral judgment. It reviews important empirical findings and discusses how philosophers have reacted to them.

Several trends have dominated the cognitive science of morality in the early 21st century. One is a move away from strict opposition between biological and cultural explanations of morality’s origin, toward a hybrid account in which culture greatly modifies an underlying common biological core. Another is the fading of strictly rationalist accounts in favor of those that recognize an important role for unconscious or heuristic judgments. Along with this has come expanded interest in the psychology of reasoning errors within the moral domains. Another trend is the recognition that moral judgment interacts in complex ways with judgment in other domains; rather than being caused by judgments about intention or free will, moral judgment may partly influence them. Finally, new technology and neuroscientific techniques have led to novel discoveries about the functional organization of the moral brain and the roles that neurotransmitters play in moral judgment.

Philosophers have responded to these developments in a variety of ways. Some deny that the cognitive science of moral judgment has any relevance to philosophical reflection on how we ought to live our lives, or on what is morally right to do. One argument to this end follows the traditional is/ought distinction and insists that we cannot generate a moral ought from any psychological is. Another argument insists that the study of morality is autonomous from scientific inquiry, because moral deliberation is essentially first-personal and not subject to any third-personal empirical correction.

Other philosophers argue that the cognitive science of moral judgment may have significant revisionary consequences for our best moral theories. Some make an epistemic argument: if moral judgment aims at discovering moral truth, then psychological findings can expose when our judgments are unreliable, like faulty scientific instruments. Other philosophers focus on non-epistemic factors, such as the need for internal consistency within moral judgment, the importance of conscious reflection, or the centrality of intersubjective justification. Certain cognitive scientific findings might require a new approach to these features of morality.

The first half of this article (sections 1 to 4) surveys the cognitive science literature, describing key experimental findings and psychological theories in the moral domain. The second half (sections 5 to 10) discusses how philosophers have reacted to these findings, discussing different ways philosophers have sought to immunize moral inquiry from empirical revision, or enthusiastically taken up psychological tools to make new moral arguments.

Note that the focus of this article is on moral judgment. See the article “Moral Character” for discussion of the relationship between cognitive science and moral character.

Table of Contents

  1. Biological and Cultural Origins of Moral Judgment
  2. The Psychology of Moral Reasoning
  3. Interaction between Moral Judgment and Other Cognitive Domains
  4. The Neuroanatomy of Moral Judgment
  5. What Do Moral Philosophers Think of Cognitive Science?
  6. Moral Cognition and the Is/Ought Distinction
    1. Semantic Is/Ought
    2. Non-semantic Is/Ought
  7. The Autonomy of Morality
  8. Moral Cognition and Moral Epistemology
  9. Non-epistemic Approaches
    1. Consistency in Moral Reasoning
    2. Rational Agency
    3. Intersubjective Justification
  10. Objections and Alternatives
    1. Explanation and Justification
    2. The Expertise Defense
    3. The Regress Challenge
    4. Positive Alternatives
  11. References and Further Reading

1. Biological and Cultural Origins of Moral Judgment

One key empirical question is this: are moral judgments rooted in innate factors or are they fully explained by acquired cultural traits? During the 20th century, scientists tended to adopt extreme positions on this question. The psychologist B. F. Skinner (1971) saw moral rules as socially conditioned patterns of behavior; given the right reinforcement, people could be led to judge virtually anything morally right or wrong. The biologist E. O. Wilson (1975), in contrast, argued that nearly all of human morality could be understand via the application of evolutionary biology. Around the early 21st century, however, most researchers on moral judgment began to favor a hybrid model, allowing roles for both biological and cultural factors.

There is evidence that at least the precursors of moral judgment are present in humans at birth, suggesting an evolutionary component. In a widely cited study, Kiley Hamlin and colleagues examined the social preferences of pre-verbal infants (Hamlin, Wynn, and Bloom 2007). The babies, seated on their parents’ laps, watched a puppet show in which a character trying to climb a hill was helped up by one puppet, but pushed back down by another puppet. Offered the opportunity to reach out and grasp one of the two puppets, babies showed a preference for the helping puppet over the hindering puppet. This sort of preference is not yet full-fledged moral judgment, but it is perhaps the best we can do in assessing the social responses of humans before the onset of language, and it suggests that however human morality comes about, it builds upon innate preferences for pro-social behavior.

A further piece of evidence comes from the work of theorists Leda Cosmides and John Tooby (1992), who argue that the minds of human adults display an evolutionary specialization for moral judgment. The Wason Selection Task is an extremely well-established paradigm in the psychology of reasoning, which shows that most people make persistent and predictable mistakes in evaluating abstract inferences. A series of studies by Cosmides, Tooby, and their colleagues shows that people do much better on a form of this task when it is restricted to violations of social norms. So, for instance, rather than being asked to evaluate an abstract rule linking numbers and colors, people were asked to evaluate a rule prohibiting those below a certain age from consuming alcohol. Participants in these studies made the normal mistakes when looking for violations of abstract rules, but made fewer mistakes in detecting violations of social rules. According to Cosmides and Tooby, these results suggest that moral judgment evolved as a domain-specific capacity, rather than an as application of domain-general reasoning. If this is right, then there must be at least an innate core to moral judgments.

Perhaps the most influential hybrid model of innate and cultural factors in moral judgment research is the linguistic analogy research program (Dwyer 2006; Hauser 2006; Mikhail 2011). This approach is explicitly modeled on Noam Chomsky’s (1965) generative linguistics. According to Chomsky, the capacity for language production and some basic structural parameters for functioning grammar are innate, but the enormous diversity of human languages comes about through myriad cultural settings and prunings within the evolutionarily allowed range of possible grammars. By analogy, then, moral grammar suggests that the capacity of making moral judgments is innate, along with some basic constraints on the substance of the moral domain—but within this evolutionarily enabled space, culture works to select and prune distinct local moralities.

The psychologist Jonathan Haidt (2012) has highlighted the role of cultural difference in the scientific study of morality through his influential Moral Foundations account. According to Haidt, all documented moral beliefs can be classified as fitting within a handful of moral sub-domains, such as harm-aversion, justice (in distribution of resources), respect for authority, and purity. Haidt argues that moral differences between cultures reflect differences in emphasis on these foundations. In fact, he suggests, industrialized western cultures appear to have emphasized the harm-aversion and justice foundations almost exclusively, unlike many other world cultures. Yet Haidt also insists upon a role for biology in explaining moral judgment; he sees each of the foundations as rooted in a particular evolutionary origin. Haidt’s foundations account remains quite controversial, but it is a prominent example of contemporary scientists’ focus on avoiding polarized answers to the biology versus culture question.

The rest of this article mostly sets aside further discussion of the distal—evolutionary or cultural—explanation of moral judgment. Instead it focuses on proximal cognitive explanations for particular moral judgments. This is because the ultimate aim is to consider the philosophical significance of moral cognitive science, whereas the moral philosophical uptake of debates over evolution is discussed elsewhere. Interested readers can see the articles on “Evolutionary Ethics” and “Moral Relativism.”

2. The Psychology of Moral Reasoning

One crucial question is whether moral judgments arise from conscious reasoning and reflection, or are triggered by unconscious and immediate impulses. In the 1970s and 1980s, research in moral psychology was dominated by the work of Lawrence Kohlberg (1971), who advocated a strongly rationalist conception of moral judgment. According to Kohlberg, mature moral judgment demonstrates a concern with reasoning through highly abstract social rules. Kohlberg asked his participants (boys and young men) to express opinions about ambiguous moral dilemmas and then explain the reasons behind their conclusions. Kohlberg took it for granted that his participants were engaged in some form of reasoning; what he wanted to find out was the nature and quality of this reasoning, which he claimed emerged through a series of developmental stages. This approach came under increasing criticism in the 1980s, particularly through the work of feminist scholar Carol Gilligan (1982), who exposed the trouble caused by Kohlberg’s reliance on exclusively male research participants. See the article “Moral Development” for further discussion of that issue.

Since the turn of the twentieth century psychologists have placed much less emphasis on the role of reasoning in moral judgment. A turning point was the publication of Jonathan Haidt’s paper, “The Emotional Dog and Its Rational Tail” (Haidt 2001). There Haidt discusses a phenomenon he calls “moral dumbfounding.” He presented his participants with provocative stories, such as a brother and sister who engage in deliberate incest, or a man who eats his (accidentally) dead dog. Asked to explain their moral condemnation of these characters, participants cited reasons that seem to be ruled out by the description of the stories—for instance, participants said that the incestuous siblings might create a child with dangerous birth defects, though the original story made clear that they took extraordinary precautions to avoid conception. When reminded of these details, participants did not withdraw their moral judgments; instead they said things like, “I don’t know why it’s wrong, it just is.” According to Haidt, studies of moral dumbfounding confirm a pattern of evidence that people do not really reason their way to moral conclusions. Moral judgment, Haidt argues, arrives in spontaneous and unreflective flashes, and reasoning is only something done post hoc, to rationalize the already-made judgments.

Many scientists and philosophers have written about the evidence Haidt discusses. A few take extremist positions, absolutely upholding the old rationalist tradition or firmly insisting that reasoning plays no role at all. (Haidt himself has slightly softened his anti-rationalism in later publications.) But it is probably right to say that the dominant view in moral psychology is a hybrid model. Many of our everyday moral judgments do arise in sudden flashes, without any need for explicit reasoning. But when faced with new and difficult moral situations, and sometimes even when confronted with explicit arguments against our existing beliefs, we are able to reason our way toward new moral judgments.

The dispute between rationalists and their antagonists is primarily one about procedure: are moral judgments produced through conscious thought or unconscious psychological mechanisms? There is a related but distinct set of issues concerning the quality of moral judgment. However moral judgment works, explicitly or implicitly, is it rational? Is the procedure that produces our moral judgments a reliable procedure? (It is important to note that a mental procedure might be unconscious and still be rational. Many of our simple mathematical calculations are accomplished unconsciously, but that alone does not keep them from counting as rational.)

There is considerable evidence suggesting that our moral judgments are unreliable (and so arguably irrational). Some of this evidence comes from experiments imported from other domains of psychology, especially from behavioral economics. Perhaps most famous is the Asian disease case first employed by the psychologists Amos Tversky and Daniel Kahneman in the early 1980s. Here is the text they gave to some of their participants:

Imagine that the U.S. is preparing for the outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat this disease have been proposed. Assume that the exact scientific estimate of the consequences of the programs are as follows:

If Program A is adopted, 200 people will be saved.

If Program B is adopted, there is a 1/3 probability that 600 will be saved, and 2/3 probability that no people will be saved. (Tversky and Kahneman 1981, 453)

Other participants read a modified form of this scenario, with the first option being that 400 people will die and the second option being that there is a 1/3 probability that nobody will die, and a 2/3 probability that 600 people will die. Notice that this is a change only in wording: the first option is the same either way, whether you describe it as 200 people being saved or 400 dying, and the second option gives the same probabilities of outcomes in either description. Notice also that, in terms of probability-expected outcome, the two programs are mathematically equivalent (for expected values, certain survival of 1/3 of people is equivalent to a 1/3 chance of all people surviving). Yet participants responded strongly to the way the choices are described; those who read the original phrasing preferred Program A three to one, while those who read the other phrasing showed almost precisely the opposite preference. Apparently, describing the choice in terms of saving makes people prefer the certain outcome (Program A) while describing it in terms of dying makes people prefer the chancy outcome (Program B).

This study is one of the most famous in the literature on framing effects, which shows that people’s judgments are affected by merely verbal differences in how a set of options is presented. Framing effects have been shown in many types of decision, especially in economics, but when the outcomes concern the deaths of large numbers of people it is clear that they are of moral significance. Many studies have shown that people’s moral judgments can be manipulated by mere changes in verbal description, without (apparently) any difference in features that matter to morality (see Sinnott-Armstrong 2008 for other examples).

Partly on this basis, some theorists have advocated a heuristics and biases approach to moral psychology. A cognitive bias is a systematic defect in how we think about a particular domain, where some feature influences our thinking in a way that it should not. Some framing effects apparently trigger cognitive biases; the saving/dying frame exposes a bias toward risk-taking in save frames and a bias away from it in dying frames. (Note that this leaves open whether either is the correct response—the point is that the mere difference of frame shouldn’t affect our responses, so at least one of the divergent responses must be mistaken.)

In the psychology of (non-moral) reasoning, a heuristic is a sort of mental short-cut, a way of skipping lengthy or computationally demanding explicit reasoning. For instance, if you want to know which of several similar objects is the heaviest, you could assume that it is the largest. Heuristics are valuable because they save time and are usually correct—but in certain types of cases an otherwise valuable heuristic will make predictable errors (some small objects are actually denser and so heavier than large objects). Perhaps some of our simple moral rules (“do no harm”) are heuristics of this sort—right most of the time, but unreliable in certain cases (perhaps it is sometimes necessary to harm people in emergencies). Some theorists (for example, Sunstein 2005) argue that large sectors of our ordinary moral judgments can be shown to exhibit heuristics and biases. If this is right, most moral judgment is systematically unreliable.

A closely related type of research shows that moral judgment is affected not only by irrelevant features within the questions (such as phrasing) but also by completely accidental features of the environment in which we make judgments. To take a very simple example: if you sit down to record your judgments about morally dubious activities, the severity of your reaction will depend partly on the cleanliness of the table (Schnall et al. 2008). You are likely to give a more harsh assessment of the bad behavior if the table around you is sticky and covered in pizza boxes than if it is nice and clean. Similarly, you are likely to render more negative moral judgments if you have been drinking bitter liquids (Eskine, Kacinik, and Prinz 2011), or handling dirty money (Yang et al. 2013). Watching a funny movie will make you temporarily less condemning of violence (Strohminger, Lewis, and Meyer 2011).

Some factors affecting moral judgment are internal to the judge rather than the environment. If you’ve been given a post-hypnotic suggestion to feel queasy whenever you hear a certain completely innocuous word, you will probably make more negative moral judgments about characters in stories containing those triggering words (Wheatley and Haidt 2005). Such effects are not restricted to laboratories; one Israeli study showed that parole boards are more likely to be lenient immediately after meals than when hungry (Danziger, Levav, and Avnaim-Pesso 2011).

Taken together, these studies appear to show that moral judgment is affected by factors that are morally irrelevant. The degree of wrongness of an act does not depend on which words are used to describe it, or the cleanliness of the desk at which it is considered, or whether the thinker has eaten recently. There seems to be strong evidence, then, that moral judgments are at least somewhat unreliable.  Moral judgment is not always of high quality. The extent of this problem, and what difference it might make to philosophical morality, is a matter that is discussed later in the article.

3. Interaction between Moral Judgment and Other Cognitive Domains

Setting aside for now the reliability of moral judgment, there are other questions we can ask about its production, especially about its causal structure. Psychologically speaking, what factors go into producing a particular moral judgment about a particular case? Are these the same factors that moral philosophers invoke when formally analyzing ethical decisions? Empirical research appears to suggest otherwise.

One example of this phenomenon concerns intention. Most philosophers have assumed that in order for something done by a person to be evaluable as morally right or wrong, the person must have acted intentionally (at least in ordinary cases, setting aside negligence). If you trip me on purpose, that is wrong, but if you trip me by accident, that is merely unfortunate. Following this intuitive point, we might think that assessment of intentionality is causally prior to assessment of morality. That is, when I go to evaluate a potentially morally important situation, I first work out whether the person involved acted intentionally, and then use this judgment as an input to working out whether what they have done is morally wrong. Hence, in this simple model, the causal structure of moral judgment places it downstream from intentionality judgment.

But empirical evidence suggests that this simple model is mistaken. A very well-known set of studies concerning the side-effect effect (also known as the Knobe effect, for its discoverer Joshua Knobe) appears to show that the causal relationship between moral judgment and intention-assessment is much more complicated. Participants were asked to read short stories like the following:

The vice-president of a company went to the chairman of the board and said, “We are thinking of starting a new program. It will help us increase profits, but it will also harm the environment.” The chairman of the board answered, “I don’t care at all about harming the environment. I just want to make as much profit as I can. Let’s start the new program.” They started the new program. Sure enough, the environment was harmed. (Knobe 2003, 191)

Other participants read the same story, except that the program would instead help, rather than harm, the environment as a side effect. Both groups of participants were asked whether the executive intentionally brought about the side effect. Strikingly, when the side effect was a morally bad one (harming the environment), 82% of participants thought it was brought about intentionally, but when the side effect was morally good (helping the environment) 77% of participants thought it was not brought about intentionally.

There is a large literature offering many different accounts of the side-effect effect (which has been repeatedly experimentally replicated). One plausible account is this: people sometimes make moral judgments prior to assessing intentionality. Rather than intentionality-assessment always being an input to moral judgment, sometimes moral judgment feeds input to assessing intentionality. A side effect judged wrong is more likely to be judged intentional than one judged morally right. If this is right, then the simple model of the causal structure of moral judgment cannot be quite correct—the causal relation between intentionality-assessment and moral judgment is not unidirectional.

Other studies have shown similar complexity in how moral judgment relates to other philosophical concepts. Philosophers have often thought that questions about causal determinism and free will are conceptually prior to attributing moral responsibility. That is, whether or not we can hold people morally responsible depends in some way on what we say about freedom of the will. Some philosophers hold that determinism is compatible with moral responsibility and others deny this, but both groups start from thinking about the metaphysics of agency and move toward morality. Yet experimental research suggests that the causal structure of moral judgment works somewhat differently (Nichols and Knobe 2007). People asked to judge whether an agent can be morally responsible in a deterministic universe seem to base their decision in part on how strongly they morally evaluate the agent’s actions. Scenarios involving especially vivid and egregious moral violations tend to produce greater endorsement of compatibilism than more abstract versions. The interpretation of these studies is highly controversial, but at minimum they seem to cast doubt on a simple causal model placing judgments about free will prior to moral judgment.

A similar pattern holds for judgments about the true self. Some moral philosophers hold that morally responsible action is action resulting from desires or commitments that are a part of one’s true or deep self, rather than momentary impulses or external influences. If this view reflects how moral judgments are made, then we should expect people to first assess whether a given action results from an agent’s true self and then render moral judgment about those that do. But it turns out that moral judgment provides input to true self assessments. Participants in one experiment (Newman, Bloom, and Knobe 2014) were asked to consider a preacher who explicitly denounced homosexuality while secretly engaging in gay relationships. Asked to determine which of these behaviors reflected the preacher’s true self, participants first employed their own moral judgment; those disposed to see homosexuality as morally wrong thought the preacher’s words demonstrated his true self, while those disposed to accept homosexuality thought the preacher’s relationships came from his true self. The implication is that moral judgment is sometimes a causal antecedent of other types of judgments, including those that philosophers have thought conceptually prior to assessing morality.

4. The Neuroanatomy of Moral Judgment

Physiological approaches to the study of moral judgment have taken on an increasingly important role. Employing neuroscientific and psychopharmacological research techniques, these studies help to illuminate the functional organization of moral judgment by revealing the brain areas implicated in its exercise.

An especially central concern in this literature is the role of emotion in moral judgment. An early influential study by Jorge Moll and colleagues (2005) demonstrated selective activity for moral judgment in a network of brain areas generally understood to be central to emotional processing. This work employed functional magnetic resonance imaging (fMRI), the workhorse of modern neuroscience, in which a powerful magnet is used to produce a visual representation of relative levels of cellular energy used in various brain areas. Employing fMRI allows researchers to get a near-real-time representation of the brain’s activities while the conscious subject makes judgments about morally important scenarios.

One extremely influential fMRI study of moral judgment was conducted by philosopher-neuroscientist Joshua D. Greene and colleagues. They compared brain activity in people making deontological moral judgments with brain activity while making utilitarian moral judgments. (To oversimplify: a utilitarian moral judgment is one primarily attentive to the consequences of a decision, even allowing the deliberate sacrifice of an innocent to save a larger number of others. Deontological moral judgment is harder to define, but for our purposes means moral judgment that responds to factors other than consequences, such as the individual rights of someone who might be sacrificed to save a greater number. See “Ethics.”

In a series of empirical studies and philosophical papers, Greene has argued that his results show that utilitarian moral judgment correlates with activity in cognitive or rational brain areas, while deontological moral judgment correlates with activity in emotional areas (Greene 2008). (He has since softened this view a bit, conceding that both types of moral judgment allow some form of emotional processing. He now holds that deontological emotions are a type that trigger automatic behavioral responses, whereas utilitarian emotions are flexible prompts to deliberation (Greene 2014).) According to Greene, learning these psychological facts give us reason to distrust our deontological judgments; in effect, his is a neuroscience-fueled argument on behalf of utilitarianism. This argument is at the center of a still very spirited debate. Whatever its outcome, Greene’s research program has had an undeniable influence on moral psychology; his scenarios (which derive from philosophical thought experiments by Philippa Foot and Judith Jarvis Thomson) have been adopted as standard across much of the discipline, and the growth of moral neuroimaging owes much to his project.

Alongside neuroimaging, lesion study is one of the central techniques of neuroscience. Recruiting research participants who have pre-existing localized brain damage (often due to physical trauma or stroke) allows scientists to infer the function of a brain area from the behavioral consequences of its damage. For example, participants with damage to the ventromedial prefrontal cortex, who have persistent difficulties with social and emotional processing, were tested on dilemmas similar to those used by Greene (Koenigs et al. 2007). These patients show a greater tendency toward utilitarian judgments than did healthy controls. Similar lesion studies have since found a range of different results, so the empirical debate remains unsettled, but the technique continues to be important.

Two newer neuroscientific research techniques have begun to play important roles in moral psychology. Transcranial magnetic simulation (TMS) uses blasts of electromagnetism to suppress or heighten activity in a brain region. In effect, this allows researchers to (temporarily) alter healthy brains and correlate this alteration with behavioral effects. For instance, one study (Young et al. 2010) used TMS to suppress activity in the right temporoparietal junction, an area associated with assessing the mental states of others (see “Theory of Mind.” After the TMS treatment, participants’ moral judgments showed less sensitivity to whether characters in a dilemma acted intentionally or accidentally. Another technique, transcranial direct current stimulation (TCDS) has been shown to increase compliance with social norms when applied to the right lateral prefrontal cortex (Ruff, Ugazio, and Fehr 2013).

Finally, it is possible to study the brain not only at the gross structural scale, but also by examining its chemical operations. Psychopharmacology is the study of the cognitive and behavioral effects of chemical alteration of the brain. In particular, the levels of neurotransmitters, which regulate brain activity in a number of ways, can be manipulated by introducing pharmaceuticals. For example, participants’ readiness to make utilitarian moral judgments can be altered by administration of the drugs propranolol (Terbeck et al. 2013) and citalopram (Crockett et al. 2010).

5. What Do Moral Philosophers Think of Cognitive Science?

So far this article has described the existing science of moral judgment: what we have learned empirically about the causal processes by which moral judgments are produced. The rest of the article discusses the philosophical application of this science. Moral philosophers try to answer substantive ethical questions: what are the most valuable goals we could pursue? How should we resolve conflicts among these goals? Are there ways we should not act even if doing so would promote the best outcome? What is the shape of a good human life and how could we acquire it? How is a just society organized? And so on.

What cognitive science provides is causal information about how we typically respond to these kinds of questions. But philosophers disagree about what to make of this information. Should we ever change our answers to ethical questions on the basis of what we learn about their causal origins? Is it ever reasonable (or even rationally mandatory) to abandon a confident moral belief because of a newly learned fact about how one came to believe it?

We now consider several prominent responses to these questions. We start with views that deny much or any role for cognitive science in moral philosophy. We then look at views that assign to cognitive science a primarily negative role, in disqualifying or diminishing the plausibility of certain moral beliefs. We conclude by examining views that see the findings of cognitive science as playing a positive role in shaping moral theory.

6. Moral Cognition and the Is/Ought Distinction

We must start with the famous is/ought distinction. Often attributed to Hume (see “Hume“), the distinction is a commonsensical point about the difference between descriptive claims that characterize how things are (for example, “the puppy was sleeping”) and prescriptive claims that assert how things should or should not be (for instance, “it was very wrong of you to kick that sleeping puppy”). These are widely taken to be two different types of claims, and there is a lot to be said about the relationship between them. For our purposes, we may gloss it as follows: people often make mistakes when they act as if an ought-claim follows immediately and without further argument from an is-claim. The point is not (necessarily) that it is always a mistake to draw an ought-statement as the conclusion of an is-statement, just that the relationship between them is messy and it is easy to get confused.  Some philosophers do assert the much stronger claim that ought-statements can never be validly drawn from is-statements, but this is not what everyone means when the issue is raised.

For our purposes, we are interested in how the is/ought distinction might matter to applying cognitive science to moral philosophy. Cognitive scientific findings are is-type claims; they describe facts about how our minds actually do work, not about how they ought to work. Yet the kind of cognitive scientific claims at interest here are claims about moral cognition—is-claims about the origin of ought-claims. Not surprisingly then, if the is/ought distinction tends to mark moments of high risk for confusion, we should expect this to be one of those moments. Some philosophers have argued that attempts to change moral beliefs on the basis of cognitive scientific findings are indeed confusions of this sort.

a. Semantic Is/Ought

In the mid-twentieth century it was popular to understand the is/ought distinction as a point about moral semantics. That is, the distinction pointed to a difference in the implicit logic of two uses of language. Descriptive statements (is-statements) are, logically speaking, used to attribute properties to objects; “the puppy is sleeping” just means that the sleeping-property attaches to the puppy-object. But prescriptive statements (ought-statements) do not have this logical structure. Their surface descriptive grammar disguises an imperative or expressive logic. So “it was very wrong of you to kick that sleeping puppy” is not really attributing the property of wrongness to your action of puppy-kicking. Rather, the statement means something like “don’t kick sleeping puppies!” or even “kicking sleeping puppies? Boo!” Or perhaps “do not like it when sleeping puppies are kicked and I want you to not like it as well.” (See “Ethical Expressivism.”)

If this analysis of the semantics of moral statements is right, then we can easily see why it is a mistake to draw ought-conclusions from is-premises. Logically speaking, simple imperatives and expressives do not follow from simple declaratives. If you agree with “the puppy was sleeping” yet refuse to accept the imperative “don’t kick sleeping puppies!” you haven’t made a logical mistake. You haven’t made a logical mistake even if you also agree with “kicking sleeping puppies causes them to suffer.” The point here isn’t about the moral substance of animal cruelty—the point is about the logical relationship between two types of language use. There is no purely logical compulsion to accept any simple imperative or expressive on the basis of any descriptive claim, because the types of language do not participate in the right sort of logical relationship.

Interestingly, this sort of argument has not played much of a role in the debate about moral cognitive science, though seemingly it could. After all, cognitive scientific claims are, logically speaking, descriptive claims, so we could challenge their logical relevance to assessing any imperative or expressive claims. But, historically speaking, the rise of moral cognitive science came mostly after the height of this sort of semantic argument. Understanding the is/ought distinction in semantic terms like these had begun to fade from philosophical prominence by the 1970s, and especially by the 1980s when modern moral cognitive science (arguably) began. Contemporary moral expressivists are often eager to explain how we can legitimately draw inferences from is to ought despite their underlying semantic theory.

It is possible to see the simultaneous decline of simple expressivism and the rise of moral cognitive science as more than mere coincidence.  Some historians of philosophy see the discipline as having pivoted from the linguistic turn of the late 19th and early 20th century to the cognitive turn (or conceptual turn) of the late 20th century. Philosophy of language, while still very important, has receded from its position at the center of every philosophical inquiry. In its place, at least in some areas of philosophy, is a growing interest in naturalism and consilience with scientific inquiry. Rather than approaching philosophy via the words we use, theorists often now approach it through the concepts in our minds—concepts which are in principle amenable to scientific study. As philosophers became more interested in the empirical study of their topics, they were more likely to encourage and collaborate with psychologists. This has certainly contributed to the growth of moral cognitive science since the 1990s.

b. Non-semantic Is/Ought

Setting aside semantic issues, we still have a puzzle about the is/ought distinction. How do we get to an ought conclusion from an is premise? The idea that prescriptive and descriptive claims are different types of claim retains its intuitive plausibility. Some philosophers have argued that scientific findings cannot lead us to (rationally) change our moral beliefs because science issues only descriptive claims. It is something like a category mistake to allow our beliefs in a prescriptive domain to depend crucially upon claims within a descriptive domain. More precisely, it is a mistake to revise your prescriptive moral beliefs because of some purely descriptive fact, even if it is a fact about those beliefs.

Of course, the idea here cannot be that it is always wrong to update moral beliefs on the basis of new scientific information. Imagine that you are a demolition crew chief and you are just about to press the trigger to implode an old factory. Suddenly one of your crew members shouts, “Wait, look at the thermal monitor! There’s a heat signature inside the factory—that’s probably a person! You shouldn’t press the trigger!” It would be extremely unfitting for you to reply that whether or not you should press the trigger cannot depend on the findings of scientific contraptions like thermal monitors.

What this example shows is that the is/ought problem, if it is a problem, is obviously not about excluding all empirical information from moral deliberation. But it is important to note that the scientific instrument itself does not tell us what we ought to do. We cannot just read off moral conclusions from descriptive scientific facts. We need some sort of bridging premise, something that connects the purely descriptive claim to a prescriptive claim. In the demolition crew case, it is easy to see what this bridging premise might be, something like: “if pressing a trigger will cause the violent death of an innocent person, then you should not press the trigger.” In ordinary moral interactions we often leave bridging premises implicit—it is obvious to everyone in this scenario that the presence of an innocent human implies the wrongness of going ahead with implosion.

But there is a risk in leaving our bridging premises implicit. Sometimes people seem to be relying upon implicit bridging premises that are not mutually agreed on, or that may not make any sense at all. Consider: “You shouldn’t give that man any spare change. He is a Templar.” Here you might guess that the speaker thinks Templars are bad people and do not deserve help, but you might not be sure—why would your interlocutor even care about Templars? And you are unlikely to agree with this premise anyway, so unless the person makes their anti-Templar views explicit and justifies them, you do not have much reason to follow their moral advice. Another example: “The tiles in my kitchen are purple, so it’s fine for you to let those babies drown.” It is actually hard to interpret this utterance as something other than a joke or metaphor. If someone tried to press it seriously, we would certainly demand to be informed of the bridging premise between linoleum color and nautical infanticide, and we would be skeptical that anything plausible might be provided.

So far, so simple. Now consider: “Brain area B is extremely active whenever you judge it morally wrong to cheat on your taxes. So it is morally wrong to cheat on your taxes.” What should we make of this claim? The apparent implicit bridging premise is: If brain area B gets excited when you judge it wrong to X, then it is wrong to X. But this is a very strange bridging premise. It does not make reference to any internal features of tax-cheating that might explain its wrongness. In fact, the premise appears to suggest that an act can be wrong simply because someone thinks it is wrong and some physical activity is going on inside that person’s body. This is not the sort of thing we normally offer to get to moral conclusions, and it is not immediately clear why anyone would find it convincing. Perhaps, as we said, this is an example of how easy it is to get confused about is and ought claims.

Two points come out here. First, some attempts at drawing moral conclusions from cognitive science involve implicit bridging premises that fall apart when anyone attempts to make them explicit. This is often true of popular science journalism, in which some new psychological finding is claimed to prove that such-and-such moral belief is mistaken. At times, philosophers have accused their psychologist or philosopher opponents of doing something similar. According to Berker (2009), Joshua Greene’s neuroscience-based debunking of deontology (discussed above) lacks a convincing bridging premise. Berker suggest that Greene avoids being explicit about this premise, because when it is made explicit it is either a traditional moral argument which does not use cognitive science to motivate its conclusion or it employs cognitive scientific claims but does not lead to any plausible moral conclusion. Hence, Berker says, the neuroscience is normatively insignificant; it does not play a role in any plausible bridging premise to a moral conclusion.

Of course, even if this is right, then it implies only that Greene’s argument fails. But Berker and other philosophers have expressed doubt that any cognitive science-based argument could generate a moral conclusion. They suggest that exciting newspaper headlines and book subtitles (for example, How Science Can Determine Human Values (Harris 2010)) trade on our leaving their bridging premises implicit and unchallenged. For—this is the second point—if there were a successful bridging principle, it would be a very unusual one. Why, we want to know, could the fact that such-and-such causal process precedes or accompanies a moral judgment give us reason to change our view about that moral judgment? Coincident causal processes do not appear to be morally relevant. What your brain is doing while you are making moral judgments seems to be in the same category as the color of your kitchen tiles—why should it matter?

Of course, the fact that causal processes are not typically used in bridging premises does not show us that they could not. But it does perhaps give us reason to be suspicious, and to insist that the burden of proof is on anyone who wishes to present such an argument. They must explain to us why their use of cognitive science is normatively significant. A later section considers different attempts to meet this burden of proof. First, though, consider an argument that it can never be met, even in principle.

7. The Autonomy of Morality

Some philosophers hold that it is a mistake to try to draw moral conclusions from psychological findings because doing so misunderstands the nature of moral deliberation. According to these philosophers, moral deliberation is essentially first-personal, while cognitive science can give us only third-personal forms of information about ourselves. When you are trying to decide what to do, morally speaking, you are looking for reasons that relate your options to the values you uphold. I have moral reason not to kick puppies because I recognize value in puppies (or at least the non-suffering of puppies). Psychological claims about your brain or psychological apparatus might be interesting to someone else observing you, but they are beside the point of first-personal moral deliberation about how to act.

Here is how Ronald Dworkin makes this point. He asks us to imagine learning some new psychological fact about why we have certain beliefs concerning justice. Suppose that the evidence suggests our justice beliefs are caused by self-interested psychological processes. Then:

It will be said that it is unreasonable for you still to think that justice requires anything, one way or the other. But why is that unreasonable? Your opinion is one about justice, not about your own psychological processes . . . You lack a normative connection between the bleak psychology and any conclusion about justice, or any other conclusion about how you should vote or act. (Dworkin 1996, 124–125)

The idea here is not just that beliefs about morality and beliefs about psychology are about different topics. Part of the point is that morality is special—it is not just another subject of study alongside psychology and sociology and economics. Some philosophers put this point in terms of an a priori / a posteriori distinction: morality is something that we can work out entirely in our minds, without needing to go and do experiments (though of course we might need empirical information to apply moral principles once we have figured them out). Notice that when philosophers debate moral principles with one another, they do not typically conduct or make reference to empirical studies. They think hard about putative principles, come up with test cases that generate intuitive verdicts, and then think hard again about how to modify principles to make them fit to these verdicts. The process does not require empirical input, so there is no basis for empirical psychology to become involved.

This view is sometimes called the autonomy of morality (Fried 1978; Nagel 1978). It holds that, in the end, the arbiter of our moral judgments will be our moral judgments—not anything else. The only way you can get a moral conclusion from a psychological finding is to provide the normative connection that Dworkin refers to. So, for instance: if you believe that it is morally wrong to vote in a way triggered by a selfish psychological process, then finding out that your intending to vote for the Egalitarian Party is somehow selfishly motivated could give you a reason to reconsider your vote. But notice that this still crucially depends upon a moral judgment—the judgment that it is wrong to vote in a way triggered by selfish psychological processes. There is no way to get entirely out of the business of making moral judgments; psychological claims are morally inert unless accompanied by explicitly moral claims.

The point here can be made in weaker and stronger forms. The weaker form is simply this: empirical psychology cannot generate moral conclusions entirely on its own. This weaker form is accepted by most philosophers; few deny that we need at least some moral premises in order to get moral conclusions. Notice though that even the weaker form casts doubt on the idea that psychology might serve as an Archimedean arbiter of moral disagreement. Once we’ve conceded that we must rely upon moral premises to get results from psychological premises, we cannot claim that psychology is a value-neutral platform from which to settle matters of moral controversy.

The stronger form of the autonomy of morality claim holds that a psychological approach to morality fundamentally misunderstands the topic. Moral judgment, in this view, is about taking agential responsibility for our own value and actions. Re-describing these in psychological terms, so that our commitments are just various causal levers, represents an abdication of this responsibility. We should instead maintain a focus on thinking through the moral reasons for and against our positions, leaving psychology to the psychologists.

Few philosophers completely accept the strongest form of the autonomy of morality. That is, most philosophers agree that there are at least some ways psychological genealogy could change the moral reasons we take ourselves to have. But it will be helpful for us to keep this strong form in mind as a sort of null hypothesis. As we now turn to various theoretical arguments in support of a role for cognitive science in moral theory, we can understand each as a way of addressing the strong autonomy of morality challenge. They are arguments demonstrating that psychological investigation is important to our understanding of our moral commitments.

8. Moral Cognition and Moral Epistemology

Many moral philosophers think about their moral judgments (or intuitions) as pieces of evidence in constructing moral theories. Rival moral principles, such as those constituting deontology and consequentialism, are tested by seeing if they get it right on certain important cases.

Take the well-known example of the Trolley Problem. An out-of-control trolley is rumbling toward several innocents trapped on a track. You can divert the trolley, but only by sending it onto a side track where it will kill one person. Most people think it is morally permissible to divert the trolley in this case. Now imagine that the trolley cannot be diverted—it can be stopped only by physically pushing a large person into its path, causing an early crash. Most people do not think this is morally permitted. If we treat these two intuitive reactions as moral evidence, then we can affirm that how an individual is killed makes a difference to the rightfulness of sacrificing a smaller number of lives to save a larger number. Apparently, it is permissible to indirectly sacrifice an innocent as a side-effect of diverting a threat, but not permissible to directly use an innocent as a means to stop the threat. This seems like preliminary evidence against a moral theory that says that outcomes are the only things that matter morally, since the outcomes are identical in these two cases.

This sort of reasoning is at the center of how most philosophers practice normative ethics. Moral principles are proposed to account for intuitive reactions to cases. The best principles are (all else equal) those that cohere with the largest number of important intuitions. A philosopher who wishes to challenge a principle will construct a clever counterexample: a case where it just seems obvious that it is wrong to do X, but the targeted principle allows us to do X in the case. Proponents of the principle must now (a) show that their principle has been misapplied and actually gives a different verdict about the case; (b) accept that the principle has gone wrong but alter it to give a better answer; (c) bite the bullet and insist that even if the principle seems to have gone wrong here, it still trustworthy because it is right in so many other cases; or (d) explain away the problematic intuition, by showing that the test case is underdescribed or somehow unfair, or that the intuitive reaction itself likely results from confusion. If all of this sounds a bit like the testing of scientific hypotheses against experimental data, that is no accident. The philosopher John Rawls (1971) explicitly modeled this approach, which he called “reflective equilibrium,” See “Rawls” on hypothesis testing in science.

There are various ways to understand what reflective equilibrium aims at doing. In a widely accepted interpretation, reflective equilibrium aims at discovering the substantive truths of ethics. In this understanding, those moral principles supported in reflective equilibrium are the ones most likely to be the moral truth. (We will leave aside here How to interpret truth in the moral domain is left aside in this article, but see “Metaethics.”) Intuitions about test cases are evidence for moral truth in much the same way that scientific observations are evidence for empirical truth. In science, our confidence in a particular theory depends on whether it gets evidential support from repeated observations, and in moral philosophy (in this conception) our confidence in a particular ethical theory depends on whether it gets evidential confirmation from multiple intuitions.

This parallel allows us to see one way in which the psychology of moral judgment might be relevant to moral philosophy. When we are testing a scientific theory, our trust in any experimental observation depends on our confidence in the scientific instruments used to generate them. If we come to doubt the reliability of our instruments, then we should doubt the observations we get from them, and so should doubt the theories they appear to support. What, then, if we come to doubt the reliability of our instruments in moral philosophy? Of course, philosophers do not use microscopes or mass spectrometers. Our instruments are nothing more than our own minds—or, more precisely, our mental abilities to understand situations and apply moral concepts to them. Could our moral minds be unreliable in the way that microscopes can be unreliable?

This is certainly not an idle worry, since we know that we make consistent mental mistakes in other domains. Think about persistent optical illusions or predictably bad mathematical reasoning. There is an enormous literature in cognitive science showing that we make consistent mistakes in domains other than morality. Against this background, it would be a surprise if our moral intuitions turned out not to be full of mistakes.

In earlier sections we discussed psychological evidence showing systematic deficits in moral reasoning. We saw, for instance, that people’s moral judgments are affected by the verbal framing in which test cases are presented (save versus die) and by the cleanliness of their immediate environment. If the readings of a scientific instrument appeared to be affected by environmental factors that had nothing to do with what the instrument was supposed to measure, then we would rightly doubt the trustworthiness of readings obtained from that instrument. In moral philosophy, a mere difference in verbal framing, or the dirtiness of the desk you are now sitting at, certainly do not seem like things that matter to whether or not a particular act is permissible. So it seems that our moral intuitions, like defective lab equipment, sometimes cannot be trusted.

Note how this argument relates to earlier claims about the autonomy of morality or the is/ought distinction. Here no one is claiming that cognitive science tells us directly which moral judgments are right and which wrong. All cognitive science can do is show us that particular intuitions are affected by certain causal factors—it cannot tell us which causal factors count as distorting and which are acceptable. It is up to us, as moral judges, to determine that differences of verbal framing (save/die) or desk cleanliness do not lead to genuine moral differences. Of course, we do not have to think very hard to decide that these causal factors are morally irrelevant—but the point remains that we are still making moral judgments, even very easy ones, and not getting our verdict directly from cognitive science.

Many proponents of a role for cognitive science in morality are willing to concede this much: in the end, any revision of our moral judgments will be authorized only by some other moral judgments, not by the science itself. But, they will now point out, there are some revisions of our moral judgments we ought to make, and we are able to make only because of input from cognitive science.  We all agree that our moral judgments should not be affected by the cleanliness of our environment, and the science is unnecessary for our agreeing on this. But we would not know about the fact that our moral judgments are affected by environmental cleanliness without cognitive science. So, in this sense at least, improving the quality of our moral judgments does seem to require the use of cognitive science. Put more positively: paying attention to the cognitive science of morality can allow us to realize that some seemingly reliable intuitions are not in fact reliable. Once these are set aside like broken microscopes, we can have greater confidence that the theories we build from the remainder will capture the moral truth. (See, for example, Sinnott-Armstrong 2008; Mason 2011.)

This argument is one way of showing the relevance of cognitive science to morality. Note that it is an epistemic debunking argument. The argument undermines certain moral intuitions as default sources of evidential justification. Cognitive science plays a debunking role by exposing the unreliable causal origins of our intuitions. Philosophers who employ debunking arguments like this do so with a range of aims. Some psychological debunking arguments are narrowly targeted, trying to show that a few particular moral intuitions are unjustified. Often this is meant to be a move within normative theory, weakening a set of principles the philosopher rejects. For instance, if intuitions supporting deontological moral theory are undermined, then we may end up dismissing deontology. Other times, philosophers intend to aim psychological debunking arguments much more widely. If it can be shown that all moral intuitions are undermined in this way, then we have grounds for skepticism about moral judgment [see “Moral Epistemology.”] The plausibility of all these arguments remains hotly debated. But if any of them should turn out to be convincing, then we have a clear demonstration of how cognitive science can matter to moral theory.

9. Non-epistemic Approaches

Epistemic debunking arguments presuppose that morality is best understood as an epistemic domain. That is, these arguments assume that there is (or could be) a set of moral truths, and that the aim of moral judgment is to track these moral truths. What if we do not accept this conception of the moral domain? What if we do not expect there to be moral truths, or do not think that moral intuition aims at tracking any such truth? In that case, should we care about the cognitive science of morality?

a. Consistency in Moral Reasoning

Obviously the answer to this question will depend on what we think the moral domain is, if it is not an epistemic domain. One common view, often associated with contemporary Kantians, is that morality concerns consistency in practical reasoning. Though we do not aim to uncover independent moral truths, we can still say that some moral beliefs are better than others, because some moral beliefs are better at cohering with the way in which we conceive of ourselves, or what we think we have most reason to do. In this understanding, a bad moral belief is not bad because it is mistaken about the moral facts (there are no moral facts) but is bad because it does not fit well with our other normative beliefs. The assumption here is that we want to be coherent, in that being a rational agent means aiming at consistency in the beliefs that ground one’s actions. [See “Moral Epistemology.”]

In this conception of the moral domain, cognitive science is useful because it can show us when we have unwittingly stumbled into inconsistency in our moral beliefs. A very simple example comes from the universalizability condition on moral judgment. It is incoherent to render different moral judgments about two people doing the exact same thing in the exact same circumstances. This is because morality (unlike taste, for instance) aims at universal prescription. If it is wrong for you to steal from me, then it is wrong for me to steal from you. (Assuming you and I are similarly situated—if I am an unjustly rich oligarch and you are a starving orphan, maybe things are different. But then this moral difference is due to different features of our respective situations. The point of universal prescription is to deny that there can be moral differences when situations are similar.) A person who explicitly denied that moral requirements applied equally to herself as to other people would not seem to really have gotten the point of morality. We would not take such a person very seriously when she complained of others violating her moral rights, if she claimed to be able to ignore those same rights in others.

We all know that we fail at universalizing our moral judgments sometimes; we all suffer moments of weakness where we try to make excuses for our own moral failings, excuses we would not permit to others. But some psychological research suggests that we may fail in this way far more often than we realize. Nadelhoffer and Feltz (2008) found that people make different judgments about moral dilemmas when they imagine themselves in the dilemma than when imagining others in the same dilemma. Presumably most people would not explicitly agree that there is such a moral difference, but they can be led into endorsing differing standards depending on whether they are presented with a me versus someone else framing of the case.  This is an unconscious failure of universalization, but it is still an inconsistency. If we aim at being consistent in our practical reasoning, we should want to be alerted to unconscious inconsistencies, so that we might get a start on correcting them. And in cases like this one, we do not have introspective access to the fact that we are inconsistent, but we can learn it from cognitive science.

Note how this argument parallels the epistemic one. The claim is, again, not that cognitive science tells us what counts as a good moral judgment. Rather cognitive science reveals to us features of the moral judgments we make, and we must then use moral reasoning to decide whether these features are problematic. Here the claim is that inconsistency in moral judgments is bad, because it undermines our aim to be coherent rational agents. We do not get that claim from cognitive science, but there are some cases where we could not apply it without the self-knowledge we gain from cognitive science. Hence the relevance of cognitive science to morality as aimed at consistency.

b. Rational Agency

There is another way in which cognitive science can matter to the coherent rational agency conception of morality. Some findings in cognitive science may threaten the intelligibility of this conception altogether. Recall, from section 2, the psychologist Jonathan Haidt’s work on moral dumbfounding; people appear to spontaneously invent justifications for their intuitive moral verdicts, and stick with these verdicts even after the justifications are shown to fail. If pressed hard enough, people will admit they simply do not know why they came to the verdicts, but hold to them nevertheless. In Haidt’s interpretation, these findings show that moral judgment happens almost entirely unconsciously, with conscious moral reasoning mostly a post hoc epiphenomenon.

If Haidt is right, point out Jeanette Kennett and Cordelia Fine (2009), then this poses a serious problem for the ideal of moral agency. For us to count as moral agents, there needs to be the right sort of connection between our conscious reasoning and our responses to the world. A robot or a simple animal can react, but a rational agent is one that can critically reflect upon her reasons for action and come to a deliberative conclusion about what she ought to do. Yet if we are morally dumbfounded in the way Haidt suggests, then our conscious moral reasoning may lack the appropriate connection to our moral reactions. We think that we know why we judge and act as we do, but actually the reasons we consciously endorse are mere post hoc confabulations.

In the end, Kennett and Fine argue that Haidt’s findings do not actually lead to this unwelcome conclusion. They suggest that he has misinterpreted what the experiments show, and that there is a more plausible interpretation that preserves the possibility of conscious moral agency. Note that responding in this way concedes that cognitive science might be relevant to assessing the status of our moral judgments. The dispute here is only over what the experiments show, not over what the implications would be if they showed a particular thing. This leaves the door open for further empirical research on conscious moral agency.

One possible approach is the selective application of Haidt’s argument. If it could be shown that certain moral judgments—those about a particular topic or sub-domain of morality—are especially prone to moral dumbfounding, then we might have the basis for disqualifying them from inclusion in reflective moral theory. This seems, at times, to be the approach adopted by Joshua Greene (see section 4) in his psychological attack on deontology. According to Greene (2014), deontological intuitions are of a psychological type distinctively disconnected from conscious reflection and should accordingly be distrusted. Many philosophers dispute Greene’s claims (see, for instance, Kahane 2012), but this debate itself shows the richness of engagement between ethics and cognitive science.

c. Intersubjective Justification

There is one further way in which cognitive science may have relevance to moral theory. In this last conception, morality is essentially concerned with intersubjective justification. Rather than trying to discover independent moral truths, my moral judgments aim at determining when and how my preferences can be seen as reasonable by other people. A defective moral judgment, in this conception, is one that reflects only my own personal idiosyncrasies and so will not be acceptable to others. For instance, perhaps I have an intuitive negative reaction to people who dress their dogs in sweaters even when it is not cold. If I come to appreciate that my revulsion of this practice is not widely shared, and that other people cannot see any justification for it, then I may conclude that it is not properly a moral judgment at all. It may be a matter of personal taste, but it cannot be a moral judgment if it has no chance of being intersubjectively justified.

Sometimes we can discover introspectively that our putative moral judgments are actually not intersubjectively justifiable, just by thinking carefully about what justifications we can or cannot offer. But there may be other instances in which we cannot discover this introspectively, and where cognitive science may help. This is especially so when we have unknowingly confabulated plausible-sounding justifications in order to make our preferences appear more compelling than they are (Rini 2013). For example, suppose that I have come to believe that a particular charity is the most deserving of our donations, and I am now trying to convince you to agree. You point out that other charities seem to be at least as effective, but I insist. By coincidence, the next day I participate in a psychological study of color association. The psychologists’ instruments notice that I react very positively to the color forest green—and then I remember that my favorite charity’s logo is a deep forest green. If this turns out to be the explanation for why I argued for the charity, then I should doubt that I have provided an intersubjective justification. Maybe my fondness for the charity’s logo is an okay reason for me to make a personal choice (if I am otherwise indifferent), but it certainly is not a reason for you to agree. Now that I am aware of this psychological influence, I should consider the possibility that I have merely confabulated the reasons I offered to you.

10. Objections and Alternatives

The preceding sections have focused on negative implications of the cognitive science of moral judgment. We have seen ways in which learning about a particular moral judgment’s psychological origins might lead us to disqualify it, or at least reduce our confidence in it. This final section briefly considers some objections to drawing such negative implications, and also discusses more positive proposals for the relationship between cognitive science and moral philosophy.

a. Explanation and Justification

One objection to disqualifying a moral judgment on cognitive scientific grounds is that this involves confusion between reasons of explanation and justification. The explanatory reason for the fact that I judge X to be immoral could be any number of psychological factors. But my justifying reason for the judgment is unlikely to be identical with the explanatory reason. Consider my judgment that it is wrong to tease dogs with treats that will not be provided. Perhaps the explanatory reason for my believing this is that my childhood dog bit me when I refused to share a sandwich. But this is not how I justify my judgment—the justifying reason I have is that dogs suffer when led to form unfulfilled expectations, and the suffering of animals is a moral bad. As long as this is a good justifying reason, then the explanatory reason does not really matter. So, runs the objection, those who disqualify moral judgments on cognitive scientific grounds are looking at the wrong thing—they should be asking about whether the judgment is justified, not why (psychologically speaking) it was made (Kamm 1998; van Roojen 1999).

One problem with this objection is that it assumes we have a basis for affirming the justifying reasons for a judgment that are unaffected by cognitive scientific investigation. Obviously if we had oracular certainty that judgment X is correct, then we should not worry about how we came to make the judgment. But in moral theory we rarely (if ever) have such certainty. As discussed earlier (see section 8), our justification for trusting a particular judgment is often dependent upon how well it coheres with other judgments and general principles. So if a cognitive scientific finding showed that some dubious psychological process is responsible for many of our moral judgments, their ability to justify one another may be in question. To see the point, consider the maximal case: suppose you learned that all of your moral judgments were affected by a chemical placed in the water supply by the government. Would this knowledge not give you some reason to second-guess your moral judgments? If that is right, then it seems that our justifying reasons for holding to a judgment can be sensitive to at least some discoveries about the explanatory reasons for them. (For related arguments, see Street 2006 and Joyce 2006.)

b. The Expertise Defense

Another objection claims to protect the judgments used in moral theory-making even while allowing the in-principle relevance of cognitive scientific findings. The claim is this: cognitive science uses research subjects who are not experts in making moral judgments. But moral philosophers have years of training at drawing careful distinctions, and also typically have much more time than research subjects to think carefully about their judgments. So even if ordinary participants in cognitive science studies make mistakes due to psychological quirks, we should not assume that the judgments of experts will resemble those of non-experts. We do not doubt the competence of expert mathematicians simply because the rest of us make arithmetic mistakes (Ludwig 2007). So, the objection runs, if it is plausible to think of moral philosophers as experts, then moral philosophers can continue to rely upon their judgments whatever the cognitive science says about the judgments of non-experts.

Is this expertise defense plausible? One major problem is that it does not appear to be well supported by empirical evidence. In a few studies (Schwitzgebel and Cushman 2012; Tobia, Buckwalter, and Stich 2013), people with doctorates in moral philosophy have been subjected to the same psychological tests as non-expert subjects and appear to make similar mistakes. There is some dispute about how to interpret these studies (Rini 2015), but if they hold up then it will be hard to defend the moral judgments of philosophers on grounds of expertise.

c. The Regress Challenge

A final objection comes in the form of a regress challenge. Henry Sidgwick first made the point, in his Methods of Ethics, that it would be self-defeating to attempt to debunk judgments on the grounds of their causal origins. The debunking itself would rely on some judgments for its plausibility, and we would then be led down an infinite regress in querying the causal origins of these judgments, and causal origins of the judgments responsible for our judgments about those first origins, and so on. Sidgwick seems to be discussing general moral skepticism, but a variant of this argument presents a regress challenge to even selective cognitive scientific debunking of particular moral judgments. According to the objection, once we have opened the door to debunking, we will be drawn into an inescapable spiral of producing and challenging judgments about the moral trustworthiness of various causal origins. Perhaps, then, we should not start on this project at all.

This objection is limited in effect; it applies most obviously to epistemic forms of cognitive scientific debunking. In non-epistemic conceptions of the aims of moral judgment, it may be possible to ignore some of the objection’s force. The objection is also dependent upon certain empirical assumptions about the interdependence of the causal origins driving various moral judgments. But if sustained, the regress challenge for epistemic debunking seems significant.

d. Positive Alternatives

Finally, we might consider an alternative take on the relationship between moral judgment and cognitive science. Unlike most of the approaches discussed above, this one is positive rather than negative. The idea is this: if cognitive science can reveal to us that we already unconsciously accept certain moral principles, and if these fit with the judgments we think we have good reason to continue to hold, then cognitive science may be able to contribute to the construction of moral theory. Cognitive science might help you to explicitly articulate a moral principle that you already accepted implicitly (Mikhail 2011; Kahane 2013). In a sense, this is simply scientific assistance to the traditional philosophical project of making explicit the moral commitments we already hold—the method of reflective equilibrium developed by Rawls and employed by most contemporary ethicists. In this view, the use of cognitive science is likely to be less revolutionary, but still quite important. Though negative approaches have received most discussion, the positive approach seems to be an interesting direction for future research.

11. References and Further Reading

  • Berker, Selim. 2009. “The Normative Insignificance of Neuroscience.” Philosophy & Public Affairs 37 (4): 293–329. doi:10.1111/j.1088-4963.2009.01164.x.
  • Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
  • Cosmides, Leda, and John Tooby. 1992. “Cognitive Adaptations for Social Exchange.” In The Adapted Mind: Evolutionary Psychology and the Generation of Culture, edited by J. Barkow, Leda Cosmides, and Tooby, 163–228. New York: Oxford University Press.
  • Crockett, Molly J., Luke Clark, Marc D. Hauser, and Trevor W. Robbins. 2010. “Serotonin Selectively Influences Moral Judgment and Behavior through Effects on Harm Aversion.” Proceedings of the National Academy of Sciences of the United States of America 107 (40): 17433–38. doi:10.1073/pnas.1009396107.
  • Danziger, Shai, Jonathan Levav, and Liora Avnaim-Pesso. 2011. “Extraneous Factors in Judicial Decisions.” Proceedings of the National Academy of Sciences 108 (17): 6889–92. doi:10.1073/pnas.1018033108.
  • Dworkin, Ronald. 1996. “Objectivity and Truth: You’d Better Believe It.” Philosophy & Public Affairs 25 (2): 87–139.
  • Dwyer, Susan. 2006. “How Good Is the Linguistic Analogy?” In The Innate Mind: Culture and Cognition, edited by Peter Carruthers, Stephen Laurence, and Stephen Stich. Oxford: Oxford University Press.
  • Eskine, Kendall J, Natalie A Kacinik, and Jesse J Prinz. 2011. “A Bad Taste in the Mouth: Gustatory Disgust Influences Moral Judgment.” Psychological Science 22 (3): 295–99. doi:10.1177/0956797611398497.
  • Fried, Charles. 1978. “Biology and Ethics: Normative Implications.” In Morality as a Biological Phenomenon: The Presuppositions of Sociobiological Research, 187–97. Berkeley, CA: University of California Press.
  • Gilligan, Carol. 1982. In a Different Voice: Psychology Theory and Women’s Development. Cambridge, MA: Harvard University Press.
  • Greene, Joshua D. 2008. “The Secret Joke of Kant’s Soul.” In Moral Psychology, Vol. 3. The Neuroscience of Morality: Emotion, Brain Disorders, and Development, edited by Walter Sinnott-Armstrong, 35–80. Cambridge, MA: MIT Press.
  • Greene, Joshua D. 2014. “Beyond Point-and-Shoot Morality: Why Cognitive (Neuro)Science Matters for Ethics.” Ethics 124 (4): 695–726. doi:10.1086/675875.
  • Haidt, Jonathan. 2001. “The Emotional Dog and Its Rational Tail: A Social Intuitionist Approach to Moral Judgment.” Psychological Review 108 (4): 814–34.
  • Haidt, Jonathan. 2012. The Righteous Mind: Why Good People Are Divided by Politics and Religion. 1st ed. New York: Pantheon.
  • Hamlin, J. Kiley, Karen Wynn, and Paul Bloom. 2007. “Social Evaluation by Preverbal Infants.” Nature 450 (7169): 557–59. doi:10.1038/nature06288.
  • Harris, Sam. 2010. The Moral Landscape: How Science Can Determine Human Values. First Edition. New York: Free Press.
  • Hauser, Marc D. 2006. “The Liver and the Moral Organ.” Social Cognitive and Affective Neuroscience 1 (3): 214–20. doi:10.1093/scan/nsl026.
  • Joyce, Richard. 2006. The Evolution of Morality. 1st ed. MIT Press.
  • Kahane, Guy. 2012. “On the Wrong Track: Process and Content in Moral Psychology.” Mind & Language 27 (5): 519–45. doi:10.1111/mila.12001.
  • Kahane, Guy. 2013. “The Armchair and the Trolley: An Argument for Experimental Ethics.” Philosophical Studies 162 (2): 421–45. doi:10.1007/s11098-011-9775-5.
  • Kamm, F. M. 1998. “Moral Intuitions, Cognitive Psychology, and the Harming-Versus-Not-Aiding Distinction.” Ethics 108 (3): 463–88.
  • Kennett, Jeanette, and Cordelia Fine. 2009. “Will the Real Moral Judgment Please Stand up? The Implications of Social Intuitionist Models of Cognition for Meta-Ethics and Moral Psychology.” Ethical Theory and Moral Practice 12 (1): 77–96.
  • Knobe, Joshua. 2003. “Intentional Action and Side Effects in Ordinary Language.” Analysis 63 (279): 190–94. doi:10.1111/1467-8284.00419.
  • Koenigs, Michael, Liane Young, Ralph Adolphs, Daniel Tranel, Fiery Cushman, Marc Hauser, and Antonio Damasio. 2007. “Damage to the Prefrontal Cortex Increases Utilitarian Moral Judgements.” Nature 446 (7138): 908–11. doi:10.1038/nature05631.
  • Kohlberg, Lawrence. 1971. “From ‘Is’ to ‘Ought’: How to Commit the Naturalistic Fallacy and Get Away with It in the Study of Moral Development.” In Cognitive Development and Epistemology, edited by Theodore Mischel. New York: Academic Press.
  • Ludwig, Kirk. 2007. “The Epistemology of Thought Experiments: First Person versus Third Person Approaches.” Midwest Studies In Philosophy 31 (1): 128–59. doi:10.1111/j.1475-4975.2007.00160.x.
  • Mason, Kelby. 2011. “Moral Psychology And Moral Intuition: A Pox On All Your Houses.” Australasian Journal of Philosophy 89 (3): 441–58. doi:10.1080/00048402.2010.506515.
  • Mikhail, John. 2011. Elements of Moral Cognition: Rawls’ Linguistic Analogy and the Cognitive Science of Moral and Legal Judgment. 3rd ed. Cambridge: Cambridge University Press.
  • Moll, Jorge, Roland Zahn, Ricardo de Oliveira-Souza, Frank Krueger, and Jordan Grafman. 2005. “The Neural Basis of Human Moral Cognition.” Nature Reviews Neuroscience 6 (10): 799–809. doi:10.1038/nrn1768.
  • Nadelhoffer, Thomas, and Adam Feltz. 2008. “The Actor–Observer Bias and Moral Intuitions: Adding Fuel to Sinnott-Armstrong’s Fire.” Neuroethics 1 (2): 133–44.
  • Nagel, Thomas. 1978. “Ethics as an Autonomous Theoretical Subject.” In Morality as a Biological Phenomenon: The Presuppositions of Sociobiological Research, edited by Gunther S. Stent, 198–205. Berkeley, CA: University of California Press.
  • Newman, George E., Paul Bloom, and Joshua Knobe. 2014. “Value Judgments and the True Self.” Personality and Social Psychology Bulletin 40 (2): 203–16. doi:10.1177/0146167213508791.
  • Nichols, Shaun, and Joshua Knobe. 2007. “Moral Responsibility and Determinism: The Cognitive Science of Folk Intuitions.” Noûs 41 (4): 663–85. doi:10.1111/j.1468-0068.2007.00666.x.
  • Rawls, John. 1971. A Theory of Justice. 1st ed. Cambridge, MA: Harvard University Press.
  • Rini, Regina A. 2013. “Making Psychology Normatively Significant.” The Journal of Ethics 17 (3): 257–74. doi:10.1007/s10892-013-9145-y.
  • Rini, Regina A. 2015. “How Not to Test for Philosophical Expertise.” Synthese 192 (2): 431–52.
  • Ruff, C. C., G. Ugazio, and E. Fehr. 2013. “Changing Social Norm Compliance with Noninvasive Brain Stimulation.” Science 342 (6157): 482–84. doi:10.1126/science.1241399.
  • Schnall, Simone, Jonathan Haidt, Gerald L. Clore, and Alexander H. Jordan. 2008. “Disgust as Embodied Moral Judgment.” Personality & Social Psychology Bulletin 34 (8): 1096–1109. doi:10.1177/0146167208317771.
  • Schwitzgebel, Eric, and Fiery Cushman. 2012. “Expertise in Moral Reasoning? Order Effects on Moral Judgment in Professional Philosophers and Non-Philosophers.” Mind and Language 27 (2): 135–53.
  • Sinnott-Armstrong, Walter. 2008. “Framing Moral Intuition.” In Moral Psychology, Vol 2. The Cognitive Science of Morality: Intuition and Diversity, 47–76. Cambridge, MA: MIT Press.
  • Skinner, B. F. 1971. Beyond Freedom and Dignity. New York: Knopf.
  • Street, Sharon. 2006. “A Darwinian Dilemma for Realist Theories of Value.” Philosophical Studies 127 (1): 109–66.
  • Strohminger, Nina, Richard L Lewis, and David E Meyer. 2011. “Divergent Effects of Different Positive Emotions on Moral Judgment.” Cognition 119 (2): 295–300. doi:10.1016/j.cognition.2010.12.012.
  • Sunstein, Cass R. 2005. “Moral Heuristics.” Behavioral and Brain Sciences 28 (4): 531–42.
  • Terbeck, Sylvia, Guy Kahane, Sarah McTavish, Julian Savulescu, Neil Levy, Miles Hewstone, and Philip Cowen. 2013. “Beta Adrenergic Blockade Reduces Utilitarian Judgement.” Biological Psychology 92 (2): 323–28.
  • Tobia, Kevin, Wesley Buckwalter, and Stephen Stich. 2013. “Moral Intuitions: Are Philosophers Experts?” Philosophical Psychology 26 (5): 629–38. doi:10.1080/09515089.2012.696327.
  • Tversky, A., and D. Kahneman. 1981. “The Framing of Decisions and the Psychology of Choice.” Science 211 (4481): 453–58. doi:10.1126/science.7455683.
  • Van Roojen, Mark. 1999. “Reflective Moral Equilibrium and Psychological Theory.” Ethics 109 (4): 846–57.
  • Wheatley, Thalia, and Jonathan Haidt. 2005. “Hypnotic Disgust Makes Moral Judgments More Severe.” Psychological Science 16 (10): 780–84. doi:10.1111/j.1467-9280.2005.01614.x.
  • Wilson, E. O. 1975. Sociobiology: The New Synthesis. Cambridge, MA: Harvard University Press.
  • Yang, Qing, Xiaochang Wu, Xinyue Zhou, Nicole L. Mead, Kathleen D. Vohs, and Roy F. Baumeister. 2013. “Diverging Effects of Clean versus Dirty Money on Attitudes, Values, and Interpersonal Behavior.” Journal of Personality and Social Psychology 104 (3): 473–89. doi:10.1037/a0030596.
  • Young, Liane, Joan Albert Camprodon, Marc Hauser, Alvaro Pascual-Leone, and Rebecca Saxe. 2010. “Disruption of the Right Temporoparietal Junction with Transcranial Magnetic Stimulation Reduces the Role of Beliefs in Moral Judgments.” Proceedings of the National Academy of Sciences 107 (15): 6753–58. doi:10.1073/pnas.0914826107.

 

Author Information

Regina A. Rini
Email: gina.rini@nyu.edu
New York University
U. S. A.

Theories of Religious Diversity

religious symbolsReligious diversity is the fact that there are significant differences in religious belief and practice. It has always been recognized by people outside the smallest and most isolated communities. But since early modern times, increasing information from travel, publishing, and emigration have forced thoughtful people to reflect more deeply on religious diversity. Roughly, pluralistic approaches to religious diversity say that, within bounds, one religion is as good as any other. In contrast, exclusivist approaches say that only one religion is uniquely valuable. Finally, inclusivist theories try to steer a middle course by agreeing with exclusivism that one religion has the most value while also agreeing with pluralism that others still have significant religious value.

What values are at issue? Literature since 1950 focuses on the truth or rationality of religious teachings, the veridicality (conformity with reality) of religious experiences, salvific efficacy (the ability to deliver whatever cure religion should provide), and alleged directedness towards one and the same ultimate religious object.

The exclusivist-inclusivist-pluralist trichotomy has become standard since the 1980s. Unfortunately, it is often used with some mix of the above values in mind, leaving it unclear exactly which values are pertinent. While this trichotomy is sometimes thought of in terms of general attitudes that a religious person may have towards other religions—approximately the attitudes of rejection, limited openness, and wide acceptance respectively—in this article they figure as theories concerning the facts of religious diversity. “Religious pluralism” in some contexts means an informed, tolerant, and appreciative or sympathetic view of the various religions. In other contexts, “religious pluralism” is a normative principle requiring that peoples of all or most religions should be treated the same.  In this article, “religious pluralism” refers to a theory about the diversity of religions. Finally, some authors use “descriptive religious pluralism” to mean what is here called “religious diversity,” calling “normative religious pluralism” views that are here called varieties of “religious pluralism.” While the trichotomy has been repeatedly challenged, it is still widely used, and can be precisely defined in various ways.

Table of Contents

  1. Facts and Theories of Religious Diversity
    1. History
    2. Theories and Associations
  2. Religious Pluralism
    1. Naive Pluralisms
    2. Core Pluralisms
    3. Hindu Pluralisms
    4. Ultimist Pluralisms
    5. Identist Pluralisms
    6. Expedient Means Pluralism
  3. Exclusivism
    1. Terminological Problems
    2. Naive Exclusivisms
    3. Christian Exclusivisms
  4. Inclusivism
    1. Terminological Problems
    2. Abrahamic Inclusivisms
    3. Buddhist Inclusivisms
    4. Plural Salvations Inclusivism
  5. References and Further Reading

1. Facts and Theories of Religious Diversity

Scholars distinguish seven aspects of religious traditions: the doctrinal and philosophical, the mythic and narrative, the ethical and legal, the ritual and practical, the experiential and emotional, the social and organizational, and the material and artistic. (Smart 1996) Religious traditions differ along all these dimensions. These are the undisputed facts of religious diversity. Some authors, usually ones who wish to celebrate these facts, call them “religious pluralism,” but this entry reserves this label for a family of theories about the facts of religious diversity.

It is arguably the doctrinal and philosophical aspects of a religion which are foundational, in that the other aspects can only be understood in light of them. Particularly central to any religion’s teaching are its diagnosis of the fundamental problem facing human beings and its suggested cure, a way to positively and permanently resolve this problem. (Prothero 2010, 13-6; Yandell 1999, 16-9)

a. History

Scholarly study of a wide range of religions, and comparison and evaluation of them, was to a large extent pioneered by Christian missionaries in the nineteenth century seeking to understand those whom they sought to convert. This led to both the questioning and the defense of various “exclusivist” traditional Christian claims. (Netland 2001, 23-54) Theories of religious diversity have largely been driven by attacks on and defenses of such claims, and discussions continue within the realm of Christian theology. (Kärkkäinen 2003; Netland 2001) The most famous of these has been the view (held by some Christians) that all non-Christians are doomed to an eternity of conscious suffering in hell. (Hick 1982 ch. 2; see section 3c below)

All the theories discussed in this article are ways that (usually religious) people regard other religions, but here we discuss them abstractly, without descending much into the details of how they would be worked out in the teachings and practices of any one religion. Such would be the work of a religiously embedded and committed theology of religious diversity, not of a general philosophy of religious diversity.

b. Theories and Associations    

Many people associate any sort of pluralist theory of religious diversity with a number of arguably good qualities. These qualities include but are not limited to: being humble, reasonable, kind, broad-minded, open-minded, informed, cosmopolitan, modern, properly appreciative of difference, non-bigoted, tolerant, being opposed to proselytizing (attempts to convince those outside the religion to convert to it), anti-colonialist, and anti-imperialist. In contrast, any non-pluralist theory of religious diversity is associated with many arguably bad qualities. These negative qualities include but are not limited to: being arrogant, unreasonable, mean, narrow-minded, closed-minded, uninformed, provincial, out of date, xenophobic, bigoted, intolerant, in favor of proselytism, colonialist, and imperialist.

These, however, are mere associations; there seems to be no obvious entailments between the theories of religious diversity and the above qualities. In principle, it would seem that an exclusivist or inclusivist may have all or most of the good qualities, and one who accepts a theory of religious pluralism may have all or most of the bad qualities. These connections between theory and character – which are believed by some to provide practical arguments for or objections to various theories – need to be argued for. But it is very rare for a scholar to go beyond merely assuming or asserting some sort of causal connection between the various theories about religious diversity and the above virtues and vices.

2. Religious Pluralism

A theory of religious pluralism says that all religions of some kind are the same in some valuable respect(s). While this is compatible with some religion being the best in some other respect(s), the theorists using this label have in mind that many religions are equal regarding the central value(s) of religion. (Legenhausen 2009)

The term “religious pluralism” is almost always used for a theory asserting positive value for many or most religions. But one may talk also of “negative religious pluralism” in which most or all religions have little or no positive value and are equal in this respect. This would be the view of many naturalists, who hold that all religions are the product of human imagination, and fail to have most or all of the values claimed for them. (Byrne 2004; Feuerbach 1967)

a. Naive Pluralisms

Though naive pluralisms are not common amongst scholars in relevant fields, they are important to mention because they are entertained by many people as they begin to reflect on religious diversity.

An uninformed person, noting certain commonalities of religious belief and practice, may suppose that all religions are the same, namely, that there are no significant differences between religious traditions. This naive pluralism is refuted by accurate information on religious differences. (Prothero 2010)

A common form of negative pluralism may be called “verificationist pluralism.” This is the view that all religious claims are meaningless, and as a result are incapable of rational evaluation.  This is because they cannot be empirically verified, that is, their truth or falsity is not known by way of observational evidence.

There are three serious problems with verificationist pluralism. First, some religious claims can be empirically confirmed or disconfirmed. For example, people have empirically disconfirmed claims that Jesus will visibly return to rule the earth from Jerusalem in 1974, or that magical “ghost shirts” will protect the wearer from bullets, or that saying a certain mantra three times will protect one from highway robbers. Second, the claim that meaningfulness requires the possibility of empirical verification has little to recommend it, and is self-refuting (that is, the claim itself is not empirically verifiable). (Peterson et. al. 2013, 268-72) Third, religions differ in how much, if at all, they make empirically verifiable claims, so it is unclear that all religions will be equal in making meaningless claims.

While there are other sorts of negative naive pluralism, we shall concentrate on positive kinds here, as most of the scholarly literature focuses on those.

Some forms of naive pluralism suppose that all religions will turn out to be complementary. One idea is that all religions would turn out to be parts of one whole (either one religion or at least one conglomeration of religions). This unified consistency may be hoped for in terms of truth, or in terms of practice. With truth, the problem is that it is hard to see how the core claims of the religions could all be true. For instance, some religions teach that the ultimate reality (the most real, only real, or primary thing) is ineffable (such that no human concept can apply to it). But others teach that the ultimate reality is a perfect self, a being capable of knowledge, will, and intentional action.

What about the religions’ practices – are they all complementary? Some practices seem compatible, such as church attendance and mindfulness meditation. On the other hand, others seem to make little or no sense outside the context of the home religion, and others are simply incompatible. What sense, for instance, would it make for a Zen Buddhist to undergo the Catholic rites of confession and penance? Or what sense would it make for an Orthodox Jew, whose religion teaches him to “be fruitful and multiply,” to employ the Buddhist practice of viewing corpses at a burial ground so as to expunge the unwanted liability of sexual desire? Nor can he be fruitful and multiply while living as a celibate Buddhist monk. Dabblers and hobbyists freely stitch together unique quilts of religious beliefs and practices, but such constructions seem to make little sense once a believer has accepted any particular religion. Many religious claims will be logically incompatible with the accepted diagnosis, and many religious practices will be useless or counter-productive when it comes to getting what one believes to be the cure.

Another way in which pluralism can be naive is the common assumption that absolutely all religions are good in significant ways, for example, by improving their adherents’ lives, facilitating interaction with God, or leading to eternal salvation. However, such a person is probably only thinking of large, respectable, and historically important religions. It is not hard to find religions or “religious cults” which would not plausibly be thought of as good in the way(s) that a pluralist has in mind. For example, a religious group may function only to satisfy the desires of its founder, discourage the worship of God, encourage the sexual abuse of children, or lead to the damnation of its members.

Carefully worked out theories of religious pluralism often sound all-inclusive. However, they nearly always have at least one criterion for excluding religions as inferior in the aspect(s) they focus on. A difficulty for any pluralist theory is how to restrict the group of equally good religions without losing the appearance of being all-accepting or wholly non-judgmental. A common strategy here is to simply ignore disreputable religious traditions, only discussing the prestigious ones.

b. Core Pluralisms

An improvement upon naive pluralism acknowledges differences in all the aspects of religions, but separates peripheral from core differences. A core pluralist claims that all religions of some kind share a common core, and that this is really what matters about the religions; their equal value is found in this common core. If the core is true teachings, they’ll all be true (to some degree). If the core is veridical experiences, all religions will enable ways to perceive whatever the objects of religious experience are. If the core is salvifically effective practice, then all will be equal in that each is equally well a means to obtaining the cure. Given that any core pluralism inevitably downplays the other non-core elements of the religions, this approach has also been called “reductive pluralism.” (Legenhausen 2006)

The most influential recent proponent of a version of core pluralism has been Huston Smith. (b. 1919) In his view, the common core of religions is a tiered worldview.  This encompasses the idea that physical reality, the terrestrial plane, is contained within and controlled by a more real intermediate plane (that is, the subtle, animic, or psychic plane) which is in turn contained and controlled by the celestial plane. This celestial plane is a personal God. Beyond this is infinite, unlimited Being (also called “Absolute Truth, “the True Reality,” “the Absolute,” “God”). Given that it is ineffable, this Being is neither a god, nor the God of monotheism. It is more real than all that comes from it. The various “planes” are not distinct from it, and it is the ultimate object of all desire, and the deepest reality within each human self. Some experience this Being as if it were a god, but the most able gain a non-conceptual awareness of it in its ineffable glory. Smith holds that in former ages, and among primitive peoples now, such a worldview is near universal.  It is only modern people who are blinded by the misunderstanding that science reveals all, who have forgotten it. (Smith 1992, 2003 ch. 3) The highest level in some sense is the human “Spirit,” the deep self which underlies the self of ordinary experience. Appropriating Hindu, Buddhist, and Christian language, Smith says that this “spirit is the Atman that is Brahman, the Buddha-nature that appears when our finite selves get out of its way, my istigkeit (is-ness) which…we see is God’s is-ness.” (Smith 2003 ch. 3, 3-4)

Such an outlook, often called “the perennial philosophy” or “traditionalism,” owes much to nineteenth and twentieth century occult literature, and to neo-Platonism and its early modern revivals. (Sedgwick 2004) Like traditional religions, it too offers a diagnosis of the human condition and a cure.  It offers a fall from primordial spirituality into modern spiritual poverty, cured by adopting the outlook sketched above. Most importantly, it offers a chance to discover the deep self as Being. A muted ally in this was the influential religious scholar Mircea Eliade (1907-86), whose work focused on comparing mythologies, and on what he viewed as an important, primitive religious outlook, which separates things into the sacred and the profane.

This “perennial philosophy” appeals to many present-day people, particularly those who, like Smith, have moved from a childhood religious faith, in his case, Christianity, to a more naturalistic and, hence, atheistic outlook. Such an outlook is commonly perceived as meaningless, hopeless, and devoid of value. (Smith 2003, 2012)

Dissenters are found among historians of religion, who deny that there is and has always been a common core in all of the world’s religions. Others dissent because they accept the incompatible diagnosis and cure taught by some other religion, such as the ones found in Islam or Christianity. (Legenhausen 2002) Those who believe the ultimate reality to be a unique god object to Smith’s view that the ultimate reality is ineffable, and so not, in itself, a god. Others find it excessive that Smith accepts other traditional doctrines, such as that Plato’s Forms are not only real but alive, that in dreaming the “subtle body” leaves the body and roams free in the intermediate realm, belief in siddhis (supernatural powers gained by meditation), possession by spirits, psychic phenomena, and so on.

This sort of core pluralism was propounded by some members of the Theosophical Society such as co-founder Helena Petrovna Blavatsky (1831-91) in her widely read The Secret Doctrine (1888), and by French convert to Sufi Islam René Guénon (1886-1951) and those influenced by him, such as the eclectic Swiss-German writer Frithjof Schuon (1907-98). This sort of pluralism, following Guénon and Schuon, has been championed by Iranian philosopher Seyyed Hossein Nasr (b. 1933) and English convert to Islam, Martin Lings (1909-2005) who was a biographer of Muhammad. (Legenhausen 2002)

While Smith’s view rests on belief in an impersonal Ultimate, other versions of core pluralism rest upon monotheism. Thus, the Hindu intellectual Ram Mohun Roy (1772/4-1833) held that Hinduism, Islam, and Christianity, when understood in their original, non-idolatrous and non-superstitious ways, all teach the important truths about God and humankind, enabling humans to love and serve God. Roy, however, always retained his Hindu and Brahmin identities. (Sharma 1999, ch. 2) He was what we now call a “pluralistic Hindu” (and most Hindus would add that he was also an unorthodox Hindu).

Swedish philosopher Mikael Stenmark explores what he calls the “some-are-equally-right view” about religious diversity, and discusses a version of it on which Judaism, Christianity, and Islam are held to possess equal amounts of religiously important truths. (Stenmark 2009) He does not advocate this view, but explores it as an alternative to exclusivism, inclusivism, and Hickian identist pluralism. Stenmark views it as most similar to identist pluralism (see 2e below). But Stenmark’s “some-are-equally-right view” can also be seen as a form of core pluralism, the core being truths about the one God and our need for relationship with God. On this view, “all” the religions are right to the same degree, that is, all versions of monotheism (or perhaps, ethical monotheisms, or Abrahamic monotheisms). This account is narrower than “pluralism” is usually thought to be, but it is arguably a version of it.

c. Hindu Pluralisms

The tradition now called “Hinduism” is and always has been very internally diverse. In modern times, it tries to equalize other religions in the same ways it equalizes the apparently contrary claims and practices internal to it. While elements within it have been sectarian and exclusivistic, modern Hindu thought is usually pluralistic. Furthermore, Hindu thought has shifted in modern times from a scriptural to an experiential emphasis. (Long 2011) Still, some Hindus object to various kinds of pluralism. (Morales 2008)

Within the pluralistic mainstream of Hinduism, a popular slogan is that “all religions are true,” but this may be an expression of almost any sort of positive religious pluralism. Moreover, some influential modern Hindu leaders have adopted a complicated rhetoric of “universal religion,” which often assumes some sort of religious pluralism. (Sharma 1999, ch. 6)

At bare minimum, the slogan that “all religions are true” means that all religions are in some way directed towards one Truth, where this is understood as an ultimate reality. Thus, it has been observed that identist religious pluralism (see 2e below) “is essentially a Hindu position,” and closely resembles Advaita Vendanta thought. (Long 2005) The slogan may also imply that all religions feature veridical experience of that one object, by way of a non-cognitive, immediate awareness. (Sharma 1990) This modern Hindu outlook has proven difficult to formulate in any clear way. One prominent scholar argues that the “neo-Hindu” position on religious diversity (that is, modern Hindu pluralism) is not the view that all religions are equal, one, true, or the same. Instead, it is the view that all religions are “valid,” meaning that they have some degree of (some kind of) value. (Sharma 1979) But if there is no one clear modern Hindu pluralism, it remains that various modern Indian thinkers have held to versions of core or identist pluralism.

Paradoxically, such pluralism is often expressed along with claims that Hinduism is greatly superior in various ways to other religions. (Datta 2005; Morales 2008) It has been argued that whether and how a Hindu embraces a theory of religious pluralism will depend crucially on what she takes “Hinduism” to be. (Long 2005)

d. Ultimist Pluralisms

Building on the speculative metaphysics of Alfred North Whitehead’s (1861–1947) Process and Reality (1929), and work by his student Charles Hartshorne (1897-2000), theologians John Cobb and David Ray Griffin have advocated what the latter calls “deep,” “differential,” and “complementary” pluralism – what is here described as “ultimist.” (Cooper 2006, ch. 7; Griffin 2005b)

Cobb and Griffin assume that there is no supernatural intervention (any miraculous interruption of the ordinary course of nature) by God or other beings. This, it is hoped, rules out anyone having grounds for believing any particular religion to be the uniquely best religion. (Griffin 2005a) They do, however, take seriously at least many of the unusual religious experiences people report. They hypothesize that some religious mystics really do perceive (without using ordinary sense organs) some “ultimate” (that is, something regarded as ultimate). Thus, in experiencing what they call “Emptiness” or the Dharmakaya (truth body), Mahayana Buddhists really do perceive what Cobb calls Creativity (or Being Itself), as do Advaita Vedanta Hindus when they perceive Nirguna Brahman (Brahman without qualities). Other Buddhists experience Amida Buddha, while Christians experience Christ, and Jews Yahweh, Hindus Isvara, and Muslims Allah. All such religious mystics really perceive a personal, supreme God, understood panentheistically, as being “in” the cosmos, akin to how a soul is in a body. Yet other religious mystics perceive the Cosmos, that is, the totality of finite things (the “body” of the World Soul).

These three – Creativity, God, Cosmos – are such that none could exist without the others. Further, it is really Creativity that is ultimate, and it is “embodied in” and does not exist apart from or as an object in addition to God and the Cosmos. Sometimes God and Cosmos are described as aspects of Creativity. The underlying metaphysics here is that of process philosophy, in which events are the basic or fundamental units of reality. On such a metaphysics, any apparent substance (being, entity) turns out to be one or more events or processes. Even God, the greatest concrete, actual being in this philosophy is, in the end, an all-encompassing “unity of experience,” and is to be understood as a process of “Creative-Responsive love.” (Griffin 2005b; Cooper 2006, ch. 7)

All the major religions, then, are really oriented towards, and involve the experience of some reality regarded as “ultimate” (Creativity, God, or Cosmos). It is also allowed that each major religion really does deliver the cure it claims to (for example, salvation and heaven, Nirvana, Moksha), and is entitled to operate by its own moral and epistemic values. Further, it respects and does not try to eliminate all these differences, and so makes genuine dialogue between members of the religions possible. Finally, Cobb and Griffin emphasize that this approach does not endorse any unreasonable form of relativism and, as such, allows one to remain distinctively Christian or Buddhist and so forth. They hope that each religion can, while remaining distinct, begin to construct “global” theologies, influenced by the truths and values of other religions. In all these ways, they argue that their ultimist pluralism is superior to other pluralisms.

This view has not been widely accepted because the Process theology and philosophy on which it is based has not been widely accepted.

One may object that this above proposal is counter to the equalizing spirit of pluralism. Griffin and Cobb seem to attribute the deepest insight to those who think the ultimate reality is an impersonal, indescribable non-thing. In their view, those who confess experience of Emptiness, Nirguna Brahman, or the One (of Neoplatonism) behold the ultimate reality (Creativity) as it really is, in contrast to monotheists or cosmos-focused religionists, who latch on to what are limited aspects of Creativity. But these monotheists and cosmos-worshipers each take their object to be ultimate, and would deny the existence of any further back entity or non-entity, that is, of Creativity. It would seem that that, for example, a Christian to accept this ultimist pluralism, she will have to reinterpret what many Christians will regard as a core commitment, namely, that the ultimate reality is personal. Even a Mahayana Buddhist may have a lot of adjusting to do, if she is to admit that believers in a personal God really do experience the greatest entity, and something which is not separate from Emptiness. Wouldn’t this be to attribute more reality to God than she’s willing to? And how can the ultimist pluralist demand such changes?

A similar pluralism is advanced by Japanese Zen scholar Masao Abe (1915-2006). He applies the Mahayana doctrine of the “three bodies” of the Buddha to other religions. In Mahayana Buddhism, the ultimate reality, a formless but active non-thing, is Emptiness, or the Truth Body (Dharmakaya). This in some sense manifests as, acts as, and is not different from a host of Enjoyment Bodies (Sambhogakaya), each of which is a Buddha outside of space and time, a historical Buddha now escaped from samsara and dwelling in a Buddha-realm. The historical Buddha, the man Gautama is, in this doctrine, a Transformation Body (or Apparitional Body, Nirmanakaya) of one of these, as are other Buddhas in time and space. In some sense these three are one, however, the Truth Body manifests or acts as various Enjoyment Bodies, which in turn manifest or act as various Transformation Bodies. The latter two classes of beings, but not the first, may be described as “personal.”

As to religious diversity, Abe suggests that we view the dynamic activity Emptiness (also called “Openness”) as ultimate, and as manifesting as various “Gods,” that is various monotheistic deities, and “Lords,” which are human religious teachers, whether manifestations of a god, as in the case of Jesus, or just pre-eminent servants of a god, as with Moses or Muhammad.

It is a mistake, Abe holds, to regard any god as ultimate, and monotheists must revise their understanding as above, if true inter-religious dialogue and peace are to be achievable. Equally, Advaita Vedanta Hindus must let go of their insistence on Nirguna Brahman as ultimate. It is a mistake to think that the ultimate is any substantial, self-identical thing, even an ineffable one. (Abe 1985)

Must the Mahayanist make any significant revision to accept the proposed “threefold reality” of Emptiness, gods, and lords? Presumably not, as she already believes in levels of truth and levels of reality. At the highest level there is only Emptiness, the ultimate. The gods and lords will stay at the “provisional” levels of truth and reality, the levels which a fully enlightened person, as it were, sees beyond.

Abe’s views have been criticized by other scholars as misunderstanding some other religions’ claims, and as privileging Mahayana Buddhist doctrines, insofar as he understands these doctrines as being truer than others. It can be argued that Abe is an inclusivist, maintaining that Buddhism is the best religion, rather than a true pluralist. (Burton 2010)

e. Identist Pluralisms

In much religious studies, theology, and philosophy of religion literature of the 1980s through the 2000s, the term “religious pluralism” means the theory of philosopher John Hick. (Hick 2004; Quinn and Meeker 2000) Hick’s approach is original, thorough, and informed by a broad and deep knowledge of many religions. His theory is at least superficially clear, and is rooted in his own spiritual journey. It attracted widespread discussion and criticism, and Hick has engaged in a spirited debate with all comers. It is here described as “Identist” pluralism because his theory claims that people in all the major religions interact with one and the same transcendent reality, variously called “God,” “the Real,” and “the Ultimate Reality.”

Hick viewed religious belief as rationally defensible, and held that one may be rational in trusting the veridicality of one’s religious experiences. However, he thought that it is arbitrary and indefensible to hold that only one’s own experiences or the experiences of those in one’s group are veridical, while those of people in other religions are not. Subjectively, those other people have similar grounds for belief. These ideas, and the fact that religious belief is strongly correlated with birthplace, convinced him that the facts of religious diversity pose irresolvable problems for any exclusivist or inclusivist view, leaving only some form of pluralism as a viable option. (Hick 1997)

Starting as a traditional, non-pluralistic Christian, Hick attended religious meetings and studied with people of other religions. As a result, he became convinced that basically the same thing was going on with these others religious followers as with Christians, namely, people responding in culturally conditioned but transformative ways to one and the same Real or Ultimate Reality. In his earlier writings, monotheistic concerns seem important. How could a perfect being fail to be available to all people in all the religions? Later on, Hick firmly settled on the view that this Real should be thought of as ineffable. Appropriating Immanuel Kant’s distinction between phenomena, how things appear, and noumena, things in themselves, Hick postulated that the Real is ineffable and is not directly experienced by anyone. However, he maintained that people in the religions interact with it indirectly, by way of various personae and impersonae, personal and impersonal appearances or phenomena of the Real. In other words, this Ultimate Reality, due to the various qualities of human minds, appears to various people as personae, such as God, the Trinity, Allah, Vishnu, and also as impersonae such as Emptiness, Nirvana, Nirguna Brahman, and the Dao. These objects of religious experience are mind-dependent, in that they depend for their existence, in part, on people with certain religious backgrounds. By contrast, the Real in itself, that is, the Ultimate Reality as it intrinsically is, is never experienced by anyone, but is only hypothesized as overall the most reasonable explanation of the facts of religious diversity.

Among these purported facts, for Hick, is that the great religions equally well facilitate the ethical transformation of their adherents, what Hick calls a transformation from self-centeredness to other-centeredness and Reality-centeredness. (Sometimes, however, Hick makes the weaker claim that we’re unable to pick any religion as best at effecting this transformation.) This transformation, Hick theorizes, is really the point of religion. All religions, then, are equal in that they are responses to the ineffable Ultimate Reality which equally well—or for all we can tell equally well—bring about an ethical improvement in humans, away from self-centeredness and towards other humans and the Ultimate Reality.

Hick realizes the incoherence of dubbing all religions “true,” for they have core teachings that conflict, and most religions are not shy about pointing out such conflicts. We could loosely paraphrase Hick’s view as being that all religions are false, not that all their teachings are false (for there is much ethical and practical agreement among them), but rather that their core claims about the main object of religious experience and pursuit equally contain falsehoods. Monotheists, after all, take the ultimate being to be a personal god while others, variously called ultimists, absolutists, or monists, hold the ultimate to be impersonal, such as the Dao, Emptiness, Nirguna Brahman, and so forth. These, Hick holds, are all mistaken; the Ultimate reality is neither personal nor impersonal. (Hick 1995, ch. 3) To say that it is either, Hick realizes, would be to hand an epistemic victory to either the monotheists or the absolutists (ultimists). This, he will not do.

Instead, Hick downgrades the importance of true belief to religion. Though not true, doctrines such as the Trinity or the Incarnation, he argues, may be interpreted to have “mythological truth,” that is, a tendency to influence people towards getting what Hick postulates is the cure offered by the religions, the ethical transformation described above.

Hick doesn’t argue for the salvific or cure-delivering equality of all religions. Rather, he only argues for the equality of what he calls “post-axial” religions – major religious traditions which have arisen since around 800 B.C.E. (Hick 2004, ch. 2-4; Meeker 2003)

Hick’s identist religious pluralism has been objected to as thoroughly as any recent theory in philosophy. Here we can survey only a few of the criticisms that have been made. (For others see Hick 2004, Introduction.)

Many have objected that Hick’s pluralism is not merely a theory about religions, but is itself a competitor in the field, offering a diagnosis and cure which disagrees with those of the world religions. It is hard to see, then, how his theory enables one to be, as Hick claimed to be, a “pluralistic Christian,” given that one has replaced the diagnosis and cure of Christianity with those of Hick’s pluralism. In reply, Hick urges that his claims are not themselves religious, but are rather about religious matters, and are, as such, philosophical.

Hick’s claim that no human concept applies to the Ultimate Reality has been criticized by many, who’ve pointed out that Hick applies these concepts to it: being one, being ultimate, being a partial cause of the impersonae and personae, and being ineffable. Moreover, it seems a necessary truth that if the concept personal doesn’t apply to the Real, then the concept non-personal must apply to it. (King, 2008; Rowe 1999; Yandell 1999) In response, Hick concedes that some concepts, “formal” ones, can be applied to the Real, while “substantial” ones cannot. He switches to the term “transcategorial,” points out historical versions of this thesis, and urges that the Real simply is not in the domain of entities to which concepts like personal and non-personal apply. His critics, he argues, are merely asserting without reason that there cannot be a transcategorial reality. (Hick 2000, 2004)

As to Hick’s idea that the correlation of birthplace and religious belief somehow undermines the rationality of religious belief, it has been pointed out that religious pluralism too is correlated with birthplace. In response to his claims that non-pluralistic religious believers are being arrogant, irrational, or arbitrary in believing that one religion (theirs) is the most true one, it has been pointed out that Hick too, as a religious pluralist, holds views which are inconsistent with many or most religions, seemingly preferring his own insights or experiences to theirs, which would, by Hick’s own lights, be just as arrogant, irrational, or arbitrary. (O’Connor 1999; King 2008; Bogardus 2013)

Others object that given the transcategoriality or ineffability of the Real, even with the above qualifications, there is no reason to think that interaction with the Real should be ethically beneficial, or that it should have any connection at all to any religious value. (Netland 2001)

Others object that Hick’s pluralism requires arbitrarily reinterpreting religious language non-literally, and usually as having to do with morality, contrary to what most proponents of those religions believe. (Yandell 2013)

Again, it has been objected that Hick, contrary to many religions, downgrades religious practice and belief as inessential to a religion, the only important features of a religion being that it is a response to the Ultimate Reality and that it fosters the ethical transformation noted above. Further, Hick presupposes the correctness of recent socially “liberal” ethics, for example, “sexual liberation,” and thus rules out as inessential to any religion any conflicting ethical demands. (Legenhausen 2006)

Other objections have been centered on the status of Hick’s personae. If, for example, in his view Allah, Vishnu, and Yahweh are all real and distinct, is Hick thereby committed to polytheism? Or are those gods mere fictions? (Hasker 2011) At first Hick evades the issue of polytheism by describing his theory not as a kind of “polytheism,” but rather as “poly-something.” He then suggests that two views of the personae are compatible with his theory: that they are mental projections, or that they are real, but angel- or deva– like beings, intermediaries and not really gods. Finally, Hick revises his view: the monotheistic gods people experience are mental projections in response to the Real, and not real selves, but since religious people really do encounter great selves in religious experience, we should posit personal intermediaries between humans and the Real, with whom religious people interact. Perhaps these are the angels, devas, and heavenly Buddhas of the religions, great but nonetheless finite beings. Thus Christians, for example, imagine that in prayer they interact with the ultimate, a monotheistic god, but in fact they interact with angels, and perhaps different Christians with different angels. It is a mistake, he now holds, to suppose that the personae (that is, Vishnu, Yahweh, Allah, and so on) are angel-like selves. This is not compatible with his thesis that Vishnu and others are phenomena of the Real, that is, culturally conditioned ways that the Real appears to us. (Hick 2011)

A less developed identist pluralism is explored by Peter Byrne. (1995, 2004) All the major religions are equal in that they (1) refer to and facilitate cognitive contact with a single, transcendent reality, (2) each offers a similarly moral- and eternal-oriented “cure,” and (3) each includes revisable and incomplete accounts of this transcendent reality. It has been objected that this theory is not promising because it is hard to see how we could ever have sufficient evidence for some of its claims, while others are implausible in light of the evidence we do have. (Mawson 2005)

f. Expedient Means Pluralism

Historically, Buddhist thought about other religions has almost never been pluralistic. (Burton 2010; Kiblinger 2005; and section 4c below) But in modern times, some have constructed a novel and distinctively Buddhist pluralism using the Mahayana doctrine of “expedient means” (Sanskrit: upaya). The classic discussion of this is in the Lotus Sutra (before 255 C.E.), which argues that previous versions of Buddhist teaching were mere expedient means, that is, non-truths taught because in his great wisdom, the Buddha knew that at its then immature stage, humanity would be aided only by those teachings. This was a polemic against non-Mahayana versions of Buddhist dharma (teaching). Now that the time is right, the truth may be told, that is, Mahayana doctrine, superseding the old. However, more recently, it has been argued that all religious doctrines, even Mahayana ones, are expedient means, helpful non-truths, ladders to be kicked away upon attainment of the cure, here understood as a non-cognitive awareness of the ultimate reality. (Burton 2010)

3. Exclusivism

a. Terminological Problems

The term “exclusivist” was originally a polemical term, chosen in part for its negative connotations. Some have urged that it be replaced by the more neutral terms “particularism” or “restrictivism.” (Netland 2001, 46; Kärkkäinen 2003, 80-1) This article retains the common term because it is widespread and many have adopted the label for their own theory of religious diversity.

In this article “exclusivism” about religious diversity denies any form of pluralism; it denies that all religions, or all “major ones,” are the same in some important respect. Insofar as a religion claims to possess a diagnosis of the fundamental problem facing humans and a cure, that is, a way to permanently and positively resolve this problem, it will then assume that other, incompatible diagnoses and cures are incorrect. Because of this, arguably exclusivism (or inclusivism, see section 4 below) is a default view in religious traditions. Thus, for example, the earliest Buddhist and Christian sources prominently feature staunch criticisms of various rival teachings and practices as, respectively, false and useless or harmful. (Netland 2001; Burton 2010)

Some philosophers, going against the much-discussed identist pluralism of John Hick (see 2e above) use “exclusivism” to mean reasonable and informed religious belief which is not pluralist. (O’Connor 1999) This “exclusivism” is compatible with both exclusivism and inclusivism in this article. It is difficult to make a fully clear distinction between exclusivist and inclusivist approaches. The basic idea is that the inclusivist grants more of the values in question to religions other than the single best religion – more truth, more salvific efficacy, more veridical experience of the objects of religious experience, more genuine moral transformation, and so forth.

Finally, because of their fit with many traditional religious beliefs and commitments, sometimes exclusivism and inclusivism are considered as two varieties of “confessionalism,” views on which “one religion is…true and…we must view other religions in the light of that fact.” (Byrne 2004, 203)

b. Naive Exclusivisms

An exclusivist stance is often signaled by the claim that there is “only one true religion.” Other religions, then, are “false.” A naive person may infer from this that no claim, or no central claim of any other religion is true, but all such are false. This position cannot be self-consistently maintained. Consider the claim that the cosmos was intentionally made. An informed Christian must concede that Jews and Muslims too believe this, and that they teach it as a central doctrine. Thus, if central Christian teachings are true, then so is at least one central teaching of these two rival religions.

Another naive exclusivist view which is rejected by most theorists is that all who are not full-fledged members of the best religion fail to get the cure. For example, all non-Christians go to hell, or all non-Buddhists fail to gain Nirvana, or to make progress towards it. Theorists nearly always loosen the requirement with regard to what they view as the one most true and/or most effective religion. Thus, Christian exclusivists usually allow that those who die as babies, the severely mentally handicapped, or friends of God who lived before Christian times may avoid hell and attain heaven despite their not being fully-fledged, believing and practicing Christians. (Dupuis 2001; Meeker 2003; section 3c below) Similarly, Buddhists usually allow that a person may gain positive karma, and so a better rebirth, by the practice of various other religions, helping her to advance, life by life, towards getting the cure by means of the distinctive Buddhist beliefs and practices.

While such naive exclusivist positions are rarely expounded by scholars, they frequently appear in the work of pluralists and inclusivists, held up as unfortunate, harmful, and unreasonable theories which are in urgent need of replacement.

c. Christian Exclusivisms

Early bishop Ignatius of Antioch (c. 35-107) writes that “if any follow a schismatic [that is, the founder of a religious group outside of the bishop-ruled catholic mainstream] they will not inherit the Kingdom of God.” (Letter of Ignatius to the Philadelphians 3:3) Leading catholic theologian Origen of Alexandria (c. 186-255) wrote: “outside the Church no one is saved.” (Dupuis 2001, 86-7) Yet Origen also held, at least tentatively, that eventually all rational beings will be saved.

Thus, the slogan that there is no salvation outside the church (Latin: Extra ecclesiam nulla salus) was meant to communicate at bare minimum the uniqueness of the Christian church as God’s instrument of salvation since the resurrection of Jesus. The slogan was nearly always, in the first three Christian centuries, wielded in the context of disputes with “heretical” Christian groups, the point being that one can’t be saved through membership in such groups. (Dupuis 2001, 86-9)

However, what about Jews, pagans, unbaptized babies, or people who never have a chance to hear the Christian message? After catholic Christianity became the official religion of the empire (c. 381), it was usually assumed that the message had been preached throughout the world, leaving all adult non-Christians without excuse. Thus, Augustine of Hippo (354-430) and Fulgentius of Ruspe (468-533) interpreted the slogan as implying that all non-Christians are damned, because they bear the guilt of “original sin” stemming from the sin of Adam, which has not been as it were washed away by baptism. (Dupuis 2000, 91-2)

Water baptism, from the beginning, had been the initiation rite into Christianity, but it was still unclear what church membership strictly required. Some theorized, for instance, that a “baptism of blood,” that is, martyrdom, would be enough to save unbaptized catechumens. Later theologians added a “baptism of desire,” which was either a desire to be baptized or the inclination to form such a desire, either way enough to secure saving membership in the church. In the first case, a person who is killed in an accident on her way to be baptized would nonetheless be in the church. In the second, even a virtuous pagan might be a church member. This “baptism of desire” was officially affirmed by the Roman Catholic Council of Trent in 1547.

With the split of the catholic movement into Roman Catholic and Eastern Orthodox branches, “the church” was understood in Western contexts to be specifically the Roman Catholic church. Thus, famously, in a papal bull of 1302, called by its first words Unam Sanctam (that is, “One Holy”), Pope Boniface VIII (r. 1294-1303) declared that outside the Roman Catholic church, “there is neither salvation nor remission of sins,” and “it is altogether necessary to salvation for every human creature to be subject to” the pope. (Plantinga 1999, 124-5; Neuner and Dupuis 2001, 305) Note that this might still be interpreted with or without the various non-standard ways to obtain church membership mentioned above. The context of this statement was not a discussion of the fate of non-Christians, but rather a political struggle between the pope and the king of France.

In the Decree for the Copts of the General Council of Florence (1442), a papal bull issued by pope Eugene IV (r. 1431-47), for the first time in an official Roman Catholic doctrinal document the slogan was asserted not only with respect to heretics and schismatics, but also concerning Jews and pagans. (Neuner and Dupuis 2001, 309-10) It also seems to close the door to non-standard routes to church membership, saying that “never was anyone, conceived by a man and a woman, liberated from the devil’s dominion except by faith in our Lord Jesus Christ.” (Tanner 1990, 575) Non-Catholics will “go into the everlasting fire…unless they are joined to the catholic church before the end of their lives…nobody can be saved, no matter how much he has given away in alms and even if has shed his blood in the name of Christ, unless he has persevered in the bosom and the unity of the catholic church.” (Tanner 1990, 578)

This exclusivistic or “rigorist” way of understanding the slogan, on which only the Roman Catholic church could provide the “cure” needed by all humans, was the most common Catholic stance on religious diversity until mid-nineteenth century. But some had always held on to theories about ways into the church other than water baptism, and since the European discovery of the New World it had become clear that the gospel had not been preached to the whole world, and many held that such pagans were non-culpably ignorant of the gospel. This view was affirmed by Pope Pius X (r. 1846-78) in his Singulari Quadam (1854): “outside the Apostolic Roman Church no one can be saved…On the other hand…those who live in ignorance of the true religion, if such ignorance be invincible, are not subject to any guilt in this matter before the eyes of the Lord.” (Neuner and Dupuis 2001, 311)

Nineteenth century popes condemned Enlightenment-inspired theories of religion pluralism about truth and salvation, then called “indifferentism,” it being, allegedly, indifferent which major religion one chose, since all were of equal value. At the same time, they argued that many people who are outside the one church cannot be blamed for this, and so will not be condemned by God.

Such views are consistent with exclusivism in the sense that Roman Catholic Christianity is the one divinely provided and so most effective instrument of salvation, as well as the most true religion, and the “true religion” in the sense that any claim which contradicts it official teaching is false. Letters by Pius XII (r. 1939-58) declared that a “by an unconscious desire and longing” non-Catholics may enjoy a saving relationship with the church. (Dupuis 2001, 127-9) Whether these non-Catholics are thought to be in the church by a non-standard means, or whether they are said to be not in the church “in reality” but only “in desire,” it was held that they were saved by God’s grace. (Neuner and Dupuis 2001, 329)

Since the Vatican II council (1962-5), many Catholic theologians have embraced what most philosophers will consider some form of inclusivism rather than a suitably qualified exclusivism, with a minority opting for some sort of pluralism. (On the majority inclusivism, see section 4b below.) The impetus for this change was fueled by statements from that council (their Latin titles: Lumen Gentium, Ad Gentes, Nostra Aetate, Gaudium et Spes, Heilsoptimismus), which are in various ways positive towards non-Catholics. One asserts not merely the possibility, but the actuality of salvation for those who are inculpably ignorant of the gospel but who seek God and try to follow his will as expressed through their own conscience. Another, without saying that people may be saved through membership in them, affirms various positive values in other religions, including true teachings, which serve as divinely ordained preparations for reception the gospel. Catholics are exhorted to patient, friendly dialogue with members of other religions. (Dupuis 2000, 161-5) Some Catholic theologians have seen the seeds or even the basic elements of inclusivism in these statements, while others view them as within the orbit of a suitably articulated exclusivism. (Dupuis 2000, 165-170) A key area of disagreement is whether or not these imply that a person may be saved by means of their participation in some other religion. Still other Catholic theologians have found these moves to be positive but not nearly different enough from the more pessimistic sort of exclusivism. Such theologians, prominently Hans Küng (b. 1928) and Paul Knitter (b. 1939), have formulated various pluralist theories. (Kärkkäinen 2003, 197-204, 309-17)

Protestant versions of exclusivism can be at least as strict as Augustine’s. Recently called “restrictivism,” this position insists that explicit knowledge of the gospel of Jesus Christ is necessary for salvation, and there is no hope for those who die without having heard the gospel. (McDermott and Netland 2014, 148) But these are sometimes tempered with loopholes such as: a universal chance to hear the gospel at or after death, a free pass to people who die before the “age of accountability,” or the view that less was required to be saved in pre-Christian times. Another view which is taken by Bible-oriented evangelical Protestants allows the possibility of non-Christians receiving saving grace, but is firmly agnostic as to whether this actually occurs, and if it does, how often, because of the paucity of relevant biblical statements. (McDermott and Netland 2014) Other Protestants choose forms of inclusivism similar to Rahner’s (see 4b below).

4. Inclusivism

a. Terminological Problems

On the one hand, it is difficult to consistently distinguish inclusivism from exclusivism, because the latter nearly always concedes some significant value to other religions. “Inclusivism” for some authors just means a friendlier or more open-minded exclusivism. On the other hand, many theorists want to adopt the friendly and broad-minded sounding label “pluralism” for their theory, even though they clearly hold that one religion is uniquely valuable. For example, both Christians and Buddhists have adopted religious-diversity-celebrating rhetoric while clearly denying anything described above in this article as a kind of “pluralism” about religious diversity. (Dupuis 2001; Burton 2010)

b. Abrahamic Inclusivisms

Historically, Jewish intellectuals have usually adopted an inclusivist rather than an exclusivist view about other religions. A typical Rabbinic view is that although non-Jews may be reconciled to God, and thus gain life in the world to come, by keeping a lesser covenant which God has made with them, still Jews enjoy a better covenant with God. Beginning in the late twentieth century, however, some Jewish thinkers have argued for pluralism along the lines of various Christian authors, revising traditional Jewish theology. (Cohn-Sherbok 2005)

Since the latter twentieth century many Roman Catholic theologians have explored non-exclusivist options. As explained above (section 3c) a major impetus for this has been statements issued by the latest official council (Vatican II, 1962-5). One goes so far as to say that “the Holy Spirit offers to all [humans] the possibility of being associated, in a way known to God, with the Paschal Mystery [that is, the saving death and resurrection of Jesus].” (Gaudium et Spes 22, quoted in Dupuis 2001, 162) Some Catholic theologians see the groundwork or beginning in these documents for an inclusivist theory, on which other religions have saving value.

Influential German theologian Karl Rahner (1904-84), in his essay “Christianity and the Non-Christian Religions,” argues that before people encounter Christianity, other religions may be the divinely appointed means of their salvation. Insofar as they in good conscience practice what is good in their religion, people in other religions receive God’s grace and are “anonymous Christians,” people who are being saved through Christ, though they do not realize it. All Christians believe that some were saved before Christianity, through Judaism. So too at least some other religions must still be means for salvation, though not necessarily to the same degree, for God wills the salvation of all humankind. But these lesser ways should and eventually will give way to Christianity, the truest religion, intended for all humankind. (Plantinga 1999, 288-303)

Subsequent papal statements have moved cautiously in Rahner’s direction, affirming the work of the Holy Spirit not only in the people in other religions, but also in those religions themselves, so that in the practice of what is good in those religions, people may respond to God’s grace and be saved, unbeknownst to them, by Christ. Nonetheless, the Roman Catholic church remains the unique divine instrument; no one is saved without some positive relation to it. (Dupuis 2001, 170-9; Neuner and Dupuis 2001, 350-1)

Although many traditional Protestant Christians hold some form of exclusivism, others favor an inclusivism much like Rahner’s. (Peterson et. al. 2013, 333-40) Theologically liberal Protestants most often hold on to some form of religious pluralism.

As a relative latecomer which has always acknowledged the legitimacy of previous prophets, including Abraham, Moses, and Jesus, while proclaiming its prophet to be the greatest and last, Islam has, like Judaism, tended towards inclusivist views of other religions. The traditional Islamic perspective is that while in one sense “Islam” was initiated by Muhammad (570-632 CE), “Islam” in the sense of submission to God was taught by all prior prophets, and so their followers were truly Muslims, that is, truly submitted to God. Still, given that Muhammad is the seal of the prophets, his teachings and practices should, and some day will supersede all previous ones. Recent Islamic thinkers have independently come to conclusions parallel to those of Rahner, while critiquing various pluralist theories as entailing the sin of unbelief (kufr), the rejection of Islam.

It is a matter of dispute whether certain famous Sufi Muslims such as Rumi (1207-73) and Ibn ‘Arabi (1165-1240) have held to some form of religious pluralism. (Legenhausen 2006, 2009)

c. Buddhist Inclusivisms

While there have been Buddhist teachers and movements who have been exclusivists, in general Buddhism has been inclusivist. Buddhism has long been very doctrinally diverse, and many schools of Buddhism argue that theirs is the truest teaching or the best practice, while other versions of the dharma are less true or less conducive to getting the cure, and have now been superseded. It has been typical also for Buddhist thinkers to hold that at best, the same is true of other religious traditions. (Burton 2010) On the other hand, some religions’ teachings are simply false and their practices are unhelpful; the contents of their prescribed beliefs and practices matter.

Some Buddhist texts teach that there can be a solitary Buddha (pratyekabuddha), a person who has gained enlightenment by his own efforts, independently of Buddhist teaching. Such a person is outside of the tradition, yet obtains the cure taught by the tradition. This is an inclusivist view about getting the cure, and about central religious truths. There are even cases of “Buddhists seeking to turn devotees of other religions traditions into ‘anonymous Buddhists’ who worship Buddhist deities without realizing that this is the case.” (Burton 2010, 11)

d. Plural Salvations Inclusivism

Forming his views by way of a detailed critique of various core and identist pluralist theories, Baptist theologian S. Mark Heim (b. 1950) proposes what he calls “a true religious pluralism,” which is nonetheless best understood as a version of inclusivism, as it allows its proponent to maintain the superiority of her own religion. (Heim 1995, 7)

Heim notes that pluralists like Hick insist on one true goal or “salvation” which is achieved by all the equally valuable religions, a goal which is proposed by the pluralist and which differs from those proposed by most of those religions. Heim suggests that we should instead assume that other religions both pursue and achieve real and distinct religious “salvations” (goals or ends). For instance, as an inclusivist Christian, Heim holds that Buddhists really do attain Nirvana. But doesn’t Christian tradition demand that each person eventually either achieves fellowship or union with God, or is irrevocably damned? Heim suggests that those who attain Nirvana would be, from a Christian perspective, either a subgroup of the saved or of the damned, depending on just what, metaphysically, is actually going on with such people. (Heim 1995, 163) This is consistent with the Christian thinking that the end pursued by Christians is in fact better than all others; thus, heaven is better than Nirvana. However, God has ordained Nirvana as a goal suitable for some non-Christians to both pursue and attain. In this and in a later book Heim asserts that such a plurality of ordained religious goals is implied by the doctrine of the Trinity.

It is far from clear that Heim is correct that this stance will be consistent with the claims of the “home” religion. Importantly, he construes the various religious goals as “experiences” obtaining in this life and continuing beyond. (Heim 1995, 154-5, 161) This is an important qualifier, as various religious goals clearly presuppose contrary claims. For example, in Theravada Buddhism, one must realize that there is no self, whereas in Advaita Vedanta Hinduism one must gain awareness that one’s true self is none other than the ultimate reality, Brahman. Similarly, in Christianity, one must realize that one’s self is a sinner in need of God’s grace. It is impossible that all three experiences are veridical. But Heim’s theory does not require them to be, but only that they occur and may be plausibly thought of as fulfilling to those who have them.

Heim strenuously objects to pluralist theories that they impose uniformity on the various religions. However, his theory seems to depend crucially on the existence of many human problems, each of which may be solved by participation in some religion or other. In contrast, each of the various religions claims to have discerned the one fundamental problem facing humans, namely, the problem from which other problems derive. In the terms explained above, a religion claims to have a diagnosis (section 1 above). This seems incompatible with Heim’s agnosticism about which, if any, of the diagnosed problems is the fundamental one. If a religion cures only a shallow, derivative human problem, leaving the deeper problem intact, then what it offers would not deserve the name “salvation,” for it would leave those who achieve it still in need of the cure. (Peterson et. al. 2013, 333) For instance, if Theravada Buddhism is correct that humans are trapped in the cycle of rebirth by craving and ignorance, even if one goes to a heavenly realm upon death, such as envisaged by non-Buddhist religions, one is still trapped in samsara, in this realm of suffering, albeit at a higher tier. How can a Theravada Buddhists accept that such a heavenly next life is a good and final end for non-Buddhists? Again, if a Christian diagnosis is correct, that humans are alienated from and need to be reconciled to God, yet some manage to attain Nirvana, they would still lack the cure, for it is no part of Nirvana that one is reconciled to God.

5. References and Further Reading

  • Abe, Masao. “A Dynamic Unity in Religious Pluralism: A Proposal from the Buddhist Point of View.” The Experience of Religious Diversity. Ed. John Hick and Hasan Askari. Brookfield, Connecticut: Gower Publishing Company, 1985. 167-227. Partially reprinted in Readings in Philosophy of Religion: East Meets West. Ed. Andrew Eshleman. Malden, Massachusetts: Blackwell, 2008. 395-404.
    • Presents an ultimist pluralism modeled on the Mahayana Buddhist “three bodies of the Buddha” doctrine.
  • Bogardus, Tomas. “The Problem of Contingency for Religious Belief,” Faith and Philosophy 30.4 (2013): 371-92.
    • Rebuts sophisticated arguments by Hick, Kitcher, and others, that you cannot know that some religious claim is true because had you been born in another place or time, you would not have believed that claim.
  • Burton, David. “A Buddhist Perspective.” The Oxford Handbook of Religious Diversity. New York: Oxford University Press, 2010. 321-36.
    • Surveys Buddhist views on religious diversity.
  • Byrne, Peter. Prolegomena to Religious Pluralism: Reference and Realism in Religion. New York: St. Martin’s Press, 1995.
    • Explores in depth without endorsing an identist religious pluralism.
  • Byrne, Peter. “It is not Reasonable to Believe that Only One Religion is True.” Contemporary Debates in Philosophy of Religion. Ed. Michael Peterson and Raymond VanArragon. Malden, MA: Blackwell, 2004. 201-10.
    • Argues for an identist pluralism and against “confessionalism” (either inclusivism or exclusivism).
  • Cohn-Sherbok, Dan. “Judaism and Other Faiths.” The Myth of Religious Superiority: A Multifaith Exploration. Ed. Paul F. Knitter. Maryknoll, New York: Orbis Books, 2005. 119-32.
    • An overview of traditional, inclusivist Jewish views of other religions, then arguing that Jewish theology should be revised to accommodate an identist pluralism.
  • Cooper, John W. Panentheism: The Other God of the Philosophers: From Plato to the Present. Grand Rapids, Michigan: Baker Academic, 2006.
    • Chapter 7 is an accessible introduction to the metaphysics and theology of Whitehead and Hartshorne, without which the ultimist pluralism of Cobb and Griffin can’t be understood.
  • Datta, Narendra [Swami Vivekananda]. “Hinduism.” The Penguin Swami Vivekananda Reader. Ed. Makarand Paranjape. New Delhi: Penguin Books India, 2005 [1893]. 43-55.
    • One of several hit speeches given at the first World Parliament of Religions in Chicago in 1893, asserting that all religions share one object and goal, although Hinduism is more tolerant, peaceful, and flexible than other traditions.
  • de Cea, Abraham Vélez. “A Cross-cultural and Buddhist-Friendly Interpretation of the Typology Exclusivism-Inclusivism-Pluralism.” Sophia 50 (2011): 453-80.
    • An attempt to clarify and expand the common trichotomy, adding a fourth category, “pluralistic inclusivism.”
  • Dupuis, Jacques. Toward a Christian Theology of Religious Pluralism. Maryknoll, New York: Orbis Books, 2001 [1997].
    • Survey of the long evolution of Roman Catholic thought on religious diversity, arguing for an inclusivist theory.
  • Feuerbach, Ludwig. Lectures on the Essence of Religion. Translated by Ralph Manheim. New York: Harper and Row, 1967 [1851]
    • A naturalistic, humanistic, atheistic critique of belief in God as a product of human desire and imagination; a form of negative pluralism.
  • Griffin, David Ray. “Religious Pluralism: Generic, Identist, Deep.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky: Westminster John Knox Press, 2005a. 3-38.
    • Surveys sophisticated recent pluralist theories by Hick, Smith, Knitter, Cobb, and criticisms of these. Argues for the superiority of his own ultimist (“deep”) religious pluralism.
  • Griffin, David Ray. “John Cobb’s Whiteheadian Complementary Pluralism.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky: Westminster John Knox Press, 2005b. 39-66.
    • Presentation of ultimist pluralism as developed by Cobb and Griffin.
  • Hasker, William. “The Many Gods of Hick and Mavrodes,” Evidence and Religious Belief. Ed. Kelly James Clark and Raymond J. VanArragon. New York: Oxford University Press, 2011. 186-98.
    • Critical discussion of Hick’s views on the personae and impersonae of religious experience, which are supposed to be manifestations of the Ultimate Reality.
  • Heim, S. Mark. Salvations: Truth and Difference in Religion. Maryknoll, New York: Orbis, 1995.
    • Critiques various pluralistic theories as insufficiently respectful of the real differences between religions and proposes a plural salvations inclusivism.
  • Hick, John. God Has Many Names. Philadelphia: The Westminster Press, 1982 [1980].
    • A short book written at a crucial juncture in Hick’s thinking about religious diversity; probably the best place to start in understanding Hick’s views.
  • Hick, John. The Rainbow of Faiths [U.S. title: A Christian Theology of Religions: The Rainbow of Faiths]. London: SCM Press, 1995.
    • A short and popular exposition of, and development of his mature views as expounded in his 1989 An Interpretation; mostly written in the form of imagined dialogues.
  • Hick, John. “The Epistemological Challenge of Religious Pluralism.” Faith and Philosophy 14.3 (1997): 277-86. Reprinted in Hick 2010, 25-36.
    • Argues that religious exclusivism and inclusivism face devastating epistemological problems; see Hick 2010 for his exchanges with some leading Christian philosophers about this piece.
  • Hick, John. “Ineffability,” Religious Studies 36.1 (2000): 35-46. Reprinted in Hick 2010, ch. 3.
    • Replies to criticisms of the ineffability or trancategoriality of “the Real” by Rowe and Insole.
  • Hick, John. An Interpretation of Religion: Human Responses to the Transcendent, 2nd ed. New Haven, Connecticut: Yale University Press, 2004 [1989].
    • The main exposition of what is widely considered the best-developed pluralist theory (esp. ch. 14-16), espousing the practical equality of “post-axial” religions (ch. 2-4). Its long introduction summarizes his replies to many critics.
  • Hick, John, ed. Dialogues in the Philosophy of Religion. New York: Palgrave-MacMillan, 2010 [2001].
    • Reprints and continues Hick’s exchanges in the late 1990s with a number of prominent philosophers and theologians.
  • Hick, John. “Response to Hasker.” Evidence and Religious Belief. Ed. Kelly James Clark and Raymond J. VanArragon. New York: Oxford University Press, 2011. 199-201.
    • Hick clarifies his claims regarding the personae and impersonae by means of which people interact with the Ultimate Reality.
  • Ignatius of Antioch, “The Letter of Ignatius to the Philadelphians,” in The Apostolic Fathers: Greek Texts and English Translations, 3rd ed. Ed. Michael W. Holmes. Grand Rapids, Michigan: BakerAcademic, 2007. 236-47.
    • Early Christian writing, probably from the first half of the second century, in which the bishop says that followers of schismatic leaders will not be saved.
  • Kärkkäinen, Veli-Matti. An Introduction to the Theology of Religions. Biblical, Historical, and Contemporary Perspectives. Downers Grove, Illinois: InterVarsity Press, 2003.
    • Wide-ranging discussion of Christian responses to religious diversity from biblical times up till the present, valuable for its summaries of ancient, early modern, and recent theological sources.
  • King, Nathan. “Religious Diversity and its Challenges to Religious Belief.” Philosophy Compass 3/4 (2008): 830-53.
    • Lucid survey of varieties of exclusivism, inclusivism, the identist pluralism of John Hick, and epistemological difficulties arising from disagreements about religious matters.
  • Kiblinger, Kristin. Buddhist Inclusivism: Attitudes Towards Religious Others. Burlington, Vermont: Ashgate, 2005.
    • Sympathetic description and criticism of Buddhist inclusivism by a non-Buddhist scholar.
  • Legenhausen, Hajj Muhammad [Gary Carl] “Why I am not a Traditionalist.” 2002.
    • Online overview of traditionalist core pluralism and critique from the perspective of Shia Islam.
  • Legenhausen, Hajj Muhammad [Gary Carl]. “A Muslim’s Proposal: Non-Reductive Religious Pluralism.” 2006.
    • Insightful online article classifying theories of religious pluralism and arguing for what the author calls “non-reductive pluralism” (here described as an example of Abrahamic Inclusivism, 4b above) by a philosopher who is an American convert to Shia Islam.
  • Legenhausen, Hajj Muhammad [Gary Carl]. “On the Plurality of Religious Pluralisms.” International Journal of Hekmat 1 (2009): 6-42.
    • The most comprehensive classification of varieties of theories of religious pluralism.
  • Long, Jeffery D. “Anekanta Vedanta: Towards a Deep Hindu Religious Pluralism.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky, 2005. 130-57.
    • Exploration of ultimist (“deep”) religious pluralism by a scholar who is an American convert to Hinduism; argues that whether or not “Hinduism” is pluralistic or inclusivist depends on whether it is understood as Vedic tradition, Indian tradition, or Sanatana Dharma [eternal religion].
  • Long, Jeffery D. “Universalism in Hinduism.” Religion Compass 5/6 (2011): 214-23.
    • Survey of historical and recent pluralist theories in Hinduism (here called versions of “universalism”) and criticisms thereof.
  • Mawson, T.J. “‘Byrne’s’ religious pluralism.” International Journal for Philosophy of Religion 58.1 (2005): 37-54.
    • Negative critique of the identist pluralism of Peter Byrne.
  • McDermott, Gerald and Netland, Harold. A Trinitarian Theology of Religions: An Evangelical Proposal. New York: Oxford University Press, 2014.
    • A recent evangelical Protestant version of exclusivism, embedded in a Christian theology of religions.
  • Meeker, Kevin. “Exclusivism, Pluralism, and Anarchy.” God Matters: Readings in the Philosophy of Religion. Ed. Raymond Martin and Christopher Bernard. New York: Pearson, 2003. 524-35.
    • Shows how versions of exclusivism, inclusivism, and pluralism can be viewed on a continuum, as excluding more or fewer religions; contrasts Hick’s “altruistic pluralism” with an more open-minded but less plausible “anarchic pluralism.”
  • Morales, Frank [Sri Dharma Pravartaka Acharya]. Radical Universalism: Does Hinduism Teach That All Religions Are the Same? New Delhi: Voice of India, 2008.
    • American-born Hindu teacher and scholar argues against the pluralism of Datta [Vivekananda] and Chattopadhyay [Ramakrishna] that it is: incoherent, inconsistent with the facts of religious diversity, foreign to Hinduism, relativistic, intolerant, destructive of Hinduism, and based on misinterpretations of Hindu scriptures.
  • Netland, Harold. Encountering Religious Pluralism: The Challenge to Christian Faith and Mission. Downers Grove, Illinois: InterVarsity Press, 2001.
    • A defense of Christian “particularism” (compatible with the descriptions “exclusivism” or “inclusivism” in this article) by an evangelical theologian, with summaries of earlier missionary literature and criticisms of pluralist theories.
  • Neuner, Josef and Dupuis, Jacques, eds. The Christian Faith in the Doctrinal Documents of the Catholic Church, Seventh Revised and Enlarged Edition. Bangalore: St. Peter’s Seminary, 2001.
    • Collection of Roman Catholic primary sources, including documents relating to the uniqueness of Catholicism, the idea that there is no salvation outside the church, and views on non-Catholic Christianity and non-Christian religions.
  • O’Connor, Timothy. “Religious Pluralism.” Reason for the Hope Within. Ed. Michael Murray. Grand Rapids, Michigan: Eerdmans, 1999. 165-81
    • Defends “exclusivism” (rejection of any pluralism) against arguments that it is arbitrary, arrogant, or irrational, and argues that Hickian identist pluralism is incoherent.
  • Peterson, Michael. et. al. Reason and Religious Belief, 5th ed. New York: Oxford University Press, 2013.
    • Leading philosophy of religion textbook with excellent chapter (14) on pluralism, exclusivism, and inclusivism.
  • Plantinga, Cornelius, ed. Christianity and Plurality: Classic and Contemporary Readings. Malden, Massachusetts: Blackwell, 1999.
    • Collection of Christian documents concerning religious diversity, starting with the Bible and ending with a statement by Pope John Paul II.
  • Prothero, Stephen. God is Not One. The Eight Rival Religions that Run the World – and Why Their Differences Matter. New York: HarperOne, 2010.
    • Introductory overview of eight religious traditions which aims to undermine “the new orthodoxy” of naive or core pluralism.
  • Quinn, Philip L. and Kevin Meeker, eds. The Philosophical Challenge of Religious Diversity. New York: Oxford University Press, 2000.
    • Important anthology of philosophical pieces, largely consisting of attacks on and defenses of Hick’s identist pluralism.
  • Rowe, William. “Religious Pluralism.” Religious Studies 35.2 (1999): 139-50.
    • Argues that Hick’s central claim that the Ultimate Reality is ineffable is incoherent.
  • Schmidt-Leukel, Perry. “Exclusivism, Inclusivism, Pluralism: The Tripolar Typology – Clarified and Reaffirmed.” The Myth of Religious Superiority: A Multifaith Exploration. Ed. Paul F. Knitter. Maryknoll, New York: Orbis Books, 2005, 13-27.
    • Catalogues and responds to the many objections various authors have given to the standard trichotomy, and precisely defines it in terms of giving knowledge sufficient to give people the cure which religion offers.
  • Sedgwick, Mark. Against the Modern World: Traditionalism and the Secret Intellectual History of the Twentieth Century. New York: Oxford University Press, 2004.
    • Intellectual history of “traditionalism” or “perennialism” (core pluralism).
  • Sharma, Arvind. “All religions are – equal? one? true? same?: a critical examination of some formulations of the Neo-Hindu position.” Philosophy East and West 29.1 (January 1979): 59-72.
    • An attempt to clarify the sort of pluralism popular in Hinduism since Datta.
  • Sharma, Arvind. A Hindu Perspective on the Philosophy of Religion. London: MacMillan, 1990.
    • Chapter 9 is a basic introduction to the pluralistic orientation of modern Hinduism, interacting with a few western scholars (W.A. Christian, W.C. Smith, and Hick).
  • Sharma, Arvind. The Concept of Universal Religion in Modern Hindu Thought. New York: St. Martin’s Press, 1999.
    • Surveys the views of leading 19th and 20th century Hindu intellectuals on the theme of “universal religion,” which can be an (alleged) fact or an unrealized ideal; some versions of religious pluralism are discussed.
  • Smart, Ninian. Dimensions of the Sacred: An Anatomy of the World’s Beliefs. Berkeley, California: University of California Press, 1996.
    • Presents a seven-fold analysis of the different aspects of religious traditions.
  • Smith, Huston. Forgotten Truth: The Common Vision of the World’s Religions. San Francisco: HarperSanFrancisco, 1992 [1976].
    • Thorough presentation of a core pluralism, as a part of what he calls “perennial philosophy,” or “traditionalism.”
  • Smith, Huston. Beyond the Postmodern Mind: The Place of Meaning in a Global Civilization, Updated and Revised. Wheaton, Illinois: Quest Books, 2003 [1982].
    • Further exposition of Smith’s “perennial philosophy,” put in the context of his diagnoses of the historical mistakes of “Modern” and “Postmodern” thinking, and his practical suggestions for the future.
  • Smith, Huston. “No Wasted Journey: A Theological Autobiography.” The Huston Smith Reader. Ed. Huston Smith and Jeffrey Paine. Berkeley: University of California Press, 2012, 3-12.
    • Smith explains his journey from Christian to religious naturalist to eclectic seeker to perennialist core pluralist.
  • Stenmark, Mikael. “Religious Pluralism and the Some-Are-Equally-Right View.” European Journal for Philosophy of Religion 2 (2009): 21-35.
    • Articulates the position named in the title as an alternative to the exclusivism-inclusivism-pluralism trichotomy, in part motivated by what he calls “the problem of emptiness” for Hick’s pluralism – roughly, that his (nearly) inconceivable Real is irrelevant to any religious concerns.
  • Tanner, Norman, ed. Decrees of the Ecumenical Councils, Volume One: Nicea I to Lateran V. Washington, D.C.: Georgetown University Press, 1990.
    • First of two volumes with the official original language texts and English translations of all twenty-one official councils recognized by the Roman Catholic church; the “Bull of union with the Copts” from the council of Florence (1442) expresses an exclusivist stance.
  • Yandell, Keith. Philosophy of Religion: A Contemporary Introduction. New York: Routledge, 1999.
    • Focuses on the differences between monotheistic religions, Advaita Vedanta Hinduism, Jainism, and Theravada Buddhism, with a thorough critique of Hick’s pluralism. (ch. 6)
  • Yandell, Keith. “Has Normative Religious Pluralism a Rationale?” Can Only One Religion Be True? Paul Knitter and Harold Netland in Dialogue Ed. Robert B. Stewart. Minneapolis: Fortress Press, 2013, 163-79.
    • Spirited attack on pluralist theories as poorly motivated and inconsistent with traditional religious beliefs.

 

Author Information

Dale Tuggy
Email: filosofer@gmail.com
State University of New York at Fredonia
U. S. A.

Adolf Lindenbaum

photo by permission of the Archives of the University of Warsaw

Adolf Lindenbaum was a Polish mathematician and logician who worked in topology, set theory, metalogic, general metamathematics and the foundations of mathematics. He represented an attitude typical of the Polish Mathematical School, consisting of using all admissible methods, independently of whether they were finitary. For example, the axiom of choice was freely applied, but on the other hand, proofs omitting this axiom were welcomed. In set theory, Lindenbaum and Tarski posed an important conjecture that the generalized continuum hypothesis entails the axiom of choice. Among the most important metalogical and metamathematical results obtained by Lindenbaum are the following: every system of propositional calculus has an at most denumerably infinite normal matrix; the construction of the so–called Lindenbaum algebra; and the maximalization theorem.

Lindenbaum studied mathematics under Wacław Sierpiński, Stefan Mazurkiewicz and Kazimierz Kuratowski in Warsaw. As part of the Lvov–Warsaw School, formed by a powerful Polish group of analytic philosophers, Lindenbaum belonged to the Polish mathematical school and the Warsaw school of logic. He began his career as a topologist, and his doctoral dissertation, written under Sierpiński, was devoted to properties of point–sets. Then in the mid–1920s he switched to logic and joined the Warsaw School of Logic that was established by Jan Łukasiewicz and Stanisław Leśniewski after World War I. Lindenbaum was a close friend and collaborator of Alfred Tarski.

Table of Contents

  1. Curriculum Vitae
  2. A General Outline of Lindenbaum’s Scientific Career and His Views
  3. Lindenbaum and Set Theory
  4. Lindenbaum and Logical Calculi
  5. Lindenbaum and General Metamathematics
  6. Final Remarks
  7. References and Further Readings

1. Curriculum Vitae

Adolf Lindenbaum was born in an assimilated (polonized) rich Jewish family in Warsaw on June 12, 1904 (see Mostowski–Marczewski 1971, Surma 1982, Zygmunt–Purdy 2014 for biographical data on Lindenbaum). He took his secondary education at M. Kreczmar’s Gymnasium in Warsaw (1915–1922) and he next entered Warsaw University to study mathematics (1922–1926) under such teachers as Kazimierz Kuratowski, Stefan Mazurkiewicz and Wacław Sierpiński in mathematics as well as Stanisław Leśniewski and Jan Łukasiewicz in logic. Lindenbaum also attended courses by Alfred Tarski on cardinal numbers and elementary mathematics, Tadeusz Kotarbiński’s course in logic and some classes in humanities, including history of philosophy (Władysław Tatarkiewicz; a general course, a special class on Kant), aesthetics (also Tatarkiewicz on French art), psychology (Władysław Witwicki), linguistics (Karol Appel), literature (Józef Ujejski on Adama Mickiewicz, the most important Polish national poet), and the history and culture of Palestine (Moses Schorr).

Sierpiński supervised Lindenbaum’s PhD dissertation entitled O własnościach mnogości punktowych (On Properties of Point–Sets). The thesis was defended in 1928 and Lindenbaum received the title of Doctor of Philosophy. In 1934 he presented his Habilitation thesis, based on several published papers, to the Faculty of Mathematics and Natural Sciences at Warsaw University and obtained the degree of Docent (a person who could lecture). This resulted in his appointment as adjunct professor at the Philosophical Seminar at the same faculty. Tthe Philosophical Seminar was an independent unit at the Faculty of Mathematics and Science directed for years by Łukasiewicz; in fact, it was the Logical Seminar. Lindenbaum lectured on various mathematical and logical topics from 1935 to 1939. His courses, for example, concerned the following topics: “On New Investigations into the Foundations of Mathematics and the Mathematical Foundations of other Disciplines”, “On Superposition of Functions” and “Selected Topics from Metrology and from the Theory of Functions”. He stood little chance of being promoted to an academic position higher than docent because of the anti–Semitic policy in Polish universities after 1935,  and his involvement in the communist movement, as well as the shortage of university positions at the time. He was also a tutor at the Scientific Circle of Jewish Students at Warsaw University, established after the exclusion (in the 1890s) of the Jews from general students’ associations existing in Poland.

Lindenbaum was a typical, good–looking bon vivant. He married a Jewish beauty, Janina Hosiasson, also a logician, who successfully worked on induction and confirmation. As Janina Hosiasson informed Alfred Tarski in one of her letters in the early 1941 (this letter is in Tarski’s Archive in U.C. Berkeley’s Bancroft Library), she and Adolf got separated at the beginning of World War II. Lindenbaum was a declared leftist. He belonged to the Polish Communist Party (KPP) until its dissolution by Stalin in 1938; He also was an activist in the intelligentsia circles. In 1936 Lindenbaum signed a petition demanding that Carl von Ossietzky, a journalist imprisoned by the Nazis, be awarded the Nobel Peace Prize, and he protested against the massacre of workers in Lvov in 1936. Mrs. Janina Kotarbiński, the wife of Tadeusz Kotarbiński, a close friend of the Lindenbaums reported the following story:

It happened that I and Antoni Pański [also a philosopher – J. W] visited the Lindenbaums in their apartment. I noticed on Dolek’s [the diminutive of Adolf – J. W.] desk the Short Philosophical Dictionary [the dictionary written by Pavel Yudin and Mark Rozental and published in Russian in the Soviet Union at the beginning of the 1930s. The authors represented the Stalinist version of Marxism. The Short Philosophical Dictionary became a symbol of the orthodox Marxist ideology – J. W.]. The book was opened at the entry ‘Dialectical Contradiction’. I was surprised, and on our way back I asked Pański why Dolek, such a clever person, read such stupidities about the concept of contradiction. Pański answered that Dolek believed in every word of this book.

She added, however, that Lindenbaum had considerable interests in philosophy without any kind of dogmatism. He was ready for an open discussion on any philosophical issue. Lindenbaum was strongly interested in literature and art, and was famous as a passionate climber.

After the outbreak of World War II, Lindenbaum immediately realized that for his own security he should escape before the German army would take Warsaw. His Jewish origin was probably not the only reason behind this decision. More importantly, it was obvious that the Germans possessed the lists of Polish communists and other critics of Hitler’s regime. Lindenbaum was particularly afraid of the consequences of the support he had given to Ossietzky, as such actions were strongly, even furiously, criticized in Germany. Although Lindenbaum could emigrate to the West or get a position in Moscow as a scientist, he decided to remain in Poland, because he has hope in the socialist future of the country. The Lindenbaums left Warsaw on September 6, 1939 and went to Vilna (presently Vilnius); this city, formerly in Poland, became the capital of Lithuania after September 17, 1939. Lithuania was an independent country in 1918–1939 with Kaunas as its capital; in 1939–1941 Lithuania was formally independent, but entirely dependent on the Soviet Union. It was occupied by Germany after its attack on the Soviet Union on June 22, 1941. Janina remained there, but Adolf moved to Bialystok, a city occupied by the Soviet Army after its invasion into Poland on September 17; he probably expected this part of Poland to become the ovule of the future Polish communist state. Lindenbaum was appointed as a docent in the Bialystok Pedagogical Institute established by the Soviet authorities, and he taught mathematics there. The German–Soviet War started on June 22, 1941, and the Germans soon came to Bialystok. The reasons why Lindenbaum did not leave the city are unknown. In September 1941, he was arrested by the Gestapo, transported to Vilnius, and killed in Ponary (the place of many massacres, particularly of the Jews in 1941–1944) near the Lithuanian capital. Janina Hossiasson–Lindenbaum was murdered in Vilnius in 1942. The exact dates of the deaths of the Lindenbaums remain unknown.

2. A General Outline of Lindenbaum’s Scientific Career and His Views

Lindenbaum began his scientific carrier as a topologist, but he soon converted to logic and the foundations of mathematics. He and Tarski (three years older) became friends in the early 1920s and the latter influenced Lindenbaum in the direction of mathematical logic and the foundations of mathematics. Both shared not only scientific interests, but also a negative attitude to any version of religion, various leftist political ideas and a love of  mountains and literature, but they were also very sensitive to their fate as secular Jews and their pretending to be assimilated and accepted by Polish society. Neither of them, however, was successful in the last respect. At the beginning of his scientific career, he was very active in the Student Mathematical Scientific Circle as well as in the Student Philosophical Circle, and he successfully promoted logic among his colleagues in both groups. In particular, he delivered several lectures on logical problems at the meetings of both circles. He was a mentor in logic to students in Warsaw. For instance, he wrote a section on logic in the Mathematical-Physical Study: Information Book for Newcomers published in 1926 in which he reported how logic was taught in Warsaw. It is a very interesting document which shows how powerful logic was in Warsaw in the mid–1920s. The revelation that the Principia Mathematica was recommended as a textbook for advanced students seems really shocking. Lindenbaum’s brilliant personality, charming style of life and powerful mathematical skills fascinated the Warsaw scientific community. Not surprisingly, he was commonly considered one of the most gifted Polish mathematicians of his generation. Tarski 1949, p. XII described Lindenbaum as “a man of unusual intelligence”. Mostowski once called Lindenbaum the most lucid mind in the foundations of mathematics. Legendary stories told by Lindenbaum’s friends and colleagues document many cases of theorems discovered by him but proved by someone else, as he had no time to complete his ideas. Yet the list of his scientific contributions is quite long; it comprises more than 40 papers, abstracts and reviews (see Surma 1982 for Lindenbaum’s bibliography), mostly published in German and French. Some of Lindenbaum’s papers were co–authored, in particular, by Tarski and Andrzej Mostowski. Lindenbaum and Tarski worked on the book Theorie der eindeutigen Abbildungen. It was announced as volume 8 of the series “Mathematical Monograph” to be published in 1938. This information is given on the back cover of S. Sachs, The Theory of Integral, Monografie Matematyczne, Warszawa–Lwów 1937 (co–published by G. E. Stechert, New York). Sachs’ book appeared as volume 7 of this series. This suggests that the book by Lindenbaum and Tarski was near to being completed. Lindenbaum was also well–perceived on the international scale. Leading logicians and mathematicians, including Wilhelm Ackermann, Friedrich Bachmann, Abraham Fraenkel, Andrei Kolmogoroff and Arnold Schmidt, reviewed his writings.  Lindenbaum actively participated in the Polish Mathematical Congresses and in the International Congress of Scientific Philosophy held in Paris in 1935.

Lindenbaum belonged to three scientific schools, namely the Polish Mathematical School with Sierpiński, Mazurkiewicz and Kuratowski as its leaders in Warsaw the Warsaw School of Logic (Łukasiewicz, Leśniewski, Tarski—the last joined the top of the School in the 1920s; see Woleński 1995 for a general presentation of logic in Poland in the interwar period) and the Lvov–Warsaw School (Kazimierz Twardowski and his disciples from Lvov, in particular, Łukasiewicz, Leśniewski and Kotarbiński). The second affiliation is perhaps the most important. The Warsaw School of Logic was a “child” of mathematicians and philosophers. Łukasiewicz and Leśniewski, the main figures in this group were, as I have already noted, philosophers by training. Nevertheless, they became professors at the Faculty of Mathematics and Natural Science at Warsaw University and were active in the mathematical environment. Zygmunt Janiszewski, another founding father of the Polish Mathematical School, who died prematurely before Lindenbaum entered the university, developed the so–called Janiszewski program, a very ambitious plan of the development of mathematics in Poland that attributed crucial significance to logic and the foundations on mathematics. The Fundamenta Mathematicae, a journal established by Janiszewski and serving from its inception as an official scientific journal of the Polish Mathematical School, published many papers by logicians. It is noteworthy that Mazurkiewicz, Sierpiński, Leśniewski and Łukasiewicz—two professional mathematicians and two logicians originating from philosophy—formed the Editorial Board of the Fundamenta. It was important for the subsequent stormy development of logic in Warsaw that the mathematical milieu accepted philosophers as professional teachers of students of mathematics.

This double heritage, philosophical as well as mathematical, determined the scientific ideology of Warsaw logicians. Perhaps one point should be mentioned here as particularly important in this context. Firstly, the Polish Mathematical School did not assume any specific philosophical standpoint concerning the nature of mathematics. Methodologically speaking, all fruitful mathematical methods, particularly coming from set theory, could be used in logical investigations, provided that they did not lead to contradictions. The last statement should be understood in the following way. Clearly, proofs of consistency are important and required, however even before Gödel’s second incompleteness theorem (roughly speaking, that the consistency of arithmetic cannot be proved in arithmetic itself) was announced, Polish mathematicians maintained that if it is empirically known that a theory of given concepts is contradiction–free, it can be faithfully used in mathematical investigations including logical research. As a consequence, the Polish Mathematics School did not subscribe to logicism, formalism or intuitionism as the main foundational currents in the philosophy of mathematics, even though they were widely regarded as such in 1900–1930. On the other hand, Polish logicians worked on many problems suggested by Russell, Hilbert and Brouwer, the main exponents of the mentioned schools. Moreover, sometimes a tension held between private, so to speak, philosophical views of some Polish logicians and their research practices. For instance, Tarski, influenced by Leśniewski’s and Kotarbiński’s nominalism, expressed explicit sympathy with this view, but on the other hand, he did not hesitate to use higher set theory and inaccessible cardinals in the foundations of mathematics.

There is practically nothing known about Lindenbaum’s philosophical views concerning mathematics. Clearly, his inclinations to dialectical materialism had no influence on his philosophy of mathematics and foundational views. In fact, he shared the general attitude of the Polish Mathematical School mentioned above. Lindenbaum published some papers directly related to general foundational problems (for example, Lindenbaum 1930, Lindenbaum 1931) in which he recommended the use of mathematics in logical investigations without any hesitation with respect to employing infinitary methods; he pointed out that such methods were present even in elementary arithmetic. For instance, he published reviews of works by Polish radical nominalists, such as Leon Chwistek and Władysław Hetper, but he entirely abstained from philosophical comments. Two papers—Lindenbaum 1936 and Lindenbaum–Tarski 1934–1935—should be particularly mentioned. The former paper,  is Lindenbaum’s contribution to the already mentioned Paris Congress in 1935,; concerns the formal simplicity of concepts. Although Lindenbaum points out that this question arises in many fields, he does not offer any general definition of simplicity. Lindenbaum distinguishes seven relevant problems concerning simplicity (a) of systems of concepts (terms); (b) of propositions and their systems; (c) of inference rules; (d) of proofs; (e) of definitions and constructions; (f) of deductive theories; (g) of formal languages; but he addresses his further considerations to (a). The idea is that measuring the number of letters occurring in a given concept can tell us about the simplicity of the term (Lindenbaum follows Leśniewski in this respect). Two points are interesting. First, Lindenbaum assumes the simple theory of types.  This suggests that he preferred a more elementary construction if it is possible and adequate for a given problem. This attitude was also very popular among Warsaw mathematicians and logicians. Second, Lindenbaum observes, obviously under Tarski’s influence, that simplicity has not only a syntactic dimension (in Carnap’s sense), but it should also be considered semantically.

The paper Lindenbaum–Tarski 1934–1935 basically concerns some metamathematical problems about the limitations of means of expressions (expressive power, to use present terminology) in deductive theories. The authors claim that all relations between objects (individuals, classes, relations, and so forth) expressible by purely logical means remain invariant under an arbitrary one–one mapping of the “world” (that is, the collection of all individuals) onto itself. Moreover, this invariance is logically provable. This idea was more fully developed by Tarski in his paper on logical concepts (see Tarski 1986). In a sense, the understanding of logical concepts as invariant under all one–one mappings has affinities with the Erlangen Program (Tarski stressed this point) in the foundations of geometry formulated by Felix Klein. Lindenbaum and Tarski proved a general theorem justifying the intuitive explanation of understanding logical relations as invariant under one–one mappings. The consequences of this approach to logical concepts for philosophy of logic are far–reaching. In particular, the Lindenbaum–Tarski definition of logical concepts motivates the theorem that logic does not distinguish any extralogical concept (roughly speaking, what can be proved in logic about an extralogical item, for instance, an individual, can be proved about any other individual). Consequently, logical theorems are true in all possible worlds (models). Thus, the definition in question naturally leads to seeing logic as invariant with respect to any specific content.

Lindenbaum’s works that are related to logic and the foundations of mathematics concern set theory and logical calculi, including their metalogical properties. This article deals with general set theory in section 3, while sections 4 and 5 are devoted to logical matters (Section 3 skips special topics, including those belonging to other mathematical fields; the borderline between general and special set theory is somehow arbitrary; also some Lindenbaum’s results will be mentioned without entering into formal details.).

Lindenbaum’s results in set theory were achieved in an individual collaboration with other authors, particularly Tarski and Mostowski. In the 1920s and 1930s Łukasiewicz conducted a seminar in mathematical logic. Its participants were the group of young logicians including, in addition to Lindenbaum and Tarski, Stanisław Jaśkowski, Andrzej Mostowski, Jerzy Słupecki, Bolesław Sobociński and Mordechaj Wajsberg. This seminar soon became a factory of new results in mathematical logic. Its participants collaboratively worked on problems. Lindenbaum’s results about logical calculi, as Łukasiewicz explicitly says, were achieved at this seminar. Lindenbaum frequently stated theorems, usually without proofs. His most important results were mentioned by others; some of them are to be found in Łukasiewicz–Tarski 1930 (I will refer to it as Ł–T1930).

3. Lindenbaum and Set Theory

In 1926, Lindenbaum and Tarski published the joint paper “Communication sur les recherches de la théorie de ensembles” (Lindenbaum–Tarski 1926; the abbreviation L–T1926 is used in further references). This paper is very compact. In 30 pages the authors announced many results in set theory and its applications, that were achieved by them within the “last few years”. More particularly, theorems and definitions concerned cardinal and ordinal numbers, the relations between them and the theory of one–one mappings. The results were stated without proofs. The authors noted that proofs and further developments would appear in the subsequent writings. (Perhaps the already mentioned monograph Theorie der eindeutigen Abbildungen was intended as a continuation of L–T1926; several related results are contained in Tarski 1949; Lindenbaum is mentioned in the Preface to this book as a person particularly effective in conducting research on cardinal numbers.) The spirit of the Polish Mathematical School is evident in this paper. The purely mathematical text is interrupted by historical and methodological comments; Polish mathematicians considered (and they still do) such remarks as a very important feature of mathematical prose. Due to the role of the axiom of choice (AC) in set theory and its controversial nature, results obtained without use of this axiom are grouped in a section different from the sections containingthe theorems based on AC. The paper lists 102 theorems or lemmas about cardinal numbers, 5 theorems or lemmas about properties of one–one mappings, on order types, 16 theorems or lemmas about order types, 4 theorems on the arithmetic of ordinal numbers and 19 theorems or lemmas on point–sets. In many cases, the investigations by Lindenbaum and Tarski continue the earlier works and achievements by Cantor, Dedekind, Bernstein, Fraenkel (Fraenkel), Hartogs, Korselt, König, Lebesgue, von Neumann, Russell and Whitehead, Schröder, Zermelo, Banach, Kuratowski, Leśniewski and Sierpiński. In a sense, L–T1926 can serve as a very important historical report on the state of the art in set theory and its foundational problems in the mid–1920s.

Perhaps the most important result (theorem 94) announced in L–T1926 concerns the generalized continuum hypothesis (GCH) and AC. Lindenbaum posed the problem of how AC (one of the main focuses of Polish Mathematical School) is related to Cantor’s hypothesis on alephs (the name of GCH used in L–T1926). Theorem 94 states that GCH entails AC. This result was proved by Sierpiński in 1947 (see Sierpiński 1965, pp. 43–44, Moore 1982, pp. 215–217 for a brief survey). The search for equivalents of AC became one of the Polish mathematical specialties de la maison. Lindenbaum (L–T1926, theorem 82(L)) claimed that AC is equivalent to the assertion that for arbitrary cardinal numbers m and n, m ≤* n or n ≤* m (the symbol ≤* expresses the relation between cardinal numbers m and n such that either m = 0 or every set of power n is the sum of m mutually disjoint non–empty sets). This theorem was finally proved by Sierpiński in 1949 (see Sierpiński 1965, p. 435–436).  L–T1926 also presents the material on the Cantor–Bernstein theorem (CBT; Lindenbaum and Tarski used the label “the Schröder–Bernstein theorem”), which says that for any cardinal numbers m, n, if m n and n m, then m = n (see Hinkis 2013 for a very detailed historical exposition). In particular, Lindenbaum proposed some equivalents of CBT also in the terms of order–types of one–one mappings. One of the equivalents of CBT is the following proposition: for any order–types α, β, γ, δ, if α = β + γ and γ = α + δ, then α = γ (via one–one mappings: if an well-ordered set X is similar to a segment of an ordered set B, and B is similar to a residue of A, then both sets A and B are similar). Another interesting result related to the Bernstein Division Theorem (BDT) says that for any natural number k and any cardinal numbers m, n, if km = kn, then m = n. L–T1926 claims that Lindenbaum proved BDT in its full generality and made that without AC. Yet there is a slight historical controversy concerning the scope of Bernstein’s original proof (see Hinkis, p. 139). [A ⇔ B]? Leaving this question aside, there is no doubt that L–T1926 played an important role in the development of the foundations of set theory.

Finally, consider thepaper by Lindenbaum and Mostowski (Lindenbaum–Mostowski 1938) on the independence of AC from other axioms of Zermelo–Fraenkel set theory (ZF). Abraham Fraenkel claimed that he proved the independence of AC (in its standard version postulating the existence of a choice set for any family of non–empty and disjoint sets) and its two equivalents (every set can be ordered; if for every family of finite and mutually disjoint sets, a choice set exists, then there exists a choice set for every countable family of sets with mutually disjoint set as elements) from ZF. Lindenbaum and Mostowski remark that Fraenkel’s investigations, though of great value, cannot be regarded as fully successful “because a dangerous confusion of metamathematical and mathematical notions is inherent in it”. More specifically, Fraenkel’s notions of model and function are obscure. Lindenbaum and Mostowski propose to understand the concept of function in a semantic manner, that is, as determined by the formula “a set satisfies a propositional function”.  Moreover, they point out an error in Fraenkel’s proof consisting in his treatment of permutations. Lindenbaum and Mostowski propose a modification of Fraenkel’s construction in order to correct his proof. It is done by the improving axiomatization with respect to axioms of separation, replacement and infinity. This step allows for proving that also other equivalents of AC are not derivable in ZF. The authors observe that ZF remains (relatively) consistent if we add the axiom “there exists an infinite set having not–sets as elements”. Moreover, the results cannot be obtained in the system in which sets are admitted as the only elements. These results were later completed by Mostowski. The models of the improved ZF with Urelements (items not being sets) are called the Fraenkel–Mostowski models. Perhaps a particularly interesting feature of Lindenbaum–Mostowski 1938 consists in the conscious use of semantic methods in the metamathematics of set theory.

4. Lindenbaum and Logical Calculi

This section reviews Lindenbaum’s results in metalogic of propositional calculus.  They were announced at Łukasiewicz’s seminar in mathematical logic in 1926–1930. Łukasiewicz initiated a research program devoted to propositional (sentential) logic, classical as well as many–valued (the results are collected in Ł–T1930). The two main directions of investigations were executed by Łukasiewicz’s group. First, constructions of sentential logic as axiomatic systems were undertaken. Łukasiewicz and his students looked for independent and possibly economic axioms (economic in the sense of having the minimal number of axioms consisting of the shortest number of symbols). Second, Warsaw logicians—mostly Tarski—invented and systematized the basic metalogical tools for investigations of propositional calculus. Lindenbaum, contrary to the majority of the members of  Łukasiewicz seminar, had no interest in proposing new axiomatizations of logic. His activities belonged entirely to metalogic. More specifically, Lindenbaum contributed to the matrix method, that is, to investigating sentential logic via logical matrices. They are generalizations of the well–known truth–tables.

Abstractly speaking, (note that the level of abstraction is higher in Ł–T1930), a logical matrix is an ordered quadruple M = [U, U, f, g] such that U and U’ are two arbitrary sets (in order to exclude trivial cases, both sets are assumed to be non–empty), U has at least two elements, U’ U, f is a binary function, and g is a unary function. Both functions are defined for U and take values from U. The intended interpretation is as follows: U – the set of logical values, U’ ­– the set of designated logical values. In the case of two–valued logic (1 – truth, 0 – falsehood), U’ = {1, 0} and U’ = {0}. Assume that L is a language of sentential logic. If A ∈ L, then f(A), g(A) ∈ U. Intuitively speaking, f and g are valuation functions defined for formulas of L taking values from U; if language is not taken into account, M is an algebra of (logical or other) values.  Write v(A) for “v is a logical value of A in M”. The matrix M is normal provided that if v(A) ∈ U’ and v(B) ∈ U, then v(A, B) ∈ U. Roughly speaking, values of compound formulas always belong to U, independently of whether their constituents are valued by designated values or other (undesignated) values.  If A L and for every w, w(A) = 1, then A is a tautology in M (A is verified by M). We can consider g as the function corresponding to negation and f as the counterpart of implication. Thus, vA) = 1, if v(A) = 0, vA) = 0, if v(A) = 1; v(A B) = 1, if v(A) = 0 or v(B) = 1, otherwise v(A B) = 1. These equalities show that truth–tables for implication and negation are special cases of logical matrices in the abstract sense.

Lindenbaum established several important theorems connecting propositional calculi with logical matrices (they are listed in a different order than the theorems appear in Ł–T 1930; this paper contains no proofs). Let the symbol LOG0 refer to a many–valued logic with a denumerably infinite set of logical values. Łukasiewicz defined matrices for such a system. Lindenbaum (theorem 16 in Ł–T 1930) established that LOG0 can be characterized by a matrix in which U’ = {1}, functions f and g satisfy the conditions f(x, y) = min(1, 1x + y),  g(x) = 1x and U is an arbitrary infinite set of numbers which satisfies the condition 0 < x < 1 for any element of U and is closed under the operations f and g. Lindenbaum also proved (theorem 19 in Ł–T 1930) the following logico–arithmetical result (the converse of a result obtained earlier by Łukasiewicz): for 2 m  0 and 2 n  0, we have the equivalence LOGm LOGn if and only if n1 is a divisor of m – 1. The next theorem established by Lindenbaum for n = 3 (Tarski generalized this result for any prime number) says if n is a prime number that there are only two systems L (the entire language) and LOG2 which contain LOG3 as a proper part. Lindenbaum also proved (Ł–T1930, theorem 23) that every logic LOGn is axiomatizable for any 1 n < 0.

Although the above results are general, they were mainly directed as reporting facts on many–valued logic. Lindenbaum also obtained the results on arbitrary sentential calculi and their matrices. The definition of M (plus the definition of a sentential calculus as closed by the consequence operation) implies that the set of tautologies in a matrix, that is, LOG(M), provided that M is normal, is a system. Lindenbaum announced (theorem 3 in Ł–T 1930) that every sentential calculus has at most a denumerably infinite normal matrix. This theorem was proved by Jerzy Łoś (see Łoś 1949; this work contains the first systematic treatment of logical matrices). This last work originated from systematic investigations on matrix semantics for propositional calculi (see Wójcicki 1989 for an extensive report on this field of logical research; several important results are also described in Pogorzelski 1994). The following historical speculation can illustrate the importance of theorem 3. When Heyting formalized intuitionistic sentential logic (ISC) in 1930, the question of finding a normal matrix for this logic became important. Gödel showed (in 1932) that no finite matrix (he used the term “realization”) verifies all theorems of ISC. By Lindenbaum’s result (Gödel did not refer to it), there exists a denumerably infinite normal matrix ISC. Jaśkowski constructed it in 1936. He certainly must have known Ł–T 1930, but he did not refer to it. This story makes a nice example of how influences could interplay in looking for an adequate matrix for intuitionistic propositional logic; but we have no accessible evidence that Lindenbaum’s theorem actually inspired Jaśkowski.

5. Lindenbaum and General Metamathematics

Several Lindenbaum’s contributions to general metamathematics are mostly mentioned in Tarski 1956 (referred to as T1956 below)). The two most important results achieved by Lindenbaum are his construction of the so–called Lindenbaum algebra (LIA) and the maximalization theorem LMT (frequently called the Lindenbaum Lemma). Lindenbaum observed that formulas can be the elements of logical matrices (see Surma 1967, Surma 1973, Surma 1982 for the reconstruction of Lindenbaum’s path to LIA). Then, he as well as other logicians (Łoś, for instance) generalized this idea for arbitrary languages. LIA is presented in this article for classical sentential calculus. Let L be a formal language with ¬, ∧, ∨, ⇒, ⇔ as connectives. Formulas as such, that is, variables and their well–formed strings do not constitute an algebra. The symbol [A] refers to the Lindenbaum class of formulas with respect to A. We further stipulate that B ∈ [A] if and only if ├ AB (this step gives a congruence in L) and then define –[A] as [¬A], [A] ∩ [B]  as [A B], [A] ∪ [B]  as [A ∨ B], [A] ⊆ [B] as [A ⇒ B], and [A ⇔  B] as [A] = [B]. If we denote the set of classes of formulas produced by the defined congruence by the symbol L[],  the structure < L[], –, ∩, ∪, ⊆ , = > is a Boolean algebra of formulas, that is, LIA for sentential logic. An interesting feature of this construction is that building blocks for LIA come from language. This justifies the use of the same symbols for propositional connectives in the object language (the language of propositional calculus) and the metalanguage, (the language of LIA). Lindenbaum’s construction of algebras has important applications in algebraic proofs of metalogical theorems (see Surma 1967, Rasiowa–Sikorski 1970, Surma 1973, Rasiowa 1974, Surma 1982, Zygmunt–Pardy 2014), including the completeness theorem for classical logic as well as many non–classical systems.

There is a controversy concerning the origin of LIA. According to Surma 1967, p. 128 Lindenbaum presented his idea at the 1st Polish Mathematical Congress in 1927. This fact, however, is only known from the Polish oral tradition. Significant information can be found in Rasiowa,–Sikorski 1970, 245–246, footnote 1. The first published mention was made by McKinsey with reference to Tarski’s oral communication in 1941. However, Tarski complained that the discovery of LIA should be credited to him (it is reported in the mentioned footnote in Rasiowa–Sikorski 1974).  Since Tarski’s historical claim concerning LIA is known, some authors used label “the Lindenbaum–Tarski algebra”.

MT is perhaps the most important result achieved by Lindenbaum. Its formulation is very simple: every consistent formal system has its maximal and consistent extension. The theorem was probably inspired by the concept of Post–completeness, well known in the Łukasiewicz’s group. This theorem is mentioned several times in T1956; its proof is to be found on pp. 98–100. An important feature of LMT consists in the fact that it is not intuitionistically  provable (although we know that maximal extensions exist, there is no general method of their construction) and it requires infinitistic methods (it is but weaker than AC; see below). T1956 points out many applications of LMT, for instance, that classical logic is the only consistent extension of intuitionistic logic (Tarski). The great career of LMT began after Henkin had used it in his proof of the completeness of first–order logic. This proof combines together LIA and the construction of maximal consistent sets of formulas. Henkin’s method became adapted for proving the completeness property of many logical systems. LMT is effectively equivalent to the Stone ultrafilter theorem. One can prove (see Surma  1968) that AC implies the GödelMalcev completeness theorem, and the latter entails LMT. The standard version of LMT works for countable languages (see Łoś 1955 for LMT for uncountable languages) and a compact consequence operation (omitting the property of compactness is not particularly significant). On the other hand, if LMT is strengthened by additional assumption concerning individual constants or by admitting uncountable languages, it becomes provably equivalent to AC (see also Gazzari 2014). These results help in placing LMT on the scale of infinitary methods in metamathematics.

The program of reverse mathematics (see Simpson 2009) gives a more general perspective in this respect. The symbol RCA0 refers to Peano arithmetic minus the full induction scheme (it is restricted to zero–one formulas) plus the recursive comprehension scheme; it is a relatively weak subsystem of second–order arithmetic. LMT is equivalent over RCA0 to the following propositions: weak König lemma, Gödel–Malcev completeness theorem, Gödel compactness theorem, completeness theorem for propositional logic for countable languages and the compactness theorem for propositional logic for countable languages. Since the weak König lemma is a rather mathematical (not metalogical) result, its equivalence with LMT exactly characterizes the mathematical content of the second. If LMT is associated with the consequence operation admitting the rule of substitution, we obtain the so–called relative Lindenbaum extensions (see Pogorzelski 1994, p. 318; this theorem was proved by Asser in 1959). Roughly speaking, if we take a consistent set X and a formula A such that A X, we have two consistent extensions of X, namely X ∪ {A} and X ∪ {¬A}. Clearly, by the standard LMT, there exist at least two different maximally consistent extensions. The problem how many such relative Lindenbaum sets are associated with a given consistent set X has no unique solution.

Here is a list of some other of Lindenbaum’s metamathematical results (see Tarski 1956, 32, 33, 36, 71, 297, 307, 338 for the technical details):

  • the number of all deductive systems is equal to 2o;
  • the number of all axiomatizable systems is equal to ℵ0;
  • the condition which must be satisfied in order for the sum of a deductive system to be a deductive system;
  • structural type of a theory;
  • theorems of degrees of completeness;
  • atomic (atomistic, according to an older terminology) Boolean algebra.
  • independence of primitive concepts in mathematical systems.

The last three results in the above list were achieved jointly by Lindenbaum and Tarski (see Tarski–Lindenbaum 1927), and other results inspired Tarski in his metamathematical investigations. Moreover, Tarski credited to Lindenbaum the pointing out the role of set–theoretical methods in metamathematical investigations (T1956, p. 75).

6. Final Remarks

Helena Rasiowa (in Rasiowa 1974, p. v) says that the introduction of the Lindenbaum–Tarski algebra became “one of the turning points in algebraic study of logic”. This tradition was continued,  systematized and conceptually unified by Tarski himself as well as by his American students, particularly J. C. C.McKinsey,  Bjarni Jónsson, Don Pigozzi as well as Polish logicians, notably Jerzy Łoś, Helena Rasiowa and Roman Sikorski.  Rasiowa–Sikorski 1970 can be considered as the opus magnum in this direction. A similar role should be attributed to LMT (not properly called the Lindenbaum Lemma, because its actual importance exceeds the fact of being an auxiliary device for proving other results) as a mark of mathematical content of tools used in metamathematics. Thus, Adolf Lindenbaum appears as one of the main masters in developing of mathematics for metamathematics. His results on logical matrices opened a new stage in metalogical investigations concerning propositional calculus.

7. References and Further Readings

  • Gazzari, R. 2014, “Direct Proofs of Lindenbaum Conditionals”, Logica Universalis 8, Issue 3–4, 321–343.
  • Hinkis, A. 2013, Proofs of the Cantor-Bernstein Theorem. A Mathematical Excursion, Basel: Birkhäuser.
  • Lindenbaum, A. 1930, “Remarques sur une question de la methode mathematique”, Fundamenta Mathematicae 15, 313–321.
  • Lindenbaum, A. 1931, “Bemerkungen zu den vorhergehendem “Bemerkungen” des Herrn J. v. Neumann”, Fundamenta Mathematicae 17, 335–336.
  • Lindenbaum, A. 1936, “Sur la simplicité formelle des notions”, in Actes du Congrès International de Philosophie Scientifique, VII Logique (Acutalités Scientifique et Industrieles 394), 29–38.
  • Lindenbaum, A.–Mostowski, A. 1938, “Über die Unabhängigkeit des Auswahlaxioms und einiger seiner Folgerungen”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 31, 27–32; Eng. tr.  in A. Mostowski, Foundational Studies. Selected Works, Vol. II, Warszawa–Amsterdam: PWN–Polish Scientific Publishers – North–Holland Publishing Company, 70–74.
  • Lindenbaum, A.–Tarski, A. 1926, „Communication sur la recherches de la théorie des ensembles”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 19, 299–330; repr. in A. Tarski, Collected Papers, vol. 1, 1921–1934, Basel: Birkhäuser 1986, 171–204.
  • Lindenbaum, A.–Tarski, A. 1934–1935, “Über die Beschränktheit des Ausdrucksmittel deduktiver Theorien, Ergebnisse eines mathematischen Kolloqiums 7, 15–22; Eng. tr. in Tarski 1986, 384–392.
  • Łoś, J. 1949, O matrycach logicznych (On Logical Matrices), Wrocław: Wrocławskie Towarzystwo Naukowe.
  • Łoś, J. 1955, „The Algebraic Treatment of the Methodology of Elementary Deductive Systems”, Studia Logica 2, 151–212.
  • Łukasiewicz, J.–Tarski, A. 1930, “Untersuchungen über den Aussgenkalkül”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 23, 30–50; Eng. tr. in Tarski 1956, 38–59.
  • Marczewski, E.–Mostowski, A. 1971 “Lindenbaum Adolf (1904–1941)”, Polski Słownik Biograficzny 17, 364b–365b;
  • Moore, G. H. 1982, Zermelo’s Axiom of Choice. Its Origins, Development, and Influence, Springer Verlag: New York – Heidelberg – Berlin.
  • Pogorzelski, W. 1994, Notions and Theories of Elementary Formal Logic, Białystok: Warsaw University – Białystok Branch.
  • Rasiowa, H. 1974, An Algebraic Approach to Non-Classical Logics, Amsterdam – Warszawa: PWN–Polish–Scientific Publishers – North–Holland Publishing Company.
  • Rasiowa, H.–Sikorski, R. 1970, The Mathematics of Metamathematics, Warszawa: PWN– Polish Scientific Publishers.
  • Sierpiński, W. 1965, Cardinal and Ordinal Numbers, Warszawa: PWN – Polish Scientific Publishers.
  • Simpson, S. G. 2009, Subsystems of Second Order Arithmetic, Cambridge: Cambridge University Press.
  • Surma, S. J. 1967, “History of Logical Applications of the Method of Lindenbaum’s Algebra”, Analele Univeritatii Bucurereşti Acta Logica 10, 127–138.
  • Surma, S. J. 1968, “Some Metamathematical Equivalents of the Axiom of Choice”, Prace z Logiki 3¸71–80.
  • Surma, S. J. 1973, “The Concept of Lindenbaum Algebra and Its Genesis, in Studies in the History of Mathematical Logic, ed. by S. J. Surma, Ossolineum, Wrocław, 239–253.
  • Surma, S. J. 1982, “On the Origins and Subsequent Applications of the Concept of Lindenbaum Algebra”, in Logic, Methodology and Philosophy of Science VI. Proceedings of the Sixth International Congress of Logic, Methodology and Philosophy of Science, Hannover 1979, ed. by L. J. Cohen, J. Łoś, H. Pfeiffer and K.–P. Podewski, Warszawa – Amsterdam: PWN Polish Scientific Publishers – North–Holland Publishing Company, 719–734.
  • Tarski, A.–Lindenbaum, A. 1927, “Sur l’indépendance des notions primitives dans les systèmes mathématiques”, Annales de la Société Polonaise de Mathématique 7, 111–113.
  • Tarski, A. 1949, Cardinal Algebras, New York: Oxford University Press.
  • Tarski, A. 1956, Logic, Semantics, Metamathematics. Papers from 1923 to 1939, Oxford: Clarendon Press.
  • Tarski, A. 1986, “What are Logical Notions?”, History and Philosophy of Logic 7, 143–154.
  • Woleński, J. 1995, “Mathematical Logic in Poland 1900–1939: People, Institutions Circles, Institutions, Ideas”, Modern Logic V(4), 363–405; repr. in J. Woleński, Essays in the History of Logic and Logical Philosophy, Kraków: Jagiellonian University Press 1999, 59–84.
  • Wójcicki, R. 1989, Theory of Logical Calculi. Basic Theory of Consequence Operations, Dordrecht: Kluwer Academic Publishers.
  • Zygmunt, J.–Purdy, R. 2014, “Adolf Lindenbaum: Notes on His Life with Bibliography and Selected References”, Logica Universalis 8, Issue 3–4, 285–320.

 

Author Information

Jan Woleński
Email: wolenski@if.uj.edu.pl
University of Technology, Management and Information
Poland

Francisco Suárez (1548—1617)

SuarezSometimes called the “Eminent Doctor” after Paul V’s designation of him as doctor eximius et pius, Francisco Suárez was the leading theological and philosophical light of Spain’s Golden Age, alongside such cultural icons as Miguel de Cervantes, Tomás Luis de Victoria, and El Greco. Although initially rejected on grounds of deficient health and intelligence when he attempted to join the rapidly growing Society of Jesus, he attained international prominence within his lifetime. He taught at the schools in Segovia, Valladolid, Rome, Alcalá, Salamanca, and finally at Coimbra, the last at Philip II’s insistence.

Not all of the attention Suárez received was positive. His Defensio fidei, published in 1613, defended a theory of political power that was widely perceived to undermine any monarch’s absolute right to rule. He explicitly permitted tyrannicide and argued that even monarchs who come to power legitimately can become tyrants and thereby lose their authority. Such views led to the book being publically burned in London and Paris.

Suárez was clearly a scholastic in style and temperament, despite coming after the rise of humanism and living on the cusp of what is usually identified as the era of modern philosophy. His writings are sometimes said to contain the whole of scholastic philosophy because in addressing a question he surveys the full range of scholastic positions—Thomist, Scotist, nominalist, and others—before affirming one of those positions or presenting his own variant. The position he ultimately settles on is likely to be a via media.

Suárez’s greatness as a philosopher comes precisely from his magisterial weighing of all the competing positions across an extraordinarily broad range of theological and philosophical issues. The combination of broad systematicity, detailed elaboration, and thorough argumentation for his preferred view and against contrary views finds few rivals. He is a philosopher’s philosopher.

The even-handed presentation of the panoply of scholastic positions also explains why Suárez’s writing served as one of the key conduits through which medieval philosophy influenced early modern philosophy. Descartes, Leibniz, and Wolff, among others, learned scholasticism at least in part from reading Suárez, a scholasticism from which they then borrowed in developing their own philosophical theories.

Table of Contents

  1. Life
  2. Writings
  3. Thought
    1. Metaphysics and Theology
    2. Distinctions
    3. Esse and Essentia
    4. Efficient and Final Causation
    5. Existence of God
    6. Categories and Genera
    7. Beings of Reason
    8. Middle Knowledge, Grace, and Providence
    9. Natural Law and Obligation
    10. Political Authority
  4. Legacy
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Francisco Suárez was born on January 5, 1548 in Granada to Gaspar Suárez de Toledo and Antonia Vázquez de Utiel, a mere half-century after the Catholic Monarchs, Ferdinand and Isabella, wrested the city from eight centuries of Muslim control. The family was prosperous, although members of earlier generations on the maternal side ran into trouble with the Inquisition due to Jewish lineage, and several members were burned at the stake. Suárez’s uncle, Francisco Toledo de Herrera, was a prominent professor of philosophy who became the first Jesuit cardinal. His extended family included others of some note, including an arch-bishop cardinal and a viceroy of Peru.

Suárez was the second son of eight children. His brother Baltasar also joined the Jesuit order and was sent to the Philippines but died en route. Another brother became a priest, and three sisters joined a Jeronymite convent. Whether these facts indicate an exceptionally devout family or an attempt to mitigate some converso ancestry is less clear.

His childhood seems to have been unexceptional. He was schooled in Latin and rhetoric until thirteen, at which point he went to Salamanca to study canon law for three years. His academic performance at this point was lackluster. During his time at Salamanca he heard the legendary sermons of the Jesuit Juan Ramirez and felt the call to join the Jesuit order. Stunning in retrospect, Suárez was rejected from the then-fledgling order on grounds of insufficient intellectual aptitude. Suárez persisted, and finally, after numerous appeals, he was admitted to the order in 1564 as an indifferent, that is, provisionally accepted with the understanding that he might be refused entry into the priesthood.

At this point, Suárez’s academic abilities seem to have flowered so suddenly that some biographers attribute the flowering to the miraculous invention of Mary, the Mother of God. His newfound academic abilities did not go unnoticed, and he was sent to study theology at the University of Salamanca, then one of the most prestigious universities at the height of its glory and at the center of the Iberian revival of scholasticism. In 1570 he performed the “Grand Act” at Salamanca, something done only by the most gifted students. The Grand Act was a public academic exercise, resembling a medieval quodlibetal dispute, in which the student needed to be able to answer questions and resolve difficulties posed by professors and visitors. Suárez had enough of a reputation already to ensure prominent guests and a full hall. His new reputation also led his superiors to have him start teaching philosophy rather than first teaching grammar or rhetoric as was usually the case. Between 1570 and 1580, Suárez taught at several institutions around Spain: Salamanca, Segovia, Valladolid, and Avila. His earliest works come from this period.

Charges of novelty also started in this period. On Suárez’s telling, the problem was that he refused to teach by dictation from copy books but rather searched “for truth at its very roots.” The resulting controversy may have factored into his new position. In 1580 Suárez was called to Rome to join the Collegio Romano and to contribute to the development of the famous Jesuit pedagogical document, the Ratio Studiorum. The Roman College was an intellectually stimulating place to be, including such luminaries as the humanist Francisco Sanches, the theologians Robert Bellarmine and Gregory of Valencia, and the mathematician Christopher Clavius. Suárez also became a close colleague of the extraordinarily young general of the Jesuit order, Claudio Acquaviva.

In 1585 Suárez started teaching at the University of Alcalá. His seven years there were marked by strife with other theologians, including, notably, with his colleague Gabriel Vasquez, a dispute that left its mark on Jesuit philosophical history for generations. Suárez was unhappy with the distraction of all these conflicts and requested release from his position. Finally, in 1592, he was sent back to the university of his student days, Salamanca, where he wrote his best-known work, the Disputationes metaphysicae.

In 1597, the same year that Disputationes metaphysicae was published, Suárez moved yet again, this time to Coimbra in Portugal at the behest of Philip II. The first time Philip asked Suárez to move to Coimbra, in 1596, Suárez declined since he feared Coimbra’s teaching responsibilities would keep him from his writing projects. Because Spain had claimed Portugal during the 1580 Portuguese succession crisis, Suárez may also have feared the political situation, since the Portuguese were likely to be less than welcoming of a Spanish professor appointed by a Spanish king (even if the Spanish king was also Philip I of Portugal). Furthermore, Suárez would have occupied a new Jesuit chair at a Dominican university and tensions were running high between the two orders. Philip was disinclined to accept Suárez’s declination, but granted it after a personal visit. Unfortunately, the person appointed in his stead died at the end of 1596, raising the issue all over again. This time Philip insisted that Suárez move to Coimbra. An amusing consequence was that Suárez now needed to acquire a doctoral degree, since the Coimbra faculty objected to Suárez’s post without one. The Jesuit Provincial in Lisbon was happy to confer one, but this failed to satisfy. Finally, Suárez made a trip to the University of Evora in southern Portugal, where he directed a public theological debate and was rewarded with a doctorate.

Suárez stayed at Coimbra until his retirement in 1613, although this was a retirement only from teaching. Among other projects, Suárez hoped to revise an earlier set of lecture notes into a commentary on Aristotle’s De anima. The revision remained unfinished, however, at Suárez’s death on September 25, 1617.

2. Writings

By Suárez’s time, Aquinas’s Summa theologiae (henceforth, ST) had to some extent replaced Lombard’s Sentences as the standard theological textbook and subject of commentaries. Many of Suárez’s works are offered as commentaries on particular sections of ST. Suárez generally does not, however, offer line-by-line comments on Aquinas’s texts; still, the arrangement of subjects in his work mirrors that in Aquinas’s text, the questions he considers often grow out of it, and he constantly cites passages from it.

In correspondence to ST Ia (that is, the First Part), Suárez wrote De Deo uno et trino (“On God, One and Triune”), De angelis (“On Angels”), De opere sex dierum (“On the Work of the Six Days [of Creation]”), and De anima (“On the Soul”). The last work, in particular, has received significant attention from Suárez scholars and is the primary source for his psychological views. One should also note in passing that titles are apt to be a source of confusion for those less familiar with Suárez’s works, since scholars will sometimes use titles for smaller or larger divisions of his work. For example, references to De divina substantiae ejusque attributis, De divino praedestinatione, and De SS. Trinitatis mysterio are all to treatises that make up the De Deo uno et trino mentioned above. Conversely, the latter three works mentioned above are sometimes gathered under the title De Deo effectore creaturarum omnium, though they are more commonly cited individually. Listing all the variations would be tedious, but readers should note that a citation of a seemingly unfamiliar work of Suárez’s might simply be using the title of a collection of treatises or the title of a part of a larger collection.

In correspondence to ST IaIIae (that is, the First Section of the Second Part), Suárez wrote De ultimo fine hominis (“On the Ultimate End of Man”), De voluntario et involuntario (“On the Voluntary and Involuntary”), De bonitate et malitia actuum humanorum (“On the Goodness and Evil of Human Acts”), De pasionibus et habitibus (“On Passions and Habits”), and De vitiis atque peccatis (“On Vices and Sins”), all published together in one volume under the title Tractatus quinque ad primam secundae D. Thomae. These works have received less scholarly attention, although they are obviously of great relevance for determining Suárez’s ethical views. De legibus seu de Deo legislatore (“On Laws or on God the Lawgiver”) also corresponds to IaIIae and has received more attention as one of Suárez’s greatest and most influential works. Seeing De legibus as a commentary on Aquinas’s ST helps to understand it properly, as well. It is sometimes read as a comprehensive presentation of Suárez’s ethical views, but Suárez himself conceives of it as corresponding to questions 90-108 of ST IaIIae. Those questions, of course, constitute a small fraction of Aquinas’s ethical thought. Finally, Suárez’s massive De gratia (“On Grace”) and some of the shorter theological works he wrote in connection with the controversy De auxiliis (to be addressed later) present an extraordinarily detailed account of grace and connected theological matters.

Suárez wrote fewer works on ST IIaIIae (a fact that is grist for the scholarly mill), but De fide theologica (“On Theological Faith”), De spe (“On Hope”), De caritate (“On Charity”), and the again massive De virtute et statu religionis (“On the Virtue and State of Religion”) do correspond to it. The last work offers a defense of the Society of Jesus along with a detailed examination of its principles. These works generally have received rather little scholarly attention, with the exception of the well-known disputation on war included in De caritate. That disputation is the main source for Suárez’s just-war theory.

Corresponding to ST IIIa are De verbo incarnato (“On the Incarnate Word”), De mysteriis vitae Christi (“On the Mysteries of Christ’s Life”), De sacramentis (“On the Sacraments”), and De censuris (“On Censures”). These works have received little attention from recent philosophers, but slightly more from theologians. De mysteriis vitae Christi, in particular, is significant for including several hundred pages of discussion of questions related to the Blessed Virgin Mary. It is this work that has earned Suárez a prominent place in the history of systematic Mariology.

Not all of Suárez’s works fit into the ST framework, especially those of his works sparked by religious and political controversies of his day. The famous controversy De auxiliis raged between the Dominicans and Jesuits during Suárez’s lifetime and he contributed a number of works defending the Jesuit position. The central issue was how to explain human free will on the one side and divine foreknowledge, providence, and grace on the other side in such a way that they were compatible with each other and without falling into one heresy or another. Suárez contributed a number of works dealing with these matters: De vera intelligentia auxilii efficacis (“On the True Understanding of Efficacious Aid”), De concursu, motione et auxilio Dei (“On God’s Concurrence, Motion, and Aid”), De scientia Dei futurorum contingentium (“On God’s Knowledge of Future Contingents”), De auxilio efficaci (“On Efficacious Aid”), De libertate divinae voluntatis (“On the Freedom of the Divine Will”), De meritis mortificatis, et per poenitentiam reparatis (“On Merits Destroyed and Revived through Penance”), and De justitia qua Deus reddit praemia meritis, et poenas pro peccatis (“On the Justice by Which God Gives Rewards for Merits and Punishment for Sins”).

More political in nature is the treatise De immunitate ecclesiastica a Venetis violata (“About the Ecclesiastical Immunity Violated by the Venetians”), a work written in defense of the papal position in the dispute between the papacy and the Republic of Venice about the extent of papal jurisdiction.

Best-known of Suárez’s controversial works is his response to the English monarch James I, Defensio fidei catholicae adversus anglicanae sectae errores (“A Defense of the Catholic Faith Against the Errors of the Anglican Sect”), written at the request of the papal nuncio in Madrid. Suárez’s initial aim was to respond to James I’s arguments for requiring Catholic subjects to take an oath of loyalty. The work became much more than that, however, and offers much of interest to theorists of political power. In it, Suárez opposes the absolute right of monarchs, argues that the papacy has indirect power over temporal rulers, and, perhaps most inflammatory, argues that there are situations in which citizens may legitimately resist a tyrannical monarch to the point of tyrannicide. The work was promptly condemned by the English king and publically burned in London. James tried to enlist the support of other European monarchs in condemning Suárez and the Jesuit order more generally with some success, especially in France.

Last but certainly not least, Suárez’s best-known and most influential work is his Disputationes metaphysicae (“Metaphysical Disputations”; henceforth, DM). The work is meant to cover the questions pertaining to the twelve books of Aristotle’s Metaphysics, but part of its significance lies in its not being a commentary on Aristotle’s work and not following its organization. Suárez is unhappy with Aristotle’s organization and so writes a large, two-volume work in which he sets out a comprehensive treatment of metaphysics in systematic fashion, organized into fifty-four disputations. The work is widely credited with being the first work to offer a comprehensive, systematically organized metaphysics and with initiating a long tradition of such works. DM was immediately and extraordinarily popular, widely reprinted all over Europe and quickly became the standard work to consult on metaphysical matters.

DM is divided into two main parts. The first part deals with the object of metaphysics, the concept of being, the transcendentals (that is, the essential properties of being as such, namely, unity, truth, and goodness), and the causes of beings. The second part deals with the divisions of being, first into infinite and finite. Finite being is then divided into substance and accident, with the latter then divided into the nine categories familiar from Aristotle. The last disputation concerns beings of reason, which, strictly speaking, fall outside the scope of metaphysics on Suárez’s conception, but an understanding of which is helpful for understanding metaphysics.

Aside from its systematic scope, few readers of DM have failed to be impressed by the extraordinary erudition displayed in its discussions. Each section includes a careful cataloguing of all the different positions that have been—and might be—taken with respect to the issue under question. DM’s two volumes include thousands of citations of hundreds of authors: Christian scholastic, Muslim, Patristic, ancient, and others. This may be the feature Schopenhauer has in mind when he characterizes DM as the “authentic compendium of the whole of scholastic wisdom.”

It is worth noting at this point that Suárez comes after centuries of a continuous tradition of professional theology and philosophy and consequently inherits a formidable accumulation of distinctions, technical terminology, and the like. Standard arguments and positions are often assumed or indicated in summary fashion. The resulting work is exceptionally sophisticated and can be highly rewarding to a reader with some familiarity with that tradition, but it can also be forbidding and alien to readers not so familiar.

The situation is not helped by the fact—astonishing in light of Suárez’s caliber and influence—that very little of his work is available either in critical edition or translation. Suárez’s works suffer far fewer textual issues than many medieval works do, so the lack of critical editions impedes less than it might. A significant number of his works were published posthumously, however, and the editorial decisions of his literary executor, Baltasar Alvarez, sometimes leave something to be desired. The value of translations, of course, is evident, yet aside from significant portions of DM very little has been translated into English (or any other language, for that matter). As the preceding presentation should have made clear, of course, Suárez was awe-inspiringly prolific, so a complete translation would be a monumental task. Furthermore, in addition to the mentioned works, there are still several unedited and unpublished works.

3. Thought

The first thing to note when trying to get one’s bearings with Suárez’s thought is that he, like many other scholastic authors, is a Christian Aristotelian. His thought is so thoroughly imbued with both Christianity and Aristotelianism that it would be difficult to find a single page in his writings not containing obvious traces of both. As is well-known, Aristotle says some things incompatible with orthodox Christianity, such as that the world had no beginning, so an orthodox Christian must modify Aristotelianism to some degree. Nonetheless, one can better understand what Suárez says and why he says it once recognizing that he is committed to orthodox Christian doctrines—more particularly, Roman Catholic doctrines—and that his philosophical framework and conceptions are rooted in Aristotle.

On the Christianity side, Suárez is committed to there being a God, a God who created and sustains human beings and the world they inhabit. God is a Trinity of Father, Son, and Holy Spirit, but also One, indeed, utterly simple. God exists necessarily, is infinite, immutable, and eternal. God is perfectly good, omniscient, and omnipotent, and orders everything in the universe providentially. Nevertheless, human beings sinned, consequently requiring God’s grace to attain their salvation and their ultimate end, that is, God. A central part of this story is the Incarnation of the second member of the Trinity, in which Christ assumes a human nature in addition to his divine nature. A moment’s reflection suffices to see that these doctrines raise a host of philosophical issues; much of Suárez’s work is devoted to such issues.

On the Aristotelian side, the most notable inheritances are hylomorphism (the view that objects are composed of matter and form); the four-causes explanatory paradigm (material, formal, efficient, and final); the categorial scheme of substance and nine categories of accidents; the view that the human soul is the form of the body; and the language of ultimate ends, happiness, and virtue in ethics. As scholars of medieval philosophy well know, this Aristotelian legacy leaves considerable room for philosophical disputes, in part due to unanswered questions, in part due to answers susceptible to differing interpretations.

The three authors most frequently cited by Suárez are Aristotle, Aquinas, and Scotus, in that order. Counting citations is not generally an infallible guide to influence, but in this case the citations seem an accurate reflection of influence. Aristotle’s influence is pervasive. Suárez claims to follow Aquinas throughout, though he no doubt exaggerates his fidelity to Aquinas. (The Jesuit order of which he was a member required fairly close adherence to Aristotle in matters of philosophy and Aquinas in matters of theology.) The number of references to Scotus, however, suggest that Suárez also had great respect for his thought. How much this respect led to influence is a matter of some controversy. Detecting such influence is made more difficult by Suárez’s practice of presenting his own view as that of Aquinas (properly interpreted), even where one might doubt their harmony. Closer examination of Suárez’s views confirms that his respect for Scotus at least occasionally led to adopting broadly Scotistic views.

The enormous range of issues addressed in Suárez’s writings ensures the impossibility of surveying all of them in an encyclopedia entry. What follows is a sampling of the issues that have received at least some scholarly attention, but it should be noted that a variety of significant topics, for example, his just war theory and his psychological views, have been omitted. The order roughly follows the order of DM with several issues drawing primarily on other works appended at the end.

a. Metaphysics and Theology

There are at least two issues concerning Suárez’s conception of metaphysics that have been the source of some controversy. Both issues also relate to the perennially enticing question of whether Suárez is the last medieval or the first modern.

Suárez opens his systematically ordered Disputationes metaphysicae by asking what the object of metaphysics is. The question received much discussion in scholastic philosophy, no doubt in part because Aristotle suggests several candidates that do not look equivalent in his Metaphysics. Suárez canvasses and criticizes six answers before giving his own answer: “being insofar as it is real being” (DM 1.1.26). Real being includes both infinite being (God) and finite being, both substances and real accidents. What it excludes, notably, is beings of reason, that is, beings that cannot exist other than as objects of thought. Note the “cannot.” On Suárez’s conception, real being includes both actual being and possible being. It does not, however, include privations, for example, blindness, and other such beings of reason.

Suárez considers metaphysics a unified science in the Aristotelian sense. The function of a science is to demonstrate the properties of its object through the latter’s principles and causes (DM 1.1.27 and 1.3.1). The term “properties” here is not being used in its wide contemporary sense but rather to refer to features necessarily possessed by the members of a kind yet not essential to them. The classic example is the capacity to laugh, which scholastics generally deem a feature necessarily possessed by human beings yet do not deem an essential feature, as rationality is. The function of metaphysics, then, is to identify the properties of being insofar as it is real being, and to demonstrate their necessary possession by appeal to the principles and causes of being insofar as it is real being.

The immediate objection is that being, insofar as it is real being, has neither necessary features that can be demonstrated nor principles and causes (DM 1.1.27). Consequently, it fails to make a suitable object for metaphysics. With respect to the first part, the thought is that being insofar as it is real being is so far abstracted that it has no properties. Trees, rocks, angels, and so forth all have properties, but what property does the being common to every being have? With respect to the second part, on the view that Suárez shares, the being par excellence is God, and God has no causes. Suárez, however, denies both claims. Being insofar as it is real being may not have any properties really distinct from it, but it does have properties that are at least distinct in reason, namely, the transcendentals: unity, truth, and goodness. In a similar move, Suárez denies that a science need appeal to causes, strictly speaking. Rather, appeal to principles or cause in a looser sense is sufficient. His example is that we can demonstrate God’s unity from God’s perfection, even though the latter is not strictly speaking a cause of the former (DM 1.1.28-29).

This conception of metaphysics might seem much narrower than more recent conceptions. It might suggest that all Suárez needed to do to finish DM is to demonstrate being’s unity, truth, and goodness and he would be done. As the size of DM attests, however, Suárez addresses a great many more topics. In the first place, he also includes disputations on the contraries of the transcendentals, namely distinction, falsity, and evil. Secondly, he includes one of the most thorough treatments of causation ever written on grounds (i) that even if being insofar as it is real being does not have a cause, most beings do and (ii) that all being is in some way a cause even if not itself caused. Finally, the entire second half of DM deals with the divisions of beings and includes lengthy disputations about the nature of particular kinds of beings such as substances, qualities, and so forth. In practice, then, the range of subjects discussed in DM comes much closer to the range of subjects that would be discussed in a modern metaphysics textbook.

Turning to the two matters of controversy mentioned earlier, the first question one may ask is how realist Suárez’s account of metaphysics is. An oft-told narrative has it that Aristotle and his medieval followers held metaphysics to concern the real rather than the mental, while at some point in the modern era metaphysics came to be about structures of thought (or something similarly mental) rather than about the extramental world. On the face of it, Suárez’s account seems impeccably realist. After all, metaphysics’ object is being insofar as it is real being. Despite seeming this way, Suárez has often been identified as one of the key figures in the transition to a non-realist approach to metaphysics. The room for debate comes with a later passage in which Suárez says that metaphysics’ object is “the objective concept of being as such” (DM 2.1.1). This is not obviously contradictory to the earlier statements that the object is being insofar as it is real being, but neither is it obviously in harmony. The question now becomes how Suárez understands objective concepts. If objective concepts borrow their ontological status from their objects, so to speak, then Suárez can readily be read as a realist. (And note that on Suárez’s view, objective concepts are concepts by extrinsic denomination; they are not really distinct from the objects. See, for example, DM 2.1.1 and 6.5.3.) On the other hand, if an objective concept is a mental item, then this later passage opens the door to less realist readings. Resolving this protracted dispute in the literature is obviously beyond the scope of this article.

The second question concerns the relationship of metaphysics to theology. Suárez makes it perfectly clear in the preface to DM that he adheres to the standard medieval view that metaphysics, albeit perfective of the human mind in its own right, ought to be in service of theology. He is, however, frustrated with piecemeal metaphysical discussions scattered throughout theological treatises, and so he writes DM to present a comprehensive metaphysics in the proper “order of teaching.” In this project, some detect a distinctly modern attitude and an early example of an increasing trend to see metaphysics as a separate discipline from theology. Others, however, are skeptical and think that Suárez sees metaphysics as distinct in the same way that his medieval predecessors did (namely, insofar as metaphysics proceeds by the “natural light of reason” apart from divine revelation) but in no additional way (for example, not by thinking that metaphysics should be conducted wholly autonomously).

b. Distinctions

The question of whether one item is distinct from another comes up in many contexts. One such place was already seen; namely, whether objective concepts are distinct from their objects. A naïve approach might just ask whether two items, A and B, are distinct or not. Is the human mind distinct from the brain? Are relations distinct from their relata? Is God’s mercy distinct from God’s justice? And so forth. Sustained philosophical reflection soon shows, however, that different sorts of distinctions might be posited between entities. Perhaps God’s mercy is identical to God’s justice in one sense but not distinct in another sense. Suárez’s scholastic predecessors frequently felt the need for a distinction between distinctions, so to speak, and so posited a plethora of distinctions. Suárez also recognizes the need for different distinctions but aims to prune the list down to three basic kinds: real, modal, and of reason. He argues in DM 7 that all other putative distinctions are actually one of these three kinds.

The two most obvious kinds of distinction are real distinctions and distinctions of reason (the two kinds that already made a showing in the previous section). As might be expected, a real distinction is the sort one expects between one thing and another thing. It is crucially an extramental distinction, one whose sign is mutual separability, that is, the possibility of either thing existing without the other. Distinctions of reason are distinctions in mind only. There may be a distinction of reason between Mark Twain and Samuel Clemens but there is no real distinction. A notable context in which Suárez wishes to appeal to distinctions of reason is when discussing God’s attributes. The doctrine of divine simplicity entails that the divine attributes can only be distinct in reason. One complication for the notion of distinctions of reason is that Suárez thinks some of them have some sort of foundation in reality while others do not.

Suárez argues that in addition to real distinctions and distinctions of reason there is a third kind of distinction, intermediate between the other two. The mark of a real distinction is mutual separability while the mark of a modal distinction is one-way separability. On Suárez’s view, for example, the union of form and matter cannot exist without the form and matter, but form and matter can exist without their union. If the union itself were a really distinct thing, Suárez thinks, then a further union would be needed to unite the union to the form and matter, and so an infinite regress would have started. So union is not some really distinct thing. But union is not merely distinct in reason, since the form and matter can exist without being united. Consequently, an intermediate kind of distinction is needed.

The reason mutual separability and one-way separability are marks or signs of the corresponding distinctions rather than simply constituting those distinctions is because of some important exceptions. On Suárez’s view there is only a one-way separability between God and creatures. God can exist without creatures but not vice versa. Assuming one is not a monist à la Spinoza, that requires granting a case of real distinction without mutual separability.

c. Esse and Essentia

In Being and Some Philosophers, Etienne Gilson famously distinguishes four fundamental traditions in metaphysics on the basis of how being is understood, two of which are existentialism and essentialism. Both terms have been used in an unhelpfully wide array of senses, but for Gilson an existentialism privileges existence over essence while an essentialist does the converse. In Gilson’s story, Thomas Aquinas is the hero who recognizes being as the very act of existing and metaphysics as the science of being insofar as it is being. Suárez, however, he sees as the paradigmatic essentialist in whose philosophy existence is no longer significant and for whom metaphysics becomes the science of essences. One may recall here that Suárez identifies being insofar as it is real being, but includes both actual and merely possible being in real being. Gilson thinks, furthermore, that this essentialism gives Suárez a pivotal role in history, albeit a malignant one, since essentialism leads to a variety of further philosophical ills.

Suárez’s conception of metaphysics figures in this story, but so does his account of the distinction between esse (“being,” meaning here actual existence) and essentia (“essence,” meaning here an individual nature), a matter related to a variety of issues regarding necessary versus contingent existence, eternal truths, the possibility of Aristotelian science about contingent things, and so forth. The usual Thomist view—how precisely to understand Thomas himself is a matter of some controversy—is that there is an actual or real distinction between essentia and esse. Or, rather, there is such a distinction for created things, which exist contingently. God, however, exists necessarily and from himself (a se); in God there is no distinction between essentia and esse.

Suárez, however, rejects a real distinction between esse and essentia and, furthermore, rejects a modal distinction. This is, in part, because of his understanding of real and modal distinctions: a real distinction would suggest that an essentia could exist without esse and esse without essentia, and a modal distinction would suggest at least the former (one source of confusion in these discussions is that a Thomist real distinction is probably not the same as a Suárezian real distinction). There were some who were willing to bite one or both of those bullets, but Suárez argues at length for their unacceptability. On Suárez’s view, then, even in created beings there is only a distinction of reason between esse and essentia. Since he is committed to the Christian doctrine of creation ex nihilo, the consequence is that an uncreated essentia is absolutely nihil, while, as expected, in a created essentia its essentia and esse are really the same, albeit conceptually distinct.

Much more would need to be said, and has been said by various scholars, to establish how significant this Suárezian claim is especially with respect to Aquinas’s position or even whether it is significant at all. Inter-school polemics and differing terminologies have resulted in more heat than light on this issue.

d. Efficient and Final Causation

Suárez’s extraordinarily detailed explication of the four kinds of Aristotelian causes—material, formal, efficient, and final—has received increasing attention in the early 21st century, perhaps in part because seven of the relevant disputations have finally been published in English translation. The resulting scholarly discussions have often been tied up with questions about Suárez’s place in the history of philosophy. Is he a loyal member of the medieval guild or is he setting the stage for mechanism or modernity? Or, perhaps both? On one reading of his account of formal causation, Suárez is the tragic hero making a valiant attempt to defend substantial forms, but in the course of doing so he alters the conception of substantial forms from the traditional model, thereby inadvertently making substantial forms more susceptible to mechanist critique and dismissal. If so, then he is a loyal member of the medieval guild and yet sets the stage for the modern mechanical philosophy. Not all scholars think it is so, however; some think he actually offers remarkably persuasive arguments on behalf of a more or less traditional conception of substantial forms, arguments to which the early modern mechanists would have done well to pay more attention.

Similar issues arise with Suárez’s account of final causality and its relationship to efficient causality. Ends are that for the sake of which an agent acts, as good health is the end for the sake of which a doctor prescribes medicine and finding small invertebrates to eat is the end for the sake of which curlews push their long curved bills into mud. Aristotle—and Aquinas—famously take ends to be a kind of cause, namely, final causes. Appeal to final causes is essential for offering complete explanations. In fact, Aquinas goes so far as to say that all the other kinds of causation, including efficient causation, presuppose final causation. Without final causation there would be no efficient causation on his view. Some scholars, however, have argued that Suárez departs from Aquinas on this score and prioritizes efficient causation, perhaps even reducing final causation to efficient causation. If so, this would make Suárez look like an intermediate between Aquinas’s position and the widespread dismissal of final causation in early modern philosophy.

At first glance, one might think Suárez straightforwardly endorses a traditional picture. He divides his discussion of causation according to the Aristotelian fourfold classification, explicitly defends the status of all four causes as real causes (DM 12.3), and, most importantly, defends at length the claim that ends are real causes (DM 23.1) and argues that final causation is present in the actions of God, rational created agents, and natural agents (that is, non-rational agents such as cows and oak trees).

But a closer look reveals some grounds for those wishing to argue that Suárez emphasizes efficient causation at the expense of final causation. First, he explicitly states that the definition of “cause” applies most properly to efficient causes (DM 12.3.3). Second, when talking about efficient causation he uses the term “real motion,” but when talking about final causation he uses the term “metaphorical motion.” Third, final causality depends on efficient causality, since an end is an actual final cause only if an efficient cause acts on its behalf. Fourth, an end is a final cause only if cognized by a rational agent (see DM 23.7 and 23.10.6); this stands in contrast with Aristotle’s confidence in final causation without the thought and intention of a rational agent. Besides, Suárez devotes far more pages to efficient causation than to final causation.

Nonetheless, scholars who wish to attribute a more traditional view to Suárez can also find support from a closer look at Suárez’s text. Taking the previous paragraph’s points in turn, one may first note that the term “cause” most properly applying to efficient causes is entirely compatible with the term properly applying to final causes, and that the significance of saying that the term applies most properly in one case but not the other is not immediately obvious. Second, it is true that Suárez uses the term “metaphorical motion” when describing the motion of final causes, but he is simply following well-entrenched terminological practice. Also, when pressed, he explicitly denies that metaphorical motion is so-called because it fails to be real motion (DM 23.1.14). Third, on Suárez’s view, actual final causation does seem to depend on efficient causation, but the converse is true as well. He grants Aquinas’s point that efficient causation presupposes final causation, that efficient causes would not act were they not to have ends for the sake of which to act. At the very least, there seems to be a sort of mutual dependence. In some passages, Suárez appears also to endorse the priority of final causation, though whether his view licenses that conclusion is a more complicated matter.

The fourth issue cannot be fully addressed without drawing in a number of other philosophical issues. One may briefly note, however, that, while Suárez does demand that ends be cognized in order to final-cause, he thinks this condition is satisfied even in the case of natural agents. This is a result of his concurrentism, according to which all actions of created things also have God as a concurring agent. That is, one and the same action has two agents, at least one of which is a rational agent. Of course, this account leaves final causation in the natural realm dependent on final causation in the divine realm. But final causation in the latter realm is not unproblematic. A central scholastic assumption, which Suárez shares, is that God is never subject to causation. So how can a final cause move God? And if it cannot, then how can natural actions inherit final causes from God? Suárez is well aware of this problem (DM 23.9.1). His response is to concede that there is no final causality in the case of God’s immanent actions, but that there is no problem with saying that God’s transeunt actions, that is, actions not located in God, have final causes. Whether this answer can be made to work in light of the details of his account of metaphorical motion in DM 23.4 is a further matter. With respect to the fourth issue, there is also a historical question whether the cognition requirement represents a change just from Aristotle or also a change from Aquinas.

These questions about how to fit Suárez’s account of causation into a broader history exemplify a common approach to Suárez. His place in time ensures that it is always tempting to read him as a transitional figure, as standing between the medieval view—where the “medieval view” typically means the view of Aquinas—and the views of the early modern mechanists. Consequently, a strand running through much Suárez scholarship concerns whether he in fact holds transitional views or not.

e. Existence of God

In DM 28, Suárez argues that the best primary division of being is into infinite and finite being, a division he considers equivalent to a number of other divisions including between necessary and contingent being, essential and participated being, and uncreated and created being. It is worth noting, however, that he does not take the term “being” to be used univocally when predicated of both infinite and finite being (DM 28.3). Nor does he go to the opposite extreme and consider it equivocal. Rather, he argues that “being” is used analogously in this case in virtue of the intrinsic characters of both infinite being and finite being.

As Suárez notes, the arguments of DM 28, assuming they work, already go a long way towards establishing the existence of God, since they purport to show that there must be some uncreated being. He devotes an entire disputation, however, to the question of God’s existence. His goal in DM 29 is to prove by natural reason, without any appeal to special revelation, that God exists, a goal that he thinks can be achieved.

His optimism about the possibility of demonstrating that God exists does not result in an uncritical attitude to previous efforts to do so. He rejects, for example, versions of the ontological argument that claim God’s existence to be evident from the fact that necessarily existing is part of what it means to be God (De Deo uno et trino 1.1.1.9). Of course, so did Aquinas. Perhaps more surprising, given Suárez’s debt to Aquinas, is that he also rejects the cosmological argument from motion made by Aristotle and made famous as Aquinas’s first way (Summa theologiae Ia.2.3). This argument starts from the motion or change that we observe, claims that whatever is moved is moved by another, but that there cannot be an infinite chain of moved movers, and so concludes that there must be an unmoved mover at the foundation. A key reason for Suárez to worry about this argument is that he not only thinks the status of the Aristotelian principle that whatever is moved is moved by another uncertain, he thinks that we ourselves provide counterexamples via our free actions. Consequently, he thinks the physical cosmological argument (“physical” because motion pertains to physics) relies on a false premise.

Instead, he turns to a metaphysical version of the cosmological argument (“metaphysical” because being is the object of metaphysics). This argument starts from the observation that there are things that exist, notes that every being either is made by something else or is not (that is, is created or is uncreated), argues that not every being can be made by another being, and concludes that there must be some being that is uncreated (DM 29.1.20-40). The alternatives to the claim that not every being can be made by another being would be either that there is an infinite chain of beings, each made by a prior being, or that there is a circle of beings, each making the next one in the circle. Suárez argues that these alternatives are impossible.

Long before Hume, Suárez recognizes that this cosmological argument hardly suffices to show that there is an uncreated being that merits being called God. Multiple worries might be raised, but Suárez focuses on the observation that for all that the cosmological argument shows, there might be many uncreated beings making other beings. In response, he moves to the next stage of his argument and enlists the aid of teleological arguments, arguing that attending to the order, structure, and beauty of the world shows that there is only one uncreated being (DM 29.2). He considers a variety of objections, ranging from the claim that the order only indicates at most that there is one governor of the world to the possibility of the world having been created and governed by a committee of uncreated beings working in consensus to the possibility that our world is only one of many worlds, each with its own uncreated creator. Suárez argues that some of these objections fail, but he concedes that the teleological or a posteriori argument he is considering cannot show that there are not other worlds with their own creators.

For the final stage, then, he turns to what he calls an a priori argument (DM 29.3). Strictly speaking, there can be no a priori arguments for God’s existence on the scholastic understanding of a priori arguments. For such arguments are arguments from causes to effects and God has no causes. Suárez accepts this point, but suggests that once we have an a posteriori demonstration of a divine attribute, it is possible to demonstrate a priori further attributes from that attribute (cf. the aforementioned example of using God’s perfection to demonstrate God’s unity. Suárez then proposes to demonstrate that there can be only one uncreated being from an uncreated being’s existing necessarily and a se (from itself). The resulting stretch of arguments is complex and relies on premises whose truth is not always obvious. Suárez himself is modest about the force of the argument, granting at the start that the proposed project is difficult and noting at the end that not all of the steps are immune to evasions. He does, however, think that the whole argument taken together will have some persuasive force for a reader who is not obstinate.

f. Categories and Genera

Thanks to the Aristotelian legacy, category theory was a prominent feature of scholastic philosophical discussions. Aristotle famously enumerated ten categories but left it unclear what the justification was for listing ten rather than fewer or more categories. A project that occasioned significant interest among medieval philosophers, then, was to provide the argument that would establish ten and only ten categories; such arguments were called sufficientiae. Deference to Aristotle was not universal, however, and so other philosophers argued that there are fewer than ten categories. There was also disagreement about what the categories are classifying. Extramental objects? Words? Concepts?

In these discussions, the terms “categories” (“praedicamenta”) and “highest genera” (“generalissima” or “suprema genera”) are often used interchangeably. At least some of the time, however, Suárez distinguishes between categories and genera and says that it is the business of logicians to deal with categories, since logic is concerned with the mind’s concepts, while it is the business of metaphysics to deal with the highest genera of beings, revealing their natures and essences (DM 39.pr.1). This suggests a view in which there is a kind of correspondence between the classification of concepts and the classification of extramental beings.

Be that as it may, Suárez qua metaphysician devotes the bulk of the second half of DM to dividing finite being (God falls outside the scope of this division) into the ten highest genera and discussing each genus in turn. Substance, of course, occupies a special role in Aristotelianism, and so Suárez first divides finite being into substance and accident and discusses substance at some length. He then turns to a discussion of the division of accidents, that is, the remaining nine genera on Aristotle’s view.

The first question he considers is whether nine genera of accidents—or ten genera in total—is too many (DM 39.1). He ends up affirming Aristotle’s number but gives a somewhat deflationary spin. He concedes that a variety of intermediate genera can be devised: real accidents versus accidents that are mere modes, absolute accidents versus respective accidents, and so forth. This concession raises questions about the significance of Aristotle’s ten genera, if they are not the most basic or immediate divisions. Suárez, however, thinks Aristotle’s division is nonetheless “most apt.”

The second question concerns the sufficiency of the nine genera of accidents (DM 39.2). Suárez divides this into two issues: (i) are there genera beyond these nine? and (ii) are all nine genera distinct from each other? As usual, Suárez has great respect for his philosophical forebears and says that it would be “rash” to doubt Aristotle’s number. That said, when he discusses the sufficientiae of Aquinas and Scotus, he is obviously dissatisfied. He concludes that a proper a priori demonstration of the sufficiency of the ten highest genera cannot be given. This, he claims, should be no surprise, since a science presupposes its subject rather than demonstrating it.

But what sort of distinction is there between the genera? One would be surprised to find that a given bird belongs to two different genera, say, Ara and Tangara. Similarly, one might be surprised to find that Aristotle’s genera overlap or even coincide entirely. Yet Suárez quickly rejects the view that there is always a real distinction between items belonging to two or more of the highest genera. He is more sympathetic to the view that there is a modal distinction. There are, however, cases that keep him from accepting that view as well.

Relations, the fourth highest genus, are of special concern. There is evidence that at one point in his career Suárez took relations to be modally distinct from their foundations. It is also a view to which he gives more time and attention when he gives an explicit treatment of relations in DM 47, though ultimately he rejects the view in that disputation. Instead, he concludes that relations are only distinct in reason from their foundations. For example, if Socrates and Plato are similar to each other in virtue of each being white, then the foundation for Socrates’s relation of similarity to Plato is Socrates’s whiteness. On Suárez’s mature view, Socrates’s similarity relation is neither really nor modally distinct from Socrates’s whiteness. The relation and quality are only conceptually distinct. There is a persuasive reason for such a reductionist account of relations. If God creates a world with nothing but substances and absolute qualities, it seems the relations would ipso facto follow. No separate creative act would be needed to ensure that white Socrates and white Plato were similar.

Consequently, Suárez concludes that a distinction of reason suffices for a separate highest genus (DM 39.2.22 and 47.2.22). Sometimes there in fact is a real distinction. Suárez thinks items in the second and third genera, quality and quantity, are really distinct from substance and from each other. In the remainder of the cases, however, there is either only a modal distinction or only a distinction of reason. He does stress that the distinction of reason needs to be one with a foundation in reality, which raises challenging questions about what that foundation is. The standard example of distinctions of reason with a foundation in reason in scholastic philosophy is the distinction between God’s attributes. That example may have been dialectically effective insofar as it was widely accepted, but it is not thereby an illuminating example. In the case of action and passion, two genera only distinct in reason, Suárez seems to suggest that the foundation in reality can be comparisons to extrinsic things. He says that action is distinct from passion insofar as action is compared to the principle that acts and passion to the principle that undergoes (DM 39.2.23).

It is worth noting in passing that Suárez thinks “being” is used analogically across the nine highest genera of accidents (DM 39.2.3). Consequently, one of the key texts relevant to the controversies about Suárez’s doctrine of the analogy of being is found in his discussion of the division of accidental being.

g. Beings of Reason

A moment’s reflection shows that we not only think and talk about existent objects such as stars and oak trees, we also think and talk about non-existent objects and even objects that could not exist, such as square circles or goat-stags. But, to use one of Suárez’s examples, what is one talking about if one says that two chimeras are similar but that goat-stags and chimeras are different? Ordinarily one might think that propositions are made true by beings. The proposition that many oaks have lobate leaves is made true by the many real oak trees with lobate leaves. But what makes the proposition that goat-stags and chimeras are different true? Besides, how can a thought be directed at one non-existent object rather than another?

Suárez calls such objects of thought “beings of reason,” and he ends DM with a disputation devoted to them (DM 54). The inclusion of this disputation might come as a surprise, given the systematic nature of the work and given that Suárez has carefully defined the object of metaphysics as “real being.” Well aware of his earlier definition, Suárez opens DM 54 with a prologue defending the treatment of beings of reason. Given that our thought inevitably involves beings of reason, someone needs to give an account of them. Suárez thinks the metaphysician is best suited to the task, since beings of reason are “shadows of being” and consequently should be treated in analogy to real being. Since real being is metaphysics’ object, metaphysicians are well-positioned to give an account of beings of reason also. Suárez’s is a controversial claim; some scholastics thought beings of reason the logician’s domain. Note, too, that the analogy between real being and being of reason is a rather weak one. Not only do the analogates not fall under any unitary concept, but beings of reason do not even have in themselves anything proportional to real beings (beings of reason have nothing in themselves). Suárez, however, points out that beings of reason are thought of as having a proportion to real beings, and insists that is sufficient for a kind of analogy of proportionality (DM 54.1.10).

As is his wont, Suárez attempts to thread a middle course. There are those who deny that there are beings of reason or that beings of reason are needed in order to teach about or conceive of real beings. On the other side are those who grant beings of reason and, furthermore, claim that there is a single concept of being that includes both real beings and beings of reason. Suárez’s own view is that there are beings of reason but that the “are” in that claim does not indicate the same thing as the “are” in the sentence “there are oak trees.” There is no single concept covering both real beings and beings of reason.

When first introducing his own view of beings of reason, Suárez characterizes them as “not true real beings, since they are not capable of true and real existence” (DM 54.1.4). It is worth noting the modal force in that characterization and recalling that for Suárez real being includes both actual and possible being. What Suárez has in mind when talking about beings of reason are things that cannot exist. That class turns out to be rather motley. Mythical beasts, negations, privations (for example, blindness), and even logical concepts such as genus and species are all beings of reason.

Suárez’s explicit definition comes two paragraphs later: “a being of reason is usually, and rightly, defined as that which has being only objectively in the intellect or as that which is thought by reason as a being even though it has no entity in itself” (DM 54.1.6; italics in the original). One issue in Suárez scholarship concerns whether the two disjuncts amount to equivalent definitions or not. If Suárez also thinks that it is possible to think of non-beings in the manner of non-beings, then it would look like there can be things that satisfy the first disjunct without satisfying the second disjunct. Another issue concerns how to understand the talk of being objectively in the intellect. On the traditional interpretation, Suárez’s view is that beings of reason have a peculiar mode of being, namely, as pure objects of thinking. Consequently, their being depends on minds actually thinking about them. That interpretation can and has been challenged, however, by a more eliminativist account that denies that “being only objectively in the intellect” is any sort of being at all. Rather, it just describes the state of affairs where an intellect has a contentful thought about something that simply fails to exist.

Most of the interest in Suárez’s account of beings of reason concerns his general characterization. DM 54, however, goes on to ask what sorts of causes, if any, beings of reason have and whether the traditional division of beings of reason into negations, privations, and relations of reason is right and sufficient. He answers that, rightly understood, it is.

Questions concerning beings of reason go back at least to the ancient Greek philosophers, but Suárez’s discussion is a landmark in its detail and systematicity. It also leaves some matters unclear, however, and of course other philosophers found things with which to disagree. The result was a train of similarly extended, sophisticated discussions of beings of reason after Suárez. Some offered what might be seen as developments of Suárez’s account, for example, Bartolomeo Mastri and Bonaventura Belluto, while others argues against Suárez and offered contrasting accounts, for example, Pedro Hurtado de Mendoza and John Punch.

h. Middle Knowledge, Grace, and Providence

As mentioned earlier, Suárez contributed to the raging debate about how to reconcile human free will with divine grace, foreknowledge, providence, and predestination that is known as the controversy De auxiliis. One of the developments for which he is best-known, albeit more in theological circles than philosophical circles, is a doctrine called Congruism. Suárez and Robert Bellarmine are the two Jesuits usually credited with formulating Congruism in detail.

To understand Congruism, it helps to step back and first look at the Molinism of which it is a species. A relatively straightforward way of reconciling human free will with the various theological doctrines in question is to provide a compatibilist account of free will, that is, an account of free will compatible with determinism. Luis de Molina, also a Jesuit, rejects that method, vigorously defending a libertarian account of free will. To be free means to be able to choose and able not to choose an option once all the prerequisites for acting have been posited. Related to this emphasis on libertarian freedom is the belief that the divine grace (whereby sinful humans can attain salvation) is not intrinsically efficacious. Rather, God’s grace is rendered efficacious by a human being’s free consent. These beliefs naturally raise questions about compatibility with traditional doctrines such as God’s foreknowledge and providential control. Molina famously proposes middle knowledge (scientia media) to show their compatibility. Middle knowledge—“middle” because it falls between the natural and free knowledge traditionally ascribed to God—is God’s prevolitional knowledge of what any possible free creature would do in any scenario. This ensures God’s foreknowledge, even of free actions, and allows God the appropriate providential control.

Suárez wholeheartedly agrees with Molina on the importance of libertarian freedom (DM 19.2-9). If, for example, God determines Peter to steal, then Peter cannot be held responsible for stealing. Suárez also agrees with the Molinist strategy of appealing to middle knowledge (De scientia Dei futurorum contingentium 2). There are, however, disagreements on this as well.

Molina argues that the reason God knows what creatures would freely do is because God’s infinitely surpassing the finite nature of creatures allows God to “super-comprehend” their natures, and thereby to know what they would do in given situations. Suárez, however, denies that any special explanation is needed for God’s middle knowledge. To know whether God can do something, we do not need to investigate God’s omnipotence. Rather, we merely need to establish that the thing is possible. Similarly, he thinks that to find out whether God can know something, we do not need to investigate God’s abilities. All we need to do is establish that the claim in question is true. If it is, then God’s omniscience ensures God’s knowledge of that claim. God knows propositions about what free creatures would do in the same way he knows any other proposition: by a simple intuition of its truth (De scientia Dei futurorum contingentium 1.8).

But Congruism’s main dispute with Molina concerns the reason for the efficaciousness of divine grace and the place of predestination. According to Molina, God bestows grace on, say, Mary and Martha, knowing that Mary will freely act so as to render the grace efficacious while Martha will not. Mary’s free acceptance is the sole extrinsic reason rendering the grace given to her efficacious, while Martha’s free failure to accept the grace is the sole extrinsic reason the grace given to her is sufficient rather than efficacious. Furthermore, God predestines Mary to salvation on the basis of his knowledge that she will freely accept his grace.

Suárez, however, attempts to carve out a position that is broadly Molinist but closer to the position of the Jesuits’ Thomist opponents. On his view, Mary’s free acceptance is not the only extrinsic reason rendering the grace efficacious. Nor does God predestine her to salvation on the basis of his middle knowledge. Rather, God antecedently elects some people to salvation but does not elect others. This election is gratuitous in the sense that it does not rest on God’s knowledge of any human merits. Having thus elected Mary, he then knows, thanks to his middle knowledge, what graces to bestow such that Mary will freely accept. If God had known that the grace given to Mary would not be such that Mary would freely accept it, then he would have given some other grace to her such that she would. In other words, the grace is efficacious not only because Mary freely accepts it, but also because of God’s antecedent decision to give whatever grace is needed to ensure Mary’s free acceptance, that is, his antecedent decision to give her a “congruous” grace (De concursu et auxilio Dei 3.14.9).

Suárez’s and Bellarmine’s Congruist version of Molinism was declared the official doctrine of the Jesuit order in 1613, supplanting Molina’s own version.

i. Natural Law and Obligation

A central question in the history of ethics goes back to Socrates in Plato’s Euthyphro: do the gods love the pious because it is pious or is it pious because the gods love it? Suárez directly addresses a variant of this question, albeit in a section whose title might not immediately reveal its philosophical significance: “Is the natural law truly a preceptive divine law?” (De legibus 2.6; henceforth, DL).

Scholastics customarily distinguish between natural and positive law. Explaining this distinction in neutral terms is made difficult by the widely varying theories of law given, but perhaps the central feature of natural law is an epistemological one. Natural law is law that is accessible to all human beings, regardless their access to some holy text or other special revelation. As Suárez puts it, natural law “is that law which sits within the human mind in order to distinguish the fine from the wicked” (DL 1.3.9 or 10, depending on edition). Other features often associated with natural law are universality (that is, applicability across times and places) and being grounded in nature rather than in an arbitrary will. Positive law, on the other hand, is in some sense arbitrary law that is added to natural law and that is not, in principle, accessible apart from special promulgation or communication. A paradigm example is a law that requires people to drive on the right-hand side of the road. One might well think the choice between that law and the law requiring people to drive on the left is arbitrary and one certainly cannot figure out which side of the road to drive on in a given country by introspecting. It is worth noting that Suárez recognizes two species of positive law: human and divine. In some contemporary discussions, positive law is characterized as human law and natural law as divine law. Understanding the terms in that way makes a hash of Suárez’s discussion, as well as those of other scholastics.

Now we can see why natural law is of special interest with respect to the question Socrates posed to Euthyphro. All scholastic theorists of law think that there are at least some cases where laws get their obligatory force solely from a legislator. That is what positive law is like, including divine positive law. A standard example is the ceremonial law of the Old Testament, which Christians thought obligatory in one time and place but not in another. The question is whether all law is like that. Natural law is, of course, the place to look if one is wondering if there are obligations that are not grounded in a superior’s command or prohibition.

Suárez stresses that the question to be asked is whether natural law is divine law in the sense that it is grounded in God qua legislator. He deems it entirely obvious that God is the cause of natural law in some sense, since God is the creator of everything, including any nature in which natural law might be grounded (DL 2.6.2). But even if God is the cause in that sense, there is still the question whether created nature already indicates what ought to be done and what ought to be avoided or whether a further legislative act prescribing or forbidding actions is needed. Suárez calls law in the former sense indicative law and law in the latter sense preceptive law.

One position is extreme naturalism or intellectualism, which Suárez attributes to Gregory of Rimini and several others (DL 2.6.3). On this view, no legislative act on God’s part is needed. Rather, natural law simply indicates what should be done or not done on the basis of what is intrinsically good or bad. Loss of life, for example, is bad: murdering King Duncan deprives him of life, and so Macbeth ought not to stab Duncan. On the extreme naturalism espoused by Gregory, Macbeth’s duty not to murder Duncan would obtain even if God had not given the Ten Commandments and even if God had not existed at all.

On the other side is an extreme voluntarism that says that natural law consists entirely in a command or prohibition coming from God’s will, a view that Suárez attributes to William of Ockham (DL 2.6.4). On this view, what one ought or ought not to do is wholly determined by God’s legislative acts and, furthermore, God’s legislative acts are unconstrained. That is, there is no act that is intrinsically bad such that God is compelled to prohibit it or even prevented from commanding it and no act that is intrinsically good such that God is compelled to command it. Had God commanded us to murder and steal, then doing so would have been obligatory and good.

Characteristically, Suárez charts a middle course. He first agrees with the extreme voluntarists that natural law is genuinely preceptive law, and argues that for a law to be genuine law and not just law in name it must be grounded in the legislative act of a superior (DL 2.6.5-10). The obligatory force of natural law comes from God’s will. Contra Gregory of Rimini, that obligation would not be present had God not legislated or not existed at all.

But then comes the crucial qualification that ends Suárez’s agreement with extreme voluntarism: “Second, I say that this will of God—that is, this prohibition or precept—is not the whole reason for the goodness and badness that is found in observing or transgressing the natural law, but that the natural law presupposes in the acts themselves a certain necessary fineness or wickedness and adjoins to these a special obligation of divine law” (DL 2.6.11). The extreme voluntarist thinks that God is free to command as he wishes, unconstrained by the natures of things. Suárez, on the other hand, thinks that God’s commands and prohibitions are constrained by natural goodness and badness. As befits a perfect being, God prohibits some actions precisely because they are evil. Suárez thinks it absurd to suggest that there are no actions such that they are too evil for God to command or even just to permit. To this extent, then, Suárez agrees with the naturalist; the obligations of natural law are rooted in natural goodness and badness.

That Suárez attempts to chart a middle course of this sort is uncontroversial. There are, however, controversies about how much Suárez allows on the natural, pre-legislative side, as well as how what is allowed on the natural side relates to the “special obligation” resulting from God’s legislative acts. On one interpretation, Suárez’s account is incoherent, because he gives a voluntarist account of obligation and yet grants that performing an action naturally bad is sufficient for blameworthiness. Since being blameworthy requires violating an obligation, Suárez is thereby implicitly committed to saying that natural badness is sufficient to give rise to obligations. But this contradicts his voluntarist account of obligation. One way to avoid the charge of incoherence would be to understand Suárez as giving a voluntarist account of the “special obligation” imposed by natural law, and understanding that obligation as an additional obligation. In other words, the natural goodness and badness intrinsic to actions is sufficient to give rise to one sort of obligation and consequently sufficient to make agents who observe the obligations praiseworthy and agents who violate them blameworthy. Natural law then adds a further obligation to that natural obligation in the same way that human legislators might add further obligation by prohibiting what is already morally prohibited. As Suárez notes, there is nothing incoherent about adding one obligation to another (DL 2.6.12).

Noting that Suárez uses both the terms “duties” (“debita”) and “obligations” (“obligationes”), an alternative strategy is to interpret him as giving a voluntarist account of all obligation while granting that natural goodness and badness give rise to duties. In other words, natural goodness and badness would be sufficient to give rational agents duties to act in certain ways and not in other ways, but none of this would count as genuine obligation. Obligation, on this view, is a peculiar sort of force that arises from someone with legitimate authority issuing a command or prohibition to some subject to that authority. A task for defenders of this interpretation is to spell out what duties and obligations are, such that being subject to a duty is not sufficient for being under obligation.

j. Political Authority

One could hold the view that what gives some individuals political power over other people is that God bestowed such authority on them directly. Suárez rejects that view. He insists that men are by nature free and subject to no one (DL 3.1.1). (I shall use the term “men” in this section, since Suárez does not grant the same natural liberty to women and children. See DL 3.2.3.) Europeans had, of course, stumbled into the Americas shortly before Suárez’s time and so one of the questions that exercised his contemporaries concerned the standing of Native Americans, especially in light of Aristotle’s infamous claim that some human beings are natural slaves. Suárez has no use for the suggestion that Native Americans are natural slaves. Men are naturally disposed to be free and being free is one of their perfections; suggesting that all the people in some region or other happen to have been born “monsters,” that is, with defective natures, is incredible.

Suárez does, however, think that some rulers have legitimate authority over men. Where does that authority come from? The short answer is that political communities are needed and so men consent to join together in such communities, and in a political community the power to govern and to look after the common good of the community must be vested in an authority. Suárez gives two primary reasons why political communities are needed (DL 3.1.3). First, he agrees with Aristotle that human beings are social animals that desire to live in community. The most natural community is a family, but this is an imperfect community, insufficient to include within it all the skills and knowledge needed for life. Consequently, multiple families need to join together in a perfect (that is, complete) community. Second, anticipating Hobbes’ famous point, Suárez notes that individual families not joined together in a political community would be unlikely to remain at peace and would have no means of averting or avenging wrongs (see also Defensio fidei catholicae 3.1.4).

A lively question in Suárez’s day was whether human beings in the state of innocence would have lived in political communities or whether such communities only became necessary after the Fall with its introduction of sin. The second reason given above, in particular, would not seem to apply in the state of innocence. Suárez, however, is confident that political communities would have formed even in the state of innocence, had it continued (De opere sex dierum 5.7). Human beings are social animals, whether in the state of innocence or not. Furthermore, Suárez thinks that even in the state of innocence some people would surpass others in virtue and knowledge. So even though joining together in a political community might not be necessary for mere survival in that state, it would, nonetheless, be most useful for the sharing of knowledge and for encouragement to greater virtue.

That political community requires some authority to govern it Suárez takes as well-nigh self-evident (DL 3.1.4-5). He does note, however, that one reason a governing authority is needed is because each individual of a political community looks after his or her own good. Individuals’ goods, however, sometimes conflict with the common good. Consequently, a government looking after the common good is needed. Following Aristotle, Suárez thinks that a monarchy best fills the role of governing authority (DL 3.4.1). He does not think that monarchy is dictated by natural law, however, and grants that other forms of government, including democracy, may be “good and useful.” Which form of government to adopt is left to human choice.

A political community does not result merely from a group of families living in proximity, even if they consequently become familiar friends. The formation of a political union requires an “explicit or tacit pact” to help each other and a subordination of the families and individuals to some governing authority (De opere sex dierum 5.7.3; cf. DL 3.2.4). The details of how this works are subject to scholarly dispute. Some commentators argue that on Suárez’s view, the community’s consent creates a political community but does not directly cause obligation to a political authority. Others argue that that the consent does directly cause the obligation and authority.

An important feature of Suárez’s view is that political power does not just reside in the community initially. It always remains there. As he puts it, “after that power has been transferred to some individual person, even if it has been passed on to a number of people through various successions or elections, it is still always regarded as possessed immediately by the community” (DL 3.4.8). Suárez is, of course, aware that the needed stability of political communities would be in question if communities could withdraw their transfer of power to the government at every whim. So even though in some sense the power always remains in the community, Suárez argues that the transferred power may not ordinarily be withdrawn (DL 3.4.6). Suárez recognizes exceptions, however. Should the government become tyrannical, the door may be opened to legitimate revolt and even tyrannicide (Defensio fidei catholicae 6.4 and De charitate 13.8). This is the doctrine that gained Suárez the ire of James I of England.

4. Legacy

Although largely unknown among non-specialists (at least in Anglo-American philosophy), Suárez’s influence has never been in doubt among historians of early modern philosophy. There are difficulties with establishing the extent of his influence. First, many of the canonical early modern figures seldom cite their sources. Descartes is perhaps best-known for this, but citations are hardly abundant in any of the other extrascholastic early moderns such as Spinoza, Malebranche, and Locke. The absence of citations is, of course, especially striking in comparison to the texts of Suárez and his fellow scholastics, which are replete with them. Second, encountering an idea or term in a modern text that looks very much like something in Suárez is insufficient to establish Suárezian influence, since there were hundreds of other scholastic theologians and philosophers, many of them also quite influential and many, especially Suárez’s Jesuit confrères, saying things more or less similar to what Suárez is saying and making use of the same terms and distinctions standard in scholastic circles.

Consequently, there is substantial debate about just how indebted Descartes, for example, is to Suárez. Some historians emphasize that his philosophical formation would have occurred in his education at the Jesuit La Flèche and that he himself writes in a letter that the first seeds of everything he learned came from the Jesuits. Other historians, however, point out that in a different letter he claims to remember only two Jesuits, Antoñio Rubio and Francisco de Toledo, and that he generally does seem to have been a very attentive reader.

However much uncertainty there may be about the extent of Suárez’s influence, it is certain that Hobbes, Descartes, Malebranche, Leibniz, and Berkeley all mention Suárez explicitly at least once and say a variety of things that might well be thought to borrow from or be inspired by Suárez. Wolff, too, cites Suárez and thought highly of him, to the extent that Wolff has been characterized as begotten of Suárez. Insofar as there is a path from Wolff to Kant, there might, then, reasonably be thought a path from Suárez to Kant as well.

The story of influence just told is the one most frequently encountered. It omits, however, Suárez’s main influence. Due to the vagaries of academic fashion and dubious historiographies, large swaths of early modern philosophy receive virtually no attention today. Yet Suárez was most influential in these neglected realms, which saw the rise of a Suárezian school of philosophy.

Suárez and Gabriel Vasquez were seen as rival fathers of Jesuit theology and philosophy, leading to near endless discussion of Suárez’s views by early modern Jesuits. Pedro Hurtado de Mendoza and Rodrigo de Arriaga, for example, discuss Suárez extensively in their own work (in fact, they are sometimes treated as faithful Suárezians, although that is a mistake). It must be remembered, too, that the Jesuits were active worldwide, leading to a remarkably wide dissemination of Suárez’s views. His influence was perhaps most profound in the early modern universities of Latin America, but Jesuit missionaries also spread his work to Africa and Asia, even starting a Chinese translation of DM in the seventeenth-century.

Outside the Jesuit order, Suárez’s stature in Scotism—at its height in early modern Europe—is also striking. Texts by Scotist authors often include frequent and detailed discussion of Suárez’s views. His influence even transcended the main religious division of modern Western Europe. Protestant scholastics such as Francis Turretin and David Hollaz, departing from Luther’s contempt for scholasticism, borrowed freely from Suárez. Suárez is sometimes described as providing the received metaphysics for seventeenth-century Lutheran universities.

Contemporary readers usually come to Suárez via the canonical early modern philosophers such as Descartes and Leibniz, but noting these additional lines of influence does two things. In the first place, it helps provide a more accurate picture of the extent of Suárez’s influence. But, second, it also gives some indication of how many philosophers and theologians identified in Suárez a thinker of the highest caliber, with whose work it was worth engaging at length.

5. References and Further Reading

a. Primary Sources

  • Suárez, Francisco. Opera omnia. Paris: Louis Vivès, 1856-78.
    • This standard edition is the most readily available, including freely online. It is not, however, a critical edition and does not include quite all of Suárez’s works. All major philosophical works are included. No English translation of a complete work has been published. Significant portions of some works, especially Disputationes metaphysicae (DM), have been published, however, and additional translations in various stages of polish are available online.

b. Secondary Sources

This abbreviated bibliography focuses on English works published in the last several years.

  • Doyle, John P. Collected Studies on Francisco Suárez, S. J. (1548-1617). Edited by Victor M. Salas. Leuven: Leuven University Press, 2010.
    • Doyle has perhaps done more than anyone else for Suárez studies in the U.S.A. This is a collection of essays drawn from forty years of work. The theme that receives the most attention is Suárez’s conception of metaphysics and being, but the volume also includes several papers on Suárez’s account of law and human rights.
  • Fichter, Joseph H. Man of Spain: Francis Suárez. New York: MacMillan, 1940.
    • Still the standard English biography of Suárez, although it is rather hagiographical by contemporary standards.
  • Freddoso, Alfred J. “God’s General Concurrence with Secondary Causes: Why Conservation Is Not Enough.” Philosophical Perspectives 5 (1991): 553-85.
    • A superb account of Suárez’s arguments in DM 22 against mere conservationism, that is, his arguments for God’s constant concurrence with the actions of created things.
  • Freddoso, Alfred J. “Introduction: Suárez on Metaphysical Inquiry, Efficient Causality, and Divine Action.” On Creation, Conservation, and Concurrence: Metaphysical Disputations 20-22, xi-cxxi. By Francisco Suárez. South Bend: St. Augustine’s Press, 2002.
    • The focus is on Suárez’s account of creation, conservation, and concurrence, but the incisive introduction to Suárez’s metaphysics in general and account of efficient causation more particularly makes this an especially valuable essay.
  • Gracia, Jorge J. E. “Suárez’s Conception of Metaphysics: A Step in the Direction of Mentalism?” American Catholic Philosophical Quarterly 65.3 (1991): 287-309.
    • Argues for a realist interpretation of Suárez’s account of metaphysics.
  • Gracia, Jorge J. E. “Francisco Suárez: The Man in History.” American Catholic Philosophical Quarterly 65.3 (1991): 259-66.
    • A brief, accessible introduction to Suárez, setting him in his historical context.
  • Heider, Daniel. Universals in Second Scholasticism: A Comparative Study with Focus on the Theories of Francisco Suárez S. J. (1548-1617), Joao Poinsot O. P. (1589-1644), and Bartolomeo Mastri da Meldola O. F. M. Conv. (1602-1673)/Bonaventura Belluto O. F. M. Conv. (1600-1676). New York: John Benjamins Publishing Company, 2014.
    • The title is an accurate guide. Exemplary historical scholarship, but challenging reading for those unfamiliar with scholastic terminology.
  • Hill, Benjamin, and Henrik Lagerlund, eds. The Philosophy of Francisco Suárez. Oxford: Oxford University Press, 2012.
    • One of the first volumes that a student of Suárez should turn to; includes papers on a variety of topics.
  • Novotný, Daniel D. Ens rationis from Suárez to Caramuel: A Study in Scholasticism of the Baroque Era. New York: Fordham University Press, 2013.
    • A fine study of Suárez’s account of beings of reason, followed by discussions of the accounts offered subsequently by Hurtado, Mastri/Belluto, and Caramuel. Novotný’s work is marked by both erudite historical scholarship and keen philosophical analysis.
  • Penner, Sydney. “Suárez on the Reduction of Categorical Relations.” Philosophers’ Imprint 13 (2013): 1-24.
    • Argues that Suárez gives a realist but reductionist account of relations, albeit with some problematic results.
  • Perler, Dominik. “Suárez on Consciousness.” Vivarium 52.3-4 (2014): 261-86.
    • An illuminating examination of Suárez’s account of our access to our own acts of perception and thinking. Looks at his distinction between first-order sensory consciousness and second-order intellectual consciousness, as well as what explains the unity of consciousness.
  • Salas, Victor and Robert Fastiggi, eds. A Companion to Francisco Suárez. Brill’s Companions to the Christian Tradition 53. Leiden: Brill, 2015.
    • Covers a number of areas that are neglected by the Schwartz and Hill/Lagerlund collections.
  • Schwartz, Daniel, ed. Interpreting Suárez: Critical Essays. Cambridge: Cambridge University Press, 2012.
    • An excellent set of essays on Suárez, treating a selection of key topics.
  • Shields, Christopher. “Virtual Presence: Psychic Mereology in Francisco Suárez.” In Partitioning the Soul: Ancient, Medieval, and Early Modern Debates, edited by K. Corcilius and D. Perler, 199-219. Berlin: W. de Gruyter, 2014.
    • Examines Suárez’s account of the soul and its parts and what the talk of parts comes to.
  • Shields, Christopher and Daniel Schwartz. “Francisco Suárez.” Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. 2014. Accessed 30 Mar. 2015.
    • A fine survey of Suárez’s life and philosophy that covers some topics neglected in the present entry.

 

Author Information

Sydney Penner
Email: sfp@sydneypenner.ca
Asbury University
U. S. A.

Scientific Representation

To many philosophers, our science is intended to represent reality. For example, some philosophers of science would say Newton’s theory of gravity uses the theoretical terms ‘center of mass’ and ‘gravitational force’ in order to represent how a solar system of planets behaves—the changing positions and velocities of the planets but not their color changes. However, it is very difficult to give a precise account of what scientific representation is. More precisely, though, scientific representation is the important and useful relationship that holds between scientific sources (for example, models, theories, and data models) and their targets (for example, real-world systems, and theoretical objects). There is a long history within philosophy of describing the nature of the representational relationship between concepts and their objects, but the discussion on scientific representation started in the 20th century philosophy of science.

There are a number of different questions one can ask when thinking about scientific representation. The question which has received the most attention, and which will receive the most attention here, is what might be called (following Callendar and Cohen 2006, 68) the “constitution question” of scientific representation: “In virtue of what is there representation between scientific sources and their targets?” This has been answered in a wide variety of ways, some arguing that it is a structural identity or similarity which ensures representation while others argue that there is only a pragmatic relationship. Other questions about scientific representation relate more specifically to the ways in which representations are used in science. These questions are more typically asked directly about certain sorts of representational objects, especially scientific models, as well as from the perspective of sociology of science.

Table of Contents

  1. Substantive Accounts
    1. Structuralist Views
      1. Isomorphism
      2. Partial Isomorphism
      3. Homomorphism
    2. Similarity Views
    3. Critiques of Substantive Accounts
  2. Deflationary and Pragmatic Accounts
    1. DDI
    2. Inferential
      1. Suárez
      2. Contessa
    3. Agent-Based Versions of Substantive Accounts
      1. Agent-Based Isomorphism
      2. Agent-Based Similarity
    4. Gricean
    5. Critiques of Deflationary and Pragmatic Accounts
  3. Model-Based Representation
    1. Models as Representations (and More)
    2. Model-Building
    3. Idealization
  4. Sociology of Science
    1. Representation and Scientific Practice
    2. Circulating Reference
    3. Critiques of Sociology of Science
  5. References and Further Reading

1. Substantive Accounts

Scientific representation became a rising topic of interest with the development of the semantic view of theories which was itself developed partly as a response to the syntactic view of theories. Briefly, on the syntactic view, theoretical terms are defined in virtue of relationships of equivalence with observational entities (Suppe 1974). This was done through the creation of a first-order predicate calculus which contained a number of logical operators as well as two sets of terms, one set filled with theoretical terms and the other with observational terms. Each theoretical term was defined in terms of a correspondence rule linking it directly to an observational term. In this way a theoretical term such as ‘mass’ was given an explicit observational definition. This definition used only phenomenal or physical terms [such as “Drop the ball from the tower in this way” and “observe the time until the ball hits the ground”] plus logical terminology [such as ‘there is’ and ‘if…then’]; but the definition did not use any other theoretical terms [such as ‘gravitational force’ or ‘center of mass’]. The logical language also included a number of axioms, which were relations between theoretical terms. These axioms were understood as the scientific laws, since they showed relationships that held among the theoretical terms. Given this purely syntactic relationship between theory and observed phenomena, there was no need to give any more detailed account of the representation relationship that held between them. The correspondence rule syntactically related the theory with observations.

The details of the rejection of the syntactic view are beyond the scope of this article, but suffice it to say that this view of the structure of theories was widely rejected. With this rejection came a different account of the structure of theories, what is often called the semantic view. Since there was no longer any direct syntactic relationship between theory and observation, it became of interest to explain what relationship does hold between theories and observations, and ultimately the world.

Before examining the accounts of scientific representation that arose to explain this relationship, we should get a basic sense of the semantic view of theories. The common feature of the semantic approach to scientific theories was that they should not be thought of as a set of axioms and defined syntactic correspondence between theory and observation. Instead, theories are “extralinguistic entities which may be described or characterised by a number of different linguistic formulations” (Suppe 1974, 221). That is to say, theories are not tied to a single formulation or even to a particular logical language. Instead theories are thought of as being a set of related models. This is better understood through Bas van Fraassen’s (1980) example.

Van Fraassen (1980, 41-43) asks us to consider a set of axioms which are constituents of a theory which will be called T1:

A0 There is at least one line.

A1 For any two lines, there is at most one point that lies on both.

A2 For any two points, there is exactly one line that lies on both.

A3 On every line there lie at least two points.

A4 There are only finitely many points.

In Figure 1, we can see a model which shows that T1 is consistent, since each of the axioms is satisfied by this model.

Figure 1

Figure 1 – Model of Consistency of T1

Notice this is just one model which shows the consistency of T1, since there are other models which could be constructed to satisfy the axioms, like van Fraassen’s Seven Point Geometry (1980, 42). Note that what is meant here by ‘model’ is whatever “satisfies the axioms of a theory” (van Fraassen, 1980, 43). Another, perhaps more intuitive, way of expressing this is that a model for any theory T is any model which would make T true iff the model were the entirety of the universe. For example, if Figure 1 were the entirety of the universe, then clearly T1 would be true. Notice also that, on the semantic view, the axioms themselves are not central in understanding the theory. Instead, what is important in understanding a theory is understanding the set of models which are each truth-makers for that theory, insofar as they satisfy the theory.

This account of the structure of theories can be applied to an actual scientific theory, like classical mechanics. Here, following Ronald Giere (1988, 78-79), we can take up the example of the idealized simple systems in physics. These are, he argues, models for the theory of classical mechanics. For example, the simple harmonic oscillator is a model which is a truth-maker for (part of) classical mechanics. The simple harmonic oscillator can be described as a machine: “a linear oscillator with a linear restoring force and no others” (Giere 1988, 79); or mathematically: F = -kx. This model, were it the entirety of the universe, would make classical mechanics true.

The targets of theoretical models on the semantic view are not always real world systems. On some views, there is at least one other set of models which serve as the targets for theoretical models. These are variably called empirical substructures (van Fraassen 1980) or data models. These are ways of structuring the empirical data, typically with some mathematical or algebraic method. When scientists gather and describe empirical data, they tend to think of and describe it in an already partially structured way. Part of this structure is the result of the way in which scientists measure the phenomena while being particularly attentive to certain features (and ignoring or downplaying others). Another part of this structuring is due to the patterns seen in the data which are in need of explanation. On some views, most notably van Fraassen’s (1980), the empirical model is the phenomenon which is being represented. That is to say, there is no further representational relationship holding between data models and the world, at least as scientific practice is concerned (for a discussion of this, see Brading and Landry 2006). Others argue that the relationship between theoretical models and data models is only one of a number of interesting representational relationships to be described, which set themselves up in a hierarchical structure (French and Ladyman 1999, 112-114).

With this semantic account of the structure of scientific theories in place, there arose an interest to give an account of the representational relationship. The views which arose with the semantic view of theories are here called “substantive,” because they all attempt to give an account of the representational relationship which looks to substantive features of the source and target. Another way of putting this (following Knuuttila 2005) is to say that the substantive accounts of representation seek to explain representation as a dyadic relationship which holds between only the source and the target. As will be discussed below, this is different from the deflationary and pragmatic accounts which view scientific representation as at least a triadic relationship insofar as they add an agent to the relationship. There are two major classifications of substantive accounts of the representational relationship. The first are the structuralist views which are divided into three main types: isomorphism, partial isomorphism, and homomorphism. The second category is the similarity views.

a. Structuralist Views

Generically, the structuralist views claim that scientific representation occurs in virtue of what might be called “mapping” relationships that hold between the structure of the source and the structure of the target, i.e. the parts of the theoretical models point to the parts of the data models.

i. Isomorphism

Isomorphism holds between two objects provided that there is a bijective function—that is, both injective (or one-to-one) and surjective (or onto)—between the source and the target. Formally, suppose there are two sets, set A and set B. Set A is isomorphic to set B (and vice versa) if and only if there is a function, call it f, which could be constructed between A and B which would take each member of set A and map it to one and only one member of set B such that each member of set B is mapped.

To make the point more clear, let us suppose that set A is full of the capital letters of the English alphabet and set B is full of the natural numbers 1 through 26. We could create a function which, when given a letter of the alphabet, will output a number. Let’s make the function easy to understand and let f(A) = 1, f(B) = 2, and so on, according to typical alphabetical order. This function is bijective because each letter is mapped to one and only one number and every number (1 – 26) is being picked out by one and only one letter. Notice that since we can draw a bijective function from the letters to the numbers, we can also create one from the numbers to the letters: most simply, let f’(1) = A, f’(2) = B, and so on. Of course, there is nothing apart from the ease of our understanding which requires that we link A and 1, since we could have linked 1 with any letter and vice versa.

Isomorphism has frequently been used to explain representation (see van Fraassen 1980, Brading and Landry 2006). Since theories, on the semantic view, are a group of related models, there is a certain sort of structure that each of these models has. Most of the time, they are thought of as mathematical models though they need not be only mathematical as long as they have a structure. Van Fraassen also identifies what he calls “appearances,” which he defines as “the structures which can be described in experimental and measurement reports” (1980, 64). So, the appearances are the measurable, observable structures which are being represented (the targets of the representation). On van Fraassen’s account (1980), a theory will be successfully representational provided that there is an isomorphic relationship between the empirical substructures (the sources) and the appearances (as targets), and an isomorphic relationship between the theoretical models (as sources) and the empirical substructures (as targets). (Or at any rate, this is how he has commonly been interpreted (see Ladyman, Bueno, Suárez, and van Fraassen 2010)). As described by Mauricio Suárez (2003, 228), this isomorphism between the models shows that there is an identity that holds between the “relational framework of the structures” of the source and the target. And it is this relational framework of structures which is being maintained.

So, on the isomorphism view of scientific representation, some scientific theory represents some target phenomena in virtue of a bijective mapping between the structures of a theory and data, and a bijective mapping between the data and the phenomena. Notice that on the isomorphic view, the bijections which account for representation are external to the theoretical language. That is to say that the relationship that holds between the theory and the phenomena is not internal to the language in which the theory is presented. This is an important feature of this account because it allows for a mapping between very different kinds of structures. Presumably the (mainly mathematical) structures of theories are quite different from the structures of data models and are certainly very different from the structures of the phenomena (because the phenomena are not themselves mathematical entities). However, since the functions are external, we can create a function which will map these very different types of structures to one another.

ii. Partial Isomorphism

Isomorphism has much to suggest for it, especially when focusing in particular on those theories which are expressed mathematically. This is especially true in more mathematically-driven fields like physics. It seems that the mathematical models in physics are representing the structure that holds between various real world phenomena. For example, F=ma represents the way in which certain features of an object (its mass, the rate at which it is being accelerated) correspond to other features (its force). However, many philosophers (for example, Cartwright 1983 and Cartwright, Shomar, and Suárez 1995) have pointed out that there are cases where a theory or model truly represents some phenomena, even though there are features of the phenomena which do not have any corresponding structure in the theory or model, due to abstraction or idealization.

Take a rather simple example, the billiard ball model of a gas (French and Ladyman 1999). Drawing on Mary Hesse’s (1966) important work on models, French and Ladyman argue that there are certain features of the model which are taken to be representative, for example, the mass and the velocity of the billiard balls represent the mass and velocity of gas atoms. There are also certain features of the billiard balls which are non-representational, for example, the colors of the balls. Most importantly, though, as a critique of isomorphism, there are typically also some undetermined features of the balls. That is to say, for some of the features of the model, it is unknown whether they are representational or not. (For a more detailed scientific example, see Cartwright, Shomar, and Suárez 1995).

To respond to problems of this sort, many (French and Ladyman 1999; Bueno 1997; French 2003; da Costa and French 2003) have argued for partial isomorphism. The basic idea is that there are partial structures of a theory for which we can define three sets of members for some relation. The first set will be those members which do have the relevant relation, the second set will be those members which do not have that relation, and the third will be those members for which it is unknown whether or not they have that relation. It is possible to think of each of these sets of individuals as being a relation itself (since a relation, semantically speaking, is extensionally defined), and so we could draw a bijective function between these relations. But, as long as the third relation (the third set of individuals for which it was unknown whether or not they had the relation) is not empty, then the isomorphism will be only partial because there are some relations for which we are unsure whether or not they hold in the target.

As a more concrete example, consider the billiard ball and atom example from above. In order for there to be a partial isomorphism between the two, we must be able to identify two partial structures of each system, that is, a partial structure of the billiard ball model and a partial structure of the gas atoms. Between these partial structures, there must be a bijective function which maps relations of the model to relations of the gas-system. For example, the velocity of the billiard balls will be mapped to the velocity of respective atoms. There must be a second function which maps those non-representational relations of the model to features of the gas-system which are not being represented. For example, a non-representative feature of the model, like the color of the billiard balls, will be mapped to some feature of the system which is not being represented, like the non-color of the atoms. All the same, this will still remain partial because there will be certain relations that the model has which are unknown (or undefined) in relationship to the gas-system.

iii. Homomorphism

Homomorphism, defended by Bartels (2006), is more general than isomorphism insofar as all isomorphisms are homomorphisms, but not all homomorphisms are isomorphisms. Homomorphisms still rely on a function being drawn between two sets, but they do not require that the function be bijective; that is, the function need not be one-to-one or onto. So, this means that not every relation and part of the theory must map on to one and only one relation or part of the target systems. Additionally, this permits that there be parts and relations in the target system which are unmapped. Homomorphisms allow for a great deal of flexibility with regard to misrepresentations.

b. Similarity

Isomorphism (and the other -morphisms) places a fairly strict requirement on the relevant constitutive features of representation, which are on these views structural. But, as Giere points out (1988, 80-81), this is often not the relevant relationship. Oftentimes, scientists are working with theories or models which are valuable not for their salient structural features, but rather for some other reason. For example, when modeling the behavior of water flowing through pipes, scientists often model the water as a continuous fluid, even though it is actually a collection of discrete molecules (Giere 2004). Here, the representational value of the model is not between the structure of the model and the structure of the world (since water is structurally not continuous, but rather a collection of discrete molecules). Instead, the relevant representational value comes from a more general relationship which holds between the behavior of the modeled and real world systems. Giere suggests that what is needed is “a weaker interpretation of the relationship between model and real system” (1988, 81). His suggestion is that we explain representation in virtue of similarity. On his account a model will represent some real world system insofar as it is similar to the real world system. Notice that this is a much weaker account of representation than the structural accounts, since similarity includes structural similarities, and so encompasses isomorphism, partial isomorphism, and homomorphism.

Of course, if we try hard enough, we can notice similarities between any two objects. For example, any two material objects are similar at least insofar as they are each material. Thus, Giere suggests that an account of scientific representation which appeals to similarity requires an “implicit” (or explicit) “specification of relevant respects and degrees” (81). Respects indicate the relevant parts and ways in which the model is taken to be representative. Perhaps it is some dynamical relationship expressed in an equation; perhaps it is some physical similarity that exists between some tangible model and some target object (for example, a plastic model of a benzene ring); perhaps it is the way in which two parts of a model are able to interact with one another, which shows how two objects in the target system might interact (like the relevant behavior of the model of water flowing through pipes). The limitations with regard to claims of the respects of similarity are limited only by what scientists know or take to be the case about the model and the target system. For example, a scientist could not claim that there was a similarity between the color of a benzene model and a benzene ring since benzene rings have no color. Similarly, a scientist could not claim that there is similarity between the color of a mathematical model and the color of a species of bacteria since a mathematical model does not have any color. Notice that it is insufficient to merely specify the respects in which a model is similar since similarity can come in degrees. Of course, there is a whole spectrum of degrees of similarity on which any particular similarity can fall. A source can be anywhere from an extremely vague approximation of its target to being nearly identical to its target (what Giere calls “exact” (1988, 93)) and everywhere in between.

Giere’s own example is that, “The positions and velocities of the earth and moon in the earth-moon system are very close to those of a two-particle Newtonian model with an inverse square central force” (1988, 80). Here, the relevant respects are the position and velocity of the earth and moon. The relevant degree is that the positions and velocities in the earth-moon system are “very close” to the two-particle Newtonian model. These respects and degrees thus give us an account of how we should think of the similarity between the model and the target system.

Giere uses similarity to describe the relationship between models and the real-world systems they represent, and sometimes between different models (one model may be a generalization of another, and so on). Theories themselves are constituted by a set of these models as well as some hypotheses that link the models to the real world which define the respect and degree of the similarity between the models and their targets.

More recently, Michael Weisberg (2013) has argued for a similarity account of representation. In brief, his view argues that two sets of things be distinguished in both source and target: the attributes and the mechanisms. In distinguishing these sets, an equation can be written in which the common attributes and mechanisms can be thought of as the intersection of the attributes of the model and of the target system, and the intersection of the mechanisms of the model and target system. The dissimilarities can also be identified in a similar fashion. He adds some terms to these sets which are weighting terms and functions. These allow the users to indicate which similarities are more important than others. Rewriting the equation as a ratio between similarities and dissimilarities will result in a method by which we can make comparative judgments about different models. In this way, we will be able to say, for example, that one model is more or less similar than another.

c. Critiques of Substantive Accounts

While similarity and isomorphism continue to have some support in the contemporary literature (especially in modified versions; see below, section 3c), the versions described above have faced serious criticisms. One of the most common arguments against the substantive views is that they are unable to handle misrepresentations (Suárez 2003, 233-235; Frigg 2006, 51). Many models in science do not accurately reflect the world, and, in fact, the model is often viewed as particularly useful because of (not in spite of) the misrepresentations. Nancy Cartwright (1983) has famously argued for a fictional account of modelling and made this case for the laws of physics. Others have shown that similar things are true in other scientific domains (Weisberg 2007a). When the theories are intentionally inaccurate, there will be difficulty in explaining the way in which these theories are representational (as scientists and philosophers often take them to be), with reference to isomorphism or similarity.

Suárez (2003, 235-237) has also argued that both similarity and isomorphism are each neither necessary nor sufficient for representation.  Consider first isomorphism. It must be the case that it is not necessary for representation, given that scientists often take certain theories to be representative of their real-world targets even though there is no isomorphic relationship between the theory and the target system. The same is true of similarity. Using his example, suppose that there is an artist painting an ocean view, using some blue and green paints. This painting has all sorts of similarities to the ocean view she is representing, one of which is that both the painting and the ocean are on the same relative side of the moon, are both in her line of vision at time t, share certain colors, and so forth. But which ones are relevant to its being representative and which are more contingent is up to the discretion of the agent who takes it to be representative of the ocean view in certain respects (as Giere argued). But if this is the case, then it turns out that A represents B if and only if A and B are similar in those respects in which A represents B. This ultimately leaves representation unexplained.

Supposing we can give some account of salience or attention or some other socially-based response to this first problem (which seems possible), we are left with the problem that plenty of salient similarities are non-representational. Suárez makes this point with Picasso’s Guernica (2003, 236). The bull, crying mother, eye, knife, and so forth, are all similar to certain real-world objects. But the painting is not a representation of these other things. It is representing some of the horrible atrocities of Franco.

Suárez also argues that both similarity and isomorphism are insufficient for representation. Consider the first, similarity. Take any given manufactured item, for example, an Acer C720 Chromebook, a computer which is similar to many other computers (hundreds of thousands). Notice that the fact of its similarity is insufficient to make it represent any of the other computers. Even if we add in Giere’s requirement that there be hypotheses which define the respects and degrees of the similarity, the insufficiency will remain. In fact, it seems as though there are hypotheses which define the relevant respects and degrees of similarity between the computers: Acer’s engineers and quality control have made sure that the production of these computers will result in similar computers. All the same, even with these hypotheses which give respects and degrees, we would not want to say that any given computer represents the others.

The non-sufficiency problem holds for isomorphism as well. Suppose someone were to write down some equation which had various constants and variables, and expressed certain relationships that held between the parts of the equation. Suppose now that, against all odds, this equation turns out to be isomorphic to some real-world system, say, that it describes the relationship between rising water temperatures and the reproduction rate of some species of fish which is native to mountain streams in the Colorado Rockies. To many, it appears to be counterintuitive to think that representations could happen accidentally. However, if isomorphism is sufficient for representation, then we would have to admit that the randomly composed equation does represent this fish species, even if no one ever uses or even recognizes the isomorphic relationship.

There are other arguments against these views in general, an important one being that they lack the right logical properties. Drawing on the work of Goodman (1976), both Suárez (2003, 232-233) and Roman Frigg (2006, 54) argue that representation has certain logical properties which are not shared by similarity or isomorphism. Representation is non-symmetric, so when some A represents B, it does not follow that B represents A. Representation is non-transitive: if A represents B and B represents C, it does not follow that A represents C. It’s also non-reflexive: A does not represent itself. Since isomorphism is reflexive, transitive, and symmetric, and similarity is reflexive and symmetric, they do not have the properties required to account for representation.

There are replies to these arguments on behalf of the substantive views. First, there is a general question about whether or not we are justified in making inferences from representation in art to representation in science. As was discussed above, many of the criticisms against substantive views draw examples from the domain of art (for example, Suárez’s (2003) uses many examples of paintings and is drawing upon Goodman’s (1976) which discusses representation in art). But, it should not be taken as given that what holds in art must translate to science. In fact, in many cases, the practices in art seem to be quite different from the practices in science. As Bueno and French say, “After all, what do paintings—in particular those that are given as counter-examples to our approach, which are drawn from abstract art—really have to do with scientific representation?” (2011, 879).

Following Anjan Chakravaratty (2009), Otávio Bueno and French (2011) argue that something like similarity or partial isomorphism is, in fact, necessary for successful representation in science. If there were no similarity or isomorphism at all, the successful use of models “would be nothing short of a miracle” (885). That is to say, while similarity or partial isomorphism might not be the whole story, they are at least part of the story. Using the aforementioned example of Picasso’s Guernica, they note that “there has to be some partial isomorphism between the marks on the canvass and specific objects in the world in order for our understanding of what Guernica represents to get off the ground” (885).

Replies have been made to the other arguments as well. Bueno and French (2011) argue that their account of partial isomorphism can meet all of the criticisms raised by Suárez (2003) and Frigg (2006). Adam Toon (2012) discusses some of the ways in which supporters of a similarity account of representation might respond to criticisms. Bartels (2006) defends the homomorphism account against these criticisms.

2. Deflationary and Pragmatic Accounts

If, as these scholars have argued, these substantive views will not work to explain scientific representation, what will? Suárez (2015) argues that what is needed instead is a deflationary account. A deflationary account claims “that there is no substantive property or relation at stake” (37) in debates about scientific representation. Deflationary accounts are typically marked by a couple of features. First, a deflationary account will deny that there are any necessary and sufficient conditions of scientific representation, or if there are, they will lack any explanatory value with regard to the nature of scientific representation. Second, these accounts will typically view representation as a relationship which is deeply tied to scientific practice. As Suárez puts it, “it is impossible, on a deflationary account, for the concept of representation in any area of science to be at variance with the norms that govern representational practice in that area…representation in that area, if anything at all, is nothing but that practice” (2015, 38).

Already we can see that these views will be quite different from the substantive views. Each of these views was substantive in the sense that they gave necessary and sufficient conditions for representation. There was also a distinct way in which these views were detached from scientific practice, since whether something was representational had little to do with whether or not it was accepted by scientists as representational and more to do with the features of the source and target. In each case, it was a relationship that was entirely accounted for by features of the theory or model and the target system. As Knuuttila (2005) describes it, these were all dyadic (two-place) accounts insofar as the relationship held between only two things. The deflationary accounts take a markedly different direction by moving to at least a triadic (three-place) account of representation.

In some cases, the views that have developed have followed the general lead of many deflationary views in giving a central role to the work of an agent in representation. These views do not qualify as deflationary, given that they still give necessary and sufficient conditions of representation. Given the importance of the role of agents and aims, we might call these views pragmatic. Although pragmatic and deflationary views are importantly distinct in their aims, they share many common threads and in many cases, the views could be reinterpreted as deflationary or pragmatic with little effort. As such, they will be grouped together in this section.

a. DDI

The earliest deflationary account of representation was RIG Hughes’ DDI Account (1997). The DDI Account consists of three parts: denotation, demonstration, and interpretation. Denotation is the way in which a model or theory can reference, symbolize, or otherwise act as a stand-in for the target system. The sort of denotation being invoked by Hughes is broad enough to include the denotation of concrete particulars (for example, a model of the solar system will denote particular planets), the denotation of specific types (for example, Bohr’s theory models not just this hydrogen atom, but all hydrogen atoms), and the denotation of a model of some global theory (for example, this particular model is “represented as a quantum system” (S331)). In each case, the model denotes something else; it stands in for some particular concrete object, some type of theoretical object, or some type of dynamical system.

We might think this relationship sufficient for representation, since the fact that scientists treat certain objects or parts of models as being stand-ins or symbols for some target system seems to answer the question of the relationship between a model and the world. Hughes, though, thinks that in order to understand scientific representation, we need to examine how it is actually used in scientific practice. This requires additional steps of analysis. The second part of Hughes’ DDI Account is demonstration. This is a feature by which models “contain resources which enable us to demonstrate the results we are interested in” (S332). That is, models are typically “representations-as,” meaning not only do scientists represent some target object or system, but they also represent it in a certain way with certain features made to be salient. The nature of this salience is such that it allows users to draw certain types of conclusions and make certain predictions, both novel and not. This is demonstration in the sense that the models are the vehicles through which (or in which) these insights can be drawn or demonstrated, physically, geometrically, mathematically, and so forth This requires that they be workable or used in certain ways.

The final part of the DDI Account is interpretation. It is insufficient that the models demonstrate some particular insight. The insight must be interpreted in terms of the target system. That is to say, scientists can use the models as vehicles of the demonstration, but in doing so, part of the representational process as defended in the DDI Account is that scientists interpret the demonstrated insights or results not as features of the model, but rather as features which apply to the target system (or at least, the way scientists are thinking of the target system).

In summary, with denotation, we are moving in thought from some target system to a model. We take a model or its parts to stand in or symbolize some target system or object. In demonstration, we use the model as a vehicle to come to certain insights, predictions, or results with regard to the relationship that holds internal to the model. It is in interpretation that we move from the model back to the world, taking the results or insights gained through use of the model to be about the target system or object in the world.

b. Inferential

i. Suárez

After criticizing the substantive accounts in his (2003), Suárez (2004) developed his own account of representation which focused centrally on inference and inferential capacities, what he calls an inferential conception of representation. As he describes it there, this account involves two parts. The first part is what he calls representational force. Representational force is defined as “the capacity of a source to lead a competent and informed user to a consideration of the target” (2004, 768). Representational force can exist for a number of reasons. One way to get representational force is to repeatedly use the source as a representation of the target. Another way is in virtue of intended representational uses, that is, in virtue of the intention of the creator or author of some source viewed within the context of a broader scientific community. Oftentimes, the representational force will occur as a combination of the two. It is also a contextual property, insofar as it requires that the agent using the source has the relevant contextual knowledge to be able to go from the source to the (correct/intended) target.

So, for example, in the upper left-hand corner of my word processor is a little blueish square with a smaller white square and a small dark circle inside of it (it is supposed to be an image of a floppy disk). This has representational force insofar as it allows me to go from the source (the image of the floppy disk) to the target (a means of saving the document which I am currently writing). In this case, the representational force exists in virtue of both the intended representational uses (the creators of this word processor surely intend this symbol to stand in for this activity) as well as repeated uses (I am part of a society which has, in the past, repeatedly used an image of a floppy disk to get to this target, not only in this program but in many others as well). It is also contextual: someone who had never used computers would not have the requisite knowledge to be able to use the icon correctly.

This is part of the story for Suárez, but in order to have scientific representation there must be something more than mere representational force. On his view, scientific representations are subject to a sort of objectivity which does not necessarily exist for other representations, for example, the example above of the save icon. The objectivity is not meant to indicate that there is somehow an independent representational relationship that exists in the world when scientists are engaged in scientific representation. Instead, the objectivity is present insofar as representations are constrained in various ways by the relevant features of the targets system which is being represented. That is, because there is some real feature which scientists are intentionally trying to represent in their scientific models and theories, the representation cannot be arbitrary but must respond to these relevant features. So the constraints are themselves objective, but this does not commit Suárez to identifying some reified relationship that holds between sources and targets.

According to Suárez, if we are going to get this objectivity in representations, we must turn to a second feature: the capacity of a source to allow for surrogate reasoning. This second feature requires that informed and competent agents be led to draw specific inferences regarding the target. These inferences can be the result of “any type of reasoning…as long as [the source] is the vehicle of the reasoning that leads an agent to draw inferences regarding [the target]” (2004, 773). Suárez’s point here is that not only does the source lead the agent to the target, but also it leads the agent to think about the target in a particular way, coming to particular insights and inferences with respect to the source.

More recently, Suárez (2010) has argued that this second feature, the capacity for surrogate reasoning, typically requires that three things be in place. First, the source must have internal structure such that certain relations between parts can be identified and examined. Secondly, when examining the parts of the source, scientists must do so in terms of the target’s parts. Finally, there must be a set of norms defined by the scientific practice which define and limit which inferences are “correct” or intended. It is in virtue of these norms of the practice that an agent will be able to draw the relevant and intended inferences, making the representation a part of that particular scientific practice. Of course, he takes his view to be deflationary, so these are not to be understood as necessary and sufficient conditions of the capacity for surrogate reasoning, but rather features which are frequently in place.

Consider an example of a mathematical model, for example the Lotka-Volterra equation. The model is supposed to be representational of predator-prey relationships. Part of this is Suárez’s representational force—the fact that competent agents will be lead to consider predator-prey relationships when considering the source. However, as Suárez notes, this is insufficient for scientific representation because in science the terms interact in a non-arbitrary way. To account for this, he argues that there is another feature of the model, which is the capacity to allow for surrogate reasoning. In this case, that means that individuals who examine or manipulate the model in terms of its parts (the multiple variables) will be able to draw certain inferences about the nature of real-world interactions between predators and prey (the parts of the target system). These insights will occur in part due to the nature of the model as well as the norms of scientific practice, which means that the inferences will be non-arbitrarily related to the real-world phenomena and will afford us to recognize certain specified inferences of scientific interest.

ii. Contessa

Suárez’s inferential account has been further developed by Gabriele Contessa (2007, 2011). He is explicit in his claim that the interpretational view he is defending is not a deflationary account, but is rather a substantive version of the inferential account insofar as he takes the account to give necessary and sufficient conditions of representation. All the same, the account he defends is clearly pragmatic in nature. Contessa begins by noting an important distinction he has drawn from Suárez’s work, that of the difference between three types of representation. The first is mere denotation, in which some (arbitrarily) chosen sign is taken to stand for some object. He gives the example of the logo of the London Underground denoting the actual system of trains and tracks.

The second sort of representation is what Contessa calls “epistemic representation” (2007, 52). An epistemic representation is one which allows surrogate reasoning of the sort described by Suárez. The London Underground logo does not have this feature since no one would be able to use it to figure out how to navigate. A map of the London Underground, on the other hand, would have this feature insofar as it could be used by an agent to draw these sorts of inferences.

The final sort of representation is what he calls “faithful epistemic representation” (2007, 54-55). Whether or not a representation is faithful is a matter of degree, so something will be a completely faithful epistemic representation provided all of the valid inferences which can be drawn about the target using the source as a vehicle will also be sound. Notice this does not require that a model user be able to draw every possible inference about the target, but rather that the inferences licensed by the map that are drawn will be sound inferences (both following from the source and true of the target). In this sense, a map of the London Underground produced yesterday will be more faithful than one produced in the 1930s.

Using this framework, Contessa goes on to describe a scientific model as an epistemic representation of features of particular target systems (56). The scientific model will be representational for a user when she interprets the source in terms of the target. He remains open to there being multiple sorts of interpretation which are relevant, but suggests that the most common sort of interpretation is “analytic,” which functions quite similarly to an isomorphism in which every part and relation of the source is interpreted as denoting one and only one part and relation in the target (and all of the target’s parts and relations are denoted by some part or relation from the source).

Of course, given that this is determined by the agent’s use, it is not necessary that the agent believe that her interpretation is actually the case about the system. Here is where Contessa draws on the distinction of faithfulness. Since models are often misrepresentations and idealizations as has been discussed above, they need not be completely faithful in order to be useful. This is not the end of the story, though, because the circumstances also play an important role in understanding whether or not something is a scientific representation.

c. Agent-Based Versions of Substantive Accounts

In light of some of the insights of Suárez and others, many of the views described above as substantive views were altered and updated to more explicitly and centrally make reference to the role of an agent, making them what could be called agent-centered approaches. Of most importance, given their role in the substantive views as described above are recent advances made by van Fraassen and Giere.

i. Agent-Based Isomorphism

The view of isomorphism commonly attributed to van Fraassen, which was described above, was the one drawn from his 1980 book, The Scientific Image. More recently, van Fraassen has presented an altered account of representation, which places much more emphasis on the role of an agent in his (2008). Van Fraassen notes that while some reference to an agent was a part of his earlier views (Ladyman, Bueno, Suárez, and van Fraassen, 2010), Suárez’s important work on deflationary accounts was influential in the development of the view he defends (2008, 7, 25-26; 2010).

He begins his account by looking primarily to the way in which a representation is used, saying that a source’s being representative of some target “depends largely, and sometimes only” on the way in which the source is being used (2008, 23). Though he does not take himself to be offering any substantive theory of representation, he does call this the Hauptsatz or primary claim of his account of representation: “There is no representation except in the sense that some things are used, made, or taken, to represent things thus and so” (2008, 23).   Van Fraassen notices that this places some restrictions on what can possibly be representational. Mental images are limited, because they are not made or used in some way. That is to say, we do not give our mental states representational roles. Similarly, there is no such thing as a representation produced naturally. What it is to be a representation is to be taken or used as a representation, and this is not something that happens spontaneously without the influence of an agent.

Van Fraassen also notices an important distinction in two ways of representing: representation of and representation as. When scientists take or use some source to be representational, they take it to be a representation of some target. This target can change based on context, and sometimes scientists might not even use the source to be a representation at all. Consider van Fraassen’s example: we can use a graph to represent the growth of bacterial colonies under certain conditions, and so the graph will be a representation of bacterial growth (2008, 27). But we could also use that graph to represent other phenomena, perhaps the acceleration of an object as it is dropped from some height. Part of what this captures is the way in which our perspectives can change the way in which we are representing a particular appearance. Thus, by using a source in some distinct way, we can represent some particular appearance of some particular phenomena.

In intentionally using a source as a representation, scientists do not only make it a representation of something, but they also represent it in a certain light, making certain features salient. This is what van Fraassen calls representation as. Two representations can be of the same target, but might represent that target as something different. Van Fraassen offers an example: everything that has a heart also has a kidney, but representing some organism as having a heart does not mean the same thing as representing it as has having kidneys (2008, 27). Similarly, we might represent the growth of bacteria mentioned above as an example of a certain sort of growth model or as the worsening of some infection as it is seen as part of a disease process.

Of course, all of this is very general, which van Fraassen acknowledges. However, in a true deflationary attitude, he notices that there is no good way of getting more specific about scientific representation since it has “variable polyadicity: for every such specification we add there will be another one” (2008, 29). Nonetheless, he still maintains that the link between a good or useful representation and phenomena requires a similarity in structure. As it stands, then, there is still an appeal to isomorphism present in his account: “A model can (be used to) represent a given phenomenon accurately only if it has a substructure isomorphic to that phenomenon” (2008, 309). Just as before, we have an account of representation which relies on isomorphism between the structure of the theoretical models and the (structure of) the phenomena. All the same, this is still a markedly different view from his earlier view described above. No longer is it the isomorphism or structural relationship alone which is representational. Now, on van Fraassen’s views, it is the fact that a scientific community uses it or takes it to be representational.

ii. Agent-Based Similarity

Ian Hacking (1983) has famously argued that, in philosophical discussions of the role and activity of science, too much emphasis is put on representation. Instead, he suggests that much of what is done in science is intervening, and this concept of intervention is key to understanding the reality with which science is engaged. All the same, he still thinks that science can and does represent. Representation, on his account, is a human activity which exhibits itself in a number of different styles. It is people who make representations, and typically, they do so occurs in terms of a likeness, which he takes to be a basic concept. Representation in terms of likeness, he thinks, is essential to being human, and he even speculates that it may have played a role in development like many think language did. In creating a likeness, though, he argues that there is no analyzable relation being made. Instead, “[likeness] creates the terms in a relation…First there is representation, and then there is ‘real’” (139). Representation on his view is not interested in being true or false, since the representation precedes the real.

Giere (2004, 2010) has also made pragmatics more central and explicit to his account of scientific representation. He claims that in attempting to understand representation in science we should not begin with some independent two-place relationship, which substantially exists in the world. Instead, we should begin with the activity of representing. If we are going to view this activity as a relationship, it will have more than two places. He proposes a four-place relation: “S uses X to represent W for purposes P” (2004, 743). Here, S will be some agent broadly construed, such that it could be some individual scientist, or less specifically some group of scientists. X is any representational object, including models, graphs, words, photographs, computational models, and theories. W is some aspect or feature of the world and P are the aims and goals of the representational activity; that is, the reasons why the scientist is using the source to represent the target. Giere identifies a number of different potential purposes of representation. These include things like learning what something is actually like, but are fairly contextual and depend upon the question being asked. So the way in which something is modeled might change depending on the purposes of the representation (2004, 749-750).

Giere is still working from what should be considered a semantic conception of theories, in which a theory is a set of models which are created according to a set of principles and certain specific conditions. The principles are what we might otherwise think of as being empirical laws, but he does not conceive of them as having empirical truth. Instead, by thinking of them as principles by which scientists can form models, it is these scientists who construct and use the models who make particular the otherwise general and idealized principles. On this view, then, it is the models which are representational and will link up to the empirical world.

There are many ways a scientist can use a model to represent the world, on Giere’s view, but the most important way remains similarity. Giere is quick to note that this does not mean that we need to think of the representational relationship as some objective or substantive relationship in the world. Instead, the scientist who uses the model does the representing and she will often do this in virtue of picking out certain salient features of a model which are similar to the target system. In doing so, the scientist specifies the relevant aspects and degrees of similarity which she is using in her act of representation.

One of the advantages of this updated version of the similarity view is the wide range of models which can be effectively representational on this account (Giere 2010). Giere gives an example of a time when he saw a nuclear physicist treat a pencil as a model of a beam of protons, explaining how the beam could be polarized. It is in virtue of the invoked similarity between the pencil and a beam of photons, that is, the fact that the physicist specifically used a relevant similarity, that he was able to use it to represent the beam of photons. By noting the importance of the role of the agent, Giere is better able to explain the scientific representation which occurs in the whole range of scientific representations.

A similar yet importantly distinct account of representation as similarity is defended by Paul Teller (2001). Teller argues that we should abandon what he called the perfect model model, in which we take scientists to model in a way that is perfectly correspondent with the real world targets. Instead, he thinks that models are rarely, if ever, perfect matches for their targets. This does not mean that models are not representations. He argues that models represent their targets in virtue of similarity, though he denies that any general account of similarity can be given. What makes something a similarity depends deeply upon the circumstances at hand including the interests of the model user.

d. Gricean

One way to ‘deflate’ the problem of scientific representation is to claim that there is no special problem for scientific representation, and instead argue that we should understand the question of scientific representation as part of the already widely discussed literature on representation in general. This is the project taken up by Craig Callender and Jonathan Cohen (2006). According to their view, representation in many different fields (art, science, language, and so forth) can be explained by more fundamental representations, which are common to each of the fields.

To explain this, they appeal to what they call “General Griceanism”, which takes its general framework from the insights of Paul Grice. On their General Gricean view, the representational nature of scientific objects will be explained in terms of something more fundamentally representational. The more fundamentally representational objects in this case are mental states. This, in effect, pushes the hard philosophical problem back a stage, since some account must be given with regard to the representational nature of mental states. They remain uncommitted to any particular account of the representational nature of mental states, leaving that something to be argued about in philosophy of mind. All the same, they mention a few popular candidates: functional role theories, informational theories, and teleological theories.

There are, on their view, significant advantages to taking this General Gricean viewpoint. For one, it has a certain sort of simplicity to it. By explaining all representation in terms of the fundamental representations of mental states, we do not need to give wildly different explanations as to why a scientific model represents its target and why, for example, a green light represents ‘go’ to a driver. Each occurs because, in virtue of what the scientist or “hearer” knows, a certain mental state will be activated which contains with it the relevant representational content.

They can also explain the reasons why similarity or isomorphism will be commonly used (though non-necessary) since these are strong pragmatic tools in helping to better bring about the relevant mental state with its representational content. This is, as they argue, clearly one of the reasons why people from Michigan use an upturned left hand to help them explain the relative location of their hometown–because the upturned left hand is similar in shape to the shape of Michigan. The reasons why similarity is a useful tool here are identical to the reasons why similarity would be useful in scientific contexts, because it will make the relative instance of communication more effective–meaning that the hearer (or user of a model or scientific representation) will be better able to arrive at the relevant mental states which represent the target system.

In short, their view is that while there might be a general philosophical problem of representation, there is not anything special about scientific practice that makes its stake in this problem any different from any other field or the general problem. Of course, as they amusingly note, this passes the buck to the more fundamental question: “Once one has paid the admittedly hefty one-time fee of supplying a metaphysics of representation for mental states, further instances of representation become extremely cheap” (71).

e. Critiques of Deflationary and Pragmatic Accounts

These deflationary and pragmatic accounts of representation have not avoided criticisms of their own. Many of these criticisms are presented as part of the defense of one of the views over another. For example, Contessa (2011) argues against a purely denotational account of scientific representation, such as the one seen in Callendar and Cohen’s (2006). As he says, “Whereas denotation seems to be a necessary condition for epistemic representation, it does not, however, seem to be a sufficient condition” (2011, 125). As Contessa argues, it is insufficient merely to be able to stipulate a denotational relationship to have the sort of representation which is useful to scientists. For example, we might use any given equation (for example, F=ma) to denote the relationship which holds between the size of predator and prey populations. But, while this equation could successfully denote this relationship, it will not be of much use to scientists because they will not be able to draw many insights about the predator-prey relationship. Therefore, Contessa argues, while denotation is a necessary condition of representation, it cannot alone be the whole story. In addition, he suggests the need for interpretation in terms of the target, as described above.

Matthias Frisch (2015, 296-304) has raised a worry which he addressed specifically to van Fraassen’s (2008) account, but which is applicable to many of the pragmatic and deflationary accounts described above. The worry is that if we take van Fraassen’s Hauptsatz (“There is no representation except in the sense that some things are used, made, or taken, to represent things thus and so” (2008, 23)) literally, then it seems to be impossible that some models can represent. Taking Frisch’s example, say we wanted to construct a quantum mechanical model of a macroscopic body of water. To do this, “we would have to solve the Schrödinger equation for on the order of 1025 variables—something that is simply impossible to do in practice” (297). But if this is so, then it turns out that that the Schrödinger equation cannot be used to represent a macroscopic body of water—since we could never use the equation in this way, it is not representational in this way. Notice that this concern applies to other pragmatic and deflationary accounts: if we are unable to make inferences or interpret the source in terms of the target (which, given the complexity here, it seems we would not be able to do), then it will also fail to be representational on these other accounts. But this leads to a fairly strong conclusion that we can only use a model to represent a system once we have actually applied the model to that system. For example, the Lotka-Volterra model seems to only represent those systems for which scientists have used it; it does not represent all predator-prey relationships, in general.

Frisch (2015, 301-304) does not think that this argument is ultimately fatal to the pragmatic accounts since he argues that because there are constraints on the use of models which are part of the scientific practice, there is a sense in which the Lotka-Volterra model, for example, represents all predator-prey relationships (even though it has not yet been used in this way). There is no problem in extending models “horizontally,” that is, to other instances which are in the same domain of validity of the model. There is, Frisch argues, a problem in extending models “vertically,” that is, using a model to represent some phenomena which is outside the domain of validity. This can be seen in the quantum mechanics example from above since we do not have any practice in place to use Schrödinger equations to describe macroscopic bodies of water. So, he claims, van Fraassen’s view (and, by extension, the other pragmatic and deflationary views) must be committed to an anti-foundationalism (a view that the sciences cannot be reduced to one foundational theory) that denies that the models of quantum mechanics can adequately represent macroscopic phenomena. Of course, the anti-foundationalist commitments might be viewed as a desirable feature of these views, rather than a flaw, depending upon other commitments.

Another important critique which applies more generically to a number of these deflationary and pragmatic views comes from Chakravartty (2009). As described above, many of those who argue for a deflationary or pragmatic account of representation offer their view as an alternative to the substantive accounts. That is to say, they deny that scientific representation is adequately described by the substantive accounts and do not merely add to these accounts, but rather reject them and offer their deflationary or pragmatic account instead. Chakravartty argues that this is a mistaken move. We should not think of deflationary or pragmatic accounts as alternatives to the substantive accounts, but rather as compliments. On the deflationary or pragmatic accounts, representation occurs when inferences can be made about the target in virtue of the source. But, “how, one might wonder, could such practices be facilitated successfully, were it not for some sort of similarity between the representation and the thing it represents—is it a miracle?” (201). That is to say, the very function which proponents of the deflationary or pragmatic accounts take to be the central explainer of scientific representation seems to require some sort of similarity or isomorphism (Bueno and French 2011). On Chakravartty’s view, the pragmatic or deflationary accounts go too far in eliminating the role for some substantive feature. In doing so, they leave an important part of scientific representation behind.

3. Model-Based Representation

The question of scientific representation has received important attention in the context of scientific modeling. There is a vast literature on models, and much of it is at least tangentially related to the questions of representation. An examination of this literature provides an opportunity to see other sorts of insights with regard to representation and the relationship between the world and representational objects.

a. Models as Representations (and More)

Much of the literature on models focuses on the various roles of models within scientific practice, both representational and others. In an influential volume on models, Margaret Morrison and Mary Morgan (1999) use a number of examples of models to defend the view that models are partially independent from theories and data and function as instruments of scientific investigation. We can learn from models due to their representational features. Morrison and Morgan start out by focusing on the construction of models. Models, on their account, are constructed by combining and mixing a range of disparate elements. Some of the elements will be theoretical and some will be empirical, that is, from the data or phenomena. Thus far, this view is mostly in line with what has been discussed in the above sections. What makes models unique in their construction is that they often involve other outside elements. These can be stories (ways of explaining some unexpected data which are not part of a theory), other times it is a sort of structure which is imposed onto the data. These other elements, they argue, give models a sort of partial independence or autonomy. This is true even when the outside elements are not as obviously present, for example when a model is an idealized, simplified, or approximated version of a theory. This independence is crucial if we are to use them to help understand both theories and data as we often use them to do.

According to Morrison and Morgan, models function like tools or instruments for a number of purposes. There are three main classifications of the uses of models. The first is in interacting with theories: models can be used to explore a theory or to make usable a theory which is otherwise unusable. They can also be used to help understand and explore areas for which we do not yet have a theory. Other times, the models are themselves the objects of experimentation. The second classification of the use of models is in measurement: not only as a way of structuring and presenting measurements, but they can also function directly as instruments of measurement. Finally, models are useful when designing and creating technology.

Models are not valuable only insofar as they have these functions. Models, Morrison and Morgan argue, are also importantly representational. Their representational value relies in part on the way in which they are constructed with both the theory and the data or phenomena. Models can represent theories, data, or can be representational instruments which mediate between data and theory. Whatever the case, representation, on their view, is not taken to be some mirroring or direct correspondence between the model and its representational target. Instead, “a representation is seen as a kind of rendering–a partial representation that either abstracts from, or translates into another form, the real nature of the system or theory, or one that is capable of embodying on a portion of a system” (1999, 27). Sometimes models can be used to represent non-existent or otherwise inaccessible theories, as they claim is the case with simulations.

The final role of models described by Morrison and Morgan is the way that models afford the possibility of learning. Sometimes the learning comes in the construction of the model. Most frequently, though, we can learn using models by using and manipulating them. In doing so, we can learn because the models have the other features already described: the wide range of sources for construction, the functions, and their status as representations. Oftentimes, the learning takes place internal to the model. In these cases, the model serves as what they call a representative rather than a representation. With representatives, the insights we can gain from manipulating the model are all about the model itself. But in doing so, we come to a place from which we can better understand other systems, both real-world systems and other systems. Other times, we take the world into the model and then manipulate the world inside the model, as a sort of experiment.

Daniela Bailer-Jones (2003, 2009) defends a slightly different but related account of the representational nature of models. On her account, models entail certain propositions about the target of the model. As propositions, they are subject to being true or false. One way of thinking about the representation of models is to say that models are representational insofar as their entailed propositions are true. However, this cannot be exactly right, since, as was mentioned above, models oftentimes intentionally entail false propositions. Since models are about those aspects of a phenomenon which are selected, they will fail to say things about other aspects of a phenomenon. In some cases, the propositions entailed may be true for one aspect but false for another. This calls for the role of model users who decide what function the model has, the ways in which and degree to which the model can be inaccurate, and which aspects of the phenomenon are actually representing. In sum, on her view, models are representational in part due to their entailed propositions, but also due to the role of the model users.

Tarja Knuuttila (2005, 2011) has argued that in thinking about models too much emphasis has been placed on their representational features – even in accounting for their epistemic value. Following and expanding on Morrison and Morgan (1999), she argues that we should think of models as being material epistemic artefacts, that is, “intentionally constructed things that are materialized in some medium and used in our epistemic endeavors in a multitude of ways” (2005, 1266). The key to their epistemic functioning is to be found from their constrained and experimentable nature. Models, according to this account, are constrained by their construction in such a way that they make certain scientific problems more accessible, and amenable to a systematic treatment. This is one of the main roles of idealizations, simplifications, and approximations. On the other hand, the representational means used also impose their own constraints on modeling. The representational modes and media through which models are constructed (for example, diagrams, pictures, scale models, symbols, language) all afford and limit scientific reasoning in their different ways. When considered in this respect, Knuuttila argues, we can see that models have far more than mere representational capacities including that they are themselves the targets of experimentation and can be thought of as creating a sort of conceptual parallel reality.

b. Model-Building

In addressing model-building Weisberg (2007b) and Godfrey-Smith (2006) both take up the idea that the characteristic ways in which models are constructed is indirect. This comes about in a three step process in which a scientist first constructs a model, then analyzes and refines the model, and finally examines the relationship between the model and the world (Weisberg 2007b, 209). Models are used and understood by scientists with “construals” of the model (Godfrey-Smith 2006). The construal, on Weisberg’s account, is made of four parts. The first is an assignment which identifies various parts of the model to the phenomena being investigated. The second part of a construal is the scope, which tells us which aspects of the phenomena are being modeled. The final two parts of the construal are each fidelity criteria. One of these is the dynamical fidelity criteria, which identifies a sort of error tolerance of the predictions of the model. The other is the representational fidelity criteria, which give standards for understanding whether the model gives the right predictions for the right reasons, that is, whether or not the model is linking up to the causal structure which explains the aspects of the phenomenon being modeled.

This strategy of model-based science is contrasted with a different sort of strategy, what Weisberg calls abstract direct representation. Abstract direct representation is the strategy of science in which study of the world is unmediated by models. He gives the example of Mendelev’s development of the periodic table of elements. This process did not begin with a hypothetical abstract model which is refined and then used representationally (as Weisberg thinks the process of model-based science proceeds). Instead, this process starts with the phenomena and abstracts away to more general features. Such distinction between modelling and abstract direct representation underlines the possibility that not all scientific representations need to achieved in same ways.

There are some worries about Weisberg’s understanding of the process of model-making (Knuuttila and Loettgers, in press). Through a close examination of the development of the Lotka-Volterra model, Knuuttila and Loettgers argue that the process of model-building often begins with certain sorts of templates, or characteristic ways of modeling some phenomena, typically adopted from other fields. Such already familiar modeling methods and forms offer the modeler a sort of scaffolding upon which they can imagine and describe the target system. They also argue that another distinct feature of model-making is its outcome-orientation. That is, in developing a model, a scientist will typically do so with an eye to the anticipated insights or features of the target system that they wish to represent. Thus, on their view, the modeler pays close attention to the target system or empirical questions in all stages of the development of the model (not just at the end, as Weisberg suggests).

c. Idealization

One of the important discussions that has developed primarily in the literature on models concerns idealization. Weisberg (2007a) argues that there are three different kinds of idealization, which he generically describes as “the intentional introduction of distortion into scientific theories” (639). The first kind of idealization he calls Galilean idealization. This is the sort of idealization in which a theory or model is intentionally distorted so as to make the theory or model simpler, in order to render it computationally tractable. This sort of idealization occurs when scientists ignore certain features of a system or theory, not because they are playing no role in what actually happens, but rather because including them makes the application of the theory or model so complex that they cannot gain traction on the problem. By removing these complexities, scientists distort their model (because it lacks complexities which reflect the target). But after gaining some initial computational tractability, they can slowly reintroduce the complexities and thus remove the distortions.

The second type of idealization is what Weisberg calls minimalist idealization. In a minimalist idealization, the only features that are carried into the model or theory are those causal features which make a difference to the outcomes. So, if some feature of a target can be left behind without losing predictive power, a minimalist idealization will leave that feature behind. As an example, Weisberg notes that when explaining Boyle’s law, it is often assumed that there are no collisions between gas molecules. This is, in fact, false since collisions between gas molecules are known to take place in low-pressure gasses. But, “low pressure gases behave as if there were no collisions” (2007a, 643). So, since these collisions do not make any difference to our understanding of this system, scientists can (and do) leave this fact behind.

Notice that this is distinct from Galilean idealization insofar as minimalist idealizations leave certain features out of their theories or models because they make no difference to the relevant tasks or goals at hand. Galilean idealization, on the other hand, leaves certain features out even when they do make a difference, simply because leaving them in would make the model more complex and less tractable.

The final sort of idealization described by Weisberg is what he calls multiple-models idealization. This is the practice of using a number of different, often incompatible models to represent or understand some phenomenon. In this case, none of the models by itself is capable of accurately modeling the relevant target system. All the same, each of the models is good at representing certain features of the target system. Thus, by using not just a single model but rather this group of models, each of which is distorted, scientists can get a better sense of the target system. Weisberg offers a helpful example: the National Weather Service uses a number of different models in making its weather forecasts. Each of the models used represents the target in a different way, each being inaccurate in some way or another. It is the use of all of these models that permits forecasts of higher accuracy since attempts to make a single model have resulted in less accurate predictions.

4. Sociology of Science

a. Representation and Scientific Practice

Many important insights on the nature of scientific representation have come not from the philosophy of science but rather from thinkers who would typically be considered part of the field of sociology of science. The insights from this field can serve as both a source of insight on the nature of representation in scientific practice, as well as a challenge to the primarily epistemically-oriented insights from the philosophy of science. Michael Lynch and Steve Woolgar (1990) edited an important collection of papers on scientific representation in practice written from the perspective of sociology of science called Representation in Scientific Practice. More recently, (2014), Lynch and Woolgar edited another collection with Catelijne Coopmans and Janet Vertesi. Treating representation from the perspective of sociology of science involves asking a different sort of question than the one so far addressed in this article. Instead of asking about the constitution of scientific representation, sociologists of science are more interested in a different question, “What do the participants, in this case, treat as representation?” (Lynch and Woolgar 1990, 11).  In the introduction to this volume, Lynch and Woolgar provide a general overview of some of the important insights from this perspective.

Since sociology of science treats scientific practice as its object of inquiry, it is keen to describe precisely how representations are actually used by scientists. They note the importance of “the heterogeneity of representational order” (2). That is, there is a wide range of devices which are representational as well as a wide range of ways in which the representations are used and in which they are useful. Importantly, sociologists are often interested in discussing more than merely the epistemic or informational role and use of representations, viewing them as significantly social, contextualized, and otherwise embedded in a complex set of activities and practices. Sociologists of science attempt to pay attention to the whole gamut of representations and representational uses to better understand precisely the role they play within scientific investigation.

Another important insight, Lynch and Woolgar note, is that the relation between representations is not to be thought of as directional in the sense that the representations move from or towards some “originary reality” (8). Instead, any directionality of representations is to be thought of as “movement of an assembly line” (8). That is to say, representational practice must be seen as constructing not only a representation, but also (re)constructing a phenomenon in a way so that it can be represented. This is something that can be seen in much of the literature from sociologists of science, including the work of Latour (see below).

In paying close attention to the way representations are actually used, some sociologists of science note settings in which “discrepancies between representations of practice and the practices using (and composing) such representations” (9). These discrepancies and other problems encountered in the actual practice of science allow for improvisation and creativity which can help advance the particular domains of which they are a part. Sociologists of science are interested in studying this creativity, not only for its productivity in science, but also as an interesting phenomenon in its own right.

b. Circulating Reference

A particularly telling example of these insights, especially from the philosophical point of view, is provided in Bruno Latour’s “Circulating Reference” (Latour 1999). Latour’s photo-philosophical case study is based on the work of a group of scientists who were examining the relationship between a savannah and forest ecosystem. At the end of their project, they collectively published a paper on their findings which included a figure of the interaction between the ecosystems, detailing the change in soil composition among other features. Latour asks how it is that this abstract drawing, which takes a perspective no individual could possibly have had and which ignores so many of the features of the ecosystem, can be about that stretch of land. That is to say, here we have a drawing, something made by ink and paper, and there we have the forest-savannah ecosystem, how is it that the former can be about the latter?

Latour’s method of answer, which takes the form of a strikingly well-written case study in which he presents pictures from the expedition which he uses to structure and represent the process he describes, is to look carefully at all the details and steps by which the scientists got from the expedition to the figure in the paper. What happens, says Latour, is that there are a series of steps through which the scientists abstract from the world in some intentional fashion. In doing so, they maintain some relevant feature of the world, but they are simultaneously constructing the phenomena they are studying. In the process the representations produced are also getting more abstract.

An example will make this clearer. At one stage in the process, soil samples are collected from a vertical stretch of ground. These samples are transferred into a device which allows the whole vertical stretch of earth to be viewed synoptically. In taking the sample, the scientist has already begun to construct–already this particular bit of dirt is taken to be representative of the dirt for a much wider area of land. Once the soil has been collected, various features of the soil are maintained through intentional actions on the part of the scientist. For example, the scientists will label the soil as being of a certain sort of consistency. The scientists then use a clever device in which there are pinholes in a tool which has the various Munsell colors and numbers, which is itself a construction with a long history. In looking through the pinholes, the scientist can abstract away from the dirt sample itself, taking, in some sense, only the color (which is done in virtue of a construction of numbers associated with particular colors). Something has clearly been lost, namely, the full materiality of the dirt. But something has also been gained, in this case, a number which corresponds to the color of the dirt; some usable, manipulable data.

Latour’s essay carefully describes many of these transitions from the savannah-forest system to the published figure. As he claims, it is this series of transitions (which each involve abstraction and construction due to the intentional decisions of a scientist) which ensures that the figure at the end references or represents the savannah-forest system. There is not a single gap between the figure and the world which must be accounted for by some representational relation. Instead, on his account, there is a large series of gaps, each of which is crossed by a scientist’s actions in abstracting and maintaining, constructing and discovering. This series, he thinks, can be extended infinitely in either direction. By abstracting further from the already quite-abstract figure, certain hypotheses might be suggested, which would result in a return to the savannah-forest system, to gather data which might be more basic than the data already gathered. On his view, there is no such thing as “the world” which is the most basic thing-in-itself. Nor is there any most-abstracted element.

c. Critiques of Sociology of Science

While these insights from the sociology of science literature have been both sources of support and criticism for the philosophical literature, they have also been subject to criticisms. One important criticism comes from Giere’s (1994) review of Lynch and Woolgar’s (1990) Representation in Scientific Practice. Giere’s primary target is the extremely constructivist nature of the sociology of science literature. The constructivist approach claims that science is socially constructed: that is, science is filled with socially-dependent knowledge and aimed at understanding socially-constructed objects. There is no such thing, on this view, as a non-constructed world to be understood by scientists, and therefore, no such world to be represented. The attempt to explain representation in this framework results in a “no representation theory of representation” (Giere 1994, 115). But, Giere thinks there is a straightforward counter-slogan to a view of this sort: “no representation without representation” (115). That is to say that if there is nothing ‘out there’ in the world being represented, it cannot be that this is an instance of representation. This is not to reject the importance of paying attention to the role of the practices and representational devices in particular case studies. All the same, Giere argues that if we want a general account of scientific representation, “we must also go beyond the historical cases” (119). Put otherwise, the sociology of science perspective is an important part of explaining scientific representation, but this work by itself leaves representation unexplained.

Knuuttila (2014) takes up a similar line of criticism. While she places great importance on the insights of sociologists of science, she thinks that many of their views have developed with a false target in mind. Many sociologists of science place their views as a contrast to a traditional philosophical view of science as something which perfectly represents the world. The alternative, they suggest, is their constructivist approach, as described above. However, Knuuttila argues, this motivation runs into problems. First, when they select certain practices to investigate rather than others, by what criterion are they distinguishing this practice as representational? In doing so, they seem to be relying on some traditional account of representation to delineate the cases of interest. Further, it seems that these studies do not show that representation is a defunct concept and that we are bound to a purely constructivist account of science. Instead, “these cases actually reveal…what a complicated phenomenon scientific representation is…and give us clues as to how, through the laborious art of representing, scientists are seeking and gaining new knowledge” (Knuuttila 2014, 304). We need not think that just because there is no perfect representation of the world, that there is therefore no world to be represented. Their insights could equally contribute to an intermediate view in which we reject this perfect-representation view of science, but still maintain that science is giving us knowledge of the real world. That is, we can simultaneously deny that representations “are some kind of transparent imprints of reality with a single determinable relationship to their targets” while still affirming that the “artificial features of scientific representations…result from well-motivated epistemic strategies that in fact enable scientists to know more about their objects” (Knuuttila 2014, 304).

5. References and Further Reading

  • Bailer-Jones, D. (2003). When Scientific Models Represent. International Studies in the Philosophy of Science 17: 59-74.
    • Representation in models is linked to entailed propositions and pragmatics.
  • Bailer-Jones, D. (2009). Scientific Models in Philosophy of Science. Pittsburgh: University of Pittsburgh Press.
    • Extended discussion of the history of philosophy of models and defends an account of models.
  • Bartels, A. (2006). Defending the Structural Concept of Representation. Theoria 55: 7-19.
    • Homomorphism as an account of representation.
  • Brading, K. and E. Landry. (2006). Scientific Structuralism: Presentation and Representation. Philosophy of Science 73: 571-581.
    • Structuralism as a strong methodological approach in science.
  • Bueno, O. (1997). Empirical Adequacy: A Partial Structures Approach. Studies in History and Philosophy of Science 28: 585-610.
    • Representation as partial isomorphism.
  • Bueno, O. and S. French. (2011). How Theories Represent. British Journal for the Philosophy of Science 62: 857-894.
    • Representation as partial isomorphism; replies to arguments against partial isomorphism.
  • Callender, C. and C. Cohen. (2006). There Is No Special Problem About Scientific Representation. Theoria 21: 67-85.
    • Scientific representation is to be explained like other problems of representation.
  • Cartwright, N. (1983). How the Laws of Physics Lie. New York: Oxford University Press.
    • Distortion and idealization in the laws of physics.
  • Cartwright, N., T. Shomar, and M. Suárez. (1995). The Tool Box of Science: Tools for the Building of Models with a Superconductivity Example. Poznan Studies in the Philosophy of the Sciences and the Humanities 44: 137-149.
    • Models are not only theory-driven, but also phenomena-driven.
  • Chakravartty, A. (2009). Informational versus Function Theories of Scientific Representation. Synthese 72:197-213.
    • Argues that substantive and pragmatic accounts are complementary.
  • Contessa, Gabriele. (2007). Scientific Representation, Interpretation, and Surrogative Reasoning. Philosophy of Science 74: 48-68.
    • A substantive inferential account of representation.
  • Contessa, Gabriele. (2011). Scientific Models and Representation. In S. French and J. Saatsi (eds.) The Bloomsbury Companion to the Philosophy of Science, pp. 120-137. New York: Bloomsbury Academic.
    • A substantive inferential account of representation.
  • Coopmans, C. J. Vertesi, M. Lynch, S. Woolgar (eds.). (2014) Representation in Scientific Practice Revisited. Cambridge: MIT Press.
    • Reexamines the question of representation in scientific practice in contemporary sociology of science.
  • da Costa, N. and S. French. (2003). Science and Partial Truth: A Unitary Approach to Models and Scientific Reasoning. Oxford: Oxford University Press.
    • Representation as partial isomorphism.
  • French, S. (2003). A Model-Theoretic Account of Representation (Or, I Don’t Know Much about Art… but I Know It Involves Isomorphism). Philosophy of Science 70: 1472-1483.
    • Representation as partial isomorphism.
  • French, S. and J. Ladyman. (1999). Reinflating the Semantic Approach. International Studies in the Philosophy of Science 13: 103-119.
    • Representation as partial isomorphism.
  • Frigg, R. (2006). Scientific Representation and the Semantic View of Theories. Theoria 55: 49-65.
    • Criticisms of the semantic approaches to representation.
  • Frisch, M. (2015). Users, Structures, and Representation. British Journal for the Philosophy of Science 66: 285-306.
    • Discusses van Fraassen’s (2008); presents and responds to a criticism.
  • Giere, R. (1988). Explaining Science: A Cognitive Approach. Chicago: University of Chicago Press.
    • A semantic account of theories and representation as similarity.
  • Giere, R. (1994). No Representation without Representation. Biology and Philosophy 9: 113-120.
    • Argues that an appeal only to practice will leave scientific representation unexplained.
  • Giere, R. (2004). How Models Are Used to Represent Reality. Philosophy of Science 71: 742-752.
    • Representation as similarity with input of agents.
  • Giere, R. (2010). An Agent-Based Conception of Models and Scientific Representation. Synthese 172: 269-281.
    • Representation as similarity with input of agents.
  • Godfrey-Smith, P. (2006). The Strategy of Model-Based Science. Biology and Philosophy 21: 725-740.
    • Discusses the use of models as being classified by a particular strategy in science.
  • Goodman, N. (1976). Languages of Art. Indianapolis: Hackett.
    • Argues for an account of representation in art which has been influential to accounts of scientific representation.
  • Hacking, I. (1983). Representing and Intervening. New York: Cambridge University Press.
    • Representation in terms of likeness, argues for the importance of intervention.
  • Hesse, M. (1966). Models and Analogies in Science. Notre Dame, Indiana: University of Notre Dame Press.
    • Argues that models are central to scientific practice.
  • Hughes, R.I.G. (1997). Models and Representation. Philosophy of Science 64: S325-S336.
    • The DDI account of representation.
  • Knuuttila, T. (2005). Models, Representation, and Mediation. Philosophy of Science 72: 1260-1271.
    • Models as epistemic tools.
  • Knuuttila, T. (2011). Modelling and Representing: An Artefactual Approach to Model-Based Representation. Studies in the History and Philosophy of Science 42: 262-271.
    • Relates representation to representationalism, and expands the notion of models as epistemic tools.
  • Knuuttila, T. (2014). Reflexivity, Representation, and the Possibility of Constructivist Realism., In M. C. Galavotti, S. Hartmann, M. Weber, W. Gonzalez, D. Dieks, and T. Uebel (eds.) New Directions in the Philosophy of Science, pp. 297-312. Dordrecht, Netherlands: Springer.
    • Criticizes sociology of science accounts and argues that they are compatible with philosophical accounts.
  • Knuuttila, T. and A. Loettgers, A. (in press). Modelling as Indirect Representation? The Lotka-Volterra Model Model Revisited. British Journal for the Philosophy of Science.
    • Focuses on the interdisciplinary, historical and empirical aspects of model construction.
  • Latour, B. (1999). Circulating Reference. In Pandora’s Hope. Cambridge: Harvard University Press.
    • Traces and discusses the steps from some phenomena to a representation.
  • Ladyman, J., O. Bueno, M. Suárez, B.C. van Fraassen. (2011). Scientific Representation: A Long Journey from Pragmatics to Pragmatics. Metascience 20: 417-442.
    • Discussion of van Fraassen’s (2008), with replies by van Fraassen.
  • Lynch, M. and S. Woolgar (eds.). (1990). Representation in Scientific Practice. Cambridge: MIT Press.
    • Collection of essays on scientific representation by sociologists of science.
  • Morgan, M. and M. Morrison (eds.). (1999). Models as Mediators: Perspectives on Natural and Social Science. New York: Cambridge University Press.
    • Collection of essays on the uses and nature of models.
  • Suárez, M. (2003). Scientific Representation: Against Similarity and Isomorphism. International Studies in the Philosophy of Science 17: 225-244.
    • Criticisms of similarity and isomorphism.
  • Suárez, M. (2004). An Inferential Conception of Scientific Representation. Philosophy of Science 71: 767-779.
    • Inferential account of representation.
  • Suárez, M. (2010). Scientific Representation. Philosophy Compass 5: 91-101.
    • Gives a brief overview of accounts of representation.
  • Suárez, M. (2015). Deflationary Representation, Inference, and Practice. Studies in History and Philosophy of Science 49: 36-47.
    • Discusses deflationary accounts and argues that both his inferential account and the DDI account are deflationary.
  • Suppe, F. (1974). The Structure of Scientific Theories. Urbana: University of Illinois Press.
    • Criticizes the syntactic view and introduces a semantic conception of theories.
  • Teller, P. (2001). Twilight of the Perfect Model Model. Erkenntnis 55: 393-415.
    • Argues for a deflationary account of similarity as representation.
  • Toon, A. (2012). Similarity and Scientific Representation. International Studies in the Philosophy of Science 26: 241-257.
    • Explores responses to criticisms on behalf of similarity.
  • van Fraassen, B. C. (1980). The Scientific Image. New York: Oxford University Press.
    • Representation as isomorphism (among other things).
  • van Fraassen, B. C. (2008). Scientific Representation: Paradoxes of Perspective. New York: Oxford University Press.
    • Representation as isomorphism with important role for agents (among other things).
  • Weisberg, M. (2007a). Three Kinds of Idealization. Journal of Philosophy 104: 639-659.
    • Three different sorts of idealization.
  • Weisberg, M. (2007b). Who Is a Modeler? British Journal for the Philosophy of Science 58: 207-233.
    • Models are indirect representations, a strategy which is distinct from abstract direct representation.
  • Weisberg, M. (2013). Simulation and Similarity: Using Models to Understand the World. New York: Oxford University Press.
    • Representation as similarity.
  • Winther, R. (2015). The Structure of Scientific Theories. In E.N. Zalta (ed.), Stanford Encyclopedia of Philosophy. (Spring 2015 Edition). http://plato.stanford.edu/
    • Detailed discussion of accounts of the structure and representation of scientific theories with extensive bibliography.

 

Author Information

Brandon Boesch
Email: boeschb@email.sc.edu
University of South Carolina
U. S. A.

Olympe de Gouges (1748—1793)

Olympe de Gouges“Woman has the right to mount the scaffold; she must equally have the right to mount the rostrum” wrote Olympe de Gouges in 1791 in the best known of her writings The Rights of Woman (often referenced as The Declaration of the Rights of Woman and the Female Citizen), two years before she would be the third woman beheaded during France’s Reign of Terror. The only woman executed for her political writings during the French Revolution, she refused to toe the revolutionary party line in France that was calling for Louis XVI’s death (particularly evident in her pamphlet Les Trois Urnes, ou le Salut de la Patrie [The Three Ballot Boxes, or the Welfare of the Nation, 1793).  Simone de Beauvoir recognizes her, in Le Deuxième sexe [The Second Sex] (1949 [1953]), as one of the few women in history who “protested against their harsh destiny.” Favorably described by commentators alternately as a stateswoman, a femme philosophe, an artist, a political analyst, and an activist, she can be considered all of these but not without some qualification. While contradictions abound in her writings, she never wavered in her belief in the right to free speech and in its role in social and political critique.

On the death of her husband after a brief unhappy marriage, she moved to Paris, where, with the support of a wealthy admirer, she began a life’s work focused wholly on mounting the rostrum denied to women. Defying social convention in every direction and molding a life evocative of feminism of a much later age, Gouges spent her adult life advocating for victims of unjust systems, helping to create a public conversation on women’s rights and the economically disadvantaged, and attempting to bring taboo social issues to the theatrical stage and to the larger social discourse. There is disagreement on whether she participated in or even occasionally hosted some of the literary and philosophic salons of the day. Her biographer Oliver Blanc suggests yes; historian John R. Cole (2011) turns up no evidence. Still, she was recognized among the fashionable and intellectual elite in Paris, and she was well-versed in the main themes of the most influential thinkers of her day, at least for a time. Her name appears in the Almanach des addresses (a kind of social registry) from 1774-1784. Despite harsh criticism for using her voice in the political arena and thus challenging deeply entrenched gender norms (“having forgotten the virtues that belong to her sex,” wrote Pierre Gaspard Chaumette, then-President of the Paris Commune, warning other politically active women of Gouges’s fate), she was certainly the author of 40 plays (12 survive), two novels and close to 70 political pamphlets.

While not a philosopher in any strict sense, she deserves attention for her morally astute analysis of women’s condition in society, for her re-imagining of the intersection of gender and political engagement, for her conception of civic virtue and her pacifist stance, and for her advocacy of selfhood for women, blacks, and children (especially in their right to know their origins). She was among the first to demand the emancipation of slaves, the rights of women (including divorce and unwed motherhood), and the protection of orphans, the poor, the unemployed, the aged and the illegitimate. She had a talent for emulating those she admired, including especially Rousseau but also Condorcet, Voltaire, and the playwright Beaumarchais.

Table of Contents

  1. Early Life
  2. Intellectual Pursuits
    1. Literary
    2. Political
    3. The Rights of Woman (1791)
    4. Philosophical
      1. Gouges and Rousseau
  3. Relevance and Legacy
  4. References and Further Reading
    1. Extant Works by Olympe de Gouges (in French)
    2. On-line English Translations of Gouges’s Original Works
    3. Secondary Sources in English (except Blanc)

1. Early Life

Details are limited. Born Marie Gouze in Montauban, France in 1748 to petite-bourgeois parents Anne Olympe Moisset Gouze, a maidservant, and her second husband, Pierre Gouze, a butcher, Marie grew up speaking Occitan (the dialect of the region). She was possibly the illegitimate daughter of Jean-Jacques Le Franc de Caix (the Marquis de Pompignan), himself a man of letters and a playwright (among whose claims to fame includes an accusation of plagiarism by Voltaire). In her semi-autobiographical Mémoire de Madame de Valmont sur l’ingratitude et la cruauté de la famille de Flaucourt (Memoir of Mme de Valmont) (1788), Gouges publishes letters purported to be transcriptions from Pompignan taking pains to distance himself from Valmont/Gouges. These letters stop short of unequivocal denial of his paternity.

She was married at 16 or 17 in 1765, unhappily, to Louis Aubrey (an associate of Pierre’s), with whom she had a son (also) Pierre, and by the age of 18 or 19 was widowed.  Denouncing marriage due to her recent past experience and disguising her widowhood (which would have given her a modicum of social and legal status), she adopted her mother’s middle name and the more aristocratic-sounding “de Gouges” and moved to Paris. Literate (schooled likely by Ursuline nuns in Montauban) but not particularly well-read, she spent the next decade informing herself on intellectual and political matters and integrating into Parisian society, supported by Jacques Biétrix de Roziéres, a wealthy weapons merchant, whom she may have met in Montauban shortly after the death of her husband or through her married sister, Jeanne Raynart. Biétrix insured her circumstances until the decline in his family’s resources in 1788. The year of her first published work is 1778, and it marks the end of her first decade in Paris.

She had begun to write in earnest around 1784. A literary compatriot and admirer of hers was Louis-Sébastien Mercier (1740-1814), with whom she shared many political views, including clemency for the King and a general abhorrence of violence. Mercier helped her navigate the tricky internal politics of the Comédie Française—the prestigious national theater of France—assisting her to publish several of her plays, and  to stage a handful. Charlotte Jeanne Be’raud de la Haye de Riou, Marchioness of Montesson (1739-1806), wife of the Duke of Orleans, a playwright herself and a woman of much influence and wealth, was among a list of other friends who came to her aid.

With little formal education and as a woman boldly unconventional, once she began her life of letters, her detractors were eager to find fault. She was often accused of being illiterate, yet her familiarity with Moliere, Paine, Diderot, Rousseau, Voltaire, and many others, the breadth of her interests, and the speed with which she replied to published criticism, all attest to the unlikelihood of the accusation. As French was not her native tongue and since her circumstances permitted, she maintained secretaries for most of her literary career.

2. Intellectual Pursuits

a. Literary

All of her plays and novels carry the theme of her life’s work: indignation at injustice. Her literary pursuits began with playwriting. Gouges wrote as many as 40 plays, as inventoried at her arrest. Twelve of those plays survived, and four found the requisite influential, wealthy, mostly male backing needed for their staging. Ten were published. While many of the plays by the dozen women playwrights that had been staged at the Comédie Française were published anonymously or under male pseudonyms, those playwrights who were successful on stage in their own names (most notably Julie Candielle) stuck to themes seen as suitable to their gender. Gouges broke with this tradition—publishing under her own name and pushing the boundaries of what was deemed appropriate subject matter for women playwrights—and withstood the consequences. Reviews of her early productions were mixed—some fairly favorable, others patronizing and condescending or skeptical of her authorship. Those of her plays read by the Comédie Française were often ridiculed by the actors themselves. Her later plays, more strongly political and controversial, were met with outright sarcasm and hostility by some reviewers: “[t]o write a good play, one needs a beard” wrote one critic.

Her first play, written in 1785, never produced for the stage, but published the following year, L’Homme Généreux [The Generous Man] explored the political powerlessness of women through representation of a socially privileged man’s struggle with sexual desire. The play also shines a light on the injustice of imprisonment for debt. Le Mariage Inattendu de Chérubin [Cherubin’s Unexpected Marriage] (1786), one of the several homages of the time written after Beaumarchais’ critically acclaimed Le Mariage de Figaro [The Marriage of Figaro] (1784), is a sequel to Figaro. Intent to rape is a theme in this play as it was in L’Homme Généreux; a privileged husband’s misplaced lust brings damage to the family, while the suffering of the victim is given significant attention. Gouges’s first staged production was originally titled Zamore et Mirza; ou L’Heureux Naufrage [Zamore and Mirza; or, The Happy Shipwreck] (1788). Written in 1784 and later revised, it was finally performed in 1789 under the title L’Esclavage de Nègres, ou l’Heureux naufrage [Black Slavery; or the Happy Shipwreck]. Accepted by the Comédie Française when submitted anonymously in 1785, it was then shelved for four years once the identity (and gender) of the playwright was confirmed. Winning praise from abolitionist groups, it was the first French play to focus on the inhumanity of slavery; it is, not surprisingly, also the first to feature the first person perspective of the slave. It saw three performances before it was shut down by sabotaging actors and protests organized by enraged French colonists who, deeply reliant on the slave trade, hired hecklers to wreak havoc on the production. Gouges fought back through the press, her social and literary connections and through the National Assembly. Understanding that her gender was connected to her lack of success, she called for a second national theatre dedicated solely to women’s productions, and she called for reforms within the Comédie Française itself.

Le Philosophe corrigé, ou le cocu supposé [The Philosopher Chastised, or the Supposed Cockold] (1787) represents women, despite expectations, as capable of agency around their own sexual desire, and of uniting and supporting each other, giving voice to the phenomenon that Simone de Beauvoir would address much later in The Second Sex with the observation that “women do not say ‘we’.” The play also depicts a male, the titular husband, as capable of acquiring moral knowledge through an evaluation of emotional response. Sympathy for his inexperienced wife and, later, an innocent baby, gives him insights he uses for moral reflection, a theme found in David Hume (1711-1776), Josiah Royce (1855-1916), and much modern feminist ethics. When French theatre in the decade of the Revolution turned to lighter vaudevillian fare, Gouges tried her hand at light comedy; La Bienfaisance récompense ou la vertu couronée [Beneficence Rewarded, or Virtue Crowned] (1788), a one-act comedy, portrayed the then-current Duc d’Orléans, an anti-royalist, as a doer of good.

But most of Gouges’s playwriting was dedicated to crafting principally dramatic—if melodramatic by today’s standards—pieces, responding to the issues of the day. And, she had a unique voice on many matters. Moliére chez Ninon ou e siécle de grands hommes [Moliére at Ninon’s, or the Century of Great Men] (1788), for instance, challenges the double standard between the sexes by depicting the famous, fiercely independent, literary courtesan, Ninon de Lenclos (1620-1705), as a noble person positively influencing the male intellectuals in her circle, including Moliére, all of whom are present to honor a visit by another notable intellectual, Queen Christina of Sweden (1626-1689).

Playwriting for Gouges was a political activity. In addition to slavery, she highlighted divorce, the marriageability of priests and nuns, girls forcibly sent to convents, the scandal of imprisonment for debt, and the sexual double-standard, as social issues, some repeatedly. Such activism was not unheard-of on the stage, but Gouges carried it to new heights. By 1790, her writings had become more explicitly political. She had three plays published:  Les Démocrates et les Aristocrates; ou le Curieux du Champ de Mars [The Democrats and the Aristocrats], a satire of political extremists on both sides; Le Nécessité du Divorce [The Necessity of Divorce], again illustrating the powerlessness of women trapped in marriage, and written simultaneously with a debate on the topic in the National Assembly (France would be the first Western country to legalize divorce two years later); and Le Couvent ou les Vœux Forces [The Convent, or Vows Compelled]. The Convent was her second play to see the light of day, and her greatest success. In the year of its publication it saw approximately 80 performances. Highlighting the political impotence of women, it illuminated the injustice of the Church’s complicity in male relatives’ right to force females into convents against their will.

Gaining momentum on the political front, her next play to be produced for the stage was Mirabeau aux Champs-Elysées [Mirabeau in the Elysian Fields] (1791). Depicting the Enlightenment philosophers Baron d’Montesquieu (1689-1755), Voltaire (1694-1778), and Rousseau, along with Benjamin Franklin (1706-1790), and, Madame Deshoulieres (1638-1694), Marquise de Sévigné (1626-1696) and Ninon de Lenclos—the latter three major female influences in France during the Enlightenment—Mirabeau awakens numerous notable historical figures to welcome Mirabeau (1749-1791) as a hero to the afterlife, in effect honoring his stance as a supporter of constitutional monarchy. Most notable, perhaps, is the appearance of the three women as worthy of a place of honor and a voice, platforms they use to assert, among other things, that the success of the Revolution pivots on the inclusion of women. (This is also the year Gouges wrote The Rights of Woman, discussed separately below.)

In 1792, one year before her trial and execution, she worked on two plays: the unfinished La France Sauvée, ou le Tyran détrôné [France Preserved, or the Tyrant Dethroned] and the completed L’Entrée de Dumourier [sic] à Bruxelles [Dumouriez’s Entry into Brussels].  The former, confiscated at her arrest, was used as proof of sedition at her trial because of its sympathetic depiction of Marie-Antoinette, even as Gouges used it to demonstrate support for her own case.  The latter depicted General Dumouriez’s defense of the Revolution against foreign anti-royalists, assisted by male and female warriors, and challenged her own privileging of aristocracy by suggesting that commoners were the true nobles. It was also the fourth and final play to be staged during Gouges’s lifetime.

The publication of Memoir of Madame de Valmont (1788) ironically begins, rather than summarizes, her political career. This fictionalized self-examination grappled with idealized father figures and fragmented selves, and served to package and compartmentalize her pre-Parisian life and move her forward wholly into a literary existence. Using a version of her (Gouges’s) personal story to make a political point, the narrator of the story sees clearly how gender works to constrain women. Mme de Valmont’s father’s refusal to acknowledge paternity raises issues of legitimacy for Valmont with financial and social repercussions. While he also expressed no patience for women’s forays into traditional male arenas such as writing, the narrator’s solution is to call for women to stop undermining other women and work to support each other. Rejection of the symbolic paternal voice of the culture has political power, and the Memoir presents an 18th century illustration of making the personal political—a vivid theme in 20th century feminism.

Gouges’s second novel Le Prince Philosophe (The Philosopher Prince), also written in 1788 but not published until 1792, reconceives monarchical rule, positing the best kind of ruler as one who would prefer not to rule, and proposing that all rule must be founded on the obligation not to take life. Scholars are mixed on whether she maintained her monarchist stance throughout her life.  Her literary output and her pamphleteering often suggest some version of a monarchy as her default position.  However, Azoulay (2009) suggests, at least in Gouges’s later writings (and perhaps in this second novel as well), her supposed attention to the monarchy is rather an attention to the preservation of the state and to the injustice of taking a life—namely, the King’s. Gouges’s depiction of the cause of the friction between the sexes in this novel is commentary on gender relations that here appears more conservative than will later be the case. The male characters still hesitate to share the reins with women. Yet, women are encouraged to develop reason rather than charm, and one female character submits a carefully drawn up plan advocating education and job opportunities for females, eventually winning a small victory by receiving permission to run a women’s academy.

Gabrielle Verdier (1994) notes several distinguishing features of Gouges’s plays:  (1) young women have active roles to play, (2) women of any age have agency, (3) female rivalry is absent, (4) mature women are protectors, benefactors, and mentors, (5) and the abuses that women experience are inevitably tied to larger social injustices. While not prepared to offer up a fully formed theory of oppression, Gouges is readying the space where that work can be done.

In all of her writings, both literary and political, one finds an unflinching self-confidence and a desire for justice. What Lisa Beckstrand (2009) refers to as the “theme of the global family” also runs through Gouges’s literary work. Familial obligations dominate and are responses to the inadequacies of the state. The plight of the illegitimate child, the unmarried mother, the poor, the commoner (at least by 1792), the orphan, the unemployed, the slave, even the King when he is most vulnerable, are all brought to light, with family connection and sympathy for the most disadvantaged as the pivotal plot points. Women characters regularly displace men at center stage. It is women, unified with each other and winning the recognition of men, that most characterizes what Gouges conveys in her work. She is the first to bring several taboo issues to the stage, divorce and slavery among them. Especially prolific in the four years from 1789 to her death in 1793, her political passion, labeled conservative by many, is still astonishing for its persistence in a culture working strenuously too often to stifle women’s voices.

b. Political

The impact of Olympe de Gouges’s political activism is commemorated by her inclusion as the only French woman on revolutionary and abolitionist Abbé Henri Grégoire’s (1750-1831) list of “all those men [sic] who have had the courage to plead the cause” of abolition.  The list is included in the introduction to his On the Cultural Achievements of Negroes (1808), which was written as a counter to Thomas Jefferson’s less admiring look at race in Notes on the State of Virginia (1782). As with so much that came to prominence with modern feminism, indignation at injustice must have started for Gouges with her own marginalization as a woman, but it shifted to the external world with a recognition of the inhumanity of slavery. While she was not an immediatist like some in the next generation of abolitionists such as William Lloyd Garrison (1805-1879) in the U.S., abolitionist thought permeated her understanding of the world and of herself as a writer, and soon grounded her thinking on women. The French Revolution itself transformed Gouges’s thinking further when the rights of citizens, despite pleas from the Girondins, were not applied to the female citizen. In fact, female political participation of all kinds was formally banned by the French National Assembly in 1793, after one of several uprisings led by women. Gouges’s political tenacity showed itself most virulently in prison where she “mounted the rostrum” at least two final times, smuggling out pamphlets that condemned prison conditions and that accepted—indeed recklessly demanded—responsibility for her ideas, challenging how the rights of freedom of speech were embodied in the new Constitution.

Her path from social nonconformist, to political activist and reformer, to martyr was one untrodden by women.  Many of the most influential eighteenth century intellectuals—with a few exceptions—were convinced women did not have the intellectual capacity for politics. Gouges challenged that perception, while also problematizing it by writing hastily, sometimes dictating straight to the printer. While her uneven education opened her to ridicule, it gave her a critical affinity for Rousseauean ideas, as will be discussed below. Both playwriting and her productivity as a pamphleteer gained her celebrity which she used as a podium for her advocacy of the marginalized, and for drawing attention to the importance of the preservation of the state.

Gouges’s formal petitions to the National Assembly, and her public calls for governmental and social reform through the press and through her pamphleteering (common in France), went far beyond the The Rights of Woman of 1791 (see next section) and her stance on slavery. Among her many calls for change in this medium were: (as mentioned above) a demand for a national theatre dedicated to the works of women; a voluntary tax system (one of the few demands she published anonymously and which saw implementation the following year); state-sponsored working-groups for the unemployed; social services for widows, the elderly and orphans; civil rights for illegitimate children and unmarried mothers; suppression of the dowry system; regulation of prostitution; sanitation; rights of divorce; rights to marriage for priests and nuns; people’s juries for criminal trials; and the abolition of the death penalty.  She petitioned the National Assembly on a number of occasions on these and other matters. Whether or not related to her efforts, the National Assembly did pass laws in 1792 giving illegitimate children some of the civil rights for which she fought and granting women the right to divorce, even while women remained legal nonentities overall.

She is frequently touted as a (sometimes the) founder of modern feminism for her unrelenting advocacy for women’s rights in writing and in action. While the historical record is more complicated than that, in historian John R. Cole’s accounting (2011), “she published on current affairs and public policy more often and more boldly than any other woman. . . and [s]he made a more formal and sweeping demand for the extension of full civil and political rights to women than any prior person, male or female, French or foreign” (231). Her call for women to identify as women and band together in support of each other can also be considered a contribution to the revolutionary and to the concept of citizenship, and remains today an important focus for modern feminism. Others in France were also proposing feminist ideas, although none as actively and comprehensively as Gouges: most notably, François Poulain de la Barre (1647-1725) in the 17th century and Marie Madeleine Jodin (1741-90), Louise de Keralio-Robert (1758?-1822), Nicholas de Condorcet (1743-1794), and Etta Palm d’Aelders (Dutch, 1732-1799), in the 18th century. The latter two petitioned the National Assembly unsuccessfully in 1790 to ensure legal rights for women.

Gouges’s writings share many themes with what would become classic texts in the feminist movement of the 20th century:  independence of mind and body, access to political rights and political voice, education, and elimination of the sexual double standard.  Her awareness of these themes spring from her experience as a woman, solidified by her unhappy early marriage, her unapologetic and ostensibly scandalous first years in Paris, through to her sometimes-thwarted, oft-derided, attempts at participation in cultural, literary and political realms. She experienced firsthand how the rights of the citizen were denied women. Her early history, her frustration at being denied, or dismissed as, a voice in the public sphere, and the ridicule she withstood, aimed at her gender, gave shape to insights emblematic of much later feminist theory and concretized for her an understanding of the link between the public and private realms. Gouges contributed markedly to the depth and breadth of the discourse on women’s rights in late 18th century France, and on the plight of the underprivileged in general. As Cole summarizes: “she tried to rally other women behind a radical extension of liberty and equality into domestic relationships . . . and she [advocated for] the extension of rights to free persons of color and free blacks and to vindicate the full humanity of slaves in the Caribbean colonies” (231).

Perhaps most indicative of Gouges’s political courage and intellectual self-reliance was the stance which led to her death. Her decision to continue to publish works deemed seditious even as the danger of arrest grew shows courage and commitment to her advocacy of the less fortunate and exemplifies her self-definition as a political activist. She was the only woman executed for sedition during the Reign of Terror (1793-1794). That fact, along with comments such as those of Pierre Chaumette: “[r]emember the shameless Olympe de Gouges, . . . who abandoned the cares of her household to involve herself in the republic, and whose head fell under the avenging blade of the laws. Is it for women to make motions? Is it for women to put themselves at the head of our armies?” (Andress, 234), suggests that her outspoken views were greeted with increased hostility because of her gender. In May or June of 1793 her poster The Three Urns [or Ballot Boxes] appeared, calling for a referendum to let the people decide the form the new government should take. Proposing three forms of government: republic, federalist or constitutional monarchy, the essay was interpreted as a defense of the monarchy and used as justification for her arrest in September. Her continuing preference for a constitutional monarchy was likely propelled in part by her disappointment with the Revolution, but more specifically by her opposition to the death penalty and her general humanitarian inclinations. She appears to have had no elemental dispute with monarchy per se, problematizing any philosophical understanding of her commitment to human rights. The Rights of Woman, for instance, is dedicated to the Queen—as a woman, but presumably because she is the Queen.

c. The Rights of Woman (1791)

By far her most well-known and distinctly feminist work, The Rights of Woman (1791) was written as a response to The Declaration of the Rights of Man and of the Citizen, written in 1789 but officially the preamble to the French Constitution as of September 1791. Despite women’s participation in the Revolution and regardless of sympathies within the National Assembly, that document was the death knell for any hopes of inclusion of women’s rights under the “Rights of Man.” The patriarchal understanding of female virtue and sexual difference held sway, supported by Rousseau’s perspective on gender relations, perpetuating the view that women’s nurturing abilities and responsibilities negated political participation. Political passivity was itself seen as a feminine responsibility.

The Rights of Woman appeared originally as a pamphlet printed with five parts:  1) the dedication to the Queen, 2) a preamble addressed to “Man,” 3) the Articles of the Declaration, 4) a baffling description of a disagreement about a fare between herself and a cab driver, and 5) a critique of the marriage contract, modeled on Rousseau’s Social Contract. It most often appears (at least in English translation) without the fourth part (Cole, 2011). Forceful and sarcastic in tone and militant in spirit, its third section takes up each of the seventeen Articles of the Preamble to the French Constitution in turn and highlights the glaring omission of the female citizen within each article. Meant to be a document ensuring universal rights, the Declaration of the Rights of Man and the Citizen is exposed thereby as anything but. The immediacy of the implications of the Revolution finally fully awakened Gouges to the ramifications of being denied equal rights, but her entire oeuvre was aiming in this direction. Gouges wrote a document that highlights her personal contradictions (her own monarchist leanings as they hinder full autonomy most obviously), while bringing piercing illumination to contradictions in the French Constitution. Despite the lack of attention Gouges’s pamphlet received at the time, her greatest contribution to modern political discourse is the highlighting of the inadequacy of attempts at universality during the Enlightenment. The demands contained within the original document assert the universality of “Man” while denying the specificity required for “Woman,” therefore collapsing—at least logically—of its own efforts. Alert to the powerlessness of women and the injustice such a condition implies, Article 4 of The Rights of Woman, for instance, particularly calls for protection from tyranny, as “liberty and justice” demand; that is, as nature and reason demand in personal as well as political terms. To harmonize this document with her devotion to the monarchy for most of her political career takes significant effort.

For Gouges, the most important expression of liberty was the right to free speech; she had been exercising that right for almost a decade. Access to the rostrum required more than an early version of “add women and stir.” While Gouges’s Rights is rife with such pluralizing—extending any right of Man to Woman as well—there is also a clear acknowledgement that blind application of universal principles is insufficient for the pursuit of equality. Article XI, for example, demands the right of women to name the father of their children. The peculiarity of the need for this right on the part of women stands out because of its specificity and demonstrates the contradictions created by blindness to gender. The citizen of the French Revolution–the idealistic universal—is the free white adult male, leaving in his wake many injustices peculiar to individuals excluded from that “universal.” The Rights of Woman unapologetically highlights that problem.

The Enlightenment presumption of the “natural rights” of the citizen (as in “inalienable rights” in the U.S. Declaration of Independence) is in direct contradiction to the equally firmly-held belief in natural sexual differences—both of which are so-called “founding principles of nature.” While Gouges is not fully aware of the implications of this conflict, she holds unequivocally in The Rights of Woman that those natural rights do indeed grant equality to all, just as the French Declaration states but does not intend. The rights such equality implies need to be recognized as having a more far-reaching application; if rights are natural and if these rights are somehow inherent in bodies, then all bodies are deserving of such rights, regardless of any particularities, like gender or color.

Marriage, as the center for political exploitation, is thoroughly lambasted in the postamble, Part 5, to The Rights of Woman. Gouges describes marriage as the “tomb of trust and love,” and the place of “perpetual tyranny.” The primary site of institutionalized inequality, marriage creates the conditions for the development of women’s unreliability and capacity for deception. Just as Mary Wollstonecraft (1759-1797) does in A Vindication of the Rights of Woman (1792), Gouges points to female artifice and weakness as a consequence of woman’s powerless place in this legalized sexual union. Gouges, much like Wollstonecraft, attempts to combat societal deficiencies: the vicious circle which neglects the education of its females and then offers their narrower interests as the reason for the refusal of full citizenship. Both, however, see the resulting fact of women’s corruption and weak-mindedness as a major source of the problems of society, but herein also lies the solution. Borrowing from Rousseau, Gouges proposes a “social contract” as replacement for traditional marriage, reformulating his social contract with a focus that obliterates his gendered conception of citizen, and create the conditions for both parties to flourish. In the “Form for a Social Contract Between Man and Woman,” Gouges offers a kind of civil union based on equality, which will create the “moral means of achieving the perfection of a happy government!”. The state is a (reconceived) marriage writ large, for Gouges. What ails government are fixed social hierarchies impossible to maintain.  What heals a government is an equal balance of powers and a shared virtue (consistent with her continuing approval of a constitutional monarchy). Marriages are to be voluntary unions by equal rights-bearing partners who hold property and children mutually and dispense of same by agreement. All children produced during this union have the right to their mother’s and father’s name, “from whatever bed they come.”

Having been a monarchist almost to the end, her authorship of this document and her lack of formal education suggests Gouges possessed less than full comprehension of what we now view as the discourse on universal human rights. That said, the production of this document has influenced exactly that conversation, and thus her presence in the list of historical figures who matter philosophically has to be acknowledged. Even Wollstonecraft, in her Vindication of 1792, does not call for the complete reinvention of women as political selves, as does Gouges in 1791.

d. Philosophical

As with most Enlightenment thinking, a natural rights tradition—although not any kind of comprehensive theory of natural rights—can be found in Gouges’s views on the origins of citizenship and rights for women and blacks. By 1791, she is arguing that equality is natural; it “had only to be recognized.” It appears that Gouges did not see any contradiction in her royalist leanings; nevertheless, she may no longer have been a monarchist by this point. She held that the human mind has no sex, an idea traceable in the modern era as far back as Poulain de la Barre (1673); men and women are equally human, therefore capable of the same thoughts. While her lack of education precluded the use of any systematic methodology, the consistency of her advocacy for the powerless, of pacifism, and (eventually) for the universal application of moral and legal rights is of great merit, and remains, if not based in rigorous philosophical analysis, yet philosophically astute. Her writings, both literary and political, point in directions contemporary feminist philosophy traversed for much of the twentieth century and beyond. She presciently foreshadows the “masculine universal” of liberal democracy identified by much contemporary feminist thought. She rejected the perceptions of sexual difference used to drive women out of the political arena, while she advocated for women’s “special interests.” Echoing Plato’s attention to gender in The Republic, she saw natural differences between genders, but not of a kind relevant to the tasks of the citizens of the state.  Despite appearing after the French constitution being decreed and constitutionally frozen, Gouges’s The Rights of Woman was aimed at expanding, even supplanting, the official French Declaration.  Focusing on women as human and thus equal, but with pregnancy and motherhood as special differences, Gouges seemed comfortable with the resulting conceptual dissonance.

On the heels of The Rights of Woman, she published The Philosopher Prince (1792), a novel where ideas in the realm of political philosophy (perhaps influenced by the historical events of the previous three years) are most on display. With the marriage contract from The Rights of Woman as a template, she unpacks reasons for the lack of solidarity between the sexes; she depicts women living in a mythical society where education becomes a requirement for civic virtue; access to reason is necessary so that women grow up equal to men and engaged in public life. Azoulay (2009) gives the most scholarly attention to date to this novel, arguing that it provides evidence that Gouges was a monarchist only insofar as monarchy was the best means to preserve the nation. “Gouges did not seek the preservation of the monarchy but rather the revival of the kingdom” (43).

Earlier, in 1789, the pamphlet Le Bonheur primitif de l’homme [The Original Happiness of Man] also gives hints of a political philosophy. Gouges imagines a society where women were granted an education and encouraged in the development of their agency. Individual happiness depends on collective happiness, but collective happiness comes from the natural qualities found within families (Beckstrand’s “theme of the global family”). While agreeing with Rousseau that civilization corrupts, she parts ways with him on the education of females. While never directly critical of Rousseau, the implication here is that the corruptibility of civilization can be countered only if we raised “Sophie” within a nurturing egalitarian family with as much freedom and natural exploration as Rousseau proposes for “Emile.” The happiness of all requires, among many other things, that “[g]irls will go to the fields and guard the animals” (tr. Harth, 1992, 222). This pamphlet also contains her call for a national theatre for women.

We find fragments of a larger philosophical perspective wherever we look.  The female as subject rather than object, especially in political discourse, is among the most important and prevalent. Her understanding of the value of her own voice creates an understanding of self that challenges gender norms head on, withstands all public criticism, and refuses to collapse under the weight of taboo. A nascent moral philosophy can be unearthed by considering her lifelong attention to the plight of the disadvantaged. An early example is the postface to the second printing of her first play, written prior to its first staging, but after a long battle with the Comédie Française to have it staged. Réflexions sur les hommes négres [Reflections on Black Men] raises questions about personhood and race. Gouges identifies race as a social construct insofar as slavery condemns blacks to being bought and sold “like cows at market.” She is horrified at what privileged men will do in the name of profit. “It is only color” that differentiates the African from the European. And, difference in color is simply the beauty of nature. If “they are animals, are we not like they?” There is explicit criticism here of a binary way of seeing the world.

An atheist, she critiques religion—particularly Catholicism—by focusing on its oppressiveness, especially towards women. Religion should not prohibit one from listening to reason or encourage one to be “deaf to nature.” The celibacy of priests and nuns lays the ground for corruption and plays a role in religion’s oppressiveness as well, she proclaimed. Throughout her writings, respect for the individual appears more vividly than Enlightenment philosophers generally could conceive, grounds her pacifism, inspires her attention to children, and underscores her political vision. And, in part, through her reverence for Rousseau, she sees problems with the separation, both devastating in its implications in practice and invigorating in its theoretical possibilities, between the private and the public spheres.

i. Gouges and Rousseau

The writings of Jean-Jacques Rousseau (1712-1778) were a major influence on the French Revolution, as was the then-recent success of the American Revolution (1776). Among Gouges’s lost plays is one titled Les Rêveries de Rousseau, la Mort de Jean-Jacques à Ermenonville [Reveries of Rousseau, the Death of Jean-Jacques of Ermenonville] (1791)—she was an ardent admirer, calling him her “spiritual father”. While it is clear Rousseau’s philosophy as a systematizable whole was ignored by the revolutionaries, his idea that the power of the government should come from “the consent of the governed” inspired the overthrow of an absolutist monarchy. His advocacy of rule by the general will helped inspire the French to shed their monarchist allegiances and take to the streets.  His theory of education for boys promoted non-interference and encouraged conditions that would allow nature to take its course. That this was the most direct route to virtue and would produce the best kind of self was aimed at the man a boy like Emile would become, and was taken seriously by men and women alike in France and beyond.

Gouges described herself as a “pupil of pure nature,” embracing a Rousseauian perspective on education while imposing on it her own perspective on gender. The education Rousseau proposed for girls was mind-numbingly stifling; they were to be raised to understand they were “made for man’s delight.”

“They must be subject, all their lives, to the most constant and severe restraint, which is that of decorum: it is, therefore, necessary to accustom them early to such confinement, that it may not afterwards cost them too dear; and to the suppression of their caprices, that they may the more readily submit to the will of others” (Emile, or On Education, V:1297).

Rousseau claimed virtues for the male arose most purely when the individual was not constrained by civilization. He proposed that boys turn out best when left to themselves. He advocated freedom for Emile. But when it came to females, a system of cultural constraints was necessary in order to ensure the properly compliant nature for a companion for Emile. “[A]mong men, opinion is the tomb of virtue; among women it is the throne” (Emile, V:1278). Universal principles and the “masculine virtues” applied only to those dominant men.  In terms of gender, Rousseau was influencing the Revolution in just the way Gouges was finding fault with it.

Gouges agreed with Rousseau’s understanding of how education of the citizenry could transform society. Yet, seeing well beyond Rousseau in terms of gender, she proclaimed in The Rights of Woman that the failure of society to educate its women was the “sole cause” of the corruption of government. Gouges, in a number of documents, as has been noted, anticipated contemporary feminist philosophy’s claim that modern liberal democracy operated with a deficient notion of the universal because it lacked inclusiveness. Historically, woman is seen as the complementary and contrasting counterbalance to man; if man is a political animal, woman is a domestic one. While the prominent French revolutionaries worked to apply Rousseauean themes to, among other things, the exclusion of women from political participation, Gouges fought to raise awareness of what is largely the misapplication of Rousseau’s social critique while never naming Rousseau. According to her, if social systems are human-made and they tend to cause the evils of the world because they interfere with nature, then just as it is for males, so must it be for females. Despite his intentions, Rousseau’s egalitarian vision had immense feminist implications. Men’s tyranny over women, for Gouges, is clearly contrary to nature, not in sync with it. Her use of “social contract” in the postscript to the Declaration is a direct appropriation of Rousseau. Her social contract proclaims that the right in marriage to equal property and parental and inheritance rights is the only way to build a society of “perfect harmony.” As with Beauvoir a century and a half later, Gouges calls women to take responsibility for their condition, and demand equality. She conceived of it in the masculine, and applied it to herself. She accepted Rousseau’s understanding of “nature” and his discourse on rights. She wrote The Original Happiness of Man in 1789 as a demonstration of her debt to Rousseau. There, she acknowledges her lack of formal education (as she often did in her writings), suggesting that she could see some things more clearly because of that deficit (she is “at once placed and displaced in this enlightened age . . .”). Her freedom comes from her lack of constraint, originating, in part, in her lack of formal education. The artificial constraints she encounters are unjustly thrust upon her by her society.

Rousseau’s condemnation of class distinctions also spoke directly to Gouges’s experience. But he did not question the right of the sovereign over the governed and Gouges, despite her monarchism, does at times do so with vigor. (When that sovereign is in one’s own household–all the more need for vigor.) For instance, in her pamphlet containing a proposal for a female national guard (Sera-t-il Roi, ne le sera-t-il pas? [Will he, or will he not, be King?] (1791), she criticizes a sovereign who would ask its citizenry to go to war, suggesting that such a request contradicts the very essence of citizenship. Being a citizen requires one to honor one’s relationship to the state, not to deliberately put that relationship in jeopardy. At the center of an understanding of political life should be a commitment not to take life–that is, to preserve the polis as a whole.

3. Relevance and Legacy

In addition to the political activism just mentioned, Gouges foreshadowed Henry David Thoreau (1817-1862), Mahatma Gandhi (1869-1948) and Martin Luther King, Jr. (1929-1968), by calling for disobedience to obviously unjust laws. Her argument for protections for the deposed French king comes, not so much from her royalist tendencies, but from her understanding of the “global family” and from her pacifism, as well as from her understanding of the separability of sovereign power from the individual who inhabits that power. Once the sovereignty is removed, the individual, she believed, was no longer synonymous with that figurehead. Putting to death the man who held that title but has since relinquished it, is a miscarriage of justice.

Through the several articles of the third part of The Rights of Woman she helped to problematize the notion of universalizability as a moral good, and has helped to formulate and to popularize the notion that the word ‘man’ when used generically may be problematic

Gouges critiqued the principle of equality touted in France because it gave no attention to who it left out, and she worked to claim the rightful place of women and slaves within its protection.  She moved Querelle des Femmes (“the woman question”)—which had its origin in France in the middle ages and pivoted around what role(s) women should rightly play in society—out of the abstract and into the political arena. She moved the discussion of slavery from an abstract distant one (an issue for the colonies only) literally to center stage and specifically highlighted the moral irrelevance of color. The “color of man is nuanced,” she wrote; and she  questioned why, if blonds are not superior to brunettes, and mulattos not superior to Negroes, how whites can be any different.  Color cannot be a criterion for dehumanization. Her challenge to traditional binaries wherever she found them may be the culminating arc of her work and where we can find our greatest debt to her. Her fictional characters all strain against the straightjackets of their identities:  strong women vie for their independence in conversation with sympathetic men rather than pitting themselves against each other in rivalry over men; men and women bond together to right some significant wrong; women seek strength in other women and unify to accomplish morally worthy goals; men often relinquish their arbitrary right to power over women to work in tandem to accomplish just goals. Men are depicted applauding women’s success. Her political pamphlets demonstrate her commitment to an overhaul of society. Revolutions, she insisted, could not succeed without the inclusion of women. And, since blacks demonstrated their humanity with every step and every breath in her plays, their enslavement was an indictment of French society.

While silenced for a time by history, Gouges scholarship has increased steadily since the publication of Oliver Blanc’s first biography in 1981 (no English translation; supplanted by his 2003 publication). A singular figure in the French Revolution and a founding influence on the direction of women’s and human rights, Gouges’s resistance to gendered social norms and her insistence on the revolutionary nature of the application of women’s rights makes her an important historical figure. If not herself a philosopher, she had the stamina and the intellect to shape ideas that have been and continue to be philosophically relevant and valuable. Her ability to attain status and power and the public rostrum despite her background and her gender is astonishing. Her refusal to be silent in the face of injustices, both personal and social, contains the roots of her legacy.

4. References and Further Reading

a. Extant Works by Olympe de Gouges (in French)

  • Théâtre politique I, preface by Gisela Thiele-Knobloch (Paris: Côté-femmes éditions, 1992)–includes Le Couvent, Mirabeau aux Champs-Elysées, and L’Entrée de Dumourier à Bruxelles.
  • Oeuvres complétes Théâtre, Félix-Marcel Castan, ed. Montauban: Cocagne, 1993 (comprises the twelve extant plays, including two in manuscript).
  • Théâtre politique II, preface by Gisela Thiele-Knobloch (Paris: Côté-femmes éditions, 1993)—comprises L’Homme généreux, Les Démocrates et les aristocrates, La Nécessité du divorce, La France sauvée ou le tyran détrôné, and Le Prélat d’autrefois, ou Sophie et Saint-Elme.
  • Ecrits politiques, 1788-1791, volume 1, preface by Olivier Blanc (Paris: Côté-femmes éditions, 1993).
  • Action héroïque d’une françoise, ou La France sauvée par les femmes. Paris: Guillaume Junior, 1789.
  • L’entrée de Dumouriez à Bruxelles, ou Les vivendiers. gallica.bnf.fr.
  • Ecrits politiques, 1792-1793, volume 2, preface by Blanc (Paris: Côté-femmes éditions, 1993).
  • Mémoire de Madame de Valmont, 1788, roman (Paris: Côté-femmes éditions, 1995).
  • La France sauvée, ou Le tyran détrôné. //gallica.bnf.fr.
  • Mon dernier mot à mes chers amis. //gallica.bnf.fr.
  • Oeuvres, ed. Benoite Groult.  Paris:  Mercure de France, 1986.
  • Repentir de Madame de Gouges. 1791, //gallica.bnf.fr.
  • Les fantômes de l’opinion publique, 1791? //gallica.bnf.fr.
  • Pour sauver la patrie, il faut respecter les trois ordes : c’est le seul moyen de conciliation qui nous reste. (1789). //gallica.bnf.fr.
  • Le cri du sage, par une femme. (1789 ?). //gallica.bnf.fr.
  • Dialogue allégorique entre la France et la vérité, dédié aux états généraux. 1789. //gallica.bnf.fr.

b. On-line English Translations of Gouges’s Original Works

  • On-going translation project by Clarissa Palmer:  Available at www.olympedegouges.eu.
  • The Rights of Woman (1791) [titled here Declaration of the Rights of Women and the Female Citizen, 1791]: www.fordham.edu/halsall/mod/1791degouge1.asp.
  • Transcript of her trial:  chnm.gmu.edu/revolution/d/488/.
  • “Reflections on Negroes” (trans. Sylvie Molta) www.uga.edu/slavery/texts/literary_works/reflections.pdf.
  • “Response to the American Champion” (trans. Maryann DeJulio) www.uga.edu/slavery/texts/literary_works/reponseenglish.pdf.
  • Additional material: www.uga.edu/slavery/texts/other_works.htm#1789.

c. Secondary Sources in English (except Blanc)

  • Andress, David. The Terror: the Merciless War for Freedom in Revolutionary France. New York: Farrar, Straus and Giroux, 2006.
  • Azoulay, Ariella. “The Absent Philosopher-Prince: Thinking Political Philosophy with Olympe de Gouges.” Radical Philosophy 158 (Nov/Dec 2009).
  • Beckstrand, Lisa.  Deviant Women of the French Revolution and the Rise of Feminism. Verlag:  Associated University Presse, 2009.
  • Beauvoir, Simone de. The Second Sex. Trans. H. M. Parshley. Vintage Books (Random House) (1989) [1952].
  • Blanc, Oliver. Une Femme de Libertés: Olympe de Gouges. Syros: Alternatives, 1989.
  • Blanc, Oliver. Marie-Olympe de Gouges: Une Humaniste à la fin du XVIIIe siècle.  Paris; Editions René Viénet, 2003.
  • Brown, Gregory S. “The Self-Fashionings of Olympe de Gouges, 1784-1789.” Eighteenth-Century Studies, 34(3) (2001), 383-401.
  • Cole, John. Between the Queen and the Cabby. Montreal, Quebec, Canada:  McGill-Queen’s University Press, 2011 (contains the only full length translation of all five parts of The Rights of Woman).
  • Diamond, Marie Josephine. “The Revolutionary Rhetoric of Olympe de Gouges.” Feminist Issues, 14(1) (1994), 3-23.
  • Fraisse, Genevieve, and Michelle Perrot, eds. A History of Women in the West, Vol. 4: Emerging Feminism from Revolution to World War. Cambridge, MA: Harvard University Press, 1993.
  • Garrett, Aaron.  “Human Nature.” Cambridge History of Eighteenth-century Philosophy, Volumes 1, ed. Knud Haakonssen.  New York, NY:  Cambridge University Press, 2006.  160-233.
  • Green, Karen.  A History of Women’s Political Thought in Europe, 1700-1800. New York, NY:  Cambridge University Press, 2014.  (Especially Chapter 9: “Anticipating and experiencing the revolution in France,” 203-234.)
  • Groult, Benoîte, ed. (French) “Olympe de Gouges: la première feminist moderne,” in Olympe de Gouges: Oeuvres. Paris: Mercure de France, 1988.
  • Harth, Erica. Cartesian Women: Versions and Subversions of Rational Discourse in the Old Regime.  Ithaca, NY:  Cornell University Press, 1992.
  • Levy, Darline Gay, Harriet Branson Applewhite and Mary Durham Johnson. Women in Revolutionary Paris: 1789-1795: Selected Documents Translated with Notes and Commentary. Urbana: University of Illinois Press, 1979.
  • Mattos, Rudy Frederic de. The Discourse of Women Writers in the French Revolution: Olympe de Gouges and Constance de Salm. 2007.  Unpublished dissertation.  University of Texas at Austin Electronic Theses and Dissertations.
  • Maclean Marie.  “Revolution and Opposition:  Olympe de Gouges and the Déclaration des droits de la femme.” In Literature and Revolution. Ed. David Beven. Amsterdam: Rodopi, 1989.
  • Melzer, Sara E. and Leslie W. Rabine, eds. Rebel Daughters:  Women and the French Revolution. New York: Oxford University Press, USA, 1992.
  • Miller, Christopher L. The French Atlantic Triangle:  Literature and Culture of the Slave Trade.  Durham, NC:  Duke University Press, 2008.
  • Monedas, Mary Cecilia.  “Neglected Texts of Olympe de Gouges, Pamphleteer of the French Revolution of 1789,” Advances in the History of Rhetoric 1.1 (1996).  43-54.
  • Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994.
  • Mousset, Sophie.  Women’s Rights and the French Revolution: a Biography of Olympe de Gouges. New Brunswick, NJ: Transaction Publishers, 2007.
  • Nielson, Wendy C. “Staging Rousseau’s Republic:  French Revolutionary Festivals and Olympe de Gouges.” Eighteenth-Century: Theory and Interpretation, 43(3) (Fall 2003), 265-85.
  • Nielson, Wendy C. Women Warriors in Romantic Drama. Lanham, MD: The University of Delaware Press, 2013.
  • O’Neill, Eileen. “Early Modern Women Philosophers and the History of Philosophy.” Hypatia 20(3) (2005), 185-197.
  • Scott, Joan Wallach. “French Feminists and the Rights of ‘Man’: Olympe de Gouges’s Declarations.” History Workshop 28 (1989), 1-21.
  • Scott, Joan Wallach. Only Paradoxes to Offer:  French Feminists and the Rights of Man. Cambridge, MA: Harvard University Press, 1996.
  • Sherman, Carol L.  Reading Olympe de Gouges. New York, NY:  Palgrave Macmillan, 2013.
  • Spencer, Samia I., ed.  French Women and the Age of Enlightenment. Bloomington, IN: Indiana University Press, 1984.
  • Trouille, Mary Seidman. “Eighteenth-Century Amazons of the Pen: Stéphanie de Genlis & Olympe de Gouges.” Femmes Savants et Femmes d’Esprit: Women Intellectuals of the French Eighteenth Century. Eds. Roland Bonnel and Catherine Rubinger. New York: Peter Lang, 1994.
  • Trouille, Mary Seidman. Sexual Politics in the Enlightenment: Women Writers Read Rousseau. Albany: State U of New York Press, 1997.
  • Vanpée, Janie.  La Déclaration des Droits de la Femme: Olympe de Gouges’s Re-Writing of La Déclaration des Droits de l’Homme. In Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994. 55-80.
  • Verdier, Gabrielle. “From Reform to Revolution: The Social Theater of Olympe de Gouges.” In Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994. 189-224.

 

Author Information

Joan Woolfrey
Email: jwoolfrey@wcupa.edu
West Chester University of Pennsylvania
U. S. A.

Kant: Philosophy of Mind

KantImmanuel Kant (1724-1804) was one of the most important philosophers of the Enlightenment Period (c. 1650-1800) in Western European history. This encyclopedia article focuses on Kant’s views in the philosophy of mind, which undergird much of his epistemology and metaphysics. In particular, it focuses on metaphysical and epistemological doctrines forming the core of Kant’s mature philosophy, as presented in the Critique of Pure Reason (CPR) of 1781/87 and elsewhere.

There are certain aspects of Kant’s project in the CPR that should be very familiar to anyone versed in the debates of seventeenth century European philosophy. For example, Kant argues, like Locke and Hume before him, that the boundaries of substantive human knowledge stop at experience, and thus that we must be extraordinarily circumspect concerning any claim made about what reality is like independent of all possible human experience. But, like Descartes and Leibniz, Kant thinks that central parts of human knowledge nevertheless exhibit characteristics of necessity and universality, and that, contrary to Hume’s skeptical arguments, there is good reason to think so.

Kant carries out a ‘critique’ of pure reason in order to show its nature and limits, thereby curbing the pretensions of various metaphysical systems articulated on the basis that reason alone allows us to scrutinize the depths of reality. But Kant also argues that the legitimate domain of reason is more extensive and more substantive than previous empiricist critiques had allowed. In this way Kant salvages (or attempts to) much of the prevailing Enlightenment conception of reason as an organ for knowledge of the world.

This article discusses Kant’s theory of cognition, including his views of the various mental faculties that make cognition possible. It distinguishes between different conceptions of consciousness at the basis of this theory of cognition and explains and discusses Kant’s criticisms of the prevailing rationalist conception of mind, popular in Germany at the time.

Table of Contents

  1. Kant’s Theory of Cognition
    1. Mental Faculties and Mental Representation
      1. Sensibility, Understanding, and Reason
      2. Imagination and Judgment
    2. Mental Processing
  2. Consciousness
    1. Phenomenal Consciousness
    2. Discrimination and Differentiation
    3. Self-Consciousness
      1. Inner Sense
      2. Apperception
    4. Unity of Consciousness and the Categories
  3. Concepts and Perception
    1. Content and Correctness
    2. Conceptual Content
    3. Conceptualism and Synthesis
    4. Objections to Conceptualism
  4. Rational Psychology and Self-Knowledge
    1. Substantiality (A348-51/B410-11)
    2. Simplicity (A351-61/B407-8)
    3. Numerical Identity (A361-66/B408)
    4. Relation to Objects in Space (A366-80/B409)
      1. The Immediacy Argument
      2. The Argument from Imagination
    5. Lessons of the Paralogisms
  5. Summary
  6. References and Further Reading
    1. Kant’s Works in English
    2. Secondary Sources

1. Kant’s Theory of Cognition

Kant is primarily interested in investigating the mind for epistemological reasons. One of the goals of his mature “critical” philosophy is articulating the conditions under which our scientific knowledge, including mathematics and natural science, is possible. Achieving this goal requires, in Kant’s estimation, a critique of the manner in which rational beings like ourselves gain such knowledge, so that we might distinguish those forms of inquiry that are legitimate, such as natural science, from those that are illegitimate, such as rationalist metaphysics. This critique proceeds via an examination of those features of the mind relevant to the acquisition of knowledge. This examination amounts to a survey of the conditions for “cognition” [Erkenntnis], or the mind’s relation to an object. Although there is some controversy about the best way to understand Kant’s use of this term, this article will understand it as involving relation to a possible object of experience, and as being a necessary condition for positive substantive knowledge (Wissen). Thus to understand Kant’s critical philosophy, we need to understand his conception of the mind.

a. Mental Faculties and Mental Representation

Kant characterizes the mind along two fundamental axes – first by the various kinds of powers which it possesses and second by the results of exercising those powers.

At the most basic explanatory level, Kant conceives of the mind as constituted by two fundamental capacities [Fähigkeiten], or powers, which he labels “receptivity” [Receptivität] and “spontaneity” [Spontaneität]. Receptivity, as the name suggests, constitutes the mind’s capacity to be affected by something, whether itself or something else. In other words, the mind’s receptive power essentially requires some external prompt to engage in producing “representations” [Vorstellungen], which are best thought of as discrete mental events or states, of which the mind is aware, or in virtue of which the mind is aware of something else (it is controversial whether representations are objects of ultimate awareness or are merely a vehicle for such awareness). In contrast, the power of spontaneity needs no such prompt. It is able to initiate its activity from itself, without any external trigger.

These two capacities of the mind are the basis for all (human) mental behavior. Kant thus construes all mental activity either in terms of its resulting from affection (receptivity) or from the mind’s self-prompted activity (spontaneity). From these two very general aspects of the mind Kant then derives three further basic faculties or “powers” [Vermögen], termed by Kant “sensibility” [Sinnlichkeit], “understanding” [Verstand], and “reason” [Vernunft]. These faculties characterize specific cognitive powers. These powers cannot be reduced to any of the others, and each is assigned a particular, cognitive task.

i. Sensibility, Understanding, and Reason

Kant distinguishes the three fundamental mental faculties from one another in two ways. First, he construes sensibility as the specific manner in which human beings, as well as other animals, are receptive. This is in contrast with the faculties of understanding and reason, which are forms of human, or all rational beings, spontaneity. Second, Kant distinguishes the faculties by their output. All of the mental faculties produce representations. We can see these distinctions at work in what is generally called the “stepladder” [Stufenleiter] passage from the Transcendental Dialectic of Kant’s major work, the Critique of Pure Reason (1781/7). This is one of the few places in the entire Kantian corpus where Kant explicitly discusses the meanings of and relations between his technical terms, and defines and classifies varieties of representation.

The genus is representation (representatio) in general. Under it stand representations with consciousness (perceptio). A perception [Wahrnehmung], that relates solely to a subject as a modification of its state, is sensation (sensatio). An objective perception is cognition (cognitio). This is either intuition or concept (intuitus vel conceptus). The first relates immediately to the object and is singular; the second is mediate, conveyed by a mark, which can be common to many things. A concept is either an empirical or a pure concept, and the pure concept, insofar as it has its origin solely in the understanding (not in a pure image of sensibility), is called notio. A concept made up of notions, which goes beyond the possibility of experience, is an idea or a concept of reason. (A320/B376–7).

As Kant’s discussion here indicates, the category of representation contains sensations [Empfindungen], intuitions [Anschauungen], and concepts [Begriffe]. Sensibility is the faculty that provides sensory representations. Sensibility generates representations based on being affected either by entities distinct from the subject or by the subject herself. This is in contrast to the faculty of understanding, which generates conceptual representations spontaneously – i.e. without advertence to affection. Reason is that spontaneous faculty by which special sorts of concepts, which Kant calls ‘ideas’ or ‘notions’, may be generated, and whose objects could never be met with in “experience,” which Kant defines as perceptions connected by fundamental concepts. Some of reason’s ideas include those concerning God and the soul.

Kant claims that all the representations generated via sensibility are structured by two “forms” of intuition—space and time—and that all sensory aspects of our experience are their “matter” (A20/B34). The simplest way of understanding what Kant means by “form” here is that anything one might experience will have either have spatial features, such as extension, shape, and location, or temporal features, such as being successive or simultaneous. So the formal element of an empirical intuition, or sense perception, will always be either spatial or temporal. Meanwhile, the material element is always sensory (in the sense of determining the phenomenal or “what it is like” character of experience) and tied either to one or more of the five senses or the feelings of pleasure and displeasure.

Kant ties the two forms of intuition to two distinct spheres or domains, the “inner” and the “outer.” The domain of outer intuition concerns the spatial world of material objects while the domain of inner intuition concerns temporally ordered states of mind. Space is thus the form of “outer sense” while time is the form of “inner sense” (A22/B37; cf. An 7:154). In the Transcendental Aesthetic, Kant is primarily concerned with “pure” [rein] intuition, or intuition absent any sensation, and often only speaks in passing of the sense perception of physical bodies (for example A20–1/B35). However, Kant more clearly links the five senses with intuition in his 1798 work Anthropology from a Pragmatic Point of View, in the section entitled “On the Five Senses.”

Sensibility in the cognitive faculty (the faculty of intuitive representations) contains two parts: sense and the imagination…But the senses, on the other hand, are divided into outer and inner sense (sensus internus); the first is where the human body is affected by physical things, the second is where the human body is affected by the mind (An 7:153).

Kant characterizes intuition generally in terms of two characteristics—namely immediacy [Unmittelbarkeit] and particularity [Einzelheit] (cf. A19/B33, A68/B93; JL 9:91). This is in contrast to the mediacy and generality [Allgemeinheit] characteristic of conceptual representation (A68/B93; JL 9:91).

Kant contrasts the particularity of intuition with the generality of concepts in the “stepladder” passage. Specifically, Kant says a concept is related to its object via “a mark, which can be common to many things” (A320/B377). This suggests that intuition, in contrast to concepts, puts a subject in cognitive contact with features of an object that are unique to particular objects and are not had by other objects. Some debate whether the immediacy of intuition is compatible with an intuition’s relating to an object by means of marks, or whether relation by means of marks entails mediacy and, thus, that only concepts relate to objects by means of marks. See Smit (2000) for discussion. Spatio-temporal properties seem like excellent candidates for such features, as no two objects of experience can have the very same spatio-temporal location (B327-8). But perhaps any non-repeatable, non-universal feature of a perceived object will do. For relevant discussion see Smith (2000); Grüne (2009), 50, 66-70.

Though Kant’s discussion of intuition suggests that it is a form of perceptual experience, this might seem to clash with his distinction between “experience” [Erfahrung] and “intuition” [Anschauung]. In part, this is a terminological issue. Kant’s notion of an “experience” is typically quite a bit narrower than our contemporary English usage of the term. Kant actually equates, at several points, “experience” with “empirical cognition” (B166, A176/B218, A189/B234), which is incompatible with experience being falsidical in any way. He also gives indications that experience, in his sense, is not something had by a single subject. See, for example, his claim that there is only one experience (A230/B282-3).

Kant also distinguishes intuition from “perception” [Wahrnehmung], which he characterizes as the conscious apprehension of the content of an intuition (Pr 4:300; cf. A99, A119-20, B162, and B202-3). “Experience,” in Kant’s sense, is then construed as a set of perceptions that are connected via fundamental concepts that Kant entitles the “categories.” As he puts it, “Experience is cognition through connected perceptions [durch verknüpfte Wahrnehmungen]” (B161; cf. B218; Pr 4:300).

Empirical intuition, perception, and experience, in Kant’s usage of these terms, all denote kinds of “experience” as we use the term in contemporary English. At its most primitive level, empirical intuition presents some feature of the world to the mind in a sensory manner. Empirical intuition does so in such a way that the intuition’s subject is in a position to distinguish that feature from others. A perception, in Kant’s sense, requires awareness of the basis by which the feature is different from other things. Kant uses the term in a variety of ways, however—JL 9:64-5, for instance—so there is some controversy surrounding the proper understanding of this term. One has a perception, in Kant’s sense, when one can not only discriminate one thing from another, or between the parts of a single thing, based on a sensory apprehension of it, but also can articulate exactly which features of the object or objects that distinguish it from others. For instance, one can say it is green rather than red, or that it occupies this spatial location rather than that one. Intuition thus allows for the discrimination of distinct objects via an awareness of their features, while perception allows for an awareness of what specifically distinguishes an object from others. “Experience,” in Kant’s sense, is even further up the cognitive ladder (see JL 9:64-5), insofar as it indicates an awareness of features, such as the substantiality of a thing, its causal relations with other beings, and its mereological features, that is  part-whole dependence relations.

Kant thus believes that the capacity to cognitively ascend from mere discriminatory awareness of one’s environment (intuition), to an awareness of those features by means of which one discriminates (perception), and finally to an awareness of the objects which ground these features (experience), depends on the kinds of mental processes of which the subject is capable.

Before turning to the issue of mental processing, which figures centrally in Kant’s overall critical project, there are two further faculties of the mind that are worth discussion— the faculties of judgment imagination. These faculties are not obviously as fundamental as the faculties of sensibility, understanding, and reason, but they nevertheless play a central role in Kant’s thinking about the structure of the mind and its contributions to our experience of the world.

ii. Imagination and Judgment

Kant links the faculty of imagination closely to sensibility. For example, in his Anthropology he says,

Sensibility in the cognitive faculty (the faculty of intuitive representations) contains two parts: sense and the power of imagination. The first is the faculty of intuition in the presence of an object, the second is intuition even without the presence of an object. (An 7:153; cf. 7:167; B151; LM 29:881; LM 28:449, 673)

The contrast Kant makes here is not entirely obvious, but includes at least the difference between cases of occurrent sensory experience of a perceived object—seeing the brown table before you—and cases of sensory recollection of a previously perceived object—visually imagining the brown table that was once in front of you. Kant makes this clearer in the process of further distinguishing between different kinds of imagination.

The power of imagination (facultas imaginandi), as a faculty of intuition without the presence of the object, is either productive, that is, a faculty of the original presentation [Darstellung] of the object (exhibitio originaria), which thus precedes experience; or reproductive, a faculty of the derivative presentation of the object (exhibitio derivativa), which brings back to mind an empirical intuition that it had previously (An 7:167).

So, in the operation of productive imagination, one brings to mind a sensory experience that is not itself based on any object previously so experienced. This is not to say the productive imagination is totally creative. Kant explicitly denies (An 7:167) that the productive imagination has the power to generate wholly novel sensory experience. It could not, in a person born blind, produce the phenomenal quality associated with the experience of seeing a red object, for example. If the productive imagination is instrumental in producing sensory fictions, the reproductive imagination is instrumental in producing sensory experiences of previously perceived objects.

Imagination thus plays a central role in empirical cognition by serving as the basis for both memory and the creative arts. In addition it also plays a kind of mediating role between the faculties of sensibility and understanding. Kant calls this mediating role a “transcendental function” of the imagination (A124). It mediates and transcends by being tied in its functioning to both faculties. On one hand, it produces sensible representations, and is thus connected to sensibility. On the other hand, it is not a purely passive faculty but rather engages in the activity of bringing together various representations, as does memory, for example, .Kant explicitly connects understanding with this kind of active mental processing.

Kant also goes so far as to claim that the activity of imagination is a necessary part of what makes perception, in his technical sense of a string of connected, conscious sensory experiences, possible (A120, note). Though Kant’s view concerning the exact role of imagination in sensory experience is contested, two points emerge as central. First, Kant belives imagination plays a crucial role in the generation of complex sensory representations of an object (see Sellars (1978) for an influential example of this interpretation). It is imagination that makes it possible to have a sensory experience of a complex, three-dimensional, and geometric figure whose identity remains constant even as it is subject to translations and rotations in space. Second, Kant regards imagination’s mediating role between sensibility and understanding as crucial for at least some kinds of concept application (see Guyer (1987) and Pendlebury (1995) for further discussion). This mediating role involves what Kant calls the “schematization” of a concept and an additional mental faculty, that of judgment.

Kant defines the faculty of judgment as “the capacity to subsume under rules, that is, to distinguish whether something falls under a given rule” (A132/B171). However, he spends comparatively little time discussing this faculty in the first Critique. There, it seems to be discussed as an extension of the understanding in that it applies concepts to empirical objects. It is not until the third Critique—Kant’s 1790 Critique of Judgment—that Kant distinguishes judgment as an independent faculty with a special role. There Kant specifies two different ways it might function (CJ 5:179; cf. CJ (First Introduction) 20:211)

In one, judgment subsumes given objects under concepts, which are themselves already given. This role appears identical to the role he assigns judgment in the Critique of Pure Reason. The basic idea is that judgment functions to assign an intuited object—a dog—to the correct concept—such as domestic animals. This concept is presumed to be one already possessed by the subject. In this activity, the faculty overlaps with the role Kant singles out for imagination in the section of the first Critique entitled ‘On the Schematism of the Pure Concepts of the Understanding.’ Both are conceived of here in terms of the ultimate functioning of understanding, since it is understanding that generates concepts.

The second role for the faculty of judgment, and what seems to make it a distinctive faculty in its own right, is that of finding a concept under which to “subsume” experienced objects. This is called judgment’s “reflecting” role (CJ 5:179). Here, the subject exercises judgment in generating an appropriate concept for what is given by intuition (CJ (First Introduction) 20:211-13; JL 9:94–95; for discussion see Longuenesse (1998), 163–166 and 195–197; Ginsborg (2006).

In addition to the generation of empirical concepts, Kant also describes reflective judgment as responsible for scientific inquiry. It must sort and classify objects in nature into a hierarchical taxonomy of genus/species relationships. Kant also utilizes the notion of reflective judgment to unify the otherwise seemingly unrelated topics of the Critique of Judgment—aesthetic judgments and teleological judgments concerning the order of nature.

Thus far, the discussion of Kant’s view of the mind has focused primarily on the various mental faculties and their corresponding representational output. Both the faculty of imagination and that of judgment operate on representations given from sensibility and understanding. In general, Kant conceives of the mind’s activity in terms of different methods of “processing” representations.

b. Mental Processing

Kant’s term for mental processing is “combination” [Verbindung], and the form of combination with which he is primarily concerned is what he calls “synthesis.” Kant characterizes synthesis as that activity by which understanding “runs through” and “gathers together” representations given to it by sensibility in order to form concepts, judgments, and ultimately, for any cognition to take place at all (A77-8/B102-3). Synthesis is not something people are typically aware of doing. As Kant says, it is a “a blind though indispensable function of the soul…of which we are only seldom even conscious (A78/B103)”.

Synthesis is carried out by the unitary subject of representation upon representations either given to the subject by sensibility or produced by the subject through thought. Intellectual synthesis occurs when synthesis is used on representations and forms the content of a concept or judgment. When carried out by the imagination on material provided by sensibility, it is called “figurative” synthesis (B150-1). In the Critique of Pure Reason, Kant is primarily concerned with synthesis performed on representations provided by sensibility, and he discusses three central kinds of synthesis—apprehension, reproduction (or imagination), and recognition (or conceptualization) (A98-110/B159-61). Though Kant discusses these forms of synthesis as if they were discrete types of mental acts, it seems that the first two forms must occur together, while the third only may occur as well (compare Brook (1997); Allais (2009).

One of the central topics of debate in the interpretation of Kant’s views on synthesis is whether Kant endorses conceptualism. Roughly, conceptualism claims the capacity for conscious sensory experience of the objective world depends, at least in part, on the repertoire of concepts possessed by the experiencing subject, insofar as those concepts are exercised in acts of synthesis by understanding.

Kant typically contrasts synthesis with other ways in which representations might be related, most importantly, by association (for example B139-40). Association is primarily a passive process by which the mind comes to connect representations due to repeated exposure of the subject to certain kinds of regularities. One might, for example, associate thoughts of chicken soup with thoughts of being ill, if one only had chicken soup when one was ill. In contrast, synthesis is a fundamentally active process that depends upon the mind’s spontaneity and is the means by which genuine judgment is possible.

Consider, for example, the difference between the merely associative transition between holding a stone and feeling its weight compared to the judgment that the stone is heavy (B142). The association of holding the stone and feeling its weight is not yet a judgment about the stone, but a kind of involuntary connection between two states of oneself. In contrast, thinking the stone is heavy moves beyond associating two feelings to a thought about how things are objectively, independent of one’s own mental states (Pereboom (1995), Pereboom (2006)). One of Kant’s most important points concerning mental processing is that association cannot explain the possibility of objective judgment. What is required, he says, is a theory of mental processing by an active subject capable of acts of synthesis.

Several of the important differences between synthesis and association can be summarized as follows (Pereboom (1995), 4-7):

  1. The source of synthesis is to be found in a subject, and the subject is distinct from its states.
  2. Synthesis can employ a priori concepts, concepts independent of experience, as modes of processing representations, whereas association never does.
  3. Synthesis is the product of a causally active subject. It is produced by a cause that is realized in the subject’s faculty, either the imagination or the understanding.

Kant’s conception of synthesis and judgment is tied to his conception of “consciousness” [Bewußtsein] and “self-consciousness” [Selbstbewußtsein]. However, both notions require some significant unpacking.

2. Consciousness

The notion of consciousness [Bewußtsein] plays an important role in Kant’s philosophy. There are, however, several different senses of “consciousness” in play in Kant’s work, not all of which line up with contemporary philosophical usage. Below, several of Kant’s most central notions and their differences from and relations to contemporary usage are explained.

a. Phenomenal Consciousness

Philosophical discussions of consciousness typically focus on phenomenal consciousness, or “what it is like” to have a conscious experience of a particular kind, such as seeing the color red or smelling a rose. Such qualitative features of consciousness have been of major concern to philosophers of the late 20th Century. However, the metaphysical issue of phenomenal consciousness is almost entirely ignored by Kant, perhaps because he is unconcerned with problems stemming from commitments to naturalism or physicalism. He seems to attribute all qualitative characteristics of consciousness to sensation and what he calls “feeling” [Gefühl] (CJ 5:206). Kant distinguishes between sensation and feeling in terms of an objective/subjective distinction. Sensations indicate or present features of objects, distinct from the subject. Feelings, by contrast, present only states of the subject to consciousness. Kant’s typical examples of such feelings include pain and pleasure (B66-7; CJ 5:189, 203-6).

Kant clearly assigns a cognitive role to sensation and allows that it is “through sensation” that we cognitively relate to objects given in sensibility (A20/B34). Despite that, he does not focus in any substantive or systematic way on the phenomenal aspects of sensory consciousness, nor does he focus on how exactly they aid in cognition of the empirical world.

b. Discrimination and Differentiation

The central notion of “consciousness” with which Kant is concerned is that of discrimination or differentiation. This is the same conception of consciousness mostly used in Kant’s time, particularly by his major predecessors Gottfried Wilhelm Leibniz (1646–1716) and Christian Wolff (1679-1754), and Kant gives little indication that he departs from their general practice.

According to Kant, any time a subject can discriminate one thing from another, the subject is, or can be, conscious of that one thing. (An 7:136-8). Representations which allow for discrimination and differentiation are “clear” [klar]. Representations which allow not only for the differentiation of one thing from others (such as differentiating one person’s face from another’s), but also the differentiation of parts of the thing so discriminated (such as differentiating the different parts of a person’s face) are called “distinct” [deutlich].

Kant does seem to deny the Leibniz-Wolff tradition that clarity can simply be equated with consciousness (B414-15, note). Primarily, he seems motivated to allow that one’s discriminatory capacities may outrun one’s capacity for memory or even the explicit articulation of that which is discriminated. In such cases, one does not have a fully clear representation.

Kant’s conception of “obscure” [dunkel] representation is that it allows the subject to discriminate differentially between aspects of her environment without any explicit awareness of how she does so. This connects him with the Leibniz-Wolff tradition of recognizing the existence of unconscious representations (An 7:135-7). Kant says the majority of representations that people appeal to in order to explain the complex, discriminatory behaviors of living organisms are “obscure” in a technical sense. Likening the mind to a map Kant goes so far as to say,

The field of sensuous intuitions and sensations of which we are not conscious, even though we can undoubtedly conclude that we have them; that is, obscure representations in the human being (and thus also in animals), is immense. Clear representations, on the other hand, contain only infinitely few points of this field which lie open to consciousness; so that as it were only a few places on the vast map of our mind are illuminated. (An 7:135)

Thus, obscure representations, have no direct or non-inferential awareness but must be posited to explain our fine-grained, differential, and discriminatory capacities. They constitute the majority of the mental representations with which the mind busies itself.

Though Kant does not make it explicit in his discussion of discrimination and consciousness, it is clear that he takes the capacity to discriminate between objects and parts of objects to be ultimately based on sensory representation of those objects. His views on consciousness as differential discrimination intersect with his views on phenomenal consciousness. Because humans are receptive through their sensibility, the ultimate basis on which we differentially discriminate between objects must be sensory. Thus, though Kant seems to take for granted the fact that conscious beings are in states with a particular phenomenal character, it must be the clarity and distinctness of this character that allows a conscious subject to differentially discriminate between the various elements of her environment (see Kant’s discussion of aesthetic perfection in the 1801 Jäsche Logic, 9:33-9 for relevant discussion).

c. Self-Consciousness

As the discussion of unconscious representation indicates, Kant believes we are not directly aware of most of our representations. They are nevertheless, to some degree, conscious, because they allow differential discrimination of elements from the subject’s environment. Kant thinks the process of making a representation clear, or fully conscious, requires a higher-order representation of the relevant representation. In other words, it requires that someone can have representations based on representations. As Kant says, “consciousness is really the representation that another representation is in me” (JL 9:33). Because this higher-order representation is one of another representation in the subject, Kant’s position here suggests that consciousness requires at least the capacity for self-consciousness. This position is reinforced by Kant’s famous claim in the Transcendental Deduction of the Critique of Pure Reason:

The I think must be able to accompany all my representations; for otherwise something would be represented in me that could not be thought at all, which is as much as to say that the representation would either be impossible or else at least would be nothing for me. (B131-2; emphasis in the original)

Kant might give the impression here of saying that for representation to be possible for a subject, the subject must possess the capacity for self-ascribing her representations. If so, then representation, and thus the capacity for conscious representation would depend on the capacity for self-consciousness. Because Kant ties the capacity for self-consciousness to spontaneity (B132, 137, 423) and restricts spontaneity to the class of rational beings, the demand for self-ascription would seem to deny that any non-rational animal (for example, dogs, cats, and birds), could have phenomenal or discriminatory consciousness.

However, there is little evidence to show that Kant endorses the self-ascription condition. Instead, he distinguishes between two distinct modes in which one is aware of oneself and one’s representations—inner sense and apperception (See Ameriks (2000) for extensive discussion). Only the latter form of awareness seems to demand a capacity for self-ascription.

i. Inner Sense

Inner sense is, according to Kant, the means by which we are aware of alterations in our own state. Hence all moods, feelings, and sensations, including such basic alterations as pleasure and pain, are the proper subject matter of inner sense. Ultimately, Kant argues that all sensations, feelings, and those representations attributable to a subject must ultimately occur in inner sense and conform to its form—time (A22-3/B37; A34/B51).

Thus, to be aware of something in inner sense is to be minimally, phenomenally conscious, at least in the case of awareness of sensations and feelings. To say a subject is aware of her own states via inner sense is to say that she has a temporally ordered series of mental states, and is phenomenally conscious of each, though she may not be conscious of the series as a whole. This could still count as a kind of self-awareness, as when an animal is aware of being in pain. But it is not an awareness of subject as a self. Kant himself indicates such a position in a letter to his friend and former student Marcus Herz in 1789.

[Representations] could still (I consider myself as an animal) carry on their play in an orderly fashion, as connected according to empirical laws of association, and thus they could even have influence on my feeling and desire, without my being aware of my own existence [meines Daseins unbewußt] (assuming that I am even conscious of each individual representation, but not of their relation to the unity of representation of their object, by means of the synthetic unity of their apperception). This might be so without my cognizing the slightest thing thereby, not even what my own condition is (C 11:52, May 26, 1789).

Hence, according to Kant, one may be aware of one’s representations via inner sense, but one is not and cannot, through inner sense alone, be aware of oneself as the subject of those representations. That requires what Kant, following Leibniz (1996), calls “apperception”.

ii. Apperception

Kant uses the term “apperception” to denote the capacity for the awareness of some state or modification of one’s self as a state. For one capable of apperception, there is a difference between feeling pain, and thus having an inner sense of it, and apperceiving that one is in pain, and thus ascribing, or being able to ascribe, a certain property or state of mind to one’s self. For example, while a non-apperceptive animal is aware of its own pain and its awareness is partially explanatory of its behavior, like avoidance, Kant construes the animal as incapable of making any self-attribution of its pain. Kant thinks of such a mind as incapable of construing itself as a subject of states, and it is thus unable to construe itself as persisting through changes of those states. This is not necessarily to say an animal incapable of apperception lacks any subject or self. But, at the very least, such an animal would be incapable of conceiving or representing itself in this way (See Naragon (1990); McLear (2011).

Kant considers the capacity for apperception as importantly tied to the capacity to represent objects as complexes of properties attributable to a single underlying entity (for example, an apple as a subject of the complex of the properties red and round). Kant’s argument for this connection is notorious both for its complexity and for its obscurity. The next sub-section will give an overview, though not an exhaustive discussion, of some of Kant’s most important points concerning these matters, as they relate to the issue of apperception.

d. Unity of Consciousness and the Categories

In order to better understand Kant’s views on apperception and unity of consciousness, one must step back and look at the wider context of the argument in which he situates these views. One of the core projects of Kant’s most famous work, the Critique of Pure Reason, is to provide an argument for the legitimacy of a priori knowledge of the natural world. Though Kant’s conception of the a priori is complex, Kant shares one central aspect of his view with his German rationalist predecessors (for example Leibniz (1996), preface), that we have knowledge of universal and necessary truths concerning aspects of the empirical world (B4-5). Those truths include one saying every event in the empirical world has a cause (B231). This tradition tended to explain the possession of knowledge of such universal and necessary truths by appeal to innate concepts which could be analyzed to yield the relevant truths. Kant importantly departs from the rationalist tradition, arguing that not all knowledge of universal and necessary truths is acquired via the analysis of concepts (B14-18). Instead, he says there are some “synthetic” a priori truths that are known on the basis of something other than conceptual analysis. Thus, according to Kant, the activity of pure reason achieves relatively little on its own. All of our ampliative knowledge (knowledge that can’t be directly deduced) that is also necessary and universal consists in what Kant calls “synthetic a priori” judgments or propositions. He then pursues the central question: how is knowledge of such synthetic a priori propositions possible?

Kant’s basic answer to the question of synthetic a priori knowledge involves what he calls the “Copernican Turn.” According to the “Copernican Turn,” the objects of human knowledge must “conform” to the basic faculties of human knowledge—the forms of intuition (space and time) and the forms of thought (the categories).

Kant thus engages in a two-part strategy for explaining the possibility of such synthetic a priori knowledge. The first part consists of arguing that the pure forms of intuition provide the basis for our synthetic a priori knowledge of mathematical truths. Mathematical knowledge is synthetic because it goes beyond mere conceptual analysis to deal with the structure of, or our representation of, space itself. It is a priori because the structure of space is accessible to us as it is merely the form of our intuition and not a real mind-independent thing.

In addition to the representation of space and time, Kant also thinks that possession of a particular, privileged set of a priori concepts is necessary for knowledge of the empirical world. But this raises a problem. How can an a priori concept, which is not itself derived from any particular experience, be nevertheless legitimately applicable to objects of experience? Even more difficult, it is not the mere possibility applying a priori concepts to objects of experience that worries Kant, for this could just be a matter of pure luck. Kant wants more than mere possibility; he wants to show that a privileged set of a priori concepts apply necessarily and universally to all objects of experience and do so in a way that people can know independently of experience.

This brings us to the second part of Kant’s argument, which is directly relevant for understanding Kant’s views on the importance of apperception. Not only must objects of knowledge conform to the forms of intuition, they also must conform to the most basic concepts (or categories) governing our capacity for thought. Kant’s strategy shows how a priori concepts legitimately apply to their objects by being partly constitutive of the objects of representation. This contrasts with the traditional view, according to which the objects of representation were the source or explanatory ground of our concepts (B, xvii-xix). Now, exactly what this means is deeply contested, in part because it is rather unclear what Kant intends by his doctrine of Transcendental Idealism. Does Kant intend that the objects of representation are themselves nothing other than representations? This would be a form of phenomenalism similar to that offered by Berkeley. Kant, however, seems to want to deny that his view is similar to Berkeley’s, asserting instead that the objects of representation exist independently of the mind, and that it is only the way that they are represented that is mind-dependent (A92/B125; compare Pr 4:288-94).

Kant’s strategy attempts to validate the legitimacy of the a priori categories proceeds by way of a “transcendental argument.” It takes  the conditions necessary for consciousness of the identity of oneself as the subject of different self-attributed mental states and ties them together with those necessary for grounding the possibility of representing an object distinct from oneself. From those conditions, various properties may be predicated. In this sense, Kant argues that the intellectual representation of subject and object stands and falls together. Kant thus denies the possibility of a self-conscious subject, who could conceptualize and self-ascribe her representations, but whose representations could not represent law-governed objects in space, and thus the material world or ‘nature’ as the subject conceives of it.

Though Kant’s views regarding the unity of the subject are contested, there are several points which can be made fairly clearly. First, Kant conceives of all specific, intellectual activity, including the most basic instances of discursive thought, as requiring what he calls the “original unity of apperception” (B132). This unity, as original, is not itself brought about by some mental act of combining representations, but, as Kant says, is “what makes the concept of combination possible” (B131). It is itself the ground of the “possibility of the understanding” (B131).

Second, the original unity of apperception requires whatever form of self-consciousness characteristically relates to the “I think.” As Kant famously says, “the I think must be able to accompany all my representations” (B131). Moreover, the “I think” essentially involves activity on the part of the subject—it is an expression of the subject’s free activity or “spontaneity” (B132). This means that, according to Kant, only beings capable of spontaneous activity—self-initiated activity that is ultimately traced to causes outside the reach of natural causal laws—are going to be capable of thought in the sense with which Kant is concerned.

Third, and related to the previous point, Kant seems to deny that a subject could attain the kind of representational unity characteristic of thought if her only resources were aggregative methods. Kant makes this point later in the Critique when he says, “representations that are distributed among different beings (for instance, the individual words of a verse) never constitute a whole thought (a verse)” (A 352). William James provides a vivid articulation of the idea: “Take a sentence of a dozen words, and take twelve men and tell to each one word. Then stand the men in a row or jam them in a bunch, and let each think of his word as intently as he will; nowhere will there be a consciousness of the whole sentence” (James (1890), 160). Kant construes consciousness as the “holding-together” of the various components of a thought. He does so in a manner that seems radically opposed to any conception of unitary thought which tries to explain it in terms of some train or succession of its components (Pr 4:304; see Kitcher (2010); Engstrom (2013) for contrasting treatments of this issue).

The exact content of Kant’s argument for the connection between subject and object in the Transcendental Deduction is highly disputed, and it is likely no single reconstruction of the argument can capture all the points Kant supports in the Deduction. At least one strand of Kant’s argument in the first half of the Deduction focuses on Kant’s denial that the unity of the subject and its powers of representational combination could be accounted for by a merely associationist (or Humean) conception of mental combination, sometimes termed his “argument from above” (see A119; Carl (1989); Pereboom (1995)). Kant’s argues (see Pereboom (2009)):

  1. I am conscious of the identity of myself as the subject of different self-attributions of mental states.
  2. I am not directly conscious of the identity of this subject of different self-attributions of mental states.
  3. If (1) and (2) are true, then this consciousness of identity is accounted for indirectly by my consciousness of a particular kind of unity of my mental states.
  4. Therefore, this consciousness of identity is accounted for indirectly by my consciousness of a particular kind of unity of my mental states. (1, 2, 3)
  5. If (4) is true, then my mental states indeed have this particular kind of unity.
  6. This particular kind of unity of my mental states cannot be accounted for by association. (5)
  7. If (6) is true, then this particular kind of unity of my mental states is accounted for by synthesis by a priori
  8. Therefore, this particular kind of unity of my mental states is accounted for by synthesis by a priori concepts. (6, 7)

Premise (1) says that I am aware of herself  as the subject of different states (or at least able to be so aware). For example, right now I might be hungry as well as sleepy. Previously, I was sleepy and slightly bored. Premise (2) claims I have no immediate or direct awareness of the being which has all of these states. In Kant’s terms, I lack any intuition of the subject of such self-ascribed states, instead having intuition only of the states themselves. Nevertheless, I am aware of all these states as related to a subject (it is I who am bored, hungry, sleepy), and it is in virtue of these connections that I can call one and all of these states mine. Hence, as premise (3) argues, there must be some unity to my mental states which accounts for my (indirect) awareness of their unity. My representations must have some basis for which they go together, and it is the basis for their ‘togetherness’ that explains how I can consider them, one and all, to be mine. Premises (4) and (5) unpack this point, and premise (6) argues that association could not account for such unity (the theory of association was articulated in a particularly influential form by David Hume (1888, Hume (2007)) and the reader should look to that article for relevant background discussion).

Kant’s point, in premise (6) of the above argument, is that forces of association acting on mental representations, whether impressions or ideas, cannot account for either the experience of a train of representations as mine or for the “togetherness” of those representations, both as a single thought or as a series of inferences. Hume argues we have no impression and thus no ensuing idea of an empirical self (Hume (1888), I.iv.6). Kant also accepts this point when he says, “the empirical consciousness that accompanies different representations is by itself dispersed and without relation to the identity of the subject” (B133). By this, Kant means that when we introspect in inner sense, all we ever get are particular mental states, such as  boredom, happiness, particular thoughts. We lack any intuition of a subject of those mental states. Hume concludes that the idea of a persisting self which grounds all of these mental states as its subject must be fictitious. Kant disagrees. His contrasting view takes the mineness and togetherness of one’s introspectible mental states as data needing explanation.Because an associative, psychological theory like that of Hume’s cannot explain these features of first-person consciousness (see Hume (1888), III. Appendix), we need to find another theory, such as Kant’s theory of mental synthesis.

Recall that, prior to the argument of the Transcendental Deduction, Kant links the operations of synthesis to possession of a set of a priori concepts, or categories, not derived from experience. Hence, in arguing that synthesis is required to explain the mineness and togetherness of one’s mental states, and by linking synthesis to the application of the categories, Kant argues we could not have the experience of the mineness and togetherness of our mental states without applying the categories.

While this argument is only half of Kant’s argument in the first part of the Deduction, it shows how tightly Kant took the connection to be between the capacities for spontaneity, synthesis and apperception, and the legitimacy of the categories. The other half, by the way, consists of an “argument from below,” and discerns the conditions necessary for the representation of unitary objects, see Pereboom (1995), (2009)According to Kant, there is only one possible explanation of one’s apperceptive awareness of one’s psychological states as one’s own and of all states being related to one another. As the subject of such states, one possesses a spontaneous power for synthesizing one’s representations according to general principles or rules, the content of which is given by pure a priori concepts—the categories. The fact that the categories play such a fundamental role in the generation of self-conscious psychological states is thus a powerful argument demonstrating their legitimacy.

Given that Kant leverages certain aspects of our capacity for self-knowledge in his argument for the legitimacy of the categories, the extent to which he argues for radical limits on our capacity for self-knowledge may be surprising. In the final section, Kant’s arguments concerning our capacity for a priori knowledge of the self and its fundamental features will be made clear. However, the next section will look at one of the central debates in Kant’s interpretation of the role of concepts in perceptual experience.

3. Concepts and Perception

During the discussion of synthesis above, conceptualism was characterized as claiming there is a dependent relation between a subject having conscious sensory experience of an objective world and the repertoire of concepts possessed by the subject and exercised by her faculty of understanding.

As a first pass at sharpening this formulation, understand conceptualism as a thesis consisting of two claims: (i) sense experience has correctness conditions determined by the ‘content’ of the experience, and (ii) the content of an experience is a structured entity whose components are concepts.

a. Content and Correctness

An important background assumption governing the conceptualism debate construes mental states as related to the world cognitively, as opposed to merely causally, if and only if they possess correctness conditions. That which determines the correctness condition for a state is that state’s content (see Siegel (2010), (2011); Schellenberg (2011)).

Suppose, for example, that an experience E has the following content C:

C: That cup is white.

This content determines a correctness condition V:

V: S’s experience E is correct if and only if the cup visually presented to the subject as the content of the demonstrative is white and the content C corresponds to how things seem to the subject to be visually presented.

Here, the content of the experiential state functions much like the content of a belief state to determine whether the experience, like the belief, is or is not correct.

A state’s possession of content thus determines a correctness condition, through which the state can be construed as mapping, mirroring, or otherwise tracking aspects of the subject’s environment.

There are reasons for questioning whether Kant endorses the content assumption articulated above. Kant seems to deny several claims integral to it. First, in various places he explicitly denies that intuition, or the deliverances of the senses more generally, are the kind of thing which could be correct or incorrect (A293–4/B350; An §11 7:146; compare LL 24:83ff, 103, 720ff, 825ff). Second, Kant’s conception of representational content requires an act of mental unification (Pr 4:304; compare JL §17 9:101; LL 24:928), something which Kant explicitly denies is present in an intuition (B129-30; compare B176-7). This is not to deny that Kant uses a notion of “content,” in some other sense, but rather only that he fails to use it in the sense required by interpretations endorsing the content assumption (see Tolley (2014), (2013)). Finally, Kant’s “modal” condition of cognition, that it provides a demonstration of what is really actual rather than merely logically possible, seems to preclude an endorsement of the content assumption (B, xxvii, note; compare Chignell (2014)). However, for the purposes of understanding the conceptualism debate, assume Kant does endorse the content assumption. The question then is how to understand the nature of the content so understood.

b. Conceptual Content

In addition to the content assumption, conceptualism is defined as committed to a conception of intuition’s content being completely composed of concepts. Against this, Clinton Tolley (Tolley (2013), Tolley (2014)) has argued that the immediacy/mediacy distinction between intuition and concept entails a difference in the content of intuition and concept.

If we understand by ‘content’…a representation’s particular relation to an object…then it is clear that we should conclude that Kant accepts non-conceptual content. This is because Kant accepts that intuitions put us in a representational relation to objects that is distinct in kind from the relation that pertains to concepts. I argued, furthermore, that this is the meaning that Kant himself assigns to the term ‘content’. (Tolley (2013), 128)

Insofar as Kant often speaks of the ‘content’ [Inhalt] of a representation as consisting of a particular kind of relation to an object (Tolley (2013), 112; compare B83, B87), Tolley’s proposal thus gives ground for a simple and straightforward argument for a non-conceptualist reading of Kant. However, it does not necessarily prove that the content of what Kant calls an intuition is not something that would be construed by others as conceptual, in a wider sense of that term. For example, both pure—that, this—and complex demonstrative expressions—that color, this person—have conceptual form, and have been proposed as appropriate for capturing the content of experience (McDowell (1996), ch. 3; for discussion see Heck (2000)). Demonstratives are not, in Kant’s terms, ‘conceptual’ since they do not exhibit the requisite generality which, according to Kant, all conceptual representation must.

c. Conceptualism and Synthesis

If it isn’t textually plausible to understand the content of an intuition in conceptual terms, at least as Kant understands the notion of a concept, then what would it mean to say that Kant endorses conceptualism with regard to experience? The most plausible interpretation, endorsed by a wide variety of interpreters, reads Kant as arguing that the generation of an intuition, whether pure or sensory, depends at least in part on the activity of the understanding. On this way of carving things, conceptualism does not consist in the narrow claim that intuitions have concepts as contents or components. Instead, it consists in the broader claim that the occurrence of an intuition depends at least in part on the discursive activity of understanding. The specific activity of understanding is that which Kant calls ‘synthesis,’ the “running through, and gathering together” of representations (A99).

The conceptualist further argues that taking intuitions as generated via acts of synthesis, which are directed by or otherwise dependent upon conceptual capacities, provides some basis for the claim that whatever correctness conditions might be had by intuition must accord with the conceptual synthesis which generated them. This arguably fits well with Kant’s much quoted claim,

The same function that gives unity to the different representations in a judgment also gives unity to the mere synthesis of different representations in an intuition, which, expressed generally, is called the pure concept of understanding. (A79/B104-5)

The link between intuition, synthesis in accordance with concepts, and relation to an object is made even clearer by Kant’s claim in §17 of the B-edition Transcendental Deduction:

Understanding is, generally speaking, the faculty of cognitions. These consist in the determinate relation of given representations to an object. An object, however, is that in the concept of which the manifold of a given intuition is united. (B137; emphasis in the original)

However else we are to understand this passage, Kant here indicates that the unity of an intuition necessary for it to stand as a cognition of an object requires a synthesis by the concept ”object.” In other words, cognition of an object requires that intuition be unified by an act or acts of the understanding.

According to the conceptualist interpretation, one must understand the notion of a representation’s content as a relation to an object, which in turn depends on a conceptually guided synthesis. So we can revise our initial definition of conceptualism to read it as claiming (i) the content of an intuition is a kind of relation to an object, (ii) the relation to an object depends on a synthesis directed in accordance with concepts, and (iii) synthesis in accordance with concepts sets correctness conditions for the intuition’s representation of a mind-independent object.

d. Objections to Conceptualism

At the heart of non-conceptualist readings of Kant stands denial that mental acts of synthesis carried out by understanding are necessary for the occurrence of cognitive mental states of the type which Kant designates by the term “intuition” [Anschauung]. Though it is controversial as to what might be considered the “natural” or “default” reading of Kant’s mature critical philosophy, there are at least four considerations which lend strong support to a non-conceptualist interpretation of Kant’s mature work.

First, Kant repeatedly and forcefully states that in cognition there is a strict division of cognitive labor—objects are given by sensibility and thought via understanding:

Objects are given to us by means of sensibility, and it alone yields us intuitions; they are thought through the understanding, and from the understanding arise concepts (A19/B33; compare A50/B74, A51/B75–6, A271/B327).

As Robert Hanna has argued, when Kant discusses the dependence of intuition on conceptual judgment in the Analytic of Concepts, he specifically talks about cognition rather than what others would consider to be perceptual experience (Hanna (2005), 265-7).

Second, Kant characterizes the representational capacities characteristic of sensibility as more primitive than those characteristic of understanding, or reason, and he characterizes those capacities as a plausible part of what humans share with the rest of the animal kingdom (Kant connects the possession of a faculty of sensibility to animal nature in various places, for example A546/B574, A802/B830; An 7:196). For example, Kant’s distinction between the faculties of sensibility and understanding seems intended to capture the difference between the “sub-rational” powers of the mind that is shared with non-human animals and the “rational or higher-level cognitive powers” that are special to human beings. (Hanna (2005), 249; compare Allais (2009); McLear (2011))

If one were to deny that, according to Kant, sensibility alone is capable of producing mental states cognitive in character, then,  it would seem that any animal which lacks a faculty of understanding would thereby lack any capacity for genuinely perceptual experience. The mental lives of non-rational animals would thus, at best, consist of non-cognitive sensory states causally correlated with changes in the animal’s environment. Aside from an unappealing and implausible characterization of the animals’ cognitive capacities, this reading also faces textual hurdles (for relevant discussion of some of the issues in contemporary cognitive ethology see Bermúdez (2003); Lurz (2009); Andrews (2014), as well as the papers in Lurz (2011)).  Kant is on record in various places as saying that animals have sensory representations of their environment (CPJ 5:464; LM 28:449; compare An 7:212), that they have intuitions (LL 24:702), and that they are acquainted with objects though they do not cognize them (JL 9:64–5) (see Naragon (1990); Allais (2009); McLear (2011)).

Hence, if Kant’s position is that synthetic acts carried out by the understanding are necessary for the cognitive standing of a mental state, then Kant is contradicting fundamental elements of his own position in crediting intuitions or their possibility to non-rational animals.

Third, any position which regards perceptual experience as dependent upon acts of synthesis carried out by the understanding would presumably also construe the ‘pure’ intuitions of space and time as dependent upon acts of synthesis (see Longuenesse (1998), ch. 9; Griffith (2012)). However, Kant’s discussion of space, and, analogously, time, in the third and fourth arguments (fourth and fifth in the case of time) of the Metaphysical Exposition of Space in the Transcendental Aesthetic seems incompatible with such a proposed relation of dependence.

Kant’s point in the third and fourth arguments of the Metaphysical Exposition of space and time is that no finite intellect could grasp the extent and nature of space as an infinite whole via a synthetic process involving movement from representation of a part to representation of the whole. If the unity of the forms of intuition were itself something dependent upon intellectual activity, then this unity would necessarily involve the discursive, though not necessarily conceptual, running through and gathering together of a given multiplicity (presumably of different locations or moments) into a combined whole. Kant believes this is characteristic of synthesis generally (A99).

But Kant’s arguments in the Metaphysical Expositions require the fundamental basis of the representation of space and time does not proceed from a grasp of the multiplicative features of an intuited particular to the whole with those features. Instead, the form of pure intuition constitutes a representational whole that is prior to that of its component parts (compare CJ 5:407-8, 409).

Hence, Kant’s position is that the pure intuitions of space and time possess a unity wholly different from that given by the discursive unity of understanding (whether in conceptual judgment or the intellectual with imaginative synthesis of intuited objects). The unity of aesthetic representation—characterized by forms of space and time—has a structure in which the representational parts depend upon the whole. The unity of discursive representation—representation where the activity of understanding is involved—has a structure in which the representational whole depends upon its parts (see McLear (2015)).

Finally, there has been extensive discussion on the non-conceptuality of intuition in the secondary literature on Kant’s philosophy of mathematics. For example, Michael Friedman has argued that the expressive limitations of prevailing logic in Kant’s time required the postulation of intuition as a form of singular, non-conceptual representation (Friedman (1992), ch. 2; Anderson (2005); Sutherland (2008)). In contrast to Friedman’s view, Charles Parsons and Emily Carson argued that the immediacy of intuition, both pure and empirical, should be construed in a ‘phenomenological’ manner. Space in particular is understood on their interpretation as an original, non-conceptual representation, which Kant takes to be necessary for the demonstration of the real possibility of constructed, mathematical objects as required for geometric knowledge (Parsons (1964); Parsons (1992); Carson (1997); Carson (1999); compare Hanna (2002). For a general overview of related issues in Kant’s philosophy of mathematics, see Shabel (2006) and the works cited therein at p. 107, note 29.)

Ultimately, however, there are difficulties assessing whether Kant’s philosophy of mathematics can have relevance for the conceptualism debate. It is not obvious whether intuition must be non-conceptual in accounting for mathematical knowledge is incompatible with claiming that intuitions themselves are dependent upon a conceptually-guided synthesis.

The non-conceptualist reading clearly commits to allowing that sensibility alone provides, perhaps in a very primitive manner, objective representation of the empirical world. Sensibility is construed as an independent cognitive faculty, which humans share with other non-rational animals, and which is the jumping-off point for more sophisticated, conceptual representation of empirical reality.

The next and final section looks at Kant’s views regarding the nature and limits of self-knowledge and the ramifications of this for traditional rationalist views of the self.

4. Rational Psychology and Self-Knowledge

Kant discusses the nature and limits of our self-knowledge most extensively in the first Critique, in a section of the Transcendental Dialectic called the “Paralogisms of Pure Reason.” Here, Kant is concerned to criticize the claims of what he calls “rational psychology.” Specifically, he is concerned about the claim that we can have substantive, metaphysical knowledge of the nature of the subject, based purely on an analysis of the concept of the thinking self. As Kant typically puts it:

I think is thus the sole text of rational psychology, from which it is to develop its entire wisdom…because the least empirical predicate would corrupt the rational purity and independence of the science from all experience. (A343/B401)

There are four “Paralogisms.” Each argument is presented as a syllogism, consisting of two premises and a conclusion. According to Kant, each argument is guilty of an equivocation on a term common to the premises, such that the argument is invalid. Kant’s aim, in his discussion of each Paralogism, is to diagnose the equivocation, and explain why the rational psychologist’s argument ultimately fails. In so doing, Kant provides a great deal of information about his own views concerning the mind (See Ameriks (2000) for extensive discussion). The argument of the first Paralogism concerns knowledge of the self as substance; the second, the simplicity of the self; the third, the numerical identity of the self; and the fourth, knowledge of the self versus knowledge of things in space.

a. Substantiality (A348-51/B410-11)

Kant presents the rationalist’s argument in the First Paralogism as follows:

  1. What cannot be thought otherwise than as subject does not exist otherwise than as subject, and is therefore substance.
  2. Now a thinking being, considered merely as such, cannot be thought of as other than a subject.
  3. Therefore, a thinking being also exists only as such a thing, i.e., as substance.

Kant’s presentation of the argument is rather compressed. In more explicit form we can put it as follows (see Proops (2010)):

  1. All entities that cannot be thought of as other than a subjects are entities that cannot exist otherwise than as subjects, and therefore are substances. (All M are P)
  2. All entities that are thinking beings are entities that cannot be thought otherwise than as subjects. (All S are M)
  3. Therefore, all entities that are thinking beings are entities that cannot exist otherwise than as subjects, and therefore are substances (All S are P)

The relevant equivocation is in the term that occupies the ‘M’ place in the argument— “entities that cannot be thought otherwise than as subjects”. Kant specifically locates the ambiguity in the use of the term “thought” [Das Denken], which he claims concerns an object in general in the first premise. Thus, “thought” could be given in a possible intuition. In the second premise, the use of “thought” is supposed to apply only to a feature of thought and, thus, not to an object of a possible intuition (B411-12).

While it isn’t obvious what Kant means by this claim, it could be that. Kant takes the first premise to make a claim about the objects of thought. They exist as an independent subject or bearer of properties and cannot be conceived of as anything else. This is thus a metaphysical claim about what kinds of objects could really exist, which explains Kant’s reference to an “object in general” that could be given in intuition.

In contrast, premise (2) makes a merely logical claim concerning the role of the representation <I> in a possible judgment. Kant says one cannot use representation <I> in any place other than upon the subject. For example, while I can make the claim “I am tall,” I would make no sense to claim “the tall is I.”

Against the rational psychologist, Kant argues that one cannot make any legitimate inference from the conditions under which representation <I> may be thought, or employed in a judgment, to the status of the ‘I’ as a metaphysical subject of properties. Kant makes this point explicit when he says,

The first syllogism of transcendental psychology imposes on us an only allegedly new insight when it passes off the constant logical subject of thinking as the cognition of a real subject of inherence, with which we do not and cannot have the least acquaintance, because consciousness is the one single thing that makes all representations into thoughts, and in which, therefore, as in the transcendental subject, our perceptions must be encountered; and apart from this logical significance of the I, we have no acquaintance with the subject in itself that grounds this I as a substratum, just as it grounds all thoughts. (A350)

Kant thus argues that one should differentiate between different conceptions of “substance” and the role they play in thoughts concerning the world.

Substance0:

x is a substance0 if and only if the representation of x cannot be used as a predicate in a categorical judgment

Substance1:

x is a substance1 if and only if its existence is such that it can never inhere, or exist, in anything else (B288, 407)

The first conception of substance is merely logical or grammatical. The second conception is explicitly metaphysical. Finally, there is an even more metaphysically demanding usage of “substance” that Kant employs.

(Empirical) Substance2:

x is a substance2 if and only if it is a substance1 that persists at every moment (A144/B183, A182)

According to Kant, the rational psychologist attempts to move from claims about substance0 to the more robustly metaphysical claims characteristic of conceptions and uses of substance1 and substance2. However, without further substantive assumptions, which go beyond anything given in an analysis of the concept <I>, no legitimate inference can be made from our notion of a substance0 to either of the other conceptions of substance.

Because, Kant denies that humans have any intuition, empirical or otherwise, of themselves as subjects, they cannot  come to have any knowledge concerning what we are in terms of beings either substance1 or substance2. At least they cannot do so by reflecting on the conditions of thinking of themselves using first-person concepts. No amount of introspection or reflection on the content of the first-person concept <I> will yield such knowledge.

b. Simplicity (A351-61/B407-8)

Kant’s discussion of the proposed metaphysical simplicity of the subject largely depends on points he made in the previous Paralogism concerning its proposed substantiality. Kant articulates the Second Paralogism as follows:

  1. The subject, whose action can never be regarded as the concurrence of many acting things, is simple. (All A is B)
  2. The self is such a subject. (C is A)
  3. Therefore, the self is simple. (C is B)

Here, the equivocation concerns the notion of a “subject.” Kant’s point, as with the previous Paralogism, is that, from the fact that one’s first-person representation of the self is always a grammatical or logical subject, nothing follows concerning the metaphysical status of that representation’s referent.

Of perhaps greater interest in this discussion of the Paralogism of simplicity is Kant’s analysis of what he calls the “Achilles of all dialectical inferences” (A351). According to the Achilles argument, the soul or mind is known to be a simple, unitary substance, because only such a substance could think unitary thoughts. Called the “unity claim” (see Brook (1997)), it says:

(UC):

If a multiplicity of representations are to form a single representation, they must be contained in the absolute unity of the thinking substance. (A352)

Against UC, Kant argues that there is no reason to think the structure of a thought, as a complex of representations, isn’t mirrored in the complex structure of an entity that thinks thoughts. UC is not analytic, which is to say that there is no contradiction entailed by its negation. UC also fails to be a synthetic a priori claim, in that it follows neither from the nature of intuition’s forms, nor from categories. Hence, UC could only be shown to be true empirically, and because people have no empirical intuition of the self, people have no basis for thinking that UC must be true (A353).

Kant here makes a point similar to contemporary, functionalist accounts of the mind (see Meerbote (1991); Brook (1997)). Mental functions, including the unity of conscious thought, are consistent with a variety of different media in which functions are realized. Kant’s says there is no contradiction in thinking that a plurality of substances might succeed in generating a single, unified thought. Hence, we cannot know that the mind is such that it must be simple in nature.

c. Numerical Identity (A361-66/B408)

Kant articulates the Third Paralogism as follows:

  1. What is conscious of the numerical identity of its Self in different times, is, to that extent, a person. (All C is P)
  2. Now, the soul is conscious of the numerical identity of its Self in different times. (S is C)
  3. Therefore, the soul is a person. (S is P)

Rational psychologists’ interest in establishing the personality of the soul or mind stems from the importance of proving that not only would the mind persist after the destruction of its body, but also that this mind would be the same person as before, not just some sort of bare consciousness or worse (for example, existing only as a “bare monad”).

Kant here makes two main points. First, the rational psychologist cannot infer from the sameness of the first-person representation (the “I think”) or across applications of it in judgment to any conclusion concerning the sameness of the metaphysical subject referred to by that representation. Kant thus again makes a functionalist point. The medium in which a series of representational states inheres may change over time, and there is no contradiction in conceiving of a series of representations as being transferred from one substance to another (A363-4, note).

Second, Kant argues that we can be confident of the soul’s possession of personality by  virtue of apperception’s persistence. The relevant notion of “personality” here distinguishes between a rational being and an animal. While the persistence of apperception (the persistence of the “I think” as being able to attach to all of one’s representations) does not provide an apperceiving subject with any insight into the true metaphysical nature of the mind, it does provide evidence of the soul’s possession of an understanding. Animals, by contrast, do not possess an understanding but, at best, according to Kant, only an analogue thereof. As Kant says in the Anthropology,@

That man can have the I among his representations elevates him infinitely above all other living beings on earth. He is thereby a person […] that is, by rank and worth a completely distinct being from things that are the same as reason-less animals with which one can do as one pleases. (An 7:127, §1)

Hence, so long as a soul possesses the capacity for apperception, it will signal the possession of an understanding, and thus serves to distinguish the human soul from that of an animal (see Dyck (2010), 120).

d. Relation to Objects in Space (A366-80/B409)

Finally, the Fourth Paralogism concerns the relation between awareness of one’s own mind and one’s awareness of other objects distinct from oneself. Thus, it also deals with one’s mind and awareness of space. Kant describes the Fourth Paralogism as follows:

  1. What can be only causally inferred is never certain. (All I is not C)
  2. The existence of outer objects can only be causally inferred, not immediately perceived by us. (O is I)
  3. Therefore, we can never be certain of the existence of outer objects. (O is not C)

Kant locates the damaging ambiguity in the conception of “outer” objects. This is puzzling because it doesn’t play the relevant role as middle term in the syllogism. But Kant is quite clear that this is where the ambiguity lies and distinguishes between two distinct senses of the “outer” or “external”:

Trancendentally Outer/External:

A seperate existence, in and of itself.

Empirically Outer/External:

An existence in space.

Kant’s point here is that all appearances in space are empirically external to the subject who perceives or thinks about them, while nevertheless being transcendentally internal. Such spatial appearances do not have an entirely independent metaphysical nature, because their spatial features depend at least in part on our forms of intuition.

Kant then uses this distinction not only to argue against the assumption of the rational psychologist that the mind is better known than any object in space (famously argued by Descartes), but also against those forms of external world skepticism championed by Descartes and Berkeley. Kant identifies Berkeley with what he calls “dogmatic idealism” and Descartes with what he calls “problematic idealism” (A377). He defines them thus:

Problematic Idealism:

We cannot be certain of the existence of any material body.

Dogmatic Idealism:

We can be certain that no material body exists – the notion of a body is self-contradictory.

Kant brings two arguments to bear against the rational psychologist’s assumption about the immediacy of our self-knowledge, as well as these two forms of skepticism, with mixed results. The two arguments are from “immediacy” and “imagination.”

i. The Immediacy Argument

In an extended passage in the Fourth Paralogism (A370-1) Kant makes the following argument:

External objects (bodies) are merely appearances, hence also nothing other than a species of my representations, whose objects are something only through these representations, but are nothing separated from them. Thus external things exist as well as my self, and indeed both exist on the immediate testimony of my self-consciousness, only with this difference: the representation of my Self, as the thinking subject, is related merely to inner sense, but the representations that designate extended beings are also related to outer sense. I am no more necessitated to draw inferences in respect of the reality of external objects than I am in regard to the reality of the objects of my inner sense (my thoughts), for in both cases they are nothing but representations, the immediate perception (consciousness) of which is at the same time a sufficient proof of their reality. (A370-1)

It helps to understand the argument as follows:

  1. Rational Psychology (RP) privileges awareness of the subject and its states over awareness of non-subjective states.
  2. But, transcendental idealism entails that people are aware of both subjective and objective states, as they appear, in the same way—via a form of intuition.
  3. So, either both kinds of awareness are immediate or they are both mediate.
  4. Because awareness of subjective states is obviously immediate, then awareness of objective states must also be immediate.
  5. Therefore, we are immediately aware of the states or properties of physical objects.

Here, Kant displays what he takes to be an advantage of Transcendental Idealism. Because both inner and outer sense depend on intuition, there is nothing special about inner intuition that privileges it over outer intuition. Both are, as intuitions, immediate presentations of objects, at least as they appear. Unfortunately, Kant never makes clear what he means by the term “immediate” [unmittelbar]. This issue is much contested (see Smit (2000)). At the very least, he means to signal that awareness in intuition is not mediated by any explicit or conscious inference, as when he says that the transcendental idealist “grants to matter, as appearance, a reality which need not be inferred, but is immediately perceived” (A371).

It is not obvious that an external world skeptic would find this argument convincing, as part of the grip of such skepticism relies on the convincing point that things could seem to one just as they currently are, even if there really is no external world causing one’s experiences. This may just beg the question against Kant (particularly premise (2) of the above argument). Certainly, Kant seems to think that his arguments for the existence of pure intuitions of space and time in the Transcendental Aesthetic lend some weight to his position. Thus, Kant is not so much arguing for Transcendental Idealism here as explaining some of the further benefits that come when the position is adopted. He does, however, present at least one further argument against the skeptical objection articulated above—the argument from imagination.

ii. The Argument from Imagination

Kant’s attempt to respond to the skeptical worry that things might appear to be outside us while not actually existing outside us appeals to the role imagination would have to play to make such a possibility plausible (A373-4; compare Anthropology, 7:167-8).

This material or real entity, however, this Something that is to be intuited in space, necessarily presupposes perception, and it cannot be invented by any power of imagination or produced independently of perception, which indicates the reality of something in space. Thus sensation is that which designates a reality in space and time, according to whether it is related to the one or the other mode of sensible intuition.

What follows is a reconstruction of this argument.

  1. If problematic idealism is correct, then it is possible for one to have never perceived any spatial object but only to have imagined doing so.
  2. But imagination cannot fabricate—it can only re-fabricate.
  3. So, if one has sensory experience of outer spatial objects, then one must have had at least one successful perception of an external spatial object.
  4. Therefore, it is certain that an extended spatial world exists.

Kant’s idea here is that the imagination is too limited to generate the various qualities that people experience as instantiated in external physical objects. Hence, it would not be possible to simply imagine an external physical world without having been originally exposed to the qualities instantiated in the physical world. Ergo, the physical world must exist. Even Descartes seems to agree with this, noting in Meditation I that “[certain simple kinds of qualities] are as it were the real colours from which we form all the images of things, whether true or false, that occur in our thought” (Descartes (1984), 13-14). Though Descartes goes on to doubt our capacity to know even such basic qualities given the possible existence of an evil deceiver, it is notable that the deceiver must be something other than ourselves, in order to account for all the richness and variety of what we experience (however, see Meditation VI (Descartes (1984), 54), where Descartes wonders whether there could be some hidden faculty in ourselves producing all of our ideas).

Unfortunately, it isn’t clear that the argument from imagination gets Kant the conclusion he wants, for all that it shows is that there was at one time a physical world, which affected one’s senses and provided the material for one’s sense experiences. This might be enough to show that one has not always been radically deceived, but it is not enough to show that one is not currently being radically deceived. Even worse, it isn’t even clear that a physical world must exist to generate the requisite material for the imagination. Perhaps all that is needed is something distinct from the subject, something which is capable of generating in it the requisite sensory experiences, whether or not they are veridical. This conclusion is thus compatible with that “something” being Descartes’s evil demon, or in contemporary epistemology, with the subject’s being a brain in a vat. Hence, it is not obvious that Kant’s argument succeeds in refuting the skeptic. To the extent that he did refute the skeptic, it still does not show that there is a physical world, as opposed merely to the existence of something distinct from the subject.

e. Lessons of the Paralogisms

Beyond the specific arguments of the Paralogisms and their conclusions, they present us with two central tenets of Kant’s conception of the mind. First, we cannot move from claims concerning the character or role of the first-person representation <I> to claims concerning the nature of the referent of that representation. This is a key part of his criticism of rational psychology. Second, people do not have privileged access to themselves as compared with things outside them. Both the self (or its states) and external objects are on par with respect to intuition. This also means that they only have access to themselves as they appear, and not as they fundamentally, metaphysically, are (compare B157). Hence, according to Kant, self-awareness, just as much as awareness of anything distinct from the self, is conditioned by sensibility. Intellectual access to selves in apperception, Kant argues, does not reveal anything about one’s metaphysical nature, in the sense of the kind of thing that must exist to realize the various cognitive powers that Kant describes as characteristic of a being capable of apperception—a spontaneous understanding or intellect.

5. Summary

Kant’s conception of the mind, his distinction between sensory and intellectual faculties, his functionalism, his conception of mental content, and his work on the nature of the subject/object distinction, were all hugely influential. His work immediately inspired the German Idealist movement. He also became central to emerging ideas concerning the epistemology of science in the late 19th and early 20th centuries, in what became known as the “Neo-Kantian” movement in central and southern Germany. Though Anglophone interest in Kant ebbed somewhat in the early 20th century, his conception of the mind and criticisms of rationalist psychology were again influential mid-century via the work of “analytic” Kantians such as P.F. Strawson, Jonathan Bennett, and Wilfrid Sellars. In the early 21st century Kant’s work on the mind remains a touchstone for philosophical investigation, especially in the work of those influenced by Strawson or Sellars, such as Quassim Cassam, John McDowell, and Christopher Peacocke.

6. References and Further Reading

Quotations from Kant’s work are from the German edition of Kant’s works, the Akademie Ausgabe, with the first Critique cited by the standard A/B edition pagination, and the other works by volume and page. English translations belong to the author of this article article, though he has regularly consulted, and in most cases closely followed, translations from the Cambridge Editions. Specific texts are abbreviated as follows:

  • An: Anthropology from a Pragmatic Point of View
  • C: Correspondence
  • CPR: Critique of Pure Reason
  • CJ: Critique of Judgment
  • JL: Jäsche Logic
  • LA: Lectures on Anthropology
  • LL: Lecturs on Logic
  • LM: Lectures on Metaphysics
  • Pr: Prolegomena to any Future Metaphysics

a. Kant’s Works in English

The most used scholarly English translations of Kant’s work are published by Cambridge University Press as the Cambridge Editions of the Works of Immanuel Kant. The following are from that collection and contain some of Kant’s most important and influential writings.

  • Correspondence, ed. Arnulf Zweig. Cambridge: Cambridge University Press, 1999.
  • Critique of Pure Reason, trans. Paul Guyer and Allen Wood. Cambridge: Cambridge University Press, 1998.
  • Critique of the Power of Judgment, trans. Paul Guyer and Eric Matthews. Cambridge: Cambridge University Press, 2000.
  • History, Anthropology, and Education, eds. Günter Zöller and Robert Louden. Cambridge: Cambridge University Press, 2007.
  • Lectures on Anthropology, ed. and trans. Allen W. Wood and Robert B. Louden. Cambridge: Cambridge University Press, 2012.
  • Lectures on Logic, trans. J. Michael Young. Cambridge: Cambridge University Press, 1992.
  • Lectures on Metaphysics, ed. and trans. Karl Ameriks and Steve Naragon. Cambridge: Cambridge University Press, 2001.
  • Practical Philosophy, ed. Mary Gregor. Cambridge: Cambridge University Press, 1996.
  • Theoretical Philosophy 1755-1770, ed. David Walford. Cambridge: Cambridge University Press, 2002.
  • Theoretical Philosophy after 1781, eds. Henry Allison and Peter Heath. Cambridge: Cambridge University Press, 2002

b. Secondary Sources

  • Allais, Lucy. 2009. “Kant, Non-Conceptual Content and the Representation of Space.” Journal of the History of Philosophy 47 (3): 383–413.
  • Allison, Henry E. 2004. Kant’s Transcendental Idealism: Revised and Enlarged. New Haven: Yale University Press.
  • Ameriks, Karl. 2000. Kant and the Fate of Autonomy: Problems in the Appropriation of the Critical Philosophy. Cambridge: Cambridge University Press.
  • Anderson, R Lanier. 2005. “Neo-Kantianism and the Roots of Anti-Psychologism.” British Journal for the History of Philosophy 13 (2): 287–323.
  • Andrews, Kristin. 2014. The Animal Mind: An Introduction to the Philosophy of Animal Cognition. London: Routledge.
  • Bennett, Jonathan. 1966. Kant’s Analytic. Cambridge: Cambridge University Press.
  • Bennett, Jonathan. 1974. Kant’s Dialectic. Cambridge: Cambridge University Press.
  • Bermúdez, José Luis. 2003. “Ascribing Thoughts to Non-Linguistic Creatures.” Facta Philosophica 5 (2): 313–34.
  • Brook, Andrew. 1997. Kant and the Mind. Cambridge: Cambridge University Press.
  • Buroker, Jill Vance. 2006. Kant’s Critique of Pure Reason: An Introduction. Cambridge: Cambridge University Press.
  • Carl, Wolfgang. 1989. “Kant’s First Drafts of the Deduction of the Categories.” In Kant’s Transcendental Deductions, edited by Eckart Förster, 3–20. Stanford: Stanford University Press.
  • Carson, Emily. 1997. “Kant on Intuition and Geometry.” Canadian Journal of Philosophy 27 (4): 489–512.
  • Carson, Emily. 1999. “Kant on the Method of Mathematics.” Journal of the History of Philosophy 37 (4): 629–52.
  • Caygill, Howard. 1995. A Kant Dictionary. Vol. 121. London: Blackwell.
  • Chignell, Andrew. 2014. “Modal Motivations for Noumenal Ignorance: Knowledge, Cognition, and Coherence.” Kant-Studien 105 (4): 573–97.
  • Descartes, Rene. 1984. The Philosophical Writings of Descartes. Edited by John Cottingham, Robert Stoothoff, and Dugald Murdoch. Vol. 2. Cambridge: Cambridge University Press.
  • Dicker, Georges. 2004. Kant’s Theory of Knowledge : An Analytical Introduction. Oxford: Oxford University Press.
  • Dyck, Corey W. 2010. “The Aeneas Argument: Personality and Immortality in Kant’s Third Paralogism.” In Kant Yearbook, edited by Dietmar Heidemann, 95–122.
  • Engstrom, Stephen. 2013. “Unity of Apperception.” Studi Kantiani 26: 37–54.
  • Friedman, Michael. 1992. Kant and the Exact Sciences. Cambridge, MA: Harvard University Press.
  • Gardner, Sebastian. 1999. Kant and the Critique of Pure Reason. London: Routledge.
  • Ginsborg, Hannah. 2006. “Kant and the Problem of Experience.” Philosophical Topics 34 (1 and 2): 59–106.
  • Griffith, Aaron M. 2012. “Perception and the Categories: A Conceptualist Reading of Kant’s Critique of Pure Reason.” European Journal of Philosophy 20 (2): 193–222.
  • Grüne, Stefanie. 2009. Blinde Anschauung. Vittorio Klostermann.
  • Guyer, Paul. 1987. Kant and the Claims of Knowledge. Cambridge: Cambridge University Press.
  • Guyer, Paul. 2014. Kant. London: Routledge.
  • Hanna, Robert. 2002. “Mathematics for Humans: Kant’s Philosophy of Arithmetic Revisited.” European Journal of Philosophy 10 (3): 328–52.
  • Hanne, Robert. 2005. “Kant and Nonconceptual Content.” European Journal of Philosophy 13 (2): 247–90.
  • Heck, Richard G. 2000. “Nonconceptual Content and the ‘Space of Reasons’.” The Philosophical Review 109 (4): 483–523.
  • Hume, David. 1888. A Treatise of Human Nature. Edited by L A Selby-Bigge. Oxford: Clarendon Press.
  • Hume, David. 2007. An Enquiry Concerning Human Understanding. Edited by Peter Millican. Oxford: Oxford University Press.
  • James, William. 1890. The Principles of Psychology. New York: Holt.
  • Keller, Pierre. 1998. Kant and the Demands of Self-Consciousness. Cambridge: Cambridge University Press.
  • Kitcher, Patricia. 1993. Kant’s Transcendental Psychology. New York: Oxford University Press.
  • Kitcher, Patricia. 2010. Kant’s Thinker. New York: Oxford University Press.
  • Leibniz, Gottfried Wilhelm Freiherr. 1996. New Essays on Human Understanding. Edited by Jonathan Bennett and Peter Remnant. Cambridge: Cambridge University Press.
  • Longuenesse, Béatrice. 1998. Kant and the Capacity to Judge. Princeton: Princeton University Press.
  • Lurz, Robert W. 2011. Mindreading Animals: The Debate over What Animals Know About Other Minds. Cambridge, MA: MIT Press.
  • Lurz, Robert W., ed. 2009. The Philosophy of Animal Minds. Cambridge: Cambridge University Press.
  • McDowell, John. 1996. Mind and World: With a New Introduction. Cambridge, MA: Harvard University Press.
  • McLear, Colin. 2011. “Kant on Animal Consciousness.” Philosophers’ Imprint 11 (15): 1–16.
  • McLear, Colin. 2015. “Two Kinds of Unity in the Critique of Pure Reason.” Journal of the History of Philosophy 53 (1): 79–110.
  • Meerbote, Ralf. 1991. “Kant’s Functionalism.” In Historical Foundations of Cognitive Science, edited by J-C Smith, 161–87. Dordrecht: Kluwer Academic Publishers.
  • Naragon, Steve. 1990. “Kant on Descartes and the Brutes.” Kant-Studien 81 (1): 1–23.
  • Parsons, Charles. 1964. “Infinity and Kant’s Conception of the ‘Possibility of Experience’.” The Philosophical Review 73 (2): 182–97.
  • Parsons, Charles. 1992. “The Transcendental Aesthetic.” In The Cambridge Companion to Kant, edited by Paul Guyer, 62–100. Cambridge: Cambridge University Press.
  • Paton, H J. 1936. Kant’s Metaphysic of Experience. Vol. 1 & 2. London: G. Allen &amp; Unwin, Ltd.
  • Pendlebury, Michael. 1995. “Making Sense of Kant’s Schematism.” PPR 55 (4): 777–97.
  • Pereboom, Derk. 1995. “Determinism Al Dente.” Noûs 29: 21–45.
  • Pereboom, Derk. 2006. “Kant’s Metaphysical and Transcendental Deductions.” In A Companion to Kant, edited by Graham Bird, 154–68. Blackwell Publishing.
  • Pereboom, Derk. 2009. “Kant’s Transcendental Arguments.” Stanford Encyclopedia of Philosophy.
  • Proops, Ian. 2010. “Kant’s First Paralogism.” The Philosophical Review 119 (4): 449.
  • Schellenberg, S. 2011. “Perceptual Content Defended.” Noûs 45 (4): 714–50.
  • Sellars, Wilfrid. 1956. “Empiricism and the Philosophy of Mind.” Minnesota Studies in the Philosophy of Science 1: 253–329.
  • Sellars, Wilfrid. 1968. Science and Metaphysics: Variations on Kantian Themes. London: Routledge &amp; Keegan Paul.
  • Sellars, Wilfrid. 1978. “Berkeley and Descartes: Reflections on the Theory of Ideas.” In Studies in Perception, edited by P K Machamer and R G Turnbull, 259–311. Columbus: Ohio University Press.
  • Shabel, Lisa. 2006. “Kant’s Philosophy of Mathematics.” In The Cambridge Companion to Kant and the Critique of Pure Reason, edited by Paul Guyer, 94–128.
  • Siegel, Susanna. 2010. “The Contents of Perception.” In The Stanford Encyclopedia of Philosophy, edited by Edward N Zalta.
  • Siegel, Susanna. 2011. The Contents of Visual Experience. Oxford: Oxford University Press.
  • Smit, Houston. 2000. “Kant on Marks and the Immediacy of Intuition.” The Philosophical Review 109 (2): 235–66.
  • Strawson, Peter Frederick. 1966. The Bounds of Sense. London: Routledge.
  • Strawson, Peter Frederick. 1970. “Imagination and Perception.” In Experience and Theory, edited by Lawrence Foster and Joe William Swanson. Amherst: University of Massachusets Press.
  • Sutherland, Daniel. 2008. “Arithmetic from Kant to Frege: Numbers, Pure Units, and the Limits of Conceptual Representation.” In Kant and Philosophy of Science Today(1), edited by Michela Massimi, 135–64. Cambridge: Cambridge University Press.
  • Tolley, Clinton. 2013. “The Non-Conceptuality of the Content of Intuitions: A New Approach.” Kantian Review 18 (01): 107–36.
  • Tolley, Clinton. 2014. “Kant on the Content of Cognition.” European Journal of Philosophy 22 (2): 200–228.
  • Van Cleve, James. 1999. Problems from Kant. Oxford: Oxford University Press.
  • Wood, Allen W. 2005. Kant. Oxford: Blackwell Publishing.

 

Author Information

Colin McLear
Email: mclear@unl.edu
University of Nebraska
U. S. A.