Karl Popper: Political Philosophy

popperAmong philosophers, Karl Popper (1902-1994) is best known for his contributions to the philosophy of science and epistemology. Most of his published work addressed philosophical problems in the natural sciences, especially physics; and Popper himself acknowledged that his primary interest was nature and not politics. However, his political thought has arguably had as great an impact as has his philosophy of science. This is certainly the case outside of the academy.  Among the educated general public, Popper is best known for his critique of totalitarianism and his defense of freedom, individualism, democracy and an “open society.” His political thought resides squarely within the camp of Enlightenment rationalism and humanism. He was a dogged opponent of totalitarianism, nationalism, fascism, romanticism, collectivism, and other kinds of (in Popper’s view) reactionary and irrational ideas.

Popper’s rejection of these ideas was anchored in a critique of the philosophical beliefs that, he argued, underpinned them, especially a flawed understanding of the scientific method. This approach is what gives Popper’s political thought its particular philosophical interest and originality—and its controversy, given that he locates the roots of totalitarianism in the ideas of some of the West’s most esteemed philosophers, ancient as well as modern. His defense of a freed and democratic society stems in large measure from his views on the scientific method and how it should be applied to politics, history and social science.  Indeed, his most important political texts—The Poverty of Historicism (1944) and The Open Society and Its Enemies (1945)—offer a kind of unified vision of science and politics.  As explained below, the people and institutions of the open society that Popper envisioned would be imbued with the same critical spirit that marks natural science, an attitude which Popper called critical rationalism. This openness to analysis and questioning was expected to foster social and political progress as well as to provide a political context that would allow the sciences to flourish.

Table of Contents

  1. The Critique of the Closed Society
    1. Open versus Closed Societies
    2. Holism, Essentialism and Historicism
    3. Hegel, Marx and Modern Historicism
    4. Utopian Social Engineering
  2. Freedom, Democracy and the Open Society
    1. Minimalist Democracy
    2. Piecemeal Social Engineering
    3. Negative Utilitarianism
    4. Libertarian, Conservative or Social Democrat?
  3. References and Further Reading
    1. Primary Literature
    2. Secondary Literature

1. The Critique of the Closed Society

A central aim of The Open Society and Its Enemies as well as The Poverty of Historicism was to explain the origin and nature of totalitarianism. In particular, the rise of fascism, including in Popper’s native Austria, and the ensuing Second World War prompted Popper to begin writing these two essays in the late 1930s and early 1940s, while he was teaching in New Zealand. He described these works as his “war effort” (Unended Quest, 115).

The arguments in the two essays overlap a great deal. (In fact, The Open Society began as a chapter for Poverty.) Yet there is a difference in emphasis.  The Poverty of Historicism is concerned principally with the methodology of the social sciences, and, in particular, how a flawed idea, which Popper dubbed “historicism,” had led historians and social scientists astray methodologically and which also served as a handmaiden to tyranny. The Open Society, a much longer and, according to Popper, a more important work, included in-depth discussion of historicism and the methods of the social sciences. But it also featured an inquiry into the psychological and historical origins of totalitarianism, which he located in the nexus of a set of appealing but, he argued, false ideas. These included not only historicism but also what he labeled “holism” and “essentialism.” Together they formed the philosophical substrate of what Popper called the “closed society.” The “closed society” is what leads to totalitarianism.

a. Open versus Closed Societies

According to Popper, totalitarianism was not unique to the 20th century. Rather, it “belongs to a tradition which is just as old or just as young as our civilization itself” (Open Society, Vol. I, 1). In The Open Society, Popper’s search for the roots of totalitarianism took him back to ancient Greece. There he detected the emergence of what he called the first “open society” in democratic Athens of the 5th century B.C.E., Athenians, he argued, were the first to subject their own values, beliefs, institutions and traditions to critical scrutiny and Socrates and the city’s democratic politics exemplified this new attitude. But reactionary forces were unnerved by the instability and rapid social change that an open society had unleashed. (Socrates was indicted on charges of corrupting the youth and introducing new gods.) They sought to turn back the clock and return Athens to a society marked by rigid class hierarchy, conformity to the customs of the tribe, and uncritical deference to authority and tradition—a “closed society.” This move back to tribalism was motivated by a widely and deeply felt uneasiness that Popper called the “strain of civilization.” The structured and organic character of closed societies helps to satisfy a deep human need for regularity and a shared common life, Popper said.  In contrast, the individualism, freedom and personal responsibility that open societies necessarily engender leave many feeling isolated and anxious, but this anxiety, Popper said, must be born if we are to enjoy the greater benefits of living in an open society: freedom, social progress, growing knowledge, and enhanced cooperation. “It is the price we have to pay for being human” (Open Society Vol. 1, 176).

Popper charged that Plato emerged as the philosophical champion of the closed society and in the process laid the groundwork for totalitarianism. Betraying the open and critical temper of his mentor Socrates, in his Republic Plato devised an elaborate system that would arrest all political and social change and turn philosophy into an enforcer, rather than a challenger, of authority.  It would also reverse the tide of individualism and egalitarianism that had emerged in democratic Athens, establishing a hierarchical system in which the freedom and rights of the individual would be sacrificed to the collective needs of society.

Popper noted that Plato’s utopian vision in the Republic was in part inspired by Sparta, Athen’s enemy in the Peloponnesian War and, for Popper, an exemplar of the closed society. Spartan society focused almost exclusively on two goals: internal stability and military prowess. Toward these ends, the Spartan constitution sought to create a hive-like, martial society that always favored the needs of the collective over the individual and required a near total control over its citizenry. This included a primitive eugenics, in which newborn infants deemed insufficiently vigorous were tossed into a pit of water. Spartan males judged healthy enough to merit life were separated from their families at a young age and provided an education consisting mainly of military training. The training produced fearsome warriors who were indifferent to suffering, submissive to authority, and unwaveringly loyal to the city. Fighting for the city was an honor granted solely to the male citizenry, while the degrading toil of cultivating the land was the lot reserved to an enslaved tribe of fellow Greeks, the helots. Rigid censorship was imposed on the citizenry, as well as laws that strictly limited contact with foreigners. Under this system, Sparta became a dominant military power in ancient Greece, but, unsurprisingly, made no significant contributions to the arts and sciences. Popper described Sparta as an “arrested tribalism” that sought to stymie “equalitarian, democratic and individualistic ideologies,” such as found in Athens (Open Society Vol. 1, 182). It was no coincidence, he said, that the Nazis and other modern-day totalitarians were also inspired by the Spartans.

b. Holism, Essentialism and Historicism

Popper charged that three deep philosophical predispositions underpinned Plato’s defense of the closed society and, indeed, subsequent defenses of the closed society during the next two-and-a-half millennia. These ideas were holism, essentialism, and historicism.

Holism may be defined as the view that adequate understanding of certain kinds of entities requires understanding them as a whole. This is often held to be true for biological and social systems, for example, an organism, an ecosystem, an economy, or a culture.  A corollary that is typically held to follow from this view is that such entities have properties that cannot be reduced to the entities’ constituent parts. For instance, some philosophers argue that human consciousness is an emergent phenomenon whose properties cannot be explained solely by the properties of the physical components (nerve cells, neurotransmitters, and so forth) that comprise the human brain. Similarly, those who advocate a holistic approach to social inquiry argue that social entities cannot be reduced to the properties of the individuals that comprise them. That is, they reject methodological individualism and support methodological holism, as Popper called it.

Plato’s holism, Popper argued, was reflected in his view that the city—the Greek polis—was prior to and, in a sense, more real than the individuals who resided in it. For Plato “[o]nly a stable whole, the permanent collective, has reality, not the passing individuals” (Open Society Vol. 1, 80). This view in turn implied that the city has real needs that supersede those of individuals and was thus the source of Plato’s ethical collectivism.  According to Popper, Plato believed that a just society required individuals to sacrifice their needs to the interests of the state. “Justice for [Plato],” he wrote, “is nothing but health, unity and stability of the collective body” (OSE I, 106). Popper saw this as profoundly dangerous. In fact, he said, the view that some collective social entity—be it, for example, a city, a state, society, a nation, or a race—has needs that are prior and superior to the needs of actual living persons is a central ethical tenet of all totalitarian systems, whether ancient or modern. Nazis, for instance, emphasized the needs of the Aryan race to justify their brutal policies, whereas communists in the Soviet Union spoke of class aims and interests as the motor of history to which the individual must bend. The needs of the race or class superseded the needs of individuals. In contrast, Popper held, members of an open society see the state and other social institutions as human designed, subject to rational scrutiny, and always serving the interests of individuals—and never the other way around. True justice entails equal treatment of individuals rather than Plato’s organistic view, in which justice is identified as a well functioning state.

Also abetting Plato’s support for a closed society was a doctrine that Popper named “methodological essentialism”. Adherents of this view claim “that it is the task of pure knowledge or ‘science’ to discover and to describe the true nature of things, i.e., their hidden reality or essence” (Open Society Vol. 1, 31).  Plato’s theory of the Forms exemplified this approach.  According to Plato, understanding of any kind of thing—for example, a bed, a triangle, a human being, or a city—requires understanding what Plato called its Form. The Forms are timeless, unchanging and perfect exemplars of sensible things found in our world. Coming to understand a Form, Plato believed, requires rational examination of its essence. Such understanding is governed by a kind of intuition rather than empirical inquiry. For instance, mathematical intuition provides the route to understanding the essential nature of triangles—that is, their Form—as opposed to attempting to understand the nature of triangles by measuring and comparing actual sensible triangles found in our world.

Although Forms are eternal and unchanging, Plato held that the imperfect copies of them that we encounter in the sensible world invariably undergo decay. Extending this theory presented a political problem for Plato. In fact, according to Popper, the disposition to decay was the core political problem that Plato’s philosophy sought to remedy. The very nature of the world is such that human beings and the institutions that they create tend to degrade over time. For Plato, this included cities, which he believed were imperfect copies of the Form of the city. This view of the city, informed by Plato’s methodological essentialism, produced a peculiar political science, Popper argued. It required, first, understanding the true and best nature of the city, that is, its Form. Second, in order to determine how to arrest (or at least slow) the city’s decay from its ideal nature, the study of politics must seek to uncover the laws or principles that govern the city’s natural tendency towards decay and thereby to halt the degradation. Thus Plato’s essentialism led him to seek a theory of historical change—a theory that brings order and intelligibility to the constant flux of our world. That is, Plato’s essentialism led to what Popper labeled “historicism.”

Historicism is the view that history is governed by historical laws or principles and, further, that history has a necessary direction and end-point. This being so, historicists believe that the aim of philosophy—and, later, history and social science—must be to predict the future course of society by uncovering the laws or principles that govern history. Historicism is a very old view, Popper said, predating Athens of the 5th century B.C.E. Early Greek versions of historicism held that the development of cities naturally and necessarily moves in cycles: a golden age followed by inevitable decay and collapse, which in some versions paves the way for rebirth and a new golden age.  In Plato’s version of this “law of decay,” the ideal city by turns degenerates from timarchy (rule by a military class) to oligarchy to democracy and then, finally, dictatorship. But Plato did not merely describe the gradual degeneration of the city; he offered a philosophical explanation of it, which relied upon his theory of the Forms and thus methodological essentialism. Going further, Plato sought to provide a way to arrest this natural tendency toward decay. This, Popper argued, was the deep aim of the utopian society developed in the Republic—a newly fabricated closed society as the solution to natural tendency toward moral and political decline. It required creation of a rigid and hierarchical class society governed by philosopher kings, whose knowledge of the Forms would stave off decay as well as ensure the rulers’ incorruptibility. Tumultuous democratic Athens would be replaced with a stable and unchanging society. Plato saw this as justice, but Popper argued that it had all the hallmarks of totalitarianism, including rigid hierarchy, censorship, collectivism, central planning—all of which would be reinforced through propaganda and deception, or, as Plato called them, “noble lies.”

Plato’s deep mistrust of democracy was no doubt in part a product of experience. As a young man he saw the citizens of Athens, under the influence of demagogues, back ill-advised military campaigns that ultimately led to the Spartan victory over the city in 404 B.C.E. After democracy was reestablished following the Spartan occupiers’ departure in 403 B.C.E., he witnessed the Athenian people’s vote to execute its wisest citizen, Socrates.  Popper as a young man had also witnessed the collapse of democracy, in his native Austria and throughout Europe. But he drew very different lessons from that experience. For him, democracy remained a bulwark against tyranny, not its handmaiden.  For reasons explained in the next section, Popper held that by rejecting democracy Plato’s system destroyed not only individual freedom but also the conditions for social, political, scientific and moral progress.

Popper’s criticism of Plato sparked a lively and contentious debate. Prior to publication of The Open Society, Plato was widely regarded as the wellspring of enlightened humanism in the Western tradition.  Popper’s recasting of Plato as a proto-fascist was scandalous. Classists rose to Plato’s defense and accused Popper of reading Plato ahistorically, using dubious or tendentious translations of his words, and failing to appreciate the ironic and literary elements in Plato’s dialogues. These criticisms exposed errors in Popper’s scholarship. But Popper was nonetheless successful in drawing attention to potential totalitarian dangers of Plato’s utopianism. Subsequent scholarship could not avoid addressing his arguments.

Although Plato was the principle target of Popper’s criticisms in the Open Society, he also detected dangerous tendencies in other ancient Greek philosophers’ ideas, most notably Aristotle’s. Plato’s greatest student, Popper argued, had inherited his teacher’s essentialism but had given it a teleological twist. Like Plato, Aristotle believed that knowledge of an entity required grasping its essence. However, Plato and Aristotle differed in their understanding of the relationship between an entity’s essence and how that essence was manifested in the sensible world. Plato held that the entities found in the sensible world were imperfect, decaying representation of the Forms. Thus his understanding of history, Popper argued, was ultimately pessimistic: the world degrades over time. Plato’s politics was an attempt to arrest or at least slow this degradation.  In contrast, Aristotle understood an entity’s essence as a bundle of potentialities that become manifest as the entity develops through time. An entity’s essence acts as a kind of internal motor that impels the entity toward its fullest development, or what Aristotle called its final cause. The oak tree, for example, is the final cause of an acorn, the end towards which it strives.

Herein Popper detected an implicit historicism in Aristotle’s epistemology. Though Aristotle himself produced no theory of history, his essentialism wedded to his teleology naturally lent itself to the notion that a person’s or a state’s true nature can only be understood as it is revealed through time. “Only if a person or a state develops, and only by way of its history, can we get to know anything about its ‘hidden undeveloped essence’” (Open Society Vol. 1I, 7). Further, Popper argued that Aristotle’s essentialism naturally aligned with the notion of historical destiny: a state’s or a nation’s development is predetermined by its “hidden undeveloped essence.”

Popper believed that he had revealed deep links between ancient Greek philosophy and hostility toward the open society. In Plato’s essentialism, collectivism, holism and historicism, Popper detected the philosophical underpinning for Plato’s ancient totalitarian project. As we shall see in the next section, Popper argued that these very same ideas were at the heart of modern totalitarianism, too. Though for Popper Plato was the most important ancient enemy of the open society, in Aristotle’s teleological essentialism Popper found a key link connecting ancient and modern historicism. In fact, the idea of historical destiny that Aristotle’s thought generated was at the core of the thought of two 19th century philosophers, G.W.F. Hegel and Karl Marx, whom Popper charged with facilitating the emergence of modern closed societies. The “far-reaching historicist consequences” of Aristotle’s essentialism “were slumbering for more than twenty centuries, ‘hidden and undeveloped’,” until the advent of Hegel’s philosophical system (Open Society Vol. 1, 8).

c. Hegel, Marx and Modern Historicism

History was central to both Hegel’s and Marx’s philosophy, and for Popper their ideas exemplified historicist thinking and the political dangers that it entailed. Hegel’s historicism was reflected in his view that the dialectal interaction of ideas was the motor of history. The evolution and gradual improvement of philosophical, ethical, political and religious ideas determines the march of history, Hegel argued. History, which Hegel sometimes described as the gradual unfolding of “Reason,” comes to an end when all the internal contradictions in human ideas are finally resolved.

Marx’s historical materialism famously inverted Hegel’s philosophy. For Marx, history was a succession of economic and political systems, or “modes of production” in Marx’s language. As technological innovations and new ways of organizing production led to improvements in a society’s capacity to meet human material needs, new modes of production would emerge. In each new mode of production, the political and legal system, as well as the dominant moral and religious values and practices, would reflect the interests of those who controlled the new productive system. Marx believed that the capitalist mode of production was the penultimate stage of human history. The productive power unleashed by new technologies and factory production under capitalism was ultimately incompatible with capitalism as an economic and political system, which was marked by inefficiency, instability and injustice.  Marx predicted that these flaws would inevitably lead to revolution followed by establishment of communist society. This final stage of human development would be one of material abundance and true freedom and equality for all.

According to Popper, though they disagreed on the mechanism that directed human social evolution, both Hegel and Marx, like Plato, were historicists because they believed that trans-historical laws governed human history.  This was the key point for Popper, as well as the key error and danger.

The deep methodological flaw of historicism, according to Popper, is that historicists wrongly see the goal of social science as historical forecast—to predict the general course of history. But such prediction is not possible, Popper said. He provided two arguments that he said demonstrated its impossibility. The first was a succinct logical argument: Human knowledge grows and changes overtime, and knowledge in turn affects social events. (That knowledge might be, for example, a scientific theory, a social theory, or an ethical or religious idea.) We cannot predict what we will know in the future (otherwise we would already know it), therefore we cannot predict the future.  As long as it is granted that knowledge affects social behavior and that knowledge changes overtime—two premises that Popper considered incontestable—then the view that we can predict the future cannot be true and historicism must be rejected. This argument, it should be noted, also reflected Popper’s judgment that the universe is nondeterministic: that is, he believed that prior conditions and the laws of nature do not completely causally determine the future, including human ideas and actions. Our universe is an “open” universe, he said.

Popper’s second argument against the possibility of historical forecasting focused on the role of laws in social explanations. According to Popper, historicists wrongly believe that genuine social science must be a kind of “theoretical history” in which the aim is to uncover laws of historical development that explain and predict the course of history (Poverty of Historicism, 39). But Popper contended that this represents a fundamental misunderstanding of scientific laws. In fact, Popper argued, there is no such thing as a law of historical development. That is, there are no trans-historical laws that determine the transition from one historical period to the next.  Failure to understand why this is so represented a deep philosophical error. There may be sociological laws that govern human behavior within particular social systems or institutions, Popper said. For instance, the laws of supply and demand are kinds of social laws governing market economies. But the future course of history cannot be predicted and, in particular, laws that govern the general trajectory of history do not exist. Popper does not deny that there can be historical trends—a tendency towards greater freedom and equality, more wealth or better technology, for instance, but unlike genuine laws, trends are always dependent upon conditions. Change the conditions and the trends may alter or disappear. A trend towards greater freedom or knowledge could be disrupted by, say, the outbreak of a pandemic disease or the emergence of a new technology that facilitates authoritarian regimes. Popper acknowledges that in certain cases natural scientists can predict the future—even the distance future—with some confidence, as is the case with astronomy, for instance. But this type of successful long-range forecasting can occur only in physical systems that are “well-isolated, stationary and recurrent,” such as the solar system (Conjectures and Refutations, 339). Social systems can never be isolated and stationary, however.

d. Utopian Social Engineering

So historicism as social science is deeply defective, according to Popper. But he also argued that it was politically dangerous and that this danger stemmed from historicism’s natural and close allegiance with what Popper called “utopian social engineering.” Such social planning “aims at remodeling the ‘whole of society’ in accordance with a definite plan or blueprint,” as opposed to social planning that aims at gradual and limited adjustments. Popper admitted that the alliance between historicism and utopian engineering was “somewhat strange” (Poverty of Historicism, 73). Because historicists believe that laws determine the course of history, from their vantage it is ultimately pointless to try to engineer social change. Just as a meteorologist can forecast the weather, but not alter it, the same holds for social scientists, historicists believe. They can predict future social developments, but not cause or alter them. Thus “out-and-out historicism” is against utopian planning—or even against social planning altogether (Open Society Vol. 1, 157). For this reason Marx rejected attempts to design a socialist system; in fact he derided such projects as “utopian.” Nonetheless, the connection between historicism and utopian planning remains strong, Popper insisted. Why?

First, historicism and utopian engineering share a connection to utopianism. Utopians seek to establish an ideal state of some kind, one in which all conflicts in social life are resolved and ultimate human ends—for example, freedom, equality, true happiness—are somehow reconciled and fully realized. Attaining this final goal requires radical overhaul of the existing social world and thus naturally suggests the need for utopian social engineering.  Many versions of historicism are thus inclined towards utopianism. As noted above, both Marx’s and Hegel’s theory of history, for instance, predict an end to history in which all social contradictions will be permanently resolved. Second, historicism and utopian social engineering both tend to embrace holism. Popper said that historicists, like utopian engineers, typically believe that “society as a whole” is the proper object of scientific inquiry. For the historicist, society must be understood in terms of social wholes, and to understand the deep forces that move the social wholes, you must understand the laws of history. Thus the historicists’ anticipation of the coming utopia, and their knowledge of the historical tendencies that will bring it about, may tempt them to try to intervene in the historical process and therefore, as Marx said, “lessen the birth pangs” associated with the arrival of the new social order. So while a philosophically consistent historicism might seem to lead to political quiescence, the fact is that historicists often cannot resist political engagement. In addition, Popper noted that even less radical versions of historicism, such as Plato’s, permit human intervention.

Popper argued that utopian engineering, though superficially attractive, is fatally flawed: it invariably leads to multitudinous unintended and usually unwelcome consequences. The social world is so complex, and our understanding of it so incomplete, that the full impact of any imposed change to it, especially grand scale change, can never be foreseen. But, because of their unwarranted faith in their historical prophesies, the utopian engineers will be methodologically ill equipped to deal with this reality. The unintended consequences will be unanticipated, and he or she will be forced to respond to them in a haphazard and ill-informed manner: “[T]he greater the holistic changes attempted, the greater are their unintended and largely unexpected repercussions, forcing on the holistic engineer the expedient of piecemeal improvisation” or the “notorious phenomenon of unplanned planning (Poverty of Historicism, 68-69). One particularly important cause of unintended consequences that utopian engineers are generally blind to is what Popper called the “human factor” in all institutional design. Institutions can never wholly govern individuals’ behavior, he said, as human choice and human idiosyncrasies will ensure this. Thus no matter how thoroughly and carefully an institution is designed, the fact that institutions are filled with human beings results in a certain degree of unpredictability in their operation. But the historicists’ holism leads them to believe that individuals are merely pawns in the social system, dragged along by larger social forces outside their control. The effect of the human factor is that utopian social engineers inevitably are forced, despite themselves, to try to alter human nature itself in their bid to transform society. Their social plan “substitutes for [the social engineers’] demand that we build a new society, fit for men and women to live in, the demand that we ‘mould’ these men and women to fit into this new society” (Poverty of Historicism, 70).

Achieving such molding requires awesome and total power and thus in this way utopian engineering naturally tends toward the most severe authoritarian dictatorship. But this is not the only reason that utopian engineering and tyranny are allied. The central planning that it requires invariably concentrates power in the hands of the few, or even the one. This is why even utopian projects that officially embrace democracy tend towards authoritarianism. Authoritarian societies are in turn hostile to any public criticism, which deprives the planners of needed feedback about the impact of their policies, which further undermines the effectiveness of utopian engineering. In addition, Popper argued that the utopian planners’ historicism makes them indifferent to the misery that their plans cause. Having uncovered what they believe is inevitable en route to utopia, they all too easily countenance any suffering as a necessary part of that process, and, moreover, they will be inclined to see such suffering as outweighed by the benefits that will flow to all once utopia is reached.

Popper’s discussion of utopian engineering and its link to historicism is highly abstract. His criticisms are generally aimed at “typical” historicists and utopian planners, rather than actual historical or contemporary figures.  This reluctance to name names is somewhat surprising, given that Popper himself later stated that the political disasters of the 1930s and 40s were the impetus for his foray into political philosophy. Exactly whom did Popper think was guilty of social science malpractice? A contemporary reader with a passing familiarity with 20th-century history is bound to suppose that Popper had in mind the horrors of the Soviet Union when he discussed utopian planning. Indeed, the attempts to transform the Soviet Union into a modern society—the “five year plans,” rapid industrialization, collectivization of agriculture, and so forth—would seem to feature all the elements of utopian engineering. They were fueled by Marxist historicism and utopianism, centrally planned, aimed at wholesale remodeling of Russian society, and even sought to create a new type of person—“New Soviet Man”—through indoctrination and propaganda. Moreover, the utopian planning had precisely the pernicious effects that Popper predicted. The Soviet Union soon morphed into a brutal dictatorship under Stalin, criticism of the leadership and their programs was ruthlessly suppressed, and the various ambitious social projects were bedeviled by massive unintended consequences. The collectivization of agriculture, for instance, led to a precipitous drop in agricultural production and some 10 million deaths, partly from the unintended consequence of mass starvation and partly from the Soviet leaders’ piecemeal improvisation of murdering incorrigible peasants. However, when writing Poverty and The Open Society, Popper regarded the Soviet experiments, at least the early ones, as examples of piecemeal social planning rather than the utopian kind. His optimistic assessment is no doubt explained partly by his belief at the time that the Russian revolution was a progressive event, and he was thus reluctant to criticize the Soviet Union (Hacohen, 396-397). In any event, the full horrors of the Soviet social experiments were not yet known to the wider world. In addition, the Soviets during the Second World War were part of the alliance against fascism, which Popper saw as a much greater threat to humanity. In fact, initially Popper viewed totalitarianism as an exclusively right-wing phenomenon. However, he later became a unambiguous opponent of Soviet-style communism, and he dedicated the 1957 publication in book form of The Poverty of Historicism to the “memory of the countless men, women and children of all creeds or nations or races who fell victims to the fascist and communist belief in Inexorable Laws of Historical Destiny.”

2. Freedom, Democracy and the Open Society 

Having uncovered what he believed were the underlying psychological forces abetting totalitarianism (the strain of civilization) as well as the flawed philosophical ideas (historicism, holism and essentialism), Popper provided his own account of the values and institutions needed to sustain an open society in the contemporary world.  He viewed modern Western liberal democracies as open societies and defended them as “the best of all political worlds of whose existence we have any historical knowledge” (All Life Is Problem Solving, 90). For Popper, their value resided principally in the individual freedom that they permitted and their ability to self-correct peacefully over time. That they were democratic and generated great prosperity was merely an added benefit. What gives the concept of an open society its interest is not so much the originality of the political system that Popper advocated, but rather the novel grounds on which he developed and defended this political vision. Popper’s argument for a free and democratic society is anchored in a particular epistemology and understanding of the scientific method. He held that all knowledge, including knowledge of the social world, was conjectural and that freedom and social progress ultimately depended upon the scientific method, which is merely a refined and institutionalized process of trial and error.  Liberal democracies in a sense both embodied and fostered this understanding of knowledge and science.

a. Minimalist Democracy

Popper’s view of democracy was simple, though not simplistic, and minimalist. Rejecting the question Who should rule? as the fundamental question of political theory, Popper proposed a new question: “How can we so organize political institutions that bad or incompetent rulers can be prevented from doing too much damage?” (Open Society Vol. 1, 121). This is fundamentally a question of institutional design, Popper said. Democracy happens to be the best type of political system because it goes a long way toward solving this problem by providing a nonviolent, institutionalized and regular way to get rid of bad rulers—namely by voting them out of office. For Popper, the value of democracy did not reside in the fact that the people are sovereign. (And, in any event, he said, “the people do not rule anywhere, it is always governments that rule” [All Life Is Problem Solving, 93]). Rather, Popper defended democracy principally on pragmatic or empirical grounds, not on the “essentialist” view that democracy by definition is rule by the people or on the view that there is something intrinsically valuable about democratic participation. With this move, Popper is able to sidestep altogether a host of traditional questions of democratic theory, e.g.. On what grounds are the people sovereign? Who, exactly, shall count as “the people”? How shall they be represented? The role of the people is simply to provide a regular and nonviolent way to get rid of incompetent, corrupt or abusive leaders.

Popper devoted relatively little thought toward the design of the democratic institutions that permit people to remove their leaders or otherwise prevent them from doing too much harm. But he did emphasize the importance of instituting checks and balances into the political system. Democracies must seek “institutional control of the rulers by balancing their power against other powers” (Ibid.) This idea, which was a key component of the “new science” of politics in the 18th century, was expressed most famously by James Madison in Federalist Paper #51.  “A dependence on the people is, no doubt, the primary control on the government,” Madison wrote, “but experience has taught mankind the necessity of auxiliary precautions.” That is, government must be designed such that “ambition must be made to counteract ambition.” Popper also argued that two-party systems, such as found in the United States and Great Britain, are superior to proportional representation systems; he reasoned that in a two-party system voters are more easily able to assign failure or credit to a particular political party, that is, the one in power at the time of an election. This in turn fosters self-criticism in the defeated party: “Under such a system … parties are from time to time forced to learn from their mistakes” (All Life Is Problem Solving, 97). For these reasons, government in a two-party system better mirrors the trial-and-error process found in science, leading to better public policy. In contrast, Popper argued that proportional representation systems typically produce multiple parties and coalitional governments in which no single party has control of the government. This makes it difficult for voters to assign responsibility for public policy and thus elections are less meaningful and government less responsive. (It should be noted that Popper ignored that divided government is a typical outcome in the U.S. system. It is relevantly infrequent for one party to control the presidency along with both chambers of the U.S. congress, thus making it difficult for voters to determine responsibility for public policy successes and failures.)

Importantly, Popper’s theory of democracy did not rely upon a well-informed and judicious public. It did not even require that the public, though ill-informed, nonetheless exercises a kind of collective wisdom. In fact, Popper explicitly rejected vox populi vox dei as a “classical myth”. “We are democrats,” Popper wrote, “not because the majority is always right, but because democratic traditions are the least evil ones of which we know” (Conjectures and Refutations, 351). Better than any other system, democracies permit the change of government without bloodshed. Nonetheless Popper expressed the hope that public opinion and the institutions that influence it (universities, the press, political parties, cinema, television, and so forth) could become more rational overtime by embracing the scientific tradition of critical discussion—that is, the willingness to submit one’s ideas to public criticism and habit of listening to another person’s point of view.

b. Piecemeal Social Engineering

So the chief role of the citizen in Popper’s democracy is the small but important one of removing bad leaders. How then is public policy to be forged and implemented? Who forges it? What are its goals? Here Popper introduced the concept of “piecemeal social engineering,” which he offered as a superior approach to the utopian engineering described above. Unlike utopian engineering, piecemeal social engineering must be “small scale,” Popper said, meaning that social reform should focus on changing one institution at a time.  Also, whereas utopian engineering aims for lofty and abstract goals (for example, perfect justice, true equality, a higher kind of happiness), piecemeal social engineering seeks to address concrete social problems (for example, poverty, violence, unemployment, environmental degradation, income inequality). It does so through the creation of new social institutions or the redesign of existing ones. These new or reconfigured institutions are then tested through implementation and altered accordingly and continually in light of their effects. Institutions thus may undergo gradual improvement overtime and social ills gradually reduced. Popper compared piecemeal social engineering to physical engineering. Just as physical engineers refine machines through a series of small adjustments to existing models, social engineers gradually improve social institutions through “piecemeal tinkering.” In this way, “[t]he piecemeal method permits repeated experiments and continuous readjustments” (Open Society Vol 1., 163). Only such social experiments, Popper said, can yield reliable feedback for social planners. In contrast, as discussed above, social reform that is wide ranging, highly complex and involves multiple institutions will produce social experiments in which it is too difficult to untangle causes and effects.  The utopian planners suffer from a kind of hubris, falsely and tragically believing that they possess reliable experimental knowledge about how the social world operates.  But the “piecemeal engineer knows, like Socrates, how little he knows. He knows that we can learn only from our mistakes” (Poverty of Historicism, 67).

Thus, as with his defense of elections in a democracy, Popper’s argument for piecemeal social engineering rests principally on its compatibility with the trial-and-error method of the natural sciences: a theory is proposed and tested, errors in the theory are detected and eliminated, and a new, improved theory emerges, starting the cycle over. Via piecemeal engineering, the process of social progress thus parallels scientific progress. Indeed, Popper says that piecemeal social engineering is the only approach to public policy that can be genuinely scientific: “This—and no Utopian planning or historical prophecy—would mean the introduction of scientific method into politics, since the whole secret of scientific method is a readiness to learn from mistakes” (Open Society Vol 1., 163).

c. Negative Utilitarianism

If piecemeal social engineers should target specific social problems, what criteria should they use to determine which problems are most urgent? Here Popper introduced a concept that he dubbed “negative utilitarianism,” which holds that the principal aim of politics should be to reduce suffering rather than to increase happiness. “[I]t is my thesis,” he wrote, “that human misery is the most urgent problem of a rational public policy” (Conjectures and Refutations, 361). He made several arguments in favor of this view.

First, he claimed that there is no moral symmetry between suffering and happiness: “In my opinion … human suffering makes a direct moral appeal, namely, an appeal for help, while there is no similar call to increase the happiness of a man who is doing well anyway” (Open Society Vol. 1, 284). He added:

A further criticism of the Utilitarian formula ‘Maximize pleasure’ is that it assumes, in principle, a continuous pleasure-pain scale which allows us to treat degrees of pain as negative degrees of pleasure. But, from a moral point of view, pain cannot be outweighed by pleasure, and especially not one man’s pain by another man’s pleasure (Ibid.).

In arguing against what we might call “positive utilitarianism,” Popper stressed the dangers of utopianism. Attempts to increase happiness, especially when guided by some ideal of complete or perfect happiness, are bound to lead to perilous utopian political projects. “It leads invariably to the attempt to impose our scale of ‘higher’ values upon others, in order to make them realize what seems to us of greatest importance for their happiness; in order, as it were to save their souls. It leads to Utopianism and Romanticism” (Open Society Vol 11., 237).  In addition, such projects are dangerous because they tend to justify extreme measures, including severe human suffering in the present, as necessary measures to secure a much greater human happiness in the future. “[W]e must not argue that the misery of one generation may be considered as a mere means to the end of securing the lasting happiness of some later generation or generations” (Conjectures and Refutations, 362). Moreover, such projects are doomed to fail anyway, owing to the unintended consequences of social planning and the irreconcilability of the ultimate humans ends of freedom, equality, and happiness. Thus Popper’s rejection of positive utilitarianism becomes part of his broader critique of utopian social engineering, while his advocacy of negative utilitarianism is tied to his support for piecemeal social engineering. It is piecemeal engineering that provides the proper approach to tackling the identifiable, concrete sources of suffering in our world.

Finally, Popper offered the pragmatic argument that negative utilitarianism approach “adds to clarify the field of ethics” by requiring that “we formulate our demands negatively”  (Open Society Vol. 1, 285.). Properly understood, Popper says, the aim of science is “the elimination of false theories … rather than the attainment of established truths” (Ibid.). Similarly, ethical public policy may benefit by aiming at “the elimination of suffering rather than the promotion of happiness” (Ibid.). Popper thought that reducing suffering provides a clearer target for public policy than chasing after the will-o’-the-wisp, never-ending goal of increasing happiness. In addition, he argued, it easier to reach political agreement to combat suffering than to increase happiness, thus making effective public policy more likely. “For new ways of happiness are theoretical, unreal things, about which it may be difficult to form an opinion. But misery is with us, here and now, and it will be with us for a long time to come. We all know it from experience” (Conjectures and Refutations, 346). Popper thus calls for a public policy that aims at reducing and, hopefully, eliminating such readily identifiable and universally agreed upon sources of suffering as “poverty, unemployment, national oppression, war, and disease” (Conjectures and Refutations, 361).

d. Libertarian, Conservative or Social Democrat?

Popper’s political thought would seem to fit most comfortably within the liberal camp, broadly understood. Reason, toleration, nonviolence and individual freedom formed the core of his political values, and, as we have seen, he identified modern liberal democracies as the best-to-date embodiment of an open society. But where, precisely, did he reside within liberalism? Here Popper’s thought is difficult to categorize, as it includes elements of libertarianism, conservatism, and socialism—and, indeed, representatives from each of these schools have claimed him for their side.

The case for Popper’s libertarianism rests mainly on his emphasis on freedom and his hostility to large-scale central planning. He insisted that freedom—understood as individual freedom—is the most important political value and that efforts to impose equality can lead to tyranny. “Freedom is more important than equality,” he wrote, and “the attempt to realize equality endangers freedom” (Unended Quest, 36.) Popper also had great admiration for Friedrich Hayek, the libertarian economist from the so-called Austrian school, and he drew heavily upon his ideas in his critique of central planning. However, Popper also espoused many views that would be anathema to libertarians. Although he acknowledged “the tremendous benefit to be derived from the mechanism of free markets,” he seemed to regard economic freedom as important mainly for its instrumental role in producing wealth rather than as an important end in itself (Open Society Vol 11., 124). Further, he warned of the dangers of unbridled capitalism, even declaring that “the injustice and inhumanity of the unrestrained ‘capitalist system’ described by Marx cannot be questioned” (Ibid.). The state therefore must serve as a counteracting force against the predations of concentrated economic power: “We must construct social institutions, enforced by the power of the state, for the protection of the economically weak from the economically strong” (Open Society Vol 11., 125). This meant that the “principle of non-intervention, of an unrestrained economic system, has to be given up” and replaced by “economic interventionism” (Ibid.)  Such interventionism, which he also called “protectionism,” would be implemented via the piecemeal social engineering described above. This top-down and technocratic vision of politics is hard to reconcile with libertarianism, whose adherents, following Hayek, tend to believe that such social engineering is generally counterproductive, enlarges the power and thus the danger of the state, and violates individual freedom.

It is in this interventionist role for the state where the socialistic elements of Popper’s political theory are most evident. In his intellectual autobiography Unended Quest, Popper says that he was briefly a Marxist in his youth, but soon rejected the doctrine for what he saw as its adherents’ dogmatism and embrace of violence.  Socialism nonetheless remained appealing to him, and he remained a socialist for “several years” after abandoning Marxism (Unended Quest, 36). “For nothing could be better,” he wrote, “than living a modest, simple, and free life in an egalitarian society” (Ibid.). However, eventually he concluded that socialism was “no more than a beautiful dream,” and the dream is undone by the conflict between freedom and equality (Ibid.).

But though Popper saw utopian efforts to create true social and economic equality as dangerous and doomed to fail anyway, he continued to support efforts by the state to reduce and even eliminate the worst effects of capitalism. As we saw above, he advocated the use of piecemeal social engineering to tackle the problems of poverty, unemployment, disease and “rigid class differences.” And it is clear that for Popper the solutions to these problems need not be market-oriented solutions. For instance, he voiced support for establishing a minimum income for all citizens as a means to eliminate poverty. It seems then that his politics put into practice would produce a society more closely resembling the so-called social democracies of northern Europe, with their more generous social welfare programs and greater regulation of industry, than the United States, with its more laissez-faire capitalism and comparatively paltry social welfare programs. That said, it should be noted that in later editions of The Open Society, Popper grew somewhat more leery of direct state intervention to tackle social problems, preferring tinkering with the state’s legal framework, if possible, to address them. He reasoned that direct intervention by the state always empowers the state, which endangers freedom.

Evidence of Popper’s conservatism can be found in his opposition to radical change. His critique of utopian engineering at times seems to echo Edmund Burke’s critique of the French Revolution. Burke depicted the bloodletting of the Terror as an object lesson in the dangers of sweeping aside all institutions and traditions overnight and replacing them with an abstract and untested social blueprint. Also like Burke and other traditional conservatives, Popper emphasized the importance of tradition for ensuring order, stability and well-functioning institutions. People have an inherent need for regularity and thus predictability in their social environment, Popper argued, which tradition is crucial for providing. However, there are important differences between Popper’s and Burke’s understanding of tradition. Popper included Burke, as well as the influential 20th-century conservative Michael Oakeshott, in the camp of the “anti-rationalists.” This is because “their attitude is to accept tradition as something just given”; that is, they “accept tradition uncritically” (Conjectures and Refutations, 120, 122, Popper’s emphasis). Such an attitude treats the values, beliefs and practices of a particular tradition as “taboo.” Popper, in contrast, advocated a “critical attitude” toward tradition (Ibid., Popper’s emphasis). “We free ourselves from the taboo if we think about it, and if we ask ourselves whether we should accept it or reject” (Ibid.). Popper emphasized that a critical attitude does not require stepping outside of all traditions, something Popper denied was possible. Just as criticism in the sciences always targets particular theories and also always takes place from the standpoint of some theory, so to for social criticism with respect to tradition. Social criticism necessarily focuses on particular traditions and does so from the standpoint of a tradition. In fact, the critical attitude toward tradition is itself a tradition — namely the scientific tradition — that dates back to the ancient Greeks of the 5th and 6th century B.C.E.

Popper’s theory of democracy also arguably contained conservative elements insofar as it required only a limited role for the average citizen in governing.  As we saw above, the primary role of the public in Popper’s democracy is to render a verdict on the success or failure of a government’s policies. For Popper public policy is not to be created through the kind of inclusive public deliberation envisioned by advocates of radical or participatory democracy. Much less is it to be implemented by ordinary citizens. Popper summed up his view by quoting Pericles, the celebrated statesman of Athenian democracy in 5th-century B.C.E.: “’Even if only a few of us are capable of devising a policy or putting it into practice, all of us are capable of judging it’.” Popper added, “Please note that [this view] discounts the notion of rule by the people, and even of popular initiative. Both are replaced with the very different idea of judgement by the people” (Lessons of This Century, 72, Popper’s emphasis). This view in some ways mirrors traditional conservatives’ support for rule by “natural aristocrats,” as Burke called them, in a democratic society. Ideally, elected officials would be drawn from the class of educated gentlemen, who would be best fit to hold positions of leadership owing to their superior character, judgment and experience.  However, in Popper’s system, good public policy in a democracy would result not so much from the superior wisdom or character of its leadership but rather from their commitment to the scientific method. As discussed above, this entailed implementing policy changes in a piecemeal fashion and testing them through the process of trial and error. Popper’s open society is technocratic rather than aristocratic.

3. References and Further Reading

The key texts for Popper’s political thought are The Open Society and Its Enemies (1945) and The Poverty of Historicism (1944/45). Popper continued to write and speak about politics until his death in 1994, but his later work was mostly refinement of the ideas that he developed in those two seminal essays.  Much of that refinement is contained in Conjectures and Refutations (1963), a collection of essays and addresses from the 1940s and 50s that includes in-depth discussions of public opinion, tradition and liberalism. These and other books and essay collections by Popper that include sustained engagement with political theory are listed below:

a. Primary Literature

  • Popper, Karl. 1945/1966. The Open Society and Its Enemies, Vol. 1, Fifth Edition. Princeton: Princeton University Press.
  • Popper, Karl. 1945/1966. The Open Society and Its Enemies, Vol. I1, Fifth Edition. Princeton: Princeton University Press.
  • Popper, Karl.1957. The Poverty of Historicism. London: Routledge.
    • A revised version of “The Poverty of Historicism,” first published in the journal Economica in three parts in 1944 and 1945.
  • Popper, Karl. 1963/1989. Conjectures and Refutations. Fifth Edition. London: Routledge and Kegan Paul.
  • Popper, Karl. 1976. Unended Quest. London: Open Court.
    • Popper’s intellectual autobiography.
  • Popper, Karl.1985. Popper Selections. David Miller (ed.). Princeton: Princeton University Press.
    • Contains excerpts from The Open Society and The Poverty of Historicism, as well as a few other essays on politics and history.
  • Popper, Karl. 1994. In Search of a Better World. London: Routledge.
    • Parts II and III contain, respectively, essays on the role of culture clash in the emergence of open societies and the responsibility of intellectuals.
  • Popper, Karl. 1999. All Life Is Problem Solving. London: Routledge.
    • Part II of this volume contains essays and speeches on history and politics, mostly from the 1980s and 90s.
  • Popper, Karl. 2000. The Lessons of This Century. London: Routledge.
    • A collection of interviews with Popper, dating from 1991 and 1993, on politics, plus two addresses from late1980s on democracy, freedom and intellectual responsibility.

b. Secondary Literature

  • Bambrough, Renford. (ed.). 1967. Plato, Popper, and Politics: Some Contributions to a Modern Controversy. New York: Barnes and Noble
    • Contains essays addressing Popper’s controversial interpretation of Plato.
  • Corvi, Roberta. 1997. An Introduction to the Thought of Karl Popper. London: Routledge.
    • Emphasizes connections between Popper’s epistemological, metaphysical and political works.
  • Currie, Gregory, and Alan Musgrave (eds.). 1985. Popper and the Human Sciences. Dordrecht: Martinus Nijhoff Plublishers.
    • Essays on Popper’s contribution to the philosophy of social science.
  • Frederic, Raphael. 1999. Popper. New York: Routledge.
    • This short monograph offers a lively, sympathetic but critical tour through Popper’s critique of historicism and utopian planning.
  • Hacohen, Malachi Haim. 2000. Karl Popper: The Formative Years, 1902-1945. Cambridge: Cambridge University Press.
    • This definitive and exhaustive biography of the young Popper unveils the historical origins of his thought.
  • Jarvie, Ian and Sandra Pralong (eds.). 1999. Popper’s Open Society after 50 Years. London: Routledge.
    • A collection of essays exploring and critiquing key ideas of The Open Society and applying them to contemporary political problems.
  • Magee, Brian. 1973. Popper. London: Fontana/Collins.
    • A brief and accessible introduction to Popper’s philosophy.
  • Notturno, Mark. 2000. Science and Open Society. New York: Central European University Press.
    • Examines connections between Popper’s anti-inductivism and anti-positivism and his social and political values, including opposition to institutionalized science, intellectual authority and communism.
  • Schilp, P.A. (ed.) 1974. The Philosophy of Karl Popper. 2 vols. La Salle, IL: Open Court.
    • Essays by various authors that explore and critique his philosophy, including his political thought. Popper’s replies to the essays are included.
  • Shearmur, Jeremey. 1995. The Political Thought of Karl Popper. London: Routledge.
    • Argues that the logic of Popper’s own ideas should have made him more leery of state intervention and more receptive to classical liberalism.
  • Stokes, Geoffrey. 1998. Popper: Philosophy, Politics and Scientific Method. Cambridge: Polity Press.
    • Argues that we need to consider Popper’s political values to understand the unity of his philosophy.

 

Author Information

William Gorton
Email: bill_gorton@msn.com
Alma College
U. S. A.

The Computational Theory of Mind

The Computational Theory of Mind (CTM) claims that the mind is a computer, so the theory is also known as computationalism. It is generally assumed that CTM is the main working hypothesis of cognitive science.

CTM is often understood as a specific variant of the Representational Theory of Mind (RTM), which claims that cognition is manipulation of representation. The most popular variant of CTM, classical CTM, or simply CTM without any qualification, is related to the Language of Thought Hypothesis (LOTH), that has been forcefully defended by Jerry Fodor. However, there are several other computational accounts of the mind that either reject LOTH—notably connectionism and several accounts in contemporary computational neuroscience—or do not subscribe to RTM at all. In addition, some authors explicitly disentangle the question of whether the mind is computational from the question of whether it manipulates representations. It seems that there is no inconsistency in maintaining that cognition requires computation without subscribing to representationalism, although most proponents of CTM agree that the account of cognition in terms of computation over representation is the most cogent. (But this need not mean that representation is reducible to computation.)

One of the basic philosophical arguments for CTM is that it can make clear how thought and content are causally relevant in the physical world. It does this by saying thoughts are syntactic entities that are computed over: their form makes them causally relevant in just the same way that the form makes fragments of source code in a computer causally relevant. This basic argument may be made more specific in various ways. For example, Allen Newell couched it in terms of the physical symbol hypothesis, according to which being a physical symbol system (a physical computer) is a necessary and sufficient condition of thinking. Haugeland framed the claim in formalist terms: if you take care of the syntax, the semantics will take care of itself. Daniel Dennett, in a slightly different vein, claims that while semantic engines are impossible, syntactic engines can approximate them quite satisfactorily.

This article focuses only on specific problems with the Computation Theory of Mind (CTM), while for the most part leaving RTM aside. There are four main sections. In the first section, the three most important variants of CTM are introduced: classical CTM, connectionism, and computational neuroscience. The second section discusses the most important conceptions of computational explanation in cognitive science, which are functionalism and mechanism. The third section introduces the skeptical arguments against CTM raised by Hilary Putnam, and presents several accounts of implementation (or physical realization) of computation. Common objections to CTM are listed in the fourth section.

Table of Contents

  1. Variants of Computationalism
    1. Classical CTM
    2. Connectionism
    3. Computational Neuroscience
  2. Computational Explanation
    1. Functionalism
    2. Mechanism
  3. Implementation
    1. Putnam and Searle against CTM
    2. Semantic Account
    3. Causal Account
    4. Mechanistic Account
  4. Other objections to CTM
  5. Conclusion
  6. References and Further Reading

1. Variants of Computationalism

The generic claim that the mind is a computer may be understood in various ways, depending on how the basic terms are understood. In particular, some theorists claimed that only cognition is computation, while emotional processes are not computational (Harnish 2002, 6), yet some theorists explain neither motor nor sensory processes in computational terms (Newell and Simon 1972). These differences are relatively minor compared to the variety of ways in which “computation” is understood.

The main question here is just how much of the mind’s functioning is computational. The crux of this question comes with trying to understand exactly what computation is. In its most generic reading, computation is equated with information processing; but in stronger versions, it is explicated in terms of digital effective computation, which is assumed in the classical version of CTM; in some other versions, analog or hybrid computation is admissible. Although Alan Turing defined effective computation using his notion of a machine (later called a ‘Turing machine’, see below section 1.a), there is a lively debate in philosophy of mathematics as to whether all physical computation is Turing-equivalent. Even if all mathematical theories of effective computation that we know of right now (for example, lambda calculus, Markoff algorithms, and partial recursive functions) turn out to be equivalent to Turing-machine computation, it is an open question whether they are adequate formalizations of the intuitive notion of computation. Some theorists, for example, claim that it is physically possible that hypercomputational processes (that is, processes that compute functions that a Turing machine cannot compute) exist (Copeland 2004). For this reason, the assumption that CTM has to assume Turing computation, frequently made in the debates over computationalism, is controversial.

One can distinguish several basic kinds of computation, such as digital, analog, and hybrid. As they are traditionally assumed in the most popular variants of CTM, they will be explicated in the following format: classical CTM assumes digital computation; connectionism may also involve analog computation; and in several theories in computational neuroscience, hybrid analog/digital processing is assumed.

a. Classical CTM

Classical CTM is understood as the conjunction of RTM (and, in particular, LOTH) and the claim that cognition is digital effective computation. The best-known account of digital, effective computation was given by Alan Turing in terms of abstract machines (which were originally intended to be conceptual tools rather than physical entities, though sometimes they are built physically simply for fun). Such abstract machines can only do what a human computer would do mechanically, given a potentially indefinite amount of paper, a pencil, and a list of rote rules. More specifically, a Turing machine (TM) has at least one tape, on which symbols from a finite alphabet can appear; the tape is read and written (and erased) by a machine head, and can also move left or right. The functioning of the machine is described by the machine table instructions, which  include five pieces of information: (1) the current state of the TM; (2) the symbol read from the tape; (3) the symbol written on the tape; (4) left or right movement of the head; (5) the next state of the TM. The machine table has to be finite; the number of states is also finite. In contrast, the length of tape is potentially unbounded.

As it turns out, all known effective (that is, halting, or necessarily ending their functioning with the expected result) algorithms can be encoded as a list of instructions for a Turing machine. For  example, a basic Turing machine can be built to perform logical negation of the input propositional letter. The alphabet may consist of all 26 Latin letters, a blank symbol and a tilde. Now, the machine table instructions need to specify the following operations: if the head scanner is at the tilde, erase the tilde (this effectively realizes the double negation rule); if the head scanner is at the letter and the state of the machine is not “1”, move the head left and change the state of the machine to 1; if the state is “1” and the head is at the blank symbol, write the tilde (note: This list of instructions is vastly simplified for presentation purposes. In reality, it would be necessary to rewrite symbols on the tape when inserting the tilde and decide when to stop operation. B—ased on the current list, it would simply cycle infinitely). Writing Turing machine programs is actually rather time-consuming and useful only for purely theoretical purposes, but all other digital effective computational formalisms are essentially similar in requiring  (1) a finite number of different symbols in what corresponds to a Turing machine alphabet (digitality); (2) that there are a finite number of steps from the beginning to the end of operation (effectiveness). (Correspondingly, one can introduce hypercomputation by positing an infinite number of symbols in the alphabet, infinite number of states or steps in the operation, or by introducing randomness in the execution of operations.) Note that digitality is not equivalent to binary code, it is just technologically easier to produce physical systems responsive to two states rather than ten. Early computers operated, for example, on decimal code, rather than binary code (Von Neumann 1958).

There is a particularly important variant of the Turing machine, which played a seminal role in justifying the CTM. This is the universal Turing machine. A Turing machine is a formally defined, mathematical entity. Hence, it has a unique description, which can identify a given TM. Since we can encode these descriptions on the tape of another TM, they can be operated upon, and one can make these operations conform to the definition of the first TM. This way, a TM that has the encoding of any other TM on its input tape will act accordingly, and will faithfully simulate the other TM. This machine  is then called universal. The notion of universality is very important in the mathematical theory of computability, as the universal TM is hypothesized to be able to compute all effectively computable mathematical functions. In addition, the idea of using a description of a TM to determine the functioning of another TM gave rise to the idea of programmable computers. At the same time, flexibility is supposed to be the hallmark of general intelligence, and many theorists supposed that this flexibility can be explained with universality (Newell 1980). This gave the universal TM a special role in the CTM; one that motivated an analogy between the mind and the computer: both were supposed to solve problems whose nature cannot be exactly predicted (Apter 1970).

These points notwithstanding, the analogy between the universal TM and the mind is not necessary to prove classical CTM true. For example, it may turn out that human memory is essentially much more bounded than the tape of the TM. In addition, the significance of the TM in modeling cognition is not obvious: the universal TM was never used directly to write computational models of cognitive tasks, and its role may be seen as merely instrumental in analyzing the computational complexity of algorithms posited to explain these tasks. Some theorists question whether anything at all hinges upon the notion of equivalence between the mind’s information-processing capabilities and the Turing machine (Sloman 1996) ——the CTM may leave the question whether all physical computation is Turing-equivalent open, or it might even embrace hypercomputation.

The first digital model of the mind was (probably) presented by Warren McCulloch and Walter Pitts (1943), who suggested that the brain’s neuron operation essentially corresponds to logical connectives (in other words, neurons were equated with what later was called ‘logical gates’ —the basic building blocks of contemporary digital integrated circuits). In philosophy, the first avowal of CTM is usually linked with Hilary Putnam (1960), even if the latter paper does not explicitly assert that the mind is equivalent to a Turing machine but rather uses the concept to defend his functionalism. The classical CTM also became influential in early cognitive science (Miller, Galanter, and Pribram 1967).

In 1975, Jerry Fodor linked CTM with LOTH. He argued that cognitive representations are tokens of the Language of Thought and that the mind is a digital computer that operates on these tokens. Fodor’s forceful defense of LOTH and CTM as inextricably linked prompted many cognitive scientists and philosophers to equate LOTH and CTM. In Fodor’s version, CTM furnishes psychology with the proper means for dealing with the question of how thought, framed in terms of propositional attitudes, is possible. Propositional attitudes are understood as relations of the cognitive agent to the tokens in its LOT, and the operations on these tokens are syntactic, or computational. In other words, the symbols of LOT are transformed by computational rules, which are usually supposed to be inferential. For this reason, classical CTM is also dubbed symbolic CTM, and the existence of symbol transformation rules is supposed to be a feature of this approach. However, the very notion of the symbol is used differently by various authors: some mean entities equivalent to symbols on the tape of the TM, some think of physically distinguishable states, as in Newell’s physical symbol hypothesis (Newell’s symbols, roughly speaking, point to the values of some variables), whereas others frame them as tokens in LOT. For this reason, major confusion over the notion of symbol is prevalent in current debate (Steels 2008).

The most compelling case for classical CTM can be made by showing its aptitude for dealing with abstract thinking, rational reasoning, and language processing. For example, Fodor argued that productivity of language (the capacity to produce indefinitely many different sentences) can be explained only with compositionality, and compositionality is a feature of rich symbol systems, similar to natural language. (Another argument is related to systematicity; see (Aizawa 2003).) Classical systems, such as production systems, excel in simulating human performance in logical and mathematical domains. Production systems contain production rules, which are, roughly speaking, rules of the form “if a condition X is satisfied, do Y”. Usually there are thousands of concurrently active rules in production systems (for more information on production systems, see (Newell 1990; Anderson 1983).)

In his later writings, however, Fodor (2001) argued that only peripheral (that is, mostly perceptual and modular) processes are computational, in contradistinction to central cognitive processes, which, owing to their holism, cannot be explained computationally (or in any other way, really). This pessimism about classical CTM seems to contrast with the successes of the classical approach in its traditional domains.

Classical CTM is silent about the neural realization of symbol systems, and for this reason it has been criticized by connectionists as biologically implausible. For example, Miller et al. (1967) supposed that there is a specific cognitive level which is best described as corresponding to reasoning and thinking, rather than to any lower-level neural processing. Similar claims have been framed in terms of an analogy between the software/hardware distinction and the mind/brain distinction. Critics stress that the analogy is relatively weak, and neurally quite implausible. In addition, perceptual and motor functioning does not seem to fit the symbolic paradigm of cognitive science.

b. Connectionism

In contrast to classical CTM, connectionism is usually presented as a more biologically plausible variant of computation. Although some artificial neural networks (ANNs) are vastly idealized (for an evaluation of neural plausibility of typical ANNs, see (Bechtel and Abrahamsen 2002, sec. 2.3)), many researchers consider them to be much more realistic than rule-based production systems. The connectionist systems do well in modeling perceptual and motor processes, which are much harder to model symbolically.

Some early ANNs are clearly digital (for example, the early proposal of McCulloch and Pitts, see section 1.a above, is both a neural network and a digital system), while some modern networks are supposed to be analog. In particular, the connection weights are continuous values, and even if these networks are usually simulated on digital computers, they are supposed to implement analog computation. Here an interesting epistemological problem is evident: because all measurement is of finite precision, we cannot ever be sure whether the measured value is actually continuous or discrete. The discreteness may just be a feature of the measuring apparatus. For this reason, continuous values are always theoretically posited rather than empirically discovered, as there is no way to empirically decide whether a given value is actually discrete or not. Having said that, there might be compelling reasons in some domains of science to assume that measurement values should be mathematically described as real numbers, rather than approximated digitally. (Note that a Turing machine cannot compute all real numbers but it can approximate any given real number to any desired degree, as the Nyquist-Shannon sampling theorem shows).

Importantly, the relationship between connectionism and RTM is more debatable here than in classical CTM. Some proponents of connectionist models are anti-representationalists or eliminativists: the notion of representation, according to them, can be discarded in connectionist cognitive science. Others claim that the mention of representation in connectionism is at best honorific (for an extended argument, see (Ramsey 2007)). Nevertheless, the position that connectionist networks are representational as a whole, by being homomorphic to their subject domain, has been forcefully defended (O’Brien and Opie 2006; O’Brien and Opie 2009). It seems that there are important and serious differences among various connectionist models in the way that they explain cognition.

In simpler models, the nodes of artificial neural networks may be treated as atomic representations (for example, as individual concepts). They are usually called ‘symbolic’ for that very reason. However, these representations represent only by fiat: it is the modeler who decides what they represent. For this reason, they do not seem to be biologically plausible, though some might argue that, at least in principle, individual neurons may represent complex features: in biological brains, so-called grandmother cells do exactly that (Bowers 2009; Gross 2002; Konorski 1967). More complex connectionist models do not represent individual representations as individual nodes; instead, the representation is distributed into multiple nodes that may be activated to a different degree. These models may plausibly implement the prototype theory of concepts (Wittgenstein 1953; Rosch and Mervis 1975). The distributed representation seems, therefore, to be much more biologically and psychologically plausible for proponents of the prototype theory (though this theory is also debated ——see (Machery 2009) for a critical review of theories of concepts in psychology).

The proponents of classical CTM have objected to connectionism by pointing out that distributed representations do not seem to explain productivity and systematicity of cognition, as these representations are not compositional (Fodor and Pylyshyn 1988). Fodor and Pylyshyn present connectionists with the following dilemma: If representations in ANNs are compositional, then ANNs are mere implementations of classical systems; if not, they are not plausible models of higher cognition. Obviously, both horns of the dilemma are unattractive for connectionism. This has sparked a lively debate. (For a review, see Connectionism and (Bechtel and Abrahamsen 2002, chap. 6)). In short, some reject the premise that higher cognition is actually as systematic and productive as Fodor and Pylyshyn assume, while others defend the view that implementing a compositional symbolic system by an ANN does not simply render it uninteresting technical gadgetry, because further aspects of cognitive processes can be explained this way.

In contemporary cognitive modeling, ANNs have become major standard tools. (See for example (Lewandowsky and Farrell 2011)). They are also prevalent in computational neuroscience, but there are some important hybrid digital/analog systems in the latter discipline that deserve separate treatment.

c. Computational Neuroscience

Computational neuroscience employs many diverse methods and it is hard to find modeling techniques applicable to a wide range of task domains. Yet it has been argued that, in general, computation in the brain is neither completely analog nor completely digital (Piccinini and Bahar 2013). This is because neurons, on one hand, seem to be digital, since they spike only when the input signal exceeds a certain threshold (hence, the continuous input value becomes discrete), but their spiking forms continuous patterns in time. For this reason, it is customary to describe the functioning of spiking neurons both as dynamical systems, which means that they are represented in terms of continuous parameters evolving in time in a multi-dimensional space (the mathematical representation takes the form of differential equations in this case), and as networks of information-processing elements (usually in a way similar to connectionism). Hybrid analog/digital systems are also often postulated as situated in different parts of the brain. For example, the prefrontal cortex is said to manifest bi-stable behavior and gating (O’Reilly 2006), which is typical of digital systems.

Unifying frameworks in computational neuroscience are relatively rare. Of special interest might be the Bayesian brain theory and the Neural Engineering Framework (Eliasmith and Anderson 2003). The Bayesian brain theory has become one of the major theories of brain functioning——here it is assumed that the brain’s main function is to predict probable outcomes (for example, causes of sensory stimulation) based on its earlier sensory input. One major theory of this kind is the free-energy theory (Friston, Kilner, and Harrison 2006; Friston and Kiebel 2011). This theory presupposes that the brain uses hierarchical predictive coding, which is an efficient way to deal with probabilistic reasoning (which is known to be computationally hard; this is one of the major criticisms of this approach ——it may even turn out that predictive coding is not Bayesian at all, compare (Blokpoel, Kwisthout, and Van Rooij 2012)). The predictive coding (also called predictive processing) is thought by Andy Clark to be a unifying theory of the brain (Clark 2013), where brains predict future (or causes of) sensory input in a top-down fashion and minimize the error of such predictions either by changing predictions about sensory input or by acting upon the world. However, as critics of this line of research have noted, such predictive coding models lack plausible neural implementation (usually they lack any implementation and remain sketchy, compare (Rasmussen and Eliasmith 2013)). Some suggest that a lack of implementation is true of the Bayesian models in general (Jones and Love 2011).

The Neural Engineering Framework (NEF) differs from the predictive brain approach in two respects: it does not posit a single function for the brain, and it offers detailed, biologically-plausible models of cognitive capacities. In a recent version (Eliasmith 2013) features the world’s largest functional brain model. The main principles of the NEF are: (1) Neural representations are understood as combinations of nonlinear encoding and optimal linear decoding (this includes temporal and population representations); (2) transformations of neural representations are functions of variables represented by a population; and (3) neural dynamics are described with neural representations as control-theoretic state variables. (‘Transformation’ is the term given for what would traditionally be called computation.) The NEF models are at the same time representational, computational, dynamical, and use the control theory (which is mathematically equivalent to dynamic systems theory). Of special interest is that the NEF enables the building of plausible architectures that tackle symbolic problems. For example, a 2.5-million neuron model of the brain (called ‘Spaun’) has been built, which is able to perform eight diverse tasks (Eliasmith et al. 2012). Spaun features so-called semantic pointers, which can be seen as elements of compressed neural vector space, and which enable the execution of higher cognition tasks. At the same time, the NEF models are usually less idealizing than classical CTM models, and they do not presuppose that the brain is as systematic and compositional as Fodor and Pylyshyn claim. The NEF models deliver the required performance but without positing an architecture that is entirely reducible to a classical production system.

2. Computational Explanation

The main aim of computational modeling in cognitive science is to explain and predict mental phenomena. (In neuroscience and psychiatry, therapeutic intervention is another major aim of the inquiry.) There are two main competing theories of computational explanation: functionalism, in particular David Marr’s account; and mechanism. Although some argue for the Deductive-Nomological account in cognitive science, especially proponents of dynamicism (Walmsley 2008), the dynamical models in question are contrasted with computational ones. What’s more, the relation between mechanical and dynamical explanation is a matter of a lively debate (Zednik 2011; Kaplan and Craver 2011; Kaplan and Bechtel 2011).

a. Functionalism

One of the most prominent views of functional explanation (for a general overview see Causal Theories of Functional Explanation) was developed by Robert Cummins (Cummins 1975; Cummins 1983; Cummins 2000). Cummins rejects the idea that explanation in psychology is subsumption under a law. For him, psychology and other special sciences are interested in various effects, understood as exercises of various capacities. A given capacity is to be analyzed functionally, by decomposing it into a number of less problematic capacities, or dispositions, that jointly manifest themselves as the effect in question. In cognitive science and psychology, this joint manifestation is best understood in terms of flowcharts or computer programs. Cummins claims that computational explanations are just top-down explanations of a system’s capacity.

A specific problem with Cummins’ account is that the explanation is considered to be correct if dispositions are merely sufficient for the joint manifestation of the effect to be displayed. For example, a computer program that has the same output as a human subject, given the same input, is held to be explanatory of the subject’s performance. This seems problematic, given that computer simulations have been traditionally evaluated not only at the level of their inputs and outputs (in which case they would be merely ‘weakly equivalent’ in Fodor’s terminology, see (Fodor 1968)), but also at the level of the process that transforms the input data into the output data (in which case they are ‘strongly equivalent’ and genuinely explanatory, according to Fodor). Note, for example, that it is sufficient to kill U. S. President John F. Kennedy with an atomic bomb, but this fact is not explanatory of his actual assassination. In short, critics of functional explanation stress that it is too liberal and that it should require causal relevance as well. They argue that functional analyses devoid of causal relevance are in the best case incomplete, and in the worst case they may be explanatorily irrelevant (Piccinini and Craver 2011).

One way to make the functional account more robust is to introduce a hierarchy of explanatory levels. In the context of cognitive science, the most influential proposal for such a hierarchy comes from David Marr (1982), who proposes a three-leveled model of explanation. This model introduces several additional constraints that have since been widely accepted in modeling practice. In particular, Marr argued that the complete explanation of a computational system should feature the following levels: (1) The computational level; (2) the level of representation and algorithm; and (3) the level of hardware implementation.

At the computational level, the modeler is supposed to ask what operations the system performs and why it performs them. Interestingly, the term Marr proposed for this level has proved confusing to some. For this reason, it is usually characterized in semantic terms, such as knowledge or representation, but this may be also somewhat misleading. At this level, the modeler is supposed to assume that a device performs a task by carrying out a series of operations. She needs to identify the task in question and justify her explanatory strategy by ensuring that her specification mirrors the performance of the machine, and that the performance is appropriate in the given environment. Marrian “computation” refers to computational tasks and not to the manipulation of particular semantic representations. No wonder that other terms for this level have been put forth to prevent misunderstanding, perhaps the most appropriate of which is Sterelny’s (1990) “ecological level.” Sterelny makes it clear that the justification of why the task is performed includes the relevant physical conditions of the machine’s environment.

The level of representation and algorithm concerns the following questions: How can the computational task be performed? What is the representation of the input and output? And what is the algorithm for the transformation? The focus is on the formal features of the representation———which are required to develop an algorithm in a programming language —rather than on whether the inputs really represent anything. The algorithm is correct when it performs the specified task, given the same input as the computational system in question. The distinction between the computational level and the level of representation and algorithm amounts to the difference between what and how (Marr 1982, 28).

The level of hardware implementation refers to the physical machinery realizing the computation; in neuroscience, of course, this will be the brain. Marr’s methodological account is based on his own modeling in computational neuroscience, but stresses the relative autonomy of the levels, which are also levels of realization. There are multiple realizations of a given task (see Mind and Multiple Realizability), so Marr endorses the classical functionalist claim of relative autonomy of levels, which is supposed to underwrite antireductionism (Fodor 1974). Most functionalists subsequently embraced Marr’s levels as well (for example, Zenon Pylyshyn (1984) and Daniel Dennett (1987)).

Although Marr introduces more constraints than Cummins, because he requires the description of three different levels of realization, his theory also suffers from the abovementioned problems. That is, it does not require the causal relevance of the algorithm and representation level; sufficiency is all that is required. Moreover, it remains relatively unclear why exactly there are three, and not, say, five levels in the proper explanation (note that some philosophers proposed the introduction of intermediary levels). For these reasons, mechanists have criticized Marr’s approach (Miłkowski 2013).

b. Mechanism

According to mechanism, to explain a phenomenon is to explain its underlying mechanism. Mechanistic explanation is a species of causal explanation, and explaining a mechanism involves the discovery of its causal structure. While mechanisms are defined variously, the core idea is that they are organized systems, comprising causally relevant component parts and operations (or activities) thereof (Bechtel 2008; Craver 2007; Glennan 2002; Machamer, Darden, and Craver 2000). Parts of the mechanism interact and their orchestrated operation contributes to the capacity of the mechanism. Mechanistic explanations abound in special sciences, and it is hoped that an adequate description of the principles implied in explanations (those that are generally accepted as sound) will also furnish researchers with normative guidance. The idea that computational explanation is best understood as mechanistic has been defended by (Piccinini 2007b; Piccinini 2008) and (Miłkowski 2013). It is closely linked to causal accounts of computational explanation, too (Chalmers 2011).

Constitutive mechanistic explanation is the dominant form of computational explanation in cognitive science. This kind of explanation includes at least three levels of mechanism: a constitutive (-1) level, which is the lowest level in the given analysis; an isolated (0) level, where the parts of the mechanism are specified, along with their interactions (activities or operations); and the contextual (+1) level, where the function of the mechanism is seen in a broader context (for example, the context for human vision includes lighting conditions). In contrast to how Marr (1982) or Dennett (1987) understand them, levels here are not just levels of abstraction; they are levels of composition. They are tightly integrated, but not entirely reducible to the lowest level.

Computational models explain how the computational capacity of a mechanism is generated by the orchestrated operation of its component parts. To say that a mechanism implements a computation is to claim that the causal organization of the mechanism is such that the input and output information streams are causally linked and that this link, along with the specific structure of information processing, is completely described. Note that the link is sometimes cyclical and can be very complex.

In some respects, the mechanistic account of computational explanation may be viewed as a causally-constrained version of functional explanation. Developments in the theory of mechanistic explanation, which is now one of the most active fields in the philosophy of science, make it, however, much more sensitive to the actual scientific practice of modelers.

3. Implementation

One of the most difficult questions for proponents of CTM is how to determine whether a given physical system is an implementation of a formal computation. Note that computer science does not offer any theory of implementation, and the intuitive view that one can decide whether a system implements a computation by finding a one-to-one correspondence between physical states and the states of a computation may lead to serious problems. In what follows, I will sketch out some objections to the objectivity of the notion of computation, formulated by John Searle and Hilary Putnam, and examine various answers to their objections.

a. Putnam and Searle against CTM

Putnam and Searle’s objection may be summarized as follows. There is nothing objective about physical computation; computation is ascribed to physical systems by human observers merely for convenience. For this reason, there are no genuine computational explanations. Needless to say, such an objection invalidates most research that has been done in cognitive science.

In particular, Putnam (1991, 121–125) has constructed a proof that any open physical system implements any finite automaton (which is a model of computation that has lower computational power than a Turing machine; note that the proof can be easily extended to Turing machines as well). The purpose of Putnam’s argument is to demonstrate that functionalism, were it true, would imply behaviorism; for functionalism, the internal structure is completely irrelevant to deciding what function is actually realized. The idea of the proof is as follows. Any physical system has at least one state. This state obtains for some time, and the duration can be measured by an external clock. By an appeal to the clock, one can identify as many states as one wishes, especially if the states can be constructed by set-theoretic operations (or their logical equivalent, which is the disjunction operator). For this reason, one can always find as many states in the physical system as the finite machine requires (it has, after all, a finite number of states). Also, its evolution in time may be easily mapped onto a physical system thanks to disjunctions and the clock. For this reason, there is nothing explanatory about the notion of computation.

Searle’s argument is similar. He argues that being a digital computer is a matter of ascribing 0s and 1s to a physical system, and that for any program and any sufficiently complex object there is a description of the object under which it realizes the program (Searle 1992, 207–208). On this view, even an ordinary wall would be a computer. In essence, both objections are similar in making the point that given enough freedom, one can always map physical states —whose number can be adjusted by logical means or by simply making more measurements —to the formal system. If we talk of both systems in terms of sets, then all that matters is cardinality of both sets (in essence, these arguments are similar to the objection once made against Russell’s structuralism, compare (Newman 1928)). As the arguments are similar, the replies to these objections usually address both at the same time, and try to limit the admissible ways of carving physical reality. The view is that somehow reality should be carved at its joints, and then made to correspond with the formal model.

b. Semantic Account

The semantic account of implementation is by far the most popular among philosophers. It simply requires that there is no computation without representation (Fodor 1975). But the semantic account seems to beg the question, given that some computational models require no representation, notably in connectionism. Besides, other objections to CTM (in particular the arguments based on the Chinese Room experiment question the assumption that computer programs ever represent anything by themselves. For this reason, at least in this debate, one can only assume that programs represent just because they are ascribed meaning by external observers. But in such a case, the observer may just as easily ascribe meaning to a wall. Thus, the semantic account has no resources to deal with these objections.

I do not meant to suggest that the semantic account is completely wrong; indeed, the intuitive appeal of CTM is based on its close links with RTM. Yet the assumption that computation always represents has been repeatedly questioned (Fresco 2010; Piccinini 2006; Miłkowski 2013). For example, it seems that an ordinary logical gate (the computational entity that corresponds to a logical connective), for example an AND gate, does not represent anything. At least, it does not seem to refer to anything. Yet it is a simple computational device.

c. Causal Account

The causal account requires that the physical states taken to correspond to the mathematical description of computation are causally linked (Chalmers 2011). This means that there have to be counterfactual dependencies to satisfy (this requirement has been proposed by (Copeland 1996), but without requiring that the states be causally relevant) and that the methodological principles of causal explanations have to be followed. They include theoretical parsimony (used already by Fodor in his constraints of his semantic account of computation) and the causal Markov condition. In particular, states that are not related causally, be it in Searle’s wall, or Putnam’s logical constructs, are automatically discarded.

There are two open questions for the causal account, however. First, for any causal system, there will be a corresponding computational description. This means that even if it is no longer true that all physical systems implement all possible computations, they still implement at least one computation (if there are multiple causal models of a given system, the number of corresponding computations of course grows). Causal theorists usually bite the bullet by replying that this does not make computational explanation void; it just allows a weak form of pancomputationalism (which is the claim that everything is computational (Müller 2009; Piccinini 2007a)). The second question is how the boundaries of causal systems are to be drawn. Should we try to model a computer’s distal causes (including the operations at the production site of its electronic components) in the causal model brought into correspondence with the formal model of computation? This seems absurd, but there is no explicit reply to this problem in the causal account.

d. Mechanistic Account

The mechanistic account is a specific version of the causal account, defended by Piccinini and Miłkowski. The first move made by both is to take into account only functional mechanisms, which excludes weak pancomputationalisms. (The requirement that the systems should have the function —in some robust sense —of computing has also been defended by other authors, compare (Lycan 1987; Sterelny 1990)). Another is to argue that computational systems should be understood as multi-level systems, which fits naturally with the mechanistic account of computational explanation. Note that mechanists in the philosophy of science have already faced the difficult question of how to draw a boundary around systems, for example by including only components constitutively relevant to the capacity of the mechanism; compare (Craver 2007). For this reason, the mechanistic account is supposed to deliver a satisfactory approach to delineating computational mechanisms from their environment.

Another specific feature of the mechanistic account of computation is that it makes clear how the formal account of computation corresponds to the physical mechanism. Namely, the isolated level of the mechanism (level 0, see section 2.c above) is supposed to be described by a mechanistically adequate model of computation. The description of the model usually comprises two parts: (1) an abstract specification of a computation, which should include all the causally relevant variables (a formal model of the mechanism); (2) a complete blueprint of the mechanism at this level of its organization.

Even if one remains skeptical about causation or physical mechanisms, Putnam and Searle’s objections can be rejected in the mechanistic account of implementation, to the extent that these theoretical posits are admissible in special sciences. What is clear from this discussion is that implementation is not a matter of any simple mapping but of satisfying a number of additional constraints usually required by causal modeling in science.

4. Other objections to CTM

The objection discussed in section 3 is by no means the only objection discussed in philosophy, but it is special because of its potential to completely trivialize CTM. Another very influential objection against CTM (and against the very possibility of creating genuine artificial intelligence) stems from Searle’s Chinese Room thought experiment. The debate over this thought experiment is, at best, inconclusive, so it does not show that CTM is doomed (for more discussion on Chinese Room, see also (Preston and Bishop 2002)). Similarly, all arguments that purport to show that artificial intelligence (AI) is in principle impossible seem to be equally unconvincing, even if they were cogent at some point in time when related to some domains of human competence (for example, for a long time it has been thought that decent machine translation is impossible; it has been even argued that funding research into machine speech recognition is morally wrong, compare (Weizenbaum 1976, 176)). The relationship between AI and CTM is complex: even if non-human AI is impossible, it does not imply that CTM is wrong, as it may turn out that only biologically-inspired AI is possible.

One group of objections against CTM focuses on its alleged reliance on the claim that cognition should be explained merely in terms of computation. This motivates, for example, claims that CTM ignores emotional or bodily processes (see Embodied Cognition). Such claims are, however, unsubstantiated: proponents of CTM more often than not ignore emotions (though even early computer simulations focused on motivation and emotion; compare (Tomkins and Messick 1963; Colby and Gilbert 1964; Loehlin 1968)) or embodiment, though this is not at the core of their claims. Furthermore, according to the most successful theories of implementation, both causal and mechanistic, a physical computation always has properties that are over and above its computational features. It is these physical features that make this computation possible in the first place, and ignoring them (for example, ignoring the physical constitution of neurons) simply leaves the implementation unexplained. For this reason, it seems quite clear that CTM cannot really involve a rejection of all other explanations; the causal relevance of computation implies causal relevance of other physical features, which means that embodied cognition is implied by CTM, rather than excluded.

Jerry Fodor has argued that it is central cognition that cannot be explained computationally, in particular in the symbolic way (and that no other explanation is forthcoming). This claim seems to fly in the face of the success of production systems in such domains as reasoning and problem solving. Fodor justifies his claim by pointing out that central cognitive processes are cognitively impenetrable, which means that an agent’s knowledge and beliefs may influence any other of his other beliefs (which also means that beliefs are strongly holistic). But even if one accepts the claim that there is a substantial (and computational) difference between cognitively penetrable and impenetrable processes, this still wouldn’t rule out a scientific account of both (Boden 1988, 172).

Arguments against the possibility of a computational account of common sense (Dreyfus 1972) also appeal to Holism. Some also claim that it leads to the frame problem in AI, though this has been debated; while the meaning of the frame problem for CTM is unclear (Pylyshyn 1987; Shanahan 1997; Shanahan and Baars 2005).

A specific group of arguments against CTM is directed against the claim that cognition is digital effective computation: some propose that the mind is hypercomputational and try to prove this with reference to Gödel’s proof of undecidability (Lucas 1961; Penrose 1989). These arguments are not satisfactory because they assume without justification that human beliefs are not contradictory (Putnam 1960; Krajewski 2007). Even if they are genuinely contradictory, the claim that the mind is not a computational mechanism cannot be proven this way, as Krajewski has argued, showing that the proof leads to a contradiction.

5. Conclusion

The Computational Theory of Mind (CTM) is the working assumption of the vast majority of modeling efforts in cognitive science, though there are important differences among various computational accounts of mental processes. With the growing sophistication of modeling and testing techniques, computational neuroscience offers more and more refined versions of CTM, which are more complex than early attempts to model mind as a single computational device ( such as a Turing machine). What is much more plausible, at least biologically, is a complex organization of various computational mechanisms, some permanent and some ephemeral, in a structure that does not form a strict hierarchy. The general agreement in cognitive science is, however, that the generic claim that minds process information, even if it is an empirical hypothesis that might prove wrong, is highly unlikely to turn out false. Yet it is far from clear what kind of processing is involved.

6. References and Further Reading

  • Aizawa, Kenneth. 2003. The Systematicity Arguments. Boston: Kluwer Academic.
  • Anderson, John R. 1983. The Architecture of Cognition. Cambridge, Mass.: Harvard University Press.
  • Apter, Michael. 1970. The Computer Simulation of Behaviour. London: Hutchinson.
  • Arbib, Michael, Carl Lee Baker, Joan Bresnan, Roy G. D’Andrade, Ronald Kaplan, Samuel Jay Keyser, Donald A. Norman, et al. 1978. Cognitive Science, 1978.
  • Bechtel, William. 2008. Mental Mechanisms. New York: Routledge (Taylor & Francis Group).
  • Bechtel, William, and Adele Abrahamsen. 2002. Connectionism and the Mind. Blackwell.
  • Blokpoel, Mark, Johan Kwisthout, and Iris van Rooij. 2012. “When Can Predictive Brains Be Truly Bayesian?” Frontiers in Psychology 3 (November): 1–3.
  • Boden, Margaret A. 1988. Computer Models of Mind: Computational Approaches in Theoretical Psychology. Cambridge [England]; New York: Cambridge University Press.
  • Bowers, Jeffrey S. 2009. “On the Biological Plausibility of Grandmother Cells: Implications for Neural Network Theories in Psychology and Neuroscience.” Psychological Review 116 (1) (January): 220–51.
  • Chalmers, David J. 2011. “A Computational Foundation for the Study of Cognition.” Journal of Cognitive Science (12): 325–359.
  • Clark, Andy. 2013. “Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science.” The Behavioral and Brain Sciences 36 (3) (June 10): 181–204.
  • Colby, Kenneth Mark, and John P Gilbert. 1964. “Programming a Computer Model of Neurosis.” Journal of Mathematical Psychology 1 (2) (July): 405–417.
  • Copeland, B. Jack. 1996. “What Is Computation?” Synthese 108 (3): 335–359.
  • Copeland, B. 2004. “Hypercomputation: Philosophical Issues.” Theoretical Computer Science 317 (1-3) (June): 251–267.
  • Craver, Carl F. 2007. Explaining the Brain. Mechanisms and the Mosaic Unity of Neuroscience. Oxford: Oxford University Press.
  • Cummins, Robert. 1975. “Functional Analysis.” The Journal of Philosophy 72 (20): 741–765.
  • Cummins, Robert. 1983. The Nature of Psychological Explanation. Cambridge, Mass.: MIT Press.
  • Cummins, Robert. 2000. “‘How Does It Work’ Versus ‘What Are the Laws?’: Two Conceptions of Psychological Explanation.” In Explanation and Cognition, ed. F Keil and Robert A Wilson, 117–145. Cambridge, Mass.: MIT Press.
  • Dennett, Daniel C. 1983. “Beyond Belief.” In Thought and Object, ed. Andrew Woodfield. Oxford University Press.
  • Dennett, Daniel C. 1987. The Intentional Stance. Cambridge, Mass.: MIT Press.
  • Dreyfus, Hubert. 1972. What Computers Can’t Do: A Critique of Artificial Reason. New York: Harper & Row, Publishers.
  • Eliasmith, Chris. 2013. How to Build the Brain: a Neural Architecture for Biological Cognition. New York: Oxford University Press.
  • Eliasmith, Chris, and Charles H. Anderson. 2003. Neural Engineering. Computation, Representation, and Dynamics in Neurobiological Systems. Cambridge, Mass.: MIT Press.
  • Eliasmith, Chris, Terrence C Stewart, Xuan Choo, Trevor Bekolay, Travis DeWolf, Yichuan Tang, Charlie Tang, and Daniel Rasmussen. 2012. “A Large-scale Model of the Functioning Brain.” Science (New York, N.Y.) 338 (6111) (November 30): 1202–5.
  • Fodor, Jerry A. 1968. Psychological Explanation: An Introduction to the Philosophy of Psychology. New York: Random House.
  • Fodor, Jerry A. 1974. “Special Sciences (or: The Disunity of Science as a Working Hypothesis).” Synthese 28 (2) (October): 97–115.
  • Fodor, Jerry A. 1975. The Language of Thought. 1st ed. New York: Thomas Y. Crowell Company.
  • Fodor, Jerry A. 2001. The Mind Doesn’t Work That Way. Cambridge, Mass.: MIT Press.
  • Fodor, Jerry A., and Zenon W. Pylyshyn. 1988. “Connectionism and Cognitive Architecture: a Critical Analysis.” Cognition 28 (1-2) (March): 3–71.
  • Fresco, Nir. 2010. “Explaining Computation Without Semantics: Keeping It Simple.” Minds and Machines 20 (2) (June): 165–181.
  • Friston, Karl, and Stefan Kiebel. 2011. “Predictive Coding: A Free-Energy Formulation.” In Predictions in the Brain: Using Our Past to Generate a Future, ed. Moshe Bar, 231–246. Oxford: Oxford University Press.
  • Friston, Karl, James Kilner, and Lee Harrison. 2006. “A Free Energy Principle for the Brain.” Journal of Physiology, Paris 100 (1-3): 70–87.
  • Glennan, Stuart. 2002. “Rethinking Mechanistic Explanation.” Philosophy of Science 69 (S3) (September): S342–S353.
  • Gross, Charles G. 2002. “Genealogy of the ‘Grandmother Cell’.” The Neuroscientist 8 (5) (October 1): 512–518.
  • Harnish, Robert M. 2002. Minds, Brains, Computers : an Historical Introduction to the Foundations of Cognitive Science. Malden, MA: Blackwell Publishers.
  • Haugeland, John. 1985. Artificial Intelligence: The Very Idea. Cambridge, Mass.: MIT Press.
  • Jones, Matt, and Bradley C. Love. 2011. “Bayesian Fundamentalism or Enlightenment? On the Explanatory Status and Theoretical Contributions of Bayesian Models of Cognition.” Behavioral and Brain Sciences 34 (04) (August 25): 169–188.
  • Kaplan, David Michael, and William Bechtel. 2011. “Dynamical Models: An Alternative or Complement to Mechanistic Explanations?” Topics in Cognitive Science 3 (2) (April 6): 438–444.
  • Kaplan, David Michael, and Carl F Craver. 2011. “The Explanatory Force of Dynamical and Mathematical Models in Neuroscience: A Mechanistic Perspective*.” Philosophy of Science 78 (4) (October): 601–627.
  • Konorski, Jerzy. 1967. Integrative Activity of the Brain; an Interdisciplinary Approach. Chicago: University of Chicago Press.
  • Krajewski, Stanisław. 2007. “On Gödel’s Theorem and Mechanism: Inconsistency or Unsoundness Is Unavoidable in Any Attempt to ‘Out-Gödel’ the Mechanist.” Fundamenta Informaticae 81 (1) (January 1): 173–181.
  • Lewandowsky, Stephan, and Simon Farrell. 2011. Computational Modeling in Cognition: Principles and Practice. Thousand Oaks: Sage Publications.
  • Loehlin, John. 1968. Computer Models of Personality. New York: Random House.
  • Lucas, JR. 1961. “Minds, Machines and Gödel.” Philosophy 9 (3) (April): 219–227.
  • Lycan, William G. 1987. Consciousness. Cambridge, Mass.: MIT Press.
  • Machamer, Peter, Lindley Darden, and Carl F Craver. 2000. “Thinking About Mechanisms.” Philosophy of Science 67 (1): 1–25.
  • Machery, Edouard. 2009. Doing Without Concepts. Oxford: Oxford University Press, USA.
  • Marr, David. 1982. Vision. A Computational Investigation into the Human Representation and Processing of Visual Information. New York: W. H. Freeman and Company.
  • McCulloch, Warren S., and Walter Pitts. 1943. “A Logical Calculus of the Ideas Immanent in Nervous Activity.” Bulletin of Mathematical Biophysics 5: 115–133.
  • Miller, George A., Eugene Galanter, and Karl H. Pribram. 1967. Plans and the Structure of Behavior. New York: Holt.
  • Miłkowski, Marcin. 2013. Explaining the Computational Mind. Cambridge, Mass.: MIT Press.
  • Müller, Vincent C. 2009. “Pancomputationalism: Theory or Metaphor?” In The Relevance of Philosophy for Information Science, ed. Ruth Hagengruber. Berlin: Springer.
  • Von Neumann, John. 1958. The Computer and the Brain. New Haven: Yale University Press.
  • Newell, Allen. 1980. “Physical Symbol Systems.” Cognitive Science: A Multidisciplinary Journal 4 (2): 135–183.
  • Newell, Allen. 1990. Unified Theories of Cognition. Cambridge, Mass. and London: Harvard University Press.
  • Newell, Allen, and Herbert A Simon. 1972. Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall.
  • Newman, M H A. 1928. “Mr. Russell’s ‘Causal Theory of Perception’.” Mind 37 (146) (April 1): 137–148.
  • O’Brien, Gerard, and Jon Opie. 2006. “How Do Connectionist Networks Compute?” Cognitive Processing 7 (1) (March): 30–41.
  • O’Brien, Gerard, and Jon Opie. 2009. “The Role of Representation in Computation.” Cognitive Processing 10 (1) (February): 53–62.
  • O’Reilly, Randall C. 2006. “Biologically Based Computational Models of High-level Cognition.” Science 314 (5796) (October 6): 91–4.
  • Penrose, Roger. 1989. The Emperor’s New Mind. Quantum. London: Oxford University Press.
  • Piccinini, Gualtiero. 2006. “Computation Without Representation.” Philosophical Studies 137 (2) (September): 205–241.
  • Piccinini, Gualtiero. 2007a. “Computational Modelling Vs. Computational Explanation: Is Everything a Turing Machine, and Does It Matter to the Philosophy of Mind?” Australasian Journal of Philosophy 85 (1): 93–115.
  • Piccinini, Gualtiero. 2007b. “Computing Mechanisms.” Philosophy of Science 74 (4) (October): 501–526.
  • Piccinini, Gualtiero. 2008. “Computers.” Pacific Philosophical Quarterly 89 (1) (March): 32–73.
  • Piccinini, Gualtiero, and Sonya Bahar. 2013. “Neural Computation and the Computational Theory of Cognition.” Cognitive Science 37 (3) (April 5): 453–88.
  • Piccinini, Gualtiero, and Carl Craver. 2011. “Integrating Psychology and Neuroscience: Functional Analyses as Mechanism Sketches.” Synthese 183 (3) (March 11): 283–311.
  • Preston, John, and Mark Bishop. 2002. Views into the Chinese Room: New Essays on Searle and Artificial Intelligence. Oxford; New York: Clarendon Press.
  • Putnam, Hilary. 1960. “Minds and Machines.” In Dimensions of Mind, ed. Sidney Hook. New York University Press.
  • Putnam, Hilary. 1991. Representation and Reality. Cambridge, Mass.: The MIT Press.
  • Pylyshyn, Zenon W. 1984. Computation and Cognition: Toward a Foundation for Cognitive Science. Cambridge, Mass.: MIT Press.
  • Pylyshyn, Zenon W. 1987. Robot’s Dilemma: The Frame Problem in Artificial Intelligence. Norwood, New Jersey: Ablex Publishing Corporation.
  • Ramsey, William M. 2007. Representation Reconsidered. Cambridge: Cambridge University Press.
  • Rasmussen, Daniel, and Chris Eliasmith. 2013. “God, the Devil, and the Details: Fleshing Out the Predictive Processing Framework.” The Behavioral and Brain Sciences 36 (3) (June 1): 223–4.
  • Rosch, Eleanor, and Carolyn B Mervis. 1975. “Family Resemblances: Studies in the Internal Structure of Categories.” Cognitive Psychology 7 (4) (October): 573–605.
  • Searle, John R. 1992. The Rediscovery of the Mind. Cambridge, Mass.: MIT Press.
  • Shanahan, Murray, and Bernard Baars. 2005. “Applying Global Workspace Theory to the Frame Problem.” Cognition 98 (2) (December): 157–76.
  • Shanahan, Murray. 1997. Solving the Frame Problem: a Mathematical Investigation of the Common Sense Law of Inertia. Cambridge, Mass.: MIT Press.
  • Sloman, A. 1996. “Beyond Turing Equivalence.” In Machines and Thought: The Legacy of Alan Turing, ed. Peter Millican, 1:179–219. New York: Oxford University Press, USA.
  • Steels, Luc. 2008. “The Symbol Grounding Problem Has Been Solved, so What’ s Next?” In Symbols and Embodiment: Debates on Meaning and Cognition, ed. Manuel de Vega, Arthur M. Glenberg, and Arthur C. Graesser, 223–244. Oxford: Oxford University Press.
  • Sterelny, Kim. 1990. The Representational Theory of Mind: An Introduction. Oxford, OX, UK; Cambridge, Mass., USA: B. Blackwell.
  • Tomkins, Silvan, and Samuel Messick. 1963. Computer Simulation of Personality, Frontier of Psychological Theory,. New York: Wiley.
  • Walmsley, Joel. 2008. “Explanation in Dynamical Cognitive Science.” Minds and Machines 18 (3) (July 2): 331–348.
  • Weizenbaum, Joseph. 1976. Computer Power and Human Reason: From Judgment to Calculation. San Francisco: W.H. Freeman.
  • Wittgenstein, Ludwig. 1953. Philosophical Investigations. New York: Macmillan.
  • Zednik, Carlos. 2011. “The Nature of Dynamical Explanation.” Philosophy of Science 78 (2): 238–263.

 

Author Information

Marcin Milkowski
Email: marcin.milkowski@gmail.com
Institute of Philosophy and Sociology
Polish Academy of Sciences
Poland

David Hume: Religion

David HumeDavid Hume (1711-1776) was called “Saint David” and “The Good David” by his friends, but his adversaries knew him as “The Great Infidel.” His contributions to religion have had a lasting impact and contemporary significance. Taken individually, Hume gives novel insights into many aspects of revealed and natural theology. When taken together, however, they provide his attempt at a systematic undermining of the justifications for religion. Religious belief is often defended through revealed theology, natural theology, or pragmatic advantage. However, through Hume’s various philosophical writings, he works to critique each of these avenues of religious justification.

Though Hume’s final view on religion is not clear, what is certain is that he was not a theist in any traditional sense. He gives a sweeping argument that we are never justified in believing testimony that a miracle has occurred, because the evidence for uniform laws of nature will always be stronger. If correct, this claim would undermine the veracity of any sacred text, such as the Bible, which testifies to miracles and relies on them as its guarantor of truth. As such, Hume rejects the truth of any revealed religion, and further shows that, when corrupted with inappropriate passions, religion has harmful consequences to both morality and society. Further, he argues, rational arguments cannot lead us to a deity. Hume develops what are now standard objections to the analogical design argument by insisting that the analogy is drawn only from limited experience, making it impossible to conclude that a cosmic designer is infinite, morally just, or a single being. Nor can we use such depictions to inform other aspects of the world, such as whether there is a dessert-based afterlife. He also defends what is now called “the Problem of Evil,” namely, that the concept of an all powerful, all knowing, and all good God is inconsistent with the existence of suffering.

Lastly, Hume is one of the first philosophers to systematically explore religion as a natural phenomenon, suggesting how religious belief can arise from natural, rather that supernatural means.

Table of Contents

  1. Hume’s Publications on Religious Belief
  2. Interpretations of Hume’s View
  3. Miracles
  4. Immortality of the Soul
  5. The Design Argument
  6. The Cosmological Argument
  7. The Problem of Evil
  8. The Psychology of Religious Belief
  9. The Harms of Religion
  10. References and Further Reading
    1. Hume’s Works on Religion
    2. Works in the History of Philosophy

1. Hume’s Publications on Religious Belief

Hume is one of the most important philosophers to have written in the English language, and many of his writings address religious subjects either directly or indirectly. His very first work had the charge of atheism leveled against it, and this led to his being passed over for the Chair of Moral Philosophy at the University of Edinburgh. In fact, Hume’s views on religion were so controversial that he never held a university position in philosophy.

Hume addressed most of the major issues within the philosophy of religion, and even today theists feel compelled to confront Hume’s challenges. He leveled moral, skeptical, and pragmatic objections against both popular religion and the religion of the philosophers. These run the gamut from highly specific topics, such as metaphysical absurdities entailed by the Real Presence of the Eucharist, to broad critiques like the impossibility of using theology to infer anything about the world.

Hume’s very first work, A Treatise of Human Nature, includes considerations against an immortal soul, develops a system of morality independent of a deity, attempts to refute occasionalism, and argues against a necessary being, to name but a few of the religious topics that it addresses. Hume’s Enquiry Concerning Human Understanding re-emphasizes several of the challenges from the Treatise, but also includes a section against miracles and a section against the fruitfulness of theology. Hume’s major non-philosophical work, The History of England, discusses specific religious sects, largely in terms of their (often bloody) consequences. He also wrote numerous essays discussing various aspects of religion, such as the anti-doctrinal essays “Of the Immortality of the Soul” and “Of Suicide,” and critiques of organized religion and the clergy in “Of Superstition and Enthusiasm” and “Of National Characters.” Hume also wrote two major works entirely dedicated to religion: The Natural History of Religion (Natural History) and the Dialogues concerning Natural Religion (Dialogues), which merit brief discussions of their own.

Hume wrote the Natural History roughly in tandem with the first draft of the Dialogues, but while the former was published during his lifetime (as one of his Four Dissertations), the latter was not. In the introduction to the Natural History, Hume posits that there are two types of inquiry to be made into religion: its foundations in reason and its origin in human nature. While the Dialogues investigate the former, the task of the Natural History is to explore the latter. In the Natural History, he focuses on how various passions can give rise to common or false religion. It is an innovative work that brings together threads from philosophy, psychology, and history to provide a naturalistic account of how the various world religions came about.

Though Hume began writing the Dialogues at roughly the same time as the Natural History, he ultimately arranged to have the former published posthumously. In the twenty-five years between the time at which he first wrote them and his death, the Dialogues underwent three sets of revisions, including a final revision from his deathbed. The Dialogues are a rich discussion of Natural Theology, and are generally considered to be the most important book ever written on the subject. Divided into twelve parts, the Dialogues follow the discussion of three thinkers debating the nature of God. Of the three characters, Philo takes up the role of the skeptic, Demea represents the orthodox theologian of Hume’s day, and Cleanthes follows a more philosophical, empirical approach to his theology. The work is narrated by Pamphilus, a professed student of Cleanthes.

Both Hume’s style and the fact of posthumous publication give rise to interpretive difficulties. Stylistically, Hume’s Dialogues are modeled after On the Nature of the Gods, a dialogue by the Roman philosopher Cicero. In Circero’s works, unlike the dialogues of Plato, Leibniz, and Berkeley, a victor is not established from the outset, and all characters make important contributions. Hume ridicules such one-sided dialogues on the grounds that they put “nothing but Nonsense into the Mouth of the Adversary” (L1, Letter 72). The combination of this stylistic preference with Hume’s use of irony, an infrequently discussed but frequently employed literary device in his writings, makes the work a delight to read, but creates interpretive difficulties in determining who speaks for Hume on any given topic.

In the Dialogues, all the characters make good Humean points, even Pamphilus and Demea. The difficulty comes in determining who speaks for Hume when the characters disagree. Hume has been interpreted as Cleanthes/Pamphilus, Philo, an amalgamation, and as none of them. The most popular view, though not without dissent, construes Hume as Philo. Philo certainly has the most to say in the Dialogues. His arguments and objections often go unanswered, and he espouses many opinions on both religion and on other philosophical topics that Hume endorses in other works, such as the hypothesis that causal inference is based on custom. The more significant challenge to interpreting Hume as Philo concerns the collection of remarks at the beginning of Part XII of the Dialogues, known as Philo’s Reversal. After spending the bulk of the Dialogues raising barrage of objections against the design argument, Part XII has Philo admitting, “A purpose, an intention, a design strikes everywhere the most careless, the most stupid thinker…” (D 12.2). Nonetheless, whether Philo’s Reversal is sincere or not is fundamentally tied to Hume’s own views on religion.

2. Interpretations of Hume’s View

Given the comprehensive critique that Hume levels against religion, it is clear that he is not a theist in any traditional sense. However, acknowledging this point does little to settle Hume’s considered views on religion. There remain three positions open to Hume: atheist naturalism, skeptical agnosticism, or some form of deism. The first position has Hume denying any form of supernaturalism, and is much more popular outside of Hume scholarship than within. The reason for this is that it runs contrary to Hume’s attitude regarding speculative metaphysics. It has him making a firm metaphysical commitment by allowing an inference from our having no good reason for thinking that there are supernatural entities, to a positive commitment that in fact there are none. However, Hume would not commit the Epistemic Fallacy and thereby allow the inference from “x is all we can know of subject y” to “x constitutes the real, mind-independent essence of y.” Indeed, in Part XII of the first Enquiry, Hume explicitly denies the inference from what we can know from our ideas to what is the case in reality.

These considerations against a full-fledged atheist position motivate the skeptical view. While atheism saddles Hume with too strong a metaphysical commitment, the skeptical view also holds that he does not affirm the existence of any supernatural entities. This view has Hume doubting the existence of supernatural entities, but still allowing their possibility. It has the advantage of committing Hume to the sparse ontology of the naturalist without actually committing him to potentially dogmatic metaphysical positions. Hence, Hume can be an atheist for all intents and purposes without actually violating his own epistemic principles.

Both the atheist and skeptical interpretations must, then, take Philo’s Reversal as insincere. Perhaps Hume feared the political consequences of publically denouncing theism; alternatively, he may have used Philo’s Reversal simply as a dialectical tool of the Dialogues. Many scholars tend to steer clear of the former for several reasons. First, while it was true that, early in his career, Hume edited his work to avoid giving offense, this was not the case later. For example, Hume excised the miracles argument from the Treatise, but it later found its way into print in the Enquiry. Second, Hume arranged to have the Dialogues published after his death, and therefore had no reason to fear repercussions for himself. Further, Hume did not seem to think that the content would bring grief to his nephew who brought it to publication, as he revealed in a letter to his publisher (L2, Appendix M). Third, it is not only in the Dialogues that we get endorsements of a deity or of a design argument. J.C.A. Gaskin (1988: 219) provides an extensive (though not exhaustive) list of several other places in which we get similar pro-deistic endorsements from Hume. Lastly, it is generally considered hermeneutically appropriate to invoke disingenuousness only if an alternative interpretation cannot be plausibly endorsed.

Norman Kemp Smith, in his commentary on the Dialogues, argues in favor of just such an alternative interpretation. Though he interprets Hume as Philo, he has the Reversal as insincerely made, not from fear, but as a dialectical tool. In his Ciceronian dialogue, Hume does not want the reader, upon finishing the piece, to interpret any of the characters as victorious, instead encouraging them to reflect further upon these matters. Thus, Philo’s Reversal is part of a “dramatic balance” intended to help mask the presence of a clear victor.

Nelson Pike, in his own commentary on the Dialogues, roundly criticizes Kemp Smith’s position. We should instead look for reasons to take the Reversal as genuine. One possibility he considers is the presence of the “irregular arguments” of Part III. Here, instead of presenting design arguments based on standard analogical reasoning, Cleanthes presents considerations in which design will, “immediately flow in upon you with a force like that of sensation” (D 3.7). Pike therefore interprets these “irregular arguments” as non-inferential. If this is right, and the idea of a designer comes upon us naturally rather than inferentially, as Ronald Butler, Stanley Tweyman, and others have argued, then Philo’s Reversal is not a reversal at all. He can consistently maintain that the inference of the design argument is insufficient for grounding one’s belief in God, and that nonetheless, we have a natural inclination to accept it.

There is, therefore, support for interpreting Hume as a deist of a limited sort. Gaskin calls this Hume’s “attenuated deism,” attenuated in that the analogy to something like human intelligence is incredibly remote, and that no morality of the deity is implied, due especially to the Problem of Evil. However, scholars that attribute weak deism to Hume are split in regard to the source of the belief. Some, like Gaskin, think that Hume’s objections to the design argument apply only to analogies drawn too strongly. Hence, Hume does not reject all design arguments, and , provided that the analogs are properly qualified, might allow the inference. This is different than the picture suggested by Butler and discussed by Pike in which the belief is provided by a natural, non-rational faculty and thereby simply strikes us, rather than as the product of an inferential argument. Therefore, though the defenders of a deistic Hume generally agree about the remote, non-moral nature of the deity, there is a fundamental schism regarding the justification and generation of this belief. Both sides, however, agree that the belief should not come from special revelation, such as miracles or revealed texts.

3. Miracles

Because Hume’s denial of all miracles in section X of the Enquiry entails a denial of all revealed theology, it is worthwhile to consider his arguments in detail. The section is divided into two parts. While Part I provides an argument against believing in miracles in general, Part II gives four specific considerations against miracles based on particular facts about the world. Therefore, we may refer to the argument of Part I as Hume’s Categorical Argument against miracles and those of Part II as the four Evidential Arguments against miracles. Identifying Hume’s intentions with these arguments is notoriously difficult. Though the Evidential Arguments are fairly straightforward in and of themselves, there are two major interpretive puzzles: what the Categorical Argument of Part I is supposed to be, and how it fits with the Evidential Arguments of Part II. Some see the two parts as entirely separable, while others insist that they provide two parts of a cohesive whole. The following reconstructions attempt to stay interpretively neutral on these disputes.

Hume begins Part I with rules for the appropriate proportioning of belief. First, he divides arguments that justify beliefs regarding cause and effect into proofs and probabilities. Proofs are arguments supported by evidence in which the effects have been constant, such as the sun rising every day. However, there are stronger and weaker proofs—consider a professor showing up for class every day versus the sun rising every day—and only the strongest proofs, those supporting our beliefs in the laws of nature, have been attested to “in all countries and all ages.” Effects, however, are not always constant. When faced with a “contrariety of effects,” we must instead use probabilities, which are evidentially weaker than proofs. Since the strength of both proofs and probabilities varies in degree, we have the potential for “all imaginable degrees of assurance.” Hume maintains that, “The wise man…proportions his beliefs to the evidence.” In cases where effects have been constant and therefore supported by proof, our beliefs are held with a greater degree of assurance than those supported by mere probability (EHU 10.1-4).

Having explained Hume’s model for proportioning beliefs, we can now consider its ramifications for attested miracles:

A miracle is a violation of the laws of nature; and as a firm and unalterable experience has established these laws, the proof against a miracle, from the very nature of the fact, is as entire as any argument from experience can possibly be imagined. (EHU 10.12)

Here, Hume defines a miracle as a “violation of the laws of nature” though he then “accurately” defines a miracle in a footnote as “a transgression of a law of nature by a particular volition of the Deity or by the interposition of some invisible agent.” As to which definition is more relevant, the second more adequately captures the notion of a miracle. In a 1761 letter to Blair, Hume indicates that, as an empirical fact, miracles always have religious content: “I never read of a miracle in my life that was not meant to establish some new point of religion” (L1, Letter 188). A Humean miracle is, therefore, a violation of a law of nature whose cause is an agent outside of nature, though the incompatibility with a law of nature is all that the Categorical Argument requires.

We must, therefore, consider Hume’s conception of the laws of nature. Following Donald Livingston, we may draw out some of the explicit features of Hume’s conception. They are universal, so any falsification of a supposed law or a law’s failure to be upheld would be sufficient to rob it of its nomological status. Laws, therefore, admit of no empirical counterexamples. Secondly, laws of nature are matters of fact, not relations of ideas, as their denial is always coherent. Indeed, like any other matter of fact, they must have some empirical content. As Livingston concludes, “…it must be possible to discipline theoretical talk about unobservable causal powers with empirical observations” (Livingston 1984: 203).

Utilizing this conception of the laws of nature, Hume draws his conclusion:

There must, therefore, be a uniform experience against every miraculous event, otherwise the event would not merit that appellation. And as the uniform experience amounts to a proof, then there is here a direct and full proof, from the nature of the fact, against the existence of any miracle; nor can such a proof be destroyed, or the miracle rendered credible, but by an opposite proof, which is superior….no testimony is sufficient to establish a miracle, unless the testimony be of such a kind, that its falsehood would be more miraculous, than the fact, which it endeavors to establish…. (EHU 10.12-10.13; SBN 115-116, Hume’s emphasis)

The interpretation of this passage requires considerable care. As many commentators have pointed out, if Hume’s argument is: a miracle is a violation of a law of nature, but laws of nature do not admit of counterexamples, therefore there are no miracles, then Hume clearly begs the question. Call this the Caricature Argument. >William Paley first attributed this to Hume, and the interpretation has had proponents ever since; but this cannot be Hume’s argument. The Caricature Argument faces three major obstacles, two of which are insurmountable. However, considering the inaccuracies of the Caricature Argument will help us to arrive at a more accurate reconstruction.

First, the Caricature Argument is an a priori, deductive argument from definition. This would make it a demonstration in Hume’s vernacular, not a proof. Nonetheless, both the argument of Section X and the letter in which he elucidates it repeatedly appeal to the evidence against miracles as constituting a proof. If the Caricature Argument were correct, then the argument against miracles could not be labeled as such.

A second, related problem is that, if one accepts the Caricature Argument, then one must accept the entailed modality. From the conclusion of the a priori deductive argument, it follows that the occurrence of a miracle would be impossible. If this were the case, then no testimony could persuade a person to believe in the existence of a miracle. However, many take Hume to implicitly reject such an assumption. Such critics point to Hume’s acceptance of the claim that if a sufficient number of people testify to an eight-day darkness, then this constitutes a proof of its occurrence (EHU 10.36). Therefore, there are hypothetical situations in which our belief in a miracle could be established by testimony, implying that the conclusion of the Caricature Argument is too strong. This reply, however, is incorrect. Hume’s description of the proof for total darkness is generally interpreted as his establishing criteria for the rational justification of a belief, based on testimony, that a miracle has occurred. However, we must note that the passage that immediately precedes the example contains an ambiguous disjunct: “…there may possibly be miracles, or violations of the usual course of nature, of such a kind as to admit proof from human testimony” (EHU 10.36 emphasis added). From this passage alone, it is not clear whether Hume means for the darkness scenario to count as an example of the former, the latter, or both. Nevertheless, in Hume’s letter to Blair, he presents a similar example with an unambiguous conclusion. In considering Campbell’s complaint that it is a contradiction for Hume to introduce a fiction in which the testimony of miracle constitutes a proof, he has us consider his previous example concerning the

…supposition of testimony for a particular miracle [that might] amount to a full proof of it. For instance, the absence of the sun during 48 hours; but reasonable men would only conclude from this fact, that the machine of the globe was disordered during this time. (L1, Letter 188)

The conclusion Hume draws is that, even if testimony of a strange event were to amount to a full proof, it would be more reasonable to infer a hiccup in the natural regularity of things (on par with an eclipse, where apparent, but not the disturbance of a higher level regularity), rather than to conclude a miracle. Therefore, when presented with a situation that is either a miracle or a “violation of the usual course of nature,” we ought to infer the latter.

This preference for a naturalistic explanation is reemphasized in Hume’s discussion of Joan of Arc in the History of England. Hume states:

It is the business of history to distinguish between the miraculous and the marvelous; to reject the first in all narrations merely profane and human; to doubt the second; and when obliged by unquestionable testimony…to admit of something extraordinary, to receive as little of it as is consistent with the known facts and circumstances. (H 2.20, Hume’s emphasis )

Here, he once more suggests that we always reject the miraculous testimony and only accept as much of the marvelous as is required to remain consistent with the “unquestionable testimony.” For Hume, testimony of a miracle is always to be rejected in favor of the naturalistic interpretation. He therefore never grants a proof of a miracle as a real possibility, so the Caricature Argument may surmount at least this objection.

However, a final difficulty related to the modality of the conclusion concerns the observation that Hume couches his argument in terms of appropriate belief. Hume’s conclusion should, therefore, be interpreted as epistemic, but the Caricature Argument instead requires a metaphysical conclusion: miracles are impossible. The Caricature Argument cannot be correct, because Hume’s entire argument hinges on the way that we apportion our beliefs, and a fortiori, beliefs about testimony. Hume speaks of “our evidence” for the truth of miracles, belief in them being “contrary to the rules of just reasoning,” and miracles never being “established on…evidence.” “A miracle can never be proved” is a far cry from saying that a miracle has never occurred and never could occur. This gives us reason to reject the metaphysical conclusion of the Caricature Argument.

There are also logical implications against the metaphysical conclusion, such as Hume’s avowal that miracles have an essence, and that there can be un-witnessed miracles. Hume does not say that violations are impossible, only unknowable. Of course, it could be that Hume grants this merely for the sake of argument, but then the stronger conclusion would still have a problem. For whether or not Hume grants the occurrence of miracles, he certainly allows for their conceivability, something the Caricature Argument cannot allow since, for Hume, conceivability implies possibility. Finally, there is the fact that Part II exists at all. If Hume did indeed think that Part I established that miracles could never occur, the entire second part, where he shows that “…there never was a miraculous event established on… [sufficient] evidence” (EHU 10.14), would be logically superfluous. The proper conclusion is, therefore, the epistemic one.

In overcoming the weaknesses of the Caricature Argument, a more plausible Humean argument takes form. Hume’s Categorical Argument of Part I may be reconstructed as follows:

  1. Beliefs about matters of fact are supported only by proofs (stronger) or probabilities (weaker) that come in varying degrees of strength. [Humean Axiom- T 1.3.11.2, EHU 6.1, EHU 10.6]
  2. When beliefs about matters of fact conflict, assent ought to be given only to the sufficiently supported belief with the greatest degree of evidential support. [Humean Axiom- EHU 10.4, EHU 10.11]
  3. Belief in the occurrence of a miracle would be a matter of fact belief that conflicts with belief in at least one law of nature. [Humean Axiom- EHU 10.2]
  4. Laws of nature are matter of fact beliefs evidentially supported by proofs of the strongest possible type [Empirical Premise- EHU 10.2]
  5. Both testimonial probabilities supporting the occurrence of a miracle and (hypothetical) testimonial proofs supporting the occurrence of a miracle would be evidentially weaker than the proofs supporting the laws of nature. [Empirical Premise- EHU 10.2, EHU 10.13, EHU 10.36. The first clause is true by definition for probabilities, but Hume also establishes it more clearly in Part II.]
  6. Therefore, we should never believe testimony that a miracle has occurred.

There is much to be said for this reconstruction. First, in addition to Humean axioms, we have empirical premises rather than definitions that support the key inferences. Hence, the reconstruction is a proof, not a demonstration. Second, given that Hume has ancillary arguments for these empirical premises, there is no question-begging of the form that the Caricature Argument suggests. For instance, he argues for (4) by drawing on his criterion of “in all countries and all ages.” He does not simply assert that laws of nature automatically meet this criterion.

However, there is a separate worry of question-begging in (4) that needs to be addressed before moving on to the arguments of Part II. The challenge is that, in maintaining Hume’s position that men in all ages testify to the constancy of the laws of nature, any testimony to the contrary (that is, testimony of the miraculous) must be excluded. However, there are people that do testify to miracles. The worry is that, in assigning existence to laws of nature without testimonial exception, Hume may beg the question against those that maintain the occurrence of miracles.

This worry can be overcome, however, if we follow Don Garrett in realizing what Hume is attempting to establish in the argument:

… [when] something has the status of “law of nature”- that is, plays the cognitive role of a “law of nature”- for an individual judger…it has the form of a universal generalization, is regarded by the judger as causal, and is something for which the judger has firm and unalterable experience….This is, of course, compatible with there actually being exceptions to it, so long as one of those exceptions has, for the judger, the status of experiments within his or her experience. (Garrett 1997: 152, Hume’s emphasis)

Garrett rightly points out that in Hume’s argument laws of nature govern our belief, and fulfill a certain doxastic role for the judger. Nonetheless, once this is realized, we can strengthen Garrett’s point by recognizing that this role is, in fact, a necessary condition for testimony of a miracle. To believe in a miracle, the witness must believe that a law of nature has been violated. However, this means that, in endorsing the occurrence of the miracle, the witness implicitly endorses two propositions: that there is an established law of nature in place and that it has been broken. Thus, in order for a witness to convince me of a miracle, we must first agree that there is a law in place. The same testimony which seeks to establish the miracle reaffirms the nomological status of the law as universally believed.

This leads to the second point that Garrett raises. Only after this common ground is established do we consider, as an experiment, whether we should believe that the said law has been violated. Hence, even such a testimonial does not count against the universality of what we, the judges, take to be a law of nature. Instead, we are setting it aside as experimental in determining whether we should offer assent to the purported law or not. If this is right, then (4) does not beg the question. This leaves us with empirical premise (5), which leads to Part II.

Hume begins Part II by stating that, in granting that the testimonies of miracles may progress beyond mere probability, “we have been a great deal too liberal in our concession…” (EHU 10.14). He then gives four considerations as to why this is the case, three of which are relatively straightforward.

First, Hume tells us that, as an empirical fact, “there is not to be found, in all history, any miracle attested by a sufficient number of men, of such unquestioned good sense, education, and learning…” to secure its testimony (EHU 10.15). To be persuaded of a miracle, we would need to be sure that no natural explanation, such as delusion, deception, and so forth, was more likely than the miraculous, a task which, for Hume, would simply take more credible witnesses than have ever attested to a miracle.

Second, it is a fact of human nature that we find surprise and wonder agreeable. We want to believe in the miraculous, and we are much more likely to pass along stories of the miraculous than of the mundane. For Hume, this explains why humans tend to be more credulous with attested miracles than should reasonably be the case, and also explains why the phenomenon is so widespread.

His third, related presumption against miracles is that testimony of their occurrence tends to be inversely proportionate to education: miracles “are observed chiefly to abound among ignorant and barbarous nations” (EHU 10.20). Hume’s explanation for this is that purported miracles are generally born of ignorance. Miracles are used as placeholders when we lack the knowledge of natural causes. However, as learning progresses, we become increasingly able to discover natural causes, and no longer need to postulate miraculous explanations.

Hume’s fourth consideration is also his most difficult:

Every miracle, therefore, pretended to have wrought in any of these religions…as its direct scope is to establish the particular system to which it is attributed; so has it the same force, though more indirectly, to overthrow every other system. In destroying a rival system, it likewise destroys the credit of those miracles, on which that system was established; so that all the [miracles] of different religions are to be regarded as contrary facts, and evidence of these…as opposite to each other. (EHU 10.24)

His general idea is that, since multiple, incompatible religions testify to miracles, they cancel each other out in some way, but scholars disagree as to how this is supposed to happen. Interpreters such as Gaskin (1988: 137-138) and Keith Yandell (1990: 334) focus on Hume’s claim that miracles are generally purported to support or establish a particular religion. Therefore, a miracle wrought by Jesus is opposed and negated by one wrought by Mohammed, and so forth. However, as both Gaskin and Yandell point out, this inference would be flawed, because miracles are rarely such that they entail accepting one religion exclusively. Put another way, the majority of miracles can be interpreted and accepted by most any religion.

However, there is a more charitable interpretation of Hume’s fourth Evidential Argument. As the rest of the section centers around appropriate levels of doxastic assent, we should think that the notion is at play here too. A less problematic reconstruction therefore has his fourth consideration capturing something like the following intuition: the testifiers of miracles have a problem. In the case of their own religion, their level of incredulity is sufficiently low so as to accept their own purported miracles. However, when they turn to those attested by other religions, they raise their level of incredulity so as to deny these miracles of other faiths. Thus, by participating in a sect that rejects at least some miracles, they thereby undermine their own position. In claiming sufficient grounds for rejecting the miracles of the other sects, they have thereby rejected their own. For Hume, the sectarians cannot have their cake and eat it. Intellectual honesty requires a consistent level of credulity. By rejecting their opponent’s claims to miracles, they commit to the higher level of incredulity and should thereby reject their own. Hence, Hume’s later claim that, in listening to a Christian’s testimony of a miracle, “we are to regard their testimony in the same light as if they had mentioned that Mahometan miracle, and had in express terms contradicted it, with the same certainty as they have for the miracle they relate” (EHU 10.24). Thus, the problem for Hume is not that the sectarians cannot interpret all purported miracles as their own but that they, in fact, do not.

These are the four evidential considerations against miracles Hume provides in Part II. However, if the above reconstruction of Part I is correct, and Hume thinks that the Categorical Argument has established that we are never justified in believing the testimony of miracles, we might wonder why Part II exists at all. Its presence can be justified in several ways. First, on the reconstruction above, Part II significantly bolsters premise (5). Second, even if Part II were logically superfluous, Michael Levine rightly points out that the arguments of Part II can still have a buttressing effect for persuading the reader to the conclusion of Part I, thereby softening the blow of its apparently severe conclusion. A third, related reason is a rhetorical consideration. In order for one’s philosophical position to be well-grounded, it is undesirable to hang one’s hat on a single consideration. As Hume himself acknowledges, resting one part of his system on another would unnecessarily weaken it (T 1.3.6.9). Therefore, the more reasons he can present, the better. Fourth, Hume, as a participant in many social circles, is likely to have debated miracles in many ways against many opponents, each with his or her own favored example. Part II, therefore, gives him the opportunity for more direct and specific redress, and he does indeed address many specific miracles there. Finally, the considerations of Part II, the second and third especially, have an important explanatory effect. If Hume is right that no reasonable person would believe in the existence of miracles based on testimony, then it should seem strange that millions have nevertheless done so. Like the Natural History discussed below, Part II can disarm this worry by explaining why, if Hume is right, we have this widespread phenomenon despite its inherent unreasonableness.

4. Immortality of the Soul

In his essay, “Of the Immortality of the Soul,” Hume presents many pithy and brief arguments against considerations of an afterlife. He offers them under three broad headings, metaphysical, moral, and physical. Written for a popular audience, they should be treated as challenges or considerations against, rather than decisive refutations of, the doctrine.

Hume’s metaphysical considerations largely target the rationalist project of establishing a mental substance a priori (such as the discovery of the “I” in DescartesMeditations ). His first two considerations against this doctrine draw on arguments from his Treatise, referring to his conclusion that we have only a confused and insufficient idea of substance. If this is the case, however, then it becomes exceedingly difficult to discover the essence of such a notion a priori. Further, Hume says, we certainly have no conception of cause and effect a priori, and are therefore in no position to make a priori conclusions about the persistence conditions of a mental substance, or to infer that this substance grounds our thoughts. Indeed, even if we admit a mental substance, there are other problems.

Assuming that there is a mental substance, Hume tells us that we must treat it as relevantly analogous to physical substance. The physical substance of a person disperses after death and loses its identity as a person. Why think that the mental substance would behave otherwise? If the body rots, disperses, and ceases to be human, why not say the same thing of the soul? If we reply by saying that mental substances are simple and immortal, then for Hume, this implies that they would also be non-generable, and should not come into being either. If this were true, we should have memories from before our births, which we clearly do not. Note that here we see Hume drawing on his considerations against miracles; implicitly rejecting the possibility of a system whereby God continuously and miraculously brings souls into existence. Finally, if the rationalists are right that thought implies eternal souls, then animals should have them as well since, in the Treatise, Hume argued that mental traits such as rationality obtain by degree throughout the animal world, rather than by total presence or total absence; but this is something that the Christians of Hume’s day explicitly denied. In this way, Hume’s metaphysical considerations turn the standard rationalist assumptions of the theists, specifically the Christian theists of his day, against them.

The moral considerations, however, require no such presuppositions beyond the traditional depictions of heaven and hell. Hume begins by considering two problems involving God’s justice: first, he addresses the defender of an afterlife who posits its existence as a theodicy, maintaining that there is an afterlife so that the good can be appropriately rewarded and the wicked appropriately punished. For reasons considered in detail below, Hume holds that we cannot infer God’s justice from the world, which means we would need independent reasons for positing an alternate existence. However, the success of the arguments discussed above would largely undercut the adequacy of such reasons. Second, Hume points out that this system would not be just regardless. Firstly, Hume claims it is unwarranted to put so much emphasis on this world if it is so fleeting and minor in comparison to an infinite afterlife. If God metes out infinite punishment for finite crimes, then God is omni-vindictive, and it seems equally unjust to give infinite rewards for finitely meritorious acts. According to Hume, most men are somewhere between good and evil, so what sense is there in making the afterlife absolute? Further, Hume raises difficulties concerning birth. If all but Christians of a particular sect are doomed to hell, for instance, then being born in, say, Japan, would be like losing a cosmic lottery, a notion difficult to reconcile with perfect justice. Finally, Hume emphasizes that punishment without purpose, without some chance of reformation, is not a satisfactory system, and should not be endorsed by a perfect being. Hence, Hume holds that considerations of an afterlife seem to detract from, rather than bolster, God’s perfection.

Lastly are the physical (empirical) considerations, which Hume identifies as the most relevant. First, he points out how deeply and entirely connected the mind and body are. If two objects work so closely together in every other aspect of their existence, then the end of one should also be the end of the other. Two objects so closely linked, and that began to exist together, should also cease to exist together. Second, again in opposition to the rationalist metaphysicians, he points out that dreamless sleep establishes that mental activity can be at least temporarily extinguished; we therefore have no reason to think that it cannot be permanently extinguished. His third consideration is that we know of nothing else in the universe that is eternal, or at least that retains its properties and identity eternally, so it would be strange indeed if there were exactly one thing in all the cosmos that did so. Finally, Hume points out that nature does nothing in vain. If death were merely a transition from one state to another, then nature would be incredibly wasteful in making us dread the event, in providing us with mechanisms and instincts that help us to avoid it, and so forth. That is, it would be wasteful for nature to place so much emphasis on survival. Because of these skeptical considerations, Hume posits that the only argument for an immortal soul is from special revelation, a source he rejects along with miracles.

5. The Design Argument

Having discussed Hume’s rejection of revealed theology, we now turn to his critiques of the arguments of Natural Theology, the most hopeful of which, for Hume, is the Design Argument. His assaults on the design argument come in two very different types. In the Dialogues, Hume’s Philo provides many argument-specific objections, while Section XI of the Enquiry questions the fruitfulness of this type of project generally.

In the Dialogues, Cleanthes defends various versions of the design argument (based on order) and the teleological argument (based on goals and ends). Generally, he does not distinguish between the two, and they are similar in logical form: both are arguments by analogy. In analogical arguments, relevant similarities between two or more entities are used as a basis for inferring further similarities. In this case, Cleanthes is draws an analogy between artifacts and nature: artifacts exhibit certain properties and have a designer/creator; parts, or the totality, of nature exhibit similar properties, therefore, we should infer a relevantly analogous designer/creator. Hume’s Philo raises many objections against such reasoning, most of which are still considered as legitimate challenges to be addressed by contemporary philosophers of religion. Replies, however, will not be addressed here. Though Philo presents numerous challenges to this argument, they can be grouped under four broad headings: the scope of the conclusion, problems of weak analogy, problems with drawing the inference, and problems with allowing the inference. The first two types of problem are related in many cases, but not all. After the objections from the Dialogues are discussed, we will turn to Hume’s more general critique from the first Enquiry.

Scope of the Conclusion: Philo points out that, if the analogy is to be drawn between an artifact and some experienced portion of the universe, then the inferred designer must be inferred only from the phenomena. That is, we can only make merited conclusions about the creator based on the experienced part of the universe that we treat as analogous to an artifact, and nothing beyond this. As Philo argues in Part V, since the experienced portion of the world is finite, then we cannot reasonably infer an infinite creator. Similarly, our limited experience would not allow us to make an inference to an eternal creator, since everything we experience in nature is fleeting. An incorporeal creator is even more problematic, because Hume maintains that the experienced world is corporeal. In fact, even a unified, single creator becomes problematic if we are drawing an analogy between the universe and any type of complex artifact. If we follow someone like William Paley, who maintains that the universe is relevantly similar to a watch, then we must further pursue the analogy in considering how many people contributed to that artifact’s coming to be. Crafting a watch requires that many artificers work on various aspects of the artifact in order to arrive at a finished project. Finally, Philo insists that we also lack the ability to infer a perfect creator or a morally estimable creator, though the reasons for this will be discussed below in the context of the Problem of Evil. Given these limitations that we must place on the analogy, we are left with a very vague notion of a designer indeed. As Philo claims, a supporter of the design analogy is only “…able, perhaps, to assert, or conjecture, that the universe, sometime, arose from something like design: But beyond that position, he cannot ascertain one single circumstance, and is left afterward to fix every point on his [revealed] theology…” (D 5.12). This is Gaskin’s “attenuated deism” mentioned above. However, even weakening the conclusion to this level of imprecision still leaves a host of problems.

Problems of Weak Analogy: As mentioned above, many of Philo’s objections can be classified as either a problem with the scope of the conclusion or as a weak analogy. For instance, concluding an infinite creator from a finite creation would significantly weaken the analogy by introducing a relevant disanalogy, but the argument is not vulnerable in this way if the scope of the conclusion is properly restricted. However, beyond these problems of scope, Philo identifies two properties that serve to weaken the analogy but that cannot be discharged via a sufficient limitation of the conclusion. In Part X, Philo points out the apparent purposelessness of the universe. Designed artifacts are designed for a purpose. An artifact does something. It works toward some goal. Thus, there is a property that all artifacts have in common but that we cannot locate in the universe as a whole. For Philo, the universe is strikingly disanalogous to, for instance, a watch, precisely because the former is not observed to work toward some goal. This weakness cannot be discharged by restricting the conclusion, and any attempt to posit a purpose to the universe will either rely on revealed theology or is simply implausible. To show why Philo thinks this, take a few simplified examples: If we say that the universe exists “for the glory of God,” we not only beg the question about the existence of God, but we also saddle our conception of God with anthropomorphized attributes Hume would find unacceptable, such as pride and the need for recognition. Similar problems exist if we say that the universe was created for God’s amusement. However, if we change tactics and claim that the universe was created for the flourishing of humans, or any other species, then for Hume, we end up ignoring the phenomena in important ways, such as the numerous aspects of the universe that detract from human flourishing (such as mosquitoes) rather than contribute to it, and the vast portions of the universe that seem utterly irrelevant to human existence.

Beyond this, Philo finds another intractably weak analogy between artifacts and natural objects. This is the fundamental difference between nature and artifices. Philo holds that the more we learn about nature, the more striking the disanalogy between nature and artifacts. They are simply too fundamentally different. Consider, for instance, that many aspects of nature are self-maintaining and even self-replicating. Even if there are important analogies to be drawn between a deer and a watch, the dissimilarities, for Philo, will always outweigh them.

Problems with Drawing the Inference: There are further problems with the design inference that go beyond the mere dissimilarity of the analogs. Hume’s Philo raises two such objections based on experience. First, there is no clear logical relationship between order and a designer. In Part VII, Philo argues that we do in fact experience order without agency: an acorn growing into an oak tree shows that one does not need knowledge or intent to bestow order. Nor can we reply that the acorn was designed to produce a tree, for this is the very issue in question, and to import design in this way would beg the question. But if we can have order without a designer, then the mere presence of order cannot allow us to infer presence of design.

His second problem with making the design inference is that, like all inductive inferences, the design argument essentially involves a causal component. However, for Hume, knowledge of causal efficacy requires an experienced constant conjunction of phenomena; that is, only after we have seen that events of type B always follow events of type A do we infer a causal relationship from one to the other (see Hume: Causation). However, the creation of the universe necessarily would be a singular event. Since we do not have experience of multiple worlds coming into existence, causal inferences about any cosmogony become unfathomable for Hume in an important sense. This objection is often interpreted as peculiar to Hume’s own philosophical framework, relying heavily on his account of causation, but the point can be made more generally while still raising a challenge for the design argument. Because of our limited knowledge of the origins, if any, of the universe (especially in the 18th century), it becomes metaphysical hubris to think that we can make accurate inferences pertaining to issues such as: its initial conditions, persistence conditions, what it would take to cause a universe, whether the event has or requires a cause, and so forth. This relates to Philo’s next objection.

Problems when the Inference is Allowed: The previous two objections teach us that there are multiple origins of order, and that we are in a poor epistemic state to make inferences about speculative cosmogony. Taking these two points together, it becomes possible to postulate many hypothetical origins of the universe that are, for Hume, on as solid a footing as that of a designer, but instead rely on a different principle of order. Though Philo indicates that there are many, he specifically identifies only four principles which have been experienced to produce order in our part of the universe alone: reason (that is, rational agency), instinct, generation, and vegetation. Though Cleanthes defends reason as the only relevant principle of order, Philo develops alternative cosmogonies based on vegetation, where the universe grows from a seed, and generation, where the universe is like an animal or is like something created instinctively, such as a spider’s web; but Philo should not be taken as endorsing any of these alternative cosmogonies. Instead, his point is that, since we have just as much reason to think that order can arise from vegetation as it can from rational agency, as we have experience of both, there is no obvious reason to think that the inference to the latter, as the source of the order of the universe, is any better than the inference from the former, since we can make just as good an analogy with any of these. If order can come from multiple sources, and we know nothing about the creation of the universe, then Cleanthes is not in a position to give one a privileged position over the others. This means that, if we are to follow Cleanthes in treating the design inference as satisfactory, then we should treat the other inferences as satisfactory as well. However, since we cannot accept multiple conflicting cosmogonies, Philo maintains that we should refrain from attempting any such inferences. As he says in a different context: “A total suspense of judgement is here our only reasonable resource” (D 8.12).

A second problem Philo raises with allowing the design inference is that doing so can lead to a regress. Let us assume that the designer inference is plausible, that is, that a complex, purposive system requires a designing mind as its principle of order. But wait! Surely a creative mind is itself a complex, purposive system as well. A mind is complex, and its various parts work together to achieve specific goals. Thus, if all such purposive systems require a designing mind as their principle of order, then it follows that we would need a designing mind for the designing mind as well. Using the same inference, we would need a designing mind for that mind, and so on. Hence, allowing that complex, purposive systems require a designing mind as their principle of order leads to an infinite regress of designing minds. In order to stop this regress while still maintaining the design inference, one must demand that the designer of the universe does not require a designer, and there are two ways to make this claim. Either one could say that the designing mind that created the universe is a necessary being whose existence does not require a causal explanation, or one could simply say that the designer’s existence is brute. Cleanthes rejects the former option in his refutation of Demea’s “argument a priori” and, more generally, Hume does not think that this form of necessity is coherent. The only option then is to declare that the designer’s existence is brute, and therefore does not require a designer for its explanation. However, if this is the case, and we are allowing brute, undesigned existences into our ontology, then Philo asks why not declare that the universe itself is the brute existence instead? If we are allowing one instance where complexity and purposiveness does not imply a designer, then why posit an extraneous entity based on what is for Philo a dubious inference when parsimony should lead us to prefer a brute universe?

Setting aside the Problem of Evil for later, these are the major specific challenges Hume raises for the design argument in the Dialogues. However, Hume generalizes our inability to use theology to make analogical inferences about the world in Section XI of the Enquiry. Call it the Inference Problem. Rather than raising specific objections against the design argument, the Inference Problem instead questions the fruitfulness of the project of natural theology generally. Roughly stated, the Inference Problem is that we cannot use facts about the world to argue for the existence of some conception of a creator, and then use that conception of the creator to reveal further facts about the world, such as the future providence of this world, and so forth.

First, it is important to realize that the Inference Problem is a special case of an otherwise unproblematic inference. In science, we make this type of inference all the time; for instance, using phenomena to infer laws of nature and then using those laws of nature to make further predictions. Since Hume is clearly a proponent of scientific methodology, we must ask why the creator of the universe is a special and problematic case. The short answer is because of the worry of the Dialogues discussed above, that the creation of the cosmos is necessarily a singular event. This means that the Inference Problem for a creator is a special case for two reasons: first, when inferring the existence and attributes of a creator deity, Hume demands that we use all available data, literally anything available in the cosmos that might be relevant to our depiction of the creator rather than limiting the scope of our inquiry to a specific subset of phenomena. Hence, the deity we posit would represent our best guess based on all available information, unlike the case of discovering specific laws. Second, because the creation was a singular event, Hume insists that we cannot use analogy, resemblance, and so forth, to make good inductive inferences beyond what we have already done in positing the deity to begin with. On account of these two unique factors, there is a special Inference Problem that will arise whenever we try to use our inferred notion of a creator in order to discover new facts about the world.

In order to better understand the Inference Problem, let us take a concrete example, inferring a creator deity who is also just. There are only two possibilities: either the totality of the available evidence of the experienced cosmos does not imply the existence of a just creator or it does. If it does not, then we simply are not merited in positing a just deity and we therefore are not justified in assuming, for instance, that the deity’s justice will be discovered later, say in an afterlife. But if the evidence does imply a just creator deity (that is, the world is sufficiently just such as to allow the inference to a just creator), then Hume says we have no reason to think that a just afterlife is needed in order to supplement and correct an unjust world. In either case, says Hume, we are not justified in inferring further facts about the world based on our conception of the deity beyond what we have already experienced. Mutatis mutandis, this type of reasoning will apply to any conclusion drawn from natural theology. Our conception of the deity should be our best approximation based on the totality of available evidence. This means that for Hume, there are only two possibilities: either any relevant data is already considered and included in inferring our conception of the creator to begin with, and we therefore learn nothing new about the world; or the data is inconclusive and simply insufficient to support the inference to the conception of the deity. Hence, we cannot reasonably make it. If the data is not already there, then it cannot be realized from a permissible inference from the nature of the deity. However, if this is right, then the religious hypothesis of natural theology supplies no new facts about the world and is therefore explanatorily impotent.

6. The Cosmological Argument

Hume couches his concerns about theological inference as emanating from problems with drawing an analogical design inference. Since this is not the only type of argument in natural theology, we must now consider Hume’s reasons for rejecting other arguments that support the existence of a creator deity. Hume never makes a clear distinction between what Immanuel Kant later dubbed ontological and cosmological arguments, instead Hume lumps them together under the heading of arguments a priori. Note that this is not as strange as it might first appear, because although cosmological arguments are now uniformly thought of as a posteriori rather than a priori, this was not the case in Hume’s day. It took Hume’s own insights about the a posteriori nature of causation and of the Principle of Sufficient Reason to make us realize this. For Hume, what is common among such ontological and cosmological arguments is that they infer the existence of a necessary being. Hume seems to slip here, failing to distinguish between the logical necessity of the deity concluded by ontological arguments and the metaphysical necessity of the deity concluded by cosmological arguments. He therefore uniformly rejects all such arguments due to the incoherence of a necessary being, a rejection found in both the Dialogues and the first Enquiry.

In Part IX of the Dialogues, Demea presents his “argument a priori,” a cosmological argument based on considerations of necessity and contingency. The argument was intentionally similar to a version proffered by Samuel Clarke, but is also similar to arguments defended by both Leibniz and Aquinas. Before discussing the rejection of this argument, it is significant to note that it is not Philo that rejects Demea’s “argument a priori” but Cleanthes. Philo simply sits back and lets the assault occur without his help. This is telling because Cleanthes is a theist, though for Hume, ultimately misguided about the success of the design argument. The implication, then, is that for Hume, even the philosophical theist who erroneously believes that natural theology can arrive at an informative conception of a deity should still reject the cosmological argument as indefensible.

Cleanthes’ rejection of the argument a priori is ultimately fourfold. The first problem he suggests is a Category Mistake involved in trying to show that the existence of God is something that can be known a priori. For Hume and for Cleanthes, claims about existence are matters of fact, and matters of fact can never be demonstrated a priori. The important distinction between relations of ideas and matters of fact is that the denial of the former is inconceivable, whereas the denial of the latter is not. Hume maintains that we can always imagine a being not existing without contradiction; hence, all existential claims are matters of fact. Cleanthes finds this argument, “entirely decisive” and is “willing to rest the whole controversy upon it” (D 9.5), and it is a point Philo affirms in Part II. Hume argues similarly in the first Enquiry, maintaining that, “The non-existence of any being, without exception, is as clear and distinct an idea as its existence” (EHU 12.28). Hence, its denial is conceivable, and must be a matter of fact.

A related objection is that, since, for Hume, we can always conceive of a being not existing, there can be nothing essential about its existence. It is therefore not the type of property that can be found in a thing’s essence. Hume’s Cleanthes goes so far as to imply that the appellation “necessary existence” actually has no “consistent” meaning and therefore cannot be used in a philosophically defensible argument.

Thirdly, there is the worry mentioned above of allowing the design inference. Even if the inference is correct and we must posit a causeless being, this does not imply that this being is the deity. The inference is only to a necessary being, and for Philo, it is at least as acceptable to posit the universe as necessary in this way rather than positing an extra entity above and beyond it. This is true whether we posit a necessary being in order to stop a designer regress as above, or if we posit it to explain the contingent beings in the universe.

Finally, Hume thinks there is the dubiousness of the inference itself. A crucial premise of the argument a priori is that an infinite regress is impossible, because it violates the Principle of Sufficient Reason. However, Cleanthes takes contention with this claim. Imagine an infinitely long chain in which each event in that chain is explained through the previous members of the series. Note that in this picture, every member of the series is explained, because for any given member, there is always a prior set of members that fully explains it; but if each member of the series has been explained, then you have explained the series. It is unnecessary and inappropriate to insist on an explanation of the series as a whole. For these reasons, Hume concludes that, “The existence, therefore, of any being can only be proved by arguments from its cause or its effect” (EHU 12.29).

7. The Problem of Evil

In addition to his refutations of the arguments of natural theology, Hume gives positive reasons for rejecting a theistic deity with the Problem of Evil. Hume holds that the evidence of the Problem of Evil counts much more significantly against the theist’s case than the other objections that he raises against a designer, and it is in this area that Philo claims to “triumph” over Cleanthes. Hume’s discussion of the Problem takes place mainly in Parts X and XI of the Dialogues. The discussion is quite thorough, and includes presentations of both the Logical Problem of Evil and the Evidential Problem of Evil. Philo also considers and ultimately rejects several general approaches to solutions.

In Part X, Demea becomes Philo’s unwitting accomplice in generating the Problem of Evil. The two join together to expound an eloquent presentation of moral and natural evil, but with different motives. Demea presents evil as an obstacle that can only be surmounted with the assistance of God. Religion becomes the only escape from this brutish existence. Philo, however, raises the old problem of Epicurus, that the existence of evil is incompatible with a morally perfect and omnipotent deity. Hence, in Part X, Philo defends a version of the logical Problem. Although Philo ultimately believes that, “Nothing can shake the solidity of this reasoning, so short, so clear, so decisive”, he is “contented to retire still from this entrenchment” and, for the sake of argument, is willing to “allow, that pain or misery in man is compatible with infinite power and goodness in the deity” (D 10.34-35, Hume’s emphasis). Philo does not believe that a solution to the logical Problem of Evil is possible but, by granting this concession, he shifts the discussion to the evidential Problem in Part XI.

Hume generally presents the evidential Problem of Evil in two ways: in terms of prior probability and in terms of the likelihood of gratuitous evil. Taking them in order, Demea first hypothesizes a stranger to this world who is dropped into it and shown its miseries. Philo continues along these lines with a similar example in which someone is first shown a house full of imperfections, and is then assured that each flaw prevents a more disastrous structural flaw. For Hume, the lesson of both examples is the same. Just as the stranger to the world would be surprised to find that this world was created by a perfect being, the viewer of the house would be surprised to learn that he was considered a great or perfect architect. Philo asks, “Is the world considered in general…different from what a man…would, beforehand, expect from a very powerful, wise, and benevolent Deity?” (D 11.4, Hume’s emphasis). Since it would be surprising rather than expected, we have reason to think that a perfect creator is unlikely, and that the phenomena do not support such an inference. Moreover, pointing out that each flaw prevents a more disastrous problem does not improve matters, according to Philo.

Apart from these considerations from prior probability, Philo also argues the likelihood of gratuitous evil. To this end, Philo presents four circumstances that account for most of the natural evil in the world. Briefly, these are a) the fact that pain is used as a motivation for action, b) that the world is conducted by general laws, c) that nature is frugal in giving powers, and d) that nature is “inaccurate,” that is, more or less than the optimum level of a given phenomenon, such as rain, can and does occur. As Philo presents these sources of evil during the discussion of the evidential Problem of Evil, his point must be interpreted accordingly. In presenting these sources, all Philo needs to show is that it is likely that at least one of these circumstances could be modified so as to produce less suffering. For instance, in the third circumstance, it seems that, were humans more resistant to hypothermia, this would lead to a slightly better world. In this way, Philo bolsters the likelihood of gratuitous evil by arguing that things could easily have been better than they are.

Having presented the Problem of Evil in these ways, Hume explicitly rejects some approaches to a solution while implicitly rejecting others. First, Demea appeals to Skeptical Theism by positing a deity that is moral in ways that we cannot fathom, but Hume rebuffs this position in several ways. First, Cleanthes denies any appeal to divine mystery, insisting that we must be empiricists rather than speculative theologians. Second, Hume’s Cleanthes insists that, if we make God too wholly other, then we ultimately abandon religion. Hence, in Part XI Cleanthes presents the theist as trapped in a dilemma: either the theist anthropomorphizes the morality of the deity and, in doing so, is forced to confront the Problem of Evil, or he abandons human analogy and, thereby “abandons all religion, and retain[s] no conception of the great object of our adoration” (D 11.1). For Cleanthes, if we cannot fathom the greatness of God, then the deity cannot be an object of praise, nor can we use God to inform some notion of morality. But without these interactions, there is little left for religion to strive toward. We might add a third rejection of the skeptical theist approach: to rationally reject the Problem of Evil without providing a theodicy, we must have independent grounds for positing a good deity. However, Hume has been quite systematic in his attempts to remove these other grounds, rejecting the design and cosmological arguments earlier in the Dialogues, rejecting miracles (and therefore divine revelation) in the Enquiry, and rejecting any pragmatic justification in many works by drawing out the harms of religion. Hence, for Hume, an appeal to divine mystery cannot satisfactorily discharge the Problem of Evil.

Turning to other solutions, Hume does not consider specific theodicies in the Dialogues. Instead, he seems to take the arguments from prior probability and the four circumstances as counting against most or all of them. Going back to the house example, Hume doesn’t seem to think that pointing out that the flaws serve a purpose by preventing more disastrous consequences is sufficient to exonerate the builder. A perfect being should at least be able to reduce the number of flaws or the amount of suffering from its current state. Furthermore, recall that, in focusing on the empirical and in rejecting revealed texts, Hume would not accept any possible retreat to doctrine-specific theodicies such as appeals to the Fall Theodicy or the Satan Theodicy.

Given the amount of evil in the world, Philo ultimately holds that an indifferent deity best explains the universe. There is too much evil for a good deity, too much good for an evil deity, and too much regularity for multiple deities.

8. The Psychology of Religious Belief

Hume wrote the Dialogues roughly in tandem with another work, the Natural History. In its introduction, Hume posits that there are two types of inquiry to be made into religion: its foundations in reason and its origin in human nature. While the Dialogues investigates the former, the explicit task of the Natural History is to explore the latter. In the Natural History, he discharges the question of religion’s foundations in reason by gesturing at the design argument (and the interpretive puzzles discussed above regarding Hume’s views still apply) before focusing on his true task: how various passions give rise to vulgar or false religion.

According to Hume, all religion started as polytheistic. This was due largely to an ignorance of nature and a tendency to assign agency to things. In barbarous times, we did not have the time or ability to contemplate nature as a whole, as uniform. On account of this, we did not understand natural causes generally. In the absence of such understanding, human nature is such that we tend to assign agency to effects, since that is the form of cause and effect that we are most familiar with. This has been well documented in children who will, for instance, talk of a hammer wanting to pound nails. This is especially true of effects that seem to break regularity. Seeing two hundred pounds of meat seemingly moving in opposition to the laws of gravity, is not a miracle, but just a person walking. Primitive humans focused on these breaks in apparent regularity rather than focusing on the regularity itself. While focusing on the latter would lead us to something like a design argument, focusing on the former brings about polytheism. Irregularity can be beneficial, such as a particularly bountiful crop, or detrimental, such as a drought. Thus, on his account, as we exercise our propensity to assign agency to irregularities, a variety of effects gives rise to a variety of anthropomorphized agents. We posit deities that help us and deities that oppose us.

Eventually, Hume says, polytheism gives way to monotheism not through reason, but through fear. In our obsequious praising of these deities, motivated by fear rather than admiration, we dare not assign them limitations, and it is from this fawning praise that we arrive at a single, infinite deity who is perfect in every way, thus transforming us into monotheists. Were this monotheism grounded in reason, its adherence would be stable. Since it is not, there is “flux and reflux,” an oscillation back and forth between anthropomorphized deities with human flaws and a perfect deity. This is because, as we get farther from anthropomorphism, we make our deity insensible to the point of mysticism. Indeed, as Hume’s Cleanthes points out, this is to destroy religion. Therefore, to maintain a relatable deity, we begin to once more anthropomorphize and, when taken too far, we once more arrive at vulgar anthropomorphic polytheism.

Hume insists that monotheism, while more reasonable than polytheism, is still generally practiced in the vulgar sense; that is, as a product of the passions rather than of reason. As he repeatedly insists, the corruption of the best things lead to the worst, and monotheism has two ugly forms which Hume calls “superstition” and “enthusiasm.” Discussed in both the Natural History and the essay, “On Superstition and Enthusiasm”, both of these corrupt forms of monotheism are grounded in inappropriate passions rather than in reason. If we believe that we have invisible enemies, agents who wish us harm, then we try to appease them with rituals, sacrifices, and so forth. This gives rise to priests that serve as intermediaries and petitioners for these invisible agents. This emphasis on fear and ritual is the hallmark of Hume’s “superstition,” of which the Catholicism of his day was his main example. Superstition arises from the combination of fear, melancholy, and ignorance.

Enthusiasm, on the other hand, comes from excessive adoration. In the throes of such obsequious praise, one feels a closeness to the deity, as if one were a divine favorite. The emphasis on perceived divine selection is the hallmark of Hume’s “enthusiasm,” a view Hume saddled to many forms of Protestantism of his day. Enthusiasm thereby arises from the combination of hope, pride, presumption, imagination, and ignorance.

In this way, Hume identifies four different forms of “false” or “vulgar” religion. The first is polytheism, which he sometimes calls “idolatry.” Then there are the vulgar monotheisms, superstition, enthusiasm, and mysticism. Though Hume does not call the last a vulgar religion explicitly, he does insist that it must be faith-based, and therefore does not have a proper grounding in reason. True religion, by contrast, supports the “principles of genuine theism,” and seems to consist mainly in assigning a deity as the source of nature’s regularity. Note that this entails that breaks in reality, such as miracles, count against genuine theism rather than for it. In the Dialogues, Philo has the essence of true religion as maintaining, “that the cause or causes of order in the universe probably bear some remote analogy to human intelligence” (D 12.33). This deity is stripped of the traits that make the design analogy weak, and is further stripped of human passions as, for Philo, it would be absurd to think that the deity has human emotions, especially a need to be praised. Cleanthes, however, supplements his version of true religion by adding that the deity is “perfectly good” (D 12.24). However, because of this added moral component, Cleanthes sees religion as giving morality and order, a position that both Philo and Hume, in the Enquiry Concerning the Principles of Morals, deny. Instead, the true religion described by both Hume and Philo is independent of morality. As Yandell (1990: 29) points out, it does not superimpose new duties and motives to the moral framework. True religion does not, therefore, affect morality, and does not lead to “pernicious consequences.” In fact, it does not seem to inform our actions at all. Because true religion cannot guide our actions, Philo holds that the dispute between theists and atheists is “merely verbal.”

9. The Harms of Religion

A historian by profession, Hume spent much effort in his writings examining religion in its less savory aspects. He deplored the Crusades, and saw Great Britain torn asunder on multiple occasions over the disputes between Catholicism and Protestantism. Based on these historical consequences, Hume saw enthusiasm as affecting society like a violent storm, doing massive damage quickly before petering out. Superstition, however, he saw as a more lingering corruption, involving the invasion of governments, and so forth. Hume argued that, because both belief systems are monotheistic, both must be intolerant by their very nature. They must reject all other deities and ways of appeasing those deities, unlike polytheism which, having no fixed dogma, sits lighter on men’s minds. Generally, Hume held that religion, especially popular monotheism, does more harm than good and he thereby develops a critique of religion based on its detrimental consequences.

Yandell (1990: 283) questions the methodology of such an attack. For him, it is not clear what religion’s socio-political consequences tell us about its truth. However, if we view Hume’s attack against religion as systematic, then consequence-based critiques fulfill a crucial role. Setting aside faith-based accounts, there seem to be three ways to justify one’s belief in religion: through revealed theology, through natural theology, or via pragmatic advantage. Hume denies revealed theology, as his argument against miracles, if successful, entails the unsustainability of most divine experiences and of revealed texts. The Dialogues are his magnum opus on natural theology, working to undermine the reasonability of religion and therefore the appeal to natural theology. If these Humean critiques are successful, then the only remaining path for justifying religious belief is from a practical standpoint, that we are somehow better off for having it or for believing it. Cleanthes argues this way in Part XII of the Dialogues, insisting that corrupt religion is better than no religion at all. However, if Hume is right that religion detracts from rather than contributes to morality, and that its consequences are overall negative, then Hume has closed off this avenue as well, leaving us nothing but faith, or perhaps human nature, on which to rest our beliefs.

10. References and Further Reading

Hume wrote all of his philosophical works in English, so there is no concern about the accuracy of an English translation. For the casual reader, any edition of his work should be sufficient. However, Oxford University Press has recently begun to produce the definitive Clarendon Edition of most of his works. For the serious scholar, these are a must have, because they contain copious helpful notes about Hume’s changes in editions, and so forth. The general editor of the series is Tom L. Beauchamp.

a. Hume’s Works on Religion

  • Hume, David. A Treatise of Human Nature. Clarendon Press, Oxford, U.K., 2007, edited by David Fate Norton and Mary J. Norton. (T)
  • Hume, David. An Enquiry Concerning Human Understanding. Clarendon Press, Oxford, U.K., 2000, edited by Tom L. Beauchamp. (EHU)
  • Hume, David. An Enquiry Concerning the Principles of Morals. Reprinted in David Hume Enquiries. L.A. Selby-Bigge, Third Edition, Clarendon Press, Oxford, U.K. 2002. (EPM)
  • Hume, David. Dialogues Concerning Natural Religion. In David Hume Dialogues and Natural History of Religion. Oxford University Press, New York, New York, 1993. (D)
  • Hume, David. Essays: Moral, Political, and Literary. Edited by Eugene F Miller. Liberty Fund Inc., Indianapolis, Indiana, 1987. (ES)
  • Hume, David. Natural History of Religion. Reprinted in A Dissertation on the Passions, The Natural History of Religion, The Clarendon Edition of the Works of David Hume, Oxford University Press, 2007. (NHR)
  • Hume, David. New Letters of David Hume. Edited by Raymond Klibansky and Ernest C. Mossner. Oxford University Press, London, England, 1954. (NL)
  • Hume, David. The History of England. Liberty Classics, the Liberty Fund, Indianapolis, Indiana, 1983. (In six volumes) (H1-6)
  • Hume, David. The Letters of David Hume. Edited by J. Y. T. Greig, Oxford University Press, London, England, 1932. (In two volumes) (L1-2)

b. Works in the History of Philosophy

  • Broad, C. D. “Hume’s Theory of the Credibility of Miracles”, Proceedings of the Aristotelian Society, New Series, Volume 17 (1916-1917), pages 77-94.
    • This is one of the earliest contemporary analyses of Hume’s essay on miracles. It raises objections that have become standard difficulties, such as the circularity of the Caricature Argument and the seeming incompatibility of Hume’s strong notion of the laws of nature with his previous insights about causation.
  • Butler, Ronald J. “Natural Belief and Enigma in Hume,” Archiv fur Geschichte der Philosophie. 1960, pages 73-100.
    • Butler is the first scholar to argue that religious belief, for Hume, is natural or instinctual. This would mean that, though adherence to a deity is not a product of reason, it may nevertheless be supported as doxastically appropriate. The argument itself has been roundly criticized due to problematic entailments, such as there being no atheists, but the originality of the idea makes the piece merit-worthy.
  • Coleman, Dorothy. “Baconian Probability and Hume’s Theory of Testimony.” Hume Studies, Volume 27, Number 2, November 2001, pages 195-226.
    • Coleman is an extremely careful, accurate, and charitable reader of Hume on miracles. She excels at clearing up misconceptions. In this article, she refocuses Hume’s argument from an anachronistic Pascalian/Bayesian model to a Baconian one, and argues that the “straight rule” of Earman and others is irrelevant to Hume, who insists that probability is only invoked when there has been a contrariety of phenomena.
  • Coleman, Dorothy. “Hume, Miracles, and Lotteries”. Hume Studies. Volume 14, Number 2, November 1988, pages 328-346.
    • Coleman is an extremely careful, accurate, and charitable reader of Hume on miracles. She excels at clearing up misconceptions. In this article, she responds to criticisms of Hambourger and others that Hume’s probability calculus in support of the miracles argument commits him to absurdities.
  • Earman, John. Hume’s Abject Failure—The Argument Against Miracles. Oxford University Press, New York, New York, 2000.
    • In this extremely critical work, Earman argues that the miracles argument fails on multiple levels, especially with regard to the “straight rule of induction.” The work is highly technical, interpreting Hume’s argument using contemporary probability theory.
  • Fogelin, Robert J. A Defense of Hume on Miracles. Princeton University Press, Princeton New Jersey, 2003.
    • In this book, Fogelin takes on two tasks, that of reconstructing Hume’s argument of Part X, and defending it from the recent criticisms of Johnson and Earman. He provides a novel reading in which Part I sets epistemic standards of credulity while Part II shows that miracles fall short of this standard. The subsequent defense relies heavily on this reading, and largely stands or falls based on how persuasive the reader finds Fogelin’s interpretation.
  • Garrett, Don. Cognition and Commitment in Hume’s Philosophy. Oxford University Press. New York, New York, 1997.
    • This is a great introduction to some of the central issues of Hume’s work. Garrett surveys the various positions on each of ten contentious issues in Hume scholarship, including the miracles argument, before giving his own take.
  • Gaskin, J.C.A. Hume’s Philosophy of Religion—Second Edition. Palgrave-MacMillan, 1988.
    • This is perhaps the best work on Hume’s philosophy of religion to date on account of both its scope and careful analysis. This work is one of only a few to provide an in-depth treatment of the majority of Hume’s writings on religion rather than focusing on one work. Though points of disagreement were voiced above, this should not detract from the overall caliber of Gaskin’s analysis, which is overall fair, careful, and charitable. The second edition is recommended because, in addition to many small improvements, there are significant revisions involving Philo’s Reversal.
  • Geisler, Norman L. “Miracles and the Modern Mind”, in In Defense of Miracles- A Comprehensive Case of God’s Action in History, edited by Douglas Geivett and Gary R. Habermas, InterVarsity Press, Downers Grove, Illinois, 1997, pages 73-85.
    • In this article, Geisler raises an important worry that Hume cannot draw a principled distinction between the miraculous and the merely marvelous. Since this is the case, then Hume must reject the marvelous as well, but this would have the disastrous consequence of stagnating science.
  • Hambourger, Robert. “Belief in Miracles and Hume’s Essay.” Nous. N 80; 14: 587-604.
    • In this essay, Hambourger lays out a problem known as the lottery paradox, in which he tries to show that a commitment to Humean probabilistic doxastic assent leads to counterintuitive consequences.
  • Holden, Thomas. Spectres of False Divinity. Oxford University Press, Oxford, U.K., 2010.
    • In this careful work, Holden argues that Hume goes beyond mere skepticism to “moral atheism,” the view that the deity cannot have moral attributes. He gives a valid argument supporting this and shows how Hume supports each premise, drawing on a wide variety of texts.
  • Huxley, Thomas Henry. Hume. Edited by John Morley, Dodo Press, U.K., 1879.
    • Huxley is an early commentator on Hume, and this work is the first to raise several worries with Hume’s miracles argument.
  • Johnson, David. Hume, Holism, and Miracles. Cornell University Press, Ithaca, New York, 1999.
    • This is another recent critique of Hume’s account of miracles. Johnson’s work is more accessible than Earman’s, and it is novel in the sense that it addresses several different historical and contemporary reconstructions of Hume’s argument.
  • Kemp Smith, Norman. (ed.) Dialogues Concerning Natural Religion. The Bobbs-Merrill Company, Inc., Indianapolis, Indiana, 1947.
    • In Kemp Smith’s edition of Hume’s Dialogues, he provides extensive interpretation and commentary, including his argument that Hume is represented entirely by Philo and that seeming evidence to the contrary is building stylistic “dramatic balance.”
  • Levine, Michael. Hume and the Problem of Miracles: A Solution. Kluwer Academic Publishers, Dordrecht, Netherlands, 1989.
    • Levine argues that Hume’s miracles argument cannot be read independently of his treatment of causation, and that the two are inconsistent. Nevertheless, a Humean argument can be made against belief in the miraculous.
  • Livingston, Donald W. Hume’s Philosophy of Common Life. University of Chicago Press, Chicago, Illinois, 1984.
    • This is one of the standard explications of Humean causal realism. It stresses Hume’s position that philosophy should conform to and explain common beliefs rather than conflict with them. It is included here because, in the course of his project, Livingston includes a helpful discussion of Humean laws of nature.
  • Paley, William. A View of the Evidences of Christianity, in The Works of William Paley, Edinburgh, 1830.
    • Paley is the first to attribute the Caricature Argument to Hume.
  • Pike, Nelson. Dialogues Concerning Natural Religion, Bobbs-Merrill Company Inc., Indianapolis, IN, 1970.
    • In Pike’s edition of Hume’s Dialogues, he provides extensive interpretation and commentary, as well as a text-based critique of Kemp Smith’s position.
  • Penelhum, Terence. “Natural Belief and Religious Belief in Hume’s Philosophy.” The Philosophical Quarterly, Volume 33, Number 131, 1983.
    • Penelhum previously offered a careful argument that some form of religious belief, for Hume, is natural. However, unlike Butler, he is not committed to the view that religious beliefs are irresistible and necessary for daily life. In this more recent work, he confronts some difficulties with the view and updates his position.
  • Swinburne, Richard. The Concept of Miracle. Macmillan, St. Martin’s Press, London, U.K., 1970.
    • Though Swinburne is generally critical of Hume’s position, he is a careful and astute reader. In this general defense of miracles, his reconstruction and critique of Hume is enlightening.
  • Tweyman, Stanley. “Scepticism and Belief in Hume’s Dialogues Concerning Natural Religion.” International Archives of the History of Ideas, Martinus Nyhoff Publishers, 1986.
    • Tweyman presents a holistic reading of the Dialogues, starting with a dogmatic Cleanthes who is slowly exposed to skeptical doubt, a doubt that must ultimately be corrected by the common life. Tweyman ultimately argues that belief in a designer is natural for Hume.
  • Wieand, Jeffery. “Pamphilus in Hume’s Dialogues”, The Journal of Religion, Volume 65, Number 1, January 1985, pages 33-45.
    • Wieand is one of the few recent scholars that argues against Hume as Philo and for a Hume as Cleanthes/Pamphilus view. This interpretation focuses largely on the role of the narrator and Pamphilus’ discussion about the dialogue form.
  • Yandell, Keith E. Hume’s “Inexplicable Mystery”—His Views on Religion. Temple University Press, Philadelphia, Pennsylvania, 1990.
    • Apart from Gaskin, Yandell’s work is the only other major comprehensive survey of Hume on religion. The work is highly technical and highly critical, and is sometimes more critical than accurate. However, he at least provides the general form of some theistic responses to Hume and identifies a few important lapses on Hume’s part, such as a lack of response to religious experience.
  • Yoder, Timothy S. Hume on God. Continuum International Publishing, New York, New York, 2008.
    • Yoder’s text is an extended argument, defending Hume’s “amoral theism”. He makes important contributions in his treatment of false/vulgar religion, the background for English deism, and Hume’s use of irony.

 

Author Information

C. M. Lorkowski
Email: clorkows@kent.edu
Kent State University- Trumbull Campus
U. S. A.

Rights and Obligations of Parents

Historically, philosophers have had relatively little to say about the family. This is somewhat surprising, given the pervasive presence and influence of the family upon both individuals and social life. Most philosophers who have addressed issues related to the parent-child relationship—Kant and Aristotle, for example—have done so in a fairly terse manner. At the end of the twentieth century, this changed. Contemporary philosophers have begun to explore, in a substantial way, a range of issues connected with the rights and obligations of parents. For example, if there are parental rights, what is their foundation? Most contemporary philosophers reject the notion that children are there parents’ property and thus reject the notions that parents have rights to their children and over their children. Some philosophers argue for a biological basis of parental rights, while others focus on the best interests of children or a social contract as the grounds of such rights. Still others reject outright the notion that parents have rights, as parents. Some do so because of skepticism about the structure of the putative rights of parents, while others reject the idea of parental rights in view of the nature and extent of the rights of children.

The claim that parents have obligations, as parents, is less controversial. Nevertheless, there is disagreement about the basis of such obligations. Apart from biological, best interests, and social contract views, there is also the causal view of parental obligations, which includes the claim that those who bring a child into existence are thereby obligated to care for that child. Philosophers are concerned not merely with these theoretical questions related to parental rights and obligations; they also focus their attention on practical questions in this realm of human life. There are many distinct positions to consider with respect to medical decision making, the autonomy of children, child discipline, the licensing of parents, and the propriety of different forms of moral, political, and religious upbringing of children. While both the theoretical and practical aspects of the rights and obligations of parents are receiving increased attention, there remains much room for substantial work to be done on this important topic.

Table of Contents

  1. Introduction
  2. Philosophical Accounts of Parental Rights and Obligations
    1. Proprietarianism
    2. Biology
    3. Best Interests of the Child
    4. Constructionism
    5. Causation
    6. Fundamental Interests of Parents and Children
  3. Skepticism about Parental Rights and Obligations
    1. Children’s Liberation
    2. The Myth of Parental Rights
  4. Applied Parental Ethics
    1. Parental Licensing
    2. The Child’s Right to an Open Future
    3. Medical Decision Making
    4. Disciplining Children
    5. The Religious Upbringing of Children
    6. Parental Love
  5. References and Further Reading

1. Introduction

What is a parent? The answer one gives to this question will likely include, either implicitly or explicitly, particular assumptions about the grounds of parental rights and obligations. Parenthood and biological parenthood are often seen as synonymous. But of course, adoptive parents are also parents by virtue of assuming the parental role. This commonsense fact opens the door for a consideration not only of the possible connections between biology and parenthood, but other issues as well, such as the role of consent in acquiring parental rights and obligations, which then leads to a host of other questions that are not only theoretically important, but existentially significant as well. What does it mean for a parent to possess rights, as a parent? Why think that such rights exist? What obligations do parents have to their children? What is the role of the state, if any, concerning the parent-child relationship? These questions are central for our understanding of the moral, social, personal, and political dimensions of the parent-child relationship.

2. Philosophical Accounts of Parental Rights and Obligations

When considering the rights of parents, both positive and negative rights are involved. A negative right is a right of non-interference, such as the right to make medical decisions on behalf of one’s child without intervention from the state. A positive right in this context is a right to have the relevant interests one has as a parent in some way promoted by the state. For example, some argue that parents have a right to maternity and paternity leave, funded in part or whole by the state. Regarding parental obligations, the focus in what follows will be on moral obligations, rather than legal ones, with a few exceptions. A parent might have a moral obligation to her child to provide her with experiences such as musical education or opportunities to participate in sports that enrich her life, without being legally bound to do so. In this section, the various accounts of the grounds of the moral rights and obligations of parents will be discussed.

a. Proprietarianism

An advocate of proprietarianism holds that children are the property of their parents, and that this serves to ground parental rights (and perhaps obligations). Proprietarianists argue, given that parents in some sense produce their children, that children are the property of their parents in some sense of the term.. Aristotle held this type of view, insofar as he takes children and slaves to be property of the father (Nicomachean Ethics, 1134b). At least one contemporary philosopher, Jan Narveson, has argued that children are the property of their parents, and that this grounds parental rights. This does not relieve parents of having obligations regarding their children even though children do not yet possess rights (Narveson 1988). For Narveson, how parents treat their children is limited by how that treatment impacts other rights-holders. Nevertheless, parents have the right to direct the lives of their children, because they exerted themselves as producers, bringing children into existence. A different sort of proprietarianism centers on the idea that parents own themselves, including their genetic material, and since children are a product of that material it follows that parents have rights over their genetic offspring. Critics of proprietarianism primarily reject it on the grounds that it is immoral to conceive of children as property. Children are human beings, and as such, cannot rightly be owned by other human beings. It follows from this that children are not the property of their parents. Most contemporary philosophers reject proprietarianism.

Historically, proprietarianism is often connected with absolutism, which is the idea that parental authority over children is in an important sense, limitless. Absolutists held that fathers have the right to decide whether or not their child lives or dies. This view is no longer advocated in the contemporary philosophical literature, of course, but in the past was thought by some that this extreme level of parental authority was morally justified. Some advocates of this view thought that because a child is the creation of the parent, that absolutism follows. Other reasons offered in support of this view include the notion that both divine and natural law grant such authority to parents; this level of authority fosters moral development in the young by preventing them from exemplifying vice; and the idea that the family is a model of the commonwealth, such that as children obey their father, they will also learn to obey the commonwealth (Bodin 1576/1967). According to Bodin The natural affection that fathers have towards their children will prevent them from abusing their authority,. Critics of absolutism reject it for reasons similar to those offered against proprietarianism. They claim that is clearly immoral to grant parents the power to end the lives of their children. While some absolutists seek to ground this power in the fact that the parent created the child in question, critics argue that the possession and exercise of this power over one’s children simply does not follow from the fact that one created those children.

b. Biology

Is a biological relationship between a parent and child necessary or sufficient for parenthood? That is, does biology in some sense ground the rights and obligations of parents? Two types of biological accounts of parenthood have emerged which are more detailed than those which emphasize the general value of biology in the parent-child relationship. Advocates of the first type emphasize the genetic connection between parent and child, while advocates of the second take gestation to be crucial. The advocates of the genetic account believe that the genetic connection between parent and child grounds parenthood. The fact that a particular child is derived from the genetic material of an individual or is “tied by blood” to that individual is what yields parental rights and obligations. A person has rights and obligations with respect to a particular child insofar as that person and the child share the requisite DNA. Historically speaking, perceived blood ties have been decisive in the transfer of wealth, property, and power from one generation to the next.

Critics of genetic accounts claim that several of the arguments advanced for these accounts are flawed in important ways. For instance, those who hold that the genetic connection is necessary for parental rights and obligations must deal with counterexamples to the claim, such as adoptive parenthood and step-parenthood. In addition, if two adults who are identical twins have the same level of genetic connection to a child it does not follow that both are that child’s mother or father, though at least some genetic accounts would seem committed to such a view.

Gestational accounts of parental rights and obligations, in their strongest from, include the claim that gestation is necessary for parental rights. On this view, men only acquire parental rights and obligations via marriage, the gestational mother consenting to co-parenthood with the male, or by the mother allowing him to adopt her child. Some gestational accounts—including those which only include the claim that gestation is sufficient for parental rights or gives the mother a prima facie claim to such rights—focus on the risk, effort, and discomfort that gestational mothers undergo as that which grounds their claims to parenthood. Others center on the intimacy that obtains and the attachment which occurs during gestation between the mother and child as the basis for a claim to parenthood. A final type of gestationalism is consequentialist, insofar as advocates of this view hold that when there is a conflict concerning custody between gestational and genetic mothers, a social and legal policy favoring gestational mother will have more favorable consequences for mothers and their children. It is argued that an emphasis on gestation, and preference for gestational mothers in such cases, would increase women’s social standing by emphasizing their freedom to make such choices concerning health on behalf of themselves and their children. This in turn will have the likely result of benefitting the health and welfare of such mothers and their children. Positive inducements are preferable to punitive sanctions, given the positive consequences of the former. This view also implies that the claims to parenthood of gestational mothers carry more weight than those of fathers, at least when disputes over custody arise.

Critics of gestationalism reply that it is objectionably counterintuitive, insofar as it is inconsistent with the belief that mothers and fathers have equal rights and obligations regarding their children. Many of the goods available to individuals via parenthood, including intimacy, meaning, and satisfaction that can be obtained or acquired in the parent-child relationship, are equally available to both mothers and fathers. This equality of parental interests, then, is thought to justify the conclusion that the presumptive claims to parenthood on the parts of mothers and fathers are equal in weight.

There is a more general issue concerning the relationship between biology and parenthood, which has to do with the value of biological connections in the parent-child relationship. A particularly strong view concerning the relationship between biology and parenthood is that biology is essential to the value of parenthood for human beings (Page 1984). On this view, there is a necessary connection between biology and parental rights. The entire process of creating, bearing, and rearing a child is thought to be a single process which is valuable to parents insofar as they seek to create a person who in some sense reflects a part of themselves. The aim is to create someone else in the image of the parent. This is why being a parent has value for us; it is why we desire it. In reply, it has been argued that while biology may have value for many people with respect to the parent-child relationship, a biological connection is neither necessary nor sufficient for parental rights and obligations. Rather, the more valuable aspects of the parent-child relationship are personal, social, and moral. It has been argued that biological ties between parents and children are morally significant in other ways (Velleman 2005). Some believe that children have families in the most important sense of the term if they will be raised by parents who want them, love them, and desire what is best for them, regardless of whether a biological connection exists. The lack of such a connection does little harm to children in such families. Against this, Velleman argues that knowledge of one’s biological relatives, especially one’s parents, is crucial because the self-knowledge one gains from knowing them is central for forging a meaningful human life. Lack of such knowledge, then, is harmful to children. In reply, it has been argued that knowledge of one’s biological progenitors is unnecessary for self-knowledge and for having and leading a good life (Haslanger 2009).

c. Best Interests of the Child

According to this account of parenthood, children ought to be raised by a parent or parents who will best serve their interests. On this account, parental rights are grounded in the ability of parents to provide the best possible context for childrearing. While the best interests criterion of parenthood is useful in cases of conflicting claims to custody in the context of divorce or in situations where child abuse and neglect are present, several criticisms have emerged with respect to its application as the fundamental grounding of parental rights and obligations. One criticism of this view is that it fails to sufficiently take into account the interests of parents, which leads to potential counterexamples. For instance, consider a case in which it is in the best interests of a child to be raised by an aunt or uncle, rather than the child’s biological or custodial parents, when the current parents are fit and fulfilling their obligations to the child in question. Removing the child from the custody of those parents solely on the basis of the comparative superiority of others seems problematic to many. Moreover, this account may entail that the state should remove newborns from the custody of their parents, if they are poor, and transfer parental rights to someone who has greater financial stability, all else being equal. For critics of the best interests account, this is deeply counterintuitive and is sufficient for rejecting this account of parenthood.

Perhaps the account can be modified to deal with such criticisms. The modified account need not entail that a child should be removed from the custody of its natural parents and given to better caretakers, who then possess parental rights with respect to that child, even if these caretakers possess the same nationality, ethnicity, and social origins. This is because it is in the best interests of the child to maintain her developing self-identity and provide her with a stable environment. Still, a primary objection to all best-interests accounts is that they fail to take into account, in an adequate manner, the relevant interests of a child’s current parents. The point is not that parental interests trump the interests of the child, but rather that best interests of the child accounts fail to weigh those interests in a proper manner.

d. Constructionism

Some philosophers argue that the rights and obligations of parenthood are not grounded in biology or a natural relationship between parents and their offspring. Rather, they hold that the rights and obligations of parents are social constructs. One form of this view includes the claim that parenthood is a type of social contract. Advocates of such a view argue that the rights and responsibilities of parenthood arise from a social agreement between the prospective parent and the moral community (such as the state) that appoints the prospective parent to be the actual parent. In some cases, social contract accounts emphasize causation (see section e. below) as a way in which individuals may implicitly consent to taking on the rights and responsibilities of parenthood. Contractual and causal accounts can come apart, however, and be treated separately. It has also been argued that social conventions have priority over biological ties when determining who will raise a child, and that in social contexts where biological parents generally have the duty to raise their offspring, individual responsibility for children is produced by the choice to undertake the duties of raising a child, which can occur by deciding to procreate or deciding not to avoid parental obligations via abortion or adoption.

Others who take parenthood to be a social construct emphasize the individual choice to undertake the rights and responsibilities of parenthood with respect to a particular child. This way of incurring special obligations is familiar. For instance, an employer takes on special obligations to another when that person becomes her employee. Spouses take on special obligations to one another and acquire certain rights with respect to each other via marriage. In these and many other instances, one acquires particular rights and obligations by choice, or voluntary consent. Similarly, then, when an individual voluntarily undertakes the parental role, that individual acquires parental rights and obligations. This can happen via intentional procreation, adoption, and step-parenthood.

Critics of constructionism argue that advocates of this view fail to appreciate certain facts of human nature related to the interests of children. Many constructionists, according to their critics, tend to weigh the interests of adults more heavily than those of the relevant children. They maintain that children have deep and abiding interests in being raised by their biological progenitors, or at least having significant relationships with them. Intentionally creating children who will lack such connections seems problematic, and some critics are especially concerned about intentionally creating children who will lack either a custodial mother or father. Other versions of constructionism are not vulnerable to this critique, insofar as they include the claim that children’s interests and in some cases rights are at least equally important relative to the rights and interests of adults.

Related to the use of reproductive technology, the creation of a child by gamete donors is thought by some to be immoral or at least morally problematic because such donors often fail to take their obligations to their genetic offspring seriously enough when they transfer them to the child’s custodial parents. Given that parental obligations include more than just minimal care, but also seeking to care for children in deeper ways which foster their flourishing, the claim is that in such cases donors do not take their obligations as seriously as is warranted. Constructionists reply that as long as the custodial parents nurture and provide sufficient care for children, the biological connections as well as the presence of both a mother and father are at least relatively, if not entirely, insignificant. In order to resolve these issues, both philosophical argumentation and empirical data are important.

e. Causation

Most, if not all, contemporary philosophers who defend a causal account of parenthood focus on parental obligations rather than rights. Simply stated, the claim is that individuals have special obligations to those offspring which they cause to come into existence. Defenders of the causal account argue that genetic and gestational parents incur moral obligations to their offspring in virtue of their causal role concerning the existence of the children in question. In many cases, of course, the causal parents of a child would incur obligations because they voluntarily consent to take on such when they choose to have a child. Defenders of the causal account often focus on cases in which procreation is not intentional, in order to isolate the causal role as being sufficient for the generation of parental obligations.

Advocates of the causal account set aside cases such as rape, where coercion is present. They maintain that in other important cases one can incur obligations to offspring, even if one does not intend to procreate or consent to take on such obligations. The general idea is that when a person voluntarily engages in a behavior which can produce reasonably foreseeable consequences, and the agent is a proximate and primary cause of those consequences, then it follows that the agent has obligations with respect to those consequences. In the case of procreation, the child needs care. To fail to provide it is to allow harmful consequences to obtain. Since the agent is causally responsible for the existence of a child in need of care, then the agent is morally responsible to provide it. This is similar to other situations in which an agent is causally responsible for harm or potential harm and is thereby thought to also bear moral responsibility relative to that harm. For instance, if a person damages his neighbor’s property via some action, then that person thereby incurs the moral responsibility to compensate his neighbor for that damage. By parity of reasoning, defenders of the causal account of parental obligations argue that causal responsibility for the existence of a child—when coercion is not present—entails moral responsibility with respect to preventing the child’s experiencing various kinds of suffering and harm.

The heart of the disagreement between proponents of the causal account and their critics is whether or not the voluntary acceptance of the special obligations of parenthood is necessary for incurring those obligations. Critics of the causal account argue that it is difficult to isolate parents as those who bear causal responsibility for a child’s existence, given the causal roles others play (such as medical practitioners). Given the variety of individuals that are causally connected to the existence of a particular child, the connections between causal responsibility and moral responsibility in this particular realm of life are unclear. A defense of the causal account against this objection includes the claim that the interests of children are in play here and deeply connected with the causal parents and not medical practitioners. This may be a hybrid account however, coupling causation with an interests-based account of parental obligation, which is the focus of the next section.

f. Fundamental Interests of Parents and Children

This view of parenthood focuses on fundamental interests—those which are crucial for human flourishing—as the grounds for the rights and obligations of parents. The general picture is a familiar one in which such interests generate correlative rights and obligations. In the parent-child relationship, there are several such interests in play, including psychological well-being, the forging and maintenance of intimate relationships, and the freedom to pursue that which brings satisfaction and meaning to life. The interests of children connected with their custodial parents are numerous and significant. If a child receives caring, intimate, and focused attention from a parent, this can help her to become an autonomous agent capable of pursuing and enjoying intimate relationships and psychological and emotional health. It can also contribute to her having the ability to create and pursue valuable ends in life. The lack of such attention and care often has very detrimental effects on the development and life prospects of a child. These interests are thought to generate the obligations of parenthood.

How is it that these interests are thought to generate parental rights? Parents can experience meaning and satisfaction in life via the various actions related to parenting, as they offer care, guidance, and knowledge to their children. By playing a role in satisfying the fundamental interests of their children, parents have many of their own interests satisfied, including the ones mentioned above: psychological well-being, the forging and maintenance of intimate relationships, and experiencing satisfaction with and meaning in life. It is important for interests-based accounts of parental rights to note that a condition for the satisfaction of the relevant interests often requires that the parent-child relationship be relatively free from intrusion. If the state exercises excessive control in this realm of human life, the parent becomes a mediator of the will of the state and many of the goods of parenthood then are lost. The parent is not making as significant of a personal contribution to the well-being of her child as she might otherwise be able to do, and so is not able to achieve some of the goods that more autonomous parenting makes possible, including intimacy in the parent-child relationship. There are certainly cases in which intrusion is warranted, such as instances of abuse and neglect, but in these types of cases there is no longer a genuine intimacy present to be threatened, given that abuse blocks relational intimacy. Finally, defenders of this view of parenthood conclude that if children need parental guidance and individualized attention based on an intimate knowledge of their preferences and dispositions, then the state has an interest in refraining from interfering in that relationship until overriding conditions obtain. Parents have rights, as parents, to this conditional freedom from intrusion.

3. Skepticism about Parental Rights and Obligations

a. Children’s Liberation

Advocates of children’s liberation hold that parents should have no rights over children because such paternal control is an unjustified inequality; it is both unnecessary and immoral. Those who support children’s liberation argue that children should possess the same legal and moral status as adults. This entails that children should be granted the same rights and freedoms that adults possess, such as self-determination, voting, and sexual autonomy, as well as the freedom to select guardians other than their parents. While advocates of liberationism disagree on the particular rights that children should be granted, they agree that the status quo regarding paternalism with respect to children is unjust. Clearly such a view is a challenge to the legal and moral status of parents. One argument in favor of this view focuses on the consistency problem. If rights are grounded in the possession of certain capacities, then it follows that when an individual has the relevant capacities—such as autonomy—then that individual should possess the rights in question. Consistency may require either denying certain rights to particular adults who do not possess the relevant capacities in order to preserve paternalistic control of children, or granting full human rights to particular children who possess the relevant capacities. Alternatively, it has been suggested that children should be granted all of the rights possessed by adults, even if they do not yet possess the relevant capacities (Cohen 1980). Rather than being left to themselves to exercise those rights, children could borrow the capacities they lack from others who are obligated to help them secure their rights and who possess the relevant capacities. Once children actualize these capacities, they may then act as agents on their own behalf. The upshot is that a difference in capacities does not justify denying rights to children.

Critics of children’s liberation argue that paternalistic treatment of children enables them to develop their capacities and become autonomous adults with the attendant moral and legal status. They also worry that in a society in which children are liberated in this way, many will forego education and other goods which are conducive to and sometimes necessary for their long-term welfare. It has also been suggested that limiting children’s right of self-determination fosters their development and protects them from exploitative employment. Granting equal rights to children might also prevent parents from providing the moral training children need, and cause adolescents to be even less likely to consider seriously the guidance offered by their parents. In addition, critics point out that autonomy is not the only relevant issue with respect to granting equal rights to children. The capacity for moral behavior is also important, and should be taken into account given the facts of moral development related to childhood. Finally, if a child possesses the relevant actualized capacities, then perhaps theoretical consistency requires that she be granted the same moral and legal status accorded to adults. However, the critic of children’s liberation may hold that this is simply a case where theory and practice cannot coincide due to the practical barriers in attempting to bring the two together. Perhaps the best way in which to bring theory and practice together is to emphasize the moral obligations of parents to respect the developed and developing autonomy and moral capacities of their children.

b. The Myth of Parental Rights

It has been argued that parents do not possess even a qualified or conditional moral right to impact the lives of their children in significant ways (Montague 2000). The reason that Montague rejects the notion of parental rights is that such rights lack two essential components of moral rights. First, moral rights are oriented towards their possessors. Second, moral rights have a discretionary character. Since the putative rights of parents have neither of these features, such rights should be rejected. If there were parental rights, their function would be to protect either the interests that parents have or the choices they make regarding the parent-child relationship. The problem for the proponent of parental rights is that no other right shares a particular feature of such rights, namely, that the relevant set of interests or autonomy is only worth protecting because of the value of protecting the interests or autonomy of others. Moreover, Montague argues that parental rights to care for children are in tension with parental obligations to do so. The notion of parental rights is in tension with the fact that parents are obligated to protect their children’s interests and assist them in the process of developing into autonomous individuals. Practically speaking, an emphasis on parental rights focuses on what is good for parents, while a focus on parental obligations emphasizes the well-being of children. He concludes that we have strong reasons for rejecting the notion that parents have a right to impact, in a significant way, the lives of their children. So, the view is that parental rights are incompatible with parental obligations. Parents have discretion regarding how to fulfill their obligations, but they do not have such discretion regarding whether to do so. If there were parental rights, parents would have discretion regarding whether to protect and promote the interests of their children, and this is unacceptable. In reply, one critic of Montague’s argument) has pointed out that while it is true that parents do not have discretion regarding what counts as fulfilling their obligations towards their children, they nevertheless have discretion regarding how to do so, and perhaps this is sufficient for thinking that there are some parental rights (Austin 2007).

4. Applied Parental Ethics

While the vast majority of philosophers agree that children have at least some rights—such as the right to life, for example—the extent of those rights and how they relate to the rights and obligations of parents is an issue that generates much controversy. The existence and extent of parental rights, the rights of children, and the relevant interests of the state all come together when one considers issues in applied parental ethics. The theoretical conception of rights one holds as well as one’s view of the comparative strength of those rights will often inform what one takes to be the personal, social, and public policy implications with respect to these issues.

a. Parental Licensing

Hugh LaFollette’s defense of the claim that the state should license parents is perhaps the most influential and widely discussed version of the philosophical argument in favor of parental licensing (LaFollette 1980).  LaFollette argues that (i) if an activity is potentially harmful to others; (ii) requires a certain level of competence; and (iii) this competence can be demonstrated via a reliable test, then the activity in question should be regulated by the state. These criteria justify current licensing programs. For instance, we require that physicians obtain medical licenses from the state to ensure their competency due to the potential harm caused by medical malpractice. In order to drive an automobile, a level of skill must be demonstrated because of the potential harm to others that can be done by incompetent drivers. These criteria also apply to parenting. It is clear that parents can harm their children through abuse, neglect, and lack of love, which often results in physical and psychological trauma. Children who suffer such harms may become adults who are neither well-adjusted nor happy, which can lead to cyclical patterns of abuse and other negative social consequences. Parenting also requires a certain competency that many people lack due to temperament, ignorance, lack of energy, and psychological instability. LaFollette believes that we can create a moderately reliable psychological test that will identify those individuals who will likely abuse or neglect their children. At the time of his paper, such tests were just beginning to be formulated. Since then, however, accurate parenting tests have been developed which could serve as useful tools for identifying individuals who are likely to be extremely bad parents (McFall 2009). Given that parenting is potentially harmful and requires competence that can be demonstrated via a reliable test, by parity of reasoning the state should also require licenses for parents. Moreover, given that we screen adoptive parents and require that they demonstrate a level of competence before they are allowed to adopt a child in order to reduce the chances of abuse or neglect, there is no compelling reason not to require the same of biological parents. The aim of parental licensing is not to pick out parents who will be very good, but rather to screen those who will likely be very bad by abusing or neglecting their children. The intent is to prevent serious harm to children, as well as the harms others suffer because of the social impact of child abuse. LaFollette concludes that since a state program for licensing parents is desirable, justifiable, and feasible, it follows that we should implement such a program.

Critics argue that there are both theoretical and practical problems with such proposals. Some worry about cases where a woman is pregnant before acquiring a license and fails to obtain one before giving birth. The picture of the state removing a newborn infant in such cases and transferring custody to suitable adoptive parents is problematic because no abuse or neglect has yet occurred. A variety of alternatives, including less invasive licensing as well as non-licensing alternatives, have been proposed. LaFollette himself puts forth the possibility that instead of prohibiting unlicensed parents from raising children, the state could offer tax incentives for licensed parents and other types of interventions, such as scrutiny by protective services of unlicensed parents, on the condition that such measures would provide adequate protection for children. Others have proposed different requirements for a parental license, with both fewer and greater restrictions than those proposed by LaFollette. These include minimum and maximum age requirements, mandatory parenting education, signing a contract in which a parent agrees to care for and not maltreat his or her child (so that if a child is maltreated, removal of the child would be based on a breach of contract rather than criminal liability), financial requirements, and cognitive requirements. Others argue for alternatives to licensing, such as mandatory birth control, extended (and perhaps paid) maternity and paternity leave, and universal daycare provided by the government.

Finally, some argue that legally mandated family monitoring and counseling is preferable to a program of licensing parents because it better accounts for the interests people have in becoming and being parents and the welfare of children. It is also claimed to be preferable to licensing because it avoids the possible injustices that may occur given the fallibility of any test aimed at predicting human behavior. If people who are or will soon be parents can develop as parents, it is better to give them the opportunity to do so under close supervision, monitoring, and counseling, allowing them to be with their children when they are young and a significant amount of bonding occurs. This practice would protect the interests of children, society, and parents. For those parents whose incompetence is severe or who fail to deal with their incompetence in a satisfactory manner, the monitoring/counseling proposal rightly prevents them from raising children, according to advocates of this approach.

b. The Child’s Right to an Open Future

A significant concept shaping much of the debate concerning the ethics of childrearing is that of the child’s right to an open future (Feinberg 1980). According to this argument, children have a right to have their options kept open until they become autonomous and are able to decide among those options for themselves, according to their own preferences. Parents violate the child’s right to an open future when they ensure that certain options will be closed to the child when she becomes an autonomous adult. For example, a parent who is overly directive concerning the religious views of her child, or who somehow limits the career choices of her child is violating this right. When parents violate this right, they are violating the autonomy rights of the adult that the child will become. According to Feinberg, parents are obligated to offer their children as much education as is feasible, as this will enable them to choose from a maximally broad range of potential life options upon reaching adulthood. When parents do engage in more directive parenting, they should do so in the preferred directions of the child, or at least not counter to those preferences. In this way, parents respect the preferences and autonomy of their children, allowing them to exercise their rights in making significant choices in life that are in line with their own natural preferences.

One direct criticism of Feinberg’s view includes the observation that steering one’s child toward particular options in the context of parenthood is unavoidable (Mills 2003). According to Mills, there are three options relative to the future which parents may choose from as they determine how directive they ought to be. First, as Feinberg claims, parents may provide their children with a maximally open future. Second, parents may direct their children toward a future which the parents value and endorse. Third, parents may opt for a compromise between these two options. Whether or not one considers some particular set of options to be open is connected to one’s perspective. Given this, one’s judgment concerning whether or not a particular child has an open future is also connected to that perspective. For instance, someone outside of the Amish community would likely contend that children in that community do not have an open future; by virtue of being Amish, careers in medicine, science, and technology are closed to such children. Yet from an Amish perspective, children have a variety of options including farming, blacksmithing, woodworking, etc. Rather than speaking of an option as open or closed, Mills argues that we should think of options as encouraged, discouraged, fostered, or inhibited. Practically speaking, in order to encourage a child toward or away from some option in life, other options must be closed down.  Finally, Mills criticizes Feinberg’s view on the grounds that it places more value on the future life of the child, rather than the present.

c. Medical Decision Making

Many are concerned about state intervention in medical decision making as it is performed by parents on behalf of their children. Most would agree that the interests of all relevant parties, including children, parents, and the state, must be taken into account when making medical decisions on behalf of children. The worry is that state intrusion into this arena is an improper invasion of family privacy. And yet among those who generally agree that such decisions should be left to parents, the claim is not that parents have absolute authority to make such decisions on behalf of their children. Given the weight of the interests and rights at issue, exceptions to parental autonomy are usually made at least in cases where the life of the child is at stake, on the grounds that the right to life trumps the right to privacy, when those rights come into conflict. While some parents may have religious reasons for foregoing certain kinds of medical treatment with respect to their children, it is controversial to say the least that parental rights to the exercise of religion are strong enough to trump a child’s right to life. According to some, the state, in its role of parens patriae, can legitimately intervene on behalf of children in many such cases. The courts have done so in cases where the illness or injury in question is life-threatening and yet a child’s parents refuse treatment. In less serious cases, the state has been more reluctant to intervene. However, the state’s interest in healthy children is apparently leading to a greater willingness to intervene in less drastic cases as well (Foreman and Ladd 1996).

A different set of issues arises with respect to medical decision making as it applies to procreative decisions, both those that are now available and those that for now are mere future possibilities. With respect to the former, it is now possible for parents to engage in attempted gender selection. An increasing number of couples are using reproductive technologies to select the sex of their children. One technique for making such a selection involves using the process of in vitro fertilization and then testing the embryos at three days of age for the desired sex. Those that are the preferred sex are then implanted in the womb and carried to term. Another technology which can be employed by couples who are seeking to select the sex of their children is sperm sorting.  Female-producing sperm and male-producing sperm are separated, and then the woman is artificially inseminated with the sperm of the desired sex.  This is easier and less expensive, though not as reliable, as the in vitro procedure.

Parents might have a variety of reasons for seeking to determine the gender of their offspring, related to the gender of their current children, family structure, or other preferences which relate to this. One criticism of this practice is that it transforms children into manufactured products, which we design rather than receive. That is, children become the result, at least in part, of a consumer choice which is thought by some to be problematic in this context. In addition, this practice is thought by some to place too much weight on the desires of parents related to the traits of their (future) children. Ideally, at least, parental love for children is to be unconditional, but in cases where parents choose the gender of their offspring it may be that their love is already contingent upon the child having a certain trait or traits. Finally, given the scarcity of resources in health care, some argue that we should employ those resources in other less frivolous areas of medical care. Similar worries are raised with respect to the future use of human cloning technology. The technology would likely be costly to develop and deploy. And if such a technology comes into existence, parents may be able to select beforehand a wide variety of traits, which could also undermine morally and psychologically significant aspects of the parent-child relationship, in the view of some critics.

d. Disciplining Children

There are a variety of ways in which parents discipline or punish their children. These include corporal forms of punishment, and other forms such as time-outs, loss of privileges, fines, and verbal corrections. Of these, corporal forms of punishment are the most controversial.

Critics of corporal punishment offer many reasons for thinking that it is both immoral and a misguided practice. The use of violence and aggression is taken by many to be wrong in the context of the parent-child relationship, which they believe should be characterized by intimacy and love with no place for the infliction of physical pain. It is thought that children may learn that violence, or inflicting pain, is a permissible way to attempt to control others. Some argue that reasoning with the child and other forms of verbal and moral persuasion are more effective, as are alternative forms of discipline and punishment such as verbal reprimands or time outs. Others believe that the negative effects on children of corporal punishment are often compounded or confused by other forms of maltreatment that are also present, such as parental expressions of disgust towards the child. This makes determining the effects of the punishment itself difficult. Still others think there is a place for corporal punishment, but only as a last resort.

One philosophical assessment of corporal punishment includes a limited defense of it, which is open to revision or abandonment if future findings in psychology and child development warrant this (Benatar 1998). When such punishment is harsh or frequent, it is argued that this amounts to child abuse. However, when corporal punishment is understood as the infliction of physical pain without injury, then it may be permissible.

Several arguments in favor of banning such forms of punishment have been offered, but potential problems have been raised for them by Benatar. Some critics of corporal punishment argue that it leads to abuse. But it is argued by Benatar that the relevant evidence in support of this claim is not conclusive. And while some parents who engage in corporal punishment do abuse their children, it does not follow that corporal punishment is never permissible. If this were the case, then by parity of reasoning the abuse of alcohol or automobiles by some would justify banning their use in moderate and appropriate ways by all. The abusive use of corporal punishment is wrong, but this does not mean that non-abusive forms of such punishment are wrong. Others argue that corporal punishment degrades children, but there is no proof that it actually lowers their self-regard, or at least that it does so in an unacceptable manner. Others are concerned that corporal punishment produces psychological damage, such as anxiety, depression, or lowered self-esteem. There is evidence that excessive forms of such punishment have such effects, but not when it is mild and infrequent. Other critics argue that corporal punishment teaches the wrong lesson, namely, that our problems can be solved with the use of physical violence and that it fosters violent behavior in children who receive it. Yet the evidence does not show that the use of corporal punishment has this effect when it is mild and infrequent. Finally, critics argue that corporal punishment should not be used because it is ineffective in changing the behavior of children, though defenders of the practice dispute this claim as well (Cope 2010).

Whatever one concludes about the proper forms of punishment, corporal and non-corporal, one proposed function of whatever forms of punishment end up being morally permissible in the family is the promotion of trust in filial relationships (Hoekema 1999). Trust is important in the family, because it is essential for the flourishing of the parent-child relationship. Children must trust their parents, given facts about childhood development. And ideally, as their development warrants it, parents should trust their children. The justification of punishment, in this way of thinking, has to do with children failing to live up to the trust placed in them by their parents. As such, proper forms of punishment both reflect and reinforce that trust. If children destroy or damage property, fining them for doing so can restore trust, release them from the guilt resulting from their betrayal of trust, and then reestablish that trust which is conducive to their continued development and the quality of the parent-child relationship. A form of punishment that fails to foster trust, or that fosters fear, would be morally problematic.

e. The Religious Upbringing of Children

While it is commonplace for parents to seek to impart their own religious, moral, and political beliefs and practices to their children, some philosophers are critical of this and raise objections to this form of parental influence.

Some hold that parents should remain neutral with respect to the religion of their children, and not seek to influence the religious beliefs and practices of their offspring (Irvine 2001). One reason offered in support of this claim is that when parents rear their children within their preferred religious framework, insisting that they adopt their faith, such parents are being hypocritical. This is because, at some point in the past, the ancestors of those parents rejected the religion of their own parents. For example, if parents today insist their child adopt some Protestant form of Christianity, they are being hypocritical because at some point in the past their ancestors rejected Roman Catholicism, perhaps to the dismay of their parents, and this is said to constitute a form of hypocrisy. One reply to this has been that hypocrisy is not present, if the parents (and their ancestors) convert because they genuinely believe that the religion in question is true. If this is the justification, then no hypocrisy obtains (Austin 2009).

There are other problems with parents insisting that their children adopt their religious faith, however, having to do with autonomy. Parents may limit their children’s access to certain kinds of knowledge, such as knowledge concerning sexuality, because of their religious faith. In the name of religion, some parents also restrict access to certain forms of education which limits the autonomy of children by preventing them from coming to know about various conceptions of the good life. This may also limit their options and opportunities as adults, which limits the future autonomy of such children.

One important view concerning parenting and religious faith includes the claim that justice restricts the freedom of parents with respect to inculcating belief in a comprehensive doctrine, that is, in a broad view of the good life for human beings (Clayton 2006). This not only includes religious frameworks, but secular ones as well. The primary reason for this is that the autonomy of children must be safeguarded, as they have an interest in being raised in an environment which allows them to choose from a variety of options with respect to the good life, both religious and non-religious. The view here is that children may only be reared within a comprehensive doctrine, such as Christianity, Islam, or humanism, if they are able to and in fact do give autonomous consent, or have the intellectual capacities required to conceive of the good and of the good life. If neither of these requirements obtain, then it is wrong for parents to seek to impart their beliefs to their children. Once their children can conceive of the good and the good life, or are able to give consent to believe and practice the religion or other comprehensive doctrine, then parents may seek to do so. On this view, parents may still seek to encourage the development of particular virtues, such as generosity, in their children, as this does not threaten autonomy and helps children to develop a sense of justice. Parents are obligated to help them develop such a sense, and so this type of moral instruction and encouragement is not only permissible, but in fact obligatory for them. In reply, it has been argued that there are ways for parents to bring their children up within a particular religion or other comprehensive doctrine that protect their autonomy and help children gain a deep understanding of the nature and value of such a doctrine. Perhaps a middle ground between indoctrination and the foregoing restrictive approach is possible.

f. Parental Love

It is fitting to close with what is arguably the most important parental obligation, the obligation to love one’s children. Some philosophers—Kant, for example—believe that there is not and indeed cannot be an obligation to love another person, because love is an emotion and emotions are not under our control. Since we cannot be obligated to do something which we cannot will ourselves to do, there is no duty to love. However, some contemporary philosophers have challenged this conclusion and argued that parents do have a moral obligation to love their children (Austin 2007, Boylan 2011, Liao 2006). One reason for this is that parents have the obligation to attempt to develop the capacities in their children that are needed for a flourishing life. There is ample empirical evidence that a lack of love can harm a child’s psychological, cognitive, social, and physical development. Given this, parents are obligated to seek to foster the development of the capacities for engaging in close and loving personal relationships in their children. A primary way that parents can do this is by loving their children and seeking to form such a relationship with them. There are ways in which parents can successfully bring about the emotions associated with loving children. For example, a parent can give himself reasons for having loving emotions for his children. A parent can bring about circumstances and situations in which it is likely that she will feel such emotions. In these and many other ways, the dispositions to feel parental love can be strengthened. To say that all emotions, including the emotions associated with parental love, cannot be commanded by morality because they cannot be controlled by us is too strong a claim. Finally, there are also reasons for thinking that it is not merely the responsibility of parents to love their children, but that all owe a certain kind of love to children (Boylan 2011). If this is true, then much more needs to be done to not only encourage parents to love their children in ways that will help them to flourish, but to change social structures so that they are more effective at satisfying this central interest of children.

5. References and Further Reading

  • Almond, Brenda. The Fragmenting Family. New York: Oxford University Press, 2006.
    • Criticizes arguments for the claim that the family is merely a social construct.
  • Archard, David and David Benatar, eds. Procreation and Parenthood. New York: Oxford University Press, 2010.
    • Several essays focus on the ethics of bringing a child into existence, while the others center on the grounds and form of parental rights and obligations, once a child exists.
  • Archard, David, and Colin Mcleod, eds. The Moral and Political Status of Children. New York: Oxford University Press, 2002.
  • Archard, David. Children: Rights and Childhood, 2nd edition. New York: Routledge, 2004.
    • Extensive discussion of the rights of children and their implications for parenthood and the state’s role in family life.
  • Austin, Michael W. Wise Stewards: Philosophical Foundations of Christian Parenting (Grand Rapids, MI: Kregel Academic, 2009)
    • A discussion of the parent-child relationship that combines theological and philosophical reflection in order to construct an everyday ethic of parenthood that is distinctly Christian.
  • Austin, Michael W. Conceptions of Parenthood: Ethics and the Family. Aldershot: Ashgate, 2007.
    • A comprehensive critical overview of the main philosophical accounts of the rights and obligations of parents (including an extensive defense of the causal view of parental obligations) and their practical implications.
  • Austin, Michael W. “The Failure of Biological Accounts of Parenthood.” The Journal of Value Inquiry 38 (2004): 499-510.
    • Rejects biological accounts of parental rights and obligations.
  • Bassham, Gregory, Marc Marchese, and Jack Ryan. “Work-Family Conflict: A Virtue Ethics Analysis.” Journal of Business Ethics 40 (2002): 145-154.
    • Discussion of balancing work and family responsibilities, from the perspective of virtue ethics.
  • Bayne, Tim. “Gamete Donation and Parental Responsibility.” Journal of Applied Philosophy 20 (2003): 77-87.
    • Criticizes arguments that gamete donors take their responsibilities to their offspring too lightly.
  • Benatar, David. “The Unbearable Lightness of Bringing into Being.” Journal of Applied Philosophy 16 (1999): 173-180.
    • Argues that gamete donation is almost always morally wrong.
  • Benatar, David. “Corporal Punishment.” Social Theory and Practice 24 (1998): 237-260.
    • Evaluates many of the standard arguments against corporal punishment.
  • Blustein, Jeffrey. Parents and Children: The Ethics of the Family. New York: Oxford University Press, 1982.
    • Includes a historical overview of what philosophers have had to say about the family, an account of familial obligations, and a discussion of public policy related to the family.
  • Bodin, Jean. Six Books of the Commonwealth. Translated by M. J. Tooley. New York: Barnes and Noble, 1967.
    • Contains Bodin’s statement of absolutism.
  • Boylan, Michael. “Duties to Children.” The Morality and Global Justice Reader. Michael Boylan, ed. Boulder, CO: Westview Press, 2011, pp. 385-403.
    • Argues that all people, including but not limited to parents, have duties to children related to the basic goods of human agency.
  • Brennan, Samantha, and Robert Noggle, eds. Taking Responsibility for Children. Waterloo: Wilfrid Laurier University Press, 2007.
  • Brighouse, Harry and Adam Swift. “Parents’ Rights and the Value of the Family.” Ethics 117 (2006): 80-108.
    • An argument in favor of limited and conditional parental rights, based upon the interests of parents and children.
  • Clayton, Matthew. Justice and Legitimacy in Upbringing. New York: Oxford University Press, 2006.
    • Applies particular principles of justice to childrearing.
  • Cohen, Howard. Equal Rights for Children. Totowa, NJ: Littlefield, Adams, and Co., 1980.
    • Makes a case for the claim that children should have equal rights and discusses social policy implications of this view.
  • Cope, Kristin Collins. “The Age of Discipline: The Relevance of Age to the Reasonableness of Corporal Punishment.” Law and Contemporary Problems 73 (2010): 167-188.
    • Includes a discussion of the legal issues and debates surrounding corporal punishment, as well as references to recent research on both sides of this debate concerning its efficacy and propriety.
  • Donnelly, Michael, and Murray Straus, eds. Corporal Punishment of Children in Theoretical Perspective. New Haven, CT: Yale University Press, 2005.
    • A collection of essays from a variety of disciplines which address a wide range of issues concerning corporal punishment.
  • Feinberg, Joel. “The Child’s Right to an Open Future.” In Whose Child?: Children’s Rights, Parental Authority, and State Power. Edited by William Aiken and Hugh LaFollette. Totowa, NJ: Littlefield, Adams, and Co., 1980, pp. 124-153.
    • Argues that the future autonomy of children limits parental authority in important ways.
  • Feldman, Susan. “Multiple Biological Mothers: The Case for Gestation.” Journal of Social Philosophy 23 (1992): 98-104.
    • Consequentialist argument for a social policy favoring gestational mothers when conflicts over custody arise.
  • Foreman, Edwin and Rosalind Ekman Ladd. “Making Decisions—Whose Choice?” Children’s Rights Re-Visioned. Rosalind Ekman Ladd, ed. Belmont, CA: Wadsworth, 1996, pp. 175-183.
    • A brief introduction to the core issues concerning medical decision making and the family.
  • Gaylin, Willard and Ruth Macklin, eds. Who Speaks for the Child: The Problems of Proxy Consent. New York: Plenum Press, 1982.
    • A collection of essays addressing medical decision making in the family.
  • Hall, Barbara. “The Origin of Parental Rights.” Public Affairs Quarterly 13 (1999): 73-82.
    • Explores the connections between the concept of self-ownership, biological parenthood, and parental rights.
  • Harris, John. “Liberating Children.” The Liberation Debate: Rights at Issue. Michael Leahy and Dan Cohn-Sherbok, eds. New York: Routledge, 1996, pp. 135-146.
    • Discusses and argues for children’s liberation, including discussion of the consistency problem.
  • Haslanger, Sally. “Family, Ancestry and Self: What is the Moral Significance of Biological Ties?” Adoption & Culture 2.
    • A criticism of David Velleman’s argument that knowing our biological parents is crucial for forging a meaningful life.
  • Hoekema, David. “Trust and Punishment in the Family.” Morals, Marriage, and Parenthood. Laurence Houlgate, ed. Belmont, CA: Wadsworth, 1999, pp. 256-260.
    • Argues that punishment in the family should both result from and maintain trust.
  • Irvine, William B. Doing Right by Children. St. Paul, MN: Paragon House, 2001.
    • Offers a stewardship account of parenthood, contrasted with ownership approaches.
  • Kass, Leon. “The Wisdom of Repugnance.” The New Republic 216 (1997): 17-26.
    • Argues that human cloning should be banned.
  • Kolers, Avery and Tim Bayne. “’Are You My Mommy? On the Genetic Basis of Parenthood.” Journal of Applied Philosophy 18 (2001): 273-285.
    • Argues that certain genetic accounts of parental rights are flawed, while one is more promising.
  • LaFollette, Hugh. “Licensing Parents.” Philosophy and Public Affairs 9 (1980): 182-197.
    • Argues in favor of the claim that the state should require licenses for parents.
  • Liao, S. Matthew. “The Right of Children to be Loved.” The Journal of Political Philosophy 14 (2006): 420-440.
    • Defends the claim that children have a right to be loved by parents because such love is an essential condition for having a good life.
  • McFall, Michael. Licensing Parents: Family, State, and Child Maltreatment. Lanham, MD: Lexington Books, 2009.
    • Contains arguments related to political philosophy, the family, and parental licensing.
  • Mills, Claudia. “The Child’s Right to an Open Future?” Journal of Social Philosophy 34 (2003): 499-509.
    • Critically evaluates the claim that children have a right to an open future.
  • Millum, Joseph. “How Do We Acquire Parental Rights?” Social Theory and Practice 36 (2010): 112-132.
    • Argues for an investment theory of parental rights, grounded in the work individuals have done as parents of a particular child.
  • Millum, Joseph. “How Do We Acquire Parental Responsibilities?” Social Theory and Practice 34 (2008): 74-93.
    • Argues that parental obligations are grounded in certain acts, the meaning of which is determined by social convention.
  • Montague, Phillip. “The Myth of Parental Rights.” Social Theory and Practice 26 (2000): 47-68.
    • Rejects the existence of parental rights on the grounds that such rights lack essential components of moral rights
  • Narayan, Uma and Julia Bartkowiak, eds. Having and Raising Children. University Park, PA: The Pennsylvania State University Press, 1999.
    • A collection of essays focused on a variety of ethical, political, and social aspects of the family.
  • Narveson, Jan. The Libertarian Idea. Philadelphia: Temple University Press, 1988.
    • Contains a statement of proprietarianism.
  • Nelson, James Lindemann. “Parental Obligations and the Ethics of Surrogacy: A Causal Perspective.” Public Affairs Quarterly 5 (1991): 49-61.
    • Argues that causing children to come into existence, rather than decisions concerning reproduction, is the primary source of parental obligations.
  • Page, Edgar. “Parental Rights.” Journal of Applied Philosophy 1 (1984): 187-203.
    • Argues that biology is the basis of parental rights; advocates a version of proprietarianism without absolutism.
  • Purdy, Laura. In Their Best Interests?: The Case against Equal Rights for Children. Ithaca: Cornell University Press, 1992.
    • Criticizes children’s liberationism and argues that granting children equal rights is in neither their interest nor society’s.
  • Richards, Norvin. The Ethics of Parenthood. New York: Oxford University Press, 2010.
    • Contains a discussion of the significance of biological parenthood, the obligations of parents, and the nature of the relationship between adult children and their parents.
  • Rothman, Barabara Katz. Recreating Motherhood. New York: W.M. Norton and Company, 1989.
    • A feminist treatment of a wide range of issues concerning the family.
  • Scales, Stephen. “Intergenerational Justice and Care in Parenting,” Social Theory and Practice 4 (2002): 667-677.
    • Argues for a social contract view, in which the moral community has the power to determine whether a person is capable of fulfilling the parental role.
  • Schoeman, Ferdinand. “Rights of Children, Rights of Parents, and the Moral Basis of the Family.” Ethics 91 (1980): 6-19.
    • An argument for parental rights based on filial intimacy.
  • Tittle, Peg, ed. Should Parents be Licensed? Amherst, NY: Prometheus Books, 2004.
    • An anthology of essays addressing a wide range of issues as they relate to the parental licensing debate.
  • Turner, Susan. Something to Cry About: An Argument Against Corporal Punishment of Children in Canada. Waterloo: Wilfrid Laurier University Press, 2002.
  • Velleman, J. David. “Family History.” Philosophical Papers 34 (2005): 357-378.
    • Argues that biological family ties are crucial with respect to the quest for a meaningful life.
  • Vopat, Mark. “Justice, Religion and the Education of Children.” Public Affairs Quarterly 23 (2009): 223-226.
  • Vopat, Mark. “Parent Licensing and the Protection of Children.” Taking Responsibility for Children. Samantha Brennan and Robert Noggle, eds. Waterloo: Wilfrid Laurier University Press, 2007, pp. 73-96.
  • Vopat, Mark. “Contractarianism and Children.” Public Affairs Quarterly 17 (2003): 49-63.
    • Argues that parental obligations are grounded in a social contract between parents and the state.
  • Willems, Jan C.M., ed. Developmental and Autonomy Rights of Children. Antwerp: Intersentia, 2007.

 

Author Information

Michael W. Austin
Email: mike.austin@eku.edu
Eastern Kentucky University
U. S. A.

Distributive Justice

Theories of distributive justice seek to specify what is meant by a just distribution of goods among members of society. All liberal theories (in the sense specified below) may be seen as expressions of laissez-faire with compensations for factors that they consider to be morally arbitrary. More specifically, such theories may be interpreted as specifying that the outcome of individuals acting independently, without the intervention of any central authority, is just, provided that those who fare ill (for reasons that the theories deem to be arbitrary, for example, because they have fewer talents than others) receive compensation from those who fare well.

Liberal theories of justice consider the process, or outcome, of individuals’ free actions to be just except insofar as this depends on factors, in the form of personal characteristics, which are considered to be morally arbitrary. In the present context these factors may be individuals’ preferences, their abilities, and their holdings of land. Such theories may, then, be categorized according to which of these factors each theory deems to be morally arbitrary.

There is a certain tension between the libertarian and egalitarian theories of justice. Special attention below is given to the views of Dworkin, Rawls, Nozick, and Sen.

Table of Contents

  1. A Taxonomy
    1. A Simple World
    2. Liberalism
  2. Justice as Fairness
    1. Two Principles
    2. A Social Contract
    3. The Difference Principle
    4. Choice Behind the Veil
    5. Summary
  3. Equality of Resources
    1. Initial Resources
    2. Fortune
    3. Handicaps
    4. Talents
    5. Summary
  4. Entitlements
    1. The Basic Schema
    2. Patterns
    3. Justice in Acquisition
    4. Justice in Transfer
    5. Justice in Rectification
    6. Summary
  5. Common Ownership
    1. A Framework
    2. The Transfer of Property
    3. The Holding of Property
    4. The Social Fund
    5. Summary
  6. Conclusions
  7. References and Further Reading
    1. References
    2. Further Reading

1. A Taxonomy

a. A Simple World

We begin with a simple hypothetical world in which there are a number of individuals and three commodities: a natural resource, called land; a consumption good, called food; and individuals’ labour. There is a given amount of land, which is held by individuals, but no stock of food: food may be created from land and labour. An individual is characterized by his preferences between food and leisure (leisure being the obverse of labour); by his ability, or productivity in transforming land and labour into food; and by his holding of land.

Liberal theories of justice consider the process, or outcome, of individuals’ free actions to be just except insofar as this depends on factors, in the form of personal characteristics, which are considered to be morally arbitrary. In the present context these factors may be individuals’ preferences, their abilities, and their holdings of land. Such theories may, then, be categorized according to which of these factors each theory deems to be morally arbitrary.

Equality has various interpretations in this simple world: these correspond to the theories discussed below. Liberty has two aspects: self-ownership, that is, rights to one’s body, one’s labour, and the fruits thereof; and resource-ownership, that is, rights to own external resources and the produce of these. Theories that fail to maintain self-ownership may be divided into those that recognize personal responsibility in that the extent of the incursions that they make are independent of how people exercise these (for example, in being industrious or lazy), and those that do not.

In a liberal context there is (as is justified below) no basis for comparing one individual’s wellbeing with another’s, so that theories of justice which require such comparisons cannot be accommodated. Accordingly, the theories of utilitarianism, which defines a distribution to be just if it maximizes the sum of each individual’s wellbeing, and of equality of welfare, which defines a distribution to be just if each individual has the same level of wellbeing, are not considered.

Four theories of justice are discussed: Rawlsian egalitarianism, or justice as fairness; Dworkinian egalitarianism, or equality of resources; Steiner-Vallentyne libertarianism, or common ownership; and Nozickian libertarianism, or entitlements. The following specification of the theories sets out, for each theory: its definition of justice; the personal characteristics that it considers to be arbitrary and therefore makes adjustments for; the nature of the institution under which this may be achieved; the justification of any inequalities which it accepts; and the extent to which it is consistent with liberty.

Justice as fairness defines a distribution to be just if it maximizes the food that the individual with the least food receives (this is the “maximin” outcome in terms of food, which is the sole primary good). It adjusts for preferences, ability, and land holdings. It is achieved by taxes and subsidies on income (that is, on the consumption of food). Inequalities in income, subject to the maximin requirement, are accepted because of the benefit they bring to the individual with the least income; all inequalities in leisure are accepted. Rights to neither self-ownership nor resource-ownership are maintained, and responsibility is not recognized.

Equality of resources defines a distribution to be just if everyone has the same effective resources, that is, if for some given amount of work each person could obtain the same amount of food. It adjusts for ability and land holdings, but not for preferences. It is achieved by taxes and subsidies on income. Inequalities in both food and leisure are accepted because they arise solely from choices made by individuals who have the same options. Rights to neither self-ownership nor resource-ownership are maintained, but responsibility is recognized.

Common ownership theories define a distribution to be just if each person initially has the same amount of land and all transactions between individuals are voluntary. It adjusts for land holdings, but not for preferences or abilities. It is achieved by a reallocation of holdings of land. Inequalities in both food and leisure are accepted because these arise solely from people having different preferences or abilities. Rights to self-ownership are maintained but rights to resource-ownership are not.

An entitlements theory defines a distribution to be just if the distribution of land is historically justified, that is if it arose from the appropriation by individuals of previously unowned land and voluntary transfers between individuals, and all other transactions between individuals are voluntary. It makes no adjustments (other than corrections for any improper acquisitions or transfers) and thus requires no imposed institution to achieve it. All inequalities are accepted. Rights to both self-ownership and resource-ownership are maintained.

As is apparent, the first two theories emphasize outcomes while the second two emphasize institutions. These four theories form a hierarchy, or decreasing progression, in terms of the personal characteristics that they consider to be morally arbitrary, and thus for which adjustments are made. The first theory adjusts for preferences, ability, and land holdings; the second only for ability and land holdings; the third only for land holdings; and the fourth for none of these (other than the corrections noted above). The four theories form a corresponding hierarchy, or increasing progression, in terms of the liberties (self-ownership, with or without personal responsibility, and resource-ownership) that they maintain: the first maintains neither, and does not recognize responsibility; the second maintains neither, but does recognize responsibility; the third maintains self-ownership but not resource-ownership; and the fourth maintains both self-ownership and resource-ownership.

These corresponding hierarchies are illustrated schematically in the table below (from Allingham, 2014, 4).

Theory

Arbitrary factors

Liberties maintained

Rawls

Preferences – Ability – Land

Dworkin

Ability – Land

Responsibility

Steiner-Vallentyne

Land

Responsibility – Self-ownership

Nozick

Responsibility – Self-ownership – Resource-ownership

 

The remainder of this survey develops these theories of justice. It demonstrates that they also form a third hierarchy in terms of equality (of outcome), with Rawls’s justice as fairness as the most egalitarian, followed by Dworkin’s equality of resources, then common ownership in the Steiner-Vallentyne vein, and finally Nozick’s entitlements theory as the least egalitarian. The order in which these theories are discussed differs from that of the decreasing progression in terms of what they consider to be arbitrary: specifically, the discussion of entitlements precedes that of common ownership. The reason for this is that common ownership theories follow temporally, and draw on, Nozick’s entitlements theory.

b. Liberalism 

The theories of justice considered are liberal in that they do not presuppose any particular conception of the good. They subscribe to what Sandel calls deontological liberalism: “society, being composed of a plurality of persons, each with his own aims, interests, and conceptions of the good, is best arranged when it is governed by principles that do not themselves presuppose any particular conception of the good” (1998, 1).

The importance of deontological liberalism is that it precludes any interpersonal comparisons of utility. As Scanlon (who supports interpersonal comparisons) accepts, “interpersonal comparisons present a problem insofar as it is assumed that the judgements of relative well-being on which social policy decisions, or claims of justice, are based should not reflect value judgements” (1991, 17). And Hammond, who also supports interpersonal comparisons, accepts that such comparisons “really do require that an individual’s utility be the ethical utility or worth of that individual to the society” (191, 237). If we are not prepared to take a position on someone’s worth to society then we cannot engage in interpersonal utility comparisons. It is in the light of this that Arrow notes that “it requires a definite value judgement not derivable from individual sensations to make the utilities of different individuals dimensionally compatible and a still further value judgement to aggregate them”, and accordingly concludes that “interpersonal comparison of utilities has no meaning and, in fact, … there is no meaning relevant to welfare comparisons in the measurability of individual utility” (2012, 9-11).

2. Justice as Fairness

Justice as fairness, as developed by Rawls, treats all personal attributes as being morally arbitrary, and thus defines justice as requiring equality, unless any departure from this benefits everyone. This view is summarized in Rawls’s “general conception of justice”, which is that “all social values – liberty and opportunity, income and wealth, and the social bases of self-respect – are to be distributed equally unless an unequal distribution of any, or all, of these values is to everyone’s advantage”: injustice “is simply inequalities that are not to the benefit of all” (1999, 24).

a. Two Principles

Rawls’s interpretation is made more precise in his two principles of justice. He proposes various formulations of these; the final formulation is that of Political Liberalism:

a. Each person has an equal claim to a fully adequate scheme of equal basic rights and liberties, which scheme is compatible with the same scheme for all; and in this scheme the equal political liberties, and only those liberties, are to be guaranteed their fair value.

b. Social and economic inequalities are to satisfy two conditions: first, they are to be attached to positions and offices open to all under conditions of fair equality of opportunity; and second, they are to be to the greatest benefit of the least advantaged members of society (2005, 5-6).

These principles are lexically ordered: the first principle has priority over the second; and in the second principle the first part has priority over the second part. For the specific question of distributive justice, as opposed to the wider question of political justice, it is the final stone in the edifice that is crucial: this is the famous difference principle.

b. A Social Contract

Rawls justifies his two principles of justice by a social contract argument. For Rawls, a just state of affairs is a state on which people would agree in an original state of nature. Rawls seeks “to generalize and carry to a higher order of abstraction the traditional theory of the social contract as represented by Locke, Rousseau, and Kant”, and to do so in a way “that it is no longer open to the more obvious objections often thought fatal to it” (1999, xviii).

Rawls sees the social contract as being neither historical nor hypothetical but a thought-experiment for exploring the implications of an assumption of moral equality as embodied in the original position. To give effect to this Rawls assumes that the parties to the contract are situated behind a veil of ignorance where they do not know anything about themselves or their situations, and accordingly are equal. The intention is that as the parties to the contract have no information about themselves they necessarily act impartially, and thus as justice as fairness requires. As no one knows his circumstances, no one can try to impose principles of justice that favour his particular condition.

c. The Difference Principle

Rawls argues that in the social contract formed behind a veil of ignorance the contractors will adopt his two principles of justice, and in particular the difference principle: that all inequalities “are to be to the greatest benefit of the least advantaged members of society”. This requires the identification of the least advantaged. There are thee aspects to this: what constitutes the members of society; what counts as being advantaged; and how the advantages of one member are to be compared with those of another.

It would seem natural in defining the least advantaged members of society to identify the least advantaged individuals, but Rawls does not do this. Instead, he seeks to identify representatives of the least advantaged group.

The wellbeing of representatives is assessed by their allocation of what Rawls terms primary goods. There are two classes of primary goods. The first class comprises social primary goods, such as liberty (the subject matter of the first part of the second principle of justice) and wealth (the subject matter of the second part of that principle). The second class comprises natural primary goods, such as personal characteristics. Justice as fairness is concerned with the distribution of social primary goods; and of these the difference principle is concerned with those that are the subject matter of the second part of the second principle of justice, such as wealth.

Rawls’s primary goods are “things which it is supposed a rational man wants whatever else he wants”: regardless of what precise things someone might want “it is assumed that there are various things which he would prefer more of rather than less”. More specifically, “primary social goods, to give them in broad categories, are rights, liberties, and opportunities, and income and wealth”. These fall into two classes: the first comprise rights, liberties, and opportunities; and the second, which is the concern of the difference principle, income and wealth. The essential difference between these classes is that “liberties and opportunities are defined by the rules of major institutions and the distribution of income and wealth is regulated by them” (1999, 79).

The construction of an index of primary social goods poses a problem, for income and wealth comprise a number of disparate things and these cannot immediately be aggregated into a composite index. Rawls proposes to construct such an index “by taking up the standpoint of the representative individual from this group and asking which combination of primary social goods it would be rational for him to prefer”, even though “in doing this we admittedly rely upon intuitive estimates” (1999, 80).

d. Choice Behind the Veil

Each contractor considers all feasible distributions of primary goods and chooses one. Because the contractors have been stripped of all distinguishing characteristics they all make the same choice, so there is in effect only one contractor. The distributions that this contractor considers allocate different amounts of primary goods to different positions, not to named persons.

The contractor does not know which position he will occupy, and as he is aware that he may occupy the least advantaged position he chooses the distribution that allocates the highest index of primary goods to that position. That is, he chooses the distribution that maximizes the index of the least advantaged, or minimum, position. Rawls thus considers his “two principles as the maximin solution to the problem of social justice” since “the maximin rule tells us to rank alternatives by their worst possible outcomes: we are to adopt the alternative the worst outcome of which is superior to the worst outcomes of the others” (1999, 132-133).

A major problem with Rawls’s theory of justice is that rational contractors will not, except in a most extreme case, choose the maximin outcome. Despite Rawls claiming that “extreme attitudes to risk are not postulated” (1999, 73) it appears that they are, and thus to choose the maximin distribution is to display the most extreme aversion to risk. In global terms, it is to prefer the distribution of world income in which 7 billion people have just $1 above a widely accepted subsistence income level of $365 a year to the distribution in which all of these except one (who has $365 a year) have the income of the average Luxembourger with $80,000 a year. It is to choose a world of universal abject poverty over one of comfortable affluence for all but one person. As Roemer expresses it, “the choice, by such a [representative] soul, of a Rawlsian tax scheme is hardly justified by rationality, for there seems no good reason to endow the soul with preferences that are, essentially, infinitely risk averse” (1996, 181).

Rawls appreciates that “there is a relation between the two principles and the maximin rule for choice under uncertainty”, and accepts that “clearly the maximin rule is not, in general, a suitable guide for choices under uncertainty”. However, he claims that it is a suitable guide if certain features obtain, and seeks to show that “the original position has these features to a very high degree”. He identifies three such features. The first is that “since the rule takes no account of the likelihoods of the possible circumstances, there must be some reason for sharply discounting estimates of these probabilities”. The second is that “the person choosing has a conception of the good such that he cares very little, if anything, for what he might gain above the minimum stipend that he can, in fact, be sure of by following the maximin rule”. The third is that “the rejected alternatives have outcomes that one can hardly accept” (1999, 132-134). However, none of these three features appears to justify the choice by a rational contractor of the maximin distribution. Accordingly, Roemer concludes that “the Rawlsian system is inconsistent and cannot be coherently reconstructed” (1996, 182).

e. Summary

The strength of Rawls’s theory of justice as fairness lies in its combination of the fundamental notion of equality with the requirement that everyone be better off than they would be under pure equality. However, the theory has a number of problems. Some of these may be avoided by inessential changes, but other problems are unavoidable, particularly that of identifying the least advantaged (with the related problems of defining primary goods and the construction of an index of these), and that of the supposedly rational choice of the maximin principle with, as Harsanyi puts it, its “absurd practical implications” (1977, 47 as reprinted).

3. Equality of Resources

Equality of resources, as developed by Dworkin, treats individuals’ abilities and external resources as arbitrary, but makes no adjustments for their preferences. The essence of this approach is the distinction between ambition-sensitivity, which recognizes differences which are due to differing ambitions, and endowment-sensitivity, which recognizes differences that are due to differing endowments.

a. Initial Resources

Dworkin accepts that inequalities are acceptable if they result from voluntary choices, but not if they result from disadvantages that have not been chosen. However, initial equality of resources is not sufficient for justice. Even if everyone starts from the same position one person may fare better than another because of her good luck, or, alternatively, because of her lesser handicaps or greater talents.

Dworkin motivates his theory of justice with the example of a number of survivors of a shipwreck who are washed up, with no belongings, on an uninhabited island with abundant natural resources. The survivors accept that these resources should be allocated among them in some equitable fashion, and agree that for a division to be equitable it must meet “the envy test”, which requires that no one “would prefer someone else’s bundle of resources to his own bundle” (1981, 285). The envy test, however, is too weak a test: Dworkin gives examples of allocations that meet this test but appear inequitable.

To deal with such cases Dworkin proposes that the survivors appoint an auctioneer who gives each of them an equal number of tokens. The auctioneer divides the resources into a number of lots and proposes a system of prices, one for each lot, denominated in tokens. The survivors bid for the lots, with the requirement that their total proposed expenditure in tokens not exceed their endowment of tokens. If all markets clear, that is, if there is precisely one would-be purchaser for each lot, then the process ends. If all markets do not clear then the auctioneer adjusts the prices, and continues to adjust them until they do.

b. Fortune

Dworkin seeks to make people responsible for the effects of their choices, but not for matters beyond their control. To take account of the latter, he distinguishes between “option luck” and “brute luck”. Option luck is “a matter of how deliberate and calculated gambles turn out”. Brute luck is “a matter of how risks fall out that are not in that sense deliberate gambles” (1981, 293). People should be responsible for the outcomes of option luck, but not of brute luck.

Dworkin’s key argument concerning luck is that “insurance, so far as it is available, provides a link between brute and option luck, because the decision to buy or reject catastrophe insurance is a calculated gamble”. Then because people should be responsible for the outcomes of option luck they should be responsible for the outcomes of all luck, at least if they could have bought insurance. Accordingly, Dworkin amends his envy test by requiring that “any resources gained through a successful gamble should be represented by the opportunity to take the gamble at the odds in force, and comparable adjustments made to the resources of those who have lost through gambles” (1981, 293-295).

c. Handicaps

Insurance cannot remove all risks: if someone is born blind he cannot buy insurance against blindness. Dworkin seeks to take account of this through a hypothetical insurance scheme. He asks how much an average person would be prepared to pay for insurance against being handicapped if in the initial state everyone had the same, and known, chance of being handicapped. He then supposes that “the average person would have purchased insurance at that level” (1981, 298), and proposes to compensate those who do develop handicaps out of a fund that is collected by taxation but designed to match the fund that would have been provided through insurance premiums. The compensation that someone with a handicap is to receive is the contingent compensation that he would have purchased, knowing the risk of being handicapped, had actual insurance been available.

Accordingly, the auction procedure is amended so that the survivors “now establish a hypothetical insurance market which they effectuate through compulsory insurance at a fixed premium for everyone based on speculations about what the average immigrant… would have purchased by way of insurance had the antecedent risk of various handicaps been equal” (1981, 301).

This process establishes equality of effective resources at the outset, but this equality will typically be disturbed by subsequent economic activity. If some survivors choose to work more than others they will produce, and thus have, more than their more leisurely compatriots. Thus at some stage the envy test will not be met. This, however, does not create a problem because the envy test is to be applied diachronically: “it requires that no one envy the bundle of occupation and resources at the disposal of anyone else over time, though someone may envy another’s bundle at any particular time” (1981, 306). Since everyone had the opportunity to work hard it would violate rather than endorse equality of resources if the wealth of the hardworking were from time to time to be distributed to the more leisurely.

d. Talents

The essential reason why differential talents create a problem is that equality of resources at the outset will typically be disturbed, not because of morally acceptable differences in work habits, but because of morally arbitrary differences in talents.

Requiring equality of resources only at the outset would be what Dworkin calls a starting-gate theory of fairness, which Dworkin sees as being “very far from equality of resources” and strongly rejects: “indeed it is hardly a coherent political theory at all”. Such a theory holds that justice requires equality of initial resources, but accepts laissez-faire thereafter. The fundamental problem with a starting-gate theory is that it relies on some purely arbitrary starting point. If the requirement of equality of resources is to apply at one arbitrary point, then presumably it is to apply at other points. If justice requires a Dworkinian auction when the survivors arrive, then it must require such an auction from time to time thereafter; and if justice accepts laissez-faire thereafter, it must accept it when they arrive. Dworkin requires neither that there be periodic auctions nor that there be laissez-faire at all times. His theory does not suppose that an equal division of resources is appropriate at one point in time but not at any other; it argues only that the resources available to someone at any moment be a function of the resources available to or consumed by him at others.

Dworkin’s aim is to specify a scheme that allows the distribution of resources at any point of time to be both ambition-sensitive, in that it reflects the cost or benefit to others of the choices people make, but not be endowment-sensitive, in that it allows scope for differences in ability among people with the same ambitions. To achieve this, Dworkin proposes a hypothetical insurance scheme that is analogous to that for handicaps. In this scheme it is supposed that people know what talents they have, but not the income that these will produce, and choose a level of insurance accordingly. An imaginary agency knows each person’s talents and preferences, and also knows what resources are available and the technology for transforming these into other resources. On the basis of this it computes the income structure, that is, the number of people earning each level of income that will emerge in a competitive market. Each person may buy insurance from the agency to cover the possibility of his income falling below whatever level he cares to name. Dworkin asks “how much of such insurance would the survivors, on average, buy, at what specified level of income coverage, and at what cost?” (1981, 317) and claims that the agency can answer this question.

This, however, is not clear. Consider four very weak requirements of such a scheme: it should distribute resources in such a way that not everyone could be better off under any alternative scheme; an increase in the resources available for allocation should not make anyone worse off; if two people have the same preferences and abilities then they should be allocated the same resources; and the scheme should not damage those whom it seeks to help. As is shown by Roemer, there is in Dworkin’s framework no scheme that satisfies these requirements, so that “resource egalitarianism is an incoherent notion” (1985, 178).

e. Summary

The strength of Dworkin’s equality of resources theory of justice is that it seeks to introduce ambition-sensitivity without allowing endowment-sensitivity. To the extent to which it succeeds in this it thus, in Cohen’s words, incorporates within egalitarianism “the most powerful idea in the arsenal of the anti-egalitarian right: the idea of choice and responsibility” (1989, 933).

However, it is not entirely successful in this endeavour. There are a number of problems with Dworkin’s auction scheme: for example, it is not clear that the auctioneer will ever discover prices at which there is precisely one would-be purchaser for each lot. However, these may be avoided by adopting the intended outcome of the auction, that is, as a free-market outcome in which everyone has the same wealth, as a specification of justice in its own right. But the problems with the insurance scheme are deeper, as Roemer’s argument (presented above) demonstrates.

4. Entitlements

Nozick’s entitlements theory (as an extreme) treats no personal attributes as being arbitrary, and thus defines justice simply as laissez-faire, provided that no one’s rights are infringed. In this view “the complete principle of distributive justice would say simply that a distribution is just if everyone is entitled to the holdings they possess under the distribution” (1974, 151).

a. The Basic Schema

Nozick introduces his approach to “distributive justice” by noting that the term is not a neutral one, but presupposes some central authority that is effecting the distribution. But that is misleading, for there is no such body. Someone’s property holdings are not allocated to her by some central planner: they arise from the sweat of her brow or through voluntary exchanges with, or gifts from, others. There is “no more a distributing or distribution of shares than there is a distributing of mates in a society in which persons choose whom they shall marry” (1974, 150).

Accordingly, Nozick holds that the justice of a state of affairs is a matter of whether individuals are entitled to their holdings. In Nozick’s schema, individuals’ entitlements are determined by two principles, justice in acquisition and justice in transfer:

If the world were wholly just, the following inductive definition would exhaustively cover the subject of justice in holdings.

1. A person who acquires a holding in accordance with the principle of justice in acquisition is entitled to that holding.

2. A person who acquires a holding in accordance with the principle of justice in transfer, from someone else entitled to the holding, is entitled to the holding.

3. No one is entitled to a holding except by (repeated) applications of 1 and 2. (1974, 151)

However, the world may not be wholly just: as Nozick observes, “not all actual situations are generated in accordance with the two principles of justice in holdings”. The existence of past injustice “raises the third major topic under justice in holdings: the rectification of injustice in holdings” (1974, 152).

b. Patterns

Nozick distinguishes entitlement principles of justice from patterned principles. A principle is patterned if “it specifies that a distribution is to vary along with some natural dimension, weighted sum of natural dimensions, or lexicographic ordering of natural dimensions”. A distribution that is determined by peoples’ ages or skin colours, or by their needs or merits, or by any combination of these, is patterned. Nozick claims that “almost every suggested principle of distributive justice is patterned” (1974, 156), where by “almost” he means “other than entitlement principles”.

The fundamental problem with patterned principles is that liberty upsets patterns. As Hume expresses it, “render possessions ever so equal, men’s different degrees of art, care, and industry will immediately break that equality” (1751, 3.2). Nozick argues this using his famous Wilt Chamberlain example.

Suppose that a distribution that is (uniquely) specified as just by some patterned principle of distributive justice is realized: this may be one in which everyone has an equal share of wealth, or where shares vary in any other patterned way. Now there is a basketball player, one Wilt Chamberlain, who is of average wealth but of superior ability. He enters into a contract with his employers under which he will receive 25 cents for each admission ticket sold to see him play. As he is so able a player a million people come to watch him. Accordingly, Mr Chamberlain earns a further $250,000. The question is, is this new distribution, in which Mr Chamberlain is much better off than in the original distribution, and also much better off than the average person, just? One answer must be that it is not, for the new distribution differs from the old, and by hypothesis the old distribution (and only that distribution) was just. On the other hand, the original distribution was just, and people moved from that to the new distribution entirely voluntarily. Mr Chamberlain and his employers voluntarily entered into the contract; all those who chose to buy a ticket to watch Mr Chamberlain play did so voluntarily; and no one else was affected. All holdings under the original distribution were, by hypothesis, just, and people have used them to their advantage: if people were not entitled to use their holdings to their advantage (subject to not harming others) it is not clear why the original distribution would have allocated them any holdings. If the original distribution was just and people voluntarily moved from it to the new distribution then the new distribution must be just.

c. Justice in Acquisition

Acquisition of material is considered to be just if what is acquired is freely available and if acquiring it leaves sufficient material for others. Giving an operational meaning to this requires the specification of what acquisition means, what is freely available, and how leaving sufficient material for others is to be interpreted. In these, Nozick, albeit with reservations, follows Locke.

Locke interprets “acquiring” as “mixing one’s labour with” (1689, 2.5.27). I own my labour, and if I inextricably mix my labour with something that no one else owns then I make that thing my own. However, as Nozick points out (without proposing any resolution of these) there are a number of problems with this interpretation. It is not clear why mixing something that I own with something that I do not own implies that I gain the latter rather than lose the former. In Nozick’s example, “if I own a can of tomato juice and spill it in the sea … do I thereby come to own the sea, or have I foolishly dissipated my tomato juice?” Further, it is not clear what determines how much of the unowned resource I come to own. If I build a fence around a previously unowned plot of land do I own all that I have enclosed, or simply the land under the fence? In Nozick’s example, “if a private astronaut clears a place on Mars, has he mixed his labor with (so that he comes to own) the whole planet, the whole uninhabited universe, or just a particular plot?” (1974, 174-175).

Locke interprets “freely available” as being “in the state that nature hath provided”, and Nozick (without any argument) follows Locke in equating “freely available” with “unowned”. There are however, other possibilities. Virgin resources may be seen as being owned in common, or as being jointly owned in equal shares.

Locke interprets leaving sufficient for others as there being “enough, and as good, left in common for others” (1689, 2.5.27); this is the famous Lockean proviso. There are two possible interpretations of this: I may be made worse off by your appropriating a particular plot of land by no longer being able to appropriate it myself, or by no longer being able to use it. Nozick adopts the second, weaker, version.

d. Justice in Transfer

The essence of Nozick’s principle of justice in transfer is that a transfer is just if it is voluntary, in that each party consents to it. Justice in transfer also involves the satisfaction of the Lockean proviso. This is both indirect and direct. It is indirect in that I cannot legitimately transfer to you something that has been acquired, by me or by anyone else, in violation of the proviso, for that thing is not rightfully mine to transfer. But the proviso is also direct, in that I may not by a series of transfers, each of which is legitimate on its own, acquire property that does not leave enough, and as good, for others.

e. Justice in Rectification

Nozick’s basic schema applies to a world that is “wholly just”. However, the world may not be wholly just: people may have violated the principle of justice in acquisition, for example, by appropriating so much of a thing that an insufficient amount is left for others; or they may have violated the principle of justice in transfer, for example, by theft or fraud. Then, as Nozick observes, “the existence of past injustice (previous violations of the first two principles of justice in holdings) raises the third major topic under justice in holdings: the rectification of injustice in holdings”. Nozick identifies a number of questions that this raises: if past injustice has shaped present holdings in ways that are not identifiable, what should be done; how should violators compensate the victims; how does the position change if compensation is delayed; how, if at all, does the position change if the violators or the victims are no longer living; is an injustice done to someone whose holding which was itself based upon an injustice is appropriated; do acts of injustice lose their force over time; and what may the victims of injustice themselves do to rectify matters? However, these questions are not answered: as Nozick admits, “I do not know of a thorough or theoretically sophisticated treatment of such issues” (1974, 152).

f. Summary

The strength of Nozick’s entitlements theory of justice is that it uncompromisingly respects individual liberty, and thus avoids all the problems associated with patterned approaches to justice. However, by avoiding patterns it introduces its own problems, for in asking how distributions came about, rather than in simply assessing them as they are, Nozick necessarily delves into the mists of time. Here lie the two most significant, and related, problems with Nozick’s theory: that of the relatively unsatisfactory nature of the principle of justice in initial acquisition, and that of the predominantly unexplained means of rectifying any injustice resulting from that.

5. Common Ownership

Common ownership theories in the Steiner-Vallentyne vein treat individuals’ holdings of external resources as arbitrary, but (at least directly) make no adjustments for their preferences or abilities. Such theories are diverse, but they all have in common the basic premise that individuals are full owners of themselves but external resources are owned by society in common. The theories differ in what they consider to be external resources, and in what is entailed by ownership in common.

a. A Framework

Common ownership theories, as entitlement theories, emphasize institutions, or processes, rather than outcomes. In essence, they consider an institution to be just if, firstly, it recognizes the principle of self-ownership and a further principle of liberty which may be called free association, and secondly, it involves some scheme of intervention on the holding or transmission of external resources that results, if not in common ownership itself, in a distribution of resources that shares some of the aspects of common ownership.

The principle of self-ownership, as Cohen’s expresses it, is that “each person enjoys, over herself and her powers, full and exclusive rights of control and use, and therefore owes no service or product to anyone else that she has not contracted to supply” (1995, 12). I have full ownership of myself if I have all the legal rights that someone has over a slave. Since a slaveholder has the legal rights to the labour of his slave and the fruits of that labour, each person is the morally rightful owner of his labour and of the fruits thereof.

The motivation for introducing a principle of free association is that what is legitimate for you and for me should be legitimate for us, subject to the satisfaction of the Lockean proviso (if relevant). Allingham proposes the principle that “each person has a moral right to combine any property to which he is entitled with the (entitled) property of other consenting persons (and share in the benefits from such combination in any manner to which each person agrees) provided that this does not affect any third parties” (2014, 110).

Schemes of intervention on the holding or transmission of property may take the form of absolute restrictions or of taxes on the holding or transfer of property.

b. The Transfer of Property

It might be thought that my rights to my property are empty if they do not permit me to do what I will with it (provided that this does not affect others), and in particular to give it to you. On the other hand, the passing down of wealth through the generations is one of the less intuitively appealing implications of this right. There are three ways of reconciling these two positions: restrictions or taxes on all gifts, on bequests, and on re-gifting.

The first proposal is based on Vallentyne’s claim that the right to transfer property to others does not guarantee that others have an unencumbered right to receive that property, and that, accordingly, the receipt of gifts may legitimately be subject to taxation. This would be to say that (the donor) having control rights in the property, and in particular the right to give it to someone, does not imply (the donee) having income rights in the property, and in particular the unencumbered right to enjoy it.

The motivation underlying the second proposal is, in Steiner’s words, “that an individual’s deserts should be determined by reference to his ancestor’s delinquencies is a proposition which doubtless enjoys a degree of biblical authority, but its grounding in any entitlement conception of justice seems less obvious” (1977, 152). Steiner’s argument in support of this position is that, contrary to Nozick’s view, bequests are fundamentally different to gifts inter vivos. Put simply, dead people do not exist, so cannot make gifts. Accordingly, the recipients of all bequests are to be taxed.

A third proposal is that people have rights to make and receive gifts, but not that these rights last for ever. More precisely, Allingham proposes that a scheme that “adopts the position that each person has a moral right to make any gifts (inter vivos or by bequest) to any other person (which person has a moral right to receive such gifts), but that any gifts that are deemed to be re-gifted may be subject to taxation” (2014, 120). If the gifts a person makes are less than those he receives then the former are deemed to be re-gifted; if the gifts he makes are greater than those he receives then the latter are deemed to be re-gifted. Thus I may freely give to you anything that I have created or earned but not consumed, but if I pass on anything that I myself have been given then this may be taxed.

c. The Holding of Property

Interventions on the holding of property may be seen as falling into three classes. One seeks to impose taxes on land by virtue of the fact that it is God-given, one on all natural resources by virtue of the fact that they are natural, and one on all property by virtue of the fact that it is property.

The claim that land, by natural right, belongs to all, like the claim that a person belongs to himself, is made by Locke: “God … hath given the world to men in common” (1689, 2.5.26). The claim is developed by a number of the nineteenth-century writers, and is most notably associated with George. As any improvements are not due to God it is only unimproved land, not developed land, which is relevant. In a typical contribution scheme proposed by Steiner, each “owner owes to the global fund a sum equal to the site’s rental value, that is, equal to the rental value of the site alone, exclusive of the value of any alterations in it wrought by labour” (1994, 272-273).

Land is not the only natural resource: what other property is to count is not clear. As Steiner notes, in any intervention scheme involving natural resources everything “turns on the isolation of what counts as ‘natural’” (1994, 277). There are many candidates. These, as summarized by Fried, include “gifts and bequests from the preceding generation; all traditional public goods (laws, police force, public works); the community’s physical productive capacity; and well-functioning markets” (2004, 85-86). Under these schemes all natural resources would be taxed in the same way as is land.

There are three possible justifications for taxing property per se: extending the concept of bequests; removing one of the incidents of ownership; and requiring a fee for protection. The first is based on a deemed lack of personal continuity over time: that “I tomorrow” am not the same person as “I today”. If this position is adopted then “I am holding property overnight” really means “I today” am bequeathing property to “I tomorrow”; the property is a bequest not a gift inter vivos as “I today” cease to exist at midnight. The second involves limiting the rights of ownership in external objects, that is, acknowledging only less than full ownership, specifically by excluding the incident of the absence of term, that one’s rights to property do not expire. If the incident of the absence of term is excluded then I have no unencumbered right to continue my ownership in some property from today until tomorrow. If I do so, the state may legitimately require that I pay for that privilege. The third justification distinguishes between the rights to enjoy and to hold through time. The former does not involve the state in any way, other than in non-interference, but the latter may, through the need for protection. If the state is to provide this protection it may legitimately charge a fee for this, and this fee may take the form of a tax on the holding of property.

d. The Social Fund

As common ownership theories typically involve the imposition of taxes, they need to determine how the social fund created by these taxes is to be applied. One natural way to do this is to specify that the social fund be distributed to everyone in equal shares. As an alternative, Nozick, with respect to the case where the social fund is collected explicitly to rectify historical injustices, suggests that the fund be distributed in such a way that the end result is close to Rawls’s difference principle.

A radically different way of dividing the social fund would be to use it to compensate those with unchosen disadvantages, as would be justified, for example, by the argument that such disadvantages were morally arbitrary. There is, however, something perverse about any proposal to apply the social fund in a way that compensates for unchosen personal endowments when all means of collecting the taxes that form that fund have, because of an adherence to the self-ownership principle, ruled out taxing people on that basis. As Fried expresses it, “schemes, which judge the tax and transfer sides of fiscal policy by wholly different distributive criteria, seem morally incoherent” (2004, 90).

e. Summary

The strength of common ownership theories is that, as Fried puts it, they “have staked out a middle ground between the two dominant strains of contemporary political philosophy: the conventional libertarianism of those such as Robert Nozick on the right, and the egalitarianism of those such as Rawls, Dworkin, and Sen on the left” (2004, 67). However, the open question remains as to whether such theories are, in Fried’s terms, “just liberal egalitarianism in drag” (2004, 84).

6. Conclusions

As regards internal consistency, Dworkin’s equality of resources theory may have the greatest problems. Some of the problems with Dworkin’s auction construction may be avoided by adopting its outcome, of an equal wealth equilibrium, as a specification of justice in its own right. The insurance scheme, however, has more serious and unavoidable problems. The fundamental flaw is that shown by Roemer: that no Dworkinian scheme can satisfy four very weak consistency conditions, so that “resource egalitarianism is an incoherent notion”.

Rawlsian justice as fairness fares a little better, but, if it is to be grounded in choice from behind a veil of ignorance, has the serious flaws of that construction. Some of these can be avoided by inessential changes, but other problems are unavoidable, particularly those of identifying the least advantaged (with the related problems of defining primary goods and the construction of an index of these), and of the supposedly rational choice of the maximin principle with its “absurd practical implications”.

Common ownership theories, being diverse, are harder to assess as a group. Theories that involve interventions of the transfer of property have a variety of arbitrariness problems, and typically violate some aspect of the principle of free association. Those that involve interventions on the holding of property have, on the whole, some serious arbitrariness problems, particularly as regards the definition of property.

Nozickian entitlements theory may have the fewest problems of consistency. But although they may be few they are not trivial, particularly those relating to justice in initial acquisition, and to the rectification of past injustice.

It is not clear that it is useful, let alone possible, to identify some most satisfactory theory of justice, and thus identify some most appropriate point in the liberty-equality spectrum. Since self-ownership is a cornerstone of liberty, the problem is given specific focus in Cohen’s claim that “anyone who supports equality of condition must oppose (full) self-ownership, even in a world in which rights over external resources have been equalized” (1995, 72).

In an absolute sense, it seems hard to disagree with Cohen. There may, however, be some room for compromise. From one end of the spectrum, equality of resources moves in that direction, particularly in making Rawlsian egalitarianism more ambition-sensitive without at the same time making it more endowment-sensitive. From the other end, some versions of common ownership also move in that direction. This is particularly the case for versions that embody rectification of past injustice: as Nozick accepts, “although to introduce socialism as the punishment for our sins would be to go too far, past injustices might be so great as to make necessary in the short run a more extensive state in order to rectify them” (1974, 231).

If an accommodation is to be found, it will be found towards the centre of the liberty-equality spectrum, that is, in equality of resources or in common ownership theories. Given the greater internal problems of the former, the latter may prove to be the more fruitful. However, common ownership theories are diverse, so this does not provide a complete prescription. But as Nozick reminds us, “there is room for words on subjects other than last words” (1974, xii).

7. References and Further Reading

a. References

  • Allingham, M. (2014) Distributive Justice, London, Routledge.
  • Arrow, K. J. (2012) Social Choice and Individual Values (third edition), New Haven: Yale University Press.
  • Cohen, G. A. (1989) “On the currency of egalitarian justice”, Ethics, 99: 906-944.
  • Cohen, G. A. (1995) Self-Ownership, Freedom, and Equality, Cambridge: Cambridge University Press.
  • Dworkin, R. (1981) “What is equality? Part 2: equality of resources”, Philosophy & Public Affairs 10: 283-345.
  • Fried, B. (2004) “Left-libertarianism: a review essay”, Philosophy and Public Affairs, 32: 66–92.
  • Hammond, P. J. (1991) “Interpersonal comparisons of utility: why and how they are and should be made”, in Interpersonal Comparisons of Well-Being (editors J. Elster and J. E. Roemer) Cambridge: Cambridge University Press, 200-254.
  • Harsanyi, J. (1977) “Morality and the theory of rational behavior”, Social Research, 44; reprinted in Utilitarianism and Beyond (editors A. Sen and B. Williams) Cambridge: Cambridge University Press, 39-62.
  • Hume, D. (1751/1998) An Enquiry Concerning the Principles of Morals, edited by T. L. Beauchamp, Oxford: Oxford University Press.
  • Locke, J. (1689/1988) Two Treatises of Government, edited by P. Laslett, Cambridge: Cambridge University Press.
  • Nozick, R. (1974) Anarchy, State, and Utopia, Oxford: Blackwell.
  • Rawls, J. (1999) A Theory of Justice (revised edition), Oxford: Oxford University Press.
  • Rawls, J. (2005) Political Liberalism (expanded edition), New York: Columbia University Press.
  • Roemer, J. E. (1985) “Equality of talent”, Economics and Philosophy, 1: 151-187.
  • Roemer, J. E. (1996) Theories of Distributive Justice, Cambridge MA: Harvard University Press.
  • Sandel, M. J. (2009) Justice: What’s the Right Thing to Do?, Allen Lane: London.
  • Scanlon, T. (1991) “The moral basis of interpersonal comparisons”, in Interpersonal Comparisons of Well-Being (editors J. Elster and J. E. Roemer) Cambridge: Cambridge University Press, 17-44.
  • Steiner, H. (1977) “Justice and entitlement”, Ethics, 87: 150-152
  • Steiner, H. (1994) An Essay on Rights, Cambridge, MA: Blackwell.

b. Further Reading

  • Overviews
  • Vallentyne, P. (2007) “Distributive justice”, in A Companion to Contemporary Political Philosophy (editors R. Goodin, P. Pettit, and T. Pogge), Oxford: Blackwell, 548-562.
  • Wellman, C. H. (2002) “Justice”, in The Blackwell Guide to Social and Political Philosophy (edited by R. L. Simon), Oxford: Blackwell.
  • Justice as fairness
  • Freeman, S. (editor) (2003) The Cambridge Companion to Rawls, Cambridge: Cambridge University Press.
  • Equality of resources
  • Brown, A. (2009) Ronald Dworkin’s Theory of Equality, London: Macmillan.
  • Entitlements
  • Bader R. M. and Meadowcroft J. (editors) (2011) The Cambridge Companion to Nozick’s Anarchy, State, and Utopia, Cambridge: Cambridge University Press.
  • Common ownership
  • Vallentyne, P. and Steiner, H. (editors) (2000) Left Libertarianism and Its Critics: The Contemporary Debate, Basingstoke: Palgrave.

 

Author Information

Michael Allingham
Email: michael.allingham@magd.ox.ac.uk
Oxford University
United Kingdom

Phenomenal Conservatism

Phenomenal Conservatism is a theory in epistemology that seeks, roughly, to ground justified beliefs in the way things “appear” or “seem” to the subject who holds a belief. The theory fits with an internalistic form of foundationalism—that is, the view that some beliefs are justified non-inferentially (not on the basis of other beliefs), and that the justification or lack of justification for a belief depends entirely upon the believer’s internal mental states. The intuitive idea is that it makes sense to assume that things are the way they seem, unless and until one has reasons for doubting this.

This idea has been invoked to explain, in particular, the justification for perceptual beliefs and the justification for moral beliefs. Some believe that it can be used to account for all epistemic justification. It has been claimed that the denial of Phenomenal Conservatism (PC) leaves one in a self-defeating position, that PC naturally emerges from paradigmatic internalist intuitions, and that PC provides the only simple and natural solution to the threat of philosophical skepticism. Critics have objected that appearances should not be trusted in the absence of positive, independent evidence that appearances are reliable; that the theory allows absurd beliefs to be justified for some subjects; that the theory allows irrational or unreliable cognitive states to provide justification for beliefs; and that the theory has implausible implications regarding when and to what degree inferences produce justification for beliefs.

Table of Contents

  1. Understanding Phenomenal Conservatism
    1. Species of Appearance
    2. Defeasibility
    3. Kinds of Justification
    4. Comparison to Doxastic Conservatism
  2. The Nature of Appearance
    1. The Belief that P
    2. The Disposition to Believe that P
    3. The Belief that One Has Evidence for P
    4. The Experience View
    5. Appearance versus Acquaintance
  3. Arguments for Phenomenal Conservatism
    1. Intuitive Internalist Motivation
    2. An Internal Coherence Argument
    3. The Self-Defeat Argument
    4. Avoiding Skepticism
    5. Simplicity
  4. Objections
    1. Crazy Appearances
    2. Metajustification
    3. Cognitive Penetration and Tainted Sources
    4. Inferential Justification
  5. Summary
  6. References and Further Reading

1. Understanding Phenomenal Conservatism

a. Species of Appearance

The following is a recent formulation of the central thesis of phenomenal conservatism:

PC If it seems to S that P, then, in the absence of defeaters, S thereby has at least some justification for believing that P (Huemer 2007, p. 30; compare Huemer 2001, p. 99).

The phrase “it seems to S that P” is commonly understood in a broad sense that includes perceptual, intellectual, memory, and introspective appearances. For instance, as I look at the squirrel sitting outside the window now, it seems to me that there is a squirrel there; this is an example of a perceptual appearance (more specifically, a visual appearance). When I think about the proposition that no completely blue object is simultaneously red, it seems to me that this proposition is true; this is an intellectual appearance (more specifically, an intuition). When I think about my most recent meal, I seem to remember eating a tomatillo cake; this is a mnemonic (memory) appearance. And when I think about my current mental state, it seems to me that I am slightly thirsty; this is an introspective appearance.

b. Defeasibility

Appearances sometimes fail to correspond to reality, as in the case of illusions, hallucinations, false memories, and mistaken intuitions. Most philosophers agree that logically, this could happen across the board – that is, the world as a whole could be radically different from the way it appears. These observations do not conflict with phenomenal conservatism. Phenomenal conservatives do not hold that appearances are an infallible source of information, or even that they are guaranteed to be generally reliable. Phenomenal conservatives simply hold that to assume things are the way they appear is a rational default position, which one should maintain unless and until grounds for doubt (“defeaters”) appear. This is the reason for the phrase “in the absence of defeaters” in the above formulation of PC (section 1a).

These defeaters may take two forms. First, there might be rebutting defeaters, that is, evidence that what appears to be the case is in fact false. For instance, one might see a stick that appears bent when half-submerged in water. But one might then feel the stick, and find that it feels straight. The straight feel of the stick would provide a rebutting defeater for the proposition that the stick is bent.

Second, there might be undercutting defeaters, that is, evidence that one’s appearance (whether it be true or false) is unreliable or otherwise defective as a source of information. For instance, suppose one learns that an object that appears red is in fact illuminated by red lights. The red lighting is not by itself evidence that the object isn’t also red; however, the red lighting means that the look of the object is not a reliable indicator of its true color. Hence, the information about the unusual lighting conditions provides an undercutting defeater for the proposition that the object is red.

c. Kinds of Justification

Epistemologists commonly draw a (misleadingly named) distinction between “propositional justification” and “doxastic justification”, where propositional justification is justification that one has for believing something (whether or not one in fact believes it) and doxastic justification is justification that an actual belief possesses. The distinction is commonly motivated by pointing out that a person might have good reasons to believe a proposition and yet not believe it for any of those reasons, but instead believe it for some bad reason. For instance, I might be in possession of powerful scientific evidence supporting the theory of evolution, but yet my belief in the theory of evolution might actually be based entirely upon trust in the testimony of my tarot card reader. In that case, I would be said to have “propositional justification” but not “doxastic justification” for the theory of evolution.

It is commonly held that to have doxastic justification for P, an individual must satisfy two conditions: first, the individual must have propositional justification for P; second, the individual must base a belief that P on that propositional justification (or whatever confers that propositional justification). If we accept this view, then the phenomenal conservative should hold (i) that the appearance that P gives one propositional justification, in the absence of defeaters, for believing that P, and (ii) that if one believes that P on the basis of such an undefeated appearance, one thereby has doxastic justification for P.

Phenomenal conservatism was originally advanced as an account of foundational, or noninferential, justification (Huemer 2001, chapter 5). That is, it was advanced to explain how a person may be justified in believing that P without basing the belief that P on any other beliefs. Some hold that a variation of phenomenal conservatism may also be used to account for inferential justification – that is, that even when a person believes that P on the basis of other beliefs, the belief that P is justified in virtue of appearances (especially the “inferential appearance” that in the light of certain premises, P must be or is likely to be true) (Huemer 2013b, pp. 338-41); this last suggestion, however, remains controversial even among those sympathetic to PC.

d. Comparison to Doxastic Conservatism

A related but distinct view, sometimes called “epistemic conservatism” but better labeled “doxastic conservatism”, holds that a person’s merely believing that P gives that person some justification for P, provided that the person has no grounds for doubting that belief (Swinburne 2001, p. 141). (Etymological note: the term “doxastic” derives from the Greek word for belief [doxa], while “phenomenal” derives from the Greek word for appearance [phainomenon].)

Doxastic conservatism is an unpopular view, as it seems to endorse circular reasoning, or something very close to it. A thought experiment due to Richard Foley (1983) illustrates the counterintuitiveness of doxastic conservatism: suppose that S has some evidence for P which is almost but not quite sufficient to justify P. Suppose that S forms the belief that P anyway. If doxastic conservatism is correct, it seems, then as soon as S formed this belief, it would immediately become justified, since in addition to the evidence S already had for P, S would now have his belief that P serving as a source of justification, which would push S over the threshold for justified belief.

The phenomenal conservative aims to avoid this sort of implausibility. PC does not endorse circular reasoning, since it does not hold that a belief (or any other mental state) may justify itself; it holds that an appearance may justify a belief. Provided that no appearance is a belief, this view avoids the most obviously objectionable form of circularity, and it avoids the Foley counterexample. Suppose that S has almost enough justification to believe that P, and then, in addition, S acquires an appearance that P. Assume also that S has no defeaters for a belief in P. In this case, it is not counterintuitive to hold that S would then be justified in believing that P.

2. The Nature of Appearance

Phenomenal conservatism ascribes justificatory significance to appearances. But what are appearances? Philosophers have taken a number of different views about the nature of appearances, and which view one takes may dramatically affect the plausibility of PC. In this section, we consider some views philosophers have taken about what it is for it to “seem to one that P.”

a. The Belief that P

Here is a very simple theory: to say that it seems to one that P is to report a tentative sort of belief that P (Chisholm [1957, ch. 4] suggested something in this neighborhood). This, however, is not how “seems” is understood by phenomenal conservatives when they state that if it seems to one that P and one lacks defeaters for P, then one has justification for P.

To motivate the distinction between its seeming to one that P and one’s believing that P, notice that in some cases, it seems to one that P even though one does not believe that P. For instance, when one experiences perceptual illusions, the illusions typically persist even when one learns that they are illusions. That is to say, things continue to appear a certain way even when one does not believe that things are as they appear, indeed, even when one knows that things are not as they appear. This shows that an appearance that P is not a belief that P.

b. The Disposition to Believe that P

Some thinkers suggest that an appearance that P might be identified with a mere inclination or disposition to believe that P (Sosa 1998, pp. 258-9; Swinburne 2001, pp. 141-2; Armstrong 1961). Typically, when it appears to one that P, one will be disposed to believe that P. However, one may be disposed to believe that P when it doesn’t seem to one that P. For instance, if one is inclined to believe that P merely because one wants P to be true, or because one thinks that a virtuous person would believe that P, this would not be a case in which it seems to one that P. Even in cases where it seems to one that P, its seeming to one that P is not to be identified with the disposition to believe that P, since one is disposed to believe that P because it seems to one that P, and not the other way around. Thus, its seeming to one that P is merely one possible ground for the disposition to believe that P.

c. The Belief that One Has Evidence for P

Some philosophers hold that its seeming to one that P is a matter of one’s believing, or being disposed to believe, that some mental state one has is evidence for P (Conee 2013; Tooley 2013). This would undermine the plausibility of PC, since it is not very plausible to think that one’s merely being disposed to believe (whether rightly or wrongly) that one has evidence for P actually gives one justification for believing P.

Fortunately, phenomenal conservatives can reasonably reject that sort of analysis, on grounds similar to those used to reject the idea that its seeming to one that P is just a matter of one’s being disposed to believe that P. Suppose that Jon is disposed to believe that he has evidence for the reality of life after death merely because Jon wants it to be true that he has evidence for life after death (a case of pure wishful thinking). This surely would not count as its seeming to Jon that there is life after death.

d. The Experience View

Most phenomenal conservatives hold that its seeming to one that P is a matter of one’s having a certain sort of experience, which has propositional content but is not analyzable in terms of belief (for discussion, see Tucker 2013, section 1). Sensory experiences, intellectual intuitions, (apparent) memories, and introspective states are either species of this broad type of experience, or else states that contain an appearance as a component.

Some philosophers have questioned this view of appearance, on the ground that intellectual intuitions, perceptual experiences, memories, and episodes of self-awareness are extremely different mental states that have nothing interesting in common (DePaul 2009, pp. 208-9).

In response, one can observe that intuitions, perceptual experiences, memories, and states of self-awareness are all mental states of a kind that naturally incline one to believe something (namely, the content of that very mental state, or, the thing that appears to one to be the case). And it is not merely that one is inclined to believe that proposition for some reason or other. We can distinguish many different reasons why one might be inclined to believe P: because one wants P to be true, because one thinks a good person would believe P, because one wants to fit in with the other people who believe P, because being a P-believer will annoy one’s parents . . . or because P just seems to one to be the case. When we reflect on these various ways of being disposed to believe P, we can see that the last one is interestingly different from all the others and forms a distinct (non-disjunctive) category. Admittedly, I have not just identified a new characteristic or set of characteristics that all and only appearances have in common; I have not defined “appearance”, and I do not believe it is possible to do so. What I have done, I hope, is simply to draw attention to the commonality among all appearances by contrasting appearances with various other things that tend to produce beliefs. When Jon believes [for all numbers x and y, x+y = y+x] because that proposition is intuitively obvious, and Mary believes [the cat is on the couch] because she seems to see the cat on the couch, these two situations are similar to each other in an interesting respect – which we see when we contrast both of those cases with cases such as that in which Sally thinks her son was wrongly convicted because Sally just cannot bear the thought that her son is a criminal (Huemer 2009, pp. 228-9).

e. Appearance versus Acquaintance

Appearances should be distinguished from another sort of non-doxastic mental state sometimes held to provide foundational justification for beliefs, namely, the state of acquaintance (Russell 1997, chs. 5, 9; Fumerton 1995, pp. 73-9). Acquaintance is a form of direct awareness of something. States of acquaintance differ from appearances in that the occurrence of an episode of acquaintance entails the existence of an object with which the subject is acquainted, whereas an appearance can occur without there being any object that appears. For example, if a person has a fully realistic hallucination of a pink rat, we can say that the person experiences an appearance of a pink rat, but we cannot say the person is acquainted with a pink rat, since there is no pink rat with which to be acquainted. In other words, an appearance is an internal mental representation, whereas acquaintance is a relation to some object.

3. Arguments for Phenomenal Conservatism

a. Intuitive Internalist Motivation

Richard Foley (1993) has advanced a plausible account of rationality, on which, roughly, it is rational for S to do A provided that, from S’s own point of view, doing A would seem to be a reasonably effective way of satisfying S’s goals. Foley goes on to suggest that epistemic rationality is rationality from the standpoint of the goal of now believing truths and avoiding falsehoods. Though Foley does not draw this consequence, his account of epistemic rationality lends support to PC, for if it seems to S that P is true and S lacks grounds for doubting P, then from S’s own point of view, believing P would naturally seem to be an effective way of furthering S’s goal of believing truths and avoiding falsehoods. Therefore, it seems, it would be epistemically rational for S to believe that P (Huemer 2001, pp. 103-4; compare McGrath 2013, section 1).

b. An Internal Coherence Argument

Internalism in epistemology is, roughly, the view that the justification or lack of justification of a belief is entirely a function of the internal mental states of the believer (for a fuller account, see Fumerton 1995, pp. 60-9). Externalism, by contrast, holds that a belief’s status as justified or unjustified sometimes depends upon factors outside the subject’s mind.

The following is one sort of argument for internalism and against externalism. Suppose that externalism is true, and that the justification of a belief depends upon some external factor, E. There could be two propositions, P and Q, that appear to one exactly alike in all epistemically relevant respects—for instance, P and Q appear equally true, equally justified, and equally supported by reliable belief-forming processes; however, it might be that P is justified and Q unjustified, because P but not Q possesses E. Since E is an external factor, this need have no impact whatsoever on how anything appears to the subject. If such a situation occurred, the externalist would presumably say that one ought to believe that P, while at the same time either denying Q or withholding judgment concerning Q.

But if one took this combination of attitudes, it seems that one could have no coherent understanding of what one was doing. Upon reflecting on one’s own state of mind, one would have to hold something like this: “P and Q seem to me equally correct, equally justified, and in every other respect equally worthy of belief. Nevertheless, while I believe P, I refuse to believe Q, for no apparent reason.” But this seems to be an irrational set of attitudes to hold. Therefore, we ought to reject the initial externalist assumption, namely, that the justificatory status of P and Q depends on E.

If one accepts this sort of motivation for internalism, then it is plausible to draw a further conclusion. Not only does the justificatory status of a belief depend upon the subject’s internal mental states; it depends, more specifically, on the subject’s appearances (that is, on how things seem to the subject). On this view, it is impossible for P and Q to seem the same to one in all relevant respects and yet for P to be justified and Q unjustified. This is best explained by something like PC (Huemer 2006).

c. The Self-Defeat Argument

One controversial argument claims that PC is the only theory of epistemic justification that is not self-defeating (Huemer 2007; Skene 2013). The first premise of this argument is that all relevant beliefs (all beliefs that are plausible candidates for being doxastically justified) are based on appearances. I think there is a table in front of me because it appears that way. I think three plus three is six because that seems true to me. And so on. There are cases of beliefs not based on how things seem, but these are not plausible candidates for justified beliefs to begin with. For instance, I might believe that there is life after death, not because this seems true but because I want it to be true (wishful thinking) – but this would not be a plausible candidate for a justified belief.

The second premise is that a belief is doxastically justified only if what it is based on is a source of propositional justification. Intuitively, my belief is justified only if I not only have justification for it but also believe it because of that justification.

From here, one can infer that unless appearances are a source of propositional justification, no belief is justified, including the belief that appearances are not a source of propositional justification. Therefore, to deny that appearances are a source of propositional justification would be self-defeating. Huemer (2007) interprets this to mean that the mere fact that something appears to one to be the case must (in the absence of defeaters) suffice to confer justification. Some critics maintain, however, that one need only hold that some appearances generate justification, allowing that perhaps other appearances fail to generate justification even in the absence of defeaters (BonJour 2004, p. 359).

A related objection holds that there may be “background conditions” for a belief’s justification – conditions that enable an appearance to provide justification for a belief but which are not themselves part of the belief’s justification. Thus, PC might be false, not because appearances fail to constitute a source of justification, but because they only do so in the presence of these background conditions, which PC neglects to mention. And these background conditions need not themselves be causally related to one’s belief in order for one’s belief to be doxastically justified. (For this objection, see Markie 2013, section 2; for a reply, see Huemer 2013b, section 4.)

Other critics hold that the first premise of the self-defeat argument is mistaken, because it often happens that one justifiedly believes some conclusion on the basis of an inference from other (justified) beliefs, where the conclusion of the inference does not itself seem true; hence, one can be justified in believing P without basing that belief on a seeming that P (Conee 2013, pp. 64-5). In reply, the first premise of the self-defeat argument need not be read as holding that the belief that P (in relevant cases) is always based on an appearance that P. It might be held that the belief that P (in relevant cases) is always based either on the appearance that P or on some ultimate premises which are themselves believed because they seem correct.

d. Avoiding Skepticism

Skeptics in epistemology maintain that we don’t know nearly as much as we think we do. There are a variety of forms of skepticism. For instance, external world skeptics hold that no one knows any contingent propositions about the external world (the world outside one’s own mind). These skeptics argue that to know anything about the external world, one would need to be able to figure out what the external world is likely solely on the basis of facts about one’s own experiences, but that in fact nothing can be legitimately inferred about non-experiential reality solely from one’s own experiences (Hume 1975, section XII, part 1). Most epistemologists consider this conclusion to be implausible on its face, even absurd, so they have sought ways of rebutting the skeptic’s arguments. However, rebutting skeptical arguments has proved very difficult, and there is no generally accepted refutation of external world skepticism.

Another form of skepticism is moral skepticism, the view that no one knows any substantive evaluative propositions. On this view, no one ever knows that any action is wrong, that any event is good, that any person is vicious or virtuous. Again, this idea seems implausible on its face, but philosophers have found it difficult to explain how, in general, someone can know what is right, wrong, good, or bad. Skeptical views may also be held in a variety of other areas – skeptics may challenge our knowledge of the past, of other people’s minds, or of all things not presently observed. As a rule, epistemologists seek to avoid skeptical conclusions, yet it is often difficult to do so plausibly.

Enter phenomenal conservatism. Once one accepts something in the neighborhood of PC, most if not all skeptical worries are easily resolved. External world skepticism is addressed by noting that, when we have perceptual experiences, there seem to us to be external objects of various sorts around us. In the absence of defeaters, this is good reason to think there are in fact such objects (Huemer 2001). Moral skepticism is dealt with in a similarly straightforward manner. When we think about certain kinds of situations, our ethical intuitions show us what is right, wrong, good, or bad. For instance, when we think about pushing a man in front of a moving train, the action seems wrong. In the absence of defeaters, this is good enough reason to think that pushing the man in front of the train would be wrong (Huemer 2005). Similar observations apply to most if not all forms of skepticism. Thus, the ability to avoid skepticism, long considered an elusive desideratum of epistemological theories, is among the great theoretical advantages of phenomenal conservatism.

e. Simplicity

If we accept phenomenal conservatism, we have a single, simple principle to account for the justification of multiple very different kinds of belief, including perceptual beliefs, moral beliefs, mathematical beliefs, memory beliefs, beliefs about one’s own mind, beliefs about other minds, and so on. One may even be able to unify inferential and non-inferential justification (Huemer 2013b, pp. 338-41). To the extent that simplicity and unity are theoretical virtues, then, we have grounds for embracing PC. There is probably no other (plausible) theory that can account for so many justified beliefs in anything like such a simple manner.

4. Objections

a. Crazy Appearances

Some critics have worried that phenomenal conservatism commits us to saying that all sorts of crazy propositions could be non-inferentially justified. Suppose that when I see a certain walnut tree, it just seems to me that the tree was planted on April 24, 1914 (this example is from Markie 2005, p. 357). This seeming comes completely out of the blue, unrelated to anything else about my experience – there is no date-of-planting sign on the tree, for example; I am just suffering from a brain malfunction. If PC is true, then as long as I have no reason to doubt my experience, I have some justification for believing that the tree was planted on that date.

More ominously, suppose that it just seems to me that a certain religion is true, and that I should kill anyone who does not subscribe to the one true religion. I have no evidence either for or against these propositions other than that they just seem true to me (this example is from Tooley 2013, section 5.1.2). If PC is true, then I would be justified (to some degree) in thinking that I should kill everyone who fails to subscribe to the “true” religion. And perhaps I would then be morally justified in actually trying to kill these “infidels” (as Littlejohn [2011] worries).

Phenomenal conservatives are likely to bravely embrace the possibility of justified beliefs in “crazy” (to us) propositions, while adding a few comments to reduce the shock of doing so. To begin with, any actual person with anything like normal background knowledge and experience would in fact have defeaters for the beliefs mentioned in these examples (people can’t normally tell when a tree was planted by looking at it; there are many conflicting religions; religious beliefs tend to be determined by one’s upbringing; and so on).

We could try to imagine cases in which the subjects had no such background information. This, however, would render the scenarios even more strange than they already are. And this is a problem for two reasons. First, it is very difficult to vividly imagine these scenarios. Markie’s walnut tree scenario is particularly hard to imagine – what is it like to have an experience of a tree’s seeming to have been planted on April 24, 1914? Is it even possible for a human being to have such an experience? The difficulty of vividly imagining a scenario should undermine our confidence in any reported intuitions about that scenario.

The second problem is that our intuitions about strange scenarios may be influenced by what we reasonably believe about superficially similar but more realistic scenarios. We are particularly unlikely to have reliable intuitions about a scenario S when (i) we never encounter or think about S in normal life, (ii) S is superficially similar to another scenario, S’, which we encounter or think about quite a bit, and (iii) the correct judgment about S’ is different from the correct judgment about S. For instance, in the actual world, people who think they should kill infidels are highly irrational in general and extremely unjustified in that belief in particular. It is not hard to see how this would incline us to say that the characters in Tooley’s and Littlejohn’s examples are also irrational. That is, even if PC were true, it seems likely that a fair number of people would report the intuition that the hypothetical religious fanatics are unjustified.

A further observation relevant to the religious example is that the practical consequences of a belief may impact the degree of epistemic justification that one needs in order to be justified in acting on the belief, such that a belief with extremely serious practical consequences may call for a higher degree of justification and a stronger effort at investigation than would be the case for a belief with less serious consequences. PC only speaks of one’s having some justification for believing P; it does not entail that this is a sufficient degree of justification for taking action based on P.

b. Metajustification

Some argue that its merely seeming to one that P cannot suffice (even in the absence of defeaters) to confer justification for believing P; in addition, one must have some reason for thinking that one’s appearances are reliable indicators of the truth, or that things that appear to one to be the case are likely to actually be the case (BonJour 2004, pp. 357-60; Steup 2013). Otherwise, one would have to regard it as at best an accident that one managed to get to the truth regarding whether P. We can refer to this alleged requirement on justified belief as the “metajustification requirement”. (When one has an alleged justification for P, a “metajustification” is a justification for thinking that one’s alleged justification for P actually renders P likely to be true [BonJour 1985, p. 9].)

While perhaps superficially plausible, the metajustification requirement threatens us with skepticism. To begin with, if we think that appearance-based justifications require metajustifications (to wit, evidence that appearances are reliable indicators of the truth), it is unclear why we should not impose the same requirement on all justifications of any kind. That is, where someone claims that belief in P is justified because of some state of affairs X, we could always demand a justification for thinking that X – whatever it is – is a reliable indicator of the truth of P. And suppose X’ explains why we are justified in thinking that X is a reliable indicator of the truth of P. Then we’ll need a reason for thinking that X’ is a reliable indicator of X’s being a reliable indicator of the truth of P. And so on, ad infinitum.

One can avoid this sort of infinite regress by rejecting any general metajustification requirement. The phenomenal conservative will most likely want to maintain that one need not have positive grounds for thinking one’s appearances to be reliable; one is simply entitled to rely upon them unless and until one acquires grounds for doubting that they are reliable.

c. Cognitive Penetration and Tainted Sources

Another class of objection to PC adverts to cases of appearances that are produced by emotions, desires, irrational beliefs, or other kinds of sources that would normally render a belief unjustified (Markie 2006, pp. 119-20; Lyons 2011; Siegel 2013; McGrath 2013). That is, where a belief produced by a particular source X would be unjustified, the objector contends that an appearance produced by X should not be counted as conferring justification either (even if the subject does not know that the appearance has this source).

Suppose, for instance, that Jill, for no good reason, thinks that Jack is angry (this example is from Siegel 2013). This is an unjustified belief. If Jill infers further conclusions from the belief that Jack is angry, these conclusions will also be unjustified. But now suppose that Jill’s belief that Jack is angry causes Jill to see Jack’s facial expression as one of anger. This “seeing as” is not a belief but a kind of experience – that is, Jack’s face just looks to Jill like an angry face. This is, however, a misinterpretation on Jill’s part, and an ordinary observer, without any preexisting beliefs about Jack’s emotional state, would not see Jack as looking angry. But Jill is not aware that her perception has been influenced by her belief in this way, nor has she any other defeaters for the proposition that Jack is angry. If PC is true, Jill will now have justification for believing that Jack is angry, arising directly from the mere appearance of Jack’s being angry. Some find this result counter-intuitive, since it allows an initially unjustified belief to indirectly generate justification for itself.

Phenomenal conservatives try to explain away this intuition. Skene (2013, section 5.1) suggests that the objectors may confuse the evaluation of the belief with that of the person who holds the belief in the sort of example described above, and that the person should be adjudged irrational but her belief judged rational. Tucker (2010, p. 540) suggests that the person possesses justification but lacks another requirement for knowledge and is epistemically blameworthy (compare Huemer 2013a, pp. 747-8). Huemer (2013b, pp. 343-5) argues that the subject has a justified belief in this sort of case by appealing to an analogy involving a subject who has a hallucination caused (unbeknownst to the subject) by the subject’s own prior action.

d. Inferential Justification

Suppose S bases a belief in some proposition P on (his belief in) some evidence E. Suppose that the inference from E to P is fallacious, such that E in fact provides no support at all for P (E neither entails P nor raises the probability of P). S, however, incorrectly perceives E as supporting P, and thus, S’s belief in E makes it seem to S that P must be true as well. (It does not independently seem to S that P is true; it just seems to S that P must be true given E.) Finally, assume that S has no reason for thinking that the inference is fallacious, even though it is, nor has S any other defeaters for P. It seems that such a scenario is possible. If so, one can raise the following objection to PC:

1. In the described scenario, S is not justified in believing P.

2. If PC is true, then in this scenario, S is justified in believing P.

3. So PC is false.

Many would accept premise (1), holding that an inferential belief is unjustified whenever the inference on which the belief is based is fallacious. (2) is true, since in the described scenario, it seems to S that P, while S has no defeaters for P. (For an objection along these lines, see Tooley 2013, p. 323.)

One possible response to this objection would be to restrict the principle of phenomenal conservatism to the case of non-inferential beliefs and to hold a different view (perhaps some variation on PC) of the conditions for inferential beliefs to be justified.

Another alternative is to maintain that in fact, fallacious inferences can result in justified belief. Of course, if a person has reason to believe that the inference on which he bases a given belief is fallacious, then this will constitute a defeater for that belief. It is consistent with phenomenal conservatism that the belief will be unjustified in this case. So the only cases that might pose a problem are those in which a subject makes an inference that is in fact fallacious but that seems perfectly good to him, and he has no reason to suspect that the inference is fallacious or otherwise defective. In such a case, one could argue that the subject rationally ought to accept the conclusion. If the subject refused to accept the conclusion, how could he rationally explain this refusal? He could not cite the fact that the inference is fallacious, nor could he point to any relevant defect in the inference, since by stipulation, as far as he can tell the inference is perfectly good. Given this, it would seem irrational for the subject not to accept the conclusion (Huemer 2013b, p. 339).

Here is another proposed condition on doxastic justification: if S believes P on the basis of E, then S is justified in believing P only if S is justified in believing E. This condition is very widely accepted. But again, PC seems to flout this requirement, since all that is needed is for S’s belief in E to cause it to seem to S that P (while S lacks defeaters for P), which might happen even if S’s belief in E is unjustified (McGrath 2013, section 5; Markie 2013, section 2).

A phenomenal conservative might try to avoid this sort of counterexample by claiming that whenever S believes P on the basis of E and E is unjustified, S has a defeater for P. This might be true because (i) per epistemological internalism, whenever E is unjustified, the subject has justification for believing that E is unjustified, (ii) whenever S’s belief that P is based on E, the subject has justification for believing that his belief that P is based on E, and (iii) the fact that one’s belief that P is based on an unjustified premise would be an undercutting defeater for the belief that P.

Alternately, and perhaps more naturally, the phenomenal conservative might again restrict the scope of PC to noninferential beliefs, while holding a different (but perhaps closely related) view about the justification of inferential beliefs (McGrath 2013, section 5; Tooley 2013, section 5.2.1). For instance, one might think that in the case of a non-inferential belief, justification requires only that the belief’s content seem true and that the subject lack defeaters for the belief; but that in the case of an inferential belief, justification requires that the premise be justifiedly believed, that the premise seem to support the conclusion, and that the subject lack defeaters for the conclusion (Huemer 2013b, p. 338).

5. Summary

Among the most central, fundamental questions of epistemology is that of what, in general, justifies a belief. Phenomenal Conservatism is among the major theoretical answers to this question: at bottom, beliefs are justified by “appearances,” which are a special type of experience one reports when one says “it seems to me that P” or “it appears to me that P.” This position is widely viewed as possessing important theoretical virtues, including the ability to offer a very simple account of many kinds of justified belief while avoiding troublesome forms of philosophical skepticism. Some proponents lay claim to more controversial advantages for the theory, such as the unique ability to avoid self-defeat and to accommodate central internalist intuitions.

The theory remains controversial among epistemologists for a variety of reasons. Some harbor doubts about the reality of a special type of experience called an “appearance.” Others believe that an appearance cannot provide justification unless one first has independent evidence of the reliability of one’s appearances. Others cite alleged counterexamples in which appearances have irrational or otherwise unreliable sources. And others object that phenomenal conservatism seems to flout widely accepted necessary conditions for inferential justification.

6. References and Further Reading

  • Armstrong, David. 1961. Perception and the Physical World. London: Routledge & Kegan Paul.
  • BonJour, Laurence. 1985. The Structure of Empirical Knowledge. Cambridge: Harvard University Press.
  • BonJour, Laurence. 2004. “In Search of Direct Realism.” Philosophy and Phenomenological Research 69, 349-367.
    • Early objections to phenomenal conservatism.
  • Brogaard, Berit. 2013. “Phenomenal Seemings and Sensible Dogmatism.” In Chris Tucker (ed.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 270-289). Oxford: Oxford University Press.
    • Objections to phenomenal conservatism.
  • Chisholm, Roderick. 1957. Perceiving: A Philosophical Study. Ithaca: Cornell University Press.
    • Chapter 4 offers a widely cited discussion of three uses of “appears” and related terms.
  • Conee, Earl. 2013. “Seeming Evidence.” In Chris Tucker (ed.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 52-68). Oxford: Oxford University Press.
    • Objections to phenomenal conservatism.
  • Cullison, Andrew. 2010. “What Are Seemings?” Ratio 23, 260-274.
  • DePaul, Michael. 2009. “Phenomenal Conservatism and Self-Defeat.” Philosophy and Phenomenological Research 78, 205-212.
    • Objections to phenomenal conservatism, especially the self-defeat argument.
  • DePoe, John. 2011. “Defeating the Self-defeat Argument for Phenomenal Conservativism.” Philosophical Studies 152, 347–359.
    • Objections to phenomenal conservatism, especially the self-defeat argument.
  • Foley, Richard. 1983. “Epistemic Conservatism.” Philosophical Studies 43, 165-182.
    • Objections to doxastic conservatism.
  • Foley, Richard. 1993. Working without a Net. New York: Oxford University Press.
  • Fumerton, Richard. 1995. Metaepistemology and Skepticism. Lanham: Rowman & Littlefield.
  • Hanna, Nathan. 2011. “Against Phenomenal Conservatism.” Acta Analytica 26, 213-221.
    • Objections to phenomenal conservatism.
  • Huemer, Michael. 2001. Skepticism and the Veil of Perception. Lanham: Rowman & Littlefield.
    • Chapter 5 defends phenomenal conservatism and contains a version of the self-defeat argument. This is the original source of the term “phenomenal conservatism.”
  • Huemer, Michael. 2005. Ethical Intuitionism. New York: Palgrave Macmillan.
    • Chapter 5 uses phenomenal conservatism to explain moral knowledge.
  • Huemer, Michael. 2006. “Phenomenal Conservatism and the Internalist Intuition.” American Philosophical Quarterly 43, 147-158.
    • Defends phenomenal conservatism using internalist intuitions.
  • Huemer, Michael. 2007. “Compassionate Phenomenal Conservatism.” Philosophy and Phenomenological Research 74, 30-55.
    • Defends phenomenal conservatism using the self-defeat argument. Responds to BonJour 2004.
  • Huemer, Michael. 2009. “Apology of a Modest Intuitionist.” Philosophy and Phenomenological Research 78, 222-236.
    • Responds to DePaul 2009.
  • Huemer, Michael. 2013a. “Epistemological Asymmetries Between Belief and Experience.” Philosophical Studies 162, 741-748.
    • Responds to Siegel 2013.
  • Huemer, Michael. 2013b. “Phenomenal Conservatism Uber Alles.” In Chris Tucker (ed.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 328-350). Oxford: Oxford University Press.
    • Responds to several critiques of phenomenal conservatism found in the same volume.
  • Hume, David. 1975. “An Enquiry Concerning Human Understanding.” In L. A. Selby-Bigge (ed.), Enquiries Concerning Human Understanding and Concerning the Principles of Morals. Oxford: Clarendon.
  • Littlejohn, Clayton. 2011. “Defeating Phenomenal Conservatism.” Analytic Philosophy 52, 35-48.
    • Argues that PC may lead one to endorse terrorism and cannibalism.
  • Lycan, William. 2013. “Phenomenal Conservatism and the Principle of Credulity.” In Chris Tucker (ed.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 293-305). Oxford: Oxford University Press.
  • Lyons, Jack. 2011. “Circularity, Reliability, and the Cognitive Penetrability of Perception.” Philosophical Issues 21, 289-311.
  • Markie, Peter. 2005. “The Mystery of Direct Perceptual Justification.” Philosophical Studies 126, 347-373.
    • Objections to phenomenal conservatism.
  • Markie, Peter. 2006. “Epistemically Appropriate Perceptual Belief.” Noûs 40, 118-142.
    • Objections to phenomenal conservatism.
  • Markie, Peter. 2013. “Searching for True Dogmatism.” In Chris Tucker (ed.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 248-268). Oxford: Oxford University Press.
    • Objections to phenomenal conservatism.
  • McGrath, Matthew. 2013. “Phenomenal Conservatism and Cognitive Penetration: The ‘Bad Basis’ Counterexamples.” In Chris Tucker (ed.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 225-247). Oxford: Oxford University Press.
    • Uses the cognitive penetration counterexamples to motivate a modification of phenomenal conservatism.
  • Russell, Bertrand. 1997. The Problems of Philosophy. New York: Oxford University Press.
  • Siegel, Susanna. 2013. “The Epistemic Impact of the Etiology of Experience.” Philosophical Studies 162, 697-722.
    • Criticizes phenomenal conservatism and related views using the tainted source objection.
  • Skene, Matthew. 2013. “Seemings and the Possibility of Epistemic Justification.” Philosophical Studies 163, 539-559.
    • Defends the self-defeat argument for phenomenal conservatism and offers an account of why epistemic justification must derive from appearances.
  • Sosa, Ernest. 1998. “Minimal Intuition.” In Michael DePaul and William Ramsey (eds.), Rethinking Intuition (pp. 257-270). Lanham: Rowman & Littlefield.
  • Steup, Matthias. 2013. “Does Phenomenal Conservatism Solve Internalism’s Dilemma?” In Chris Tucker (eds.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 135-153). Oxford: Oxford University Press.
  • Swinburne, Richard. 2001. Epistemic Justification. Oxford: Oxford University Press.
  • Tolhurst, William. 1998. “Seemings.” American Philosophical Quarterly 35, 293-302.
    • Discusses the nature of seemings.
  • Tooley, Michael. 2013. “Michael Huemer and the Principle of Phenomenal Conservatism.” In Chris Tucker (ed.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 306-327). Oxford: Oxford University Press.
    • Objections to phenomenal conservatism.
  • Tucker, Chris. 2010. “Why Open-Minded People Should Endorse Dogmatism.” Philosophical Perspectives 24, 529-545.
    • Defends phenomenal conservatism, appealing to its explanatory power.
  • Tucker, Chris. 2013. “Seemings and Justification: An Introduction.” In Chris Tucker (ed.), Seemings and Justification: New Essays on Dogmatism and Phenomenal Conservatism (pp. 1-29). Oxford: Oxford University Press.

 

Author Information

Michael Huemer
Email: owl232@earthlink.net
University of Colorado
U. S. A.

Francis Hutcheson (1694—1745)

Francis HutchesonFrancis Hutcheson was an eighteenth-century Scottish philosopher whose meticulous writings and activities influenced life in Scotland, Great Britain, Europe, and even the newly formed North American colonies. For historians and political scientists, the emphasis has been on his theories of liberalism and political rights; for philosophers and psychologists, Hutcheson’s importance comes from his theories of human nature, which include an account of an innate care and concern for others and of the internal senses (including the moral sense). The latter were pivotal to the Scottish Enlightenment’s empirical aesthetics, and all of Hutcheson’s theories were important to moral sentimentalism. One cannot properly study the works of Adam Smith, Hutcheson’s most famous student, or David Hume’s moral and political theories, without first understanding Hutcheson’s contributions and influence.

Popular and well-read in his day, Hutcheson’s writings seem to be enjoying resurgence specifically among libertarians, contemporary moral psychologists and philosophers. The latter are taking another and more in-depth look at Hutcheson and the rest of the sentimentalists because present-day empirical studies seem to support many of their claims about human nature. This is not surprising because the philosophical theories of the Scottish Enlightenment were based on human observations and experiences, much of which would be considered psychology today.

As part of his attempt to defend Shaftesbury against the attacks of Bernard Mandeville, Hutcheson’s writings concentrate on human nature. Hutcheson also promoted a natural benevolence against the egoism of Thomas Hobbes and against the reward/punishment view of Samuel Pufendorf by appealing to our own experiences of ourselves and others.

What follows is an overview of Hutcheson’s life, works and influence, with special attention paid to his writings on aesthetics, morality, and the importance of the internal senses of beauty, harmony, and the moral sense.

Table of Contents

  1. Life
  2. Internal Senses
  3. Moral Sense Faculty
    1. Operations of moral sense faculty
    2. Sense vs. Reason
    3. Basis of Moral Determinations
  4. Benevolence: Response to Hobbes and Pufendorf
  5. Influences on Hume and Smith
  6. References and Further Reading
    1. Works by Hutcheson
      1. Collected Works and Correspondence
    2. Secondary Readings

1. Life

Francis Hutcheson was born to Scottish parents on August 8, 1694 in Ireland. Though remembered primarily as a philosopher, he was also a Presbyterian minister, as were his father and grandfather before him. After he attended the University of Glasgow in Scotland in 1711 he returned to Dublin in 1716. Rather than taking a ministry position he was asked to start an academy in Dublin, and it was here that he wrote his most influential works. At this time he also married Mary Wilson and had one son, Francis. Eventually he was appointed professor and chair of Moral Philosophy at the University of Glasgow in 1729 following the death of his mentor and teacher, Gershom Carmichael.

Hutcheson was a popular lecturer perhaps because he was the first professor to use English in lectures rather than the commonly used Latin and also, possibly influenced by his preaching experience, was more animated than was typical of an eighteenth-century academic. Throughout his career he retained a commitment to the liberal arts as his thoughts and theories were always connected to the ancient traditions, especially those of Aristotle and Cicero. His writings were respected even before his Glasgow position and this reputation continued throughout his lifetime. His most influential pieces, first published in Dublin anonymously, were An Inquiry into the Original of Our Ideas of Beauty and Virtue (1725) and An Essay on the Nature and Conduct of the Passions and Affections, with Illustrations of the Moral Sense (1728). Hutcheson’s moral theory was influenced most by Lord Shaftesbury, while his aesthetics were in many ways influenced by and a response to John Locke’s primary and secondary qualities. Those who read and were influenced by Hutcheson’s theories included David Hume and Adam Smith, his student at Glasgow, while Thomas Reid and Immanuel Kant both cited Hutcheson in their writings.

Francis Hutcheson died in 1745 after 16 years at Glasgow while on a visit to Ireland, where he is buried.  After his death, his son and namesake published another edition of Hutcheson’s Illustrations on the Moral Sense in 1746 and in 1755, A System of Moral Philosophy, a text written specifically for college students.

2. Internal Senses

Though Shaftesbury could be called the father of modern aesthetics, Hutcheson’s thorough treatment of the internal senses, especially of beauty, grandeur, harmony, novelty, order and design in the Inquiry, is what specifically moved the focus of study from rational explanations to the sensations. For Hutcheson the perception of beauty does depend on the external sense of sight; however, the internal sense of beauty operates as an internal or reflex sense. The same is the case with hearing: hearing music does not necessarily give the perception of harmony as it is distinct from the hearing (Inquiry I. I. X). Yet, the internal senses are senses because like the external senses they are immediate perceptions not needing knowledge of cause or advantage to receive the idea of beauty. Both the external and internal senses are characterized by a passive and involuntary nature, and the internal senses are a source of pleasure and pain. With a nod to Locke’s primary and secondary qualities (Inquiry I, 1 7), Hutcheson described perception specifically of beauty and harmony in terms of simple and complex ideas. Without the internal sense of beauty there is no perception of it: “This superior power of perception is justly called a sense, because of its affinity to the other senses in this, that the pleasure does not arise from any knowledge of principles, proportions, causes, or of the usefulness of the object; but strikes us at first with the idea of beauty: nor does the most accurate knowledge increase this pleasure of beauty, however it may super-add a distinct rational pleasure from prospects of advantage, or from the increase of knowledge” (Inquiry I, 1, 8).

The perception of beauty though excited by an object is not possible without this internal sense of beauty. There is a specific type of absolute beauty and there are figures that excite this idea of beauty. We experience this when recognizing what Hutcheson calls “uniformity amidst variety” (Inquiry, I, 2, 3). This happens with both mathematical and natural objects, which although multifaceted and complex, are perceived with a distinct uniformity. The proportions of an animal or human figure can also excite and touch the internal sense as absolute beauty. Imitative beauty, on the other hand, is perceived in comparison to something else or as an imitation in art, poetry, or even in an idea. The comparison is what excites this sense of beauty even when the original being imitated is not singularly beautiful.

Hutcheson wondered why there would be a question about whether there were internal senses since they, like the external ones, are prominent in our own experiences. Perhaps one of the reasons that the internal senses are questioned more than the external is because there are no common names for them such as ‘hearing’ and ‘seeing’ (Inquiry, I. VI, IX). There is no easy way to describe the sense that feels beauty, yet we all experience it in the presence of beauty. Though this internal sense can be influenced by knowledge and experience it is not consciously controlled and is involuntary. Moving aesthetics away from logic and mathematical truths does not make it any less real and important for our pleasure as felt in the appreciation and experience of beauty and harmony. The internal senses also include the moral sense, so called by Shaftesbury and developed thoroughly by Hutcheson.

3. Moral Sense Faculty

a. Operations of moral sense faculty

Hutcheson, like Shaftesbury, claimed moral judgments were made in the human faculty that Shaftesbury called a moral sense. Both believed human nature contained all it needed to make moral decisions, along with inclinations to be moral.

The process, Hutcheson described, begins with a feeling of pleasure or advantage felt in the moral sense faculty—not necessarily to us but advantageous to someone or generally for everyone. This perception of pleasure has a specific moral flavor and causes us to feel moral approbation. We feel this pleasure when considering what is good or beneficial to others as a part of our natural instinct of benevolence. The things pursued for this pleasure are wanted because of our self-love and interest in the good for others. So first there is a sense of pleasure; then there is the interest in what causes the pleasure. From there, our experience or reason can tell us what objects have and may continue to give us pleasure or advantage (Hutcheson 1725, 70). For Hutcheson, the moral sense thus described is from God, implanted, not like innate ideas, but as an innate sense of pleasure for objects that are not necessarily to our advantage—and for nobler pleasures like caring for others or appreciation of harmony (Hutcheson 1725, I.VIII, 83).

Evaluating what is good or not—what we morally approve of or disapprove of—is done by this moral sense. The moral sense is not the basis of moral decisions or the justification of our disapproval as the rationalists claim; instead it is better explained as the faculty with which we feel the value of an action. It does not justify our evaluation; the moral sense gives us our evaluation. The moral faculty gives us our sense of valuing—not feeling in an emotional sense as that would be something like sadness or joy.  There is feeling, but the feeling is a valuing type of feeling.

Like the other internal senses of beauty and harmony, people are born with a moral sense. We know this because we experience moral feelings of approbation and disapprobation. We do not choose to make moral approvals or disapprovals; they just happen to us and we feel the approvals when they occur. Hutcheson put it this way: “approbation is not what we can voluntarily bring upon ourselves” (Hutcheson 1728, I. 412). He continued that in spite of the fact that it is a pleasurable experience to approve of actions, we cannot just approve of anything or anyone when we want to. Hutcheson gives illustrations of this: for instance, people do not “approve as virtuous the eating a bunch of grapes, taking a glass of wine, or sitting down when tired” (ibid.). The point is that moral approvals and disapprovals done by our moral sense are specific in nature and only operate when there is an action that can be appropriately judged of by our moral sense (ibid.). Another way to make this point is to compare the moral sense to the olfactory sense. I can put my nose to this ceramic cup in front of me but my nose will not smell anything if there is nothing to smell. The moral sense operates when an idea touches it the same way a nose smells when there is an odor reaching it. No odor, no smell; no moral issue, no moral sentiment. For Hutcheson, the moral sense is involved and included when the agent reflects on an action or a spectator observes them in reference to the action’s circumstances, specifically those whom it affects (Hutcheson 1728, I. 408). So when an action has consequences for others, it is more likely to awaken our moral sensibility.

Reasoning and information can change the evaluation of the moral sense, but no amount of reasoning can or does precede the moral sense in regard to its approval of what is for the public good. Reason does, however, inform the moral sense, as discussed below. The moral sense approves of the good for others. This concern for others by the moral sense is what is natural to humankind, Hutcheson contended. Reason gives content to the moral sense, informing it of what is good for others and the public good (Hutcheson 1728, I. 411).

Some may think Hutcheson a utilitarian and certainly no thorough accounting of historical utilitarianism is complete without a mention of Hutcheson. Consider the following statement from Hutcheson: “In the same manner, the moral evil, or vice, is as the degree of misery, and number of sufferers; so that, that action is best, which procures the greatest happiness for the greatest numbers; and that, worst, which, in like manner, occasions, misery.” Preceding this, though, is the phrase, “…we are led by our moral sense of virtue to judge thus…” (Inquiry, II, 3, 8). So it is our moral sense that evaluates goodness and evil and does seem to evaluate much like a utilitarian, but it is not bound by the utilitarian rule—moral sense evaluations are normatively privileged and prior to moral rules of any kind.

In Illustrations upon the Moral Sense (1728), Hutcheson gives definitions of both the approbation of our own actions and those of others. Approbation of our own action is given when we are pleased with ourselves for doing the action and/or pleased with our intentions for doing the action. Hutcheson puts it this way: “[A]pprobation of our own action denotes, or is attended with, a pleasure in the contemplation of it, and in reflection upon the affections which inclined us to it” (I. 403). Consider what happens when someone picks up and returns something that another person drops. In response to the action, the person who picked up the dropped item would have feelings of approbation toward their own action. This person would be happy with what they did, especially after giving it some thought. Further, they would be pleased if their own intentions were ones with which they could also be pleased. The intention could possibly be that they just wanted to help this person; however, if the intention was to gain advantage with the other person, then they would not be as pleased with themselves. Approbation of another’s action is much the same except that the observer is pleased to witness the action of the other person and feels affection toward the agent of the action. Again Hutcheson:

[A]pprobation of the action of another has some little pleasure in attending it in the observer, and raises love toward the agent, in whom the quality approved is deemed to reside, and not in the observer, who has a satisfaction in the act of approving (Hutcheson 1728, 403).

There is a distinction, Hutcheson claimed, between choosing to do an action or wanting someone else to do an action and our approbation of the action. According to Hutcheson, we often act in ways we disapprove of (ibid. 403). All I have to think of is the extra cookie I have just consumed: upon reflection I am not pleased with my choice; I disapprove of eating the cookie.

b. Sense vs. Reason

In response to the difficulty philosophers seem to have understanding the separate operations of sensing—done by the moral sense—and intellectual reasoning, Hutcheson referred to the ancients—a common element in his writing—and the division of the soul between the will (desires, appetites, ‘sensus’) and the intellect. Philosophers who think reasons motivate and/or judge have conflated the will into the intellect (Hutcheson 1728, 405). In this same discussion, Hutcheson, borrowing from Aristotle, explained that reason and the intellect help determine how to reach an end or goal. Yet the desire for that goal is the job of the will. The will is moved by the desire for that end which, of course, for Aristotle, was happiness (ibid. I. 405-6).

There has to be a desire for the will to choose something. Something is chosen because it is seen as a possible fulfillment of a human desire. For Hutcheson, there is a natural instinct and desire for the good of others. Without this natural desire, Hutcheson claimed, no one would care whether an action benefits or harms one person or many. Information may be sound and true about the dangers of an action, yet without the instinct to care about those who would be benefited or harmed the information would not move our passions (ibid. I. 406-7). The only reason to care about a natural disaster 1,000 miles away where we do not know anyone and we are not affected even indirectly is that we care about others in general and do not wish harm on them. A person can only want something if the desire for it is connected to or understood to be satisfying a certain natural instinct or affection (ibid. I. 404). This instinct or desire for the welfare of others is what influences our moral sense to approve or disapprove of an action.

Reasons and discussions that excite and motivate presuppose instincts and affections (ibid.). To be moved means there is an instinct that is moved. Consider a different type of instinct like one’s instinct for happiness. Hutcheson explained it this way: “[T]here is an instinct or desire fixed in his nature, determining him to pursue his happiness: but it is not this reflection on his own nature, or this [some] proposition which excites or determines him, but the instinct itself” (ibid. I. 406). It is not the proposition that a certain act will produce lots of money that excites a person, but rather the instinct toward happiness and the belief that money will bring the desired happiness. So reasoning that leads a person to believe that money will bring happiness presupposes an instinct that values happiness. Reasons that justify or explain something as being moral or immoral presuppose a moral sense (ibid. 404). If there are reasons for something and those reasons are considered, a moral sense must exist that cares about and utilizes the information.

Hutcheson thought one of the reasons there was confusion and opposition to the idea of moral judgment coming from one’s instincts or affections is the violent, passionate actions that are observed in people and would not be effective as moral evaluators. Yet Hutcheson was not claiming that these passions and out-of-control desires are the source of moral judgment; it is “the calm desire or affection which employs our reason freely…” (ibid. IV. 413). Also, for Hutcheson, “the most perfect virtue consists in the calm, impassionate benevolence, rather than in particular affection” (ibid.). So not only are the moral passions calm, they naturally respond positively to behaviors that benefit the public good. Hutcheson did not claim that this should be the case and, therefore, it is not the normative claim utilitarianism makes; rather, what Hutcheson argued is that his experiences and moral sense find this to be the case.

To the criticism that a person’s moral sense might be judged good or evil, Hutcheson replied that this was not possible. He compared judging the moral sense as good or evil with calling the “power of tasting, sweet or bitter; or of seeing, strait or crooked, white or black” (ibid. I. 409). So a person cannot have a morally evil moral sense even if this person disagrees with another. Hutcheson did see that people may differ in taste—and various people could and do—and that the moral sense can be silenced or ignored (ibid. 410). He contended, however, that these differences in taste and evaluation do not indicate evil in the moral sense itself.

Hutcheson did address the issue of uniformity in moral sentiments by answering whether or not we can know others will also approve of that which we approve (ibid. IV. 414). Though there is no certainty of agreement, the moral sense as natural to humankind is largely uniform. Hutcheson added that God approves of benevolence and kindness and so he created human nature with the capability to make the same types of approvals, and this is done by the moral sense. Our moral sense naturally, according to Hutcheson, approves of kindness and caring for others, and unless there is a prejudiced view of whether the action is truly kind and publicly useful, it is not probable that a person would judge incorrectly (ibid.). So, yes, there is disagreement sometimes, but the disagreement is not rooted in self-interest.

c. Basis of Moral Determinations

For Hutcheson, the foundation of our moral determinations is not self-love. What is basic to morality is our inclination for benevolence—an integral part of our moral evaluations which will be more fully examined in the following section. In response to the Hobbesian doctrine of egoism as advanced by authors like Bernard Mandeville, Hutcheson set out to prove the existence of natural feelings like benevolence in order to show that not every action was performed out of self-interest. Although the following quote demonstrates that Hutcheson worried that our natural benevolence could get caught up with our selfish nature, he hoped people could realize that our natural benevolence will allow us to see the higher character and that we can understand and encourage what is best for everyone:

Let the misery of excessive selfishness, and all its passions, be but once explain’d, that so self-love may cease to counteract our natural propensity to benevolence, and when this noble disposition gets loose from these bonds of ignorance, and false views of interest, it shall be assisted even by self-love, and grow strong enough to make a noble virtuous character. Then he is to enquire, by reflection upon human affairs, what course of action does most effectually promote the universal good… (Hutcheson 1725, VII. 155).

However, even when selfishness drowns out our benevolent instincts, our moral sense still operates in response to what is good for others.

Hutcheson’s moral sense theory helped to conceptually circumvent the problems that stem from a strict doctrine of egoism. He claimed that it is natural for us to want good things for others. When someone’s moral sense operates and they judge an action as morally wrong, the moral sense is not why they feel the wrongness, it is how they feel it. It is like an applause meter that evaluates the morality that is expressed in the sentiment: “I morally disapprove of that.” This last statement is a report of the moral sense into an opinion of morality, moving from a feeling to an idea. Yet, if the moral sense faculty works the way Hutcheson describes, there needs to be an innate benevolence, and that case is made by Hutcheson.

4. Benevolence: Response to Hobbes and Pufendorf

Hutcheson’s arguments for an instinctual benevolence are in both Reflections on the Common Systems of Morality (1724) and the Inaugural Lecture on the Nature of Man (1730), both found in Francis Hutcheson: Two Texts on Human Nature (Mautner 1993). In these texts Hutcheson responds to both Thomas Hobbes and Samuel Pufendorf, arguing that from our own experiences we can see that there are, in fact, disinterested motivations common in humankind. Hutcheson specifically claims that the term ‘state of nature’ as used by Hobbes and Pufendorf creates a misunderstanding of what is actually present in human nature. The actual ‘state of nature,’ for Hutcheson, includes the benevolence he claimed as instinctual to humankind. The particular Pufendorf claim that Hutcheson was concerned with was that people would not be virtuous unless they believed in divine punishment and reward (Mautner 1993, 18). This is not unlike Hobbes, who claimed that without civil authority, life for humankind would be “solitary, poor, nasty, brutish and short” (Hobbes 1651, 13.8). For both Hobbes and Pufendorf, the natural ‘state of nature’ is unappealing and full of egoistic defensive protections against others. In opposition, Hutcheson claims the nature of humankind as created by God includes a natural instinct for benevolence. Hutcheson considered the state of nature as described by Hobbes and Pufendorf as an uncultivated state (Hutcheson 1730, 132). He described the cultivated state as one in which a person’s mind is actively learning and developing. These cultivated persons are, for Hutcheson, truly following their own nature as designed by God. In this cultivated state, persons take care of themselves and want all of humankind to be safe and sound (Hutcheson 1730, 133). Hutcheson would have preferred that Hobbes and Pufendorf had used a term other than ‘state of nature’—perhaps ‘state of freedom’—to describe the uncultivated state. This may seem like an unimportant distinction, but consider it for a moment: if humankind is naturally as Hobbes and Pufendorf described, then they need to be forced to develop in cooperative ways, which would be against their nature. If humankind were by nature caring of others, as Hutcheson proposed, then individuals would not need to be forced to cooperate.

Besides the label, ‘state of nature,’ Hutcheson had other objections to the negative characterization of humankind ascribed by Pufendorf and Hobbes. Surely we experience other aspects of people that are not cruel or selfish. We also experience in ourselves a caring and a concern for others. Hutcheson wondered why there was no attention or acknowledgement given by Hobbes or Pufendorf to people’s natural propensity and:

kind instinct [s] to associate; of natural affections, of compassion, of love of company, a sense of gratitude, a determination to honour and love the authors of any good offices toward any part of mankind, as well as of those toward our selves… (Hutcheson 1724, 100).

These characteristics, for Hutcheson, are certainly a part of what we experience in ourselves and in others. We reach out to people for friendship and are impressed and grateful to people who kindly help others as well as ourselves.

Hutcheson also added that human beings naturally care what others think of them. He described this characteristic, observed in others and experienced in ourselves, as “a natural delight men take in being esteemed and honoured by others for good actions…” These characteristics, “all may be observed to prevail exceedingly in humane life,” are ones that we witness daily in people, and are ignored and therefore unaccounted for by Hobbes and Pufendorf (Hutcheson 1724, 100-1). Here, Hutcheson took care to describe his own experiences, and those of others for whom caring for others is not uncommon, and yet these characteristics are missing in the Hobbesian model of humankind. And it is not a meek or quiet instinct: “we shall find one of the greatest springs of their [men in general] actions to be love toward others…a strong delight in being honoured by others for kind actions…” (Hutcheson 1724, 101). Along with his disagreement with the Hobbesian characteristics of humankind, Hutcheson also discusses whether all human action comes from self-interest, arguing against psychological egoism. Hutcheson acknowledged that it is in everyone’s advantage to form cooperative units and that this interdependence is necessary for mankind’s survival (Hutcheson 1730, 134-5). This view agrees partially with what is referred to as prudentialism, as discussed by Hobbes and Pufendorf. Prudentialism is the theory that all cooperation and sociability comes from a self-interested motive. So people make friends or are kind because they know in the long run the effort will benefit their projects and survival—it is prudent to at least feign to care for others. Where Hutcheson disagreed with Hobbes and Pufendorf was over the claim that self-interest is the only motive for social life and/or caring for others. Hutcheson claimed that human beings have other natural affections and appetites “immediately implanted by nature, which are not directed towards physical pleasures or advantage but towards certain higher things which in themselves depend on associating with others” (Hutcheson 1730, 135).

Hutcheson could not imagine a rational creature sufficiently satisfied or happy in a state that would not include love and friendship with others. Hutcheson allowed that this person could have all the pleasant sensations of the external senses along with “the perceptions of beauty, order, harmony.” But that wouldn’t be enough (ibid. V. 144).  When discussing the pleasures of wealth and other external pleasures, Hutcheson connected the enjoyments of these with our experiences and involvement with others. For Hutcheson, even in an imaginary state of wealth, we include others. Hutcheson asked whether these kinds of ideas of wealth do not always include “some moral enjoyments of society, some communication of pleasure, something of love, of friendship, of esteem, of gratitude” (ibid. VI.147). Hutcheson asked more directly, “Who ever pretended to a taste of these pleasures without society” (ibid. VI. 147). So even in our imagination, while enjoying great wealth and material success, we are doing so in the company of others.

There is another minor disagreement between Hobbes and Hutcheson over what is considered funny, specifically what makes us laugh. Though taking up only small sections in Hobbes’ Human Nature (9. 13) and Leviathan (I.6.42), Hobbes’ claim that infirmity causes laughter was addressed by Hutcheson in “Thoughts [Reflections] on Laughter and Observations on ‘The Fable of the Bees.’” In this collection of six letters, Hutcheson also addresses his disagreements with Mandeville.  These letters, though not as well known today, could well have been quite influential essays when they were published originally in the Dublin Journal. They are also an excellent illustration of Hutcheson’s skills in argumentation.

5. Influences on Hume and Smith 

The moral sentimentalist theories of David Hume and Adam Smith were able to move past the Hobbesian view of human nature as both men considered Hutcheson to have handily defeated Hobbes’ argument. Hume does not take on Hobbes directly as he explains that “[m]any able philosophers have shown the insufficiency of these systems” (EPM, Appendix 2.6.17). Without Hutcheson’s successful argument for natural benevolence in human nature, Hume’s and Smith’s moral theories were not feasible because an innate care and concern for others and for society are both basic to their theories.

As a professor at the University of Glasgow, Hutcheson taught Smith, and his writings influenced both Smith and Hume by setting the empirical and psychological tone for both of their moral theories. Hutcheson particularly set up Hume’s moral theory in three ways. Hutcheson argued—as far as Hume was concerned, successfully—against humankind being completely self-interested. Hutcheson also described the mechanism of the internal moral sense that generates moral sentiments (although Hume’s description differed slightly, the mechanism in Hume’s account has many of the same characteristics). In connection to these two Hutcheson themes (the argument against human beings as solely self-interested and a moral sense wherein moral sentiments are felt), Hutcheson also made an argument for a naturally occurring instinct of benevolence in humankind. It was with these three Hutcheson themes that Hume and Smith began articulating their respective moral theories.

It is impossible to know how much Smith was influenced by Hutcheson. Many of Smith’s theories, especially concerning government regulations, property rights and unalienable rights, certainly resemble those espoused by Hutcheson. These were all addressed in the second treatise of the Inquiry (sections v-vii), where Hutcheson aligns the naturally occurring benevolence with feelings of honor, shame and pity, and with the evaluations of the moral sense—and also explains the way benevolence affects human affairs and the happiness of others. Smith’s ideas in Wealth of Nations align with Hutcheson on such issues as the division of labor and the compatibility of the amount and difficulty of labor with its value. Smith was also influenced by Hutcheson’s discussion of the cost of goods being dependent on the difficulty of acquiring them plus the demand for them (Systems II. 10. 7). Also of note in the same chapter is an insightful description for the use of coinage, gold and silver in the exchange of goods and the role of government in the use of coins. Overall, Hutcheson’s timely and meticulous attention to these kinds of social, economic and political details was not only instrumental to Smith’s development but also to that of the American colonies. The latter could have resulted specifically from Hutcheson’s A Short Introduction to Moral Philosophy being translated from Latin into English and used at American universities such as Yale.

6. References and Further Reading

a. Works by Hutcheson

  • Hutcheson, Francis. 1724. Reflections on the Common Systems of Morality. In Francis Hutcheson: On Human Nature, ed. Thomas Mautner, 1993. 96-106. Cambridge: Cambridge University Press
  • Hutcheson, Francis. Philosophical Writings, ed. R. S. Downie. Everyman’s Library. 1994. London: Orion Publishing Group.
  • Hutcheson’s Writings (selection) ed. John McHugh in the Library of Scottish Philosophy series ed. Gordon Graham. Forthcoming 2014
  • Hutcheson, Francis. 1725. An Inquiry Concerning the Original of Our Ideas of Virtue or Moral Good. Selections reprinted in British Moralists, ed. L. A. Selby –Bigge, 1964. 69-177. Indianapolis: Bobbs-Merrill.
  • Hutcheson, Francis. 1728. Illustrations upon the Moral Sense. Selections reprinted in British moralists, ed. L. A. Selby-Bigge. 1964. 403-418. Indianapolis: Bobbs-Merrill.
  • Hutcheson, Francis. 1730. Inaugural Lecture on the Social Nature of Man. In Francis Hutcheson: On Human Nature, ed. Thomas Mautner. 1993. 124-147. Cambridge: Cambridge University Press.
  • Hutcheson, Francis. 1742. An Essay on the Nature and Conduct of the Passions and Affections. Selections reprinted in British moralists, ed. L. A. Selby-Bigge. 1964. 392-402. Indianapolis: Bobbs-Merrill.
  • Hutcheson, Francis. 1755. A System of Moral Philosophy. Selection reprinted in British moralists, ed. L. A. Selby-Bigge. 1964. 419-425. Indianapolis: Bobbs-Merrill.

i. Collected Works and Correspondence

  • Liberty Fund Natural Law and Enlightenment series: General Editor, Knud Haakonssen. Liberty Fund, Indianapolis, Indiana U.S.A.
  • 1725 An Inquiry into the Original of Our ideas of Beauty and Virtue. 2004
  • 1742 An Essay on the Nature and Conduct of the Passions and Affections, with the Illustrations on the Moral Sense. 2002
  • 1742 Logic, Metaphysics, and the Natural Sociability of Mankind. 2006
  • 1745 (Translated into English 1747) Philosophiae Moralis Instituitio Compendiaria with A Short Introduction to Moral Philosophy. ed. Luigi Turco. 2007
  • 1755 Meditations of the Emperor Marcus Aurelius Antonius. 2008
  • 1729 “Thoughts on Laughter and Observations on ‘The Fable of the Bees’” in The Correspondence and Occasional Writings of Francis Hutcheson  2014

b. Secondary Readings

  • Berry, Christopher J. 2003. “Sociality and Socialization.” The Cambridge Companion to the Scottish Enlightenment, ed. Alexander Broadie, Cambridge University Press.
  • Blackstone, William T. 1965. Francis Hutcheson & Contemporary Ethical Theory. University of Georgia Press.
  • Broadie, Alexander, ed. 2003. The Cambridge Companion to the Scottish Enlightenment. Cambridge University Press.
  • Brown, Michael. 2002. Francis Hutcheson in Dublin 1719-1730: The Crucible of his Thought. Four Courts Press.
  • Carey, Daniel. 1999. Hutcheson. In The Dictionary of Eighteenth-Century British Philosophers, eds. John Yolton, John Valdimir Price, and John Stephens. Two volumes.Vol. II: 453-460. Bristol, England: Thoemmes Press.
  • D’Arms, Justin and Daniel Jacobson. 2000. Sentiment and Value. In Ethics 110 (July): 722-748. The University of Chicago.
  • Daniels, Norman and Keith Lehrer. Eds. 1998. Philosophical Ethics. Dimensions of Philosophy Series. Boulder, Colorado: Westview Press.
  • Darwall, Stephen. 1995. The British Moralists and the Internal ‘Ought’ 1640 – 1740. Cambridge: Cambridge University Press.
  • Darwell, Stephen, Allan Gibbard, and Peter Railton, eds.1997. Moral Discourse and Practice. Oxford University Press.
  • Emmanuel, Steven, ed. 2001. The Blackwell Guide to the Modern Philosophers. Massachusetts: Blackwell Press.
  • Gill, Michael. 1996. Fantastic Associations and Addictive General Rules: A fundamental difference between Hutcheson and Hume. Hume Studies vol. XXII, no. 1 (April): 23-48.
  • Graham, Gordon. 2001. Morality and Feeling in the Scottish Enlightenment. Philosophy. Volume 76.
  • Haakonssen, Knud. 1996. Natural Law and Moral Philosophy: From Grotius to the Scottish Enlightenment. Cambridge: Cambridge University Press.
  • Haakonssen, Knud. 1998. Adam Smith. Aldershot, England: Dartmouth Publishing Company Limited and Ashgate Publishing Limited.
  • Harman, Gilbert. 2000. Explaining Value. Oxford: Clarendon Press.
  • Herman, Arthur. 2002. How the Scots Invented the Modern World: The True story of How Western Europe’s Poorest Nation Created Our World. Broadway Books.
  • Hope, Vincent. 1989. Virtues by Consensus: The Moral Philosophy of Hutcheson, Hume, and Adam Smith. Oxford University Press.
  • Hobbes, Thomas. 1651. Leviathan, ed. Edwin Curley. 1994. Indiana, USA: Hackett Press.
  • Hobbes, Thomas. 1651. Human Nature: or the Fundamental Elements of Policy. In British Moralists, ed. D.D. Raphael, 1991. Pp. 3-17. Indiana USA: Hackett Press.
  • Hume, David. 1740. A Treatise of Human Nature, eds. L. A. Selby-Bigge and P. H. Nidditch. second edition, 1978. Oxford: Clarendon Press.
  • Hume, David. 1751. Enquiries Concerning the Human Understanding and Concerning the Principles of Morals, eds. L. A. Selby-Bigge and P. H. Nidditch. Revised third edition, 1975. Oxford: Clarendon Press.
  • LaFollette, Hugh, ed. 2000. The Blackwell Guide to Ethical Theory. Massachusetts: Blackwell Publishers.
  • LaFollette, Hugh. 1991. The truth in Ethical Relativism. Journal of Social Philosophy. 146-54.
  • Kivy, Peter. 2003. The Seventh Sense: A Study of Francis Hutcheson’s Aesthetics and Its Influence in Eighteenth-Century Britain. 2nd edition. New York: Franklin.
  • Mackie, J. L. 1998. The Subjectivity of Values. In Ethical Theories, third edition, ed. Louis Pojman. 518 – 537.Wadworth Publishing.
  • Mautner, Thomas, ed. 1993. Francis Hutcheson: Two Texts on Human Nature. Cambridge University Press.
  • McDowell, John.1997. Projection and Truth in Ethics. In Moral Discourse and Practice, eds. Stephen Darwall, Allan Gibbard, Peter Railton. Chapter 12: 215 – 225. Oxford Press.
  • McNaughton, David. 1999. Shaftesbury. In The Dictionary of Eighteenth-Century British Philosophers, eds. John Yolton, John Valdimir Price, and John Stephens. Two volumes. Vol.1: 781-788. Bristol, England: Thoemmes Press.
  • Mercer, Philip. 1972. Sympathy and Ethics. Oxford: Clarendon Press.
  • Mercer, Philip. 1995. “Hume’s concept of sympathy.” Ethics, Passions, Sympathy, ‘Is’ and ‘Ought.’  David Hume: Critical Assessments. Volume IV: 437 – 60. London and New York: Routledge Press.
  • Moore, James. 1990. “The Two Systems of Francis Hutcheson: On the Origins of the Scottish Enlightenment,” in Studies in the Philosophy of the Scottish Enlightenment, ed. M.A. Stewart. Pp. 37-59. Oxford: Clarendon Press.
  • Moore, James. 1995. “Hume and Hutcheson.” Hume and Hume’s Connections, eds. M. A. Stewart and James P. Wright. 23-57. The Pennsylvania State University Press.
  • Price, John Valdimir. 1999. “Hume.” The dictionary of eighteenth-century British philosophers, eds. John Yolton, John Valdimir Price, and John Stephens. Two volumes. Volume II: 440-446. Bristol, England: Thoemmes Press.
  • Russell, Paul. 1995. Freedom and Moral Sentiments, Oxford University Press.
  • Schneewind, J. B. 1990. Moral Philosophy from Montagne to Kant: An Anthology. Volumes I and II. Cambridge University Press.
  • Schneider, Louis. 1967. The Scottish Moralists: On Human Nature and Society. Phoenix Books, University of Chicago.
  • Scott, William Robert. 1900. Francis Hutcheson, His Life, Teaching and Position in the History of Philosophy. Cambridge: Cambridge University Press, Reprint 1966 New York: A. M. Kelley.
  • Strasser, Mark. 1990. Francis Hutcheson’s Moral Theory. Wakefield, New Hampshire: Longwood Academic.
  • Strasser, Mark. 1991-2. “Hutcheson on Aesthetic Perception.” Philosophia 21: 107-18
  • Stewart, M. A. and Wright, John P., eds. 1995. Hume and Hume’s Connections. The Pennsylvania State University Press.
  • Taylor, W. L. 1965. Francis Hutcheson and David Hume as Predecessors of Adam Smith. Duke University Press.
  • Turco, Luigi. 2003. “Moral Sense and the Foundations of Morals.” The Cambridge Companion to the Scottish Enlightenment, ed. Alexander Broadie, Cambridge University Press.
  • Yolton, John, John Valdimir Price, and John Stephens, eds.1999. The Dictionary of Eighteenth-Century British Philosophers, Two volumes. Bristol, England: Thoemmes Press.

 

Author Information

Phyllis Vandenberg
Email: vandenbp@gvsu.edu
Grand Valley State University
U. S. A.

and

Abigail DeHart
Email: dehartab@mail.gvsu.edu
Grand Valley State University
U. S. A.

Emergence

If we were pressed to give a definition of emergence, we could say that a property is emergent if it is a novel property of a system or an entity that arises when that system or entity has reached a certain level of complexity and that, even though it exists only insofar as the system or entity exists, it is distinct from the properties of the parts of the system from which it emerges. However, as will become apparent, things are not so simple because “emergence” is a term used in different ways both in science and in philosophy, and how it is to be defined is a substantive question in itself.

The term “emergence” comes from the Latin verb emergo which means to arise, to rise up, to come up or to come forth. The term was coined by G. H. Lewes in Problems of Life and Mind (1875) who drew the distinction between emergent and resultant effects.

Effects are resultant if they can be calculated by the mere addition or subtraction of causes operating together, as with the weight of an object, when one can calculate its weight merely by adding the weights of the parts that make it up. Effects are emergent if they cannot be thus calculated, because they are qualitatively novel compared to the causes from which they emerge. For Lewes, examples of such emergent effects are mental properties that emerge from neural processes yet are not properties of the parts of the neural processes from which they emerge.  In Lewes’ work, three essential features of emergence are laid out. First, that emergentism is a theory about the structure of the natural world; and, consequently, it has ramifications concerning the unity of science. Second, that emergence is a relation between properties of an entity and the properties of its parts. Third, that the question of emergence is related to the question of the possibility of reduction. These three features will structure this article’s discussion of emergence.

Table of Contents

  1. The British Emergentists
    1. J. S. Mill
    2. Samuel Alexander
    3. C. Lloyd Morgan
    4. C. D. Broad
  2. Later Emergentism
    1. Kinds of Emergence
      1. Strong and Weak Emergence
        1. Strong Emergence: Novelty as Irreducibility and Downward Causation
        2. Weak Emergence: Novelty as Unpredictability
      2. Synchronic and Diachronic Emergence
    2. Emergence and Supervenience
  3. Objections to Emergentism
    1. The Supervenience Argument
    2. Do Cases of Genuine (Strong) Emergence Exist?
  4. References and Further Reading

1. The British Emergentists

The group of emergentists that Brian McLaughlin (1992) has dubbed the “British emergentists” were the first to make emergence the core of a comprehensive philosophical position in the second half of the nineteenth century and the beginning of the twentieth century. A central question at that time was whether life, mind and chemical bonding could be given a physical explanation and, by extension, whether special sciences such as psychology and biology were reducible to more “basic”’ sciences and, eventually, to physics. Views were divided between the reductionist mechanists and the anti-reductionist vitalists. The mechanists claimed that the properties of an organism are resultant properties that can be fully explained, actually or in principle, in terms of the properties and relations of its parts. The vitalists claimed that organic matter differs fundamentally from inorganic matter and that what accounts for the properties of living organisms is not the arrangement of their constitutive physical and chemical parts, but some sort of entelechy or spirit. In this debate the emergentists proposed a middle way in which, against the mechanists, the whole is more than just the sum and arrangement of its parts yet, against the vitalists, without anything being added to it “from the outside”—that is, there is no need to posit any mysterious intervening entelechy to explain irreducible emergent properties.

Though the views of the British emergentists differ in their details we can generally say they were monists regarding objects or substances in as much as the world is made of fundamentally one kind of thing, matter. However, they also held that at different levels of organization and complexity matter exhibits different properties that are novel relative to the lower levels of organization from which they emerged and this makes the emergentist view one of property dualism (or pluralism). It should also be noted that the British emergentists identified their view as a naturalist position firstly because whether something is emergent or not is to be established or rejected by empirical evidence alone, and secondly because no extra-natural powers, entelechies, souls and so forth are used in emergentist explanations. The main texts of this tradition of the so-called “British emergentists” are J. S. Mill’s System of Logic, Samuel Alexander’s Space, Time and Deity, C. Lloyd Morgan’s Emergent Evolution and C. D. Broad’s The Mind and its Place in Nature. Beyond these emergentists, traditional brands of emergentism can be found in the work of R. W. Sellars (1922), A. Lovejoy (1927), Roger Sperry (1980, 1991), Karl Popper and John Eccles (1977) and Michael Polanyi (1968).

a. J. S. Mill

Though he did not use the term ‘emergence,’ it was Mill’s System of Logic (1843) that marked the beginning of British emergentism.

Mill distinguished between two modes of what he called “the conjoint action of causes,” the mechanical and the chemical. In the mechanical mode the effect of a group of causes is nothing more than the sum of the effects that each individual cause would have were it acting alone. Mill calls the principle according to which the whole effect is the sum of the effects of its parts the “principle of composition of causes” and illustrates it by reference to the vector sum of forces. The effects thus produced in the mechanical mode are called “homopathic effects” and they are subject to causal “homopathic laws.” Mill contrasts the mechanical mode with the chemical mode in which the principle of composition of causes does not hold. In the chemical mode causal effects are not additive but, instead, they are “heteropathic” which means that the conjoint effect of different causes is different from the sum the effects the causes would have in isolation. The paradigmatic examples of such effects were, for Mill, the products of chemical reactions which have different properties and effects than those of the individual reactants. Take, for example a typical substitution reaction:

Zn + 2HCl → ZnCl2 + H2.

In such a reaction zinc reacts with hydrogen chloride and replaces the hydrogen in the latter to produce effects that are more than just the sum of the parts that came together at the beginning of the reaction. The newly formed zinc chloride has properties that neither zinc nor hydrogen chloride possess separately.

Mill’s heteropathic effects are the equivalent of Lewes’ emergent effects, whereas homopathic effects are the equivalent of Lewes’ resultants. Heteropathic effects are subject, according to Mill, to causal “heteropathic” laws which, though now relative to the laws of the levels from which they emerged, do not counteract them. Such laws are found in the special sciences such as chemistry, biology and psychology.

b. Samuel Alexander

In Space, Time and Deity (1920), Samuel Alexander built a complex metaphysical system that has been subject to a number of different interpretations. As we shall see, Alexander in effect talks of different levels of explanation as opposed to the more robust ontological emergence we find in the works of the other British emergentists.

According to Alexander, all processes are physico-chemical processes but as their complexity increases they give rise to emergent qualities that are distinctive of the new complex configurations. These are subject to special laws that are treated by autonomous special sciences that give higher-order explanations of the behavior of complex configurations. One kind of such emergent qualities is mental qualities (others are biological and chemical qualities). Since for Alexander all processes are physico-chemical processes, mental processes are identical to neural processes. However Alexander claims that mental qualities are distinctive of higher-order configurations. Furthermore, Alexander claims, mental qualities are not epiphenomenal. A neural process that lost its mental qualities would not be the same process because it is in virtue of its mental qualities that the “nervous”—neural—process has the character and effects that it has. So though emergent qualities are co-instantiated in one instance in a physico-chemical process, they are distinct from that process due to their novel causal powers.

Alexander also holds that emergent qualities and their behavior cannot be deduced even by a Laplacean calculator from knowledge of the qualities and laws of the lower—physiological—order. To be precise, though a Laplacean calculator could predict all physical processes (and hence all mental processes, since mental processes are physical processes) he would not be able to predict the emergent qualities of those events because their configuration, though being in its entirety physico-chemical, exhibits different behavior from the kind the physico-chemical sciences are concerned with and this behavior is, in turn, captured by emergent laws. Hence the emergence of such qualities should be taken as a brute empirical fact that can be given no explanation and should be accepted with “natural piety”. However it should be noted here that Alexander leaves open the possibility that, if chemical properties were to be reduced without residue to physico-chemical processes, then they would not be emergent, and he adds that the same holds for mental properties.

c. C. Lloyd Morgan

In Emergent Evolution (1923) (and subsequently in Life, Spirit and Mind [1926] and The Emergence of Novelty [1933]) the biologist C. Lloyd Morgan introduced the notion of emergence into the notion of the process of evolution and maintained that in the course of evolution new properties and behaviors emerge (like life, mind and reflective thought) that cannot be predicted from the already existing entities they emerged from. Taking off from Mill and Lewes, Morgan cites as the paradigmatic case of an emergent phenomenon the products of chemical reactions that are novel and unpredictable. These novel properties, moreover, are not merely epiphenomenal but bring about “a new kind of relatedness”—new lawful connections—that affects the “manner of go” of lower-level events in a way that would not occur had they been absent. Thus emergent properties are causally autonomous and have downward causal powers.

d. C. D. Broad

The last major work in the British emergentist tradition and, arguably, the historical foundation of contemporary discussions of emergence in philosophy, was C. D. Broad’s Mind and Its Place in Nature (1925).

Broad identified three possible answers to the question of how the properties of a complex system are related to the properties of its parts. The “component theory” of the vitalists, the reductive answer of the mechanists and the emergentist view that the behavior of the whole cannot in principle be deduced from knowledge of the parts and their arrangement.  From this latter view—Broad’s own—it follows that contrary to the mechanist’s view of the world as homogeneous throughout, reality is structured in aggregates of different order. Different orders in this sense exhibit different organizational complexity and the kinds that make up each order are made up of the kinds to be found in lower orders. This lack of unity is, in turn, reflected in the sciences, where there is a hierarchy with physics at the lower order and then ascending chemistry, biology and psychology—the subject matter of each being properties of different orders that are irreducible to properties of the lower orders. According to Broad these different orders are subject to different kinds of laws: trans-ordinal laws that connect properties of adjacent orders and intra-ordinal laws that hold between properties within the same order. Trans-ordinal laws, Broad writes, cannot be deduced from intra-ordinal laws and principles that connect the vocabularies of the two orders between which they hold; trans-ordinal laws are irreducible to intra-ordinal laws and, as such, are fundamental emergent laws—they are metaphysical brute facts.

Broad considered the question whether a trans-ordinal law is emergent to be an empirical question. Though he considered the behavior of all chemical compounds irreducible and thus emergent, he admitted, like Alexander, that if one day it is reduced to the physical characteristics of the chemical compound’s components it will not then count as emergent. However, unlike Alexander, he did not consider the same possible concerning the phenomenal experiences that “pure”—secondary—qualities of objects cause in us. Broad calls trans-ordinal laws that hold between physical properties and secondary qualities “trans-physical laws”. Though he is willing to grant that it could turn out that we mistakenly consider some trans-ordinal laws to be emergent purely on the basis of our incomplete knowledge, trans-physical laws are necessarily emergent—we could never have formed the concept of blue, no matter how much knowledge we had of colors, unless we had experienced it.  Broad puts forward an a priori argument to this effect that can be seen as a precursor of the knowledge argument against physicalism. These qualities, he says, could not have been predicted even by a “mathematical archangel” who knows everything there is to know about the structure and working of the physical world and can perform any mathematical calculation—they are in principle irreducible, only inductively predictable and hence emergent.

In this we see that Broad’s emergentism concerning the phenomenal experience of secondary qualities is not epistemological (as is sometimes suggested by his writings) but is a consequence of an ontological distinction of properties. That is, the impossibility of prediction which he cites as a criterion of emergence is a consequence of the metaphysical structure of the world; the “mathematical archangel” could not have predicted emergent properties not because of complexity or because of limits to what can be expressed by lower-level concepts, but because emergent facts and laws are brute facts or else are laws that are in principle not reductively explainable.

2. Later Emergentism

Beginning in the late 1920’s, advances in science such as the explanation of chemical bonding by quantum mechanics and the development of molecular biology put an end to claims of emergence in chemistry and biology and thus marked the beginning of the fall of the emergentist heyday and the beginning of an era of reductionist enthusiasm. However, beginning with Putnam’s arguments for multiple realizability in the 1960’s, Davidson’s anomalous monism of the psychophysical and Fodor’s argument for the autonomy of the special sciences, the identity theory  and reductionism were dealt a severe blow. Today, within a predominant anti-reductivist monist climate, emergentism has reappeared in complex systems theory, cognitive science and the philosophy of mind.

a. Kinds of Emergence

Because emergent properties are novel properties, there are different conceptions of what counts as emergent depending on how novelty is understood, and this is reflected in the different ways the concept of emergence is used in the philosophy of mind and in the natural and cognitive sciences. To capture this difference, David Chalmers (2006) drew the distinction between weak and strong emergence. A different distinction has been drawn by O’Connor and Wong (2002) between epistemological and ontological emergence, but this can be incorporated into the distinction between weak and strong emergence becasue ultimately both differentiate between an epistemological emergence couched in terms of higher and lower-level explanations or descriptions and a robust ontological difference between emergent and non-emergent phenomena. Beyond this, accounts of emergence differ in whether novelty is understood as occurring over time or whether it is a phenomenon restricted to a particular time. This difference is meant to be captured in the distinction between synchronic and diachronic emergence.

i. Strong and Weak Emergence

1. Strong Emergence: Novelty as Irreducibility and Downward Causation

The metaphysically interesting aspect of emergence is the question of what it takes for there to be genuinely distinct things. In other words, the question is whether a plausible metaphysical distinction can be made between things that are “nothing over and above” what constitutes them and those things that are “something over and above” their constituent parts. The notion of strong emergence that is predominant in philosophy is meant to capture this ontological distinction that was part of the initial motivation of the British emergentists and which is lacking in discussions of weak emergence.

Though a phenomenon is often said to be strongly emergent because it is not deducible from knowledge of the lower-level domain from which it emerged—as was the case for C.D. Broad—what distinguishes the thesis of strong emergence from a thesis only about our epistemological predicament is that this non-deducibility is in principle a consequence of an ontological distinction.  The question then is what sort of novelty must a property exhibit in order for it to be strongly emergent?

Even reductive physicalists can agree that a property can be novel to a whole even though it is nothing more than the sum of the related properties of the parts of the whole. For instance, a whole weighs as much as the sum of the weights of its parts, yet the weight of the whole is not something that its parts share. In this sense resultant systemic properties, like weight, are novel but not in the sense required for them to be strongly emergent. Also, numerical novelty, the fact that a property is instantiated for the first time, is not enough to make it strongly emergent for, again, that would make many resultant properties emergent, like the first time a specific shape or mass is instantiated in nature.

For this reason the criterion often cited as essential for the ontological autonomy of strong emergents (along with in principle irreducibility or non-deducibility) is causal novelty.  That is, the basic tenet of strong emergentism is that at a certain level of physical complexity novel properties appear that are not shared by the parts of the object they emerge from, that are ontologically irreducible to the more fundamental matter from which they emerge and that contribute causally to the world. That is, emergent properties have new downward causal powers that are irreducible to the causal powers of the properties of their subvenient or subjacent (to be more etymologically correct) base. Ontological emergentism is therefore typically committed not only to novel fundamental properties but also to fundamental emergent laws as was the case with the British emergentists who, with the exception of Alexander, were all committed to downward causation—that is, causation from macroscopic levels to microscopic levels. (It should be noted also that this ontological autonomy of emergents implies the existence of irreducible special sciences.) Thus Timothy O’Connor (1994) defines strong emergent properties as properties that supervene on properties of the parts of a complex object, that are not shared by any of the objects parts, are distinct from any structural property of the complex, and that have downward causal influence on the behavior of the complex’s parts.

However, though downward causal powers are commonly cited along with irreducibility as a criterion for strong emergence, there is no consensus regarding what is known as “Alexander’s dictum” (that is, that for something to be real it must have causal powers) and hence not everyone agrees that strong emergentism requires downward causation. For example, David Chalmers (2006) who is neutral on the question of epiphenomenalism, does not take downward causation to be an essential feature of emergentism. Rather, Chalmers defines a high-level phenomenon as strongly emergent when it is systematically determined by low-level facts but nevertheless truths concerning that phenomenon are in principle not deducible from truths in the lower-level domain. The question is posed by Chalmers in terms of conceptual entailment failure. That is, emergent phenomena are nomologically but not logically supervenient on lower-level facts and therefore novel fundamental laws are needed to connect properties of the two domains.

A different approach is offered by Tim Crane (2001, 2010) who bases his account of strong emergence on the distinction between two kinds of reduction: (1) ontological reduction, which identifies entities in one domain with those in another, more fundamental one, and (2) explanatory reduction: that is, a relation that holds between theories aimed at understanding phenomena of one level of reality in terms of a “lower” level. In other words, one theory, T2, is explanatorily reduced to another, T1, when theory T1 sheds light on the phenomena treated in T2; that is, shows from within theory T1 why T2 is true. Crane argues that the difference between strong emergentism and non-reductive physicalism lies in their respective attitude to reduction: though both non-reductive physicalism and emergentism deny ontological reduction, non-reductive physicalism requires explanatory reduction (at least in principle) whereas the distinguishing feature of emergentism is that it denies explanatory reduction and is committed to an explanatory gap. Crane argues that if you have supervenience with in-principle irreducibility and downward causation then you have dependence without explanatory reduction and, hence, strong emergence.

2. Weak Emergence: Novelty as Unpredictability

Weak emergence is the kind of emergence that is common in the early twenty-first century primarily (though not exclusively) in cognitive science, complex system theory and, generally, scientific discussions of emergence in which the notions of complexity, functional organization, self-organization and non-linearity are central. The core of this position is that a property is emergent if it is a systemic property of a system—a property of a system that none if its smaller parts share—and it is unpredictable or unexpected given the properties and the laws governing the lower-level, more fundamental, domain from which it emerged. Since weak emergence is defined in terms of unpredictability or unexpectedness, it is an epistemological rather than a metaphysical notion. Commonly cited examples of such weak emergent phenomena range from emergent patterns in cellular automata and systemic properties of connectionist networks to phase transitions, termite organization, traffic jams, the flocking patterns of birds, and so on.

Weak emergence is compatible with reduction since a phenomenon may be unpredictable yet also reducible. For instance, processes comprised of many parts may fall under strict deterministic laws yet be unpredictable due to the unforeseeable consequences of minute initial conditions. And, as Chalmers (2006) argues, weak emergence is also compatible with deducibility of the emergent phenomenon from its base, as for instance, in cellular automata in which though higher-level patterns may be unexpected they are in principle deducible given the initial state of the base entities and the basic rules governing the lower level.

Mario Bunge’s “rational emergentism” (1977) is a form of weak emergence according to which emergent properties are identified with systemic properties that none of the parts of the system share and that are reducible to the parts of the system and their organization. Bunge identifies his view as an emergentism of sorts because he claims that, unlike reductionist mechanism it appreciates the novelty of systemic properties. In addition, he thinks of novelty as having a reductive explanation. He calls this “rational” emergence.

William Wimsatt (2000) also defends an account according to which emergence is compatible with reduction. Wimsatt defines emergence negatively as the failure of aggregativity; aggregativity is the state in which “the whole is nothing more than the sum of its parts” in which, that is, systemic properties are the result of the component parts of a system rather than their organization. Contrasting emergence to aggregativity, Wimsatt defines a systemic property as emergent relative to the properties of the parts of a system if the property is dependent on their mode of organization (and is also context-sensitive) rather than solely on the system’s composition. He argues that, in fact, it is aggregativity which is very rare in nature, while emergence is a common phenomenon (even if in different degrees).

Robert Batterman (2002), who focuses on emergence in physics, also believes that emergent phenomena are common in our everyday experience of the physical world. According to Batterman, what is at the heart of the question of emergence is not downward causation or the distinctness of emergent properties, but rather inter-theoretic reduction and, specifically, the limits of the explanatory power of reducing theories. Thus, a property is emergent, according to this view, if it is a property of a complex system at limit values that cannot be derived from lower level, more fundamental theories. As examples of emergent phenomena Batterman cites phase transitions and transitions of magnetic materials from ferromagnetic states to paramagnetic states, phenomena in which novel behavior is exhibited that cannot be reductively explained by the more fundamental theories of statistical mechanics. However, Batterman wants to distinguish explanation from reduction and so claims that though emergent phenomena are irreducible they are not unexplainable per se because they can have non-reductive explanations.

More recently Mark Bedau (1997, 2007, 2008) has argued that the characteristic of weak emergence is that, though macro-phenomena of complex systems are in principle ontologically and causally reducible to micro-phenomena, their reductive explanation is intractably complex, save by derivation through simulation of the system’s microdynamics and external conditions. In other words, though macro-phenomena are explainable in principle in terms of micro-phenomena, these explanations are incompressible, in the sense that they can only be had by “crawling the micro-causal web”—by aggregating and iterating all local micro-interactions over time. Bedau argues that this is the only kind of real emergence and champions what he calls the “radical view” of emergence according to which emergence is a common phenomenon that applies to all novel macro-properties of systems. (He contrasts this to what he calls the “sparse view” which he characterizes as the view that emergence is a rare phenomenon found only in “exotic” phenomena such as consciousness that are beyond the scope of normal science.) However, though this is a weak kind of emergence in that it denies any strong form of downward causation and it involves reducibility of the macro to the micro (even if only in principle), Bedau denies that weak emergence is merely epistemological, or merely “in the mind” since explanations of weak emergent phenomena are incompressible because they reflect the incompressible nature of the micro-causal structure of reality which is an objective feature of complex systems.

Andy Clark (1997, 2001) also holds a weak emergentist view according to which emergent phenomena need not be restricted to unpredictable or unexplainable phenomena but are, instead, systemic phenomena of complex dynamical systems that are the products of collective activity. Clark distinguishes four kinds of emergence. First, emergence as collective self-organization (a system becomes more organized due solely to the collective effects of the local interaction of its parts, such as  flocking patterns of birds, or due to the collective effects of its parts and the environment, such as termite nest building). Second, emergence as unprogrammed functionality, that is, emergent behavior that arises from repeated interaction of an agent with the environment, such as wall-following behavior in “veer and bounce” robots (Clark, 1997). Third, emergence as interactive complexity in which effects, patterns or capacities of a system emerge resulting from complex, cyclic interaction of its components. For example, Bénard and Couette convection cells that result from a repetitive cycle of movement caused by differences in density within a fluid body in which the colder fluid forces the warmer fluid to rise until the latter loses enough heat to descend and cause the former fluid to rise again, and so on. And fourth, emergence as uncompressible unfolding (phenomena that cannot be predicted without simulation). All of these formulations of emergence are compatible with reducibility or in principle predictability and are thus forms of weak emergence. For Clark, emergence picks out the “distinctive way” in which factors conspire to bring about a property, event or pattern and it is “linked to the notion of what variables figure in a good explanation of the behavior of a system.” Thus, Clark’s notion of emergence in complex systems theory is explanatory in that it focuses on explanations in terms of collective variables, that is, variables that focus on higher-level features of complex dynamical systems that do not track properties of the components of the system but, instead, reflect the result of the interaction of multiple agents or their interaction with their environment.

Proponents of weak emergence do not support the strong notion of downward causation that is found in strong emergentist views but, instead, favor one in which higher-level causal powers of a whole can be explained by rules of interaction of its parts, such as feedback loops. Though this kind of view of emergence is predominant in the sciences, it is not exclusive to them. A form of weak emergence within philosophy that denies strong downward causation can be found in John Searle (1992). Searle allows for the existence of “causally emergent system features” such as liquidity, transparency and consciousness that are systemic features of a system that cannot be deduced or predicted from knowledge of causal interactions of lower levels. However, according to Searle, whatever causal effects such features exhibit can be explained by the causal relations of the systems parts, for example, in the case of consciousness, by the behavior and interaction of neurons.

If we make use, for more precision, of the distinction between ontological and explanatory reduction we can see that if we understand strongly emergent phenomena as both ontologically and explanatorily irreducible, as Crane (2010) does, then they are also weakly emergent. However, if strongly emergent phenomena are only ontologically irreducible they may still be, in principle, predictable. For example, even if you deny the identity of heat with mean kinetic energy (perhaps because of multiple realizability) a Laplacean demon could still predict a gas’ heat from the mean kinetic energy of its molecules with the use of “bridge laws” that link the two vocabularies. These bridge laws can be considered to be part of what Crane calls an explanatory reduction. So in such cases, strong emergence does not entail weak emergence. Also it should be noted that weak emergence does not entail strong emergence. A phenomenon can be unpredictable yet also ontologically reducible: perhaps for instance, because systemic properties are subject to indeterministic laws. So a case of weak emergence need not necessarily be a case of strong emergence.

ii. Synchronic and Diachronic Emergence

Another distinction that is made concerning how novelty is understood is the distinction between synchronic and diachronic novelty. The former is novelty exhibited in the properties of a system vis-à-vis the properties of its constituent parts at a particular time; the latter is temporal novelty in the sense that a property or state is novel if it is instantiated for the first time. This distinction leads to distinction between synchronic and diachronic emergence.

In synchronic emergence, articulated by C. D. Broad and predominant in the philosophy of mind, the higher-level, emergent phenomena are simultaneously present with the lower-level phenomena from which they emerge. Usually this form of emergence is stated in terms of supervenience of mental phenomena on subvenient/subjacent neural structures, and so mental states or properties co-exist with states or properties at the neural level. Strong ontological emergence is thus usually understood to be synchronic, “vertical”, emergence. In contrast, diachronic emergence is “horizontal” emergence evolved through time in which the structure from which the novel property emerges exists prior to the emergent. This is typical of the weakly emergent states appealed to in discussions of complex systems, evolution, cosmology, artificial life, and so forth. It can be found in Searle (1992) since he views the relation of the emergent to its base as causal thus, at least in non-synchronic accounts of causation, excluding synchronic emergence.

Because diachronic emergence is emergence over time, novelty is understood in terms of unpredictability of states or properties of a system from past states of that system. And because weak emergence is typically defined in terms of unpredictability it is also usually identified with cases of diachronic emergence. In contrast, in synchronic emergence, which refers to the state of a system at a particular time, novelty revolves around the idea of irreducibility and thus synchronic emergence is usually identified with strong emergence. However, there are formulations of non-supervenience-based strong emergence that are causal and diachronic, such as O’Connor and Wong’s (2005). Note that synchronic emergence could be the result of diachronic emergence but is not entailed by it since, presumably, if God were to create the world exactly as it is in this moment, synchronically emergent phenomena would exist without them being diachronically emergent.

b. Emergence and Supervenience

The British emergentists, and this is especially clear in the writing of C. D. Broad, thought that a necessary feature of emergentism is a relation of the kind we would today call supervenience. Supervenience is a relation of covariation between two sets of properties, subjacent/underlying properties and supervenient properties. Roughly, we say that a set of properties A supervenes on a set of properties B if and only if two things that differ with respect to A-properties will also differ with respect to B-properties. Today, because of the failure of successful reductions, especially in the case of the mental to the physical, and because the relation of supervenience per se doesn’t entail anything about the specific nature of the properties it relates, for example, whether they are distinct or not, it has been seen as a prima facie good candidate for a key feature of the relation between emergents and their subjacent base that can account for the distinctness and dependence of emergents while also adding the restriction of synchronicity. Jaegwon Kim (1999), James van Cleve (1990), Timothy O’Connor (1994), Brian McLaughlin (1997), David Chalmers (2006) and Paul Noordhof (2010) all take nomological strong supervenience to be a necessary feature of emergentism. (For present purposes, following Kim we can define strong supervenience thus: A-properties strongly supervene on B-properties if and only if for any possible worlds w1 and w2 and any individuals x in w1 and y in w2, if x in w1 is B-indiscernible from y in w2, then x in w1 is A-indiscernible from y in w2. Nomological supervenience restricts the range of possible worlds to those that conform to the natural laws).

However, not everyone agrees that the relation of strong supervenience is necessary for strong emergence. Some, like Crane (2001), argue that supervenience is not sufficient for emergence and other proponents of strong emergence have questioned that supervenience is even a necessary condition for emergence. For example, O’Connor (2000, 2003, O’Connor & Wong 2005) now supports a form of dynamical emergence which is causal and non-synchronic. A state of an entity is emergent, in this view, if it instantiates non-structural properties as a causal result of that object’s achieving a complex configuration. O’Connor’s view includes a strong notion of downward causation (and the denial of causal closure–roughly, the principle that all physical effects are entirely determined by, or have their chances entirely determined by, prior physical events) and the possibility that an emergent state can generate another emergent state.

Paul Humphreys (1996, 1997) has also offered an alternative account to supervenience-based emergence according to which emergence of properties is the diachronic result of fusion of lower-level properties, a phenomenon that Humphreys claims is common in the physical realm. That is, properties of the base are fused (thereby ceasing to exist) and give rise to new emergent properties with novel causal powers which are not made up of the old property instances—and, in this sense, the only real phenomenon is the emergent phenomenon. Humphreys offers as a paradigmatic example of such emergence quantum entanglement, in which a system can be in a definite state while its individual parts are not and in which the state of the system determines the states of its parts and not the other way around. It must be noted that Humphreys claims ignorance about whether this is what happens in the case of mental properties. Different formulations of non-supervenience-based emergence can be found in Silberstein and McGeever (1999) who have also argued for ontological emergence in quantum mechanics and, by extension, as a real feature of the natural world, as well as in Bickhard and Campbell’s (2000) “process model” of ontological emergence.

3. Objections to Emergentism

a. The Supervenience Argument

The most usually cited objection to strong emergence, initially formulated by Pepper (1926) and championed today by Jaegwon Kim (1999, 2005), concerns the novel (and downward) causal powers of emergent properties.

Kim’s formulation is based on three basic physicalist assumptions: (1) the principle of causal closure which Kim defines as the principle that if a physical event has a cause at t, then it has a physical cause at t, (2) the principle of causal exclusion according to which if an event e has a sufficient cause at t, no event at t distinct from c can be the cause of e (unless this is a genuine case of causal over-determination), and (3) supervenience. Kim defines mind/body supervenience as follows: mental properties strongly supervene on physical/biological properties, that is, if any system s instantiates a mental property M at t, there necessarily exists a physical property P such that s instantiates P at t, and necessarily anything instantiating P at any time instantiates M at any time.

The gist of the problem is the following. In order for emergent mental properties to have causal powers (and thus to exist, according to what Kim has coined “Alexander’s dictum”) there must be some form of mental causation. However, if this is the case, the principle of causal closure is violated and emergence is in danger of becoming an incoherent position. If mental (and therefore downward) causation is denied and thus causal closure retained, emergent properties become merely epiphenomenal and in this case their existence is threatened.

More specifically, the argument is as follows. According to mind-body supervenience, every time a mental property M is instantiated it supervenes on a physical property P. Now suppose M appears to cause another mental property M¹, the question arises whether the cause of M¹ is indeed M or whether it is M¹’s subvenient/subjacent base P¹ (since according to supervenience M¹ is instantiated by a physical property P¹). Given causal exclusion, it cannot be both, and so, given the supervenience relation, it seems that M¹ occurs because P¹ occurred. Therefore, Kim argues, it seems that M actually causes M¹ by causing the subjacent P¹ and that mental to mental (same level) causation presupposes mental to physical (downward) causation. [Another, more direct, way to put this problem is whether the effect of M is really M¹ or M¹’s subjacent base P¹. I chose an alternative formulation in order for the problem to be more clear to the non-expert reader.] However, Kim continues, given causal closure, P¹ must have a sufficient physical cause P. But given exclusion again, P¹ cannot have two sufficient causes, M and P, and so P is the real cause of P¹ because, if M were the real cause then causal closure would be violated again. Therefore, given supervenience, causal closure and causal exclusion, mental properties are merely epiphenomenal. The tension here for the emergentist, the objection goes, is in the double requirement of supervenience and downward causation in that, on the one hand, we have upward determination and the principle of causal closure of the physical domain, and, on the other hand, we have causally efficacious emergent phenomena. In other words, Kim claims that what seem to be cases of emergent causation are just epiphenomena because ultimately the only way to instantiate an emergent property is to instantiate its base. So, saying that higher level properties are causally efficacious renders any form of non-reductive physicalism, under which Kim includes emergentism, at least implausible and at most incoherent.

Note that this is an objection leveled against cases of strong emergence because in cases of weak emergence that do not make any claims of ontological novelty the causal inheritance principle is preserved—the emergents’ causal powers are inherited from the powers of their constitutive parts. For example, a flocking pattern of birds may affect the movement of the individual birds in it but that is nothing more than the effect of the aggregate of all the birds that make it up. Also, this argument applies to cases of supervenience-based emergence which retain base properties intact along with emergent properties, but accounts of emergence that are non-synchronic sidestep the problem of downward causation. So, Kim’s objection does not get off the ground as a retort to O’Connor’s dynamical emergence, Bickhard and Campbell’s process model, Silberstein and McGeever’s quantum mechanical emergence or Humphreys’ fusion emergence.

In the cases where this objection applies, there have been different responses.  Philosophers who want to retain causal closure while also retaining emergent properties have tried to give modified accounts of strong emergence that deny either downward causation or the requirement that emergent properties have novel causal powers. For example, Shoemaker (2001) believes that what must be denied is not the principle of causal closure but, instead, that emergent properties have novel causal powers (the appearance of which he elsewhere attributes to “micro-latent” powers of lower-level entities). This approach, however, is problematic, since it seems to be a requirement for robust strong emergence that emergent properties are not merely epiphenomenal. Another approach has recently been proposed by Cynthia and Graham Macdonald (2010) who attempt to preserve causal closure and to show that it is compatible with emergence by building a metaphysics in which events can co-instantiate in a single instance mental and physical properties thus allowing for mental properties to have causal effects (a view that Peter Wyss (2010) has correctly pointed out is in some respects reminiscent of Samuel Alexander’s). In this schema, the Macdonalds argue, property instances do not belong to different levels (though properties do) and so the problem of downward causation is resolved because, in effect, there is no downward causation in the sense assumed by Kim’s argument (and causal efficacy for emergent and mental properties is preserved, they argue, since if a property has causally efficacious instances that means that the property itself has causal powers). However this view will also seem unsatisfactory to the strong emergentist who wants to retain a robust notion of emergent properties and downward causation.

Other philosophers who want to retain strong emergence have opted for rejecting causal closure instead.  Such a line has been taken by Crane (2001), Hendry (2010) and Lowe (2000) who, however, subsequently offers an account of strong emergence compatible with causal closure (Lowe, 2003).

b. Do Cases of Genuine (Strong) Emergence Exist?

Kim’s supervenience argument is meant to question the very possibility of strongly emergent properties. However, even if strong emergence is possible, there is the further question of whether there are any actual cases of strong emergence in the world.

Brian McLaughlin (1992) who grants that the emergence of novel configurational forces is compatible with the laws of physics and that theories of emergence are coherent and consistent, has argued that there is “not a scintilla of evidence” that there are any real cases of strong emergence to be found in the world. This is a commonly cited objection to emergence readily espoused by reductive physicalists committed to the purely physical nature of all the phenomena that have at different times been called emergent and also raised by Mark Bedau who claims that though weak emergence is very common we have no evidence for cases of strong emergence.

Hempel and Oppenheim (1948) have argued that the unpredictability of emergent phenomena is theory-relative—that is, something is emergent only given the knowledge available at a given time—and does not reflect an ontological distinction. And Ernest Nagel (1960), agreeing that emergence is theory-relative, argued that it is a doctrine concerning “logical facts about formal relations between statements rather than any experimental or even ‘metaphysical’ facts about some allegedly ‘inherent’ traits of properties of objects.” According to these views, theoretical advance and accumulation of new knowledge will lead to the re-classification of what are today considered to be emergent phenomena, as happened with the case of life and chemical bonding of the British emergentists. However, though these objections can be construed as viable objections to some forms of weak emergence they fail to affect strong emergence (which was their target) because it is concerned with in principle unpredictability as a result of irreducibility.

Though this skepticism is shared by a few, some philosophers believe that though strong emergence may be rare, it does exist. Bickhard and Campbell (2000), Silvester and McGeever (1999) and Humphreys (1997) claim that ontological emergence can be found (at least) in quantum mechanics—an interesting proposal, and somewhat ironic given that it was advances in quantum physics in the early 20th century that was supposed to have struck the death blow to the British emergentist tradition. Predominantly, however, the usual candidates for strongly emergent properties are mental properties (phenomenal and/or intentional) that continue to resist any kind of reduction. Chalmers (2006)—because of the explanatory gap—considers consciousness to be the only possible intrinsically strongly emergent phenomenon in nature while O’Connor (2000) has argued that our experience of free will which is, in effect, macroscopic control of behavior, seems to be irreducible and hence strongly suggests that human agency may be strongly emergent. (Stephan (2010) also sees free will as a candidate for a strongly emergent property.)

Another line of response is taken by E. J. Lowe (2000) according to whom emergent mental causes could be in principle out of reach of the physiologist, and so it should not come as a surprise that physical science has not discovered them. Lowe argues that, even if we grant that every physical event has a sufficient immediate physical cause, it is plausible that a mental event could have caused the physical event to have that physical cause. That is not to say that the mental event caused the physical event that caused the physical effect; rather, the mental event linked the two physical events so the effect was jointly caused by a mental and a physical event. Such a case, Lowe argues, would be indistinguishable from the point of view of physiological science from a case in which causal closure held.

Following this line of thought it can be argued that though we do not have actual empirical proof that emergent properties exist, the right attitude to hold is to be open to the possibility of their existence. That is, given that there is no available physiological account of how mental states can cause physical states (or how they can be identical), while at the same time having everyday evidence that they do, as well as a plausible mental—psychological or folk psychological—explanation for it, we have independent grounds to believe that emergent properties could possibly exist.

4. References and Further Reading

  • Alexander, Samuel, Space, Time, and Deity. New York: Dover Publications, 1920.
  • Batterman, Robert W., “Emergence in Physics”. Routledge Encyclopedia of Philosophy Online.
  • Batterman, Robert W., The Devil in the Details: Asymptotic Reasoning in Explanation, Reduction, and Emergence. Oxford Studies in Philosophy of Science. Oxford, UK: Oxford University Press, 2001.
  • Bedau, Mark A. and Humphreys, Paul (eds.), Emergence: Contemporary Readings in Philosophy and Science. London, UK: MIT Press, 2007.
    • A collection of contemporary philosophical and scientific papers on emergence.
  • Bedau, Mark A. “Weak Emergence”, in J. Tomberlin (ed.) Philosophical Perspectives: Mind, Causation and World, vol.11.  Malden, MA: Blackwell, 1997. pp. 375-399.
  • Batterman, Robert W. “Is Weak Emergence Just in the Mind?” Minds and Machines 18, 2008: 443-459.
    • On weak emergence as computational irreducibility and explanatory incompressibility respectively.
  • Bickhard, M. & D.T. Campbell, “Emergence”, in P.B. Andersen, C. Emmerche, N. O. Finnemann & P. V. Christiansen (eds), Downward causation. Aarhus: Aarhus University Press, 2000.
  • Bickhard, M. and D.T. Campbell, “Physicalism, Emergence and Downward Causation” Axiomathes, October 2010.
    • On the “process model” of emergence.
  • Broad, C.D., The Mind and Its Place in Nature. London: Routledge and Kegan Paul, 1925.
    • The classical formulation of British emergentist tradition.
  • Bunge, Mario, “Emergence and the Mind”. Neuroscience 2, 1977: 501-509.
    • On “rational emergence,” a form of weak emergence.
  • Chalmers, David, “Strong and Weak Emergence”.  In P. Clayton and P. Davies, eds, The Re-emergence of Emergence Oxford: Oxford University Press, 2006.
    • On weak and strong emergence.
  • Clark, Andy, Being There: Putting Brain, Body, and World Together Again. Cambridge, MA: MIT Press, 1997.
  • Clark, Andy, Mindware: An Introduction to the Philosophy of Cognitive Science. Oxford and New York: Oxford University Press, 2001.
    • On the four types of weak emergence that Clark identifies in the cognitive sciences.
  • Crane, Tim, “Cosmic Hermeneutics vs. Emergence: The Challenge of the Explanatory Gap” in Emergence in Mind, eds. Cynthia Macdonald and Graham Macdonald. New York: Oxford University Press, 2010.
  • Crane, Tim, “The Significance of Emergence” in B. Loewer and G. Gillett (eds) Physicalism and Its Discontents. Cambridge, UK: Cambridge University Press, 2001.
    • On the relation of emergence to non-reductive physicalism, reduction and theexplanatory gap.
  • Hempel, Carl Gustav and Paul Oppenheim (1948), “Studies in the Logic of Explanation”, in Hempel, C. G. Aspects of Scientific Explanation. New York: Free Press, 1965.
    • An exposition of the objection that emergence is only theory relative and not a genuine phenomenon in nature.
  • Humphreys, Paul, “How Properties Emerge.” Philosophy of Science, 1997(a), 64: 1-17.
  • Humphreys, Paul, “Emergence, Not Supervenience.” Philosophy of Science, 1997(b), 64: 337-345.
    • On non-supervenience – based emergence as fusion of properties.
  • Hendry, Robin Findlay, “Emergence vs. Reduction in Chemistry” in Mcdonald & Mcdonald (2010).
    • Contains an argument against causal closure and for downward causation in chemistry in support of the position that emergentism is at least as supported by empirical evidence as non-reductive physicalism.
  • Kim, Jaegwon, “‘Downward Causation’ in Emergentism and Nonreductive Physicalism”, in Beckermann, Flohr, and Kim (eds), Emergence or Reduction? Essays on the Prospects of Nonreductive Physicalism. Berlin: Walter de Gruyter, 1992.
  • Kim, Jaegwon,“Making Sense of Emergence”. Philosophical Studies, 95, 1999: 3-36.
  • Kim, Jaegwon, Physicalism, or Something Near Enough.  Princeton and Oxford: Princeton University Press, 2005.
    • Contain analyses of non-reductive physicalism and emergence and a main source of criticism of these views, including the supervenience argument.
  • Lewes, George Henry. Problems of Life and Mind. Vol 2. London: Kegan Paul, Trench, Turbner, & Co., 1875.
      • Another of the historical texts of the British emergentist tradition in which the term “emergent” is coined.
  • Lowe, J., “Causal Closure Principles and Emergentism.” Philosophy, 75 (4), 2000.: 571-585.
  • Lowe, J., “Physical Causal Closure and the Invisibility of Mental Causation” in Sven Walter and Heinz-Dieter Heckmann (eds.) Physicalism and Mental Causation: The Metaphysics of Mind and Action. UK: Imprint Academic, 2003.
    • For an idea of what it could be like for there to be mental forces in principle out of reach of the physiologist yet also consistent with causal closure.
  • Macdonald, C. and G. Macdonald, eds., Emergence in Mind. New York: Oxford University Press, 2010.
    • A collection of philosophical essays on emergence covering a wide range of issues from explanation and reduction to free will and group agency.
  • McLaughlin, Brian P., “Emergence and Supervenience.” Intellectica, 2, 1997: 25-43.
  • McLaughlin, Brian P.,“The Rise and Fall of British Emergentism” in Beckerman, Flor and J.Kim (eds.), Emergence or Reduction? Berlin, Germany: Walter DeGruyter &Co., 1992.
    • The most comprehensive critical historical overview of British emergentism.
  • Mill, J.S., A System of Logic Ratiocinative and Inductive.  London: Longmans, Green and Co., 1930.
    • For Mill’s discussion of homopathic and heteropathic effects and laws that marked the beginning of the British emergentist tradition.
  • Mitchell, Sandra D., Unsimple Truths. Chicago and London: The University of Chicago Press, 2009.
    • A very good account of emergence in science.
  • Morgan, C.L., Emergent Evolution. London: Williams and Norgate, 1923.
  • Nagel, Ernest, The Structure of Science. London: Routledge & Kegan Paul, 1961.
    • On the objection that emergence is only theory relative.
  • Noordhof, Paul, “Emergent Causation and Property Causation” in Emergence in Mind, eds.  Cynthia Macdonald and Graham Macdonald, New York: Oxford University Press, 2010.
  • O’Connor, Timothy, “Causality, Mind and Free Will”. Philosophical Perspectives, 14, 2000: 105-117.
  • O’Connor, Timothy, “Emergent Individuals”. The Philosophical Quarterly, 53, 213, 2003: 540-555.
  • O’Connor, Timothy, Emergent Properties”. American Philosophical Quarterly, 31, 1994: 91-104.
  • O’Connor, Timothy,  & Hong Yu Wong, “The Metaphysics of Emergence”. Noûs 39, 4, 2005: 58–678.
    • In defense of strongly emergent properties.
  • Papineau, David, “Why Supervenience?” Analysis, 50, 2 (1990): 66-71.
  • Pepper, Stephen C., “Emergence”. Journal of Philosophy, 23, 1926: 241- 245.
    • The original formulation of the objection against downward causation.
  • Searle, J.R., The Rediscovery of Mind. Cambridge, Mass.: MIT Press, 1992.
    • Contains a philosophical discussion supporting a weak form of causal emergence for consciousness.
  • Shoemaker, S., “Realization and Mental Causation” in Physicalism and Its Discontents, Barry Loewer and Carl Gillett (eds.). Cambridge: Cambridge University Press, 2001: 74-98.
  • Silberstein, Michael and John McGeever, “The Search for Ontological Emergence”. The Philosophical Quarterly, 49, 1999: 182-200.
    • An account of strong emergence based on the relational holism of quantum states.
  • Sperry, R. W. “A Modified Concept of Consciousness” Psychological Review, 76, 6, 1969: 532-536
  • Sperry, R.W.  “Mind-Brain Interaction: Mentalism, Yes; Dualism, No”. Neuroscience, 5, 1980: 195-206.
    • An argument for the strong emergence of consciousness involving downward causation from a neuroscientist’s perspective.
  • Stephan, Achim, “Varieties of Emergentism.” Evolution and Cognition, 49, vol. 5, no.1, 1999: 49-59.
    • On different kinds of emergentism and how they relate.
  • Stephan, Achim, “An Emergentist’s Perspective on the Problem of Free Will” in Macdonald & Macdonald  (2010).
    • On free will as a strongly emergent property
  • Wimsatt, William C., “Emergence as Non-Aggregativity and the Biases of Reductionisms”. Foundations of Science, 5, 3, 2000: 269-297.
    • On a view of emergence as non-aggregativity that is compatible with reduction.
  • Wyss, Peter, “”Identity with a Difference: Comments on Macdonald and Macdonald in Macdonald & Macdonald  (2010).

Author Information

Elly Vintiadis
Email: evintus@gmail.com
Naval Staff and Command College
U. S. A.

The Infinite

Working with the infinite is tricky business. Zeno’s paradoxes first alerted Western philosophers to this in 450 B.C.E. when he argued that a fast runner such as Achilles has an infinite number of places to reach during the pursuit of a slower runner. Since then, there has been a struggle to understand how to use the notion of infinity in a coherent manner. This article concerns the significant and controversial role that the concepts of infinity and the infinite play in the disciplines of philosophy, physical science, and mathematics.

Philosophers want to know whether there is more than one coherent concept of infinity; which entities and properties are infinitely large, infinitely small, infinitely divisible, and infinitely numerous; and what arguments can justify answers one way or the other.

Here are some examples of these four different ways to be infinite. The density of matter at the center of a black hole is infinitely large. An electron is infinitely small. An hour is infinitely divisible. The integers are infinitely numerous. These four claims are ordered from most to least controversial, although all four have been challenged in the philosophical literature.

This article also explores a variety of other questions about the infinite. Is the infinite something indefinite and incomplete, or is it complete and definite? What did Thomas Aquinas mean when he said God is infinitely powerful? Was Gauss, who was one of the greatest mathematicians of all time, correct when he made the controversial remark that scientific theories involve infinities merely as idealizations and merely in order to make for easy applications of those theories, when in fact all physically real entities are finite? How did the invention of set theory change the meaning of the term “infinite”? What did Cantor mean when he said some infinities are smaller than others? Quine said the first three sizes of Cantor’s infinities are the only ones we have reason to believe in. Mathematical Platonists disagree with Quine. Who is correct? We shall see that there are deep connections among all these questions.

Table of Contents

  1. What “Infinity” Means
    1. Actual, Potential, and Transcendental Infinity
    2. The Rise of the Technical Terms
  2. Infinity and the Mind
  3. Infinity in Metaphysics
  4. Infinity in Physical Science
    1. Infinitely Small and Infinitely Divisible
    2. Singularities
    3. Idealization and Approximation
    4. Infinity in Cosmology
  5. Infinity in Mathematics
    1. Infinite Sums
    2. Infinitesimals and Hyperreals
    3. Mathematical Existence
    4. Zermelo-Fraenkel Set Theory
    5. The Axiom of Choice and the Continuum Hypothesis
  6. Infinity in Deductive Logic
    1. Finite and Infinite Axiomatizability
    2. Infinitely Long Formulas
    3. Infinitely Long Proofs
    4. Infinitely Many Truth Values
    5. Infinite Models
    6. Infinity and Truth
  7. Conclusion
  8. References and Further Reading

1. What “Infinity” Means

The term “the infinite” refers to whatever it is that the word “infinity” correctly applies to. For example, the infinite integers exist just in case there is an infinity of integers. We also speak of infinite quantities, but what does it mean to say a quantity is infinite? In 1851, Bernard Bolzano argued in The Paradoxes of the Infinite that, if a quantity is to be infinite, then the measure of that quantity also must be infinite. Bolzano’s point is that we need a clear concept of infinite number in order to have a clear concept of infinite quantity. This idea of Bolzano’s has led to a new way of speaking about infinity, as we shall see.

The term “infinite” can be used for many purposes. The logician Alfred Tarski used it for dramatic purposes when he spoke about trying to contact his wife in Nazi-occupied Poland in the early 1940s. He complained, “We have been sending each other an infinite number of letters. They all disappear somewhere on the way. As far as I know, my wife has received only one letter” (Feferman 2004, p. 137). Although the meaning of a term is intimately tied to its use, we can tell only a very little about the meaning of the term from Tarski’s use of it to exaggerate for dramatic effect.

Looking back over the last 2,500 years of use of the term “infinite,” three distinct senses stand out: actually infinite, potentially infinite, and transcendentally infinite. These will be discussed in more detail below, but briefly, the concept of potential infinity treats infinity as an unbounded or non-terminating process developing over time. By contrast, the concept of actual infinity treats the infinite as timeless and complete. Transcendental infinity is the least precise of the three concepts and is more commonly used in discussions of metaphysics and theology to suggest transcendence of human understanding or human capability.

To give some examples, the set of integers is actually infinite, and so is the number of locations (points of space) between London and Moscow. The maximum length of grammatical sentences in English is potentially infinite, and so is the total amount of memory in a Turing machine, an ideal computer. An omnipotent being’s power is transcendentally infinite.

For purposes of doing mathematics and science, the actual infinite has turned out to be the most useful of the three concepts. Using the idea proposed by Bolzano that was mentioned above, the concept of the actual infinite was precisely defined in 1888 when Richard Dedekind redefined the term “infinity” for use in set theory and Georg Cantor made the infinite, in the sense of infinite set, an object of mathematical study. Before this turning point, the philosophical community generally believed Aristotle’s concept of potential infinity should be the concept used in mathematics and science.

a. Actual, Potential, and Transcendental Infinity

The Ancient Greeks generally conceived of the infinite as formless, characterless, indefinite, indeterminate, chaotic, and unintelligible. The term had negative connotations and was especially vague, having no clear criteria for distinguishing the finite from the infinite. In his treatment of Zeno’s paradoxes about infinite divisibility, Aristotle (384-322 B.C.E.) made a positive step toward clarification by distinguishing two different concepts of infinity, potential infinity, and actual infinity. The latter is also called complete infinity and completed infinity. The actual infinite is not a process in time; it is an infinity that exists wholly at one time. By contrast, Aristotle spoke of the potentially infinite as a never-ending process over time, but which is finite at any specific time.

The word “potential” is being used in a technical sense. A potential swimmer can learn to become an actual swimmer, but a potential infinity cannot become an actual infinity. Aristotle argued that all the problems involving reasoning with infinity are really problems of improperly applying the incoherent concept of actual infinity instead of the coherent concept of potential infinity. (See Aristotle’s Physics, Book III, for his account of infinity.)

For its day, this was a successful way of treating Zeno’s Achilles paradox since, if Zeno had confined himself to using only potential infinity, he would not have been able to develop his paradoxical argument. Here is why. Zeno said that to go from the start to the finish line, the runner must reach the place that is halfway-there, then after arriving at this place he still must reach the place that is half of that remaining distance, and after arriving there he again must reach the new place that is now halfway to the goal, and so on. These are too many places to reach because there is no end to these places since for any one there is another. Zeno made the mistake, according to Aristotle, of supposing that this infinite process needs completing when it really doesn’t; the finitely long path from start to finish exists undivided for the runner, and it is Zeno the mathematician who is demanding the completion of such a process. Without that concept of a completed infinite process there is no paradox.

Although today’s standard treatment of the Achilles paradox disagrees with Aristotle and says Zeno was correct to use the concept of a completed infinity and to imply the runner must go to an actual infinity of places in a finite time, Aristotle had so many other intellectual successes that his ideas about infinity dominated the Western world for the next two thousand years.

Even though Aristotle promoted the belief that “the idea of the actual infinite−of that whose infinitude presents itself all at once−was close to a contradiction in terms…,” (Moore 2001, 40) during those two thousand years, others did not treat it as a contradiction in terms. Archimedes, Duns Scotus, William of Ockham, Gregory of Rimini, and Leibniz made use of it. Archimedes used it, but had doubts about its legitimacy. Leibniz used it but had doubts about whether it was needed.

Here is an example of how Gregory of Rimini argued in the fourteenth century for the coherence of the concept of actual infinity:

If God can endlessly add a cubic foot to a stone—which He can—then He can create an infinitely big stone. For He need only add one cubic foot at some time, another [cubic foot] half an hour later, another a quarter of an hour later than that, and so on ad infinitum. He would then have before Him an infinite stone at the end of the hour. (Moore 2001, 53)

Leibniz envisioned the world as being an actual infinity of mind-like monads, and in (Leibniz 1702) he freely used the concept of being infinitesimally small in his development of the calculus in mathematics.

The term “infinity” that is used in contemporary mathematics and science is based on a technical development of this earlier, informal concept of actual infinity. This technical concept was not created until late in the 19th century.

b. The Rise of the Technical Terms

In the centuries after the decline of ancient Greece, the word “infinite” slowly changed its meaning in Medieval Europe. Theologians promoted the idea that God is infinite because He is limitless, and this at least caused the word “infinity” to lose its negative connotations. Eventually, during the Medieval Period, the word had come to mean endless, unlimited, and immeasurable–but not necessarily chaotic. The question of its intelligibility and conceivability by humans was disputed.

The term actual infinity is now very different. There are actual infinities in the technical, post-1880s sense, which are neither endless, unlimited, nor immeasurable. A line segment one meter long is a good example. It is not endless because it is finitely long, and it is not a process because it is timeless. It is not unlimited because it is limited by both zero and one. It is not immeasurable because its length measure is one meter. Nevertheless, the one-meter line is infinite in the technical sense because it has an actual infinity of sub-segments, and it has an actual infinity of distinct points. So, there definitely has been a conceptual revolution.

This can be very shocking to those people who are first introduced to the technical term “actual infinity.” It seems not to be the kind of infinity they are thinking about. The crux of the problem is that these people really are using a different concept of infinity. The sense of infinity in ordinary discourse these days is either the Aristotelian one of potential infinity or the medieval one that requires infinity to be endless, immeasurable, and perhaps to have connotations of perfection or inconceivability. This article uses the name transcendental infinity for the medieval concept although there is no generally accepted name for the concept. A transcendental infinity transcends human limits and detailed knowledge; it might be incapable of being described by a precise theory. It might also be a cluster of concepts rather than a single one.

Those people who are surprised when first introduced to the technical term “actual infinity” are probably thinking of either potential infinity or transcendental infinity, and that is why, in any discussion of infinity, some philosophers will say that an appeal to the technical term “actual infinity” is changing the subject. Another reason why there is opposition to actual infinities is that they have so many counter-intuitive properties. For example, consider a continuous line that has an actual infinity of points. A single point on this line has no next point! Also, a one-dimensional continuous curve can fill a two-dimensional area. Equally counterintuitive is the fact that some actually infinite numbers are smaller than other actually infinite numbers. Looked at more optimistically, though, most other philosophers will say the rise of this technical term is yet another example of how the discovery of a new concept has propelled civilization forward.

Resistance to the claim that there are actual infinities has had two other sources. One is the belief that actual infinities cannot be experienced. The other is the belief that use of the concept of actual infinity leads to paradoxes, such as Zeno’s. In order to solve Zeno’s Paradoxes, the standard solution makes use of calculus. The birth of the new technical definition of actual infinity is intimately tied to the development of calculus and thus to properly defining the mathematician’s real line, the linear continuum. The set of real numbers in their standard order was given the name “the continuum” or “the linear continuum” because it was believed that the real numbers fill up the entire number line continuously without leaving gaps. The integers have gaps, and so do the fractions.

Briefly, the argument for actual infinities is that science needs calculus; calculus needs the continuum; the continuum needs a very careful definition; and the best definition requires there to be actual infinities (not merely potential infinities) in the continuum.

Defining the continuum involves defining real numbers because the linear continuum is the intended model of the theory of real numbers just as the plane is the intended model of the theory of ordinary two-dimensional geometry. It was eventually realized by mathematicians that giving a careful definition to the continuum and to real numbers requires formulating their definitions within set theory. As part of that formulation, mathematicians found a good way to define a rational number in the language of set theory; then they defined a real number to be a certain pair of actually infinite sets of rational numbers. The continuum’s eventual definition required it to be an actually infinite collection whose elements are themselves infinite sets. The details are too complex to be presented here, but the curious reader can check any textbook in classical real analysis. The intuitive picture is that any interval or segment of the continuum is a continuum, and any continuum is a very special infinite set of points that are packed so closely together that there are no gaps. A continuum is perfectly smooth. This smoothness is reflected in there being a very great many real numbers between any two real numbers (technically a nondenumerable infinity between them).

Calculus is the area of mathematics that is more applicable to science than any other area. It can be thought of as a technique for treating a continuous change as being composed of an infinite number of infinitesimal changes. When calculus is applied to physical properties capable of change such as spatial location, ocean salinity, or an electrical circuit’s voltage, these properties are represented with continuous variables that have real numbers for their values. These values are specific real numbers, not ranges of real numbers and not just rational numbers. Achilles’ location along the path to his goal is such a property.

It took many centuries to rigorously develop the calculus. A very significant step in this direction occurred in 1888 when Richard Dedekind re-defined the term “infinity” and when Georg Cantor used that definition to create the first set theory, a theory that eventually was developed to the point where it could be used for embedding all classical mathematical theories. See the example in the Zeno’s Paradoxes article of how Dedekind used set theory and his new idea of “cuts” to define the real numbers in terms of infinite sets of rational numbers. In this way, additional rigor was given to the concepts of mathematics, and it encouraged more mathematicians to accept the notion of actually infinite sets. What this embedding requires is first defining the terms of any mathematical theory in the language of set theory, then translating the axioms and theorems of the mathematical theory into sentences of set theory, and then showing that these theorems follow logically from the axioms. (The axioms of any theory, such as set theory, are the special sentences of the theory that can always be assumed during the process of deducing the other theorems of the theory.)

The new technical treatment of infinity that originated with Dedekind in 1888 and was adopted by Cantor in his new set theory provided a definition of “infinite set” rather than simply “infinite.” Dedekind says an infinite set is a set that is not finite. The notion of a finite set can be defined in various ways. We might define it numerically as a set having n members, where n is some non-negative integer. Dedekind found an essentially equivalent definition of finite set (assuming the axiom of choice, which will be discussed later), but Dedekind’s definition does not require mentioning numbers:

A (Dedekind) finite set is a set for which there exists no one-to-one correspondence between it and one of its proper subsets.

By placing the finger-tips of your left hand on the corresponding fingertips of your right hand, you establish a one-to-one correspondence between the set of fingers of each hand; in that way, you establish that there is the same number of fingers on each of your hands, without your needing to count the fingers. More generally, there is a one-to-one correspondence between two sets when each member of one set can be paired off with a unique member of the other set, so that neither set has an unpaired member.

Here is a one-to-one correspondence between the natural numbers and its proper subset of even numbers, demonstrating that the natural numbers are infinite:

1 2 3 4
2 4 6 8

Informally expressed, any infinite set can be matched up to a part of itself; so the whole is equivalent to a part. This is a surprising definition because, before this definition was adopted, the idea that actually infinite wholes are equinumerous with some of their parts was taken as clear evidence that the concept of actual infinity is inherently paradoxical. For a systematic presentation of the many alternative ways to successfully define “infinite set” non-numerically, see (Tarski 1924).

Dedekind’s new definition of “infinite” is defining an actually infinite set, not a potentially infinite set because Dedekind appealed to no continuing operation over time. The concept of a potentially infinite set is then given a new technical definition by saying a potentially infinite set is a growing, finite subset of an actually infinite set. Cantor expressed the point this way:

In order for there to be a variable quantity in some mathematical study, the “domain” of its variability must strictly speaking be known beforehand through a definition. However, this domain cannot itself be something variable…. Thus this “domain” is a definite, actually infinite set of values. Thus each potential infinite…presupposes an actual infinite. (Cantor 1887)

The new idea is that the potentially infinite set presupposes an actually infinite one. If this is correct, then Aristotle’s two notions of the potential infinite and actual infinite have been redefined and clarified.

Two sets are the same if any member of one is a member of the other, and vice versa. Order of the members is irrelevant to the identity of the set, and to the size of the set. Two sets are the same size if there exists a one-to-one correspondence between them. This definition of same size was recommended by both Cantor and Frege. Cantor defined “finite” by saying a set is finite if there is a one-to-one correspondence with the set {1, 2, 3, …, n} for some positive integer n; and he said a set is infinite if it is not finite.

Cardinal numbers are measures of the sizes of sets. There are many definitions of what a cardinal number is, but what is essential for cardinal numbers is that two sets have the same cardinal just in case there is a one-to-one correspondence between them; and set A has a smaller cardinal number than a set B (and so set A has fewer members than B) provided there is a one-to-one correspondence between A and a subset of B, but B is not the same size as A. In this sense, the set of even integers does not have fewer members than the set of all integers, although intuitively you might think it does.

How big is infinity? This question does not make sense for either potential infinity or transcendental infinity, but it does for actual infinity. Finite cardinal numbers such as 0, 1, 2, and 3 are measures of the sizes of finite sets, and transfinite cardinal numbers are measures of the sizes of actually infinite sets. The transfinite cardinals are aleph-null, aleph-one, aleph-two, and so on; we represent them with the numerals ℵ0, ℵ1, ℵ2, …. The smallest infinite size is ℵ0 which is the size of the set of natural numbers, and it is said to be countably infinite (or denumerably infinite or enumerably infinite). The other alephs are measures of the uncountable infinities. However, calling a set of size ℵ0 countably infinite is somewhat misleading since no process of counting is involved. Nobody would have the time to count from 0 to ℵ0.

The set of even integers, the set of natural numbers and the set of rational numbers all can be shown to have the same size, but surprisingly they all are smaller than the set of real numbers. The set of points in the continuum and in any interval of the continuum turns out to be larger than ℵ0, although how much larger is still an open problem, called the continuum problem. A popular but controversial suggestion is that a continuum is of size ℵ1, the next larger size.

When creating set theory, mathematicians did not begin with the belief that there would be so many points between any two points in the continuum nor with the belief that for any infinite cardinal there is a larger cardinal. These were surprising consequences discovered by Cantor. To many philosophers, this surprise is evidence that what is going on is not invention but rather is discovery about mind-independent reality.

The intellectual community has always been wary of actually infinite sets. Before the discovery of how to embed calculus within set theory (a process that is also called giving calculus a basis in set theory), it could have been more easily argued that science does not need actual infinities. The burden of proof has now shifted, and the default position is that actual infinities are indispensable in mathematics and science, and anyone who wants to do without them must show that removing them does not do too much damage and has additional benefits. There are no known successful attempts to reconstruct the theories of mathematical physics without basing them on mathematical objects such as numbers and sets, but for one attempt to do so using second-order logic, see (Field 1980).

Here is why some mathematicians believe the set-theoretic basis is so important:

Just as chemistry was unified and simplified when it was realized that every chemical compound is made of atoms, mathematics was dramatically unified when it was realized that every object of mathematics can be taken to be the same kind of thing [namely, a set]. There are now other ways than set theory to unify mathematics, but before set theory there was no such unifying concept. Indeed, in the Renaissance, mathematicians hesitated to add x2 to x3, since the one was an area and the other a volume. Since the advent of set theory, one can correctly say that all mathematicians are exploring the same mental universe. (Rucker 1982, p. 64)

But the significance of this basis can be exaggerated. The existence of the basis does not imply that mathematics is set theory.

Paradoxes soon were revealed within set theory—by Cantor himself and then others—so the quest for a more rigorous definition of the mathematical continuum continued. Cantor’s own paradox surfaced in 1895 when he asked whether the set of all cardinal numbers has a cardinal number. Cantor showed that, if it does, then it doesn’t. Surely the set of all sets would have the greatest cardinal number, but Cantor showed that for any cardinal number there is a greater cardinal number.  [For more details about this and the other paradoxes, see (Suppes 1960).] The most famous paradox of set theory is Russell’s Paradox of 1901. He showed that the set of all sets that are not members of themselves is both a member of itself and not a member of itself. Russell wrote that the paradox “put an end to the logical honeymoon that I had been enjoying.”

These and other paradoxes were eventually resolved satisfactorily by finding revised axioms of set theory that permit the existence of enough well-behaved sets so that set theory is not crippled [that is, made incapable of providing a basis for mathematical theories] and yet the axioms do not permit the existence of too many sets, the ill-behaved sets such as Cantor’s set of all cardinals and Russell’s set of all sets that are not members of themselves. Finally, by the mid-20th century, it had become clear that, despite the existence of competing set theories, Zermelo-Fraenkel’s set theory (ZF) was the best way or the least radical way to revise set theory in order to avoid all the known paradoxes and problems while at the same time preserving enough of our intuitive ideas about sets that it deserved to be called a set theory, and at this time most mathematicians would have agreed that the continuum had been given a proper basis in ZF. See (Kleene 1967, pp. 189-191) for comments on this agreement about ZF’s success and for a list of the ZF axioms and for a detailed explanation of why each axiom deserves to be an axiom.

Because of this success, and because it was clear enough that the concept of infinity used in ZF does not lead to contradictions, and because it seemed so evident how to use the concept in other areas of mathematics and science where the term “infinity” was being used, the definition of the concept of “infinite set” within ZF was claimed by many philosophers to be the paradigm example of how to provide a precise and fruitful definition of a philosophically significant concept. Much less attention was then paid to critics who had complained that we can never use the word “infinity” coherently because infinity is ineffable or inherently paradoxical.

Nevertheless, there was, and still is, serious philosophical opposition to actually infinite sets and to ZF’s treatment of the continuum, and this has spawned the programs of constructivism, intuitionism, finitism, and ultrafinitism, all of whose advocates have philosophical objections to actual infinities. Even though there is much to be said in favor of replacing a murky concept with a clearer, technical concept, there is always the worry that the replacement is a change of subject that has not really solved the problems it was designed for. More discussion of the role of infinity in mathematics and science continues in later sections of this article.

2. Infinity and the Mind

Can humans grasp the concept of the infinite? This seems to be a profound question. Ever since Zeno, intellectuals have realized that careless reasoning about infinity can lead to paradox and perhaps “defeat” the human mind.

Some critics of infinity argue not just that paradox can occur but that paradox is essential to, or inherent in, the use of the concept of infinity, so the infinite is beyond the grasp of the human mind. However, this criticism applies more properly to some forms of transcendental infinity rather than to either actual infinity or potential infinity. This is a consequence of the development of set theory as we shall see in a later section.

A second reason to believe humans cannot grasp infinity is that the concept must contain an infinite number of sub-concepts, which is too many for our finite minds. A counter to this reason is to defend the psychological claim that if a person succeeds in thinking about infinity, it does not follow that the person needs to have an actually infinite number of ideas in mind at one time.

A third reason to believe the concept of infinity is beyond human understanding is that to have the concept one must have some accurate mental picture of infinity. Thomas Hobbes, who believed that all thinking is based on imagination, might remark that nobody could picture an infinite number of grains of sand at once. However, most contemporary philosophers of psychology believe mental pictures are not essential to have a concept. Regarding the concept of dog, you might have a picture of a brown dog in your mind, and I might have a picture of a black dog in mine, but I can still understand you perfectly well when you say dogs frequently chase cats.

The main issue here is whether we can coherently think about infinity to the extent of being said to have the concept. Here is a simple argument that we can: If we understand negation and have the concept of finite, then the concept of infinite is merely the concept of not-finite. A second argument says the apparent consistency of set theory indicates that infinity in the technical sense of actual infinity is well within our grasp. And since potential infinity is definable in terms of actual infinity, it, too, is within our grasp.

Assuming that infinity is within our grasp, what is it that we are grasping? Philosophers disagree on the answer. In 1883, the father of set theory, Georg Cantor, created a formal theory of infinite sets as a way of clarifying the infinite. This was a significant advance, but the notion of set can be puzzling. If you understand that a pencil is on my desk, must you implicitly understand that a set containing a pencil is on my desk? Plus a set containing that set? And another set containing the set containing the set with the pencil, and so forth to infinity?

In regard to mentally grasping an infinite set or any other set, Cantor said:

A set is a Many which allows itself to be thought of as a One.

Notice the dependence of a set upon thought. Cantor eventually clarified what he meant and was clear that he did not want set existence to depend on mental capability. What he really believed is that a set is a collection of well-defined and distinct objects that exist independently of being thought of, but that might be thought of by a powerful enough mind.

3. Infinity in Metaphysics

There is a concept which corrupts and upsets all others. I refer not to Evil, whose limited realm is that of ethics; I refer to the infinite. —Jorge Luis Borges.

Shakespeare declared, “The will is infinite.” Is he correct or just exaggerating? Critics of Shakespeare, interpreted literally, might argue that the will is basically a product of different brain states. Because a person’s brain contains approximately 1027 atoms, these have only a finite number of configurations or states, and so, regardless of whether we interpret Shakespeare’s remark as implying that the will is unbounded (is potentially infinite) or the will produces an infinite number of brain states (is actually infinite), the will is not infinite. But perhaps Shakespeare was speaking metaphorically and did not intend to be taken literally, or perhaps he meant to use some version of transcendental infinity that makes infinity be somehow beyond human comprehension.

Contemporary Continental philosophers often speak that way. Emmanuel Levinas says the infinite is another name for the Other, for the existence of other conscious beings besides ourselves whom we are ethically responsible for. We “face the infinite” in the sense of facing a practically incomprehensible and unlimited number of possibilities upon encountering another conscious being. (See Levinas 1961.) If we ask what sense of “infinite” is being used by Levinas, it may be yet another concept of infinity, or it may be some kind of transcendental infinity. Another interpretation is that he is exaggerating about the number of possibilities and should say instead that there are too many possibilities to be faced when we encounter another conscious being and that the possibilities are not readily predictable because other conscious beings make free choices, the causes of which often are not known even to the person making the choice.

Leibniz was one of the few persons in earlier centuries who believed in actually infinite sets, but he did not believe in infinite numbers. Cantor did. Referring to his own discovery of the transfinite cardinals ℵ0, ℵ1, ℵ2, …. and their properties, Cantor claimed his work was revealing God’s existence and that these mathematical objects were in the mind of God. He claimed God gave humans the concept of the infinite so that they could reflect on His perfection. Influential German neo-Thomists such as Constantin Gutberlet agreed with Cantor. Some Jesuit math instructors claim that by taking a calculus course and set theory course and understanding infinity, students are getting closer to God. Their critics complain that these mystical ideas about infinity and God are too speculative.

When metaphysicians speak of infinity they use all three concepts: potential infinity, actual infinity, and transcendental infinity. But when they speak about God being infinite, they are usually interested in implying that God is beyond human understanding or that there is a lack of a limit on particular properties of God, such as God’s goodness and knowledge and power.

The connection between infinity and God exists in nearly all of the world’s religions. It is prominent in Hindu, Muslim, Jewish, and Christian literature. For example, in chapter 11 of the Bhagavad Gita of Hindu scripture, Krishna says, “O Lord of the universe, I see You everywhere with infinite form….”

Plato did not envision God (the Demi-urge) as infinite because he viewed God as perfect, and he believed anything perfect must be limited and thus not infinite because the infinite was defined as an unlimited, unbounded, indefinite, unintelligible chaos.

But the meaning of the term “infinite” slowly began to change. Over six hundred years later, the Neo-Platonist philosopher Plotinus was one of the first important Greek philosophers to equate God with the infinite−although he did not do so explicitly. He said instead that any idea abstracted from our finite experience is not applicable to God. He probably believed that if God were finite in some aspect, then there could be something beyond God and therefore God wouldn’t be “the One.” Plotinus was influential in helping remove the negative connotations that had accompanied the concept of the infinite. One difficulty here, though, is that it is unclear whether metaphysicians have discovered that God is identical with the transcendentally infinite or whether they are simply defining “God” to be that way. A more severe criticism is that perhaps they are just defining “infinite” (in the transcendental sense) as whatever God is.

Augustine, who merged Platonic philosophy with the Christian religion, spoke of God “whose understanding is infinite” for “what are we mean wretches that dare presume to limit His knowledge?” Augustine wrote that the reason God can understand the infinite is that “…every infinity is, in a way we cannot express, made finite to God….” [City of God, Book XII, ch. 18] This is an interesting perspective. Medieval philosophers debated whether God could understand infinite concepts other than Himself, not because God had limited understanding, but because there was no such thing as infinity anywhere except in God.

The medieval philosopher Thomas Aquinas, too, said God has infinite knowledge. He definitely did not mean potentially infinite knowledge. The technical definition of actual infinity might be useful here. If God is infinitely knowledgeable, this can be understood perhaps as meaning that God knows the truth values of all declarative sentences and that the set of these sentences is actually infinite.

Aquinas argued in his Summa Theologia that, although God created everything, nothing created by God can be actually infinite. His main reason was that anything created can be counted, yet if an infinity were created, then the count would be infinite, but no infinite numbers exist to do the counting (as Aristotle had also said). In his day this was a better argument than today because Cantor created (or discovered) infinite numbers in the late 19th century.

René Descartes believed God was actually infinite, and he remarked that the concept of actual infinity is so awesome that no human could have created it or deduced it from other concepts, so any idea of infinity that humans have must have come from God directly. Thus God exists. Descartes is using the concept of infinity to produce a new ontological argument for God’s existence.

David Hume, and many other philosophers, raised the problem that if God has infinite power then there need not be evil in the world, and if God has infinite goodness, then there should not be any evil in the world. This problem is often referred to as “The Problem of Evil” and has been a long-standing point of contention for theologians.

Spinoza and Hegel envisioned God, or the Absolute, pantheistically. If they are correct, then to call God infinite, is to call the world itself infinite. Hegel denigrated Aristotle’s advocacy of potential infinity and claimed the world is actually infinite. Traditional Christian, Muslim, and Jewish metaphysicians do not accept the pantheistic notion that God is at one with the world. Instead, they say God transcends the world. Since God is outside space and time, the space and time that he created may or may not be infinite, depending on God’s choice, but surely everything else he created is finite, they say.

The multiverse theories of cosmology in the early 21st century allow there to be an uncountable infinity of universes within a background space whose volume is actually infinite. The universe created by our Big Bang is just one of these many universes. Christian theologians often balk at the notion of God choosing to create this multiverse because the theory’s implication that, although there are so many universes radically different from ours, there also are an actually infinite number of ones just like ours. This implies there is an infinite number of indistinguishable copies of Jesus, each of whom has been crucified on the cross. This removal of the uniqueness of Jesus is apparently a removal of his dignity. Augustine had this worry about uniqueness when considering infinite universes, and he responded that “Christ died once for sinners….”

There are many other entities and properties that some metaphysician or other has claimed are infinite: places, possibilities, propositions, properties, particulars, partial orderings, pi’s decimal expansion, predicates, proofs, Plato’s forms, principles, power sets, probabilities, positions, and possible worlds. That is just for the letter p. Some of these are considered to be abstract objects, objects outside of space and time, and others are considered to be concrete objects, objects within, or part of, space and time.

For helpful surveys of the history of infinity in theology and metaphysics, see (Owen 1967) and (Moore 2001).

4. Infinity in Physical Science

From a metaphysical perspective, the theories of mathematical physics seem to be ontologically committed to objects and their properties. If any of those objects or properties are infinite, then physics is committed to there being infinity within the physical world.

Here are four suggested examples where infinity occurs within physical science. (1) Standard cosmology based on Einstein’s general theory of relativity implies the density of the mass at the center of a simple black hole is infinitely large (even though the black hole’s total mass is finite). (2) The Standard Model of particle physics implies the size of an electron is infinitely small. (3) General relativity implies that every path in space is infinitely divisible. (4) Classical quantum theory implies the values of the kinetic energy of an accelerating, free electron are infinitely numerous. These four kinds of infinities—infinite large, infinitely small, infinitely divisible, and infinitely numerous—are implied by theory and argumentation, and are not something that could be measured directly.

Objecting to taking scientific theories at face value, the 18th-century British empiricists George Berkeley and David Hume denied the physical reality of even potential infinities on the empiricist grounds that such infinities are not detectable by our sense organs. Most philosophers of the 21st century would say that Berkeley’s and Hume’s empirical standards are too rigid because they are based on the mistaken assumption that our knowledge of reality must be a complex built up from simple impressions gained from our sense organs.

But in the spirit of Berkeley’s and Hume’s empiricism, instrumentalists also challenge any claim that science tells us the truth about physical infinities. The instrumentalists say that all theories of science are merely effective “instruments” designed for explanatory and predictive success. A scientific theory’s claims are neither true nor false. By analogy, a shovel is an effective instrument for digging, but a shovel is neither true nor false. The instrumentalist would say our theories of mathematical physics imply only that reality looks “as if” there are physical infinities. Some realists on this issue respond that to declare it to be merely a useful mathematical fiction that there are physical infinities is just as misleading as to say it is mere fiction that moving planets actually have inertia or petunias actually contain electrons. We have no other tool than theory-building for accessing the existing features of reality that are not directly perceptible. If our best theories—those that have been well tested and are empirically successful and make novel predictions—use theoretical terms that refer to infinities, then infinities must be accepted. See (Leplin 2000) for more details about anti-realist arguments, such as those of instrumentalism and constructive empiricism.

a. Infinitely Small and Infinitely Divisible

Consider the size of electrons and quarks, the two main components of atoms. All scientific experiments so far have been consistent with electrons and quarks having no internal structure (components), as our best scientific theories imply, so the “simple conclusion” is that electrons are infinitely small, or infinitesimal, and zero-dimensional. Is this “simple conclusion” too simple? Some physicists speculate that there are no physical particles this small and that, in each subsequent century, physicists will discover that all the particles of the previous century have a finite size due to some inner structure. However, most physicists withhold judgment on this point about the future of physics.

A second reason to question whether the “simple conclusion” is too simple is that electrons, quarks, and all other elementary particles behave in a quantum mechanical way. They have a wave nature as well as a particle nature, and they have these simultaneously. When probing an electron’s particle nature it is found to have no limit to how small it can be, but when probing the electron’s wave nature, the electron is found to be spread out through all of space, although it is more probably in some places than others. Also, quantum theory is about groups of objects, not a single object. The theory does not imply a definite result for a single observation but only for averages over many observations, so this is why quantum theory introduces inescapable randomness or unpredictability into claims about single objects and single experimental results. The more accurate theory of quantum electrodynamics (QED) that incorporates special relativity and improves on classical quantum theory for the smallest regions, also implies electrons are infinitesimal particles when viewed as particles, while they are wavelike or spread out when viewed as waves. When considering the electron’s particle nature, QED’s prediction of zero volume has been experimentally verified down to the limits of measurement technology. The measurement process is limited by the fact that light or other electromagnetic radiation must be used to locate the electron, and this light cannot be used to determine the position of the electron more accurately than the distance between the wave crests of the light wave used to bombard the electron. So, all this is why the “simple conclusion” mentioned at the beginning of this paragraph may be too simple. For more discussion, see the chapter “The Uncertainty Principle” in (Hawking 2001) or (Greene 1999, pp. 121-2).

If a scientific theory implies space is a continuum, with the structure of a mathematical continuum, then if that theory is taken at face value, space is infinitely divisible and composed of infinitely small entities, the so-called points of space. But should it be taken at face value? The mathematician David Hilbert declared in 1925, “A homogeneous continuum which admits of the sort of divisibility needed to realize the infinitely small is nowhere to be found in reality. The infinite divisibility of a continuum is an operation which exists only in thought.” Hilbert said actual, transcendental infinities are real in mathematics, but not in physics. Many physicists agree with Hilbert. Many other physicists and philosophers argue that, although Hilbert is correct that ordinary entities such as strawberries and cream are not continuous, he is ultimately incorrect, for the following reasons.

First, the Standard Model of particles and forces is one of the best tested and most successful theories in all the history of physics. So are the theories of relativity and quantum mechanics. All these theories imply or assume that, using Cantor’s technical sense of actual infinity, there are infinitely many infinitesimal instants in any non-zero duration, and there are infinitely many point places along any spatial path. So, time is a continuum, and space is a continuum.

The second challenge to Hilbert’s position is that quantum theory, in agreement with relativity theory, implies that for any possible kinetic energy of a free electron there is half that energy−insofar as an electron can be said to have a value of energy independent of being measured to have it. Although the energy of an electron bound within an atom is quantized, the energy of an unbound or free electron is not. If it accelerates in its reference frame from zero to nearly the speed of light, its energy changes and takes on all intermediate real-numbered values from its rest energy to its total energy. But mass is just a form of energy, as Einstein showed in his famous equation E = mc2, so in this sense mass is a continuum as well as energy.

How about non-classical quantum mechanics, the proposed theories of quantum gravity that are designed to remove the disagreements between quantum mechanics and relativity theory? Do these non-classical theories quantize all these continua we’ve been talking about? One such theory, the theory of loop quantum gravity, implies space consists of discrete units called loops. But string theory, which is the more popular of the theories of quantum gravity in the early 21st century, does not imply space is discontinuous. [See (Greene 2004) for more details.] Speaking about this question of continuity, the theoretical physicist Brian Greene says that, although string theory is developed against a background of continuous spacetime, his own insight is that

[T]he increasingly intense quantum jitters that arise on decreasing scales suggest that the notion of being able to divide distances or durations into ever smaller units likely comes to an end at around the Planck length (10-33centimeters) and Planck time (10-43 seconds). …There is something lurking in the microdepths−something that might be called the bare-bones substrate of spacetime−the entity to which the familiar notion of spacetime alludes. We expect that this ur-ingredient, this most elemental spacetime stuff, does not allow dissection into ever smaller pieces because of the violent fluctuations that would ultimately be encountered…. [If] familiar spacetime is but a large-scale manifestation of some more fundamental entity, what is that entity and what are its essential properties? As of today, no one knows. (Greene 2004, pp. 473, 474, 477)

Disagreeing, the theoretical physicist Roger Penrose speaks about both loop quantum gravity and string theory and says:

…in the early days of quantum mechanics, there was a great hope, not realized by future developments, that quantum theory was leading physics to a picture of the world in which there is actually discreteness at the tiniest levels. In the successful theories of our present day, as things have turned out, we take spacetime as a continuum even when quantum concepts are involved, and ideas that involve small-scale spacetime discreteness must be regarded as ‘unconventional.’ The continuum still features in an essential way even in those theories which attempt to apply the ideas of quantum mechanics to the very structure of space and time…. Thus it appears, for the time being at least, that we need to take the use of the infinite seriously, particular in its role in the mathematical description of the physical continuum. (Penrose 2005, 363)

b. Singularities

There is a good reason why scientists fear the infinite more than mathematicians do. Scientists have to worry that someday we will have a dangerous encounter with a singularity, with something that is, say, infinitely hot or infinitely dense. For example, we might encounter a singularity by being sucked into a black hole. According to Schwarzschild’s solution to the equations of general relativity, a simple, non-rotating black hole is infinitely dense at its center. For a second example of where there may be singularities, there is good reason to believe that 13.8 billion years ago the entire universe was a singularity with infinite temperature, infinite density, infinitesimal volume, and infinite curvature of spacetime.

Some philosophers will ask: Is it not proper to appeal to our best physical theories in order to learn what is physically possible? Usually, but not in this case, say many scientists, including Albert Einstein. He believed that, if a theory implies that some physical properties might have or, worse yet, do have actually infinite values (the so-called singularities), then this is a sure sign of error in the theory. It’s an error primarily because the theory will be unable to predict the behavior of the infinite entity, and so the theory will fail. For example, even if there were a large, shrinking universe pre-existing the Big Bang, if the Big Bang were considered to be an actual singularity, then knowledge of the state of the universe before the Big Bang could not be used to predict events after the Big Bang, or vice versa. This failure to imply the character of later states of the universe is what Einstein’s collaborator Peter Bergmann meant when he said, “A theory that involves singularities…carries within itself the seeds of its own destruction.” The majority of physicists probably would agree with Einstein and Bergmann about this, but the critics of these scientists say this belief that we need to remove singularities everywhere is merely a hope that has been turned into a metaphysical assumption.

But doesn’t quantum theory also rule out singularities? Yes. Quantum theory allows only arbitrary large, finite values of properties such as temperature and mass-energy density. So which theory, relativity theory or quantum theory, should we trust to tell us whether the center of a black hole is or isn’t a singularity? The best answer is, “Neither, because we should get our answer from a theory of quantum gravity.” A principal attraction of string theory, a leading proposal for a theory of quantum gravity to replace both relativity theory and quantum theory, is that it eliminates the many singularities that appear in previously accepted physical theories such as relativity theory. In string theory, the electrons and quarks are not point particles but are small, finite loops of fundamental string. That finiteness in the loop is what eliminates the singularities.

Unfortunately, string theory has its own problems with infinity. It implies an infinity of kinds of particles. If a particle is a string, then the energy of the particle should be the energy of its vibrating string. Strings have an infinite number of possible vibrational patterns each corresponding to a particle that should exist if we take the theory literally. One response that string theorists make to this problem about too many particles is that perhaps the infinity of particles did exist at the time of the Big Bang but now they have all disintegrated into a shower of simpler particles and so do not exist today. Another response favored by string theorists is that perhaps there never were an infinity of particles nor a Big Bang singularity in the first place. Instead, the Big Bang was a Big Bounce or quick expansion from a pre-existing, shrinking universe whose size stopped shrinking when it got below the critical Planck length of about 10-35 meters.

c. Idealization and Approximation

Scientific theories use idealization and approximation; they are “lies that help us to see the truth,” to use a phrase from the painter Pablo Picasso (who was speaking about art, not science). In our scientific theories, there are ideal gases, perfectly elliptical orbits, and economic consumers motivated only by profit. Everybody knows these are not intended to be real objects. Yet, it is clear that idealizations and approximations are actually needed in science in order to promote genuine explanation of many phenomena. We need to reduce the noise of the details in order to see what is important. In short, approximations and idealizations can be explanatory. But what about approximations and idealizations that involve the infinite?

Although the terms “idealization” and “approximation” are often used interchangeably, John Norton (Norton 2012) recommends paying more attention to their difference by saying that, when there is some aspect of the world, some target system, that we are trying to understand scientifically, approximations should be considered to be inexact descriptions of the target system whereas idealizations should be considered to be new systems or parts of new systems that also are approximations to the target system but that contain reference to some novel object or property. For example, elliptical orbits are approximations to actual orbits of planets, but ideal gases are idealizations because they contain novel objects such as point-sized gas particles that are part of a new system that is useful for approximating the target system of actual gases.

Philosophers of science disagree about whether all appeals to infinity can be known a priori to be mere idealizations or approximations. Our theory of the solar system justifies our belief that the Earth is orbited by a moon, not just an approximate moon. The speed of light in a vacuum really is constant, not just approximately constant. Why then should it be assumed, as it often is, that all appeals to infinity in scientific theory are approximations or idealizations? Must the infinity be an artifact of the model rather than a feature of actual physical reality?  Philosophers of science disagree on this issue. See (Mundy, 1990, p. 290).

There is an argument for believing some appeals to infinity definitely are neither approximations nor idealizations. The argument presupposes a realist rather than an antirealist understanding of science, and it begins with a description of the opponents’ position. Carl Friedrich Gauss (1777-1855) was one of the greatest mathematicians of all time. He said scientific theories involve infinities merely as approximations or idealizations and merely in order to make for easy applications of those theories, when in fact all real entities are finite. At the time, nearly everyone would have agreed with Gauss. Roger Penrose argues against Gauss’ position:

Nevertheless, as tried and tested physical theory stands today—as it has for the past 24 centuries—real numbers still form a fundamental ingredient of our understanding of the physical world. (Penrose 2004, 62)

Gauss’s position could be buttressed if there were useful alternatives to our physical theories that do not use infinities. There actually are alternative mathematical theories of analysis that do not use real numbers and do not use infinite sets and do not require the line to be dense. See (Ahmavaara 1965) for an example. Representing the majority position among scientists on this issue, Penrose says, “To my mind, a physical theory which depends fundamentally upon some absurdly enormous…number would be a far more complicated (and improbable) theory than one that is able to depend upon a simple notion of infinity” (Penrose 2005, 359). David Deutsch agrees. He says, “Versions of number theory that confined themselves to ‘small natural numbers’ would have to be so full of arbitrary qualifiers, workarounds and unanswered questions, that they would be very bad explanations until they were generalized to the case that makes sense without such ad-hoc restrictions: the infinite case.” (Deutsch 2011, pp. 118-9) And surely a successful explanation is the surest route to understanding reality.

In opposition to this position of Penrose and Deutsch, and in support of Gauss’ position, the physicist Erwin Schrödinger remarks, “The idea of a continuous range, so familiar to mathematicians in our days, is something quite exorbitant, an enormous extrapolation of what is accessible to us.” Emphasizing this point about being “accessible to us,” some metaphysicians attack the applicability of the mathematical continuum to physical reality on the grounds that a continuous human perception over time is not mathematically continuous. Wesley Salmon responds to this complaint from Schrödinger:

…The perceptual continuum and perceived becoming [that is, the evidence from our sense organs that the world changes from time to time] exhibit a structure radically different from that of the mathematical continuum. Experience does seem, as James and Whitehead emphasize, to have an atomistic character. If physical change could be understood only in terms of the structure of the perceptual continuum, then the mathematical continuum would be incapable of providing an adequate description of physical processes. In particular, if we set the epistemological requirement that physical continuity must be constructed from physical points which are explicitly definable in terms of observables, then it will be impossible to endow the physical continuum with the properties of the mathematical continuum. In our discussion…, we shall see, however, that no such rigid requirement needs to be imposed. (Salmon 1970, 20)

Salmon continues by making the point that calculus provides better explanations of physical change than explanations which accept the “rigid requirement” of understanding physical change in terms of the structure of the perceptual continuum, so he recommends that we apply Ockham’s Razor and eliminate that rigid requirement. But the issue is not settled.

d. Infinity in Cosmology

Let’s review some of the history regarding the volume of spacetime. Aristotle said the past is infinite because for any past time we can imagine an earlier one. It is difficult to make sense of his belief about the past since he means it is potentially infinite. After all, the past has an end, namely the present, so its infinity has been completed and therefore is not a potential infinity. This problem with Aristotle’s reasoning was first raised in the 13th century by Richard Rufus of Cornwall. It was not given the attention it deserved because of the assumption for so many centuries that Aristotle couldn’t have been wrong about time, especially since his position was consistent with Christian, Jewish, and Muslim theology which implies the physical world became coherent or well-formed only a finite time ago (even if past time itself is potentially infinite). However, Aquinas argued against Aristotle’s view that the past is infinite; Aquinas’ grounds were that Holy Scripture implies God created the world (and thus time itself) a finite time ago and that Aristotle was wrong to put so much trust in what we can imagine.

Unlike time, Aristotle claimed space is finite. He said the volume of physical space is finite because it is enclosed within a finite, spherical shell of visible, fixed stars with the Earth at its center. On this topic of space not being infinite, Aristotle’s influence was authoritative to most scholars for the next eighteen hundred years.

The debate about whether the volume of space is infinite was rekindled in Renaissance Europe. The English astronomer and defender of Copernicus, Thomas Digges (1546–1595) was the first scientist to reject the ancient idea of an outer spherical shell and to declare that physical space is actually infinite in volume and filled with stars. The physicist Isaac Newton (1642–1727) at first believed the universe’s material is confined to only a finite region while it is surrounded by infinite empty space, but in 1691 he realized that if there were a finite number of stars in a finite region, then gravity would require all the stars to fall in together at some central point. To avoid this result, he later speculated that the universe contains an infinite number of stars in an infinite volume. We now know that Newton’s speculation about the stability of an infinity of stars in an infinite universe is incorrect. There would still be clumping so long as the universe did not expand. (Hawking 2001, p. 9)

Immanuel Kant (1724–1804) declared that space and time are both potentially infinite in extent because this is imposed by our own minds. Space and time are not features of “things in themselves” but are an aspect of the very form of any possible human experience, he said. We can know a priori even more about space than about time, he believed; and he declared that the geometry of space must be Euclidean. Kant’s approach to space and time as something knowable a priori went out of fashion in the early 20th century. It was undermined in large part by the discovery of non-Euclidean geometries in the 19th century, then by Beltrami’s and Klein’s proofs that these geometries are as logically consistent as Euclidean geometry, and finally by Einstein’s successful application to physical space of non-Euclidean geometry within his general theory of relativity.

The volume of spacetime is finite at present if we can trust the classical Big Bang theory. [But do not think of this finite space as having a boundary beyond which a traveler falls over the edge into nothingness.] Assuming space is all the places that have been created since the Big Bang, then the volume of space is definitely finite at present, though it is huge and growing ever larger over time. Assuming this expansion will never stop, it follows that the volume of spacetime is potentially infinite but not actually infinite. For more discussion of the issue of the volume of spacetime, see (Greene 2011).

Einstein’s theory of relativity implies that all physical objects must travel at less than light speed (in a vacuum). Nevertheless, by exploiting the principle of time dilation and length contraction in his special theory of relativity, the time limits on human exploration of the universe can be removed. Assuming you can travel safely at any high speed under light speed, then as your spaceship approaches light speed, your trip’s distance and travel time become infinitesimally short. In principle, you have time on your own clock to cross the Milky Way galaxy, a trip that takes light itself 100,000 years as measured on an Earth clock.

5. Infinity in Mathematics

The previous sections of this article have introduced the concepts of actual infinity and potential infinity and explored the development of calculus and set theory, but this section probes deeper into the role of infinity in mathematics. Mathematicians always have been aware of the special difficulty in dealing with the concept of infinity in a coherent manner. Intuitively, it seems reasonable that if we have two infinities of things, then we still have an infinity of them. So, we might represent this intuition mathematically by the equation 2 ∞ = 1 ∞. Dividing both sides by ∞ will prove that 2 = 1, which is a good sign we were not using infinity in a coherent manner. In recommending how to use the concept of infinity coherently, Bertrand Russell said pejoratively:

The whole difficulty of the subject lies in the necessity of thinking in an unfamiliar way, and in realising that many properties which we have thought inherent in number are in fact peculiar to finite numbers. If this is remembered, the positive theory of infinity…will not be found so difficult as it is to those who cling obstinately to the prejudices instilled by the arithmetic which is learnt in childhood. (Salmon 1970, 58)

That positive theory of infinity that Russell is talking about is set theory, and the new arithmetic is the result of Cantor’s generalizing the notions of order and of size of sets into the infinite, that is, to the infinite ordinals and infinite cardinals. These numbers are also called transfinite ordinals and transfinite cardinals. The following sections will briefly explore set theory and the role of infinity within mathematics. The main idea, though, is that the basic theories of mathematical physics are properly expressed using the differential calculus with real-number variables, and these concepts are well-defined in terms of set theory which, in turn, requires using actual infinities or transfinite infinities of various kinds.

a. Infinite Sums

In the 17th century, when Newton and Leibniz invented calculus, they wondered what the value is of this infinite sum:

1/1 + 1/2 + 1/4 + 1/8 + ….

They believed the sum is 2. Knowing about the dangers of talking about infinity, most later mathematicians hoped to find a technique to avoid using the phrase “infinite sum.” Cauchy and Weierstrass eventually provided this technique two centuries later. They removed any mention of “infinite sum” by using the formal idea of a limit. Informally, the Cauchy-Weierstrass idea is that instead of overtly saying the infinite sum x1 + x2 + x3 + … is some number S, as Newton and Leibniz were saying, one should say that the sequence converges to S just in case the numerical difference between S and any partial sum is as small as one desires, provided that partial sum occurs sufficiently far out in the sequence of partial sums. More formally it is expressed this way:

If an infinite series of real numbers is x1 + x2 + x3 + …, and if the infinite sequence of its partial sums is s1s2s3, …, then the series converges to S if and only if for every positive number ε there exists an integer n such that, for all integers k > n,  |sk – S| < ε.

This technique of talking about limits was due to Cauchy in 1821 and Weierstrass in the period from 1850 to 1871. The two drawbacks to this technique are that (1) it is unintuitive and more complicated than Newton and Leibniz’s intuitive approach that did mention infinite sums, and (2) it is not needed because infinite sums were eventually legitimized by being given a set-theoretic foundation.

b. Infinitesimals and Hyperreals

There has been considerable controversy throughout history about how to understand infinitesimal objects and infinitesimal changes in the properties of objects. Intuitively, an infinitesimal object is as small as you please but not quite nothing. Infinitesimal objects and infinitesimal methods were first used by Archimedes in ancient Greece, but he did not mention them in any publication intended for the public because he did not consider his use of them to be rigorous. Infinitesimals became better known when Leibniz used them in his differential and integral calculus. The differential calculus can be considered to be a technique for treating continuous motion as being composed of an infinite number of infinitesimal steps. The calculus’ use of infinitesimals led to the so-called “golden age of nothing” in which infinitesimals were used freely in mathematics and science. During this period, Leibniz, Euler, and the Bernoullis applied the concept. Euler applied it cavalierly (although his intuition was so good that he rarely if ever made mistakes), but Leibniz and the Bernoullis were concerned with the general question of when we could, and when we could not, consider an infinitesimal to be zero. They were aware of apparent problems with these practices in large part because they had been exposed by Berkeley.

In 1734, George Berkeley attacked the concept of infinitesimal as ill-defined and incoherent because there were no definite rules for when the infinitesimal should be and shouldn’t be considered to be zero. Berkeley, like Leibniz, was thinking of infinitesimals as objects with a constant value–as genuinely infinitesimally small magnitudes–whereas Newton thought of them as variables that could arbitrarily approach zero. Either way, there were coherence problems. The scientists and results-oriented mathematicians of the golden age of nothing had no good answer to the coherence problem. As standards of rigorous reasoning increased over the centuries, mathematicians became more worried about infinitesimals. They were delighted when Cauchy in 1821 and Weierstrass in the period from 1850 to 1875 developed a way to use calculus without infinitesimals, and at this time any appeal to infinitesimals was considered illegitimate, and mathematicians soon stopped using infinitesimals.

Here is how Cauchy and Weierstrass eliminated infinitesimals with their concept of limit. Suppose we have a function f,  and we are interested in the Cartesian graph of the curve y = f(x) at some point a along the x-axis. What is the rate of change of  f at a? This is the slope of the tangent line at a, and it is called the derivative f’ at a. This derivative was defined by Leibniz to be

infinity-equation1

where h is an infinitesimal. Because of suspicions about infinitesimals, Cauchy and Weierstrass suggested replacing Leibniz’s definition of the derivative with

equation

That is,  f'(a) is the limit, as x approaches a, of the above ratio. The limit idea was rigorously defined using Cauchy’s well-known epsilon and delta method. Soon after the Cauchy-Weierstrass definition of derivative was formulated, mathematicians stopped using infinitesimals.

The scientists did not follow the lead of the mathematicians. Despite the lack of a coherent theory of infinitesimals, scientists continued to reason with infinitesimals because infinitesimal methods were so much more intuitively appealing than the mathematicians’ epsilon-delta methods. Although students in calculus classes in the early 21st century are still taught the unintuitive epsilon-delta methods, Abraham Robinson (Robinson 1966) created a rigorous alternative to standard Weierstrassian analysis by using the methods of model theory to define infinitesimals.

Here is Robinson’s idea. Think of the rational numbers in their natural order as being gappy with real numbers filling the gaps between them. Then think of the real numbers as being gappy with hyperreals filling the gaps between them. There is a cloud or region of hyperreals surrounding each real number (that is, surrounding each real number described nonstandardly). To develop these ideas more rigorously, Robinson used this simple definition of an infinitesimal:

h is infinitesimal if and only if 0 < |h| < 1/n, for every positive integer n.

|h| is the absolute value of h.

Robinson did not actually define an infinitesimal as a number on the real line. The infinitesimals were defined on a new number line, the hyperreal line, that contains within it the structure of the standard real numbers from classical analysis. In this sense, the hyperreal line is the extension of the reals to the hyperreals. The development of analysis via infinitesimals creates a nonstandard analysis with a hyperreal line and a set of hyperreal numbers that include real numbers. In this nonstandard analysis, 78+2h is a hyperreal that is infinitesimally close to the real number 78. Sums and products of infinitesimals are infinitesimal.

Because of the rigor of the extension, all the arguments for and against Cantor’s infinities apply equally to the infinitesimals. Sentences about the standardly-described reals are true if and only if they are true in this extension to the hyperreals. Nonstandard analysis allows proofs of all the classical theorems of standard analysis, but it very often provides shorter, more direct, and more elegant proofs than those that were originally proved by using standard analysis with epsilons and deltas. Objections by practicing mathematicians to infinitesimals subsided after this was appreciated. With a good definition of “infinitesimal” they could then use it to explain related concepts such as in the sentence, “That curve approaches infinitesimally close to that line.” See (Wolf 2005, chapter 7) for more about infinitesimals and hyperreals.

c. Mathematical Existence

Mathematics is apparently about mathematical objects, so it is apparently about infinitely large objects, infinitely small objects, and infinitely many objects. Mathematicians who are doing mathematics and are not being careful about ontology too easily remark that there are infinite-dimensional spaces, the continuum, continuous functions, an infinity of functions, and this or that infinite structure. Do these infinities really exist? The philosophical literature is filled with arguments pro and con and with fine points about senses of existence.

When axiomatizing geometry, Euclid said that between any two points one could choose to construct a line. Opposed to Euclid’s constructivist stance, many modern axiomatizers take a realist philosophical stance by declaring simply that there exists a line between any two points, so the line pre-exists any construction process. In mathematics, the constructivist will recognize the existence of a mathematical object only if there is at present an algorithm (that is, a step by step “mechanical” procedure operating on symbols that is finitely describable, that requires no ingenuity and that uses only finitely many steps) for constructing or finding such an object. Assertions require proofs. The constructivist believes that to justifiably assert the negation of a sentence S is to prove that the assumption of S leads to a contradiction. So, legitimate mathematical objects must be shown to be constructible in principle by some mental activity and cannot be assumed to pre-exist any such construction process nor to exist simply because their non-existence would be contradictory. A constructivist, unlike a realist, is a kind of conceptualist, one who believes that an unknowable mathematical object is impossible. Most constructivists complain that, although potential infinities can be constructed, actual infinities cannot be.

There are many different schools of constructivism. The first systematic one, and perhaps the most well-known version and most radical version, is due to L.E.J. Brouwer. He is not a finitist,  but his intuitionist school demands that all legitimate mathematics be constructible from a basis of mental processes he called “intuitions.” These intuitions might be more accurately called “clear mental procedures.” If there were no minds capable of having these intuitions, then there would be no mathematical objects just as there would be no songs without ideas in the minds of composers. Numbers are human creations. The number pi is intuitionistically legitimate because we have an algorithm for computing all its decimal digits, but the following number g is not legitimate. It is the number whose nth digit is either 0 or 1, and it is 1 if and only if there are n consecutive 7s in the decimal expansion of pi. No person yet knows how to construct the decimal digits of g. Brouwer argued that the actually infinite set of natural numbers cannot be constructed (using intuitions) and so does not exist. The best we can do is to have a rule for adding more members to a set. So, his concept of an acceptable infinity is closer to that of potential infinity than actual infinity. Hermann Weyl emphasizes the merely potential character of these infinities:

Brouwer made it clear, as I think beyond any doubt, that there is no evidence supporting the belief in the existential character of the totality of all natural numbers…. The sequence of numbers which grows beyond any stage already reached by passing to the next number, is a manifold of possibilities open towards infinity; it remains forever in the status of creation, but is not a closed realm of things existing in themselves. (Weyl is quoted in (Kleene 1967, p. 195))

It is not legitimate for platonic realists, said Brouwer, to bring all the sets into existence at once by declaring they are whatever objects satisfy all the axioms of set theory. Brouwer believed realists accept too many sets because they are too willing to accept sets merely by playing coherently with the finite symbols for them when sets instead should be tied to our experience. For Brouwer, this experience is our experience of time. He believed we should arrive at our concept of the infinite by noticing that our experience of a duration can be divided into parts and then these parts can be further divided, and so. This infinity is a potential infinity, not an actual infinity. For the intuitionist, there is no determinate, mind-independent mathematical reality that provides the facts to make mathematical sentences true or false. This metaphysical position is reflected in the principles of logic that are acceptable to an intuitionist. For the intuitionist, the sentence “For all x, x has property F” is true only if we have already proved constructively that each x has property F. And it is false only if we have proved that some x does not have property F. Otherwise, it is neither true nor false. The intuitionist does not accept the principle of excluded middle: For any sentence S, either S or the negation of S. Outraged by this intuitionist position, David Hilbert famously responded by saying, “To take the law of the excluded middle away from the mathematician would be like denying the astronomer the telescope or the boxer the use of his fists.” (quoted from Kleene 1967, p. 197) For a presentation of intuitionism with philosophical emphasis, see (Posy 2005) and (Dummett 1977).

Finitists, even those who are not constructivists, also argue that the actually infinite set of natural numbers does not exist. They say there is a finite rule for generating each numeral from the previous one, but the rule does not produce an actual infinity of either numerals or numbers. The ultrafinitist considers the classical finitist to be too liberal because finite numbers such as 2100 and 21000 can never be accessed by a human mind in a reasonable amount of time. Only the numerals or symbols for those numbers can be coherently manipulated. One challenge to ultrafinitists is that they should explain where the cutoff point is between numbers that can be accessed and numbers that cannot be. Ultrafinitsts have risen to this challenge. The mathematician Harvey Friedman says:

I raised just this objection [about a cutoff] with the (extreme) ultrafinitist Yessenin-Volpin during a lecture of his. He asked me to be more specific. I then proceeded to start with 21 and asked him whether this is “real” or something to that effect. He virtually immediately said yes. Then I asked about 22, and he again said yes, but with a perceptible delay. Then 23, and yes, but with more delay. This continued for a couple of more times, till it was obvious how he was handling this objection. Sure, he was prepared to always answer yes, but he was going to take 2100 times as long to answer yes to 2100 than he would to answering 21. There is no way that I could get very far with this. (Elwes 2010, 317)

This battle among competing philosophies of mathematics will not be explored in-depth in this article, but this section will offer a few more points about mathematical existence.

Hilbert argued that “If the arbitrarily given axioms do not contradict one another, then they are true and the things defined by the axioms exist.” But (Chihara 2008, 141) points out that Hilbert seems to be confusing truth with truth in a model. If a set of axioms is consistent, and so is its corresponding axiomatic theory, then the theory defines a class of models, and each axiom is true in any such model, but it does not follow that the axioms are really true. To give a crude, nonmathematical example, consider this set of two axioms {All horses are blue, all cows are green.}. The formal theory using these axioms is consistent and has a model, but it does not follow that either axiom is really true.

Quine objected to Hilbert’s criterion for existence as being too liberal. Quine’s argument for infinity in mathematics begins by noting that our fundamental scientific theories are our best tools for helping us understand reality and doing ontology. Mathematical theories that imply the existence of some actually infinite sets are indispensable to all these scientific theories, and their referring to these infinities cannot be paraphrased away. All this success is a good reason to believe in some actual infinite sets and to say the sentences of both the mathematical theories and the scientific theories are true or approximately true since their success would otherwise be a miracle. But, he continues, it is no miracle. See (Quine 1960 chapter 7).

Quine believed that infinite sets exist only if they are indispensable in successful applications of mathematics to science; but he believed science so far needs only the first three alephs: ℵ0 for the integers, ℵ1 for the set of point places in space, and ℵ2 for the number of possible lines in space (including lines that are not continuous). The rest of Cantor’s heaven of transfinite numbers is unreal, Quine said, and the mathematics of the extra transfinite numbers is merely “recreational mathematics.” But Quine showed intellectual flexibility by saying that if he were to be convinced more transfinite sets were needed in science, then he’d change his mind about which alephs are real. To briefly summarize Quine’s position, his indispensability argument treats mathematical entities on a par with all other theoretical entities in science and says mathematical statements can be (approximately) true. Quine points out that reference to mathematical entities is vital to science, and there is no way of separating out the evidence for the mathematics from the evidence for the science. This famous indispensability argument has been attacked in many ways. Critics charge, “Quite aside from the intrinsic logical defects of set theory as a deductive theory, this is disturbing because sets are so very different from physical objects as ordinarily conceived, and because the axioms of set theory are so very far removed from any kind of empirical support or empirical testability…. Not even set theory itself can tell us how the existence of a set (e.g. a power set) is empirically manifested.” (Mundy 1990, pp. 289-90). See (Parsons 1980) for more details about Quine’s and other philosophers’ arguments about the existence of mathematical objects.

d. Zermelo-Fraenkel Set Theory

Cantor initially thought of a set as being a collection of objects that can be counted, but this notion eventually gave way to a set being a collection that has a clear membership condition. Over several decades, Cantor’s naive set theory evolved into ZF, Zermelo-Fraenkel set theory, and ZF was accepted by most mid-20th century mathematicians as the correct tool to use for deciding which mathematical objects exist. The acceptance was based on three reasons. (1) ZF is precise and rigorous. (2) ZF is useful for defining or representing other mathematical concepts and methods. Mathematics can be modeled in set theory; it can be given a basis in set theory. (3) No inconsistency has been uncovered despite heavy usage.

Notice that one of the three reasons is not that set theory provides a foundation for mathematics in the sense of justifying the doing of mathematics or in the sense of showing its sentences are certain or necessary. Instead, set theory provides a basis for theories only in the sense that it helps to organize them, to reveal their interrelationships, and to provide a means to precisely define their concepts. The first program for providing this basis began in the late 19th century. Peano had given an axiomatization of the natural numbers. It can be expressed in set theory using standard devices for treating natural numbers and relations and functions and so forth as being sets. (For example, zero is the empty set, and a relation is a set of ordered pairs.) Then came the arithmetization of analysis which involved using set theory to construct from the natural numbers all the negative numbers and the fractions and real numbers and complex numbers. Along with this, the principles of these numbers became sentences of set theory. In this way, the assumptions used in informal reasoning in arithmetic are explicitly stated in the formalism, and proofs in informal arithmetic can be rewritten as formal proofs so that no creativity is required for checking the correctness of the proofs. Once a mathematical theory is given a set theoretic basis in this manner, it follows that if we have any philosophical concerns about the higher level mathematical theory, those concerns will also be concerns about the lower level set theory in the basis.

In addition to Dedekind’s definition, there are other acceptable definitions of “infinite set” and “finite set” using set theory. One popular one is to define a finite set as a set onto which a one-to-one function maps the set of all natural numbers that are less than some natural number n. That finite set contains n elements. An infinite set is then defined as one that is not finite. Dedekind, himself, used another definition; he defined an infinite set as one that is not finite, but defined a finite set as any set in which there exists no one-to-one mapping of the set into a proper subset of itself. The philosopher C. S. Peirce suggested essentially the same approach as Dedekind at approximately the same time, but he received little notice from the professional community. For more discussion of the details, see (Wilder 1965, p. 66f, and Suppes 1960, p. 99n).

Set theory implies quite a bit about infinity. First, infinity in ZF has some very unsurprising features. If a set A is infinite and is the same size as set B, then B also is infinite. If A is infinite and is a subset of B, then B also is infinite. Using the axiom of choice, it follows that a set is infinite just in case for every natural number n, there is some subset whose size is n.

ZF’s axiom of infinity declares that there is at least one infinite set, a so-called inductive set containing zero and the successor of each of its members (such as {0, 1, 2, 3, …}). The power set axiom (which says every set has a power set, namely a set of all its subsets) then generates many more infinite sets of larger cardinality, a surprising result that Cantor first discovered in 1874.

In ZF, there is no set with maximum cardinality, nor a set of all sets, nor an infinitely descending sequence of sets x0, x1, x2, … in which x1 is in x0, and x2 is in x1, and so forth. There is, however, an infinitely ascending sequence of sets x0, x1, x2, … in which x0 is in x1, and x1 is in x2, and so forth. In ZF, a set exists if it is implied by the axioms; there is no requirement that there be some property P such that the set is the extension of P. That is, there is no requirement that the set be defined as {x| P(x)} for some property P. One especially important feature of ZF is that for any condition or property, there is only one set of objects having that property, but it cannot be assumed that for any property, there is a set of all those objects that have that property. For example, it cannot be assumed that, for the property of being a set, there is a set of all objects having that property.

In ZF, all sets are pure. A set is pure if it is empty or its members are sets, and its members’ members are sets, and so forth. In informal set theory, a set can contain cows and electrons and other non-sets.

In the early years of set theory, the terms “set” and “class” and “collection” were used interchangeably, but in von Neumann–Bernays–Gödel set theory (NBG or VBG) a set is defined to be a class that is an element of some other class. NBG is designed to have proper classes, classes that are not sets, even though they can have members which are sets. The intuitive idea is that a proper class is a collection that is too big to be a set. There can be a proper class of all sets, but neither a set of all sets nor a class of all classes. A nice feature of NBG is that a sentence in the language of ZFC is provable in NBG only if it is provable in ZFC.

Are philosophers justified in saying there is more to know about sets than is contained within ZF set theory? If V is the collection or class of all sets, do mathematicians have any access to V independently of the axioms? This is an open question that arose concerning the axiom of choice and the continuum hypothesis.

e. The Axiom of Choice and the Continuum Hypothesis

Consider whether to believe in the axiom of choice. The axiom of choice is the assertion that, given any collection of non-empty and non-overlapping sets, there exists a ‘choice set’ which is composed of one element chosen from each set in the collection. However, the axiom does not say how to do the choosing. For some sets, there might not be a precise rule of choice. If the collection is infinite and its sets are not well-ordered in any way that has been specified, then there is, in general, no way to define the choice set. The axiom is implicitly used throughout the field of mathematics, and several important theorems cannot be proved without it. Mathematical Platonists tend to like the axiom, but those who want explicit definitions or constructions for sets do not like it. Nor do others who note that mathematics’ most unintuitive theorem, the Banach-Tarski Theorem, requires the axiom of choice. The dispute can get quite intense with advocates of the axiom of choice saying that their opponents are throwing out invaluable mathematics, while these opponents consider themselves to be removing tainted mathematics. See (Wagon 1985) for more on the Banach-Tarski Theorem; see (Wolf 2005, pp. 226-8) for more discussion of which theorems require the axiom.

A set is always smaller than its power set. How much bigger is the power set? Cantor’s controversial continuum hypothesis says that the cardinality of the power set of ℵ0 is ℵ1, the next larger cardinal number, and not some higher cardinal. The generalized continuum hypothesis is more general; it says that, given an infinite set of any cardinality, the cardinality of its power set is the next larger cardinal and not some even higher cardinal. Cantor believed the continuum hypothesis, but he was frustrated that he could not prove it. The philosophical issue is whether we should alter the axioms to enable the hypotheses to be proved.

If ZF is formalized as a first-order theory of deductive logic, then both Cantor’s generalized continuum hypothesis and the axiom of choice are consistent with Zermelo Frankel set theory but cannot be proved or disproved from them, assuming that ZF is not inconsistent. In this sense, both the continuum hypothesis and the axiom of choice are independent of ZF. Gödel in 1940 and Cohen in 1964 contributed to the proof of this independence result.

So, how do we decide whether to believe the axiom of choice and continuum hypothesis, and how do we decide whether to add them to the principles of ZF or any other set theory? Most mathematicians do believe the axiom of choice is true, but there is more uncertainty about the continuum hypothesis. The independence does not rule out our someday finding a convincing argument that the hypothesis is true or a convincing argument that it is false, but the argument will need more premises than just the principles of ZF. At this point, the philosophers of mathematics divide into two camps. The realists, who think there is a unique universe of sets to be discovered, believe that if ZF does not fix the truth values of the continuum hypothesis and the axiom of choice, then this is a defect within ZF and we need to explore our intuitions about infinity in order to uncover a missing axiom or two for ZF that will settle the truth values. These persons prefer to think that there is a single system of mathematics to which set theory is providing a foundation, but they would prefer not simply to add the continuum hypothesis itself as an axiom because the hope is to make the axioms “readily believable,” yet it is not clear enough that the axiom itself is readily believable. The second camp of philosophers of mathematics disagree and say the concept of infinite set is so vague that we simply do not have any intuitions that will or should settle the truth values. According to this second camp, there are set theories with and without axioms that fix the truth values of the axiom of choice and the continuum hypothesis, and set theory should no more be a unique theory of sets than Euclidean geometry should be the unique theory of geometry.

Believing that ZFC’s infinities are merely the above-surface part of the great iceberg of infinite sets, many set theorists are actively exploring new axioms that imply the existence of sets that could not be proved to exist within ZFC. So far there is no agreement among researchers about the acceptability of any of the new axioms. See (Wolf 2005, pp. 226-8) and (Rucker 1982, pp. 252-3) for more discussion of the search for these new axioms.

6. Infinity in Deductive Logic

The infinite appears in many interesting ways in formal deductive logic, and this section presents an introduction to a few of those ways. Among all the various kinds of formal deductive logics, first-order logic (the usual predicate logic) stands out as especially important, in part because of the accuracy and detail with which it can mirror mathematical deductions. First-order logic also stands out because it is the strongest logic that has proofs for every one of its infinitely numerous logically true sentences, and that is compact in the sense that if an infinite set of its sentences is inconsistent, then so is some finite subset. But first-order logic has expressive limitations:

[M]any central concepts—such as finitude, countability, minimal closure, wellfoundedness, and well-order, cannot be captured in a first-order language. The Lowenheim-Skolem theorems entail that no infinite structure can be characterized up to isomorphism in a first-order language. …The aforementioned mathematical notions that lack first-order characterizations all have adequate characterizations in second-order languages.
Stewart Shapiro, Handbook of Philosophical Logic, p. 131.

Let’s be clearer about just what first-order logic is. To answer this and other questions, it is helpful to introduce some technical terminology. Here is a chart of what is ahead:

First-order language First-order theory First-order formal system First-order logic
Definition Formal language with quantifiers over objects but not over sets of objects. A set of sentences expressed in a first-order language. First-order theory plus its method for building proofs. First-order language with its method for building proofs.

A first-order theory is a set of sentences expressed in a first-order language (which will be defined below). A first-order formal system is a first-order theory plus its deductive structure (method of building proofs). Intuitively and informally, any formal system is a system of symbols that are manipulated by the logician in game-like fashion for the purpose of more deeply understanding the properties of the structure that is represented by the formal system. The symbols denote elements or features of the structure the formal system is being used to represent.

The term “first-order logic” is ambiguous. It can mean a first-order language with its deductive structure, or a first-order language with its semantics, or the academic discipline that studies first-order languages and theories.

Classical first-order logic is classical predicate logic with its core of classical propositional logic. This logic is distinguished by its satisfying certain classically-accepted assumptions: that it has only two truth values (some non-classical logics have an infinite number of truth-values), every sentence (that is, proposition) gets exactly one of the two truth values; no sentence can contain an infinite number of symbols; a valid deduction cannot be made from true sentences to a false one; deductions cannot be infinitely long; the domain of an interpretation cannot be empty but can have any infinite cardinality; an individual constant (name) must name something in the domain; and so forth.

A formal language specifies the language’s vocabulary symbols and its syntax, primarily what counts as being a term or name and what are its well-formed formulas (wffs). A first-order language is a formal language whose symbols are the quantifiers (for example, ∃), connectives (↔), individual constants (a), individual variables (x), predicates or relations (R), and perhaps functions (f) and equality (=). It has a denumerable list of variables. (A set is denumerable or countably infinite if it has size ℵ0.) A first-order language has a countably finite or countably infinite number of predicate symbols and function symbols, but not a zero number of both. First-order languages differ from each other only in their predicate symbols or function symbols or constants symbols or in having or not having the equality symbol. See (Wolf 2005, p. 23) for more details. There are denumerably many terms, formulas, and sentences. Also, because there are uncountably many real numbers, a theory of real numbers in a first-order language does not have enough names for all the real numbers.

To carry out proofs or deductions in a first-order language, the language needs to be given a deductive structure. There are several different ways to do this (via axioms, natural deduction, sequent calculus), but the ways are all independent of which first-order language is being used, and they all require specifying rules such as modus ponens for how to deduce wffs from finitely many previous wffs in the deduction.

To give some semantics or meaning to its symbols, the first-order language needs a definition of valuation and of truth in a valuation and of validity of an argument. In a propositional logic, the valuation assigns to each sentence letter its own single truth value; in predicate logic each term is given its own denotation (its extension), and each predicate is given a set of objects (its extension) in the domain that satisfy the predicate. The valuation rules then determine the truth values of all the wffs. The valuation’s domain is a set containing all the objects that the terms might denote and that the variables range over. The domain may be of any finite or transfinite size, but the variables can range only over objects in this domain, not over sets of those objects.

Tarski, who was influential in giving an appropriate, rigorous definition to first-ordr language, was always bothered by the tension between his nominalist view of language as the product of human activity, which is finite, and his view that intellectual progress in logic and mathematics requires treating a formal language as having infinite features such as an infinity of sentences. This article does not explore how this tension can be eased, or whether it should be.

Because a first-order language cannot successfully express sentences that generalize over sets (or properties or classes or relations) of the objects in the domain, it cannot, for example, adequately express Leibniz’s Law that, “If objects a and b are identical, then they have the same properties.” A second-order language can do this. A language is second-order if in addition to quantifiers on variables that range over objects in the domain it also has quantifiers (such as the universal quantifier ∀P) on a second kind of variable P that ranges over properties (or classes or relations) of these objects. Here is one way to express Leibniz’s Law in second-order logic:

(a = b) –> ∀P(Pa ↔ Pb)

P is called a predicate variable or property variable. Every valid deduction in first-order logic is also valid in second-order logic. A language is third-order if it has quantifiers on variables that range over properties of properties of objects (or over sets of sets of objects), and so forth. A language is called higher-order if its order is second-order or higher.

The definition of first-order theory given earlier in this section was that it is any set of wffs in a first-order language. A more ordinary definition adds that it is closed under deduction. This additional requirement implies that every deductive consequence of some sentences of the theory also is in the theory. Since the consequences are countably infinite, all ordinary first-order theories are countably infinite.

If the language is not explicitly mentioned for a first-order theory, then it is generally assumed that the language is the smallest first-order language that contains all the sentences of the theory. Valuations of the language in which all the sentences of the theory are true are said to be models of the theory.

If the theory is axiomatized, then in addition to the logical axioms there are proper axioms (also called non-logical axioms); these axioms are specific to the theory (and so usually do not hold in other first-order theories). For example, Peano’s axioms when expressed in a first-order language are proper axioms for the formal theory of arithmetic, but they aren’t logical axioms or logical truths. See (Wolf, 2005, pp. 32-3) for specific proper axioms of Peano Arithmetic and for proofs of some of its important theorems.

Besides the above problem about Leibniz’s Law, there is a related problem about infinity that occurs when Peano Arithmetic is expressed as a first-order theory. Gödel’s First Incompleteness Theorem proves that there are some bizarre truths which are independent of first-order Peano Arithmetic (PA), and so cannot be deduced within PA. None of these truths so far are known to lie in mainstream mathematics. But they might. And there is another reason to worry about the limitations of PA. Because the set of sentences of PA is only countable, whereas there are uncountably many sets of numbers in informal arithmetic, it might be that PA is inadequate for expressing and proving some important theorems about sets of numbers. See (Wolf 2005, pp. 33-4, 225).

It seems that all the important theorems of arithmetic and the rest of mathematics can be expressed and proved in another first-order theory, Zermelo-Fraenkel set theory with the axiom of choice (ZFC). Unlike first-order Peano Arithmetic, ZFC needs only a very simple first-order language that surprisingly has no undefined predicate symbol, equality symbol, relation symbol, or function symbol, other than a single two-place binary relation symbol intended to represent set membership. The domain is intended to be composed only of sets but since mathematical objects can be defined to be sets, the domain contains these mathematical objects.

a. Finite and Infinite Axiomatizability

In the process of axiomatizing a theory, any sentence of the theory can be called an axiom. When axiomatizing a theory, there is no problem with having an infinite number of axioms so long as the set of axioms is decidable, that is, so long as there is a finitely long computation or mechanical procedure for deciding, for any sentence, whether it is an axiom.

Logicians are curious as to which formal theories can be finitely axiomatized in a given formal system and which can only be infinitely axiomatized. Group theory is finitely axiomatizable in classical first-order logic, but Peano Arithmetic and ZFC are not. Peano Arithmetic is not finitely axiomatizable because it requires an axiom scheme for induction. An axiom scheme is a countably infinite number of axioms of similar form, and an axiom scheme for induction would be an infinite number of axioms of the form (expressed here informally): “If property P of natural numbers holds for zero, and also holds for n+1 whenever it holds for natural number n, then P holds for all natural numbers.” There needs to be a separate axiom for every property P, but there is a countably infinite number of these properties expressible in a first-order language of elementary arithmetic.

Assuming ZF is consistent, ZFC is not finitely axiomatizable in first-order logic, as Richard Montague discovered. Nevertheless, ZFC is a subset of von Neumann–Bernays–Gödel (NBG) set theory, and the latter is finitely axiomatizable, as Paul Bernays discovered. The first-order theory of Euclidean geometry is not finitely axiomatizable, and the second-order logic used in (Field 1980) to reconstruct mathematical physics without quantifying over numbers also is not finitely axiomatizable. See (Mendelson 1997) for more discussion of finite axiomatizability.

b. Infinitely Long Formulas

An infinitary logic is a logic that makes one of classical logic’s necessarily finite features be infinite. In the languages of classical first-order logic, every formula is required to be only finitely long, but an infinitary logic might relax this. The original, intuitive idea behind requiring finitely long sentences in classical logic was that logic should reflect the finitude of the human mind. But with increasing opposition to psychologism in logic, that is, to making logic somehow dependent on human psychology, researchers began to ignore the finitude restrictions. Löwenheim in about 1915 was perhaps the pioneer here. In 1957, Alfred Tarski and Dana Scott explored permitting the operations of conjunction and disjunction to link infinitely many formulas into an infinitely long formula. Tarski also suggested allowing formulas to have a sequence of quantifiers of any transfinite length. William Hanf proved in 1964 that, unlike classical logics, these infinitary logics fail to be compact. See (Barwise 1975) for more discussion of these developments.

c. Infinitely Long Proofs

Classical formal logic requires any proof to contain a finite number of steps. In the mid-20th century with the disappearance of psychologism in logic, researchers began to investigate logics with infinitely long proofs as an aid to simplifying consistency proofs. See (Barwise 1975).

d. Infinitely Many Truth Values

One reason for permitting an infinite number of truth-values is to represent the idea that truth is a matter of degree. The intuitive idea is that, say, depending on the temperature, the truth of “This cup of coffee is warm” might be definitely true, less true, even less true, and so forth.

One of the simplest infinite-valued semantics uses a continuum of truth values. Its valuations assign to each basic sentence (a formal sentence that contains no connectives or quantifiers) a truth value that is a specific number in the closed interval of real numbers from 0 to 1. The truth-value of the vague sentence “This water is warm” is understood to be definitely true if it has the truth value 1 and definitely false if it has the truth value 0. To sentences having main connectives, the valuation assigns to the negation ~P of any sentence P the truth value of one minus the truth value assigned to P. It assigns to the conjunction P & Q the minimum of the truth values of P and of Q. It assigns to the disjunction P v Q the maximum of the truth values of P and of Q, and so forth.

One advantage of using an infinite-valued semantics is that by permitting modus ponens to produce a conclusion that is slightly less true than either premise, we can create a solution to the paradox of the heap, the sorites paradox. One disadvantage is that there is no well-motivated choice for the specific real number that is the truth value of a vague statement. What is the truth value appropriate to “This water is warm” when the temperature is 100 degrees Fahrenheit and you are interested in cooking pasta in it? Is the truth value 0.635? This latter problem of assigning truth values to specific sentences without being arbitrary has led to the development of fuzzy logics in place of the simpler infinite-valued semantics we have been considering. Lofti Zadeh suggested that instead of vague sentences having any of a continuum of precise truth values we should make the continuum of truth values themselves imprecise. His suggestion was to assign a sentence a truth value that is a fuzzy set of numerical values, a set for which membership is a matter of degree. For more details, see (Nolt 1997, pp. 420-7).

e. Infinite Models

A countable language is a language with countably many symbols. The Löwenhim Skolem Theorem says:

If a first-order theory in a countable language has an infinite model, then it has a countably infinite model.

This is a surprising result about infinity. Would you want your theory of real numbers to have a countable model? Strictly speaking, it is a puzzle and not a paradox because the property of being countably infinite is a property it has when viewed from outside the object language not within it. The theorem does not imply first-order theories of real numbers must have no more real numbers than there are natural numbers.

The Löwenhim-Skolem Theorem can be extended to say that if a theory in a countable language has a model of some infinite size, then it also has models of any infinite size. This is a limitation on first-order theories; they do not permit having a categorical theory of an infinite structure. A formal theory is said to be categorical if any two models satisfying the theory are isomorphic. The two models are isomorphic if they have the same structure, and they can’t be isomorphic if they have different sizes. So, if you create a first-order theory intended to describe a single infinite structure of a certain size, the theory will end up having, for any infinite size, a model of that size. This frustrates the hopes of anyone who would like to have a first-order theory of arithmetic that has models only of size ℵ0, and to have a first-order theory of real numbers that has models only of size 20. See (Enderton 1972, pp. 142-3) for more discussion of this limitation.

Because of this limitation, many logicians have turned to second-order logics. There are second-order categorical theories for the natural numbers and for the real numbers. Unfortunately, there is no sound and complete deductive structure for any second-order logic having a decidable set of axioms; this is a major negative feature of second-order logics.

To illustrate one more surprise regarding infinity in formal logic, notice that the quantifiers are defined in terms of their domain, the domain of discourse. In a first-order set theory, the expression ∃xPx says there exists some set x in the infinite domain of all the sets such that x has property P. Unfortunately, in ZF there is no set of all sets to serve as this domain. So, it is oddly unclear what the expression ∃xPx means when we intend to use it to speak about sets.

f. Infinity and Truth

According to Alfred Tarski’s Undefinability Theorem, in an arbitrary first-order language, a global truth predicate is not definable. A global truth predicate is a predicate that is satisfied by all and only the names (via, say, Gödel numbering) of all the true sentences of the formal language. According to Tarski, since no single language has a global truth predicate, the best approach to expressing truth formally within the language is to expand the language into an infinite hierarchy of languages, with each higher language (the metalanguage) containing a truth predicate that can apply to all and only the true sentences of languages lower in the hierarchy. This process is iterated into the transfinite to obtain Tarski’s hierarchy of metalanguages. Some philosophers have suggested that this infinite hierarchy is implicit within natural languages such as English, but other philosophers, including Tarski himself, believe an informal language does not contain within it a formal language.

To handle the concept of truth formally, Saul Kripke rejects the infinite hierarchy of metalanguages in favor of an infinite hierarchy of interpretations (that is, valuations) of a single language, such as a first-order predicate calculus, with enough apparatus to discuss its own syntax. The language’s intended truth predicate T is the only basic (atomic) predicate that is ever partially-interpreted at any stage of the hierarchy. At the first step in the hierarchy, all predicates but the single one-place predicate T(x) are interpreted. T(x) is completely uninterpreted at this level. As we go up the hierarchy, the interpretation of the other basic predicates are unchanged, but T is satisfied by the names of sentences that were true at lower levels. For example, at the second level, T is satisfied by the name of the sentence ∀x(Fx v ~Fx). At each step in the hierarchy, more sentences get truth values, but any sentence that has a truth value at one level has that same truth value at all higher levels. T almost becomes a global truth predicate when the inductive interpretation-building reaches the first so-called fixed point level. At this countably infinite level, although T is a truth predicate for all those sentences having one of the two classical truth values, the predicate is not quite satisfied by the names of every true sentence because it is not satisfied by the names of some of the true sentences containing T. At this fixed point level, the Liar sentence (of the Liar Paradox) is still neither true nor false. For this reason, the Liar sentence is said to fall into a “truth gap” in Kripke’s theory of truth. See (Kripke, 1975).

(Yablo 1993) produced a semantic paradox somewhat like the Liar Paradox. Yablo claimed there is no way to coherently assign a truth value to any of the sentences in the countably infinite sequence of sentences of the form, “None of the subsequent sentences are true.” Ask yourself whether the first sentence in the sequence could be true. Notice that no sentence overtly refers to itself. There is controversy in the literature about whether the paradox actually contains a hidden appeal to self-reference, and there has been some investigation of the parallel paradox in which “true” is replaced by “provable.” See (Beall 2001).

7. Conclusion

There are many aspects of the infinite that this article does not cover. Here are some of them: an upper limit on the amount of information in the universe, renormalization in quantum field theory, supertasks and infinity machines, categorematic and syncategorematic uses of the word “infinity,” mereology, ordinal and cardinal arithmetic in ZF, the various non-ZF set theories, non-standard solutions to Zeno’s Paradoxes, Cantor’s arguments for the Absolute, Kant’s views on the infinite, quantifiers that assert the existence of uncountably many objects, and the detailed arguments for and against constructivism, intuitionism, and finitism. For more discussion of these latter three programs, see (Maddy 1992).

8. References and Further Reading

  • Ahmavaara, Y. (1965). “The Structure of Space and the Formalism of Relativistic Quantum Theory,” Journal of Mathematical Physics, 6, 87-93.
    • Uses finite arithmetic in mathematical physics, and argues that this is the correct arithmetic for science.
  • Barrow, John D. (2005). The Infinite Book: A Short Guide to the Boundless, Timeless and Endless. Pantheon Books, New York.
    • An informal and easy-to-understand survey of the infinite in philosophy, theology, science, and mathematics. Says which Western philosopher throughout the centuries said what about infinity.
  • Barwise, Jon. (1975) “Infinitary Logics,” in Modern Logic: A Survey, E. Agazzi (ed.), Reidel, Dordrecht, pp. 93-112.
    • An introduction to infinitary logics that emphasizes historical development.
  • Beall, J.C. (2001). “Is Yablo’s Paradox Non-Circular?” Analysis 61, no. 3, pp. 176-87.
    • Discusses the controversy over whether the Yablo Paradox is or isn’t indirectly circular.
  • Cantor, Georg. (1887). “Über die verschiedenen Ansichten in Bezug auf die actualunendlichen Zahlen.” Bihang till Kongl. Svenska Vetenskaps-Akademien Handlingar , Bd. 11 (1886-7), article 19. P. A. Norstedt & Sôner: Stockholm.
    • A very early description of set theory and its relationship to old ideas about infinity.
  • Chihara, Charles. (1973). Ontology and the Vicious-Circle Principle. Ithaca: Cornell University Press.
    • Pages 63-65 give Chihara’s reasons for why the Gödel-Cohen independence results are evidence against mathematical Platonism.
  • Chihara, Charles. (2008). “The Existence of Mathematical Objects,” in Proof & Other Dilemmas: Mathematics and Philosophy, Bonnie Gold & Roger A. Simons, eds., The Mathematical Association of America.
    • In chapter 7, Chihara provides a fine survey of the ontological issues in mathematics.
  • Deutsch, David. (2011). The Beginning of Infinity: Explanations that Transform the World. Penguin Books, New York City.
    • Emphasizes the importance of successful explanation in understanding the world, and provides new ideas on the nature and evolution of our knowledge.
  • Descartes, René. (1641). Meditations on First Philosophy.
    • The third meditation says, “But these properties [of God] are so great and excellent, that the more attentively I consider them the less I feel persuaded that the idea I have of them owes its origin to myself alone. And thus it is absolutely necessary to conclude, from all that I have before said, that God exists….”
  • Dummett, Michael. (1977). Elements of Intuitionism. Oxford University Press, Oxford.
    • A philosophically rich presentation of intuitionism in logic and mathematics.
  • Elwes, Richard. (2010). Mathematics 1001: Absolutely Everything That Matters About Mathematics in 1001 Bite-Sized Explanations, Firefly Books, Richmond Hill, Ontario.
    • Contains the quoted debate between Harvey Friedman and a leading ultrafinitist.
  • Enderton, Herbert B. (1972). A Mathematical Introduction to Logic. Academic Press: New York.
    • An introduction to deductive logic that presupposes the mathematical sophistication of an advanced undergraduate mathematics major. The corollary proved on p. 142 says that if a theory in a countable language has a model of some infinite size, then it also has models of any infinite size.
  • Feferman, Anita Burdman, and Solomon. (2004) Alfred Tarski: Life and Logic, Cambridge University Press, New York.
    • A biography of Alfred Tarski, the 20th century Polish and American logician.
  • Field, Hartry. (1980). Science Without Numbers: A Defense of Nominalism. Princeton: Princeton University Press.
    • Field’s program is to oppose the Quine-Putnam Indispensability argument which apparently implies that mathematical physics requires the existence of mathematical objects such as numbers and sets. Field tries to reformulate scientific theories so, when they are formalized in second-order logic, their quantifiers do not range over abstract mathematical entities. Field’s theory uses quantifiers that range over spacetime points. However, because it uses a second-order logic, the theory is also committed to quantifiers that range over sets of spacetime points, and sets are normally considered to be mathematical objects.
  • Gödel, Kurt. (1947/1983). “What is Cantor’s Continuum Problem?” American Mathematical Monthly 54, 515-525. Revised and reprinted in Philosophy of Mathematics: Selected Readings, Paul Benacerraf and Hilary Putnam (eds.), Prentice-Hall, Inc. Englewood Cliffs, 1964.
    • Gödel argues that the failure of ZF to provide a truth value for Cantor’s continuum hypothesis implies a failure of ZF to correctly describe the Platonic world of sets.
  • Greene, Brian. (2004). The Fabric of Reality. Random House, Inc., New York.
    • Promotes the virtues of string theory.
  • Greene, Brian (1999). The Elegant Universe. Vintage Books, New York.
    • The quantum field theory called quantum electrodynamics (QED) is discussed on pp. 121-2.
  • Greene, Brian. (2011). The Hidden Reality: Parallel Universes and the Deep Laws of the Cosmos. Vintage Books, New York.
    • A popular survey of cosmology with an emphasis on string theory.
  • Hawking, Stephen. (2001). The Illustrated A Brief History of Time: Updated and Expanded Edition. Bantam Dell. New York.
    • Chapter 4 of Brief History contains an elementary and non-mathematical introduction to quantum mechanics and Heisenberg’s uncertainty principle.
  • Hilbert, David. (1925). “On the Infinite,” in Philosophy of Mathematics: Selected Readings, Paul Benacerraf and Hilary Putnam (eds.), Prentice-Hall, Inc. Englewood Cliffs, 1964. 134-151.
    • Hilbert promotes what is now called the Hilbert Program for solving the problem of the infinite by requiring a finite basis for all acceptable assertions about the infinite.
  • Kleene, (1967). Mathematical Logic. John Wiley & Sons: New York.
    • An advanced textbook in mathematical logic.
  • Kripke, Saul. (1975). “Outline of a Theory of Truth,” Journal of Philosophy 72, pp. 690–716.
    • Describes how to create a truth predicate within a formal language that avoids assigning a truth value to the Liar Sentence.
  • Leibniz, Gottfried. (1702). “Letter to Varignon, with a note on the ‘Justification of the Infinitesimal Calculus by that of Ordinary Algebra,'” pp. 542-6. In Leibniz Philosophical Papers and Letters. translated by Leroy E. Loemkr (ed.). D. Reidel Publishing Company, Dordrecht, 1969.
    • Leibniz defends the actual infinite in calculus.
  • Levinas, Emmanuel. (1961). Totalité et Infini. The Hague: Martinus Nijhoff.
    • In Totality and Infinity, the Continental philosopher Levinas describes infinity in terms of the possibilities a person confronts upon encountering other conscious beings.
  • Maddy, Penelope. (1992). Realism in Mathematics. Oxford: Oxford University Press.
    • A discussion of the varieties of realism in mathematics and the defenses that have been, and could be, offered for them. The book is an extended argument for realism about mathematical objects. She offers a set-theoretic monism in which all physical objects are sets.
  • Maor, E. (1991). To Infinity and Beyond: A Cultural History of the Infinite. Princeton: Princeton University Press.
    • A survey of many of the issues discussed in this encyclopedia article.
  • Mendelson, Elliolt. (1997). An Introduction to Mathematical Logic, 4th ed. London: Chapman & Hall.
    • Pp. 225–86 discuss NBG set theory.
  • Mill, John Stuart. (1843). A System of Logic: Ratiocinative and Inductive. Reprinted in J. M. Robson, ed., Collected Works, volumes 7 and 8. Toronto: University of Toronto Press, 1973.
    • Mill argues for empiricism and against accepting the references of theoretical terms in scientific theories if the terms can be justified only by the explanatory success of those theories.
  • Moore, A. W. (2001). The Infinite. Second edition, Routledge, New York.
    • A popular survey of the infinite in metaphysics, mathematics, and science.
  • Mundy, Brent. (1990). “Mathematical Physics and Elementary Logic,” Proceedings of the Biennial Meeting of the Philosophy of Science Association. Vol. 1990, Volume 1. Contributed Papers (1990), pp. 289-301.
    • Discusses the relationships among set theory, logic and physics.
  • Nolt, John. Logics. (1997). Wadsworth Publishing Company, Belmont, California.
    • An undergraduate logic textbook containing in later chapters a brief introduction to non-standard logics such as those with infinite-valued semantics.
  • Norton, John. (2012). “Approximation and Idealization: Why the Difference Matters,” Philosophy of Science, 79, pp. 207-232.
    • Recommends being careful about the distinction between approximation and idealization in science.
  • Owen, H. P. (1967). “Infinity in Theology and Metaphysics.” In Paul Edwards (Ed.) The Encyclopedia of Philosophy, volume 4, pp. 190-3.
    • This survey of the topic is still reliable.
  • Parsons, Charles. (1980). “Quine on the Philosophy of Mathematics.” In L. Hahn and P. Schilpp (Eds.) The Philosophy of W. V. Quine, pp. 396-403. La Salle IL: Open Court.
    • Argues against Quine’s position that whether a mathematical entity exists depends on the indispensability of the mathematical term denoting that entity in a true scientific theory.
  • Penrose, Roger. (2005). The Road to Reality: A Complete Guide to the Laws of the Universe. New York: Alfred A. Knopf.
    • A fascinating book about the relationship between mathematics and physics. Many of its chapters assume sophistication in advanced mathematics.
  • Posy, Carl. (2005). “Intuitionism and Philosophy.” In Stewart Shapiro. Ed. (2005). The Oxford Handbook of Philosophy of Mathematics and Logic. Oxford: Oxford University Press.
    • The history of the intuitionism of Brouwer, Heyting and Dummett. Pages 330-1 explains how Brouwer uses choice sequences to develop “even the infinity needed to produce a continuum” non-empirically.
  • Quine, W. V. (1960). Word and Object. Cambridge: MIT Press.
    • Chapter 7 introduces Quine’s viewpoint that set-theoretic objects exist because they are needed on the basis of our best scientific theories.
  • Quine, W. V. (1986). The Philosophy of W. V. Quine. Editors: Lewis Edwin Hahn and Paul Arthur Schilpp, Open Court, LaSalle, Illinois.
    • Contains the quotation saying infinite sets exist only insofar as they are needed for scientific theory.
  • Robinson, Abraham. (1966). Non-Standard Analysis. Princeton Univ. Press, Princeton.
    • Robinson’s original theory of the infinitesimal and its use in real analysis to replace the Cauchy-Weierstrass methods that use epsilons and deltas.
  • Rucker, Rudy. (1982). Infinity and the Mind: The Science and Philosophy of the Infinite. Birkhäuser: Boston.
    • A survey of set theory with much speculation about its metaphysical implications.
  • Russell, Bertrand. (1914). Our Knowledge of the External World as a Field for Scientific Method in Philosophy. Open Court Publishing Co.: Chicago.
    • Russell champions the use of contemporary real analysis and physics in resolving Zeno’s paradoxes. Chapter 6 is “The Problem of Infinity Considered Historically,” and that chapter is reproduced in (Salmon, 1970).
  • Salmon, Wesley, ed. (1970). Zeno’s Paradoxes. The Bobbs-Merrill Company, Inc., Indianapolis.
    • A collection of the important articles on Zeno’s Paradoxes plus a helpful and easy-to-read preface providing an overview of the issues.
  • Shapiro, Stewart. (2001). “Systems between First-Order and Second-Order Logics,” in D. M. Gabbay and F. Guenthner (eds.),
    Handbook of Philosophical Logic, 2nd Edition, Volume I,
    Kluwer Academic Publishers, pp. 131-187.

    • Surveys first-order logics, second-order logics, and systems between them.
  • Smullyan, Raymond. (1967). “Continuum Problem,” in Paul Edwards (ed.), The Encyclopedia of Philosophy, Macmillan Publishing Co. & The Free Press: New York.
    • Discusses the variety of philosophical reactions to the discovery of the independence of the continuum hypotheses from ZF set theory.
  • Suppes, Patrick. (1960). Axiomatic Set Theory. D. Van Nostrand Company, Inc.: Princeton.
    • An undergraduate-level introduction to set theory.
  • Tarski, Alfred. (1924). “Sur les Ensembles Finis,” Fundamenta Mathematicae, Vol. 6, pp. 45-95.
    • Surveys and evaluates alternative definitions of finitude and infinitude proposed by Zermelo, Russell, Sierpinski, Kuratowski, Tarski, and others.
  • Wagon, Stan. (1985). The Banach-Tarski Paradox. Cambridge University Press: Cambridge.
    • The unintuitive Banach-Tarski Theorem says a solid sphere can be decomposed into a finite number of parts and then reassembled into two solid spheres of the same radius as the original sphere. Unfortunately, you cannot double your sphere of solid gold this way.
  • Wilder, Raymond L. (1965) Introduction to the Foundations of Mathematics, 2nd ed., John Wiley & Sons, Inc.: New York.
    • An undergraduate-level introduction to the foundation of mathematics.
  • Wolf, Robert S. (2005). A Tour through Mathematical Logic. The Mathematical Association of America: Washington, D.C.
    • Chapters 2 and 6 describe set theory and its historical development. Both the history of the infinitesimal and the development of Robinson’s nonstandard model of analysis are described clearly on pages 280-316.
  • Yablo, Stephen. (1993). “Paradox without Self-Reference.” Analysis 53: 251-52.
    • Yablo presents a Liar-like paradox involving an infinite sequence of sentences that, the author claims, is “not in any way circular,” unlike with the traditional Liar Paradox.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University Sacramento
U. S. A.

Bernard Mandeville (1670—1733)

MdandevilleBernard Mandeville is primarily remembered for his impact on discussions of morality and economic theory in the early eighteenth century. His most noteworthy and notorious work is The Fable of the Bees, which triggered immense public criticism at the time. He had a particular influence on philosophers of the Scottish Enlightenment, most notably Francis Hutcheson, David Hume, Jean-Jacques Rousseau, and Adam Smith. The Fable’s overall influence on the fields of ethics and economics is, perhaps, one of the greatest and most provocative of all early-eighteenth century English works.

The controversy sparked by the Fable was over Mandeville’s proposal that vices, such as vanity and greed, result in publically beneficial results. Along the same lines, he proposed that many of the actions commonly thought to be virtuous were, instead, self-interested at their core and therefore vicious. He was a critic of moral systems that claimed humans had natural feelings of benevolence toward one another, and he instead focused attention on self-interested passions like pride and vanity that led to apparent acts of benevolence. This caused his readers to imagine him to be a cruder reincarnation of Thomas Hobbes, particularly as a proponent of egoism. What follows is an overview of Mandeville’s life and influence, paying specific attention to his impact on discussions of morality and economic theory.

Table of Contents

  1. Life
  2. The Fable of the Bees
  3. The Private Vice, Public Benefit Paradox
  4. The Egoist “Culprit”
  5. On Charity
  6. Influence on Economic Theory
  7. References and Further Reading
    1. Works by Mandeville
    2. Secondary Literature

1. Life

Mandeville was born in 1670 to a distinguished family in the Netherlands, either in or nearby Rotterdam. His father was a physician, as was his great-grandfather, a factor that, no doubt, influenced his own educational path in medicine at the University of Leyden, receiving his M.D. in 1691. He also held a baccalaureate in philosophy, and wrote his dissertation defending the Cartesian doctrine that animal bodies are mere automata because they lack immaterial souls.

Mandeville moved to England some time after the Glorious Revolution of 1688, and it was here he settled permanently, married, and had at least two children. His first published works in English were anonymous pieces in 1703 entitled The Pamphleteers: A Satyr and Some Fables after the Easie and Familiar Method of Monsieur de la Fontaine. In the first, Mandeville defends against those “pamphleteers” who were criticizing both the Glorious Revolution and the late King William III. In Some Fables, he translated twenty-seven of La Fontaine’s Fables, adding two of his own in the same comic style as employed in his later Grumbling Hive.

Although Dr. Mandeville supported his family through his work as a physician, he was also engaged in many literary-political activities. His political interests were not directly obvious until 1714 when he published a piece of political propaganda, The Mischiefs that Ought Justly to be Apprehended from a Whig-Government, which demonstrates his support for the Whig party. Throughout his life, he published numerous smaller works and essays, most of them containing harsh social criticism. Published in 1720, Free Thoughts on Religion, the Church and National Happiness was his final party political tract in which he endorses the advantages of Whig governance as well as advancing a skeptical view of the religious establishment and priestcraft.

Mandeville still continued to publish other provocative pieces, for example: A Modest Defence of Publick Stews (1724), containing controversial plans which would create public housing for prostitution. Within this piece he argued that the best societal solution was to legalize prostitution and regulate it under strict government supervision. Mandeville’s most notable and notorious work, however, was The Fable of the Bees; it began as an anonymous pamphlet of doggerel verse in 1705, entitled The Grumbling Hive: Or, Knaves Turn’d Honest. More is known of Mandeville’s writings than of his life, and so it is most useful to turn to The Fable for a further examination of his history.

2. The Fable of the Bees

It is rare that a poem finds its way into serious philosophical discussion, as The Grumbling Hive: or, Knaves Turn’d Honest has done. Written in the style of his previous fables, the 433-line poem served as the foundation for Mandeville’s principal work: The Fable of the Bees: or, Private Vices, Publick Benefits. The Fable grew over a period of twenty-four years, eventually reaching its final, sixth edition in 1729. In this work, Mandeville gives his analysis of how private vices result in public benefits like industry, employment and economic flourishing. Interpreted by his contemporaries as actively promoting vice as the singular explanation and precondition for a thriving economic society, this central analysis was the primary reason for Mandeville’s reputation as a scandalous libertine. This was a misreading of Mandeville’s position. Most of the work he later produced was either an expansion or defense of the Fable in the light of contemporary opposition.

The Grumbling Hive poem is a short piece, later published as just a section of the larger Fable, which was mostly comprised as a series of commentaries upon the 1705 poem. It immediately introduces its reader to a spacious and luxurious hive of bees. This hive was full of vice, “Yet the whole mass a paradise” (The Fable, Vol. I, pg. 31). The society flourished in many ways, but no trade was without dishonesty. Oddly, the worst cheats of the hive were those who complained most about this dishonesty and fraud so plaguing their society. Here the poem dramatically turns as “all the rogues cry’d brazenly, Good gods, had we but honesty!” (The Fable, Vol. I, pg. 33) Jove, the bees’ god, angrily rid the hive of all vice, but the results were catastrophic as the newly virtuous bees were no longer driven to compete with one another. As a result, industry collapsed, and the once flourishing society was destroyed in battle, leaving few bees remaining. These bees, to avoid the vices of ease and extravagance, flew into a hollow tree in a contented honesty.

The implication of the poem is clear for the beehive, but perhaps not for humanity: it seems paradoxical to suggest that a society is better when it promotes a culture characterized by private vice. However, it is precisely this paradox on which Mandeville draws to make his larger point. The “Moral” at the end of the poem claims, “Fools only strive To make a Great an’ honest Hive.”(The Fable, Vol. I, pg. 36) Mandeville thought the discontent over moral corruptness, or the private vice of society, was either hypocritical or incoherent, as such vice served an indispensable role in the economy by stimulating trade, industry and upward economic improvement i.e., public benefit. The desire to create a purely virtuous society was based on “a vain EUTOPIA seated in the Brain”: fancying that a nation can, with virtues like honesty, attain great wealth and success, when in fact it is the desire to improve one’s material condition in acts of self-indulgence that lies at the heart of economic productivity (The Fable, Vol. I, pg. 36).

The poem’s humorous ending demonstrates that vice can look surprisingly like virtue if implemented correctly. To Mandeville’s readers this was a deeply offensive conclusion to draw, and yet for almost twenty years his work went largely unnoticed. In 1714, Mandeville published the Fable of the Bees, presented as a series of “Remarks” offering an extended commentary upon the original “The Grumbling Hive”, and intended to explain and elucidate the meaning of the earlier poem. But the Fable initially garnered little attention. It was not until a second edition in 1723, featuring a new addition, “An Essay on Charity and Charity-Schools”, that Mandeville gained the notoriety that would make him infamous amongst his contemporaries. The 1723 edition soon prompted reproach from the public, and was even presented before the Grand Jury of Middlesex and there declared a public nuisance. The presentment of the Jury claimed that the Fable intended to disparage religion and virtue as detrimental to society, and to promote vice as a necessary component of a well-functioning state. Though never censored, the book and author achieved sudden disrepute, and the Fable found itself the subject of conversation amongst clergymen, journalists, and philosophers.

3. The Private Vice, Public Benefit Paradox

Rather than giving a lengthy argument proving that private vice can be useful, Mandeville illustrates in the Fable that vice can be disguised, and yet is necessary in the attainment of collective goods, thus resulting in a paradox of “private vices, public benefits”. For instance, and to take one of Mandeville’s central examples, pride is a vice, and yet without pride there would be no fashion industry, as individuals would lack the motivation to buy new and expensive clothes with which to try and impress their peers. If pride were eradicated tomorrow, the result would leave hundreds of companies bankrupt, prompt mass unemployment, risk the collapse of industry, and in turn devastate both the economic security and with it the military power of the British commercial state. Similarly, and on a smaller scale, without thieves there would be no locksmiths, without quarrels over property, no lawyers, and so on.

Crucially, however, Mandeville did not claim a paradox of private vice, public virtue. The “benefits” that arose from individually vicious actions were morally compromised due to their being rooted in private self-seeking- one of Mandeville’s starkest challenges to his contemporaries, and a point which makes his fundamental philosophical commitments difficult to interpret. It is still disputed as to what, exactly, Mandeville thought the relation between private vice and public benefit should be: was he merely holding up a mirror to a corrupt society, satirizing those who claimed commercial opulence was straightforwardly compatible with virtue? Or did he seriously believe that modern commercial states should abandon their luxurious comforts for austere self-denial, so as to escape the paradox he alleged? Whatever the case, his notoriety arose from placing the two together, a little too closely for most of his readers’ taste and comfort. Mandeville’s paradox alleged, unapologetically, the tendency of men to hide vices behind socially acceptable forms of behavior, thereby appearing virtuous.  On the one hand, Mandeville wished to imply that common sense views are not as reliant on common sense as they first appear: what looks like virtuous behavior may in fact be disguised selfishness. On the other, those who preach virtue may turn out to be deluded hypocrites: real virtue would mean the collapse of all the benefits that supervene on private vice. Chief amongst Mandeville’s targets was Anthony Ashley Cooper, Third Earl of Shaftesbury, who claimed that a large-scale flourishing commercial society was compatible with individuals securing virtue by engaging in rational self-restraint whilst enjoying the benefits of economic advancement. For Mandeville, this was incorrect and preposterous: society could be prosperous and based on private vices, or poor and based on private virtues- but not both.

4. The Egoist “Culprit”

Mandeville’s psychological examination of humankind, often perceived as cynical, is a large part of his genius and also his infamy. Much in keeping with the physician he was, it is fitting that he took on the task of diagnosing society in order to expose what he believed to be the true motives of humankind. Nonetheless, there was a religious component in Mandeville’s thought. His man was necessarily fallen man: capable only of pleasing himself, the individual human being was a postlapsarian creature, irredeemably selfish and greedy for its own private pleasure, at which it always aimed even if it hid such self-seeking behind more respectable facades (The Fable, Vol. I, pg. 348). Mandeville’s examination showed the ways in which people hid their real thoughts and motives behind a mask in order to fake sociability by not offending the selfish pride of their peers. Ironically, Mandeville’s own honesty led him into trouble: he boldly claimed vice was inevitably the foundation of a thriving society, insofar as all human beings had to act viciously because their status as selfish fallen men ensured that whatever displays they affected, at bottom selfishness always dictated their actions. All social virtues are evolved from self-love, which is at the core irredeemably vicious. Mandeville also challenged conventional moral terminology by taking a term like “vice” and showing that, despite its negative connotations, it was beneficial to society at large.

In its time, most responses to the Fable were designed as refutations (and understandably so, as few desired association with Mandeville’s central thesis) mainly focused on its analysis of the foundations of morality. To many, Mandeville was on par with Thomas Hobbes in promoting a doctrine of egoism which threatened to render all putative morality a function of morally-compromised selfishness. This accusation comes, in part, from “An Enquiry into the Origin of Moral Virtue” (1723) where Mandeville first proposes his theory of the skillful politician. Whether genuine theory, or more of Mandeville’s playful satirizing, the “Enquiry” was a provocative analysis designed to call into question contemporary notions of virtue. According to Mandeville, skillful politicians originally flattered the masses into believing that actions were vicious when done in order to gratify selfish passions, and virtuous when they were performed in contrast with immediate impulse of nature to acquire private pleasure, by instead suppressing this urge temporarily so as not to offend or harm others. But Mandeville’s central contention was that that no action was virtuous when inspired by selfish emotions. When men learned to temporarily suppress their urges for pleasure, they did not act from virtue. What they really did was find innovative ways to better secure their private pleasures, by engaging in forms of socially-sanctioned behavior they were flattered for- thus securing a more advanced form of pleasure than would be had by simply glorying over their peers in immediate displays of selfishness. Because he considered all natural human passions to be selfish, no action could be virtuous if it was done from a natural impulse which would itself be necessarily selfish. Accordingly, a human could not perform a virtuous act without some form of self-denial. Skillful politicians invented a sort of quasi-morality by which to control naturally selfish men- but because this involved the redirection of natural passion, not active self-denial, at root this was vice. The upshot of Mandeville’s vision was that excepting acts of Christian virtue assisted directly by God, all human actions were vicious and thus morally compromised. Unsurprisingly, this view of human nature was thought to be cynical and degrading, which is why he was often categorized with Hobbes, usually by critics of both, as a proponent of the serious egoist system denying the reality of moral distinctions.

Many critical reactions followed Mandeville’s depiction of humankind as selfish and unruly. He was often understood to deny the reality of virtue, with morality being merely the invention of skillful politicians in order to tame human passions. As Mandeville’s analysis of human nature developed throughout his life, he increasingly placed more emphasis on the peculiarity of human passions. His central estimation is that humankind is filled and predominantly governed by the passion of pride, and even when one seems to be acting contrarily, he or she is doing so out of some form of self-interest. He spends a considerable amount of time satirizing “polite” societies whose members imagine their actions to be entirely benevolent. Statements like “Pride and Vanity have built more Hospitals than all the Virtues together” are used to point out the real motives behind seemingly charitable actions (The Fable, Vol. 1, pg. 294). Pride is central to Mandeville’s analysis because it accounts for human actions performed in order to appear selfless to gain public honor, but which can be made into public benefits. It takes the central role in the skillful politician’s plan to socialize humanity through flattery, offering honor as an ever-renewable prize to anyone who would deny his or her immediate self-interest for the sake of another.

For Mandeville, one problem that arose from this account was over the exact role of skillful politicians in mankind’s societal development. How could it be, if men were only able to please themselves, that some (these skillful politicians) could know enough to control others by instigating a system of social virtues? The second volume of the Fable was written to elucidate difficulties such as these and to explain several things “that were obscure and only hinted at in the First.” (The Fable, Vol. II, pg. vi) To accomplish this task, he fashioned six dialogues between interlocutors Cleomenes, who was an advocate for the Fable, and Horatio, described as one who found great delight in Lord Shaftesbury’s writings. These dialogues provided, among other topics, an explanation of how humankind transitioned from its original state of unrestrained self-pleasing into a complex functioning society. Pride was still central to this analysis, but because of the intricacy and confusion behind such a word as pride, Mandeville introduced a helpful distinction between “self-love” and “self-liking”. Self-liking was identified as the cause of pride and shame and accounted for the human need to gain approval from others, whereas self-love referred to material needs of the body; he asserted that the seeds of politeness were lodged within self-love and self-liking.

In part, this distinction came as response to Joseph Butler who claimed that Mandeville’s version of psychological egoism fell apart upon application. By seeking to reduce the consequences that stemmed from Mandeville’s exposure of the hypocrisy of acting for public benefit, Butler argued the compatibility of self-love and benevolence. He did this by making self-love a general, not a particular passion and in doing so, he made the object of self-love happiness. Happiness, then, would be entirely in the interest of moral subjects. Butler held that self-love was compatible with benevolence because calculating long-term interests led to virtuous action. To Mandeville, however, this avoided the main point by failing to ask the central ethical question: how the distinction between moral and non-moral action can be made if moral acts are indistinguishable from self-interested ones. This second volume of the Fable dismisses many of Butler’s criticisms as ignorant, but Mandeville did realize that his notion of pride needed to be re-conceptualized because it was a loaded term and yet was central to his estimation. According to Mandeville, Butler’s error –leading him to claim Mandeville’s system collapsed incoherently– was failing to recognize that men first had to like themselves, but could only do so through other’s recognition and then approbation. Mandeville upheld that self-love is given to all for self-preservation, but we cannot love what we dislike and so we must genuinely like our own being. He alleged that nature caused us to value ourselves above our real worth and so in order to confirm the good opinions we have of ourselves, we flock together to have these notions affirmed. He wrote, “an untaught Man would desire every body that came near him, to agree with him in the Opinion of his superiour Worth, and be angry, as far as his Fear would let him, with all that should refuse it: He would be highly delighted with, and love every body, whom he thought to have a good Opinion of him” (The Fable, Vol. II, pg. 138-9). So, he thought even in an instance where a group of men was fully fed, within less than a half an hour self-liking would lead to a desire for superiority in some way, be it through strength, cunning, or some other grander quality.

Mandeville thought introducing the distinction between “self-liking” and “self-love” rectified confusions over the role of pride. Humans have a deeply rooted psychological need for approbation, and this can drive us to ensure we truly possess the qualities we admire in others. In fact, he claimed self-liking is so necessary to beings who indulge it that people can taste no pleasure without it. Mandeville gives an example of the extremities of this need­­ by claiming self-liking can even drive one to suicide if he or she fails to receive the approbation of others. Still, Mandeville maintains that because our motivation is for the pleasure of a good opinion of ourselves along with a good reputation, our achievement of virtuous character traits, even if genuinely desired, is not true virtue. The motivation is selfish and, consequently, not virtuous.

A large part of Mandeville’s later work focused on critiquing theorists like Berkeley, Law, and Shaftesbury. He particularly criticized Shaftesbury who claimed that human benevolence was natural and that men could act disinterestedly without regard to pride. Mandeville opposed the search for this objective standard of morality as being no better than “a Wild-Goose-Chace that is little to be depended on” (The Fable, Vol. I, p. 331). He thought that impressing upon people that they could be virtuous without self-denial would be a “vast inlet to hypocrisy,” not only deceiving everyone else, but also themselves (The Fable, Vol. I, p. 331). Mandeville aimed to show that, by using his own rigorous and austere standards of morality, his opponents had never performed a virtuous act in their lives; furthermore, if everyone must live up to these ideals, it would mean the collapse of modern society. Thus by alleging the difficulty of achieving virtue and the usefulness of vice, his paradox seemed to set a trap. Francis Hutcheson took up this debate in defense of Shaftesbury in order to establish an alternate account of human virtue to show how humanity could naturally be virtuous by acting from disinterested benevolence. He found the Fable’s outcome detestable in that it reduced societal virtue to passion and claimed this constituted a comprehensive system of sociability. Hutcheson considered a proper moralist to be one who promoted virtue by demonstrating that it is within one’s own best interest to act virtuously. He argued, by constructing his theory of the moral sense, that virtue was pleasurable and in complete accordance with one’s nature. Still, even with this radical departure from Mandeville’s conclusions, both undoubtedly agreed that reason could not sufficiently supply a standard for action: one must begin with an examination of human nature.

Other philosophers took the Fable in a less outraged and condemnatory fashion than Hutcheson. Instead of agreeing with Mandeville that self-interest negated moral worth and attempting to show that human action could be entirely disinterested, Hume agreed with substantial aspects of his basic analysis, but pointed out that if good things result from vice, then there is something deeply incorrect in retaining the terminology of vice after all. Hume considered Hutcheson’s conclusion— that we give our approvals because we are pleased naturally by the actions we find virtuous— to be incorrect. Hume noted, much like Mandeville, that our sense of duty or morality solely occurs in civilization, and he aligns himself more closely with Mandeville than Hutcheson when accounting for human sociability.

It is, perhaps, through Jean-Jacques Rousseau that Mandeville’s naturalistic account of human sociability found its most important messenger. In 1756, Adam Smith, in his review of Rousseau’s Discourse on the Origins of Inequality remarked how Mandeville’s second volume of the Fable gave occasion to Rousseau’s system. Rousseau and Mandeville both deny the natural sociability of man and equally stress the gradual evolution of society. For Rousseau, mankind was endowed with pity, or a “natural repugnance at seeing any other sensible being and particularly any of our own species, suffer pain or death” (Discourse on the Origins of Inequality). This pity or compassion plays a large part in modifying amour de soi-même (self-respect) and making it humane. He saw this passion as a natural and acknowledged that Mandeville agreed. What Mandeville failed to see, thought Rousseau, was that from this pity came all of the other societal virtues.

Smith was also influenced by Mandeville, but likewise disagreed with the supposition that people are wholly selfish, and his Theory of Moral Sentiments spends considerable time debunking the positions of Hobbes and Mandeville accordingly. Smith was able to circumvent this purely self-interested account by drawing on the role of sympathy. He supposed the whole account of self-interest as found in Hobbes’s and Mandeville’s systems caused such commotion in the world because of misapprehensions on the role of sympathy. Smith determined that an operational system of morals was partly based on its capacity to account for a good theory of fellow feeling. So, for example, Mandeville claimed that one’s motivation to help a beggar on the streets would stem from passions like pity that govern humankind: to walk away from someone in need would raise pity within one’s self in such way as to cause psychological harm, and therefore any help given would be performed in order to relieve the unease of seeing another in suffering.

Smith also considered Mandeville’s claim that humans only associated with one another to receive pleasure from the esteem they sought. While Smith did not wholly accept this, they both agreed about the enticing nature of public praise and that it can, at times, be a more powerful desire than accumulation of money. Smith responds directly to Mandeville on this point in the Theory of Moral Sentiments, paying particular attention to Mandeville’s account of the role of pride. Smith rejects Mandeville’s contention that all public spirit and self-sacrifice are merely clever ways to receive the praise of society. He gets around this by drawing a distinction between the desire to become praise-worthy, which is not vice, and the desire of frivolous praise for anything whatsoever. He claims there is a tricky similarity between the two that has been exaggerated by Mandeville, but the distinction is made by separating vanity from the love of true glory. Both are passions, but one is reasonable while the other is ridiculous. Significantly, though, Smith never lays to rest the importance of motivation to one’s overall actions and acknowledges how there are alternate motivations to act which employ both the role of sympathy and self-interest, e.g., one may donate out of some true feeling from sympathy, all the while knowing the move is socially advantageous. Smith gives some praise to Mandeville’s licentious system, because even though it was ultimately incorrect, it could not have made so much noise in the world if it had not, in some way, bordered upon truth. Smith noted it was because of Mandeville’s clever, yet misplaced analysis of human nature that people began to feel the connection between economic activity and human desire.

5. On Charity

In Mandeville’s “Vindication” of the Fable, he proposed that the reason for its sudden popularity may have been his “An Essay on Charity and Charity-Schools” (1723). In this essay Mandeville took his theory from fable to applied social criticism as he claimed that charity is often mistook for pity and compassion. Pity and compassion, as opposed to charity, can be traced back to a desire to think well of one’s self. This “charity”, then, would not be virtuous action but vicious, and therefore worthy of examination. To say Mandeville was unpopular for writing against the formation of charity schools would be an understatement: charity schools were highly regarded and were the most popular form of benevolence in eighteenth-century England. Initiated near the end of the seventeenth century, they were the predominant form of education for the poor. Donning a charitable temperature, these schools provided ways to impose virtuous qualities into the minds of poor children. The common attitude toward these children was rather derogatory and often depicted them as “rough” because they came from pickpockets, idlers and beggars of society. The curriculum within charity schools was overtly religious, attempting to instill moral and religious habits so as to turn these children into polite members of society.

Bernard Mandeville opposed the formation of charity schools, and while his disagreement may seem harsh, it is a practical example of the kind of hypocrisy he contested. Mandeville challenged the use of the word “charity” in description of these schools, and claimed that they were formed not out of the virtue of charity, but out of the passion of pity. To him, passions like pity are acted upon to relieve one’s own self the unease of seeing another in suffering. He explains that, in order for an action to be virtuous, there must not be an impure motive. Acts performed on behalf of friends and family, or done in order to gain honor and public respect could not be charitable. If charity were reducible to pity, then charity itself would be an undiscriminating universal passion and be of no use to society. To him, charity schools were simply clever manifestations of pride. Beginning the essay with his own rigid definition of charity, Mandeville clearly intended to show that these schools were not worthy to be so entitled.

Mandeville argued pity and compassion were accounted for by human passions, and noted, that though it may seem odd, we are controlled by self-love that drives us to relieve these feelings. He drew a sketch of self-love and pity working together with his beggar example. Imagine a beggar on the streets appeals to you by explaining his situation, showing off his wound in need of medical attention, and then implores you to show virtue for Jesus Christ’s sake by giving him some money. His image raises within you a sense of pity, and you feel compelled to give him money. Mandeville claimed the beggar is a master in this art of capturing pity and makes his marks buy their peace. It is our self-love alone that motivates us to give money to this beggar, which cannot constitute an act of charity.

The part of the “Essay” that would have been truly offensive to those in Mandeville’s time comes when he turns accusations of villainy not to so-called objects of charity but to people with wealth and education. He attacks those of good reputation and claims that the reason they have this good reputation is that they have hidden their private vice behind public benefit. He compared charity schools to a vogue in the fashion of hooped petticoats, and pointed out no reason could be given for either. Moreover, he considered these schools to be pernicious, as they would weaken the established social hierarchies on which the British state depended. Charity schools were fashionable to support, but beyond this, Mandeville found little reason for their continuation.

Mandeville disagreed with the entire motivation behind charity schools, seeing them as nothing but a system where men he most opposed could impart their views onto following generations. Mandeville thought, as was common in his day, that people were born into their life stations and should seek to be content within them. He still considered charity to be necessary at times because the helpless should be looked after, but he believed the model of charity schools would only ever promote laziness in society. This view becomes less cynical when considering his support of economic activity as a solution. Mandeville approved of the growing industry and he saw economic advancements as necessary pieces to advancing civilization because standards were being raised, for example: today’s poor were living like yesterday’s rich. He alleged that British prosperity depended, in part, on exploiting the laboring poor, and so it was not the economic advancement he challenged, but rather the hypocrisy of individuals who thought that by their public benefit, they were advancing society. These citizens were acting out of self-love not charity, and if this could be realized, then instances like charity schools could be given over to the critical examination Mandeville thought they deserved.

6. Influence on Economic Theory

Mandeville’s defense of luxury stands amidst the forefront of economic discussions in the eighteenth century. While he charged that a state founded on selfishness is corrupt, he also showed that society must be based upon that selfishness and that no state can be great without embracing luxury. His argument that luxury was harmless to social (if not personal, spiritual) prosperity and necessary for economic flourishing flew in the face of traditional ascetic moral codes embedded in certain Christian teaching, as well as earlier republican political theory which claimed that luxury rendered a population impotent and corrupted individuals, leading to the internal decay of the polity and its vulnerability to external conquest.

Mandeville’s most prevalent influence on economic theory was through Adam Smith. Both of them by and large supported market-based systems of free resource allocation. Mandeville’s commanding point, which could not be ignored by future economists, was that without indulgence there would be little, if any, consumer spending. Mandeville certainly influenced Smith’s economic thought, as Smith picks up the private vice, public benefit paradox in order to claim that one of the original principles in human nature is to barter and trade for private advantage, which then propels commercial society forward resulting in economic advancement and prosperity. This paradox raised the question of whether self-interested action was vicious, and further proposed that by attending to one’s own needs, one could actually contribute to society in positive ways. In his Wealth of Nations, Smith borrowed largely from Mandeville’s earlier position on the usefulness of self-interested behavior, though he denied the scandalous implications Mandeville provided. It is speculated as to whether Smith inherited his invisible hand notion from the paradox Mandeville presented–although the phrase was never explicitly mentioned in Mandeville’s writing– because Smith mentions the invisible hand when he provides an example of unintended public interest brought about by intending one’s own gain. Influence is also found in the division of labor theory, which was one of Smith’s tenets of modern economic thought.

Most notably, Mandeville’s work contains the genealogical origins of laissez-faire economic theory- in particular as put forward by Friedrich von Hayek, one of the Fable’s keenest twentieth-century admirers. The similarity lies in Mandeville’s claim that self-seeking individuals will interact in mutually beneficial ways without being coordinated from above, while a natural check on their interactions will result in public benefit as the outcome. Interference with this self-seeking will pervert the balance- as alleged in the conclusion of the Grumbling Hive. Because of this notion of order emerging through voluntarily, unregulated activities, Hayek credits Mandeville as being one of the first to put forward the concept of “spontaneous order”. Using the same sort of language, Mandeville remarked, “how the short-sighted Wisdom, of perhaps well-meaning People, may rob us of a Felicity, that would flow spontaneously from the Nature of every large Society, if none were to divert or interrupt the Stream” (The Fable, Vol. II, p. 427). Hayek argued that instead of solely viewing Mandeville through the lens of a moral philosopher, we should see him as a great psychologist who may not have contributed much by way of answers, but certainly asked the right questions using an evolutionary approach to understand society. Hayek even goes so far as to claim that Darwin, in many respects, is the culmination of a development Mandeville started more than any other single person. This approach– rather than assuming society was the product of planning and conscious design by elites– helped spark new empirical explorations. Mandeville saw the sociability of man as arising from two things: the many desires he has, and the opposition met while attempting to satisfy these desires. He brings to the foreground the beneficial effects of luxury, and this was part of what interested John Maynard Keynes. In his General Theory, Keynes cited Mandeville as a source for his position in emphasizing the positive effects of consumption (aggregate demand). This stood in opposition to classical economics who held up production (aggregate supply) as the motor of economic growth.

While there was no systematic formulation of laissez-faire theory in Mandeville’s writing, it was an important literary source for the doctrine, namely, its analysis of human selfishness and the societal benefits ironically and unintentionally stemming therefrom. It is precisely through these attempts to reconcile the paradox of private vices, public benefits that we find some of the first leanings toward a modern utilitarian attitude. Accordingly, Mandeville is thought to be one its most fundamental and early philosophical influences, as transmitted in particular by David Hume and Adam Smith to Jeremy Bentham and then John Stuart Mill.

7. References and Further Reading

Bernard Mandeville was an outspoken and controversial author and an equally interesting character. He claims that he wrote mostly for his own entertainment, but the vast number of essays, poems, and stories he composed should, perhaps, be allowed to speak for themselves. The best modern edition and collection of Mandeville’s work is F.B. Kaye’s The Fable of the Bees. The textual references throughout the article were from Kaye’s Fable through the Online Library of Liberty (1988). The following list of Mandeville’s work is adapted from and indebted to Kaye’s own work on Bernard Mandeville.

a. Works by Mandeville

  • Bernandi a Mandeville de Medicina Oratorio Scholastica. Rotterdam: Typis Regneri Leers, 1685.
  • Disputatio Philosophica de Brutorum Operationibus. Leyden: Apud Abrahamum Elzevier, Academiae Typograph, 1689.
  • Disputatio Medica Inauguralis de Chylosi Vitiata. Leyden: Apud Abrahamum Elzevier, Academiae Typograph, 1691.
  • The Pamphleteers: A Satyr. London, 1703.
  • Some Fables after the Easie and Familiar Method of Monsieur de la Fontaine. London, 1703.
  • Aesop Dress’d; or a Collection of Fables Writ in Familiar Verse. By B. Mandeville, M.D. London: Printed for Richard Wellington, 1704.
  • Typhon: or The Wars Between the Gods and Giants; A Burlesque Poem in Imitation of the Comical Mons. Scarron. London: Printed for J. Pero & S. Illidge, and sold by J. Nutt, 1704.
  • The Grumbling Hive: or, Knaves Turn’d Honest. London: Printed for Sam. Ballard and sold by A. Baldwin, 1705.
  •  The Virgin Unmask’d: or, Female Dialogues Betwixt an Elderly Maiden Lady, and Her Niece, On Several Diverting Discourses on Love, Marriage, Memoirs, and Morals of the Times. London: Sold by J. Morphew & J. Woodward, 1709.
  • A Treatise of the Hypochondriack and Hysterick Passions, Vulgarly call’d the Hypo in Men and Vapours in Women… By B. de Mandeville, M.D. London: Printed for the author, D. Leach, W. Taylor & J. Woodward, 1711.
  • Wishes to a Godson, with Other Miscellany Poems, By B.M. London: Printed for J. Baker, 1712.
  • The Fable of the Bees: or, Private Vices, Publick Benefits. London: Printed for J. Roberts, 1714.
  • The Mischiefs that Ought Justly to be Apprehended from a Whig-Government. London: Printed for J. Roberts, 1714.
  • Free Thoughts on Religion, the Church and National Happiness, By B.M. London: Sold by T. Jauncy & J. Roberts, 1720.
  • A Modest Defence of Publick Stews… by a Layman. London: Printed by A. Moore, 1724.
  • An Enquiry into the Cause of the Frequent Executions at Tyburn… by B. Mandeville, M.D. London: Sold by J. Roberts, 1725.
  • The Fable of the Bees. Part II. By the Author of the First. London: Sold by J. Roberts, 1729.
  • An Enquiry into the Origin of Honour, and the Usefulness of Christianity in War. By the Author of the Fable of the Bees. London: Printed for J. Brotherton, 1732.
  • A Letter to Dion, Occasion’d by his Book call’d Alciphron or The Minute Philosopher. By the Author of the Fable of the Bees. London: Sold by J. Roberts, 1732.

b. Secondary Literature

  • Cook, H. J. “Bernard Mandeville and the Therapy of ‘The Clever Politician’” Journal of the History of Ideas 60 (1999): 101-124.
    • On the clever politicians’ manipulation of people’s passions to make politics run smoothly.
  • Goldsmith, M.M. Private Vices, Public Benefits: Bernard Mandeville’s Social and Political Thought. Christchurch, New Zealand: Cybereditions Corporation, 2001.
    • A helpful monograph of Mandeville’s ideas placed in context of eighteenth-century England’s political atmosphere.
  • Hayek, F.A. The Trend of Economic Thinking: Essays on Political Economists and Economic History Volume III. Taylor & Francis e-Library, 2005.
    • See Hayek’s chapter 6 devoted to what he sees as two important Mandevillean contributions to the history of economics.
  • Heath, E. “Mandeville’s Bewitching Engine of Praise” History of Philosophy Quarterly 15 (1998): 205-226.
    • Offers Mandeville’s account of human nature and how government arises from a state of nature. Also depicts Mandeville as one of the first defenders of commercial modernity.
  • Hont, I. “The early Enlightenment debate on commerce and luxury” The Cambridge History of Eighteenth-Century Political Thought. 1st ed. Cambridge: Cambridge University Press (2006): 377-418.
    • See especially pages 387-395 for a discussion of Mandeville’s place in the luxury debate.
  • Hundert, E.J. The Enlightenment’s Fable: Bernard Mandeville and the Discovery of Society. Cambridge: Cambridge University Press, 1994.
    • A comprehensive book which examines the strategies of Mandeville’s ideas and sources and how they lent to his eighteenth-century influence.
  • Hundert, E.J. The Fable of the Bees: And Other Writings. Indianapolis: Hackett Publishing Company, Inc., 1997.
    • An anthology with a wonderful, short introduction to Mandeville.
  • Jones, M.G. The Charity School Movement: A Study of Eighteenth Century Puritanism in Action. Cambridge: Cambridge University Press, 1938.
    • A valuable study, especially helpful in understanding the context of Mandeville’s “An Essay on Charity and Charity-Schools”.
  • Kaye, F.B. Commentary and Introduction to The Fable of the Bees, 2 Volumes. Oxford: Clarendon Press, 1924.
    • The best modern edition and collection of The Fable along with an introduction to Mandeville’s work.
  • Kaye, F.B. “The Writings of Bernard Mandeville: A Bibliographical Survey” The Journal of English and Germanic Philology 20 (1921): 419-467.
    • A scholarly survey of Mandeville’s writings.
  • Kerkhof, B. “A fatal attraction? Smith’s ‘Theory of moral sentiments’ and Mandeville’s ‘Fable’” History of Political Thought 16 no. 2 (1995): 219-233
    • A helpful article on Mandeville’s distinction between “self-liking” and “self-love”
  • Malcom, J. “One State of Nature: Mandeville and Rousseau” Journal of the History of Ideas 39 no. 1 (1978): 119-124.
    • A piece exploring the similarities between Mandeville and Rousseau’s state of nature and process of human sociability.
  • Primer, I. (ed.), Mandeville Studies: New Explorations in the Art and Thought of Dr. Bernard Mandeville (1670-1733). The Hague: Nijhoff, 1975.
    • A collection of various articles, including pieces on some of Mandeville’s minor writings and his relation to specific writers, such as: Defoe, Shaftesbury and Voltaire.
  • Primer, I. The Fable of the Bees Or Private Vices, Publick Benefits. New York: Capricorn Books, 1962.
    • A helpful edited edition good for a basic overview of Mandeville’s thought- complete with an introduction.
  • Runciman, D. Political Hypocrisy: The Mask of Power, from Hobbes to Orwell and Beyond. Princeton: Princeton University Press, 2008.
    • In chapter 2 the author proposes two types of hypocrisy present in Mandeville’s analysis and demonstrates how, to Mandeville, certain kinds of hypocrisy are necessary whilst others are detestable.
  • Welchman, J. “Who Rebutted Bernard Mandeville?” History of Philosophy Quarterly 24 No. 1 (2007): 57-74.
    • On Mandeville and some of his moral interlocutors. It presents several attempts to rebut Mandeville made by Hutcheson, Butler, Berkeley, Hume, and Smith.

Author Information

Phyllis Vandenberg
Email: vandenbp@gvsu.edu
Grand Valley State University
U. S. A.

and

Abigail DeHart
Email: dehartab@mail.gvsu.edu
Grand Valley State University
U. S. A.

Jules Lequyer (Lequier) (1814—1862)

LequyerLike Kierkegaard, Jules Lequyer (Luh-key-eh) resisted, with every philosophical and literary tool at his disposal, the monistic philosophies that attempt to weave human choice into the seamless cloth of the absolute. Although haunted by the suspicion that freedom is an illusion fostered by an ignorance of the causes working within us, he maintained that in whatever ways we are made—by God, the forces of nature, or the conventions of society—there remain frayed strands in the fabric of human existence where self-making adds to the process. Declaring this freedom “the first truth” required by all genuine inquiry into truth, he also challenged traditional doctrines of divine creativity, eternity, and omniscience and he developed his own alternative based on what he saw as the implications of a true metaphysics of freedom.

Lequyer was a reclusive Breton who died in relative obscurity without having published anything. He held no important academic post and most of his literary and philosophical work remained unfinished. Despite these disadvantages, his influence on philosophy was much greater than the ignorance of his thought and of his name would suggest. Charles Renouvier and William James adopted many of his ideas about the meaning of human freedom, its reality, and how it is known. Echoes of Lequyer’s ideas, and sometimes the very phrases he used, are found in French existentialism and American process philosophy. A man of deep religious conviction but also of increasingly melancholy temperament, Lequyer expressed his philosophy in a variety of literary styles. As a consequence, he has been called “the French Kierkegaard,” although he and his more famous Danish contemporary knew nothing of each other.

Table of Contents

  1. Biography
  2. Philosophy of Freedom
  3. Theological Applications
  4. Philosophical Legacy
  5. Conclusion
  6. References and Further Reading
    1. Primary Sources
    2. English Translations
    3. Secondary Sources in French and English

1. Biography

Joseph-Louis-Jules Lequyer, born January 29, 1814 in the village of Quintin, France, was an only child. His father, Joseph Lequyer (1779-1837), was a respected physician, and his mother, Céleste-Reine-Marie-Eusèbe Digaultray (1772-1844), cared for the poor and sick in the Quintin hospital. The family name was subject to a variety of spellings, most notably, “Lequier” and “Lequyer” (occasionally with an accent aigu over the first e). Lequyer’s birth certificate had “Lequier” but in 1834 his father had the spelling legally fixed as “Lequyer” [Grenier, La Philosophie de Jules Lequier, 257-58]. Lequyer was not consistent in the way he spelled his name and the orthographic confusion persists in the scholarly literature. “Lequyer” is the spelling on the plaque marking his birthplace in Quintin and on his tombstone in Plérin.

Lequyer’s parents relocated from Quintin to the nearby town of St.-Brieuc along the north coast of Brittany where their son was educated in a little seminary. By the age of thirteen, he excelled in Greek and Latin. A pious Catholic upbringing, combined with his friendship with Louis Épivent (1805-1876), who himself became a cleric, nurtured Lequyer’s interests in philosophy and theology, especially the perennial question of human free will. The family spent vacations just north of St.-Brieuc near Plérin at an isolated cottage known as Plermont (a contraction of “Plérin” and “mont”) within walking distance of the coast. In this rural setting Lequyer spent many happy hours with his closest friend, Mathurin Le Gal La Salle (1814-1904). Another important attachment of his early years was Anne Deszille (1818-1909), also known as “Nanine.” Lequyer never married, although he twice proposed to Deszille (in 1851 and in 1861) and, to his great disappointment, she twice refused.

In 1834 Lequyer entered the École Polytechnique in Paris. The school regimen required students to rise at dawn, eat a meager breakfast, then study scientific subjects—mathematics, physics, and chemistry—until lunchtime. After lunch, there were military exercises, fencing, and horse riding, as well as lessons in dance and music. After supper, students retired to their studies until nightfall. The rigid schedule did not suit Lequyer’s contemplative habits so he was at cross purposes with some of his superiors. His troubles were exacerbated by the unexpected death of his father in 1837. The following year he failed the exam that would have qualified him to become a lieutenant. Viewing an offer to enter the infantry as an insult, he made a dramatic exit. He announced his resignation to the examining officer with these words: “My general, there are two types of justice, mine and yours” [Hémon, 145]. Of some interest is Lequyer’s physical description from his matriculation card: he stood just under five and a half feet, had blond hair, brown eyes, a straight nose, a small mouth, an oval face, a round chin, and scars under his left eye and on the right side of his chin [Brimmer 1975, Appendix III]. The scar on his chin was from a riding accident at the school which, in later years, he covered by wearing a beard.

The course of study in Paris introduced Lequyer to the determinism of Pierre Simon LaPlace (1749-1827). As the school’s military schedule had conflicted with his temperament, so the idea that every event is necessitated by its causes was in tension with his cherished religious ideas, in particular, the conviction of free will. By happy coincidence, he found in his new friend and classmate Charles Renouvier (1815-1903) a sounding board for his quandaries about freedom and necessity. Renouvier saw in Lequyer a strange combination of religious naïveté and philosophical profundity. Indeed, Renouvier never failed to acknowledge Lequyer’s genius and to refer to him—literally, to his dying days—as his “master” on the subject of free will [Derniers entretiens, 64]. Lequyer, chronically unable to complete most of what he wrote, benefited from Renouvier’s industry. Renouvier eventually published a small library of books, in some of which he included excerpts from Lequyer’s writing. Three years after his friend’s death Renouvier published, at his own expense, one-hundred and twenty copies of a handsome edition of his selection of Lequyer’s writings which he distributed free of charge to any interested party.

Upon leaving the École Polytechnique, Lequyer used the inheritance from his father to retire to Plermont where he lived with his mother and the family servant, Marianne Feuillet (probably born in 1792). Lequyer never had a head for finances, so his money was soon exhausted, although there remained properties in St.-Brieuc that his father had owned. In 1843, the three moved to Paris where Lequyer acquired a position teaching French composition to Egyptian nationals at the École Égyptienne. He had the misfortune of teaching at the school during its decline. Nevertheless, he worked to redesign its curriculum after the model of the École Polytechnique, but centered more on literature, poetry, and even opera. Lequyer’s mother died the year following the move to Paris. Worried over the state of her son’s mind, she entrusted him to the care of Feuillet with these words: “Oh, Marianne, keep watch over my poor Jules. He has in his heart a passion which, I greatly fear, will be the cause of his death” [Hémon, 172]. The exact object of his mother’s concern is unknown but in the fullness of time her words became prophetic.

On August 15, 1846, the day of celebration of the Assumption of Mary, Lequyer underwent a mystical experience that was occasioned by his meditations on the Passion of Christ. He wrote down his experience, alternating between French and Latin, which invites a comparison with Pascal’s Memorial. Lequyer’s indignation at those who caused Christ’s suffering is transformed, first, into a profound sense of repentance as he realizes that he too had “added some burden to the cross” by his sins, and, second, into the gratitude for the love of God in being forgiven his sins. On August 19th, the religious ecstasy recurred, this time as he took communion at the church of St.-Sulpice. Again, the theme of the suffering of Christ is paramount, but now giving way to a determination to share in those sufferings to such an extent that the Virgin Mary would be unable to distinguish him from her own son. Lequyer’s first biographer, Proper Hémon (1846-1918), spoke of the philosopher’s “bizarre religiosity” [Hémon, 184], but there can be no question that, despite his shortcomings and misfortunes, his mystical experiences found outlet in acts of devotion and charity for the remainder of his life.

Lequyer returned to Plermont with Feuillet in 1848, after the February revolution in Paris. Full of zeal for a rejuvenated Republic, he announced, with Renouvier’s help, his candidacy for a seat in the parliament of the Côtes-du-Nord as a “Catholic Republican” [Hémon, 188]. His published platform identifies freedom as the basis of rights and duties and it explicitly mentions the freedoms of the press, of association, of education, and of religion [Le Brech, 56-57]. Of note is that Lequyer received a glowing recommendation for political office from one of his former teachers at the École Polytechnique, Barthélémy Saint-Hilaire. However, like many in more rural areas who identified, or seemed to identify, with the Parisian revolutionaries, Lequyer was not elected. He came in twentieth on the list of candidates, receiving far too few votes to be among those who won a seat in the parliament.

After the election, which was in April 1848, Lequyer retired to Plermont and spent his days in study and meditation, which included long walks along the coast; sometimes he would stay out overnight. There was, however, the persistent problem of finances. Hémon reports that Lequyer would throw change wrapped in paper from his second floor study to the occasional beggar that passed by. From March 30, 1850 into 1851, he sold the family property in St.-Brieuc, leaving him only Plermont. When his aunt Digaultray died on March 31, 1850 he was hopeful of an inheritance of 10,000 francs. As luck would have it, the aunt’s will directed that the sum be doubled, but only on the condition that it be used to pay a debt of 20,000 francs that Lequyer owed to his first cousin, Palasme de Champeaux! The cousin died in August of the same year, so the inheritance went to his estate [Hémon, 245].

Lequyer’s letters to Renouvier indicate a heightened level of creativity in which he made major progress on his philosophical work. In a November 1850 letter, he claimed that he was writing “something unheard of,” namely that the first and most certain of truths is the declaration of one’s own freedom. This movement of thought ends with the idea that one is one’s own work, responsible to oneself, and “to God, who created me creator of myself” (Lequyer had written “creature of myself” but later changed it to “creator of myself”) [OC 70, 538]. Philosophical insights, however, were not enough to save Lequyer from the weight of his failed projects and his destitution which, arguably, contributed to a mental breakdown. On February 28, 1851, a neighbor found Lequyer wandering about with an axe with which he intended to cut his own arm; Lequyer was taken to the hospital in St.-Brieuc for observation. The doctors determined that he was a danger to himself and should be transferred to a mental institution. On March 3rd, Le Gal La Salle and the Abbot Cocheril took Lequyer to the asylum near Dinan, using subterfuge to lure him there. On April 12th, with the help of Paul Michelot (1817-1885) and some other friends, Lequyer was taken to Passy, near Paris, to the celebrated hospital-resort of Dr. Esprit Blanche, the well-known physician who specialized in mental disorders.

Lequyer was discharged from Passy on April 29th, improved but not completely recovered, according to the doctors. He returned to Plermont, there to be welcomed by the faithful Feuillet and to renew contact with an elderly neighbor, Madame Agathe Lando de Kervélégan (born 1790). Relations with others, however, were broken or became strained. Never accepting that his confinement was justified, he severed ties with Le Gal La Salle who he regarded as the one who had orchestrated it. In the book that he planned, a major section was labeled “Episode: Dinan.” Since the book was never completed, we cannot know Lequyer’s exact thoughts about his two months under medical supervision. That his perceptions were cloudy is indicated by the fact that, only a few months after his confinement, he proposed marriage to Nanine, believing she would accept. Her family, with a view to Lequyer’s mental and financial instability, encouraged her to refuse. This she did in a most forceful way by returning all of his letters and by instructing him to burn her letters to him. This he did, but not before making copies of certain excerpts.

For two years after the events of 1851 Lequyer’s whereabouts are unknown. His letters to Renouvier in the closing months of 1855 indicate that two years earlier he had gone to Besançon as a professor of mathematics at the Collège Saint-François Xavier. By Easter of 1854, however, relations with the head of the college, a Monsieur Besson, had gone sour. The details of the problem are unknown, but it seems that Besson scolded Lequyer for not coming to him to ask for something. According to Lequyer, Besson boasted that men of influence as great as the arch-bishop, “crawl at my feet” [OC 546]. Lequyer related this conversation to the Cardinal and Besson was demoted. One of Lequyer’s friends, Henri Deville, had written a well-intentioned letter to the Cardinal requesting that he find Lequyer another place in his diocese. The Cardinal, perhaps misinterpreting the request, turned against Lequyer. As a result, Lequyer was entangled in law suits with both Besson and the Cardinal over indemnities. Lequyer’s lawyer told him “all was lost” when he decided to act with dignity and not crawl at Besson’s feet [OC 549]. An interesting aspect of Lequyer’s sketchy account is that he says he was inspired by the memory of Dinan, imitating the man he had been there by controlling his anger in spite of the wrongs he perceived to have been done to him. Furthermore, he recognized Deville’s good intentions and, though he thought his intervention inappropriate, did not blame him for it.

By the close of 1855 Lequyer had returned to Plermont, never to leave again. Many of the most touching stories about Lequyer come from the last six years of his life. Though his relations with his friends were often strained, he inspired in them a seemingly unconditional loyalty. It was they after all who underwrote the considerable cost of staying at Passy. In his final years, his friends—including Le Gal La Salle who he had disowned—came to his aid more than once. For example, Lequyer frequented a restaurant in St.-Brieuc but would order embarrassingly meager portions. When the owner of the establishment told his friends, they instructed him to give Lequyer full meals and they would pay the difference. When the owner wondered whether Lequyer would notice the charity, the reply was, “Non, il est dans le ciel” [Hémon, 205]—his head is in the clouds—an apt metaphor for his impracticality and his philosophical preoccupations.

In 1858, on the recommendation of Madame Lando, Lequyer became the tutor of Jean-Louis Ollivier, the thirteen year old son of a customs officer of the same name who admired Lequyer’s rhetorical skills; the father once described Lequyer as “a magician of words” [Hémon, 191]. Lequyer taught young Ollivier but also employed him in transcribing Lequyer’s own writing into a more legible script. Ollivier studied with Lequyer for two years but at the close of 1860, passing the exam that allowed him the chance to study to become an administrator of the state, the boy left. A few months earlier (in April) Lequyer had the misfortune of losing a chance to become chief archivist for the Côtes-du-Nord because of a delay in mail service. With this opportunity missed and Ollivier gone, Lequyer was without his student and unemployed. Jean-Louis Le Hesnan, a man of twenty who was too frail to work in the fields took Olliver’s place as Lequyer’s secretary. This partnership, however, was not enough to lift the weight of loneliness.

In the year that followed, Lequyer’s condition deteriorated. His neighbors reported that he would lose track of time and come calling at late hours with no explanation. His hair and beard, no longer cared for, grew prematurely white. His gaze took on a lost and vacant stare. Lequyer’s quixotic hopes of marriage to Nanine were rekindled when, on December 28, 1861, her father died—he believed her father was the main obstacle to the marriage. He again proposed marriage; sometime in the first week of February he learned of her refusal, which she made clear was final. Lequyer’s behavior became frenzied and erratic. He was subject to bizarre hallucinations and he spoke of putting an end to his misery. On Tuesday, February 11, 1862, Lequyer went to the beach with Le Hesnan, shed his clothes, threw water on his chest, and jumped into the bay. He swam to the limits of his strength until he was visible only as a dot among the waves and he cried out. According to Le Hesnan, Lequyer’s last words would not have been a cry of distress but a farewell to Deszille—“Adieu Nanine” [Hémon, 232] At nine o’clock in the evening, Lequyer’s body washed ashore. Feuillet, who Lando described as Lequyer’s “second mother,” was waiting at Plermont to receive the body.

The official police report mentioned Lequyer’s “disturbed spirit” but ruled his death accidental. Nevertheless, a controversy erupted when a newspaper published a poem, “Les Adeiux de Jules Lequyer,” [The Farewells of Jules Lequyer] which was written in Lequyer’s voice and which suggested that he had committed suicide [Grenier, La Philosophie, 272]. Madame Lando eventually revealed herself as the author of the poem; she explained that she was saying Lequyer’s farewells for him in a way that he would have wished. The most propitious result of the controversy is that Charles Le Maoût, writing for Le Publicateur des Côtes-du-Nord (March 1, 1862), published an article titled “Derniers Moments de Jules Lequyer” [Last Moments of Jules Lequyer]. The article includes reports of Lequyer’s friends and neighbors about his final days, thereby providing insight into the disoriented and melancholy condition into which the philosopher had fallen. In November 1949, Dr. Yves Longuet, a psychiatrist at Nantes gave his professional opinion from the available evidence. He concluded that Lequyer suffered a “clear cyclopthemia,” that is to say, a manic-depressive personality [Grenier 1951, 37].

2. Philosophy of Freedom

Renouvier’s edition of Lequyer’s work, noted above, bore the title La Recherche d’une première vérité [The Search for a First Truth]. The book is divided into three sections. The first, titled Comment trouver, comment chercher une première vérité? [How to Find, How to Search for a First Truth?], is prefaced by a brief quasi-autobiographical meditation, “La Feuille charmille” [The Hornbeam Leaf]. The second and third sections are, respectively, Probus ou le principe de la science: Dialogue [Probus or the Principle of Knowledge: Dialogue] and Abel et Abel—Esaü et Jacob: Récit biblique [Abel and Abel—Esau and Jacob: Biblical Narrative]. Collections edited by Jean Grenier in 1936 and 1952 brought together most of Lequyer’s extant work, including excerpts from his correspondence. Curiously absent from Grenier’s editions is a meditation on love and the Trinity; longer and shorter versions of this were published in subsequent collections (Abel et Abel 1991, pp. 101-08; La Recherche 1993, pp. 319-22). An unfinished short story from Lequyer’s earlier years titled La Fourche et la quenouille [The Fork and the Distaff] was published in 2010 and edited by Goulven Le Brech. Other collections have been published, but these form the corpus of Lequyer’s work.

“The Hornbeam Leaf” is Lequyer’s best known work. It was the one thing he wrote that he considered complete enough to distribute to his friends. It addresses, in the form of a childhood experience, the meaning and reality of freedom. Lequyer intended it to be the introduction to his work. It exhibits the best qualities of Lequyer’s writing in its dramatic setting, its poetic language, and its philosophical originality. Lequyer recalls one of his earliest memories as he played in his father’s garden. He is about to pluck a leaf from a hornbeam when he considers that he is the master of his action. Insignificant as it seems, the decision whether or not to pluck the leaf is in his power. He marvels at the idea that his act will initiate a chain of events that will make the world forever thereafter different than it might have been. As he reaches for the leaf, a bird in the foliage is startled. It takes flight only to be seized by a sparrow hawk. Recovering from the shock of this unintended consequence of his act, the child reflects on whether any other outcome was really possible. Perhaps the decision to reach for the leaf was one in a series of events in which each cause was itself the inevitable effect of a prior cause. Perhaps the belief that he could have chosen otherwise, that the course of events might have been different, is an illusion fostered by an ignorance of the antecedent factors bearing on the decision. The child is mesmerized by the thought that he might be unknowingly tangled in a web of necessity, but he recovers the faith in his freedom by a triumphant affirmation of his freedom.

Renouvier remarked that “The Hornbeam Leaf” recorded the point of departure of Lequyer’s philosophical effort [OC 3]. More than this, it illustrates the salient characteristics of freedom as Lequyer conceived them. For Lequyer, at a minimum, freedom involves the twin ideas that an agent’s decision is not a mere conduit through which the causal forces of nature operate and that it is itself the initiator of a chain of causes. Prior to the decision, the future opens onto alternate possibilities. The agent’s decision closes some of these possibilities while it opens others. After the decision is made, the feeling persists that one could have decided differently, and that the past would have been different because of the decision one might have made. Because the course of events is at least partially determined by the agent’s decision, Lequyer maintains that it creates something that, prior to the decision, existed only as a possibility. If one is free in this sense, then one is part creator of the world, and also of others. The child’s gesture leads to the bird’s death. Lequyer draws the corollary that the smallest of beginnings can have the greatest of effects that are unforeseen by the one who initiated the causal chain, a thought that makes even the least of decisions potentially momentous [OC 14, compare OC 201]. This is Lequyer’s version of what Edward Lorenz much later, and in a different context, dubbed “the butterfly effect”—a butterfly flaps its wings in Brazil which leads to a tornado in Texas.

For Lequyer, one’s decisions not only create something in the world, they double back on oneself. If one is free then, in some respects, one is self-creative. These ideas are expressed cryptically in Lequyer’s maxim which occurs in the closing pages of How to Find, How to Search for a First Truth?: “TO MAKE, not to become, but to make, and, in making, TO MAKE ONESELF” [OC 71]. When Lequyer denies that making is a form of becoming he is saying that the free act is not a law-like consequence of prior conditions. This is not to say that making or self-making is wholly independent of prior conditions. Lequyer borrows the language of Johann Fichte and speaks of the human person as a “dependent independence” [OC 70; compare OC 441]. Lequyer is clear that one is not responsible for having come to exist nor for all the factors of nature and nurture that brought one to the point of being capable of thinking for oneself and making one’s own decisions. All of these are aspects of one’s dependence and Lequyer often underscores their importance. On the other hand, one’s independence, as fragile and seemingly insignificant as it may be, is the measure of one’s freedom. This freedom, moreover, is the essential factor in one’s self-making. For Lequyer, it makes sense not only to speak of one’s decisions as being expressions of one’s character as so far formed, but also to speak of one’s character as an expression of one’s decisions as so far made.

Lequyer considers the objection that his view of freedom involves “a sort of madness of the will” [OC 54; compare OC 381]; by claiming that the free act, like a role of dice, could go one way or another, Lequyer seems to imply that freedom is only randomness, a “liberty of indifference” undisciplined by reason. Lequyer replies that arbitrariness is indeed not the idea of freedom, but he claims that it is its foundation. In Lequyer’s view, one is oneself the author of the chance event and that event is one’s very decision. His meaning seems to be that indeterminism—the idea that, in some instances, a single set of causal factors is compatible with more than one possible effect—is a necessary but not a sufficient condition of acts for which we hold a person accountable. In the process of deliberation, motives are noticed and reasons are weighed until one decides for one course of action over another. The will is manifested in the sphere of one’s thought when one causes one idea to prevail over others and one’s hesitation is brought to an end. The act resulting in a decision may be characterized in any number of ways—capricious, selfish, reasonable, moral—but it is in no sense a product of mere brute force. The entire process of deliberation, Lequyer says, is animated by the self-determination of the will. Should an explanation be demanded, appealing to antecedent conditions for exactly why the decision was made one way rather than another, Lequyer replies that the demand is question-begging, for it presupposes determinism [OC 47]. The free act is not a mere link in a causal chain; it is at the origin of such chains. In Lequyer’s words, “To act is to begin” [OC 43].

It is clear that Lequyer did not believe that freedom and determinism can both be true. He acknowledged that we often act, without coercion, in accordance with our desires. Lequyer says that “the inner feeling”—presumably, introspectively discerned—guarantees it [OC 50]. Some philosophers look no further than this for a definition of freedom. For Lequyer, however, this is not enough, for non-human animals often act without constraint [OC 334, 484]. To speak of free will one must also include the idea that one is the ultimate author of one’s decisions. He counsels not to confuse the lack of a feeling of dependence upon causal conditions that would necessitate one’s decision with the feeling of independence of such conditions. The confusion of these ideas, Lequyer claims, leads us to believe that we have more freedom than we actually have. All that we are allowed to say, based on introspection, is that we sometimes do not feel necessitated by past events. An analogous argument for determinism is likewise inconclusive. When we come to believe through a careful examination of a past decision that causes were at work of which we were unaware and which strongly suggest that the decision was inevitable, we are not warranted in generalizing to all of our decisions, supposing that none of them are free [OC 50].

In the dramatic finale of “The Hornbeam Leaf” the child affirms his own freedom. This affirmation is not based on an argument in the sense of inferring a conclusion from premises that are more evident than freedom itself. Lequyer reaches a theoretical impasse—an aporia—on the question of freedom and necessity. Somewhat anticipating Freud, he never tires of emphasizing the depth of our ignorance about the ultimate causes of our decisions. Indeed, the final sentence of How to Find, How to Search for a First Truth? cautions that we never know whether a given act is free [OC 75]. Moreover, he denies that we experience freedom [OC 52; compare OC 349, 353]. He argues that this would involve the impossibility of living through the same choice twice over and experiencing the decision being made first in one way and then being made in the contrary way. The memory of the first choice—or at least the mere fact of its having taken place—would intrude on the second and thus it would not be the same choice in identical circumstances. Lequyer speaks, rather, of a “presentiment” of freedom, the stubbornly persistent sense that we have that, in a given circumstance, we could have chosen differently [OC 52]. Yet, Lequyer maintains, such is the extent of our ignorance—our lack of self-knowledge—that it is often easier to believe that one is free when one is not than to believe that one is free when one really is [OC 53].

Notwithstanding Lequyer’s many caveats about the limitations on freedom and even of knowing whether free will exists, he is above all a champion of human liberty. What remains to be explained is the ground of this affirmation. Despite the fragmentary nature of his literary remains, the general outline of his thinking is clear. How to Find, How to Search for a First Truth? begins as a Cartesian search for an indubitable first truth but it diverges from Descartes’ project in being more than a theoretical exercise. Lequyer speaks of the “formidable difficulty” that stands in the way of inquiry: if one seeks truth without prejudice one runs the risk of changing one’s most cherished convictions [OC 32]. He uses a Pascalian image to illustrate the attempt to seek truth without risk of losing one’s convictions. He says that it would be like walking along a road imagining a precipice on either side; something would be missing from the experience, “the precipice and the vertigo.” Lequyer continues in Pascal’s vein by raising the possibility that honest investigation may not support one’s faith. The heart can place itself above reason but what one most desires is that faith and reason be in harmony [OC 33]. There is, finally, the difficulty that sincere doubt is “both impossible and necessary from different points of view” [OC 30]. It is impossible because doubting what is evident (for example, that there is a world independent of one’s mind) is merely feigned doubt; it is necessary because one cannot assume that what is evident is true (for example, even necessary truths may seem false and people have genuine disagreements about what they firmly believe), otherwise, the search for truth would never begin.

Lequyer’s differences with Descartes are also apparent in his treatment of the skeptical argument from dreaming: because dreams can feel as real as waking life, one cannot be certain that one is awake. Lequyer notes that the search for a first truth requires a sustained effort of concentration in which one actively directs one’s thoughts. In dreams, impressions come pell-mell and one is more a spectator of fantastic worlds than an actor sustaining one’s own thoughts. Lequyer concedes that he cannot be certain that he is awake, but he can be certain that he does not inhabit any ordinary dream. If one sleeps it is one’s thoughts that one doubts; if one is awake, it is one’s memory that one doubts [OC 36]. Lequyer avers that the former is a less feigned doubt than the latter. Pushed further by the radical skepticism to justify one’s belief in the external world, Lequyer prefers the answer of the child: “Just because” [OC 37]. His discussion takes a decidedly existential detour as he reflects on the solitude implicit in the impossibility of directly knowing the thoughts of another. Lequyer’s is not the academic worry of Descartes of how we know that another person is not a mere automaton, it is rather the sense of isolation in contemplating the gulf between two minds even when there is the sincere desire on both of their parts to communicate [OC 37].

It is Lequyer’s treatment of the cogito (“I think”) that takes one to the heart of his philosophy of freedom. He acknowledges the certainty of Descartes’ “I think therefore I am” but he criticizes his predecessor for leaving the insight obscure and therefore of not making proper use of it [OC 329]. The obscurity, Lequyer says, is in the concept of a self-identical thinking substance—sum res cogitans. The cogito is precisely the activity of a thinking subject having itself as an object of thought. In the language of the phenomenologists, Lequyer is puzzled by the intentionality within self-consciousness—the mind representing itself to itself [compare OC 362]. He argues that there is an essentially temporal structure to this relation; the “self” of which one is aware in self-awareness is a previous state of oneself. Lequyer goes so far as to call consciousness “nascent memory” [OC 339-40]. This is a significant departure from Descartes who does not even include memory in his list of characteristics of thought. Descartes says that by “thought” he means understanding, willing, sensing, feeling, and imagining (abstaining by methodical doubt, to be sure, from any judgment about the reality of the object of one’s thought). The omission of “remembering” is curious; “I (seem to) remember, therefore I am” is an instance of the cogito and memory is not obviously reducible to any of the other characteristics of thought. Although Lequyer does not claim that self-memory is perfect, he maintains that each aspect of self-consciousness—as subject and as object—requires the other. Their unity, he maintains, is nothing other than the activity of unifying subject and object. Furthermore, the on-going sequence of events that is consciousness requires that each emergent “me” becomes an object remembered by a subsequent “me.” The “Hornbeam Leaf” is itself the report of such an act of remembering.

For Lequyer, the analysis of the “I think” reveals a more fundamental fact, to wit, “I make.” The making, moreover, is a self-making, for one is continually unifying the dual and interdependent aspects of oneself as subject and as object [OC 329]. Because this process of self-formation is not deterministic, it is open-ended. Lequyer characterizes the relation of cause and effect in a free act as asymmetrical. He labels the relation from effect (subject) to cause (object) as “the necessary” because the subject would not be what it is apart from the object that it incorporates into self-awareness; however, he labels the relation from cause (object) to effect (subject) as “the possible” in the sense that the object remains what it is independent of the subject incorporating it. Lequyer says that “the effect is the movement by which the cause determines itself” [OC 473]. Lequyer’s asymmetrical view of causation, at least where the free act is concerned, diverges from that of the determinist. In deterministic thinking, necessity flows symmetrically from cause to effect and from effect to cause; “the possible,” for determinism, is only a product of our ignorance of the causal matrix that produces an effect. Lequyer agrees that ignorance is a factor in our talk of possibility. He notes that the hand that opens a letter that contains happy or fatal news still trembles, hoping for the best and fearing the worst, each “possibility” considered, although one knows that one of the imagined outcomes is now impossible [OC 60]. Lequyer’s indeterminism, on the other hand, allows that possibilities outrun necessities, that the future is sometimes open whether or not we are ignorant of causes.

Lequyer writes that “it is an act of freedom which affirms freedom” [OC 67]. As already noted, for Lequyer, free will is not deduced from premises whose truth is more certain than the conclusion. We have also seen that he denies that free will can be known directly in experience [OC 353]. The logical possibility remains—entertained by the child in “The Hornbeam Leaf” and spelled out in greater detail in the fourth part of How to Find, How to Search for a First Truth?—that free will is an illusion, that one’s every thought and act is necessitated by the already completed course of events reaching into the past before one’s birth. Lequyer addresses the impasse between free will and determinism with the following reasoning (Renouvier called this Lequyer’s double dilemma). Either free will or determinism is true, but which one is true is not evident. Lequyer says that one must choose one or the other by means of one or the other. This yields a four-fold array: (1) one chooses freedom freely; (2) one chooses freedom necessarily; (3) one chooses necessity freely; (4) one chooses necessity necessarily [OC 398; compare Renouvier’s summary, OC 64-65]. One’s affirmation should at least be consistent with the truth, which means that the array reduces to the first and last options. Of course, the determinist believes that the second option characterizes the advocate of free will; by parity of reasoning, the free willist believes that the third option characterizes the determinist. Again, there is stalemate.

Inspired by the example of mathematics, Lequyer proposes to break the deadlock by considering “a maximum and a minimum at the same time, the least expense of belief for the greatest result” [OC 64, 368]. He compares the hypotheses of free will and determinism as postulates for how they might make sense of or fail to make sense of human decisions. Lequyer, it should be noted, conceives the non-human world of nature as deterministic, so his discussion of free will is limited to the human realm and, in his theology, to that of the divine [OC 475]. It is in considering the two postulates, according to Lequyer, that the specter of determinism casts its darkest shadow. First, with Kant, Lequyer accepts that free will is a necessary postulate to make sense of the moral life [OC 345; compare OC 484-85]. If no one could have chosen otherwise than they chose, there is no basis for claiming that they should have chosen otherwise; judgments of praise and blame, especially of past actions, are groundless if determinism is true. Second, Lequyer goes beyond Kant by claiming that free will is necessary for making sense of the search for truth [OC 398-400]. Lequyer’s reasoning is not as clear as one would like, but the argument seems to be as follows. The search for truth presupposes that the mind can evaluate the reasons for and against a given proposition. The mechanisms of determinism are not, however, sensitive to reasons; indeed, no remotely plausible deterministic laws have been found or proposed for understanding intellectual inquiry. Renouvier elaborated this point by saying that, as the freedom of indifference involves (as Lequyer says) an active indifference to reasons, so determinism involves a passive indifference to reasons. Thus, determinism, by positing necessity as the explanation for our reasoned judgments, undermines the mind’s sensitivity to reasons and therefore allows no way clear of skepticism.

Lequyer’s reasoning, even if it is sound, does not decide the issue in favor of free will. Nor does Lequyer claim that it does. Determinism may yet be true and, if Lequyer is correct, the consequences are that morality is founded on a fiction and we can have no more trust in our judgments of truth and falsity than we can have in a random assignment of truth values to propositions. In the final analysis, the truth that Lequyer seeks is less a truth that is discovered than it is a truth that is made. The free act affirms itself, but because the act is self-creative, it is also a case of the act creating a new truth, namely, that such and such individual affirmed freedom. If freedom is true, and if Lequyer’s reasoning is correct, then the one who creates this fact has the virtue of being able to live a life consistent with moral ideals and of having some hope of discovering truth.

3. Theological Applications

Renouvier deemphasized the theological dimensions of Lequyer’s thought. He said he was bored by Lequyer’s views on the Trinity. He suggested demythologizing Lequyer’s religious ideas so as to salvage philosophical kernels from the theological husk in which they were encased. Obviously, Lequyer did not agree with this approach. Indeed, he devoted approximately twice as much space in his work to topics in philosophy of religion and Christian theology as he did to strictly non-religious philosophizing. Grenier convincingly argued that Lequyer’s design was a renewal of Christian philosophy [OC 326]. One may, however, sympathize with Renouvier’s concerns, for a few of Lequyer’s ruminations are now dated. He seemed to have no knowledge of the sciences that, in his own day, were revealing the astounding age of the earth and the universe. Adam and Eve were real characters in his mind and he speculated on Christ’s return in a few years because of the symmetry between the supposed two-thousand year interval from the moment of creation until the time of Christ and the fact that nearly two-thousand more years had elapsed since Jesus walked the earth [OC 439-40]. Despite these limitations Lequyer’s treatment of religious themes is not, for the most part, dependent on outdated science. His views prefigure developments in philosophical theology in the century and a half since his death, giving his thought a surprisingly contemporary flavor.

Lequyer’s more explicitly theological works are as notable for their literary qualities as for their philosophical arguments. Probus or the Principle of Knowledge, also known as the Dialogue of the Predestinate and the Reprobate, is a nearly complete work in three parts. The first section is a dialogue between two clerics who have been made privy to the future by means of a tableau that pictures for them the contents of divine foreknowledge. Neither character is named, but one is sincerely faithful while the other exhibits only a superficial piety. They see in the tableau that the hypocritical cleric will repent and enter heaven but the pious cleric will backslide and live with the demons. When “the reprobate” begins to despair, “the predestinate” tries to offer him hope of going to heaven. Hope comes in the form of arguments from medieval theologians that are designed to show the compatibility of God’s foreknowledge and human freedom. In the style of Scholastic quaestiones disputatae, the clerics debate the classical arguments. The pious cleric criticizes and is unconvinced by each argument. In the second part, the impious cleric appeals to the tableau for events occurring twenty years in the future. The pious cleric has become a master in a monastery and, ironically, has become a partisan of the very arguments that he had earlier criticized. In the future scene, the master monitors and eventually enters a Socratic discussion between Probus, a young divine, and Caliste, a child. Probus defends the idea that God faces a partially open future precisely because God is perfect and must know, and therefore be affected by, what the creatures do. The scene closes as the master counters these arguments with the claim that the future is indeterminate for human perception but determinate for God. The final and shortest section returns to the clerics. The reprobate’s closing speech answers through bitter parodies the ideas that he has just heard uttered by his future self, the master. The speech reveals that the clerics are having dreams that will be mostly forgotten when they awake. The drama closes when they wake up, each remembering only the end of his dream: one singing with the angels, the other in agony with the demons. Satan, who appears for the first time, has the final word. He will lie in wait for one of the men to stumble.

The dialogue is operatic in its intricacy and drama; its philosophical argument is complex and rigorous. The intertwining of its literary and philosophical aspects is evident in the final pages when the clerics are made to forget the content of their shared dream. They must forget their dream in order for the revelation of the dream to come to pass without interference from the revelation itself. Likewise, Satan is not privy to the content of the dreams, so he must lie in wait, not knowing whether he will catch his prey. It is clear both from the tone of the dialogue and from other things that Lequyer wrote that the reprobate in the first and third parts and Probus in the second part are his spokespersons. The overall message of the dialogue is that the position on divine knowledge and human freedom that had been mapped out by Church theologians is nightmarish. Reform in both the meaning of freedom and how this affects ideas about God are in order. In short, the dialogue is a good example of Lequyer’s attempt to renew Christian philosophy. It should be said, however, that specifically Christian (and Jewish) ideas are used primarily by way of illustration and thus, it is less Christian philosophy than it is philosophical theology that is under consideration.

Lequyer was conversant with what most of the great theologians said about the foreknowledge puzzle—from Augustine and Boethius to Albert the Great, Thomas Aquinas, and John Duns Scotus. The concluding fragments of How to Find, How to Search for a First Truth? make clear that he rejected the Thomistic claim that the creatures can have no affect on God. The relation from the creatures to God, says Lequyer, is as real as the relation from God to the creatures [OC 73]. This rejection of Thomism follows from his analysis of freedom as a creative act that initiates causal chains. One’s free acts make the world, other persons, and even oneself, different than they otherwise would have been. Lequyer never doubted that God is the author of the universe, but the universe, he emphasized, includes free creatures. Thus, he speaks of “God, who created me creator of myself” [OC 70]. Aquinas explained that, in the proper sense of the word, creativity belongs to God alone; the creatures cannot create. For Lequyer, on the other hand, God has created creatures that are lesser creators. That they are God’s creation entails that they are dependent upon God, but since they are also creative they are in some measure independent of God. Because the acts of a free creature produce novel realities, they also create novel realities for God. In a striking turn of phrase, Lequyer says that the free acts of the creatures “make a spot in the absolute, which destroys the absolute” [OC 74].

Lequyer never doubts the omniscience of God. What is in doubt is what there is for God to know and how God comes by this knowledge. The dominant answers to these questions, expressed most thoroughly by Aquinas, were that God has detailed knowledge of the entire sweep of events in space and time—all that has been, is, and will be—and this knowledge is grounded in the fact that God created the universe. The deity has perfect self-knowledge and, as the cause of the world, knows the world as its effect. God’s creativity, according to the classical theory, has no temporal location, nor is omniscience hampered by time. Divine eternity, in the seminal statement of Boethius, is the whole, complete, simultaneous possession of endless life [compare OC 423]. Lequyer’s theory of free will challenges Aquinas’ view of the mechanics of omniscience. On Lequyer’s view, God cannot know human creative acts by virtue of creating them. To be sure, the ability to perform such acts is granted by God, but the acts themselves are products of the humans that make them and are not God’s doing. These lesser creative acts are the necessary condition of God’s knowledge of them; they create something in God that God could not know apart from their creativity. Their creative choices, moreover, are not re-enactments in time of what God decided for them in eternity, nor do they exist in eternity [OC 212]. It follows that they cannot be present to God in eternity. If it is a question of the free act of a creature, what is present to God is that such and such a person is undecided between courses of action and that both are equally possible. God too faces an open future precisely because more than one future is open to a creature to help create. In Lequyer’s words, “A frightful prodigy: man deliberates, and God waits!” [OC 71].

It is tempting to say that Lequyer offers a view of divine knowledge as limited. Lequyer demurs. As Probus explains, it is no more a limitation on God’s knowledge not to be able to know a future free act than it is a limitation on God’s power not to be able to create a square circle—the one is as impossible as the other [OC 171]. A future free act is, by its nature, indeterminate and must be known as such, even by God. Lequyer counsels that his view of divine knowledge only seems to be a limitation on God because we have an incorrect view of creativity. Prefiguring Henri Bergson, he speaks of the “magic in the view of accomplished deeds” that makes them appear, in retrospect, as though they were going to happen all along [OC 280; compare OC 419]. Lequyer—through Probus—speaks of divine self-limitation, but this is arguably an infelicitous way for him to make his case [OC 171]. It is not as though God could remove blinders or exert a little more power and achieve the knowledge of an as yet to be enacted free decision. Prior to the free decision, there is nothing more to be known than possibilities (and probabilities); by exerting more power, God could deprive the decision of its freedom, but it would, by the nature of the case, no longer be a free decision that God was foreseeing. Lequyer argues, however, that one may freely set in motion a series of events that make it impossible for one’s future self to accomplish some desired end. In that case, it would have been impossible for God to foreknow the original free decision, but God would infallibly know the result once the decision had been made.

Lequyer does not tire of stressing that if God is omniscient, then God must know the extent to which the future is open at any given juncture [OC 205]. Recall that Lequyer is mindful of how easily we fool ourselves into thinking we are free when we are not. We mistake merely imagined possibilities for real possibilities. God is not subject to this limitation. For these reasons, his view of divine creativity and knowledge allows for a significant degree of providential control, although there can be no absolute guarantees that everything God might wish to occur will occur. Risk remains. Lequyer disparages the idea that every detail of the world is willed by God; this view of divine power, he says, yields “imitations of life” that make of the work of God something frivolous [OC 212]. Even if creatures are ignorant of the extent of their freedom, free will is nonetheless real and so the world is no puppet show. When it comes to the question of prophecy, Lequyer emphasizes how often biblical prophecies are warnings rather than predictions. Those involving predictions, especially of free acts (for example, Peter’s denials of Christ and Judas’ betrayal), can be accounted for, he avers, by highlighting human ignorance and pride in comparison with divine knowledge of the extent to which the future is open [compare OC 206-07]. God is able to see into the heart of a person to know perfectly what is still open for the person not to do and what is certain that he or she will do. On Lequyer’s view, a deed for which a person is held accountable must be free in its origin but not necessarily in its consequences. One may freely make decisions that deprive one’s future self of freedom, but this does not relieve the person of moral accountability [OC 211].

A peculiarity of Lequyer’s theory as it appears in Probus is that he denies the law of non-contradiction where future contingents are concerned. In this, he follows what he understood (and what some commentators understand) to be Aristotle’s views. Lequyer claims that it is true to say of things past or present that they either are or they are not. On the other hand, for future contingents (like free decisions that might go one way or another), Lequyer says that both are false; where A is a future contingent, both A-will-be and A-will-not-be are false [OC 194]. Doubtless this is the least plausible aspect of Lequyer’s views since abandoning the law of non-contradiction is an extremely heavy price to pay for an open future. It is interesting to speculate, however, on what he would have thought of Charles Hartshorne’s view that the contradictory of A-will-be is A-may-not-be and the contradictory of A-will-not-be is A-may-be. This makes A-will-be and A-will-not-be contraries rather than contradictories. As in Aristotle’s square, contraries may both be false; in this way, Lequyer could have achieved at no damage to elementary logic a doctrine of an open future. He certainly leaned in this direction in the closing pages of How to Find, How to Search for a First Truth? There, he declares that it is contradictory to say that a thing will be and that it is entirely possible that it may not be [OC 75].

Besides Probus, the curiously titled Abel and Abel—Esau and Jacob: Biblical Narrative is Lequyer’s other major work that addresses specifically religious themes. As the title suggests, it is closely tied to biblical motifs. Although it is yet another exploration of the idea of freedom, the examination of philosophical arguments is replaced by a fiction informed by philosophical ideas. Lequyer imagines an old man of Judea, living a little after the time of Christ, who has quoted St. Paul to his grandson that God preferred Jacob to Esau before their birth (Romans 9.11). The child is astonished and saddened by the statement, because it seems to place God’s goodness in doubt. The old man tells a story to the child that is designed to help explain the enigma. The tale, set some generations after Jacob and Esau, concerns the identical twin sons—identical even in their names, “Abel”—of a widowed patriarch, Aram. Before telling this story, however, he recounts the biblical episode of Abraham’s attempted sacrifice of Isaac (Gen. 22). He explains that he wishes the grandson to be reminded of Isaac under Abraham’s knife when he tells the story of the Abels, saying, “Faith is a victory; for a great victory, there must be a great conflict” [OC 235]. In the epilogue, the wizened grandfather gives what amounts to a Christian midrash on the story of Jacob and Esau with special attention to Jacob’s wrestling with the angel (Gen. 32.24-32). Thus, the story of the Abel twins is intercalated between two biblical stories. The theme uniting the three stories is God’s tests and the possible responses to them.

The Abel twins are as alike as twins could be, sharing thoughts and sometimes even dreams, but always in bonds of love for one another. They are introduced to an apparent injustice that saddens them when two brothers, slaves of their father, commit a theft and Aram pardons one but punishes the other. The seeming unfairness of the slave’s punishment reminds the twins of Esau’s complaint that he had been cheated when his brother Jacob stole their father’s blessing from him (Gen. 27). The Abels come close to passing judgment on their own father for treating the guilty slaves unequally. They resist the thought and then are told by Eliezer, the senior servant in the household, that Aram recognized the slave he condemned as having led his companions into some misdeed prior to having committed the theft. The boys are relieved to hear their father vindicated. His judgment of the slaves only seemed unjust to the twins because they lacked information that their father possessed. The episode of the unequally treated thieves serves as a parable counseling faith in the justice of God even when God seems to act in morally arbitrary ways.

The twins themselves must also face the test of being treated unequally. Aram shows them an elaborately decorated cedar ark. He explains that the day will come when one of the twins will be favored over the other to open the ark and discover inside the name which God reserves for him and his brother. Mysteriously, the name will apply to both of them but it will separate them as well. The dreams of the twins are disturbed by this favor that will separate them. Aram leaves, perhaps never to return again, giving charge of his sons to Eliezer. After a time, Eliezer brings the boys again to the cedar ark and there explains to them the decree of Aram. The favored son will be given a ring to denote that he is the chosen of God. The other son may either submit to his brother or depart from the country with a third of Aram’s inheritance, leaving the other two-thirds of the wealth for the chosen Abel. Their father’s possessions are great, so to receive a third of the inheritance is a significant amount. Nevertheless, the fact remains that the twins, equal in every way, will have been treated unequally by Aram’s decree.

It is not given to the child who is being told the story of the Abel twins (or to the reader) to know the outcome of their trial. Instead, he is told of three mutually exclusive ways in which the story could go, depending on how the brothers respond to their unequal treatment. In the first scenario, the favored Abel succumbs to pride and his brother shows resentment. Calling to mind the name of the first murderer in the Bible, Lequyer writes, “And, behind the sons of Aram, Satan who was promising himself two Cains from these two Abels, was laughing” [OC 265]. In the second scenario, the favored brother refuses the gift out of a generous feeling for his brother. In that case, Lequyer says that the favored Abel can be called “the Invincible.” In the third scenario, the favored brother, in great sorrow for what his brother has not received, accepts the ring while the other Abel, out of love for his twin, rejoices in his brother’s gift and helps him to open the gilded cedar chest. Lequyer says that, in this case, the other Abel can be called “the Victorious.” Lequyer presents the three scenarios in the order in which he believes they ought to be valued, from the least (the first scenario) to the greatest (the third scenario). When the ark is opened the mystery is revealed of the single name that is given to the brothers that nevertheless distinguishes them. Written within are the words: YOUR NAME IS: THAT WHICH YOU WERE IN THE TEST [OC 276]. The test was to see how the twins would respond to the apparent injustice of one being favored over the other. In effect, God’s predestined name for the brothers is like a mathematical variable whose value will be determined by the choices that the brothers make in response to the test.

Lequyer is clear that the lesson of Abel and Abel is not simply that God respects the free will of the twins. One also learns that God’s richer gifts may be more in what is denied than in what is given [OC 271]. Put somewhat differently, the denial of a gift may itself be a gift of an opportunity to exercise one’s freedom in the best possible way. To be sure, the favored Abel has his own opportunities. By accepting the ring, graciously and without pride, he is a noble figure. He is greater still (“the Invincible”) if he refuses the ring out of love for his brother. It is open to the other Abel, however, to win an incomparable victory (signified by the name, “the Victorious”) should his brother accept the ring. He is victorious over the apparent injustice done to him and over the resentment and envy he might have felt. He has been given a great opportunity to exhibit a higher virtue and he has taken it. In Lequyer’s words, “It is sweet to be loved . . . but it is far sweeter to love” [OC 272]; he argues that one can be loved without finding pleasure in it, although this may be a fault, but one cannot love without feeling joy. It should also be noted that by becoming “the Victorious” the other Abel in no way diminishes the virtue or the reward open to his twin. In this way, Lequyer avers, one may go far in vindicating God’s justice as well as God’s magnificence (that is, giving more to a person than is strictly merited by their deeds). This is a long way from a complete theodicy but Lequyer surely meant these reflections to be an important contribution to a renewal of Christian philosophy.

In the epilogue Lequyer reemphasizes the importance of accepting the will of God even when it seems harsh. The grandfather returns to the story of Jacob and Esau whose unequal treatment so saddened the grandson in the first place. According to the grandfather’s imaginative retelling, Jacob was tested by God when he wrestled with the angel. As Jacob anxiously awaits the arrival of Esau who had vowed to kill him (Gen. 27.41), he is filled with terror contemplating “the stubbornness of the Lord’s goodwill” in allowing him to buy Esau’s birthright (Gen. 25.29-33) and to steal Isaac’s blessing [OC 296]. Perhaps he fears that Esau will finally exact God’s judgment against him. A stranger approaches Jacob from the shadows and demands to know whether he will bless the name of God even if God should strike him. Jacob promises to bless God. He is shown several terrifying episodes in his future, from the rape of his daughter Dinah (Gen. 34.1-5) to the presumed death of his son Joseph (Gen. 37.33). In the final vision, a perfectly righteous man he does not recognize suffers an ignominious death on a cross. After each vision, Jacob “wrestles” with the temptation to impiety but instead blesses God’s name. Jacob is thus found worthy of the favors bestowed upon him. As the stranger leaves, Jacob sees his face and recognizes it as the face of the man on the cross. When morning comes, Esau arrives and greets his brother with kisses of fraternal love (Gen. 33.4).

Probus and Abel and Abel address different problems and in very different styles. Yet, in some sense they are a diptych, to borrow the apt metaphor of André Clair. Each work deals with a different kind of necessity. The necessity in Probus (also in How to Find, How to Search for a First Truth?) is that of deterministic causes resulting inevitably in certain effects, included among the latter, one’s supposedly free decisions. The necessity in Abel and Abel is the inalterability of the past, especially as it pertains to Aram’s decree. The decree sets the conditions of the test but does not determine its outcome. This is very different from the decree of damnation of the unhappy cleric. The tableau of God’s foreknowledge includes every detail of how the cleric will act in the future. In the dialogue, there is no equivalent of the “name” that is written in the cedar ark, no variable whose value can be decided by one’s free choice. Indeed, Probus can be read as an extended reductio against traditional teachings about foreknowledge and predestination. The predestinate fails to console the reprobate. There can be no hope for him for he knows with certainty that he will be damned. The dialogue, however, offers hope for the reader, the hope of breaking free of a nightmarish theology by rethinking the concepts of freedom and the nature of God along the lines that the character of Probus suggests—after all, Probus is the name of the dialogue. Abel and Abel reinforces the idea that God faces a relatively open future. The story does not tell which of the three options is chosen, nor does it suggest that one of them is predestined to occur.

The story of the Abel twins goes beyond the dialogue, however, by returning to the question raised in How to Find, How to Search for a First Truth? of how self-identity is constructed. Clair argues convincingly that Lequyer means to generalize from the Abel twins to all human beings. The twins represent the fact that one’s self-identity is not merely a question of not being someone else. They are different from each other but neither acquires a new “name”—that is, a distinctive identity—apart from exercising their freedom in response to the test. This is consistent with Lequyer’s theme of the self as a product of self-creative acts, although the self-creativity of the twins most clearly manifests itself in relation to other persons. In Abel and Abel, there is a shift in the question of self-creativity from metaphysics to axiology. The fulfillment of self-creativity, which is to say its highest manifestation, is in love. The “I” of self-creativity becomes inseparable from the “we”. Lequyer appropriates this idea for theology in his reflections on the Trinity. He says that a Divine Love that cannot say “You” to one that is equal to itself would be inconsolable by the eternal absence of its object [Abel et Abel 1991, 101]. If God is love, as Christianity maintains (I John 4.8), then the unity of God requires a plurality within the Godhead.

4. Philosophical Legacy

Renouvier once said that he saved Lequyer’s work from sinking [Esquisse d’une classification systématique, v. 2, 382]. In view of Lequyer’s drowning, it is a fitting if somewhat macabre metaphor. Renouvier often quoted his friend’s work at length in his own books. His edition of The Search for a First Truth, limited though it was to one-hundred and twenty copies, ensured that Lequyer’s philosophy was presented in something like a form of which he would have approved. Renouvier included a brief “Editor’s Preface” but his name appears nowhere in the book. In publishing the book, it was his friend’s contribution to philosophy that he intended to preserve and celebrate, not his own. More widely available editions of the book were published in 1924 and 1993. Another indication of Renouvier’s respect is the marker he was instrumental in erecting over Lequyer’s grave. The inscription reads in part, “to the memory of an unhappy friend and a man of great genius.” Throughout his career he called Lequyer his “master” on the subject of free will and he took meticulous care in attributing to Lequyer the ideas that he borrowed from him. In Renouvier’s last conversations, as recorded by his disciple Louis Prat, he quoted Lequyer’s maxim, “TO MAKE . . . and, in making, TO MAKE ONESELF” as a summary of his own philosophy of personalism [Derniers entretiens, 64].

Others did not take as much care as Renouvier in giving Lequyer the credit that he was due. William James learned of Lequyer from reading Renouvier’s works and wrote to him in 1872 inquiring about The Search for a First Truth which he had not been able to locate through a bookstore. Renouvier sent him a copy which he read, at least in part, and which he donated to the Harvard Library. The essential elements of James’s mature views on free will and determinism closely parallel those of Lequyer—freedom is not merely acting in accordance with the will, the impossibility of experiencing freedom, the importance of effort of attention in the phenomenon of will, the reality of chance, the theoretical impasse between freedom and necessity, and the idea that freedom rightly affirms its own reality. James’s Oxford Street/Divinity Avenue thought experiment in his essay “The Dilemma of Determinism” could be interpreted as an application of a similar passage in the third section of How to Find, How to Search for a First Truth? [OC 52]. There are, to be sure, profound differences between James and Lequyer on many subjects, but where it is a question of free will and determinism the similarities are uncanny.

James always credited Renouvier for framing the issue of free will in terms of “the ambiguity of futures,” but it is clear that Renouvier was a conduit for the ideas of Lequyer. This is nowhere more evident than in James’s 1876 review of two books, by Alexander Bain and Renouvier, published in the Nation. He praises Renouvier’s ideas about freedom, but the views he highlights are the very ideas that Renouvier attributed to Lequyer. In one instance, he confuses a quote from Lequyer as belonging to Renouvier. The unwary reader, like James, assumes that it is Renouvier speaking. In his personal letters James mentions Lequyer by name, but not in any of his works written for publication. It is clear, however, that he thought highly of him. In The Principles of Psychology (1890), James mentions “a French philosopher of genius” and quotes a phrase from the concluding section of How to Find, How to Search for a First Truth? He cites the same phrase, slightly altered, in Some Problems of Philosophy but again not revealing the name of its author [For references, see Viney 1997/2009].

Another famous philosopher who quoted Lequyer without mentioning his name is Jean-Paul Sartre. Sartre may have learned of Lequyer in 1935 when he sat on the board of editors for the Nouvelle Revue Française. The board was considering whether to publish Grenier’s doctoral thesis, La Philosophie de Jules Lequier. The decision was against publication, but not without Sartre objecting that there was still interest among readers in freedom. In 1944, Sartre responded to critics of existentialism and affirmed as his own, the saying, “to make and in making to make oneself and to be nothing except what one has made of oneself.” This is a nearly direct quote from Lequyer. Jean Wahl, who edited a selection of Lequyer’s writings, maintained that Sartre borrowed the principle idea of L’existentialisme est un humanisme (1945) from Lequyer, to wit, that in making our own choices, we are our own creators. Lequyer is not quoted in that presentation. Seven years later, however, in a discussion of Stéphane Mallarmé’s poetry, Sartre again mentions Lequyer’s maxim, placing it in quotation marks, but without reference to the name of the Breton. If one may speak of Lequyer’s anonymous influence on James, one may perhaps speak of Lequyer’s anonymous shadow in the work of Sartre [For references see Viney 2010, 13-14].

The irony in Sartre’s quotations of Lequyer’s maxim is that he uses it not only to express a belief in freedom but also to express his atheism. Sartre rejected the idea that, God creates creatures in accordance with a detailed conception of what they will be. This is what Sartre would characterize as essence preceding existence. The formula of Sartre’s existentialism is that existence precedes essence. In Sartre’s words, it is not the case that “the individual man is the realization of a certain concept in the divine understanding” [Existentialisme est un humanisme, 28]. Of course, Lequyer agrees, but rather than adopting atheism he opted for revising the concept of God as one capable of creating other, lesser, creators. Grenier outlined Lequyer’s theology in his dissertation (just mentioned) but there is no indication—unless his silence says something—of what Sartre thought of it. Other philosophers, however, did not remain silent on Lequyer’s suggestions for revising traditional ideas about God.

After Renouvier, Grenier, and Wahl, the philosopher who made most explicit use of Lequyer’s ideas and promoted their importance was the American Charles Hartshorne. Hartshorne learned of Lequyer from Wahl in Paris in 1948. By that time, Hartshorne was far along in his career with well-developed views of his own in what is known as process philosophy and theology. Nevertheless, he thereafter consistently promoted Lequyer’s significance as a forerunner of process thought. He often quoted the Lequyerian phrase, “God created me creator of myself” and cited Lequyer as the first philosopher to clearly affirm a bilateral influence between God and the creatures. With Hartshorne, Lequyer ceased being, as in James and Sartre, the anonymously cited philosopher. Hartshorne included the first English language excerpt from Lequyer’s writings in his anthology, edited with William L. Reese, Philosophers Speak of God (1953).

Harvey H. Brimmer II (1934-1990), one of Hartshorne’s students, wrote a dissertation titled Jules Lequier and Process Philosophy (1975), which included as appendices translations of How to Find, How to Search for a First Truth? and Probus. This was the first book-length study of Lequyer in English. Brimmer argued, among other things, that the distinction for which Hartshorne is known between the existence/essence of God and the actuality of God is implicit in Lequyer’s thought. According to this idea, God’s essential nature (including the divine existence) is immutable and necessary but God is ever open to new experiences as the particular objects of God’s power, knowledge, and goodness, which are contingent, come to be. For example, it is God’s nature to know whatever exists, but the existence of this particular bird singing is contingent, and so God’s knowledge of it is contingent. Brimmer seems to be on firm footing, for Lequyer says both that God is unchanging but that there can be a change in God [OC 74, compare OC 243].

Hartshorne’s admiration for Lequyer introduced, if unintentionally, its own distortion, as though the only things that matter about Lequyer were the ways in which he anticipated process thought. It may be more accurate, for example, to interpret Lequyer as a forerunner of an evangelical “open theism”—at least a Catholic version—than of process philosophy’s version of divine openness. For example, Lequyer and the evangelical open theists affirm but Hartshorne denies the divine inspiration of the Bible and the doctrine of creation ex nihilo. We may, nevertheless, accentuate the positive by noting that many of Lequyer’s central ideas are incarnated in each variety of open theism. Also noteworthy is that some of those evangelicals who identify themselves as open theists—William Hasker, Richard Rice, and Gregory Boyd—were influenced to a greater or lesser extent by Hartshorne. That Lequyer is an important, if not the most important, pioneer of an open view of God cannot be doubted. Moreover, the combination of literary imagination and philosophical rigor that he brought to the exploration of an open view of God, especially in Probus and Abel and Abel, is unmatched.

The philosopher to whom Lequyer is most often compared is Kierkegaard. Each philosopher endeavored, in the words of Clair, to “think the singular” [Title of Clair 1993]. They would not allow, after the manner of Hegel, a dialectical aufheben in which, they believed, the individual is swallowed by the absolute [OC 347]. Choice and responsibility are central themes for both philosophers. The same can be said of the subject of faith and the “audacity and passion” (Lequyer) that it requires [OC 501]. Both men blurred the line between literature and philosophy, as often happens in superior spirits. Perhaps the best example of this is that they developed what might be called the art of Christian midrash, amending biblical narratives from their own imaginations to shed new light on the text. As Lequyer said in a Kierkegaardian tone, the Scriptures have “extraordinary silences” [OC 231]. Lequyer’s treatment of the story of Abraham and Isaac bears some similarities with what one finds in Kierkegaard’s Fear and Trembling. Both philosophers warn against reading the story in reverse as though Abraham knew all along that God would not allow Isaac to die. Lequyer says that Abraham faced a terrifying reversal of all things human and divine.

If there is a common idea that unites Lequyer and Kierkegaard it is the revitalization of Christianity. Yet, this commonality begins to dissolve under a multitude of qualifications. Kierkegaard’s criticisms of the established church in Denmark were in the truest spirit of Protestantism. Except for an early period of emotional detachment from the church, Lequyer was loyal to Catholicism. The renewal of Christianity meant something different for each philosopher. Kierkegaard spoke of reintroducing Christianity into Christendom and he maintained that the thought behind his whole work was what it means to become a Christian. A distant analogy in Lequyer’s polemic to what Kierkegaard calls “Christendom” is the reasoning of the doctors of the church. Lequyer says that the reasoning of the doctors never had any power over him, even as a child [OC 13]. Whereas Kierkegaard launched an assault on the idea of identifying an institution with Christianity, Lequyer targets the theologians whose theories he believes undercut belief in the freedom of God and of the creatures. Lequyer’s willingness to engage medieval theology on its own terms, matching argument with argument in an attempt to develop a more adequate, logically consistent, and coherent concept of God, stands in contrast to Kierkegaard’s negative dialectic that leads to faith embracing paradox.

5. Conclusion

Lequyer wrote to Renouvier in 1850 that he was writing “something unheard of” [OC 538]. The way in which his ideas and his words have sometimes been invoked without mention of his name makes this sadly ironic. Too often he has been heard from but without himself being heard of. Until recently, the unavailability of his writings in translation tended to confine detailed knowledge of his work to francophones. To make matters more difficult, as Grenier noted, he is something of an απαξ (hapax)—one of a kind. His philosophy does not readily fit any classification or historical development of ideas. Grenier wryly commented on those eager to classify philosophical schools and movements: “Meteors do not have a right to exist because they enter under no nomenclature” [Grenier 1951, 33]. The same metaphor, used more positively, is invoked by Wahl in his edition of Lequyer’s writings. Lequyer, he remarked, left mostly fragments of philosophy, but he left “brief and vivid trails” in the philosophical firmament.

Lequyer worked outside the philosophical mainstream. Yet, he can be regarded, in the expression of Xavier Tilliette, as a scout or a precursor of such diverse movements as personalism, pragmatism, existentialism, and openness theologies. Of course, it is an honor to be considered in such a light. On the other hand, like a point on the horizon on which lines converge, the distinctiveness and integrity of Lequyer’s own point-of-view is in danger of being lost by such a multitude of comparisons. It does not help matters that Lequyer failed to complete his life’s work. It is often reminiscent of Pascal’s Pensées: nuggets of insight and suggestions for argument are scattered throughout the drafts that he made of his thought. In any event, Goulven Le Brech’s assessment seems secure: “The fragmentary and unfinished work of Jules Lequier is far from having given up all its secrets” [Cahiers Jules Lequier, v. 1, 5].

6. References and Further Reading

  • The abbreviation “OC” refers to OEuvres complètes, Jean Grenier’s edition of Lequyer’s works published in 1952. “Hémon” refers to Prosper Hémon’s biography of Lequyer published in Abel et Abel (1991).
  • The Fonds Jules Lequier [Jules Lequier Archives] are at the University of Rennes. Beginning in 2010, Les amis de Jules Lequier has published annually, under the editorship of Le Brech, Cahiers Jules Lequier [Jules Lequier Notebooks] which includes articles, archival material, and previously published but difficult to find material.

a. Primary Sources

  • Lequier, Jules. 1865. La Recherche d’une première vérité, fragments posthumes [The Search for a First Truth, Postumous Fragments]. Edited by Charles Renouvier. (Saint-Cloud, Impr. de Mme Vve Belin).
  • Lequier, Jules. 1924. La Recherche d’une première vérité, fragments posthumes, recueillis par Charles Renouvier. Notice biographique, par Ludovic Dugas. Paris: Librairie Armand Colin. Dugas’ 58 page introductory essay, titled “La Vie, l’Œuvre et le Génie de Lequier” [The Life, Work, and Genius of Lequier], draws heavily on Hémon’s biography (see Lequier 1991).
  • Lequier, Jules. 1936. La Liberté [Freedom]. Textes inédits présentes par Jean Grenier. Paris: Librairie Philosophique J. Vrin.
  • Lequier, Jules. 1948. Jules Lequier. Textes présentes par Jean Wahl. Les Classiques de la Liberté. Genève et Paris: Editions des Trois Collines.
  • Lequier, Jules. 1952. Œuvres complètes [Complete Works]. Édition de Jean Grenier. Neuchâtel, Suisse: Éditions de la Baconnière.
  • Lequier, Jules. 1985. Comment trouver, comment chercher une première vérité? Suivi de “Le Murmure de Lequier (vie imaginaire)” par Michel Valensi [How to find, how to search for a first truth? Followed by “The Murmure of Lequier (imaginary life)”]. Préface de Claude Morali. Paris: Éditions de l’éclat.
  • Lequier, Jules. 1991. Abel et Abel, suivi d’une “Notice Biographique de Jules Lequyer” [Abel and Abel followed by “A Biographical Notice of JulesLequyer”] par Prosper Hémon. Édition de G. Pyguillem. Combas: Éditions de l’Éclat. Hémon’s biography, though incomplete, is the first and most extensively researched biography of the philosopher. It was written at the end of the nineteenth century.
  • Lequier, Jules. 1993. La Recherche d’une première vérité et autres textes, édition établie et présenté par André Clair. Paris: Presses Universitaires de France.
  • Lequier, Jules. 2010. La Fourche et la quenouille [The Fork and the Distaff], préface et notes par Goulven Le Brech. Bédée : Éditions Folle Avoine.

b. English Translations

  • Brimmer, Harvey H. [with Jacqueline Delobel]. 1974. “Jules Lequier’s ‘The Hornbeam Leaf’” Philosophy in Context, 3: 94-100.
  • Brimmer, Harvey H. and Jacqueline Delobel. 1975. Translations of The Problem of Knowledge (which includes “The Hornbeam Leaf”) (pp. 291-354) and Probus, or the Principle of Knowledge (pp. 362-467). The translations are included as an appendix to Brimmer’s Jules Lequier and Process Philosophy (Doctoral Dissertation, Emory University, 1975), Dissertation Abstracts International, 36, 2892A.
  • Hartshorne, Charles and William L. Reese, editors. 1953. Philosophers Speak of God. University of Chicago Press: 227-230. Contains brief selections from Probus.
  • Viney, Donald W. 1998. Translation of Works of Jules Lequyer: The Hornbeam Leaf, The Dialogue of the Predestinate and the Reprobate, Eugene and Theophilus. Foreword by Robert Kane. Lewiston, New York: The Edwin Mellen Press.
  • West, Mark. 1999. Jules Lequyer’s “Abel and Abel” Followed by “Incidents in the Life and Death of Jules Lequyer.” Translation by Mark West; Biography by Donald Wayne Viney. Foreword by William L. Reese. Lewiston, New York: The Edwin Mellen Press.

c. Secondary Sources in French and English

  • Brimmer, Harvey H. 1967. “Lequier (Joseph Louis) Jules.” The Encyclopedia of Philosophy. Edited by Paul Edwards. Volume 4: 438-439. New York: Macmillan.
  • Clair, André. 2000. Métaphysique et existence: essai sur la philosophie de Jules Lequier. Bibliothèque d’histoire de la philosophie, Nouvelle série. Paris: J. Vrin.
  • Grenier, Jean. 1936. La Philosophie de Jules Lequier. Paris: Presses Universitaires de France.
  • Grenier, Jean. 1951. “Un grand philosophe inconnu et méconnu: Jules Lequier” [A great philosopher unknown and unrecognized]. Rencontre, no ll. Lausanne (novembre): 31-39.
  • Le Brech, Goulven. 2007. Jules Lequier. Rennes : La Part Commune.
  • Pyguillem, Gérard. 1985. “Renouvier et sa publication des fragments posthumes de J. Lequier,” [Renouvier and the publication of the posthumous fragments of J. Lequier]. Archives de Philosophie, 48: 653-668.
  • Séailles, Gabriel. 1898. “Un philosophe inconnu, Jules Lequier.” [An unknown philosopher, Jules Lequier]. Revue Philosophique de la France et de L’Etranger. Tome XLV: 120-150.
  • Tilliette, Xavier. 1964. Jules Lequier ou le tourment de la liberté. [Jules Lequier or the torment of freedom]. Paris: Desclée de Brouwer.
  • Viney, Donald W. 1987. “Faith as a Creative Act: Kierkegaard and Lequier on the Relation of Faith and Reason.” Faith & Creativity: Essays in Honor of Eugene H. Peters. Edited by George Nordgulen and George W. Shields. St. Louis, Missouri: CBP Press: 165-177.
  • Viney, Donald W. 1997. “William James on Free Will: The French Connection.” History of Philosophy Quarterly, 14/1 (October): 29-52. Republished in The Reception of Pragmatism in France & the Rise of Roman Catholic Modernism, 1890-1914, edited by David G. Schultenover, S. J. (Washington, D. C.: The Catholic University of America Press, 2009): 93-121.
  • Viney, Donald W. 1997. “Jules Lequyer and the Openness of God.” Faith and Philosophy, 14/2 (April): 1-24.
  • Viney, Donald W. 1999. “The Nightmare of Necessity: Jules Lequyer’s Dialogue of the Predestinate and the Reprobate.” Journal of the Association of the Interdisciplinary Study of the Arts 5/1 (Autumn): 17-30.
  • Vinson, Alain. 1992. “L’Idée d’éternité chez Jules Lequier.” [The Idea of Eternity According to Jules Lequier]. Les Études Philosophique, numéro 2 (Avril-Juin) (Philosophie française): 179-193.

Author Information

Donald Wayne Viney
Email: dviney@pittstate.edu
Pittsburg State University
U. S. A.

Metaphor and Phenomenology

 The term “contemporary phenomenology” refers to a wide area of 20th and 21st century philosophy in which the study of the structures of consciousness occupies center stage. Since the appearance of Kant’s Critique of Pure Reason and subsequent developments in phenomenology and hermeneutics after Husserl, it has no longer been possible to view consciousness as a simple scientific object of study. It is, in fact, the precondition for any sort of meaningful experience, even the simple apprehension of objects in the world. While the basic features of phenomenological consciousness – intentionality, self-awareness, embodiment, and so forth—have been the focus of analysis, Continental philosophers such as Paul Ricoeur and Jacques Derrida go further in adding a linguistically creative dimension. They argue that metaphor and symbol act as the primary interpreters of reality, generating richer layers of perception, expression, and meaning in speculative thought. The interplay of metaphor and phenomenology introduces serious challenges and ambiguities within long-standing assumptions in the history of Western philosophy, largely with respect to the strict divide between the literal and figurative modes of reality based in the correspondence theory of truth. Since the end of the 20th century, the role of metaphor in the production of cognitive structures has been taken up and extended in new productive directions, including “naturalized phenomenology” and straightforward cognitive science, notably in the work of G. Lakoff and M. Johnson, M. Turner, D. Zahavi, and S. Gallagher.

Table of Contents

  1. Overview
    1. The Conventional View: Aristotle’s Contribution to Substitution Model
    2. The Philosophical Issues
    3. Nietzsche’s Role in Development of Phenomenological Theories of Metaphor
  2. The Phenomenological Theory in Continental Philosophy
    1. Phenomenological Method: Husserl
    2. Heidegger’s Contribution
  3. Existential Phenomenology: Paul Ricoeur, Hermeneutics, and Metaphor
    1. The Mechanics of Conceptual Blending
    2. The Role of Kant’s Schematism in Conceptual Blending
  4. Jacques Derrida: Metaphor as Metaphysics
    1. The Dispute between Ricoeur and Derrida
  5. Anglo-American Philosophy: Interactionist Theories
  6. Metaphor, Phenomenology, and Cognitive Science
    1. The Embodied Mind
    2. The Literary Mind
  7. Conclusion
  8. References and Further Reading

1. Overview

This article highlights the definitive points in the ongoing philosophical conversation about metaphorical language and it’s centrality in phenomenology. The phenomenological interpretation of metaphor, at times presented as a critique, is a radical alternative to the conventional analysis of metaphor. The conventional view, largely inherited from Aristotle, is also known as the “substitution model.” In the traditional, or standard approach, the uses and applications of metaphor have been restricted to (along with other related symbolic phenomena/tropes) the realms of rhetoric and poetics. In this view, metaphor is none other than a kind of categorical mistake, a deviance of sense produced in order to create a lively effect.

While somewhat contested, the standard substitution theory, also referred to as the “similarity theory,” generally defines metaphor as a stylistic literary device involving a deviant and dyadic movement which shifts meaning from one word to another. This view, first and most thoroughly articulated by Aristotle, reinforces the epistemic primacy of the literal, where metaphor can only operate as a secondary device, one which is dependent on the prior level of ordinary descriptive language, where the first-order language in itself contains nothing metaphorical. In most cases, the relation between two orders, literal and figurative, has been interpreted as an implicit simile, which expresses a “this is that” structure. For example, Aristotle mentions, in Poetics: 

When the poet says of Achilles that he “Leapt on the foe as a lion,” this is a simile; when he says of him, “the lion leapt” it is a metaphor—here, since both are courageous, [Homer] has transferred to Achilles the name of “lion.” (1406b 20-3)

In purely conventional terms, poetic language can only be said to refer to itself; that is, it can accomplish imaginative description through metaphorical attribution, but the description does not refer to any reality outside of itself. For the purposes of traditional rhetoric and poetics in the Aristotelian mode, metaphor may serve many purposes; it can be clever, creative, or eloquent, but never true in terms of referring to new propositional content. This is due to the restriction of comparison to substitution, such that the cognitive impact of the metaphoric transfer of meaning is produced by assuming similarities between literal and figurative domains of objects and the descriptive predicates attributed to them.

The phenomenological interpretation of metaphor, however, not only challenges the substitution model, it advances the role of metaphor far beyond the limits of traditional rhetoric. In the Continental philosophical tradition, the most extensive developments of metaphor’s place in phenomenology are found in the work of Martin Heidegger, Paul Ricoeur and Jacques Derrida. They all, in slightly different ways, see figurative language as the primary vehicle for the disclosure and creation of new forms of meaning which emerge from an ontological, rather than purely epistemic or objectifying engagement with the world.

a. The Conventional View: Aristotle’s Contribution to Substitution Model

Metaphor consists in giving the thing a name that belongs to something else; the transference being either from species to genus, or from genus to species, or from species to species, on the grounds of analogy. (Poetics 1457b 6-9)

 While his philosophical predecessor Plato condemns the use of figurative speech for its role in rhetorike, “the art of persuasion,” Aristotle recognizes its stylistic merits and provides us with the first systematic analysis of metaphor and its place in literature and the mimetic arts. His briefer descriptions of how metaphors are to be used can be found in Rhetoric and Poetics, while his extended analysis of how metaphor operates within the context of language as a whole can be inferred by reading On Interpretation together with Metaphysics. The descriptive use of metaphor can be understood as an extension of its meaning; the term derives from the Greek metaphora, from metaphero, meaning “to transfer or carry over.” Thus, the figurative trope emerges from a movement of substitution, involving the transference of a word to a new sense, one which compares or juxtaposes seemingly unrelated subjects.  For example, in Shakespeare’s Sonnet 73:

In me thou seest the glowing of such fire,
That on the ashes of his youth doth lie…

The narrator directly transfers and applies the “dying ember” image in a new “foreign” sense: his own awareness of his waning youth.

This is Aristotle’s contribution to the standard substitution model of metaphor. It is to be understood as a linguistic device, widely applied but remaining within the confines of rhetoric and poetry. Though it does play a central role in social persuasion, metaphor, restricted by the mechanics of similarity and substitution, does not carry with it any speculative or philosophical importance. Metaphors may point out underlying similarities between objects and their descriptive categories, and may instruct through adding liveliness and elegance to speech, but they do not refer, in the strong sense, to a form of propositional knowledge.

The formal structure of substitution operates in the following manner: the first subject or entity under description in one context is characterized as equivalent in some way to the second entity derived from another context; it is either implied or stated that the first entity “is” the second entity in some way. The metaphorical attribution occurs when certain select properties from the second entity are imposed on the first in order to characterize it in some distinctive way. Metaphor relies on pre-existing categories which classify objects and their properties; these categories guide the ascription of predicates to objects, and since metaphor may entail a kind of violation of this order, it cannot itself refer to a “real” class of existing objects or the relations between them. Similarly, in poetry, metaphor serves not as a foundation for knowledge, but as a tool for mimesis or artistic imitation, representing the actions in epic tragedy or mythos in order to move and instruct the emotions of the audience for the purpose of catharsis.

Aristotle’s theory and its significance for philosophy can only be fully understood in terms of the wider context of denotation and reference which supports the classical realist epistemology. Metaphor is found within his taxonomy of speech forms; additionally, simile is subordinate to metaphor and both are figures of speech falling under the rubric of lexis/diction, which itself is composed of individual linguistic units or noun-names and verbs. Lexis operates within the unity of logos, meaning that the uses of various forms of speech must conform to the overall unity of language and reason, held together by categorical structures of being found in Aristotle’s metaphysics.

As a result of Aristotle’s combined thinking in these works, it turns out that the ostensive function of naming individual objects (“this” name standing for “this object” or property) allows for the clear demarcation between the literal and figurative meanings for names. Thus, the noun-name can work as a signifier of meaning in two domains, the literal and the non-literal. However, there remains an unresolved problem: the categorical nature of the boundary between literal and figurative domains will be a point of contention for many contemporary critiques of the theory coming from phenomenological philosophy.

Furthermore, the denotative theory has served in support of the referential function of language, one which assumes a system of methodological connections between language, sense perceptions, mental states, and the external world. The referential relation between language and its objects serves the correspondence theory of truth, in that the truth-bearing capacity of language corresponds to valid perception and cognition of the external world. The theory assumes that these sets of correspondences allow for the consistent and reliable relation of reference between words, images, and objects.

Aristotle accounts for this kind of correspondence in the following way: sense perceptions’s pathemata give rise to the psychological states in which object representations are formed. These states are actually likenesses (isomorphisms) of the external objects. Thus, names for things refer to the things themselves, mental representations of those things, and to the class-based meanings.

If, as Aristotle assumes, the meaning of metaphor rests on the level of the noun-name, its distinguishing feature lies in its deviation, a “something which happens” to the noun/name by virtue of a transfer (epiphora) of meaning. Here, Aristotle creates a metaphor (based on physical movement) in order to explain metaphor. The term “phora” refers to a change in location from one place to another, to which is added the prefix “epi:” epiphora refers then to the transfer of the common proper name of the thing to the new, unfamiliar, alien (allotrios) place or object. Furthermore, the transference (or substitution), borrowing as it does the alien name for the thing, does not disrupt the overall unity of meaning or logical order of correspondence within the denotative system; all such movement remains within the classifications of genus and species.

The metaphoric transfer of meaning will become a significant point of debate and speculation in later philosophical discussions. Although Aristotle himself does not explore the latent philosophical questions in his own theory, subsequent philosophers of language have over the years recast these issues, exploring the challenges to meaning, reference, and correspondence that present themselves in the substitution theory. What happens, on these various levels, when we substitute one object or descriptor of a “natural kind,” to a foreign object domain? It may the be the case that metaphorical transference calls into question the limits of all meaning-bearing categories, and in turn, the manner in which words can be said to “refer” to specific objects and their attributes. By virtue of the epiphoric movement, species and genus attributes of disparate objects fall into relations of kinship, opposition, or deviation among the various ontological categories. These relations allow for the metaphoric novelty which will subsequently fuel the development of alternative theories, those which view as fundamental to our cognitive or conceptual processes. At this point the analysis of metaphor opens up the philosophical space for further debate and interpretation.

b. The Philosophical Issues

In any theory of metaphor, there are significant philosophical implications for the transfer of meaning from one object-domain or context of associations to another. The metaphor, unlike its sister-trope the analogy, creates a new form of predication, suggesting that one category or class of objects (with certain characteristics) can be projected onto another separate class of entities; this projection may require a blurring of the ontological and epistemological distinctions between the kinds of objects that can be said to exist, either in the mind or in the external world. Returning to the Shakespearean metaphor above, what are the criteria that we use to determine whether a dying ember aptly fits the state of the narrator’s consciousness? What are the perceptual and ontological connections between fire and human existence? The first problem lies in how we are to explain the initial “fit” between any predicate category and its objects. Another problem comes to the forefront when we try to account for how metaphors enable us to think in new ways. If we are to move beyond the standard substitution model, we are compelled to investigate the specific mental operations that enable us to create metaphoric representations; we need to elaborate upon the processes which connect particular external objects (and their properties) given to sensory experience to linguistic signs “referring” to a new kind of object, knowledge context, or domain of experience.

According to the standard model, a metaphor’s ability to signify is restricted by ordinary denotation. The metaphor, understood as a new name, is conceived as a function of individual terms, rather than sentences or wider forms of discourse (narratives, texts). As Continental phenomenology develops in the late 19th and 20th centuries, we are presented with radically alternative theories which obscure strict boundaries between the literal and the figurative, disrupting the connections between perception, language, and thought. Namely, the phenomenological, interactionist, and cognitive treatments of metaphor defend the view that metaphorical language and symbol serve as indirect routes to novel ways of knowing and describing human experience. In their own ways, these theories will call into question the validity and usefulness of correspondence and reference, especially in theoretical disciplines such as philosophy, theology, literature, and science.

Although this article largely focuses on explicating phenomenological theories of metaphor, it should be noted that in all three theories mentioned above, metaphor is displaced from its formerly secondary position in substitution theory to occupying the front and center of our cognitive capabilities. Understood as the product of intentional structures in the mind, metaphor now becomes conceptual, rather than merely ornamental, acting as a conduit through which we take apart and re-assemble the concepts we use to describe the varieties and nuances of experience. They all share in the assumption that metaphors suggest, posit, or disclose similarities between objects and domains of experience (where there seem to be none), without explicitly recognizing that a comparison is being made between two sometimes very different kinds of things or events. These theories, when applied to our original metaphor (“in me thou seest…”) contend that at times, there need not be any explicit similarity between states of awareness or existence as “fire” or “ashes”.

c. Nietzsche’s Role in Development of Phenomenological Theories of Metaphor

In Nietzsche’s thought we see an early turning away from the substitution theory and its reliance on the correspondence theory of truth, denotation, and reference. His description of metaphor takes us back to its primordial “precognitive” or ontological origins; Nietzsche acts here as a pre-cursor to later developments, yet in itself his analysis offers a compelling account of the power of metaphor. Though his remarks on metaphor are somewhat scattered, they can be found in the early writings of 1872-74, Nachgelassene Fragmente, and “On Truth and Lie in an Extra-Moral Sense” (see W. Kaufman’s translation in The Portable Nietzsche). Together with the “Rhetorik” lectures, these writings argue for a genealogical explanation of the conceptual, displacing traditional philosophical categories into the metaphorical realm. In doing so, he deconstructs our conventional reliance on the idea that meaningful language must reflect a system of logical correspondences.

With correspondence, we can only assume we are in possession of the truth when our representations or ideas about the world “match up” with external states of affairs. We have already seen how Aristotle’s system of first-order predication supports correspondence, as it is enabled through the denotative ascription of predicates/categorical features of /to objects. But Nietzsche boldly suggests that we are, from the outset, already in metaphor and he works from this starting point. The concepts and judgments we use to describe reality do not flatly reflect pre-existing similarities or causal relationships between themselves and our physical intuitions about reality, they are themselves metaphorical constructions; that is, they are creative forms of differentiation emerging out of a deeper undifferentiated primordiality of being. The truth of the world is more closely reflected in the Dionysian level of pure aesthetic immersion into an “undecipherable” innermost essence of things.

Even in his early work, The Birth of Tragedy, Nietzsche rejects the long-held assumption that truth is an ordering of concepts expressed through rigid linguistic categories, putting forth the alternative view which gives primacy to symbol as the purest, most elemental form of representation. That which is and must be expressed is produced organically, out of the flux of nature and yielding a “becoming” rather than being.

In the Dionysian dithyramb man is incited to the greatest exaltation of all his symbolic faculties; something never before experienced struggles for utterance—the annihilation of the veil of maya, … oneness as the soul of the race and of nature itself. The essence of nature is now to be expressed symbolically; we need a new world of symbols.… (BOT Ch. 2)

Here, following Schopenhauer, he reverses Aristotelian transference of concept-categories from the literal to the figurative, and makes the figurative the original mode for representation of experience. The class terms “species” and “genus”, based in Aristotle and so important in classical and medieval epistemology, only appear to originate and validate themselves in “dialectics and through scientific reflection.” For Nietzsche, the categories hide their real nature, abiding as frozen metaphors which reflect previously experienced levels of natural experience metaphorically represented in our consciousness. They emerge through construction indirectly based in vague images or names for things, willed into being out of the unnamed flowing elements of biological existence. Even Thales the pre-Socratic, we are reminded, in his attempt to give identity to the underlying unity of all things, falls back on a conceptualization of it as water without realizing he is using a metaphor.

Once we construct and begin to apply our concepts, their metaphorical origins are forgotten or concealed from ordinary awareness. This theoretical process is but another attempt to restore “the also-forgotten” original unity of being. The layering of metaphors, the archeological ancestors of concepts, is specifically linked to our immediate experiential capacity to transcend the proper and the individual levels of experience and linguistic signs. We cannot, argues Nietzsche, construct metaphors without breaking out of the confines of singularity, thus we must reject the artificiality of designating separate names for separate things. To assume that an individual name would completely and transparently describe its referent (in perception) is to also assume that language and external experience mirror one another in some perfect way. It is rather the case that language transfers meaning from place to place. The terms metapherein and Übertragung are equivalently applied here; if external experience is in constant flux, it is not possible to reduplicate exact and individual meanings. To re-describe things through metaphor is to “leave out” and “carry-over” meaning, to undergo a kind of dispossession of self, thing, place, and time and an overcoming of both individualisms and dualities. Thus the meaningful expression of the real is seen and experienced most directly in the endlessly creative activity of art and music, rather than philosophy.

2. The Phenomenological Theory in Continental Philosophy

Versions of Nietzsche’s “metaphorization” of thought will reappear in the Continental philosophers described below; those who owe their phenomenological attitudes to Husserl, but disagree with his transcendental idealization of meaning, one which demands that we somehow separate the world of experience from the essential meanings of objects in that world. Taken together, these philosophers call into question the position that truth entails a relationship of correspondence between dual aspects of reality, one internal to our minds and the other external. We consider Heidegger, Ricoeur, and Derrida as the primary examples. For Heidegger, metaphoric language signals a totality or field of significance in which being discloses or reveals itself. Ricoeur’s work, in turn, builds upon aspects of Heidegger’s ontological hermeneutics, explicating how it is the case that metaphors drive speculative reflection. In Ricoeur’s model, the literal level is subverted, and metaphoric language and symbols containing “semantic kernels” create structures of double reference in all figurative forms of discourse. These structures point beyond themselves in symbols and texts, serving as mediums which reveal new worlds of meaning and existential possibilities.

French philosopher Jacques Derrida, on the other hand, reiterates the Nietzschean position; metaphor does not subvert metaphysics, but rather is itself the hidden source of all conceptual structures.

a. Phenomenological Method: Husserl

Edmund Husserl’s phenomenological method laid the groundwork, in the early 20th century, for what would eventually take shape in the phenomenological philosophies of Martin Heidegger, Maurice Merleau-Ponty, and Jean-Paul Sartre. Husserl’s early work provides the foundation for exploring how these modes of presentation convey the actual meaningful contents of experience. He means to address here the former distinction made by Kant between the phenomenal appearances of the real (to consciousness) and the noumenal reality of the things-in-themselves. Husserl, broadly speaking, seeks to resolve not only what some see as a problematic dualism in Kant, but also some philosophical problems that accompany Hegel’s constructivist phenomenology.

Taken in its entirety, Husserl’s project demonstrates a major shift in the 20th century phenomenology, seeking a rigorous method for the description and analysis of consciousness and the contents given to it. He intends his method to be the scientific grounding for philosophy; it is to be a critique of psychologism and a return to a universal knowledge of “the things themselves,” those intelligible objects apprehended by and given to consciousness.

In applying this method we seek, Husserl argues, a scientific foundation for universally objective knowledge; adhering to the “pure description” of phenomena given to consciousness through the perception of objects. If those objects are knowable, it is because they are immediate in conscious experience. It is through the thorough description of these objects as they appear to us in terms of color, shape, and so forth, that we apprehend that which is essential – what we call “essences” or meanings. Here, the act of description is a method for avoiding a metaphysical trap: that of imposing these essences or object meanings onto the contents of mental experience. Noesis, for Husserl, achieves its aim by including within itself (giving an account of) the role that context or horizon plays in delineating possible objects for experience. This will have important implications for later phenomenological theories of metaphor, in that metaphors may be said intend new figurative contexts in which being appears to us in new ways.

In Ideen (30), Husserl explains how such a horizon or domain of experience presents a set of criteria for us to apply. We choose and identify an object as a single member of a class of objects, and so these regions of subjective experience, also called regions of phenomena, circumscribe certain totalities or generic unities to which concrete items belong. In order to understand the phenomenological approach to meaning-making, it is first necessary to clarify what we mean by “phenomenological description,” as it is described in Logical Investigations. Drawing upon the work of Brentano and Meinong, Husserl develops a set of necessary structural relations between the knower (ego), the objects of experience, and the horizon within which those objects are given. The relation is characterized in an axiomatic manner as intentionality, where the subjective consciousness and its objects are correlates brought together in a psychological act. Subjectivity contributes to and makes possible cognition; specifically, it must be the case that perception and cognition are always about something given in the stream of consciousness, they are only possible because consciousness intends or refers to these immanent objects. As we shall presently see, the intentional nature of consciousness applies to Ricoeur’s hermeneutics of the understanding, bestowing metaphor with a special ability to expand (to nearly undermining) the structure of reference in a non-literal sense to an existential state.

Husserl’s stage like development of phenomenology unveils the structure of intentionality as derived from the careful description of certain mental acts. Communicable linguistic expressions, such as names and sentences, exist only in so far as they exhibit intentional meanings for speakers. Written or spoken expressions only carry references to objects because they have meanings for speakers and knowers. If we examine all of our mental perceptions, we find it impossible to think without intending an object of some sort. Both Continental and Anglo-American thinkers agree that metaphor holds the key to understanding these processes, as it re-organizes our senses of perception, temporality, and relation of subject to object, referring to these as subjects of existential concern and possibility.

b. Heidegger’s Contribution

Heidegger, building upon the phenomenological thematic, asserts that philosophical analysis should keep to careful description of the human encounter with the world, revealing the modes in which being is existentially or relationally given. This signals both a nod to and departure from Husserl, leading to a rethinking of phenomenology which replaces the theoretical apprehension of meaning with an “uncovering” of being as it is lived out in experiential contexts or horizons. Later, Ricoeur will draw on Heidegger’s “existentialized” intentionality as he characterizes the referential power of metaphors to signal those meanings waiting to be “uncovered’ by Dasein’s (human as being-there) experience of itself – in relation to others, and to alternate worlds of possibility.

As his student, Heidegger owes to Husserl the phenomenological intent to capture “the things themselves” (die Sachen selbst), however, the Heideggerian project outlined in Being and Time rejects the attempt to establish phenomenology as a science of the structures of consciousness and reforms it in ontologically disclosive or manifestational terms. Heidegger’s strong attraction to the hermeneutic tradition in part originates in his dialogue with Wilhem Dilthey, the 19th century thinker who stressed the importance of historical consciousness attitude in guiding the work of the social sciences and hermeneutics, directed toward the understanding of primordial experience. Dilthey’s influence on Heidegger and Ricoeur (as well as Gadamer) is evident, in that all recognize the historical life of humans as apprehended in the study of the text (a form of spirit), particularly those containing metaphors and narratives conveying a lived, concrete experience of religious life.

Heidegger rejects the notion that the structures of consciousness are internally maintained as transcendentally subjective and also directed towards their transcendental object. Phenomenology must now be tied to the problems of human existence, and must then direct itself immediately towards the lived world and allow this “beholding” of the world to guide the work of “its own uncovering.”

Heidegger argues for a return to the original Greek definitions of the terms phainonmenon (derived from phainesthai, or “that which shows itself”) and logos. Heidegger adopts these terms for his own purposes, utilizing them to reinforce the dependence of ontological disclosure or presence: those beings showing themselves or letting themselves be “seen-as.” The pursuit of aletheia, (“truth as recovering of the forgotten aspects of being”) is now fulfilled through adherence to a method of self-interpretation achieved from the standpoint of Dasein’s (humanity’s) subjectivity, which has come to replace the transcendental ego of Kant and Husserl.

The turn to language, in this case, must be more than simple communication between persons; it is a primordial feature of subjectivity. Language is to be the interpretive medium of the understanding through which all forms of being present themselves to subjective apprehension. In this way, Heidegger replaces the transcendental version of phenomenology with the disclosive, where the structure of interpretation provides further insight into his ontological purposes of the understanding.

3. Existential Phenomenology: Paul Ricoeur, Hermeneutics, and Metaphor

The linguistic turn in phenomenology has been most directly applied to metaphor in the works of Paul Ricoeur, who revisits Husserlian and Heideggerian themes in his extensive treatment of metaphor. He extends his analysis of metaphor into a fully developed discursive theory of symbol, focusing on those found in religious texts and sacred narratives. His own views follow from what he thinks are overly limited structuralist theories of symbol, which, in essence, do not provide a theory of linguistic reference useful for his own hermeneutic project. For Ricoeur, a proper theory of metaphor understands it to be “a re-appropriation of our effort to exist,” echoing Nietszche’s call to go back to the primordiality of being. Metaphor must then include the notion that such language is expressive and constitutive of the being of those who embark on philosophical reflection.

Much of Ricoeur’s thought can be characterized by his well-known statement “the symbol gives rise to the thought.” Ricoeur shares Heidegger’s and Husserl’s assumptions: we reflectively apprehend or grasp the structures of human experience as they are presented to temporalized subjective consciousness While the “pure” phenomenology of Husserl seeks a transparent description of experience as it is lived out in phases or moments, Ricoeur, also following Nietzsche, centers the creation of meaning in the existential context. The noetic act originates in the encounter with a living text, constituting “a horizon of possibilities,” for the meaning of existence, thus abandoning the search for essences internal to the objects we experience in the world.

His foundational work in The Symbolism of Evil and The Rule of Metaphor places the route to human understanding concretely, via symbolic expressions which allow for the phenomenological constitution, reflection, and re-appropriation of experience. These processes are enabled by the structure of “seeing-as,” adding to Heidegger’s insight with the metaphoric acting as a “refiguring” of that which is given to consciousness. At various points he enters into conversation with Max Black and Nelson Goodman, among others, who also recognize the cognitive contributions to science and art found in the models and metaphors. In Ricoeur’s case, sacred metaphors display the same second-order functions shared by those in the arts and sciences, but with a distinctively ontological emphasis: “the interpretation of symbols is worthy of being called a hermeneutics only insofar as it is a part of self-understanding and of the understanding of being” (COI 30).

In The Rule of Metaphor, Ricoeur, departing from Aristotle, locates the signifying power of metaphor primarily at the level of the sentence, not individual terms. Metaphor is to be understood as a discursive linguistic act which achieves its purpose through extended predication rather than simple substitution of names. Ricoeur, like so many language philosophers, argues that Aristotelian substitution is incomplete; it does not go far enough in accounting for the semantic, syntactic, logical, and ontological issues that accompany the creation of a metaphor. The standard substitution model cannot do justice to potential for metaphor create meaning by working in tandem with propositional thought-structures (sentences). To these ends, Ricoeur’s study in The Rule of Metaphor replaces substitution and strict denotative theories with a theory of language that works through a structure of double reference.

Taking his lead while diverging from Aristotle, Ricoeur reads the metaphorical transfer of a name as a kind of “category mistake” which produces an imaginative construction about the new way objects may be related to one another. He expands this dynamic of “meaning transfer” on to the level of the sentence, then text, enabling the production of a second-order discursive level of thinking whereby all forms of symbolic language become phenomenological disclosures of being.

The discussion begins with the linguistic movement of epiphora (transfer of names-predicates) taken from an example in Poetics. A central dynamic exists in transposing one term, with one set of meaning-associations onto another. Citing Aristotle’s own example of “sowing around a god-created flame,”

If A = light of the sun, B = action of the sun, C = grain, and D = sowing, then

B is to A, as D is to C

We see action of the sun is to light as sowing is to grain, however, B is a vague action term (sun’s action) which is both missing and implied; Ricoeur calls this a “nameless act” which establishes a similar relation to the object, sunlight, as sowing is to the grain. In this act the phenomenological space for the creation of new meaning is opened up, precisely because we cannot find a conventional word to take the place of a metaphorical word. The nameless act implies that the transfer of an alien name entails more than a simple substitution of concepts, and is therefore said to be logically disruptive.

a. The Mechanics of Conceptual Blending

The “nameless act” entails a kind of “cognitive leap:’’ since there is no conventional term for B, the act does not involve substituting a decorative term in its place. Rather, a new meaning association has been created through the semantic gap between the objects. The absence of the original literal term, the “semantic void”, cannot be filled without the creation of a metaphor which signals the larger discursive context of the sentence and eventually, the text. If, as above, the transfer of predicates (the sowing of grain as casting of flames) challenges the “rules” of meaning dictated by ostensive theory, we are forced to make a new connection where there was none, between the conventional and metaphorical names for the object. For Ricoeur, the figurative (sowing around a flame) acts as hermeneutic medium in that it negates and displaces the original term, signifying a “new kind of object” which is in fact a new form (logos) of being. The metaphorical statement allows us to say that an object is and is not what we usually call it. The sense-based aspect is then “divorced” from predication and subsequently, logos is emptied of its objective meaning; the new object may be meaningful but not clear under the conditions of strict denotation or natural knowledge.

We take note that the “new object” (theoretically speaking) has more than figurative existence; the newly formed subject-predicate relation places the copula at the center of the name-object (ROM 18). Ricoeur’s objective is to create a dialectically driven process which produces a new ‘object-domain’ or category of being. Following the movement of the Hegelian Aufhebung, (through the aforementioned negation and displacement) the new name has opened up a new field of meaning to be re-appropriated into our reflective consciousness. This is how Ricoeur deconstructs first-order reference in order to develop an ontology of sacred language based on second-order reference.

We are led to the view that myths are modes of discourse whose meanings are phenomenological spaces of openness, creating a nearly infinite range of interpretations. Thus we see how metaphor enables being, as Aristotle notes, to “be said in many ways.”

Ricoeur argues that second-order discursivity “violates” the pre-existing first order of genus and species, in turn causing a kind of upheaval among further relations and rules set by the categories: namely subordination, coordination, proportionality or equality among object properties. Something of a unity of being remains, yet for Ricoeur this non-generic unity or enchainement, corresponds to a single generic context referring to “Being,” restricting the senses or applications of transferred predicates in the metaphoric context.

b. The Role of Kant’s Schematism in Conceptual Blending

The notion of a “non-generic unity” raises, perhaps, more philosophical problems than it answers. How are we to explain the mechanics which blend descriptors from one object domain and its sets of perceptions, to a domain of foreign objects? Ricoeur addresses the epistemic issues surrounding the transfer of names from one category to another in spatiotemporal experience by importing Kant’s theory of object construction, found in the Critique of Pure Reason. In the “Transcendental Schematism”, Kant establishes the objective validity of the conceptual categories we use to synthesize the contents of experience. In this section, Kant elevates the Aristotelian categories from grammatical principles to formal structures intrinsic to reason. Here, he identifies an essential problem for knowledge: how are we to conceive a relationship between these pure concept-categories of the understanding and the sensible objects given to us in space and time? With the introduction of the schematism, Kant seeks a resolution to the various issues inherent to the construction of mental representations (a position shared by contemporary cognitive scientists; see below). For Ricoeur, this serves to answer the problem of how metaphoric representations of reality can actually “refer” to reality (even if only at the existential level of experience).

Kant states “the Schematism” is a “sensible condition under which alone pure concepts of the understanding can be employed” (CPR/A 136). Though the doctrine is sometimes said to be notoriously confusing due to its circular nature, the schemata are meant as a distinctive set of mediating representations, rules, or operators in the mind which themselves display the universal and necessary characteristics of sensible objects; these characteristics are in turn synthesized and unified by the activity of the transcendental imagination.

In plainer terms, the schematic function is used by the imagination to guide it in the construction of images. It does not seem to be any kind of picture of an object, but rather the “form” or “listing” of how we produce the picture. For Ricoeur, the schematism lends the structural support for assigning an actual truth-value or cognitive contribution to the semantic innovation produced by metaphor. The construction of new meaning via new forms of predication entails a re-organization and re-interpretation of pre-existing forms, and the operations of the productive imagination enable the entire process.

In the work Figuring the Sacred, for example, Ricoeur, answering to his contemporary Mircea Eliade ( The Sacred and The Profane), moves metaphor beyond the natural “boundedness” of myths and symbols. While these manifest meaning, they are still constrained in that they must mirror the natural cosmic order of things. Metaphor, on the other hand, occupies the center of a “hermeneutic of proclamation;” it has the power to proclaim because it is a “free invention of discourse.” Ricoeur specifically explicates biblical parables, proverbs, and eschatological statements as extended metaphorical processes. Thus, “The Kingdom of God will not come with signs that you can observe. Do not say, ‘It is here; it is there.’ Behold the kingdom of God is among you” (Luke 17:20-21). This saying creates meaning by breaking down our ordinary or familiar temporal frameworks applied to interpretation of signs (of the kingdom). The quest for signs is, according to Ricoeur, “overthrown” for the sake of “a completely new existential signification” (FS 59).

This discussion follows from the earlier work in The Rule of Metaphor, where the mechanics of representation behind this linguistic act of “re-description” are further developed. The act points us towards a novel ontological domain of human possibility, enabled through new cognitive content. The linguistic act of creating a metaphor in essence becomes a hermeneutic act directed towards a gap which must be bridged, that between the abstract (considerations of reflection) understanding (Verstehen) and the finite living out of life. In this way Ricoeur’s theory, often contrasted with that of Derrida, takes metaphor beyond the mechanics of substitution.

4. Jacques Derrida: Metaphor as Metaphysics

In general, Derrida’s deconstructive philosophy can be read as a radically alternative way of reading philosophical texts and arguments, viewing them in a novel way through the lens of a rhetorical methodology. This will amount to the taking apart of established ways in which philosophers define perception, concept formation, meaning, and reference.

Derrida, from the outset, will call into question the assumption that the formation of concepts (logos) somehow escapes the primordiality of language and the fundamentally metaphorical-mythical nature of philosophical discourse. In a move which goes much further than Ricoeur, Derrida argues for what Guiseseppe Stellardi so aptly calls the “reverse metaphorization of concepts.” The reversal is such that there can be no final separation between the linguistic-metaphorical and the philosophical realms. These domains are co-constitutive of one another, in the sense that either one cannot be fully theorized or made to fully or transparently explain the meaning of the other. The result is that language acquires a certain obscurity, ascendancy, and autonomy. It will permanently elude our attempts to fix its meaning-making activity in foundational terms which necessitate a transcendent or externalized (to language) unified being.

Derrida’s White Mythology offers a penetrating critique of the common paradigm involving the nature of concepts, posing the following questions: “Is there metaphor in the text of philosophy, and if so, how?” Here, the history of philosophy is characterized as an economy, a kind of “usury” where meaning and valuation are understood as metaphorical processes involving “gain and loss.” The process is represented through Derrida’s well-known image of the coin:

I was thinking how the Metaphysicians, when they make a language for themselves, are like … knife-grinders, who instead of knives and scissors, should put medals and coins to the grindstone to efface … the value… When they have worked away till nothing is visible in these crown pieces, neither King Edward, the Emperor William, nor the Republic, they say: ‘These pieces have nothing either English, German, or French about them; we have freed them from all limits of time and space; they are not worth five shillings any more ; they are of inestimable value, and their exchange value is extended indefinitely.’ (WM 210).

The “usury” of the sign (the coin) signifies the passage from the physical to the metaphysical. Abstractions now become “worn out” metaphors; they seem like defaced coins, their original, finite values now replaced by a vague or rough idea of the meaning-images that may have been present in the originals.

Such is the movement which simultaneously creates and masks the construction of concepts. Concepts, whose real origins have been forgotten, now only yield an empty sort of philosophical promise – that of “the absolute”, the universalized, unlimited “surplus value” achieved by the eradication of the sensory or momentarily given. Derrida reads this process along a negative Hegelian line: the metaphysicians are most attracted to “concepts in the negative, ab-solute, in-finite, non-Being” (WM 121). That is, their love of the most abstract concept, made that way “by long and universal use”, reveals a preference for the construction of a metaphysics of Being. This is made possible via the movement of the Hegelian Aufhebung. The German term refers to a dynamic of sublation where the dialectical, progressive movement of consciousness overcomes and subsumes the particular, concrete singularities of experience through successive moments of cognition. Derrida levels a strong criticism against Hegel’s attempts to overcome difference, arguing that consciousness as understood by Hegel takes on the quality of building an oppressive sort of narrative, subsuming the particular and the momentary under an artificial theoretical gaze. Derrida prefers giving theoretical privilege to the negative; that is, to the systematic negation of all finite determinations of meaning derived from particular aspects of particular beings.

Echoing Heidegger, Derrida conceives of metaphysical constructs as indicative of the Western “logocentric epoch” in philosophy. They depend for their existence on the machinery of binary logic. They remain static due to our adherence to the meaning of ousia (essence), the definition of being based on self-identitical substance, which can only be predicated or expressed in either/or terms. Reference to being, in this case, is constrained within the field of the proper and univocal. Both Heidegger and Derrida, and to some degree Ricoeur seek to free reference from these constraints. Unlike Heidegger, however, Derrida does not work from the assumption that being indicates some unified primordial reality.

For Derrida, there lies hidden within the merely apparent logical unity (with its attendant binary oppositions) or logocentricity of consciousness a white mythology, masking the primitive plurivocity of being which eludes all attempts to name it. Here we find traces of lost meanings, reminiscent of the lost inscriptions on coins. These are “philosophemes,” words, tropes or modes of figuration which do not express ideas or abstract representations of things (grounded in categories), but rather invoke a radically plurivocal notion of meaning. Having thus dismantled the logic of either/or with difference (difference), Derrida gives priority to ambiguity, in “both/and” and “neither/nor” modes of thought and expression. Meaning must then be constituted of and by difference, rather than identity, for difference subverts all preconceived theoretical or ontological structures. It is articulated in the context of all linguistic relations and involves ongoing displacement of a final idealized and unified form of meaning; such displacement reveals through hints and traces, the reality and experience of a disruptive alterity in meaning and being. Alterity is “always already there” by virtue of the presence of the Other.

With the introduction of “the white mythology,” Derrida’s alignment with Nietszche creates a strong opposition to traditional Western theoria. Forms of abstract ideation and theoretical systems representing the oppressive consciousness of the “white man,” built in the name of reason/logos, are in themselves a collection of analogies, existing as colorless dead metaphors whose primitive origins lie in the figurative realms of myths, symbol, and fable.

Derrida’s project, resulting as it does in the deconstruction of metaphysics, runs counter to Ricoeur’s tensive theory. In contrast to Heidegger’s restrained criticism Derrida’s deconstruction appears to Ricoeur “unbounded.” That is, Ricoeur still assumes a distinction between the speculative and the poetic, where the poetic “drives the speculative” to explicate a surplus of meaning. The surplus, or plurivocity is problematic from Derrida’s standpoint. The latter argues that the theory remains logocentric in that it remains true to the binary mode of identity and difference which underlie metaphysical distinctions such as “being and non-being.” For Ricoeur, metaphors create a new space for meaning based on the tension between that which is (can be properly predicated of an object) and that which “is not” (which cannot be predicated of an object). Derrida begs to differ: in the final analysis, there can be no such separation, systematic philosophical theory or set of conceptual structures through which we subsume and “explain” the cognitive or existential value of metaphor.

Derrida’s reverse metaphorization of concepts does not support a plurivocal characterization of meaning and being, it does not posit a wider referential field; for Derrida metaphors and concepts remain in a complex, always ambiguous relation to one another. Thus he seems to do away with “reference,” or the distinction between signifier and signified, moving even beyond polysemy (the many potential meaning that words carry). The point here is to preserve the flux of sense and the ongoing dissemination of meaning and otherness.

a. The Dispute between Ricoeur and Derrida

The dispute between Ricoeur and Derrida regarding the referential power of metaphor lies in where they position themselves with regard to Aristotle. Ricoeur’s position, in giving priority to the noun-phrase instead of the singular name, challenges Aristotle while still appealing to the original taxonomy (categories) of being based on an architectonic system of predication. For Ricoeur, metaphoric signification mimics the fundamentally equivocal nature of being—we cannot escape the ontological implications of Aristotle’s statement: being can be “said in many ways.” Nevertheless, Ricoeur maintains the distinction between mythos and logos, for we need the tools provided by speculative discourse to explain the polysemic value of metaphors.

Derrida’s deconstruction reaches back to dismantle Aristotle’s theory, rooted as it is in the ontology of the proper name/noun (onoma) which signifies a thing as self-identical being (homousion). This, states Derrida, “reassembles and reflects the culture of the West; the white man takes his own mythology, Indo-European mythology, his own logos, that is, the mythos of his idiom, for the universal form of that he must still wish to call Reason” (WM 213).

The original theory makes metaphor yet another link in the logocentric chain—a form of metaphysical oppression. If the value of metaphor is restricted to the transference of names, then metaphor entails a loss or negation of the literal which is still under the confines of a notion of discourse which upholds the traditional formulations of representation and reference in terms of the mimetic and the “proper” which are, in turn, based on a theory of perception (and an attendant metaphysics) that gives priority to resemblance, identity, or what we can call “the law of the same.”

5. Anglo-American Philosophy: Interactionist Theories

Contemporary phenomenological theories of metaphor directly challenge the straightforward theory of reference, replacing the ordinary propositional truth based on denotation with a theory of language which designates and discloses its referents. These interactionist theories carry certain Neo-Kantian features, particularly in the work of the analytic philosophers Nelson Goodman and Max Black. They posit the view that metaphors can reorganize the connections we make between our perceptions of the world. Their theories reflect certain phenomenological assumptions about the ways in which figurative language expands the referential field, allowing for the creation of novel meanings and creating new possibilities for constructing models of reality; in moving between the realms of art and science, metaphors have an interdisciplinary utility. Both Goodman and Black continue to challenge the traditional theory of linguistic reference, offering instead the argument that reference is enabled by the manipulation of predicates in figurative modes of thinking through language.

 

6. Metaphor, Phenomenology, and Cognitive Science

Recent studies underscore the connections between metaphors, mapping, and schematizing aspects of cognitive organization in mental life. Husserl’s approach to cognition took an anti-naturalist stance, opposed to defining consciousness as an objective entity and therefore unsuited to studying the workings of subjective consciousness; instead his phenomenological stance gave priority to subjectivity, since it constitutes the necessary set of pre-conditions for knowing anything at all as an object or a meaning. Recently, the trend has been renewed and phenomenology has made some productive inroads into the examination of connectionist and embodied approaches to perception, cognition and other sorts of dynamic and adaptive (biological) systems.

Zahavi and Thompson, for example, see strong links between Husserlian phenomenology and philosophy of mind with respect to the phenomena of consciousness, where the constitutive nature of subjective consciousness is clarified specifically in terms of the forms and relations of different kinds of intentional mental states. These involve the unity of temporal experience, the structural relations between intentional mental acts and their objects, and the inherently embodied nature of cognition. Those who study the embodied mind do not all operate in agreement with traditional phenomenological assumptions and methods. Nevertheless, some “naturalized” versions in the field of consciousness studies are now gaining ground, offering viable solutions to the kind of problematic Cartesian dualistic metaphysics that Husserl’s phenomenology suggests.

a. The Embodied Mind

In recent years, the expanding field of cognitive science has explored the role of metaphor in the formation of consciousness (cognition and perception). In a general sense, it appears that contemporary cognitivist, constructivist, and systems (as in self-organizing) approaches to the study of mind incorporate metaphor as a tool for developing an anti-metaphysical, anti-positivist theory of mind, in an attempt to reject any residual Cartesian and Kantian psychologies. The cognitive theories, however, remain partially in debt to Kantian schematism and its role in cognition.

There is furthermore in these theories an overturning of any remaining structuralist suppositions (that language and meaning might be based on autonomous configurations of syntactic elements). Many cognitive scientists, in disagreement with Chomsky’s generative grammar, study meaning as a form of cognition that is activated in context of use. Lakoff and Johnson, in Philosophy in the Flesh, find a great deal of empirical evidence for the ways in which metaphors shape our ordinary experience, exploring the largely unconscious perceptual and linguistic processes that allow us to understand one idea or domain of experience, both conceptual and physical, in terms of a “foreign” domain. The research follows the work of Srini Narayanan and Eleanor Rosch, cognitive scientists who also examine schemas and metaphors as key in embodied theories of cognition. Such theories generally trace the connective interplay between our neuronal makeup, or physical interactions with the environment, and our own private and social human purposes.

In a limited sense, the stress on the embodied nature of cognition aligns itself with the phenomenological position. Perceptual systems, built in physical response to determinate spatio-temporal and linguistic contexts, become phenomenological “spaces” shaped through language use. Yet these researchers largely take issue with Continental phenomenology and traditional philosophy in a dramatic and far-reaching way, objecting to the claim that the phenomenological method of introspection makes adequate space for our ability to survey and describe all available fields of consciousness in the observing subject. If it is the case that we do not fully access the far reaches of hidden cognitive processes, much of the metaphorical mapping which underlies cognition takes place at an unconscious level, which is sometimes referred to as “the cognitive unconscious.”(PIF 12-15)

Other philosophers of mind, including Stefano Arduini, and Antonio D’Amasio, work along similar lines in cognitive linguistics, cognitive science, neuroscience, and artificial intelligence. Their work investigates the ways in which metaphors ground various first and second-order cognitive and emotional operations and functions. Their conclusions share insights with the Continental studies conceiving of metaphor as a “refiguring” of experience. There is then some potential for overlap with this cognitive-conceptual version of metaphor, where metaphors and schemata embody emergent transformative categories enabling the creation of new fields of cognition and meaning.

Arduini, in his work, has explored what he calls the “anthropological ability” to build up representations of the world. Here rhetorical figures are realized on the basis of conceptual domains which create the borders of experience. We have access to a kind of reality that would otherwise be indeterminate, for human beings have the ability to conceptualize the world in imaginative terms through myth, symbol, the unconscious, or any expressive sign. For Arduini, figurative activity does not depict the given world, but allows for the ability to construct world images employed in reality. To be figuratively competent is to use the imagination as a tool which puts patterns together in inventive mental processes. Arduini then seems to recall Nieztsche; anthropologically speaking, humans are always engaging in some form of figuration or form of language, which allows for “cognitive competence” in that it chooses among particular forms which serve to define the surrounding contexts or environments. Again, metaphor is foundational to the apprehension of reality; it is part of the pre-reflective or primordial apparatus of experience, perception, and first- through second-order thought, comprising an entire theoretical approach as well as disciplines such as evolutionary anthropology (see Tooby and Cosmides).

b. The Literary Mind

The work of Gilles Fauconnier and Mark Turner extends that of Lakoff and Johnson outlined above. For Fauconnier, the task of language is to construct, and for the linguist and cognitive scientist it is “a window into the mind.” Independently and together, Fauconnier and Turner’s collaboration results in a theory of conceptual blending in which metaphorical forms take center stage. Basically, the theory of conceptual blending follows from Lakoff and Johnson’s work on the “mapping” or projective qualities of our cognitive faculties. For example, if we return to take Shakespearean line “in me thou seest the glowing of such fire”, the source is fire, whose sets of associations are projected onto the target – in this case the waning aspect of the narrator. Their research shows that large numbers of such cross-domain mappings are expressed as conceptual structures which have propositional content: for example, “life is fire, loss is extinction of fire.” There exist several categories of mappings across different conceptual domains, including spatio-temporal orientation, movement, and containment. For example: “time flies” or “this relationship is smothering.”

Turner’s work in The Literary Mind, takes a slightly different route, portraying these cognitive mechanisms as forms of “storytelling.” This may, superficially, seem counterintuitive to the ordinary observer, but Turner gives ample evidence for the mind’s ability to do much of its everyday work using various forms of narrative projection (LM 6-9). It is not too far a reach from this version of narrative connection back to the hermeneutic and cognitive-conceptual uses of metaphor outlined earlier. If we understand parables to be essentially forms of extended metaphor, we can clearly see the various ways in which they contribute to the making of intelligible experience.

The study of these mental models sheds light on the phenomenological and hermeneutic aspects of reality-construction. If these heuristic models are necessary to cognitive functioning, it is because they allow us to represent higher-order aspects of reality which involve expressions of human agency, intentionality, and motivation. Though we may be largely unaware of these patterns, they are based on our ability to think in metaphor, are necessary, and are continuously working to enable the structuring of intentional experience – which cannot always be adequately represented by straightforward first-order physical description. Fauconnier states:

We see their status as inventions by contrasting them with alternative representations of the world. When we watch someone sitting down in a chair, we see what physics cannot recognize: an animate agent performing an intentional act. (MTL 19-20)

Turner, along with Fauconnier and Lakoff, connects parabolic thought with the image-schematic or mapping between different domains of encounter with our environments. Fauconnier’s work, correlating here with Turner’s, moves between cognitive-scientific and phenomenological considerations; both depict mapping as a constrained form of projection, a complex mental manipulation which moves across mental structures which correspond to various phenomenological spaces of thought, action, and communication.

Metaphorical mapping allows the mind to cross and conflate several domains of experience. The cross-referencing, reminiscent of Black’s interactionist dynamics, amounts to a form of induction resulting from projected relations between a source structure, a pattern we already understand, onto a target structure, that which we seek to understand.

Mapping as a form of metaphoric construction leads to other forms of blending, conceptual integration, and novel category formation. We can, along with Fauconnier and the rest, describe this emergent evolution of linguistic meaning in dialectical terms, arguing that it is possible to mesh together two images of virus (biological and computational) into a third integrated idea that integrates and expands the meaning of the first two (MTL 22). Philosophically speaking, we seem to have come full circle back to the Hegelian theme which runs through the phenomenological analysis of metaphor as a re-mapping of mind and reality.

7. Conclusion

The Continental theories of metaphor that have extrapolated and developed variations on the theme expressed in Nietzsche’s apocryphal pronouncement that truth is “a mobile army of metaphors.” The notion that metaphorical language is somehow ontologically and epistemologically prior to ordinary propositional language has since been voiced by Heidegger, Ricoeur, and Derrida. For these thinkers metaphor serves as a foundational heuristic structure, one which is primarily designed to subvert ordinary reference and in some way dismantle the truth-bearing claims of first-order propositional language. Martin Heidegger’s existential phenomenology does away with the assumption that true or meaningful intentional statements reflect epistemic judgments about the world; that is, they do not derive referential efficacy through the assumed correspondence between an internal idea and an external object. While there may be a kind of agreement between our notions of things and the world in which we find those things, it is still a derivative agreement emerging from a deeper ontologically determined set of relations between things-in-the-world, given or presented to us as inherently linked together in particular historical, linguistic, or cultural contexts.

The role of metaphor in perception and cognition also dominates the work of contemporary cognitive scientists, linguists, and those working in the related fields of evolutionary anthropology and computational theory. While the latter may not be directly associated with Continental phenomenology, aspects of their work support an “anti-metaphysical” position and draw upon common phenomenological themes which stress the embodied, linguistic, contextual, and symbolic nature of knowledge. Thinkers and researchers in this camp argue that metaphoric schemas are integral to human reasoning and action, in that they allow us to develop our cognitive and heuristic capacities beyond simple and direct first order experience.

8. References and Further Reading

  • Aristotle. Categories and De Interpretatione. J.C. Ackrill, trans. Oxford, Clarendon, 1963. (CDI)
  • Aristotle. Peri Hermenenias. Hans Arens, trans. Philadelphia, Benjamins, 1984. (PH)
  • Arduini, Stefano (ed.). Metaphors. Edizioni Di Storia E Letteratura,
  • Barber, A. and Stainton, R. The Concise Encyclopedia of Language and Linguistics.
  • Oxford, Elsevier Ltd., 2010
  • Black, Max. Models and Metaphors. Ithaca, Cornell, 1962. (MAM)
  • Brentano, Franz C. On the Several Senses of Being in Aristotle. Berkeley, UC Press, 1975.
  • Cazeaux, Clive. Metaphor and Continental Philosophy, from Kant to Derrida. London, Routledge, 2007.
  • Cooper, David E. Metaphor. London, Oxford, 1986.
  • Derrida, Jacques. “White Mythology, Metaphor in the Text of Philosophy” in Margins of Philosophy, trans. A. Bass, Chicago, University of Chicago Press, 1982. (WM)
  • Fauconnier, Gilles. Mappings in Thought and Language. Cambridge, Cambridge University, 1997. (MTL)
  • Gallagher, Shaun. Phenomenology and Non-reductionist Cognitive Science” in Handbook of Phenomenology and Cognitive Science. ed. by Shaun Gallagher and Daniel Schmicking. Springer, New York, 2010.
  • Goodman, Nelson. Languages of Art. New York, Bobs-Merrill, 1968.
  • Hinman, Lawrence. “Nietzsche, Metaphor, and Truth” in Philosophy and Phenomenological Research, Vol. 43, #2, 1984.
  • Harnad, Stevan. “Category Induction and Representation” in Categorical Perception: The Groundwork of Cognition. New York, Cambridge, 1987.
  • Heidegger, Martin. Being and Time. John MacQuarrie and E. Robinson, trans. New York,
  • Harper and Row, 1962. (BT)
  • Heidegger, Martin. The Basic Problems of Phenomenology, trans. A. Hofstadter, Bloomington, Indiana University Press, 1982.
  • Huemer, Wolfgang. The Constitution of Consciousness: A Study in Analytic Phenomenology. Routledge, 2005.
  • Johnson, Mark. “Metaphor and Cognition” in Handbook of Phenomenology and Cognitive Science, ed. by Shaun Gallagher and Daniel Schmicking. Springer, New York, 2010.
  • Joy, Morny. “Derrida and Ricoeur: A Case of Mistaken Identity” in The Journal of Religion. Vol. 68, #04, University of Chicago Press, 1988.
  • Kant, Immanuel. The Critique of Pure Reason. Trans. N. K. Smith, New York, 1958. (CPR)
  • Kofman, Sarah, Nietzsche and Metaphor. Trans. D. Large, Stanford, 1993.
  • Lakoff, George and Johnson, Mark. Philosophy in the Flesh: The Embodied Mind and Its Challenge to Western Thought. New York, Perseus-Basic, 1999.
  • Malabou, Catharine. The Future of Hegel: Plasticity, Temporality, and the Dialectic. Trans. Lisabeth During, New York, Routledge, 2005.
  • Nietzsche, F. The Birth of Tragedy and the Case of Wagner. Trans. W. Kaufman. New York, Vintage, 1967. (BOT)
  • Lawlor, Leonard. Imagination and Chance: The Difference Between the Thought of Ricoeur and Derrida. Albany, SUNY Press, 1992
  • Mohanty, J.N. and McKenna, W.R. Husserl’s Phenomenology; A Textbook. Washington, DC, University Press, 1989.
  • Rajan, Tilottama. Deconstruction and the Remainders of Phenomenology: Sartre, Derrida, Foucault, Baudrillard. Stanford, Stanford University Press, 2002.
  • Ricoeur, Paul. Figuring the Sacred, trans. D. Pellauer, Minneapolis, Fortress, 1995. (FS)
  • Ricoeur, Paul. The Rule of Metaphor. Toronto, University of Toronto, 1993. (ROM)
  • Schrift Alan D. and Lawlor, Leonard (eds.) TheHistory of Continental Philosophy; Vol. 4 Phenomenology: Responses and Developments. Chicago, University of Chicago Press, 2010.
  • Stellardi, Giuseppe. Heidegger and Derrida on Philosophy and Metaphor: Imperfect Thought. New York, Humanity-Prometheus, 2000.
  • Tooby, J. and L. Cosmides (ed. with J.Barkow). The Adapted Mind: Evolutionary Psychology and The Generation of Culture. Oxford, 1992.
  • Turner, Mark. The Literary Mind: The Origins of Thought and Language. Oxford, 1996. (LM)
  • Woodruff-Smith, David and McIntyre, Ronald. Husserl and Intentionality: A Study of Mind, Meaning, and Language. Boston, 1982.
  • Zahavi, Dan. “Naturalized Phenomenology” in Handbook of Phenomenology and Cognitive Science.

 

Author Information

S. Theodorou
Email: stheodorou@immaculata.edu
Immaculata University
U. S. A.

Scientific Change

How do scientific theories, concepts and methods change over time? Answers to this question have historical parts and philosophical parts. There can be descriptive accounts of the recorded differences over time of particular theories, concepts, and methods—what might be called the shape of scientific change. Many stories of scientific change attempt to give more than statements of what, where and when change took place. Why this change then, and toward what end? By what processes did they take place? What is the nature of scientific change?

This article gives a brief overview of the most influential views on the shape and nature of change in science. Important thematic questions are: How gradual or rapid is scientific change? Is science really revolutionary? How radical is the change? Are periods in science incommensurable, or is there continuity between the first and latest scientific ideas? Is science getting closer to some final form, or merely moving away from a contingent, non-determining past? What role do the factors of community, society, gender, or technology play in facilitating or mitigating scientific change? The most important modern development in the topic is that none of these questions have the same answer for all sciences. When we speak of scientific change it should be recognized that it is only at a fairly contextualized level of description of the practices of scientists at rather specific times and places that anything substantial can be said.

Nonetheless, scientific change is connected with many other key issues in philosophy of science and broader epistemology, such as realism, rationality and relativism. The present article does not attempt to address them all. Higher-order debates regarding the methods of historiography or the epistemology of science, or the disciplinary differences between History and Philosophy, while important and interesting, represent an iteration of reflection on top of scientific change itself, and so go beyond the article’s scope.

Table of Contents

  1. If Science Changes, What is Science?
  2. History of Science and Scientific Change
  3. Philosophical Views on Change and Progress in Science
    1. Kuhn, Paradigms and Revolutions
      1. Key Concepts in Kuhn’s Account of Scientific Change
      2. Incommensurability as the Result of Radical Scientific Change
    2. Lakatos and Progressing and Degenerating Research Programs
    3. Laudan and Research Traditions
  4. The Social Processes of Change
    1. Fleck
    2. Hull’s Evolutionary Account of Scientific Change
  5. Cognitive Views on Scientific Change
    1. Cognitive History of Science
    2. Scientific Change and Science Education
  6. Further Reading and References
    1. Primary Sources
    2. Secondary Sources
      1. Concepts, Cognition and Change
      2. Feminist, Situated and Social Approaches
      3. The Scientific Revolution

1. If Science Changes, What is Science?

We begin with some organizing remarks. It is interesting to note at the outset the reflexive nature of the topic of scientific change. A main concern of science is understanding physical change, whether it be motions, growth, cause and effect, the creation of the universe or the evolution of species. Scientific views of change have influenced philosophical views of change and of identity, particularly among philosophers impressed by science’s success at predicting and controlling change. These philosophical views are then reflected back, through the history and philosophy of science, as images of how science itself changes, of how its theories are created, evolve and die. Models of change from science—evolutionary, mechanical, revolutionary—often serve as models of change in science.

This makes it difficult to disentangle the actual history of science from our philosophical expectations about it. And the historiography and the philosophy of science do not always live together comfortably. Historians balk at the evaluative, forward-looking, and often necessitarian, claims of standard philosophical reconstructions of scientific events. Philosophers, for their part, have argued that details of the history of science matter little to a proper theory of scientific change, and that a distinction can and should be made between how scientific ideas are discovered and how they are justified. Beneath the ranging, messy, and contingent happenings which led to our current scientific outlook, there lies a progressive, systematically evolving activity waiting to be rationally reconstructed.

Clearly, to tell any story of ‘science changing’ means looking beneath the surface of those changes in order to find something that remains constant, the thing which remains science. Conversely, what one takes to be the demarcating criteria of science will largely dictate how one talks about its changes. What part of human history is to be identified with science? Where does science start and where does it end? The breadth of science has a dimension across concurrent events as well as across the past and future. That is, it has both synchronic (at a time) and diachronic (over time) dimensions. Science will consist of a range of contemporary events which need to be demarcated. But likewise, science has a temporal breadth: a beginning, or possibly several beginnings, and possibly several ends.

The synchronic dimension of science is one way views of scientific change can be distinguished. On one hand there are logical or rationalistic views according to which scientific activity can be reduced to a collection of objective, rational decisions of a number of individual scientists. On this latter view, the most significant changes in science can each be described through the logically-reconstructable actions and words of one historical figure, or at most a very few. According to many of the more recent views, however, an adequate picture of science cannot be formed with anything less than the full context of social and political structures: the personal, institutional, and cultural relations scientists are a part of. We look at some of these broader sociological views in the section on social process of change.

Historians and philosophers of science have wanted also to “broaden” science diachronically, to historicize its content, such that the justifications of science, or even its meanings, cannot be divorced from their past. We will begin with the most influential figure for history and philosophy of science in North America in the last half-century: Thomas Kuhn. Kuhn’s work in the middle of the last century was primarily a reaction to the then prevalent, rationalistic and a-historical view described in the previous paragraph. Along with Kuhn, we describe the closely related views of Imre Lakatos and Larry Laudan. For an introduction to the most influential philosophical accounts of the diachronical development of science, see Losee 2004.

When Kuhn and the others advanced their new views on the development of science into Anglo-Saxon philosophy of science, history and sociology were already an important part of the landscape of Continental history and philosophy of science. A discussion of these views can be found as part of the sociology of science section as well. The article concludes with more recent naturalized approaches to scientific change, which turn to cognitive science for accounts of scientific understanding and how that understanding is formed and changed, as well as suggestions for further reading.

Science itself, at least in a form recognizable to us, is a twentieth century phenomenon. Although a matter of debate, the canonical view of the history of scientific change is that its seminal event is the one tellingly labeled the Scientific Revolution. It is usually dated to the 16th and 17th centuries. The first historiographies of science—as much construction of the revolution as they were documentation—were not far behind, coming in the eighteenth and nineteenth centuries. Professionalization of the history of science, characterized by reflections on the telling of the history of science, followed later. We begin our story there.

2. History of Science and Scientific Change

As history of science professionalized, becoming a separate academic discipline in the twentieth century, scientific change was seen early on as an important theme within the discipline. Admittedly, the idea of radical change was not a key notion for early practitioners of the field such as George Sarton (1884-1956), the father of history of science in the United States, but with the work of historians of science such as Alexandre Koyré (1892-1964), Herbert Butterfield (1900-1979) and A. Rupert Hall (1920-2009), radical conceptual transformations came to play a much more important role.

One of the early outcomes of this interest in change was the volume Scientific Change (Crombie, 1963) in which historians of science covering the span of science from the physical to the biological sciences, and the span of history from antiquity to modern science, all investigated the conditions for scientific change by examining cases from a multitude of periods, societies, and scientific disciplines. The introduction to Crombie’s volume presented a large number of questions regarding scientific change that remained key issues in both history and philosophy of science for several decades:

What were the essential changes in scientific thought and how were they brought about? What was the part played in the initiation of change by mutations in fundamental ideas leading to new questions being asked, new problems being seen, new criteria of satisfactory explanation replacing the old? What was the part played by new technical inventions in mathematics and experimental apparatus; by developments in pure mathematics; by the refinements of measurement; by the transference of ideas, methods and information from one field of study to another? What significance can be given to the description and use of scientific methods and concepts in advance of scientific achievement? How have methods and concepts of explanation differed in different sciences? How has language changed in changing scientific contexts? What parts have chance and personal idiosyncrasy played in discovery? How have scientific changes been located in the context of general ideas and intellectual motives, and to what extent have extra-scientific beliefs given theories their power to convince? … How have scientific and technical changes been located in the social context of motives and opportunities? What value has been put on scientific activity by society at large, by the needs of industry, commerce, war, medicine and the arts, by governmental and private investment, by religion, by different states and social systems? To what external social, economic and political pressures have science, technology and medicine been exposed? Are money and opportunity all that is needed to create scientific and technical progress in modern society? (Crombie, 1963, p. 10)

Of particular interest among historians of science have been the changes associated with scientific revolutions and especially the period often referred to as the Scientific Revolution, seen as the sum of achievements in science from Copernicus to Newton (Cohen 1985; Hall 1954; Koyré 1965). The word ‘revolution’ had started being applied in the eighteenth century to the developments in astronomy and physics as well as the change in chemical theory which emerged with the work of Lavoisier in the 1770s, or the change in biology which was initiated by Darwin’s work in the mid-nineteenth century. These were fundamental changes that overturned not only the reigning theories but also carried with them significant consequences outside their respective scientific disciplines. In most of the early work in history of science, scientific change in the form of scientific revolutions was something which happened only rarely. This view was changed by the historian and philosopher of science Thomas S. Kuhn whose 1962 monograph The Structure of Scientific Revolutions (1970) came to influence philosophy of science for decades. Kuhn wanted in his monograph to argue for a change in the philosophical conceptions of science and its development, but based on historical case studies. The notion of revolutions that he used in Structure included not only fundamental changes of theory that had a significant influence on the overall world view of both scientists and non-scientists, but also changes of theory whose consequences remained solely within the scientific discipline in which the change had taken place. This considerably widened the notion of scientific revolutions compared to earlier historians and initiated discussions among both historians and philosophers on the balance between continuity and change in the development of science.

3. Philosophical Views on Change and Progress in Science

In the British and North American schools of philosophy of science, scientific change did not became a major topic until the 1960s onwards when historically inclined philosophers of science, including Thomas S. Kuhn (1922-1996), Paul K. Feyerabend (1924-1994), N. Russell Hanson (1924-1967), Michael Polanyi (1891-1971), Stephen Toulmin (1922-2009) and Mary Hesse (*1924) started questioning the assumptions of logical positivism, arguing that philosophy of science should be concerned with the historical structure of science rather than with an ahistorical logical structure which they found to be a chimera. The occupation with history led naturally to a focus on how science develops, including whether science progresses incrementally or through changes which represent some kind of discontinuity.

Similar questions had also been discussed among Continental scholars. The development of the theory of relativity and of quantum mechanics in the beginning of the twentieth century suggested that empirical science could overturn deeply held intuitions and introduce counter-intuitive new concepts and ideas; and several European philosophers, among them the German neo-Kantian philosopher Ernst Cassirer (1874-1945), directed their work towards rejecting Kant’s absolute categories in favor of categories that may change over time. In France, the historian and philosopher of science Gaston Bachelard (1884-1962) also noted that what Kant had taken to be absolute preconditions for knowledge had turned out wrong in the light of modern physics. On Bachelard’s view, what had seemed to be absolute preconditions for knowledge were instead merely contingent conditions. These conditions were still required for scientific reasoning and therefore, Bachelard concluded, a full account of scientific reasoning could only be derived from reflections upon its historical conditions and development. Based on the analysis of the historical development of science, Bachelard advanced a model of scientific change according to which the conceptions of nature are from time to time replaced by radical new conceptions – what Bachelard called epistemological breaks.

Bachelard’s view was later developed and modified by the historian and philosopher of science, and student of Bachelard, George Canguilhem (1904-1995) and by the philosopher and social historian, and student of Canguilhem, Michel Foucault (1926-1984). Beyond the teacher-student connections, there are other commonalities which unify this tradition. In North America and England, among those who wanted to make philosophy more like science, or to import into philosophical practice lessons from the success of science, the exemplar was almost always physics. The most striking and profound advances in science seemed to be, after all, in physics, namely the quantum and relativity revolutions. But on the Continent, model sciences were just as often linguistics or sociology, biology or anthropology, and not limited to those. Canguilhem’s interest in changing notions of the normal versus the pathological, for example, coming from an interest in medicine, typified the more human-centered theorising of the tradition. What we as humans know, how we know it, and how we successfully achieve our aims, are the guiding questions, not how to escape our human condition or situatedness.

Foucault described his project as archaeology of the history of human thought and its conditions. He compared his project to Kant’s critique of reason, but with the difference that Foucault’s interest was in a historical a priori; that is, with what seem to be for a given period the necessary conditions governing reason, and how these constraints have a contingent historical origin. Hence, in his analysis of the development of the human sciences from the Renaissance to the present, Foucault described various so-called epistemes that determined the conditions for all knowledge of their time, and he argued that the transition from one episteme to the next happens as a break that entails radical changes in the conception of knowledge. Michael Friedman’s work on the relativized and dynamic a priori can be seen as continuation of this thread (Friedman 2001). For a detailed account of the work of Bachelard, Canguilhem and Foucalt, see Gutting (1989).

With the advent of Kuhn’s Structure, “non-Continental” philosophy of science also started focusing in its own way on the historical development of science, often apparently unaware of the earlier tradition, and in the decades to follow alternative models were developed to describe how theories supersede their successors, and whether progress in science is gradual and incremental or whether it is discontinuous. Among the key contributions to this discussion, besides Kuhn’s famous paradigm-shift model, were Imre Lakatos’ (1922-1974) model of progressing and degenerating research programs and Larry Laudan’s (*1941) model of successive research traditions.

a. Kuhn, Paradigms and Revolutions

One of the key contributions that provoked interest in scientific change among philosophers of science was Thomas S. Kuhn’s seminal monograph The Structure of Scientific Revolutions from 1962. The aim of this monograph was to question the view that science is cumulative and progressive, and Kuhn opened with: “History, if viewed as a repository for more than anecdote or chronology, could produce a decisive transformation in the image of science by which we are now possessed” (p. 1). History was expected to do more than just chronicle the successive increments of, or impediments to, our progress towards the present. Instead, historians and philosophers should focus on the historical integrity of science at a particular time in its development, and should analyze science as it developed. Instead of describing a cumulative, teleological development toward the present, history of science should see science as developing from a given point in history. Kuhn expected a new image of science would emerge from this diachronic historiography. In the rest of Structure he used historical examples to question the view of science as a cumulative development in which scientists gradually add new pieces to the ever-growing aggregate of scientific knowledge, and instead he described how science develops through successive periods of tradition-preserving normal science and tradition-shattering revolutions. For introductions to Kuhn’s philosophy of science, see for example Andersen 2001, Bird 2000, and Hoyningen-Huene 1993.

i. Key Concepts in Kuhn’s Account of Scientific Change

On Kuhn’s model, science proceeds in key phases. The predominant phase is normal science which, while progressing successfully in its aims, inherently generates what Kuhn calls anomalies. In brief, anomalies lead to crisis and extraordinary science, followed by revolution, and finally a new phase of normal science.

Normal science is characterized by a consensus which exists throughout the scientific community as to (a) the concepts used in communication among scientists, (b) the problems which can meaningfully be formulated as relevant research problems, and (c) a set of exemplary problem solutions that serve as models in solving new problems. Kuhn first introduced the notion ‘paradigm’ to denote these shared communal aspects, and also the tools used by that community for solving its research problems. Because so much was apparently captured by the term ‘paradigm’, Kuhn was criticized for using the term in ambiguous ways (see especially Masterman 1970). He later offered the alternative notion ‘disciplinary matrix’, covering (a) symbolic generalizations, or laws in their most fundamental forms, (b) beliefs about which objects and phenomena that exist in the world, (c) values by which the quality of research can be evaluated, and (d) exemplary problems and problem situations. In normal science, scientists draw on the tools provided by the disciplinary matrix, and they expect the solutions of new problems to be in consonance with the descriptions and solutions of the problems that they have previously examined. But sometimes these expectations are violated. Problems may turn out not to be solvable in an acceptable way, and then instead they represent anomalies for the reigning theories.

Not all anomalies are equally severe. Some discrepancy can always be found between theoretical predictions and experimental findings, and this does not necessarily challenge the foundations of normal science. Hence, some anomalies can be neglected, at least for some time. Others may find a solution within the reigning theoretical framework. Only a small number will be so severe and so persistent, that they suggest the tools provided by the accepted theories must be given up, or at least be seriously modified. Science has then entered the crisis phase of Kuhn’s model. Even in crisis, revolution may not be immediately forthcoming. Scientists may “agree” that no solution is likely to be found in the present state of their field and simply set the problems aside for future scientists to solve with more developed tools, while they return to normal science in its present form. More often though, when crisis has become severe enough for questioning the foundation, and the anomalies may be solved by a new theory, that theory gradually receives acceptance until eventually a new consensus is established among members of the scientific community regarding the new theory. Only in this case has a scientific revolution occurred.

Importantly though, even severe anomalies are not simply falsifying instances. Severe anomalies cause scientists to question the accepted theories, but the anomalies do not lead the scientists to abandon the paradigm without an alternative to replace it. This raises a crucial question regarding scientific change on Kuhn’s model: where do new theories come from? Kuhn said little about this creative aspect of scientific change; a topic that later became central to cognitively inclined philosophers of science working on scientific change (see the section on Cognitive Views below). Kuhn described merely how severe anomalies would become the fixation point for further research, while attempts to solve them might gradually diverge more and more from the solution hitherto accepted as exemplary. Until, in the course of this development, embryonic forms of alternative theories were born.

ii. Incommensurability as the Result of Radical Scientific Change

For Kuhn the relation between normal science traditions separated by a scientific revolution cannot be described as incorporation of one into the other, or as incremental growth. To describe the relation, Kuhn adopted the term ‘incommensurability’ from mathematics, claiming that the new normal-scientific tradition which emerges from a scientific revolution is not only incompatible but often actually incommensurable with that which has gone before.

Kuhn’s notion of incommensurability covered three different aspects of the relation between the pre- and post-revolutionary normal science traditions: (1) a change in the set of scientific problems and the way in which they are attacked, (2) conceptual changes, and (3) a change, in some sense, in the world of the scientists’ research. This latter, “world-changing” aspect is the most fundamental aspect of incommensurability. However, it is a matter of great debate exactly how strongly we should take Kuhn’s meaning, for instance when he stated that “though the world does not change with a change of paradigm, the scientist afterwards works in a different world” (p. 121). To make sense of these claims it is necessary to distinguish between two different senses of the term ‘world’: the world as the independent object which scientists investigate and the world as the perceived world in which scientists practice their trade.

In Structure, Kuhn argued for incommensurability in perceptual terms. Drawing on results from psychological experiments showing that subjects’ perceptions of various objects were dependent on their training and experience, Kuhn suspected that something like a paradigm was prerequisite to perception itself and that, therefore, different normal science traditions would cause scientists to perceive differently. But when it comes to visual gestalt-switch images, one has recourse to the actual lines drawn on the paper. Contrary to this possibility of employing an ‘external standard’, Kuhn claimed that scientists can have no recourse above or beyond what they see with their eyes and instruments. For Kuhn, the change in perception cannot be reduced to a change in the interpretation of stable data, simply because stable data do not exist. Kuhn thus strongly attacked the idea of a neutral observation-language; an attack similarly launched by other scholars during the late 1950s and early 1960s, most notably Hanson (Hanson 1958).

These aspects of incommensurability have important consequences for the communication between proponents of competing normal science traditions and for the choice between such traditions. Recognizing different problems and adopting different standards and concepts, scientists may talk past each other when debating the relative merits of their respective paradigms. But if they do not agree on the list of problems that must be solved or on what constitutes an acceptable solution, there can be no point-by-point comparison of competing theories. Instead, Kuhn claimed that the role of paradigms in theory choice was necessarily circular in the sense that the proponents of each would use their own paradigm to argue in that paradigm’s defense. Paradigm choice is a conversion that cannot be forced by logic and neutral experience.

This view has led many critics of Kuhn to the misunderstanding that he saw paradigm choice as devoid of rational elements. However, Kuhn did emphasize that although paradigm choice cannot be justified by proof, this does not mean that arguments are not relevant or that scientists are not rationally persuaded to change their minds. In contrast, Kuhn argued that, “Individual scientists embrace a new paradigm for all sorts of reasons and usually for several at once.” (Kuhn 1996. p. 152)  According to Kuhn, such arguments are, first of all, about whether the new paradigm can solve the problems that have led the old paradigm to a crisis, whether it displays a quantitative precision strikingly better than its older competitor, and whether in the new paradigm or with the new theory there are predictions of phenomena that had been entirely unsuspected while the old one prevailed. Aesthetic arguments, based on simplicity for example, may enter as well.

Another common misunderstanding of Kuhn’s notion of incommensurability is that it should be taken to imply a total discontinuity between the normal science traditions separated by a scientific revolution. Kuhn emphasized, rather, that a new paradigm often incorporates much of the vocabulary and apparatus, both conceptual and manipulative, of its predecessor. Paradigm shifts may be “non-cumulative developmental episodes …,” but the former paradigm can be replaced “… in whole or in part …” (Ibid. p. 2). In this way, parts of the achievements of a normal science tradition will turn out to be permanent, even across a revolution. “[P]ostrevolutionary science invariably includes many of the same manipulations, performed with the same instruments and described in the same terms …” (Ibid. p 129-130). Incommensurability is a relation that holds only between minor parts of the object domains of two competing theories.

b. Lakatos and Progressing and Degenerating Research Programs

Lakatos agreed with Kuhn’s insistence on the tenacity of some scientific theories and the rejection of naïve falsification, but he was opposed to Kuhn’s account of the process of change, which he saw as “a matter for mob psychology” (Lakatos, 1970, p. 178). Lakatos therefore sought to improve upon Kuhn’s account by providing a more satisfactory methodology of scientific change, along with a meta-methodological justification of the rationality of that method, both of which were seen to be either lacking or significantly undeveloped in Kuhn’s early writings. On Lakatos’ account, a scientific research program consists of a central core that is taken to be inviolable by scientists working within the research program, and a collection of auxiliary hypotheses that are continuously developing as the core is applied. In this way, the methodological rules of a research program divide into two different kinds: a negative heuristic that tells the scientists which paths of research to avoid, and a positive heuristic that tells the scientists which paths to pursue. On this view, all tests are necessarily directed at the auxiliary hypotheses which come to form a protective belt around the hard core of the research program.

Lakatos aims to reconstruct changes in science as occurring within research programs. A research program is constituted by the series of theories resulting from adjustments to the protective belt but all of which share a hard core. As adjustments are made in response to problems, new problems arise, and over a series of theories there will be a collective problem-shift. Any series of theories is theoretically progressive, or constitutes a theoretically progressive problem-shift, if and only if there is at least one theory in the series which has some excess empirical content over its predecessor. In the case if this excess empirical content is also corroborated the series of theories is empirically progressive. A problem-shift is progressive, then, if it is both theoretically and empirically progressive, otherwise it is degenerate. A research program is successful if it leads to progressive problem-shifts and unsuccessful if it leads to degenerating problem-shifts. The further aim of Lakatos’ account, in other words, is to discover, through reconstruction in terms of research programs, where progress is made in scientific change.

The rationally reconstructive aspect of Lakatos’ account is the target of criticism. The notion of empirical content, for instance, is carrying a pretty heavy burden in the account. In order to assess the progressiveness of a program, one would seem to need a measure of the empirical content of theories in order to judge when there is excess content. Without some such measure, however, Lakatos’ methodology is dangerously close to being vacuous or ad hoc.

We can instead take the increase in empirical content to be a meta-methodological principle, one which dictates an aim for scientists (that is, to increase empirical knowledge), while cashing this out at the methodological level by identifying progress in research programs with making novel predictions. The importance of novel predictions, in other words, can be justified by their leading to an increase in the empirical content of the theories of a research program. A problem-shift which results in novel predictions can be taken to entail an increase in empirical content. It remains a worry, however, whether such an inference is warranted, since it seems to simply assume novelty and cumulativity go together unproblematically. That they might not was precisely Kuhn’s point.

A second objection is that Lakatos’ reconstruction of scientific change through appeal to a unified method runs counter to the prevailing attitude among philosophers of science from the second half of the twentieth century on, according to which there is no unified method for all of science. At best, anything they all have in common methodologically will be so general as to be unhelpful or uninteresting.

At any rate, Lakatos does offer us a positive heuristic for the description and even explanation of scientific change. For him, change in science is a difficult and delicate thing, requiring balance and persistence. “Purely negative, destructive criticism, like ‘refutation’ or demonstration of an inconsistency does not eliminate a program. Criticism of a program is a long and often frustrating process and one must treat budding programs leniently. One may, of course, whop up on [criticize] the degeneration of a research program, but it is only constructive criticism which, with the help of rival research programs, can achieve real successes; and dramatic spectacular results become visible only with hindsight and rational reconstruction” (Lakatos, 1970, p. 179).

c. Laudan and Research Traditions

In his Progress and Its Problems: Towards a Theory of Scientific Growth (1977), Laudan defined a research tradition as a set of general assumptions about the entities and processes in a given domain and about the appropriate methods to be used for investigating the problems and constructing the theories in that domain. Such research traditions should be seen as historical entities created and articulated within a particular intellectual environment, and as historical entities they would “wax and wane” (p. 95). On Laudan’s view, it is important to consider scientific change both as changes that may appear within a research tradition and as changes of the research tradition itself.

The key engine driving scientific change for Laudan is problem solving. Changes within a research tradition may be minor modifications of subordinate, specific theories, such as modifications of boundary conditions, revisions of constants, refinements of terminology, or expansion of a theory’s classificatory network to encompass new discoveries. Such changes solve empirical problems, essentially those problems Kuhn conceives of as anomalies. But, contrary to Kuhn’s normal science and to Lakatos’ research programs, Laudan held that changes within a research tradition might also involve changes to its most basic core elements. Severe anomalies which are not solvable merely by modification of specific theories within the tradition may be seen as symptoms of a deeper conceptual problem. In such cases scientists may instead explore what sorts of (minimal) adjustments could be made in the deep-level methodology or ontology of that research tradition (p. 98). When Laudan looked at the history of science, he saw Aristotelians who had abandoned the Aristotelian doctrine that motion in a void is impossible, and Newtonians who had abandoned the Newtonian demand that all matter has inertial mass, and he saw no reason to claim that they were no longer working within those research traditions.

Solutions to conceptual problems may even result in a theory with less empirical support and still count as progress since it is overall problem solving effectiveness (not all problems are empirical ones) which is the measure of success of a research tradition (Laudan 1996). Most importantly for Laudan, if there are what can be called revolutions in science, they reflect different kinds of problems, not a different sort of activity. David Pearce calls this Laudan’s methodological monism (see Pearce 1984). For Kuhn and Lakatos, identification of a research tradition (or program or paradigm) could be made at the level of specific invariant, non-rejectable elements. For Laudan, there is no such class of sacrosanct elements within a research tradition—everything is open to change over time. For example, while absolute time and space were seen as part of the unrejectable core of Newtonian physics in the eighteenth century, they were no longer seen as such a century later. This leaves a dilemma for Laudan’s view. If research traditions undergo deep-level transformations of their problem solving apparatus this would seem to constitute a significant change to the problem solving activity that may warrant considering the change the basis of a new research tradition. On the other hand, if the activity of problem solving is strong enough to provide the identity conditions of a tradition across changes, consistency might force us to identify all problem solving activity as part of one research tradition, blurring distinctions between science and non-science. Distinguishing between a change within a research tradition and the replacement of a research tradition with another seems both arbitrary and open-ended. One way of solving this problem is by turning from just internal characteristics of science to external factors of social and historical context.

4. The Social Processes of Change

Science is not just a body of facts or sets of sentences. However one characterizes its content, that content must be embodied in institutions and practices comprised of scientists themselves. An important question then, with respect to scientific change, regards how “science” is constructed out of scientists, and which unit of analysis – the individual scientist or the community—is the proper one for understanding the dynamic of scientific change? Popper’s falsificationism was very much a matter of personal responsibility and reflection. Kuhn, on the other hand, saw scientific change as a change of community and generations. While Structure may have been largely responsible for making North American philosophers aware of the importance of historical and social context in shaping scientific change, Kuhn was certainly not the first to theorize about it. Kuhn himself recognized his views in the earlier work of Ludwick Fleck (See for example Brorson and Andersen 2001, Babich 2007 and Mössner 2011 for comparisons between the views of Kuhn and Fleck).

a. Fleck

As early as the mid-1930s, Ludwik Fleck (1896-1961) gave an account of how thoughts and ideas change through their circulation within the social strata of a thought-collective (Denkkollektiv) and how this thought-traffic contributes to the process of verification. Drawing on a case study from medicine on the development of a diagnostic test for syphilis, Ludwik Fleck argued in his 1935 monograph Genesis and the Development of a Scientific Fact that a thought collective is a functional unit in which people who interact intellectually are tied together through a particular ‘thought style’ that forces narrow constraints upon the thinking of the individual. The thought-style is dogmatically transmitted from one generation to the next, by initiation, training, education or other devices whose aim is introduction into the collective. Most people participate in numerous thought-collectives, and any individual therefore possesses several overlapping thought-styles and may become carriers of influence between the various thought-collectives in which they participate. This traffic of thoughts outside the collective is linked to the most outstanding alterations in thought-content. The ensuing modification and assimilation according to the foreign thought-style is a significant source of divergent thinking. According to Fleck, any circulation of thoughts therefore also causes transformation of the circulated thought.

In Kuhn’s Structure, the distinction between the individual scientist and the community as the agent of change was not quite clear, and Kuhn later regretted having used the notion of a gestalt switch to characterize changes in a community because “communities do not have experiences, much less gestalt switches.” Consequently, he realized that “to speak, as I repeatedly have, of a community’s undergoing a gestalt switch is to compress an extended process of change into an instant, leaving no room for the microprocesses by which the change is achieved” (Kuhn 1989, p. 50). Rather than helping himself to an unexamined notion of communal change, Fleck, on the other hand, made the process by which individual interacted with collective central to his account of scientific development and the joint construction of scientific thought. What the accounts have in common is a view that the social plays a role in scientific change through the social shaping of science content. It is not a relation between scientist and physical world which is constitutive of scientific knowledge, but a relation between the scientists and the discipline to which they belong. That relation can be restrictive of change in science. It can also provide the dynamics for change.

b. Hull’s Evolutionary Account of Scientific Change

Several philosophers of science have held the view that the dynamics of scientific change can be seen as an evolutionary process in which some kind of selection plays a central role. One of the most detailed evolutionary accounts of scientific change has been provided by David Hull (1935-2010). On Hull’s account of scientific change, the development of science is a function of the interplay between cooperation and competition for credit among scientists. Hence, selection in the form of citations plays a central role in this account.

The basic structure of Hull’s account is that, for the content element of science—problems and their solutions, accumulated data, but also beliefs about the goals of science, proper ways to realize these goals, and so forth—to survive in science they must be transmitted more or less intact through history. That is, they must be seen as replicators that pass on their structure in successive replication. Hence, conceptual replication is a matter of information being transmitted largely intact by different vehicles. These vehicles of transmission may be media such as books or journals, but also scientists themselves. Whereas books and journals are passive vehicles, scientists are active in testing and changing the transmitted ideas. They are therefore not only vehicles of transmission but also interactors, interacting with their environment in a way that causes replication to be differential and hence enabling of scientific change.

Hull did not elaborate much on the inner structure of differential replication, apart from arguing that the underdetermination of theory by observation made it possible. Instead, the focus of his account is on the selection mechanism that can cause some lineages of scientific ideas to cease and others to continue. First, scientists tend to behave in ways that increase their conceptual fitness. Scientists want their work to be accepted, which requires that they gain support from other scientists. One kind of support is to show that their work rests on preceding research. But that is at the same time a decrease in originality. There is a trade-off between credit and support. Scientists whose support is worth having are likely to be cited more frequently.

Second, this social process is highly structured. Scientists tend to organize into tightly knit research groups in order to develop and disseminate a particular set of views. Few scientists have all the skills and knowledge necessary to solve the problems that they confront; they therefore tend to form research groups of varying degrees of cohesiveness. Cooperating scientists may often share ideas that are identical in descent, and transmission of their contributions can be viewed as similar to kin selection. In the wider scientific community, scientists may form a deme in the sense that they use the ideas of each other much more frequently than the ideas of scientists outside the community.

Initially, criticism and evaluation come from within a research group. Scientists expose their work to severe tests prior to publication, but some things are taken so much for granted that it never occurs to them to question it. After publication, it shifts to scientists outside the group, especially opponents who are likely to have different—though equally unnoticed—presuppositions. The self-correction of science depends on other scientists having different perspectives and different career interests—scientists’ career interests are not damaged by refuting the views of their opponents.

5. Cognitive Views on Scientific Change

Scientific change received new interest during the 1980s and 1990s with the emergence of cognitive science; a field that draws on cognitive psychology, cognitive anthropology, linguistics, philosophy, artificial intelligence and neuroscience. Historians and philosophers of science adapted results from this interdisciplinary work to develop new approaches to their field. Among the approaches are Paul Churchland’s (*1942) neurocomputational perspective (Churchland, 1989; Churchland, 1992), Ronald Giere’s (*1938) work on cognitive models of science (Giere, 1988), Nancy Nersessian’s (*1947) cognitive history of science (Nersessian, 1984; Nersessian, 1992; Nersessian, 1995a; 1995b), and Paul Thagard’s (*1950) computational philosophy of science (Thagard, 1988; Thagard, 1992). Rather than explaining scientific change in terms of a priori principles, these new approaches aim at being naturalized by drawing on cognitive science to provide insights on how humans generally construct and develop conceptual systems and how they use these insights in analyses of scientific change as conceptual change. (For an overview of research in conceptual change, see (Vosniadou, 2008).)

a. Cognitive History of Science

Much of the early work on conceptual change emphasized the discontinuous character of major changes by using metaphors like ‘gestalt switch’, indicating that such major changes happen all at once. This idea had originally been introduced by Kuhn, but in his later writings he admitted that his use of the gestalt switch metaphor had its origin in his experience as a historian working backwards in time and that, consequently, it was not necessarily suitable for describing the experience of the scientists taking part in scientific development. Instead of dramatic gestalt shifts, it is equally plausible that for the historical actors there exist micro-processes in their conceptual development. The development of science may happen stepwise with minor changes and yet still sum up over time to something that appears revolutionary to the historian looking backward and comparing the original conceptual structures to the end product of subsequent changes. Kuhn realized this, but also saw that his own work did not offer any details on how such micro-processes would work, though it did leave room for their exploration (Kuhn 1989).

Exploration of conceptual microstructures has been one of the main issues within the cognitive history and philosophy of science. Historical case studies of conceptual change have been carried out by many scholars, including Nersessian, Thagard, the Andersen-Barker-Chen groupThat (see for example Nersessian, 1984; Thagard, 1992; Andersen, Barker, and Chen, 2006).

Some of the early work in cognitive history and philosophy of science focused on mapping conceptual structures at different stages during scientific change (see for example Thagard, 1990; Thagard and Nowak, 1990; Nersessian and Resnick, 1989) and developing typologies of conceptual change in terms of their degree of severeness (Thagard, 1992). These approaches are useful for comparing between different stages of scientific change and for discussing such issues as incommensurability. However, they do not provide much detail on the creative process through which changes are created.

Other lines of research have focused on the reasoning processes that are used in creating new concepts during scientific change. One of the early contributions to this line of work was Shapere who argued that, as concepts evolve, chains of reasoning connect the successive versions of a concept. These chains of reasoning therefore also establish continuity in scientific change, and this continuity can only be fully understood by analysis of the reasons that motivated each step in the chain of changes (Shapere 1987a;1987b). Over the last two decades, this approach has been extended and substantiated by Nersessian (2008a; 2008b) whose work has focused on the nature of the practices employed by scientists in creating, communicating and replacing scientific representations within a given scientific domain. She argues that conceptual change is a problem-solving process. Model-based reasoning processes, especially, are used to facilitate and constrain abstraction and information from multiple sources during this process.

b. Scientific Change and Science Education

Aiming at insights into general mechanisms of conceptual development, some of the cognitive approaches have been directed toward investigating not only the development of science, but also how sciences are learned. During the 1980s and early 1990s, several scholars argued that conceptual divides of the same kind as described by Kuhn’s incommensurability thesis might exist in science education between teacher and student. Science teaching should, therefore, address these misconceptions in an attempt to facilitate conceptual change in students. Part of this research incorporated the (controversial) thesis that the development of ideas in students mirrors the development of ideas in the history of science—that cognitive ontogeny recapitulates scientific phylogeny. For the field of mechanics in particular, research was done to show that children’s’ naïve beliefs parallel early scientific beliefs, like impetus theories, for example. (Champagne, Klopfer, and Anderson, 1980; Clement, 1983; McClosky, 1983). However, most research went beyond the search for analogies between students’ naïve views and historically held beliefs. Instead, they carried out material investigations of the cognitive processes employed by scientists in constructing scientific concepts and theories more generally, through the available historical records, focussing on the kinds of reasoning strategies communicated in those records (see Nersessian, 1992; Nersessian, 1995a). Thus, this work still assumed that the cognitive activities of scientists in their construction of new scientific concepts was relevant to learning, but it marked a return to a view of the relevance of the history of science as a repository of case studies demonstrating how scientific concepts are constructed and changed. In assuming a conceptual continuity between scientific understanding “then and now,” the cognitive approach had moved away from the Kuhnian emphasis on incommensurability and gestalt shift conceptual change.

6. Further Reading and References

It is impossible to disentangle entirely the history and philosophy of scientific change from a great number of other issues and disciplines. We have not addressed here the epistemology of science, the role of experiments in science (or of thought experiments), for instance. The question of whether science, or knowledge in general, is approaching truth, or tracking truth, or approximating to truth, are debates taken up in epistemology. For more on those issues one should consult the relevant references. Whether science progresses (and not just changes) is a question which supports its own literature as well. Many iterations of interpretations, criticism and replies to challenges of incommensurability, non-cumulativity, and irrationality of science have been given. Beliefs in scientific progress founded on a naïve realism, according to which science is getting ever closer to a literally true picture of the world, have been criticized soundly. A simple version of the criticism is the pessimistic meta-induction: every scientific image of reality in the past has been proven wrong, therefore all future scientific images will be wrong (see Putnam 1978; Laudan 1984). In response to challenges to realism, much attention has been paid to structural realism, an attempt to describe some underlying mathematical structure which is preserved even across major theory changes. Past theories were not entirely wrong, on this view, and not entirely discarded, because they had some of the structure correct, albeit wrongly interpreted or embedded in a mistaken ontology or broader world view which has been since abandoned.
On the question of unity of science, on whether the methods of science are universal or plural, and whether they are rational, see the references given for Cartwright (2007), Feyerabend (1974), Mitchell (2000;2003); Kellert, et al (2006). For feminist criticisms and alternatives to traditional philosophy and history of science the interested reader should consult Longino (1990;2002); Gary, et al (1996); Keller, et al (1996); Ruetsche (2004). Clough (2004) puts forward a program combining feminism and naturalism. Among twenty-first century approaches to the historicity of science there are Friedman’s dynamic a priori approach (Friedman 2001), the evolving subject-object relation of McGuire and Tuchanska (2000), and complementary science of Hasok Chang (2004).

Finally, on the topic of the Scientific Revolution, there are the standard Cohen (1985), Hall (1954) and Koyré (1965); but for subsequent discussion of the appropriateness of revolution as a metaphor in the historiography of science we recommend the collection Rethinking the Scientific Revolution, edited by Osler (2000).

a. Primary Sources

  • Crombie, A. C. (1963). Scientific Change: Historical studies in the intellectual, social and technical conditions for scientific discovery and technical invention, from antiquity to the present. London: Heinemann.
  • Feyerabend, P. (1974) Against Method. London: New Left Books.
  • Feyerabend, P. (1987) Farewell to Reason. London: Verso.
  • Fleck, L. (1979) The Genesis and Development of a Scientific Fact, (edited by T.J. Trenn and R.K. Merton, foreword by Thomas Kuhn) Chicago: University of Chicago Press
  • Hull, D.L. (1988). Science as a Process: Evolutionary Account of the Social and Conceptual Development of Science. Chicago: The University of Chicago Press.
  • Kuhn, T. S. (1970). The Structure of Scientific Revolutions. Chicago: Chicago University Press.
  • Kuhn, T. S. (1989). Speaker´s Reply. In S. Allén (Ed.), Possible Worlds in Humanities, arts, and Sciences. Berlin: de Gruyter. 49-51.
  • Lakatos, I. (1970). Falsification and the Methodology of Scientific Research Programs. In I. Lakatos and A. Musgrave, eds., Criticism and the Growth of Knowledge. Cambridge: Cambridge University Press. 91-196.
  • Laudan, L. (1977). Progress and Its Problems. Towards a Theory of Scientific Growth. Berkeley: University of California Press.
  • Laudan, L. (1996). Beyond Positivism and Relativism: Theory, Method, and Evidence. Boulder: Westview Press.
  • Toulmin, S. (1972). Human Understanding: The Collective Use and Evolution of Concepts. Princeton: Princeton University Press.

b. Secondary Sources

  • Andersen, H. (2001). On Kuhn, Belmont CA: Wadsworth
  • Babich, B. E. (2003). From Fleck’s Denkstil to Kuhn’s paradigm: conceptual schemes and incommensurability, International Studies in the Philosophy of Science 17: 75-92
  • Bird, A. (2000). Thomas Kuhn, Chesham: Acumen
  • Brorson, S. and H. Andersen (2001). Stabilizing and changing phenomenal worlds: Ludwik Fleck and Thomas Kuhn on scientific literature, Journal for General Philosophy of Science 32: 109-129
  • Cartwright, Nancy (2007). Hunting Causes and Using Them. Cambridge: Cambridge University Press.
  • Chang, H. (2004). Inventing Temperature: Measurement and Scientific Progress. Oxford: Oxford University Press.
  • Clough, S. Having It All: Naturalized Normativity in Feminist Science Studies. Hypatia, vol. 19 no. 1 (Winter 2004). 102-18.
  • Feyerabend, P. K. (1981). Explanation, reduction and empiricism. In Realism, Rationalism and Scientific Method: Philosophical Papers. Volume 1. Cambridge: Cambridge University Press. 44-96.
  • Friedman, M. (2001). Dynamics of Reason. Stanford: CSLI Publications.
  • Gutting G. (1989). Michel Foucault’s archaeology of scientific reason. Cambridge: Cambridge University Press
  • Gutting G. (2005). Continental philosophy of science. Oxford: Blackwell
  • Hall, A.R. (1954). The Scientific Revolution 1500-1800. Boston: Beacon Press.
  • Hoyningen-Huene, P. (1993). Reconstructing Scientific Revolutions, Chicago: University of Chicago Press.
  • Losee, J. (2004). Theories of Scientific Progress. London: Routledge.
  • McGuire, J. E. and Tuchanska, B. (2000). Science Unfettered. Athens: Ohio University Press.
  • Mössner, N. (2011). Thought styles and paradigms – a comparative study of Ludwik Fleck and Thomas S. Kuhn, Studies in History and Philosophy of Science 42: 362-371.

i. Concepts, Cognition and Change

  • Andersen, H., Barker, P., and Chen, X. (2006). The Cognitive Structure of Scientific Revolutions. Cambridge: Cambridge University Press.
  • Champagne, A. B., Klopfer, L. E., and Anderson, J. (1980). Factors Influencing Learning of Classical Mechanics. American Journal of Physics, 48, 1074-1079.
  • Churchland, P. M. (1989). A Neurocomputational Perspective. The Nature of Mind and the Structure of Science. Cambridge, MA: MIT Press.
  • Churchland, P. M. (1992). A deeper unity: Some Feyerabendian themes in neurocomputational form. In R. N. Giere, ed., Cognitive models of science. Minnesota studies in the philosophy of science. Minneapolis: University of Minnesota Press. 341-363.
  • Clement, J. (1983). A Conceptual Model Discussed by Galileo and Used Intuitively by Physics Students. In D. Gentner and A. L. Stevens, eds. Mental Models. Hillsdale: Lawrence Earlbaum Associates. 325-340.
  • Giere, R. N. (1988). Explaining Science: A Cognitive Approach. Chicago: University of Chicago Press.
  • Hanson, N.R.(1958). Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cambridge: Cambridge University Press.
  • McClosky, M. (1983). Naive Theories of Motion. In D. Gentner and A. L. Stevens (Eds.), Mental Models. Hillsdale: Lawrence Erlbaum Associates. 75-98.
  • Nersessian, N. J. (1984). Faraday to Einstein: Constructing Meaning in Scientific Theories. Dordrecht: Martinus Nijhoff.
  • Nersessian, N. J. (1992). Constructing and Instructing: The Role of “Abstraction Techniques” in Creating and Learning Physics. In R.A. Duschl and R. J. Hamilton, eds. Philosophy of Science, Cognition, Psychology and Educational Theory and Practice. Albany: SUNY Press. 48-53.
  • Nersessian, N. J. (1992). How Do Scientists Think? Capturing the Dynamics of Conceptual Change in Science. In R. N. Giere, ed. Cognitive Models of Science. Minneapolis: University of Minnesota Press. 3-44.
  • Nersessian, N. J. (1995a). Should Physicists Preach What They Practice? Constructive Modeling in Doing and Learning Physics. Science and Education, 4. 203-226.
  • Nersessian, N. J. (1995b). Opening the Black Box: Cognitive Science and History of Science. Osiris, 10. 194-211.
  • Nersessian, N. J. (2008a). Creating Scientific Concepts. Cambridge MA: MIT Press.
  • Nersessian, N. J. (2008b). Mental Modelling in Conceptual Change. In S.Vosniadou, ed. International Handbook of Research on Conceptual Change. New York: Routledge. 391-416.
  • Nersessian, N., ed. (1987). The Process of Science. Netherlands: Kluwer Academic Publisher.
  • Nersessian, N. J. and Resnick, L. B. (1989). Comparing Historical and Intuitive Explanations of Motion: Does “Naive Physics” Have a Structure. Proceedings of the Cognitive Science Society, 11. 412-420.
  • Shapere, D. (1987a). “Method in the Philosophy of Science and Epistemology: How to Inquire about Inquiry and Knowledge.” In Nersessian, N., ed. The Process of Science. Netherlands: Kluwer Academic Publisher.
  • Shapere, D. (1987b.) “External and Internal Factors in the Development of Science.” Science and Technology Studies, 1. 1–9.
  • Thagard, P. (1990). The Conceptual Structure of the Chemical Revolution. Philosophy of Science 57, 183-209.
  • Thagard, P. (1992). Conceptual Revolutions. Princeton: Princeton University Press.
  • Thagard, P. and Nowak, G. (1990). The Conceptual Structure of the Geological Revolution. In J. Shrager and P. Langley, eds. Computational Models of Scientific Discovery and Theory Formation. San Mateo: Morgan Kaufmann. 27-72.
  • Thagard, P. (1988). Computational Philosophy of Science. Cambridge: MIT Press.
  • Thagard, P. (1992). Conceptual Revolutions. Princeton: Princeton University Press.
  • Vosniadou, S. (2008). International Handbook of Research in Conceptual Change. London: Routledge.

ii. Feminist, Situated and Social Approaches

  • Garry, Ann and Marilyn Pearsall, eds. (1996). Women, Knowledge and Reality: Explorations in Feminist Epistemology. New York: Routledge.
  • Goldman, Alvin. (1999). Knowledge in a Social World. New York: Oxford University Press.
  • Hacking, Ian. (1999). The Social Construction of What? Cambridge: Harvard University Press.
  • Keller, Evelyn Fox and Helen Longino, eds. (1996). Feminism and Science. Oxford: Oxford University Press.
  • Keller, Stephen H., and Helen E. Longino, and C. Kenneth Waters, eds (2006). Scientific Pluralism. Minnesota Studies in the Philosophy of Science, Volume 19, Minneapolis: University of Minnesota Press.
  • Longino, H. E. (2002). The Fate of Knowledge. Princeton: Princeton University Press.
  • Longino, H. E. (1990). Science as Social Knowledge: Values and Objectivity in Scientific Inquiry. Princeton, NJ: Princeton University Press.
  • McMullin, Ernan, ed. (1992). Social Dimensions of Scientific Knowledge. South Bend: Notre Dame University Press.
  • Ruetsche, Laura, 2004, “Virtue and Contingent History: Possibilities for Feminist Epistemology”, Hypatia, 19.1: 73–101
  • Solomon, Miriam. (2001). Social Empiricism. Cambridge: Massachusetts Institute of Technology Press.

iii. The Scientific Revolution

  • Cohen, I. B., (1985). Revolution in Science, Cambridge: Harvard University Press.
  • Koyré, A. (1965). Newtonian Studies. Chicago: The University of Chicago Press.
  • Osler, Margaret (2000). Rethinking the Scientific Revolution. Cambridge: Cambridge University Press.

 

Author Information

Hanne Andersen
Email: hanne.andersen@ivs.au.dk
University of Aarhus
Denmark

and

Brian Hepburn
Email: bhepburn@ivs.au.dk
University of Aarhus
Denmark

Infinitism in Epistemology

This article provides an overview of infinitism in epistemology. Infinitism is a family of views in epistemology about the structure of knowledge and epistemic justification. It contrasts naturally with coherentism and foundationalism. All three views agree that knowledge or justification requires an appropriately structured chain of reasons. What form may such a chain take? Foundationalists opt for non-repeating finite chains. Coherentists (at least linear coherentists) opt for repeating finite chains. Infinitists opt for non-repeating infinite chains. Appreciable interest in infinitism as a genuine competitor to coherentism and foundationalism developed only in the early twenty-first century.

The article introduces infinitism by explaining its intuitive motivations and the context in which they arise. Next it discusses the history of infinitism, which is mostly one of neglect, punctuated by brief moments of hostile dismissal. Then there is a survey of arguments for and against infinitism.

For the most part, philosophers have assumed that knowledge requires justified belief. That is, for some proposition (statement, claim or sentence) P, if you know that P, then you have a justified belief that P. Knowledge that P thus inherits its structure from the structure of the constituent justified belief that P. If the justified belief is inferential, then so is the knowledge. If the justified belief is “basic,” then so is the knowledge. These assumptions are taken for granted in the present article.

Table of Contents

  1. Introduction
  2. Historical Discussion of Infinitism
  3. Contemporary Arguments for Infinitism
    1. The Features Argument
    2. Regress Arguments
      1. The Enhancement Argument
      2. The Interrogation Argument
    3. The Proceduralist Argument
  4. Common Objections to Infinitism
    1. The Finite Mind Objection
    2. The Proof of Concept Objection
    3. The AC/DC Objection
    4. The Unexplained Origin Objection
    5. The Misdescription Objection
  5. References and Further Reading
    1. References
    2. Further Reading

1. Introduction

We often provide reasons for the things we believe in order to justify holding the beliefs. But what about the reasons? Do we need reasons for holding those reasons? And if so, do we need reasons for holding those reasons that were offered as reasons for our beliefs? We’re left to wonder:

Does this regress ever end?

Infinitism is designed to answer that question. Given that one of the goals of reasoning is to enhance the justification of a belief, Q, infinitism holds that there are two necessary (but not jointly sufficient) conditions for a reason in a chain to be capable of enhancing the justification of Q:

  1. No reason can be Q itself, or equivalent to a conjunction containing Q as a conjunct. That is, circular reasoning is excluded.
  2. No reason is sufficiently justified in the absence of a further reason. That is, there are no foundational reasons.

If both (1) and (2) are true, then the chain of reasons for any belief is potentially infinite, that is, potentially unlimited.

The reason for accepting (1), and thereby rejecting circular reasoning as probative, that is, as tending to prove, is that reasoning ought to be able to improve the justificatory status of a belief. But if the propositional content of a belief is offered as a reason for holding the belief, then no additional justification could arise. Put more bluntly, circular reasoning begs the question by positing the very propositional content of the belief whose justificatory status the reasoning is designed to enhance.

Condition (1) is generally accepted, although some coherentists seem to condone the sort of circular reasoning that it proscribes (for example, Lehrer 1997). However, these coherentists might not actually be denying (1). Rather, they might instead be claiming that it is epistemically permissible to offer a deliverance of a cognitive faculty as a reason for believing that the faculty produces justified beliefs. On this alternative reading, these coherentists don’t deny (1), because (1) concerns the structure, not the source, of probative reasons. For example, suppose you employ beliefs produced by perception as reasons for believing that perception is reliable. This need not involve employing the proposition “perception is reliable” as one of the reasons.

Condition (2) is much more controversial. Indeed, denying (2) is a component of the dominant view in epistemology: foundationalism. Many foundationalists claim that there are beliefs, so-called “basic beliefs” or “foundational beliefs,” which do not require further reasons in order to function effectively as reasons for “non-basic” or “non-foundational” beliefs. Basic beliefs are taken to be sufficiently justified to serve as, at least, prima facie reasons for further beliefs in virtue of possessing some property that doesn’t arise from, or depend on, being supported by further reasons. For example, the relevant foundationalist property could be that the belief merely reports the contents of sensations or memories; or it could be that the belief is produced by a reliable cognitive faculty. The general foundationalist picture of epistemic justification is that foundational beliefs are justified to such an extent that they can be used as reasons for further beliefs, and that no reasons for the foundational beliefs are needed in order for the foundational beliefs to be justified.

Infinitists accept (2) and so deny that there are foundational beliefs of the sort that foundationalists champion. The motivation for accepting (2) is the specter of arbitrariness. Infinitists grant that in fact every actually cited chain of reasons ends; but infinitists deny that there is any reason which is immune to further legitimate challenge. And once a reason is challenged, then on pain of arbitrariness, a further reason must be produced in order for the challenged reason to serve as a good reason for a belief.

In addition to denying the existence of so-called basic beliefs, infinitism takes reasoning to be a process that generates an important type of justification — call it “reason-enhanced justification.” In opposition to foundationalism, reasoning is not depicted as merely a tool for transferring justification from the reasons to the beliefs. Instead, a belief’s justification is enhanced when sufficiently good reasons are offered on its behalf. Such enhancement can occur even when the reasons offered have not yet been reason-enhanced themselves. That is, citing R as a reason for Q can make one’s belief that Q reason-enhanced, even though R, itself, might not yet have been reason-enhanced.

As mentioned above, infinitists reject the form of coherentism – sometimes called “linear coherentism” – that endorses question-begging, circular reasoning. But by allowing that reasoning can generate epistemic justification, infinitists partly align themselves with another, more common form of coherentism – often called “holistic coherentism.” Holistic coherentism also accepts that reasoning can generate reason-enhanced justification (see BonJour 1985, Kvanvig 2007). As the name “holistic coherentism” indicates, epistemic justification is taken to be a property of entire sets of beliefs, rather than a property of individual beliefs. Holistic coherentism holds that individual beliefs are justified only in virtue of their membership in a coherent set of beliefs. On this view, justification does not transfer from one belief to another, as foundationalists or linear coherentists would claim; rather, the inferential relationships among beliefs in a set of propositions generates a justified set of beliefs; individual beliefs are justified merely in virtue of being members of such a set. Sosa (1991, chapter 9) raises serious questions about whether holistic coherentism is ultimately merely just a disguised version of foundationalism; and if Sosa is correct, then some of the objections to foundationalism would apply to holistic coherentism as well.

The argument pattern for infinitism employs the epistemic regress argument and, thus, infinitists defend their view in a manner similar to the way in which foundationalism and coherentism have been defended. This is the pattern:

  1. There are three possible, non-skeptical solutions to the regress problem: foundationalism, coherentism and infinitism.
  2. There are insurmountable difficulties with two of the solutions (in this case, foundationalism and coherentism).
  3. The third view (in this case, infinitism) faces no insurmountable difficulties.
  4. Therefore, the third view (in this case, infinitism) is the best non-skeptical solution to the regress problem.

2. Historical Discussion of Infinitism

The term ‘epistemic infinitism’ was used by Paul Moser in 1984, and the phrase “infinitist’s claim” was used by John Post in 1987. Both philosophers rejected infinitism.

Infinitism was well known by the time of Aristotle – and he rejected the view. The empiricist and rationalist philosophers of the 17th and 18th centuries rejected the view. Contemporary foundationalists and coherentists reject the view.

Indeed, it is fair to say that the history of infinitism is primarily a tale of neglect or rejection, with the possible exception of Charles Pierce (Aikin 2011, pp. 80–90; see also “Some Questions Concerning Certain Faculties Claimed for Man” in Peirce 1965, v. 5, bk. 2, pp. 135–155, esp. pp. 152–3). Some have questioned whether Peirce was defending infinitism (BonJour 1985, p. 232, n. 10; Klein 1999, pp. 320–1, n. 32). There has been some recent interest in infinitism, beginning when Peter Klein published the first in a series of articles defending infinitism (Klein 1998). But it clearly remains in the early 21st century a distinctly minority view about the structure of reasons.

Ever since Aristotle proposed objections to infinitism and defended foundationalism, various forms of foundationalism have dominated Western epistemology. For example, consider the epistemologies of the seventeenth and eighteenth centuries; this is the formative period in which modern philosophy shaped the issues addressed by contemporary epistemologists. Both the empiricists and rationalists were foundationalists, although they clearly disagreed about the nature of foundational reasons.

Consider this passage from Descartes’s Meditation One, where he explains his method of radical doubt:

But in as much as reason already persuades me that I ought no less carefully to withhold my assent from matters which are not entirely certain and indubitable than from those which appear to me manifestly to be false, if I am able to find in each one some reason to doubt, this will suffice to justify rejecting the whole. And for that end it will not be requisite that I should examine each in particular, which would be an endless undertaking; for owing to the fact that the destruction of the foundations of necessity brings with it the downfall of the rest of the edifice, I shall only in the first place attach those principles upon which all my former opinions rest. (Descartes 1955 [1641], p. 145)

After producing a “powerful” reason for doubting all of his former beliefs based on his senses, Descartes begins his search anew for a foundational belief that is beyond all doubt and writes in Meditation Two:

Archimedes, in order that he might draw the terrestrial globe out of its place, and transport it elsewhere demanded only that one point should be fixed and unmovable; in the same way I shall have the right to conceive high hopes if I am happy enough to discover one thing only which is certain. (Descartes 1955 [1641], p. 149)

He then happily produces what he takes—at least at that point in the Meditations – to be that one, foundational proposition:

So that after having reflected well and carefully examined all things we must come to the definite conclusion that this proposition: I am, I exist, is necessarily true each time I pronounce it, or that I mentally conceive it. (Descartes 1955 [1641], p. 150)

Regardless of the success or failure of his arguments, the point here is that Descartes clearly takes it as given that both he and the empiricist, his intended foil, will accept that knowledge is foundational and that the first tasks are to identify the foundational proposition(s) and to uncover the correct account of the nature of the foundational proposition(s). Once that is accomplished, the second task is to move beyond it (or them) to other beliefs by means of truth-preserving inferences. The Meditations presupposes a foundationalist model of reasons without any hint of argument for foundationalism.

Now consider this passage from Hume:

In a word, if we proceed not upon some fact present to the memory or senses, our reasonings would be merely hypothetical; and however the particular links might be connected with each other the whole chain of inferences would have nothing to support it, nor could we ever, by its means arrive at the knowledge of any real existence. If I ask you why you believe a particular matter of fact which you relate, you must tell me some reason; and this reason will be some other fact connected with it. But as you cannot proceed after this manner in infinitum, you must at last terminate with some fact which is present to your memory or senses or must allow that your belief is entirely without foundation. (Hume 1955 [1748], pp. 59–60)

Setting aside an evaluation of the steps in Hume‘s argument for foundationalism, notice that he too simply discards infinitism with the stroke of a pen: “But as you cannot proceed in this manner in infinitum …”. To Hume, infinitism seemed so obviously mistaken that no argument against it was needed.

So why did infinitism come to be so easily and so often rejected?

The short answer is: Aristotle. His arguments against infinitism and for foundationalism were so seemingly powerful that nothing else needed to be said. We can divide Aristotle’s objections to infinitism into three types. Each pertains to the infinitist solution to the regress problem.

  • Misdescription Objection: Infinitism does not correctly describe our epistemic practices; but foundationalism does.
  • Finite Mind Objection: Our finite minds are not capable of producing or grasping an infinite set of reasons.
  • Unexplained Origin Objection: Infinitism does not provide a good account of how justification is generated and transferred by good reasoning; but foundationalism does.

We will return Aristotle’s objections below, in section 4.

3. Contemporary Arguments for Infinitism

There are three main contemporary arguments for infinitism.

a. The Features Argument

Infinitism has been defended on the grounds that it alone can explain two of epistemic justification’s crucial features: it comes in degrees, and it can be complete (Fantl 2003). This argument concerns propositional justification, rather than doxastic justification. Propositional justification is a matter of having good reasons; doxastic justification is typically thought to be a matter of properly believing based on those reasons.

For purposes of this argument, understand infinitism as the view that a proposition Q is justified for you just in case there is available to you an infinite series of non-repeating reasons that favors believing Q. And understand foundationalism as the view that Q is justified for you just in case you have a series of non-repeating reasons that favors believing Q, terminating in a properly basic foundational reason “that needs no further reason.” And further suppose that infinitism and foundationalism are the only relevant non-skeptical alternatives for a theory of epistemic justification, so that if skepticism about justification is false, then either infinitism or foundationalism is true.

The features argument is based on two features of justification. First, justification comes in degrees. We can be more or less justified in believing some claim. An adequate theory of justification must respect this, and explain why justification comes in degrees. Call this the degree requirement on an acceptable theory of justification. Second, it’s implausible to identify adequate justification with complete justification. Adequate justification is the minimal degree of justification required for knowledge. Complete justification is maximal justification, beyond which justification cannot be increased or strengthened. An adequate theory of justification should explain how justification could be complete. Call this the completeness requirement on an acceptable theory of justification.

Infinitism satisfies the degree requirement by pointing out that length comes in degrees, which justification may mirror. Other things being equal, the longer the series of reasons you have for believing Q, the better justified Q is for you (as long as the shorter set is a proper subset of the longer set). Infinitism can satisfy the completeness requirement by offering an account of complete justification: Q is completely justified for you just in case you have an infinite array of adequate reasons (Fantl 2003: 558). To have an infinite array of reasons favoring Q, for each potential challenge to Q, or to any of the infinite reasons in the chain supporting Q, or to any of the inferences involved in traversing any link in the chain, you must have available a further infinite series of reasons. In short, it requires having an infinite number of infinite chains.

Can foundationalism meet the degree and completeness requirements? To assess this, we need first to explain how foundationalists understand foundational reasons. Traditional foundationalists contend that foundational reasons are self-justifying, because their mere truth suffices to justify them. The claims “I am thinking” and “There is at least one proposition that is neither true nor false” are plausible candidates for self-justifying reasons. Metajustificatory foundationalists deny that the mere truth of a foundational reason ensures its foundational status. Instead, they say, foundational reasons must have some other property, call it ‘F’. Metajustificatory foundationalists disagree among themselves over what F is. Some say it is reliability, others say it is coherence, and yet others say it is clear and distinct perception or social approval. The important point to recognize is that metajustificatory foundationalism can’t “require that a believer have access to the metajustificatory feature as a reason for the foundational reason,” because that would undermine its putative status as foundational (Fantl 2003: 541). It would effectively require a further reason for that which supposedly stood in no need of it.

Having divided all foundationalists into two jointly exhaustive and mutually exclusive groups, the argument against foundationalism goes like this:

  1. All foundationalist theories are either traditional or metajustificatory. (Premise)
  2. Traditional foundationalism can’t satisfy the degree requirement. (Premise)
  3. Metajustificatory foundationalism can’t satisfy the completeness requirement. (Premise)
  4. So no foundationalist theory can satisfy both the degree and completeness requirements (From 1–3)
  5. An adequate theory of justification must satisfy both the degree and completeness requirements. (Premise)
  6. So no foundationalist theory of justification is adequate. (From 4–5)

The argument is valid. Line 1 is trivially true, given the way the categories are defined. Line 2 is supported on the grounds that all self-justifying reasons are by definition true, and their truth justifies them. But truth doesn’t come in degrees. So traditional foundationalism lacks the resources to satisfy the degree requirement. Truth isn’t flexible enough.

Line 3 is supported on the grounds that the foundationalist will have to analyze complete justification along these lines:

Q is completely justified for you iff you have a non-repeating series of reasons for Q, ultimately founded on a reason that exemplifies the metajustificatory feature [F] to the highest possible degree. (Fantl 2003: 546)

But any such proposal must fail for a simple reason: no matter what F is, if you gain a reason to think that the foundational reason completely exemplifies F, and that exemplifying F is epistemically important, then Q will thereby become better justified for you. To see why, for the sake of argument suppose that we accept a reliabilist version of metajustificatory foundationalism, according to which Q is completely justified for you if and only if you have a non-repeating series of reasons for Q, ultimately founded on a perfectly reliable reason. Now if you gain a reason to believe that the reason is perfectly reliable, then Q will thereby become better justified for you. But then metajustificatory foundationalism hasn’t satisfied the completeness requirement after all, because it will be possible for you to increase your justification for Q beyond what the maximal exemplification of F would allow. But this violates the definition of complete justification. So metajustificatory foundationalism can’t meet the completeness requirement.

In response, foundationalists have pointed out that the reasoning in support of line 2 of the argument is undermined to the extent that a degree-theoretic conception of truth is plausible — that is, to the extent it’s plausible that truth comes in degrees. Foundationalists have also responded that the supporting reasoning for line 3 overlooks the possibility of adequate justification being over-determined. The more reasons you have that independently adequately justify Q for you, the better justified Q is for you. A natural foundationalist proposal, then, is that Q is completely justified for you if and only if it is infinitely over-determined that Q is adequately justified for you (Turri 2010).

b. Regress Arguments

There are at least two regress arguments for infinitism: the enhancement argument and the interrogation argument. Each concerns a very specific epistemic status closely connected to reasons and reasoning. Neither purports to establish that infinitism is true about all interesting epistemic statuses. Although infinitists take skepticism seriously, for the purposes of these two arguments, we’ll simply assume that skepticism is false.

i. The Enhancement Argument

The enhancement argument begins by asking a question (Klein 2005): What sort of reasoning could enhance the justification of a non-evident proposition, in a context where its truth has been legitimately questioned? What structural form would the reasons offered in the course such reasoning take? We can divide all answers to that question into three groups. Enhancement coherentists answer that some repeating chains could enhance justification; enhancement foundationalists answer that no repeating chain could enhance justification, but some finite and non-repeating chains could; enhancement infinitists answer that no repeating or finite chain could enhance justification, but some infinite and non-repeating chains could.

The enhancement argument for infinitism is that neither coherentism nor foundationalism provides a satisfactory answer to the question posed, whereas infinitism does. Given that these three answers exhaust the (non-skeptical) alternatives, it follows that infinitism is the only satisfactory account of the epistemic status in question, which for convenience we can call rational enhancement of justification.

The objection to enhancement coherentism is that repeating chains are objectionably question-begging and so can’t rationally enhance justification. If Corrie believes Q, and someone asks her, “Why believe Q?”, and she responds by citing a chain of reasoning that relies on Q itself, then in that context she has clearly done nothing to rationally enhance her justification for Q. Her response simply presupposes the claim in question, so how could it rationally enhance her justification?

Enhancement foundationalists claim that some reasons are special: the foundational enhancers. Foundational enhancers can rationally enhance the justification for believing other things, even though they are not rationally supported by further reasons in turn. This is why some finite chains can rationally enhance justification: a foundational enhancer appropriately terminates the affair.

The objection to enhancement foundationalism is that all finite chains are objectionably arbitrary at their terminus. Suppose that Fontana believes A, and someone asks him, “Why believe A?”, and he responds by citing some reason B. But B is not a foundational enhancer, and Fontana is in turn asked, “Why believe B?” This continues until Fontana reaches the point where he cites a reason that, according to him, is a foundational enhancer. Let Z be this purported foundational enhancer. Fontana’s interlocutor presses further, “Why think that foundational enhancers are likely to be true?” In response to this last question, Fontana has three options: affirm, deny, or withhold. If he denies, then using Z as a reason is arbitrary and the reasoning can’t rationally enhance A for him. If he withholds, then, from his own point of view, he should not use Z as the basis for further beliefs. If it is not good enough to affirm in and of itself, then it isn’t proper to use it as a basis for affirming something else. If he affirms, then there is no immediate problem, but this is because the reasoning has continued, and what was supposed to be a foundational enhancer turned out not to be one.

Enhancement infinitism avoids the problems faced by coherentism and foundationalism. It endorses neither circular reasoning nor arbitrary endpoints.

The enhancement argument for infinitism can be understood as follows:

  1. If skepticism about rational enhancement is false, then either coherentism, foundationalism or infinitism is the correct theory of rational enhancement. (Premise)
  2. Skepticism about rational enhancement is false. (Premise)
  3. Coherentism isn’t the correct theory. (Premise)
  4. Foundationalism isn’t the correct theory. (Premise)
  5. So infinitism is the correct theory of rational enhancement. (From 1–4)

Line 1 is true because the way that coherentism, foundationalism and infinitism are characterized exhausts logical space. Every rationally enhancing chain is either circular or not. If it is circular, then it’s a coherentist chain; if it isn’t, then either it is finite or infinite. If it is finite, then it is a foundationalist chain; if it is infinite, then it is an infinitist chain. Line 2 is assumed without defense in the present context, as mentioned above. Lines 3 and 4 are defended on grounds already explained: line 3 on the grounds that circular reasoning can rationally enhance justification, and line 4 on the grounds that arbitrary reasoning can’t do so either.

ii. The Interrogation Argument

The interrogation argument concerns “the most highly prized form of true belief” (Plato, Meno, 98a), which is the sort of knowledge that human adults take themselves to be capable of and sometimes even attain (Klein 2011). More specifically, the interrogation argument concerns one of the essential requirements of this sort of knowledge, namely, full justification.

A key idea in the infinitist’s discussion here is that distinctively human knowledge is distinguished by the importance of reasoning in attaining full justification: we make our beliefs fully justified by reasoning in support of them. The reasoning is partly constitutive of full justification, and so is essential to it. A mechanical calculator might know that 2+2=4, and a greyhound dog might know that his master is calling, but neither the calculator nor the greyhound reasons in support of their knowledge. Their knowledge is merely mechanical or brute. Adult humans are capable of such unreasoned knowledge, but we are also capable of a superior sort of knowledge involving full justification, due to the value added by reasoning.

The interrogation argument is motivated by a specific version of the regress problem, which emerges from an imagined interrogation. Suppose you believe that Q. Then someone asks you a legitimate question concerning the basis of your belief that Q. You respond by citing reason R1. You are then legitimately asked about your basis for believing R1. You cite reason R2. Then you are legitimately asked about your basis for believing R2. A pattern is emerging. How, if it all, can the reasoning resolve itself such that you’re fully justified in believing Q? Either the process goes on indefinitely, which suggests that the reasoning you engage in is fruitless because another reason is always needed; or some reason is repeated in the process, which means that you reasoned circularly and thus fruitlessly; or at some point the reasoning ends because the last reason cited isn’t supported by any other reason, which suggests that the reasoning is fruitless because it ends arbitrarily. No matter how the reasoning resolves itself, it seems, you’re no better offer for having engaged in it. Thus, it can seem doubtful that any reasoning will result in a fully justified belief.

This is essentially the argument given by Sextus Empiricus (1976, lines 164-170, p. 95) to motivate a version of Pyrrhonian Skepticism. What are we to make of this problem? The infinitist agrees that circular reasoning is fruitless, and that finite reasoning ends arbitrarily and so is fruitless too. However, the infinitist disagrees with the claim that reasoning that goes on indefinitely must be fruitless. Every belief is potentially susceptible to legitimate questioning, and interrogation can, in principle, go on indefinitely. You need to be able to answer legitimate questions, and so you need available to you an indefinite number of answers. Each answer is a further reason. So, far from seeming fruitless, potentially indefinitely long reasoning seems to be exactly what is needed for the reasoning to be epistemically effective and result in full justification.

The interrogation argument for infinitism can be summarized like so:

  1. Adult human knowledge requires full justification.(Premise)
  2. Full justification requires proper reasoning. (Premise)
  3. Proper reasoning requires that there be available an infinite and non-repeating series of reasons. (Premise)
  4. So adult human knowledge requires that there be available an infinite and non-repeating series of reasons. (From 1–3)

Lines 1 and 2 can be understood as stipulating the epistemic status that the infinitist is interested in, as explained above. Line 3 is defended on the grounds that (a) circular reasoning is illegitimate, and (b) finite chains won’t suffice because every reason offered is potentially susceptible to legitimate interrogation, and full justification requires that an answer to every legitimate question be at least available to you. Foundationalists point to beliefs with an allegedly special foundational property F, which, it is claimed, suites them to put a definitive end to legitimate questioning. But, the infinitist responds, foundationalists always pick properties that they think are truth-conducive, and it is always, potentially at least, legitimate to ask, “Why think that reasons with the property F are truth-conducive?” Once this legitimate question is raised, the foundationalist must abandon the supposed foundational citadel, in search of further reasons. But this looks suspiciously like infinitism in disguise.

c. The Proceduralist Argument

The proceduralist argument for infinitism pertains to knowledge. It begins from the premise that knowledge is a “reflective success” (Aikin 2009). Reflective success requires succeeding through proper procedure. Proper procedure requires thinking carefully. Moreover, we can make our careful thinking explicit. To make our careful thinking explicit is to state our reasons. And for a reason to legitimately figure into our careful thinking, we must have a reason for thinking that it is true in turn.

We can encapsulate the proceduralist argument for infinitism like so:

  1. Knowledge is a reflective success. (Premise)
  2. Reflective success requires careful thinking. (Premise)
  3. Careful thinking requires the availability of an infinite series of reasons. (Premise)
  4. So knowledge requires the availability of an infinite series of reasons. (From 1–3)

Lines 1 and 2 can be understood as characterizing the sort of knowledge that the infinitist is interested in. (Aikin 2005 and 2009 strongly suggests that this is knowledge ordinarily understood, though the matter is not entirely clear.) Line 3 is defended by appeal to a guiding intuition, namely, that if you know, then you can properly answer all questions about your belief and your reasons. But in principle there are an infinite number of questions about your belief and your reasons. And no proper answer will implicate you in question-begging circularity. So, in principle you need an infinite number of answers (Aikin 2009: 57–8). If there were a proper stopping point in the regress of reasons, then beliefs at the terminus would not be susceptible to legitimate challenges from those who disagree. Your opponents would be simply mistaken for challenging you at this point. But it doesn’t seem like there even is a point where your opponents must be simply mistaken for challenging you.

What about the examples featured prominently by foundationalists? For example, what about your belief that 2+2=4, or that you have a headache (when you do have one)? It can easily seem implausible that a challenge to these beliefs must be legitimate. It can easily seem that someone who questioned you on these matters would be simply mistaken. The infinitist disagrees. We always should be able to offer reasons. At the very least, careful thinking requires us to have an answer to the question, “Are our concepts of a headache or addition fit for detecting the truth in such matters?” Even if we think there are good answers to such questions, the infinitist claims, the important point is that we need those answers in order to think carefully and, in turn, gain knowledge.

Infinitism can appear counterintuitive because, as a matter of fact, we never answer very many questions about any of our beliefs, but we ascribe knowledge to people all the time. But this an illusion because we often carelessly attribute knowledge, or attribute knowledge for practical reasons that aren’t sensitive to the attribution’s literal truth.

4. Common Objections to Infinitism

a. The Finite Mind Objection

For most cases of effective reasoning, justified belief or knowledge, infinitism requires more of us than we can muster. We have finite lives and finite minds. Given the way that we are actually constituted, we cannot produce an infinite series of reasons. So skepticism is the immediate consequence of any version of infinitism that requires us to produce an infinite series of reasons (Fumerton 1995; compare BonJour 1976: 298, 310 n. 22).

In a remark in the Posterior Analytics reflecting his general worries about regresses, Aristotle gives a reason for rejecting infinitism: “one cannot traverse an infinite series.” But if one cannot traverse an infinite series of reasons, then if infinitism is the correct account of justification, then skepticism is the correct view. We cannot traverse an infinite series of reasons because we have finite minds. It is useful to quote the passage in full, because it is also a famous passage advocating a regress argument for foundationalism.

Aristotle expresses dissatisfaction with both infinitism and question-begging coherentism, and so opts for foundationalism. He writes:

Some hold that, owing to the necessity of knowing primary premisses, there is no scientific knowledge. Others think there is, but that all truths are demonstrable. Neither doctrine is either true or a necessary deduction from the premisses. The first school, assuming that there is no way of knowing other than by demonstration, maintain that an infinite regress is involved, on the ground that if behind the prior stands no primary, we could not know the posterior through the prior (wherein they are right, for one cannot traverse an infinite series [emphasis added]); if on the other hand – they say – the series terminates and there are primary premisses, yet these are unknowable because incapable of demonstration, which according to them is the only form of knowledge.

And since thus [sic] one cannot know the primary premisses, knowledge of the conclusions which follow from them is not pure scientific knowledge nor properly knowing at all, but rests on the mere supposition that the premisses are true. The other party agree with them as regards to knowing, holding that it is possible only by demonstration, but they see no difficulty in holding that all truths are demonstrated on the ground that demonstration may be circular or reciprocal. (72b5–18)

Aristotle here focuses on “scientific knowledge” and syllogistic “demonstration.” But his remarks are no less plausible when taken to apply to all knowledge and reasoning. Aristotle himself hints at this with his comment about “knowing at all.”

The spirit of Aristotle’s original finite-mind objection is alive and well in contemporary epistemology. Here is a representative example:

The [proposed] regress of justification of S’s belief that p would certainly require that he hold an infinite number of beliefs. This is psychologically, if not logically, impossible. If a man can believe an infinite number of things, then there seems to be no reason why he cannot know an infinite number of things. Both possibilities contradict the common intuition that the human mind is finite. Only God could entertain an infinite number of beliefs. But surely God is not the only justified believer. (Williams 1981, p. 85)

But infinitists have been careful not to claim that we must actually produce an infinite series of reasons. Rather, they typically say that we must have an appropriately structured, infinite set of reasons available to us. About this milder infinitist requirement, it might be worried that it’s not clear that we could even understand an infinite series of reasons. But being able to understand a series of reasons is required for that series to be available — at least in some senses of “available” — to us as reasons. So, even this milder infinitist requirement might lead to skepticism.

b. The Proof of Concept Objection

Contrary to what was suggested at the end of the previous objection, it seems that we could understand an infinite series, provided that each element in the series was simple enough. And it doesn’t seem impossible for a justificatory chain to include only simple enough elements.

Grant that it’s possible that every element of an infinite series could be comprehensible to us. But what evidence is there that there actually are such series? And what evidence is there that, for at least most of the things that we justifiably believe (or most of the things we know, or most of the acceptable reasoning we engage in), there is a properly structured infinite series available to us? Unless infinitists can convincingly respond to these questions — unless they can offer a proof of concept — then it seems likely that infinitism leads to skepticism.

The objection can be made more poignant by pairing it with the finite mind objection. To handle the finite mind objection, infinitists deny that you need to actually produce the infinite series of reasons in order for your belief to be justified. Just having the reasons available, and producing enough of them to satisfy contextual demands, suffices to justify your belief. But since contextual demands are never so stringent as to demand more than, say, ten reasons, we’re left with no actual example of a chain that seems a promising candidate for an infinite series (Wright 2011: section 3).

At least one example has been given of a readily available infinite chain of reasons, but ironically it is one compatible with foundationalism, offered by a foundationalist in response to infinitism (Turri 2009). (Peijnenburg and Atkinson 2011 sketch some formal possibilities and provide an analogy with heritable traits.)

c. The AC/DC Objection

For any proposition we might believe, both it and its denial can be supported by similar, appropriately structured infinite chains of reasons (Post 1980 32–7; Aikin 2005: 198–9; Aikin 2008: 182–3). Importantly, neither chain of reasons is, in any meaningful sense, more available to us than the other. To appreciate the point, suppose you are inquiring into whether P. An infinite affirming chain could be constructed like so:

Affirmation chain (AC)

Q & (Q → P)

R & (R → (Q & (Q → P)))

S & (S → (R & (R → (Q & (Q → P)))))

whereas an infinite denial chain could be constructed like so:

Denial chain (DC)

Q & (Q → ~P)

R & (R → (Q & (Q → ~P)))

S & (S → (R & (R → (Q & (Q → ~P)))))

It is an equally long way to the top of each chain, but which is, so to speak, the road to epistemic heaven, and which the road to hell? Having one such chain available to you isn’t a problem, but having both available is a touch too much (at least in non-paradoxical cases), and infinitism lacks the resources to eliminate one.

A further worry is that if infinitists embrace additional resources to eliminate one of these chains, those very same resources could in turn form the basis of a satisfactory finitist epistemology (Cling 2004: section 5). Aikin 2008 defends a version of infinitism, “impure infinitism,” intended to address this problem by incorporating elements of foundationalism; and Klein has argued that specifying the conditions for the availability of reasons will eliminate the possibility of both chains being available in non-paradoxical cases.

d. The Unexplained Origin Objection

Aristotle begins the Posterior Analytics with this statement: “All instruction given or received by way of argument proceeds from pre-existent knowledge.” And later in the Posterior Analytics, after having rejected both infinitism and question-begging coherentism as capable of producing knowledge, he writes:

Our own doctrine is that not all knowledge is demonstrative; on the contrary, knowledge of the immediate premisses is independent of demonstration. (The necessity of this is obvious; for since we must know the prior premisses from which the demonstration is drawn, and since the regress must end in immediate truths, those truths must be indemonstrable.) Such, then, is our doctrine, and in addition we maintain that besides scientific knowledge there is an originative source which enables us to recognize the definitions [that is, the first principles of a science].(72b18–24)

What is this “originative source” and how does it produce knowledge not based on reasoning? The answer is a proto-reliabilist one that relies on humans having a “capacity of some sort” (99b33) that produces immediate (non-inferential) knowledge. Although most contemporary reliabilists will not take the foundational propositions employed in demonstration to be the first principles of a science, they will take foundational beliefs to result from the operation of some capacities humans possess that do not employ conscious reasoning (Goldman 2008).

Here is Aristotle’s account of the “originating source” of justified beliefs:

But though sense-perception is innate in all animals, in some perception comes to persist, in others it does not. So animals in which this persistence does not come to be have either no knowledge at all outside of the act of perceiving, or no knowledge of objects of which no impression persists; animals in which it does come into being have perception and can continue to retain the sense-impression in the soul; and when such persistence is frequently repeated a further distinction at once arises between those which out of persistence of such sense impressions develop a power of systematizing them and those which do not. So out of sense perception comes to be what we call memory, and out of frequently repeated memories of the same thing develops experience; for a number of memories constitute a single experience. From experience … originate the skill of the craftsman and the knowledge of the man of science. (99b36–100a5)

Thus, Aristotle holds that foundationalism can explain how justification can arise in basic beliefs and how it is transmitted through reasoning to non-foundational beliefs. This, he claims, contrasts with infinitism and question-begging coherentism, which have no way of explaining how justification arises. He seems to assume that reasoning cannot originate justification, but can merely transmit it. If each belief were to depend on another for its justification, then there would be no originative source, or starting point, that generates the justification in the first place.

Writing in the second century AD, Sextus Empiricus wondered how we might show that believing a proposition is better justified than the alternatives of either disbelieving it or suspending judgment. He employed the “unexplained origin objection” to reject an infinitist attempt to show how believing could be better justified. He argues that infinitism must lead to suspension of judgment.

The Mode based upon regress ad infinitum is that whereby we assert that the thing adduced as a proof of the matter proposed needs a further proof, and this another again, and so on ad infinitum, so that the consequence is suspension [of judgment], as we possess no starting-point for our argument.(1976, I:164–9)

The unexplained origin objection remains popular today. Carl Ginet, a contemporary foundationalist, puts it this way:

A more important, deeper problem for infinitism is this: Inference cannot originate justification, it can only transfer it from premises to conclusion. And so it cannot be that, if there actually occurs justification, it is all inferential. (Ginet 2005, p. 148)

Jonathan Dancy, another contemporary foundationalist, makes a similar point:

Suppose that all justification is inferential. When we justify belief A by appeal to belief B and C, we have not yet shown A to be justified. We have only shown that it is justified if B and C are. Justification by inference is conditional justification only; A’s justification is conditional upon the justification of B and C. But if all justification is conditional in this sense, then nothing can be shown to be actually non-conditionally justified. (Dancy 1985, p. 55)

e. The Misdescription Objection

In the Metaphysics, Aristotle writes:

There are … some who raise a difficulty by asking, who is to be the judge of the healthy man, and in general who is likely to judge rightly on each class of questions. But such inquiries are like puzzling over the question whether we are now asleep or awake. And all such questions have the same meaning. These people demand that a reason shall be given for everything; for they seek a starting point, and they seek to get this by demonstration, while it is obvious from their actions that they have no such conviction. But their mistake is what we have stated it to be; they seek a reason for things for which no reason can be given; for the starting point of demonstration is not demonstration. (1011a2–14)

The point of this objection is that, assuming that skepticism is false, infinitism badly misdescribes the structure of reasons supporting our beliefs, as revealed by or expressed in our actual deliberative practices. Our actual practices do not display what infinitism would predict (again, assuming that skepticism is false).

Of the three objections to infinitism presented by Aristotle, this one has gained the least traction in contemporary epistemology. This might be because it rests on two easily challenged assumptions: (i) a theory of justification can be tested by determining whether our actual deliberations meet its demands; (ii) our actual deliberations meet foundationalism’s demands. Regarding (i), can we test an ethical theory by determining whether our actual behavior meets its demands? (Let us hope not!) If not, then why should we accept (i)? Regarding (ii), would a foundationalist accept the following as a foundational proposition: “The train schedule says so”? Such claims often end deliberation about when the next train departs. But it’s not the sort of proposition that foundationalists have taken to be basic.

5. References and Further Reading

a. References

  • Aikin, S., 2005, “Who Is Afraid of Epistemology’s Regress Problem,” Philosophical Studies 126: 191–217.
  • Aikin, S., 2008, “Meta-epistemology and the Varieties of Epistemic Infinitism,” Synthese 163: 175–185.
  • Aikin, S., 2009, “Don’t Fear the Regress: Cognitive Values and Epistemic Infinitism,” Think Autumn 2009: 55–61.
  • Aikin, S., 2011, Epistemology and the Regress Problem, Routledge.
  • Aristotle, Metaphysics.
  • Aristotle, Posterior Analytics.
  • BonJour, L., 1976, “The Coherence Theory of Empirical Knowledge,” Philosophical Studies 30: 281–312.
  • Cling, A., 2004, “The Trouble with Infinitism,” Synthese 138: 101–123.
  • Dancy, J., 1985, Introduction to Contemporary Epistemology, Blackwell.
  • Descartes, R., 1955 [1641], Meditations on First Philosophy, in Philosophical Works of Descartes, trans. and ed. By E.S. Haldane and G.R.T. Ross, v. 1, Dover.
  • Fantl, J., 2003, “Modest Infinitism,” Canadian Journal of Philosophy 33: 537–62.
  • Fumerton, R., 1995, Metaepistemology and Skepticism, Rowman & Littlefield.
  • Ginet, C., 2005, “Infinitism is Not the Solution to the Regress Problem,” Contemporary Debates in Epistemology, ed. M. Steup and E. Sosa, Blackwell.
  • Goldman, A., 2008, “Immediate Justification and Process Reliabilism,” Epistemology: New Essays, ed. Q. Smith, Oxford University Press.
  • Hume, D., 1955 [1748], An Inquiry Concerning Human Understanding, ed Charles Hendel, Bobbs-Merrill Company.
  • Klein, P., 1998, “Foundationalism and the Infinite Regress of Reasons,” Philosophy and Phenomenological Research 58: 919-925.
  • Klein, P., 1999, “Human Knowledge and the Infinite Regress of Reasons,” J. Tomberlin, ed., Philosophical Perspectives 13: 297-325.
  • Klein, P., 2005, “Infinitism is the Solution to the Regress Problem,” in M. Steup and E. Sosa, eds., Contemporary Debates in Epistemology, Blackwell.
  • Klein, P., 2012, “Infinitism and the Epistemic Regress Problem,” in S. Toldsdorf, ed., Conceptions of Knowledge, de Gruyter.
  • Lehrer, K., 1997, SelfTrust, Oxford University Press.
  • Moser, P., 1984, “A Defense of Epistemic Intuitionism,” Metaphilosophy 15: 196–209.
  • Peijnenburg, J. and D. Atkinson, 2011, “Grounds and Limits: Reichenbach and Foundationalist Epistemology,” Synthese 181: 113–124
  • Peirce, C.S., 1965, Collected papers of Charles Sanders Peirce, ed. Charles Hartshorne and Paul Weiss, Harvard University Press.
  • Plato, Meno.
  • Post, J., 1980, “Infinite Regresses of Justification and of Explanation,” Philosophical Studies 38: 31–52.
  • Post, J., 1984, The Faces of Existence, Cornell University Press.
  • Sextus Empiricus, 1976, Outlines of Pyrrhonism, Harvard University Press.
  • Sosa, E., 1991, Knowledge in Perspective, Cambridge University Press.
  • Turri, J., 2009, “On the Regress Argument for Infinitism,” Synthese 166: 157–163.
  • Turri, J., 2010, “Foundationalism for Modest Infinitists,” Canadian Journal of Philosophy 40: 275–284.
  • Wright, S., 2011, “Does Klein’s Infinitism Offer a Response to Agrippa’s Trilemma?” Synthese, DOI 10.1007/s11229-011-9884-x.

b. Further Reading

  • Atkinson, D. and J. Peijnenburg, 2009, “Justification by an Infinity of Conditional Probabilities,” Notre Dame Journal of Formal Logic 50: 183–93.
  • Coffman, E.J. and Howard-Snyder, D. 2006, “Three Arguments Against Foundationalism: Arbitrariness, Epistemic Regress, and Existential Support,” Canadian Journal of Philosophy 36.4: 535–564.
  • Ginet, C., “Infinitism is not the Solution to the Regress Problem,” in M. Steup and E. Sosa, eds., Contemporary Debates in Epistemology, Blackwell.
  • Klein, P., 2000, “The Failures of Dogmatism and a New Pyrrhonism,” Acta Analytica 15: 7-24.
  • Klein, P., 2003a, “How a Pyrrhonian Skeptic Might Respond to Academic Skepticism,” S. Luper, ed., The Skeptics: Contemporary Essays, Ashgate Press.
  • Klein, P., 2003b, “When Infinite Regresses Are Not Vicious,” Philosophy and Phenomenological Research 66: 718-729.
  • Klein, P., 2004a, “There is No Good Reason to be an Academic Skeptic,” S. Luper, ed., Essential Knowledge, Longman Publishers.
  • Klein, P., 2004b, “What IS Wrong with Foundationalism is that it Cannot Solve the Epistemic Regress Problem,” Philosophy and Phenomenological Research 68: 166-171.
  • Klein, P., 2005b, “Infinitism’s Take on Justification, Knowledge, Certainty and Skepticism,” Veritas 50: 153-172.
  • Klein, P., 2007a, “Human Knowledge and the Infinite Progress of Reasoning,” Philosophical Studies 134: 1-17.
  • Klein, P., 2007b, “How to be an Infinitist about Doxastic Justification,” Philosophical Studies 134: 25-29.
  • Klein, P., 2008, “Contemporary Responses to Agrippa’s Trilemma,” J. Greco, ed., The Oxford Handbook of Skepticism, Oxford University Press.
  • Klein, P., 2011, “Infinitism,” S. Bernecker and D. Pritchard, eds., Routledge Companion to Epistemology, Routledge.
  • Peijnenburg, J., 2007, “Infinitism Regained,” Mind 116: 597–602.
  • Peijnenburg, J., 2010, “Ineffectual Foundations: Reply to Gwiazda,” Mind 119: 1125–1133.
  • Peijnenburg, J. and D. Atkinson, 2008, “Probabilistic Justification and the Regress Problem,” Studia Logica 89: 333–41.
  • Podlaskowski, A.C. And J.A. Smith, 2011, “Infinitism and Epistemic Normativity,” Synthese 178: 515–27.
  • Turri, J., 2009, “An Infinitist Account of Doxastic Justification,” Dialectica 63: 209–18.
  • Turri, J., 2012, “Infinitism, Finitude and Normativity,” Philosophical Studies, DOI: 10.1007/s11098-011-9846-7.

 

Author Information

Peter D. Klein
Email: pdklein@rci.rutgers.edu
Rutgers University, New Brunswick
U. S. A.

and

John Turri
Email: John.turri@gmail.com
University of Waterloo
Canada

Applied Ethics

Under what conditions is an abortion morally permissible?  Does a citizen have a moral obligation to actively participate (perhaps by voting) in the democratic process of one’s nation (assuming one is living in a democracy)?  What obligations, if any, does one have to the global poor?  Under what conditions is female genital excision morally permissible?  If there are conditions under which it is morally wrong, what measures, if any, should be taken against the practice?  These are just some of the thousands of questions that applied ethicists consider. Applied ethics is often referred to as a component study of the wider sub-discipline of ethics within the discipline of philosophy. This does not mean that only philosophers are applied ethicists, or that fruitful applied ethics is only done within academic philosophy departments. In fact, there are those who believe that a more informed approach is best gotten outside of the academy, or at least certainly outside of philosophy. This article, though, will mostly focus on how applied ethics is approached by trained academic philosophers, or by those trained in very closely related disciplines.

This article first locates applied ethics as distinct from, but nevertheless related to, two other branches of ethics. Since the content of what is studied by applied ethicists is so varied, and since working knowledge of the field requires considerable empirical knowledge, and since historically the pursuit of applied ethics has been done by looking at different kinds of human practices, it only makes sense that there will be many different kinds of applied ethical research, such that an expert working in one kind will not have much to say in another. For example, business ethics is a field of applied ethics, and so too is bioethics. There are plenty of experts in one field that have nothing to say in the other. This article discusses each field, highlighting just some of the many issues that fall within each. Throughout the presentation of the different areas of applied ethics, some methodological issues continue to come up. Additionally, the other two branches of ethics are consulted in dealing with many of the issues of almost all the different fields. So, what may be a methodological worry for a business ethics issue may also be a worry for bioethical issues.

One particular kind of applied ethics that raises distinct concerns is bioethics. Whereas with other kinds of applied ethics it is usually implicit that the issue involves those who we already know to have moral standing, bioethical issues, such as abortion, often involve beings whose moral standing is much more contentious. Our treatment of non-human animals is another area of bioethical research that often hinges on what moral standing these animals have. As such, it is important that this article devote a section to the issues that arise concerning moral standing and personhood.

This article ends with a discussion of the role of moral psychology in applied ethics, and in particular how applied ethicists might appropriate social psychological knowledge for the purpose of understanding the role of emotion in the formation of moral judgments. Additionally, to what extent is it important to understand the role of culture in not only what is valued but in how practices are to be morally evaluated?

Table of Contents

  1. Applied Ethics as Distinct from Normative Ethics and Metaethics
  2. Business Ethics
    1. Corporate Social Responsibility
    2. Corporations and Moral Agency
    3. Deception in Business
    4. Multinational Enterprises
  3. Bioethics
    1. Beginning of Life Issues, including Abortion
    2. End of Life Issues
    3. Research, Patients, Populations, and Access
  4. Moral Standing and Personhood
    1. Theories of Moral Standing and Personhood
    2. The Moral Status of Non-Human Animals
  5. Professional Ethics
    1. What is a Profession?
    2. Engineering Ethics
  6. Social Ethics, Distributive Justice, and Environmental Ethics
    1. Social Ethics
    2. Distributive Justice, and Famine Relief
    3. Environmental Ethics
  7. Theory and Application
  8. References and Further Reading

1. Applied Ethics as Distinct from Normative Ethics and Metaethics

One way of categorizing the field of ethics (as a study of morality) is by distinguishing between its three branches, one of them being applied ethics. By contrasting applied ethics with the other branches, one can get a better understanding what exactly applied ethics is about. The three branches are metaethics, normative ethics (sometimes referred to as ethical theory), and applied ethics. Metaethics deals with whether morality exists. Normative ethics, usually assuming an affirmative answer to the existence question, deals with the reasoned construction of moral principles, and at its highest level, determines what the fundamental principle of morality is. Applied ethics, also usually assuming an affirmative answer to the existence question, addresses the moral permissibility of specific actions and practices.

Although there are many avenues of research in metaethics, one main avenue starts with the question of whether or not moral judgments are truth-apt. The following will illuminate this question. Consider the following claims:  ‘2+2=4’, ‘The volume of an organic cell expands at a greater rate than its surface area’, ‘AB=BA, for all A,B matrices’, and ‘Joel enjoys white wine.’  All of these claims are either true or false; the first two are true, the latter two are false, and there are ways in which to determine the truth or falsity of them. But how about the claim ‘Natalie’s torturing of Nate’s dog for the mere fun of it is morally wrong’?  A large proportion of people, and perhaps cross-culturally, will say that this claim is true (and hence truth-apt). But it’s not quite as obvious how this claim is truth-apt in the way that the other claims are truth-apt. There are axioms and observations (sometime through scientific instruments) which support the truth-aptness of the claims above, but it’s not so clear that truth-aptness is gotten through these means with respect to the torturing judgment. So, it is the branch of metaethics that deals with this question, and not applied ethics.

Normative ethics is concerned with principles of morality. This branch itself can be divide into various sub-branches (and in various ways):  consequentialist theories, deontological theories, and virtue-based theories. A consequentialist theory says that an action is morally permissible if and only if it maximizes overall goodness (relative to its alternatives). Consequentialist theories are specified according to what they take to be (intrinsically) good. For example, classical utilitarians considered intrinsic goodness to be happiness/pleasure. Modern utilitarians, on the other hand, define goodness in terms of things like preference-satisfaction, or even well-being. Other kinds of consequentialists will consider less subjective criteria for goodness. But, setting aside the issue of what constitutes goodness, there is a rhetorical argument supporting consequentialist theories:  How could it ever be wrong to do what’s best overall?  (I take this straight from Robert N. Johnson.)  Although intuitively the answer is that it couldn’t be wrong to do what’s best overall, there are a plentitude of purported counterexamples to consequentialism on this point – on what might be called “the maximizing component” of consequentialism. For example, consider the Transplant Problem, in which the only way to save five dying people is by killing one person for organ transplantation to the five. Such counterexamples draw upon another kind of normative/ethical theory – namely, deontological theory. Such theories either place rights or duties as fundamental to morality. The idea is that there are certain constraints placed against persons/agents in maximizing overall goodness. One is not morally permitted to save five lives by cutting up another person for organ transplantation because the one person has a right against any person to be treated in this way. Similarly, there is a duty for all people to make sure that they do not treat others in a way that merely makes them a means to the end of maximizing overall goodness, whatever that may be. Finally, we have virtue theories. Such theories are motivated by the idea that what’s fundamental to morality is not what one ought to do, but rather what one ought to be. But given that we live in a world of action, of doing, the question of what one ought to do creeps up. Therefore, according to such theories, what one ought to do is what the ideally virtuous person would do. What should I do?  Well, suppose I’ve become the kind of person I want to be. Then whatever I do from there is what I should do now. This theory is initially appealing, but nevertheless, there are lots of problems with it, and we cannot get into them for an article like this.

Applied ethics, unlike the other two branches, deals with questions that started this article – for example, under what conditions is an abortion morally permissible?  And, what obligations, if any, do we have toward the world’s global poor?  Notice the specificity compared to the other two branches. Already, though, one might wonder whether the way to handle these applied problems is by applying one of the branches. So, if it’s the case that morality doesn’t exist (or: moral judgments are not truth-apt), then we can just say that any claims about the permissibility of abortion or global duties to the poor are not true (in virtue of not being truth-apt), and there is therefore no problem; applied ethics is finished. It’s absolutely crucial that we are able to show that morality exists (that moral judgments are truth-apt) in order for applied ethics to get off the ground.

Actually, this may be wrong. It might be the case that even if we are in error about morality existing, we can nevertheless give reasons which support our illusions in specified cases. More concretely, there really is no truth of the matter about the moral permissibility of abortion, but that does not stop us from considering whether we should have legislation that places constraints on it. Perhaps there are other reasons which would support answers to this issue. The pursuit and discussion of these (purported) reasons would be an exercise in applied ethics. Similarly, suppose that there is no such thing as a fundamental principle of morality; this does not exclude, for one thing, the possibility of actions and practices from being morally permissible and impermissible/wrong. Furthermore, suppose we go with the idea that there is a finite list of principles that comprise a theory (with no principle being fundamental). There are those who think that we can determine, and explain, the rightness/wrongness of actions and practices without this list of non-fundamental principles. (We’ll look at this later in this article)  If this is the case, then we can do applied ethics without an explicit appeal to normative ethics.

In summary, we should consider whether or not the three branches are as distinct as we might think that they are. Of course, the principle questions of each are distinct, and as such, each branch is in fact distinct. But it appears that in doing applied ethics one must (or less strongly, may) endeavor into the other two branches. Suppose that one wants to come to the conclusion that our current treatment of non-human animals, more specifically our treatment of chickens in their mass production in chicken warehouses, is morally impermissible. Then, if one stayed away from consequentialist theories, they would have either a deontological or virtue-based theory to approach this issue. Supposing they dismissed virtue-theory (on normative ethical grounds), they would then approach the issue from deontology. Suppose further, they chose a rights-based theory. Then they would have to defend the existence of rights, or at least appeal to a defense of rights found within the literature. What reasons do we have to think that rights exist?  This then looks like a metaethical question. As such, even before being able to appeal to the issue of whether we’re doing right by chickens in our manufactured slaughtering of them, we have to do some normative ethics and metaethics. Yes, the three branches are distinct, but they are also related.

2. Business Ethics

Some people might think that business ethics is an oxymoron. How can business, with all of its shady dealings, be ethical?  This is a view that can be taken even by well educated people. But in the end, such a position is incorrect. Ethics is a study of morality, and business practices are fundamental to human existence, dating back at least to agrarian society, if not even to pre-agrarian existence. Business ethics then is a study of the moral issues that arise when human beings exchange goods and services, where such exchanges are fundamental to our daily existence. Not only is business ethics not something oxymoronical, it is important.

a. Corporate Social Responsibility

One important issue concerns the social responsibility of corporate executives, in particular those taking on the role of a CEO. In an important sense, it is stockholders, and not corporate executives (via their role as executives), who own a corporation. As such, a CEO is an employee, not an owner, of a corporation. And who is their employer?  The stockholders. Who are they, the CEO and other executives, directly accountable to?  The board of directors, representing the stockholders. As such, there is the view taken by what’s called stockholder theorists, that the sole responsibility of a CEO is to do what the stockholders demand (as expressed by the collective decision of the board of directors), and usually that demand is to maximize profits. Therefore, according to stockholder theory, the sole responsibility of the CEO is to, through their business abilities and knowledge, maximize profit. (Friedman, 1967)

The contesting viewpoint is stakeholder theory. Stakeholders include not just stockholders but also employees, consumers, and communities. In other words, anyone who has a stake in the operations of a corporation is a stakeholder of that corporation. According to stakeholder theory, a corporate executive has moral responsibilities to all stakeholders. Thus, although some corporate ventures and actions might maximize profit, they may conflict with the demands of employees, consumers, or communities. Stakeholder theory very nicely accounts for what some might consider to be a pre-theoretical commitment – namely, that an action should be assessed in terms of how it affects everyone involved by it, not just a select group based on something morally arbitrary. Stakeholder theorists can claim that the stakeholders are everyone affected by a business’s decision, and not just the stockholders. To consider only stockholders is to focus on a select group based on something that is morally arbitrary.

There are at least two problems for stakeholder theory worth discussing. First, as was mentioned above, there are conflicts between stockholders and the rest of stakeholders. A stakeholder account has to handle such conflicts. There are various ways of handling such conflicts. For example, some theorists take a Rawlsian approach, by which corporate decisions are to be made in accordance with what will promote the least well-off.  (Freeman, 2008)  Another kind of Rawlsian approach is to endorse the use of the veil of ignorance without appeal to the Difference Principle, whereby it might result that what is morally correct is actually more in line with the stockholders (Dittmer, 2010). Additionally, there are other decision making principles by which one could appeal in order to resolve conflict. Such stakeholder theories will then be assessed according to the plausibility of their decision making theories (resolving conflict) and their ability to achieve intuitive results in particular cases.

Another challenge of some stakeholder theories will be their ability to make some metaphysical sense of such entities as community, as well as making sense of potentially affecting a group of people. If a corporate decision is criticized in terms of it affecting a community, then we should keep in mind what is meant by community. It is not as if there is an actual person that is a community. As such, it is hard to understand how a community can be morally wronged, like a person can be wronged. Furthermore, if the decisions of a corporate executive are to be measured according to stakeholder theory, then we need to be clearer about who counts as a stakeholder. There are plenty of products and services that could potentially affect a number of people that we might not initially consider. Should such potential people be counted as stakeholders?  This is a question to be considered for stakeholder theorists. Stockholder theorists could even us this question as a rhetorical push for their own theory. 

b. Corporations and Moral Agency

In the media, corporations are portrayed as moral agents: “Microsoft unveiled their latest software”, “Ford morally blundered with their decision to not refit their Pinto with the rubber bladder design”, and “Apple has made strides to be the company to emulate”, are the types of comments heard on a regular basis. Independently of whether or not these claims are true, each of these statements relies on there being such a thing as corporations having some kind of agency. More specifically, given that intuitively corporations do things that result in morally good and bad things, it makes sense to ask whether such corporations are the kind of entities that can be moral agents. For instance, take an individual human being, of normal intelligence. Many of us are comfortable with judging her actions as morally right or wrong, and also holding onto the idea that she is a moral agent, eligible for moral evaluation. The question relative to business ethics is:  Are corporations moral agents?   Are they the kind of thing capable of being evaluated in such a way as to determine if they are either morally good or bad?

There are those who do think so. Peter French has argued that corporations are moral agents. It is not just that we can evaluate such entities as shorthand for the major players involved in corporate practices and policies. Instead, there is a thing over and above the major players which is the corporation, and it is this thing that can be morally evaluated. French postulates what’s called a “Corporate Internal Decision Structure” (CID structure), whereby we can understand a corporation over and above its major players as a moral agent. French astutely observes that any being that is a moral agent has to be capable of intentionality – that is, the being has to have intentions. It is through the CID structure that we can make sense of a corporation as having intentions, and as such as being a moral agent. (French, 1977). One intuitive idea driving CID structures as supporting the intentionality of corporations is that there are rules and regulations within a corporation that drives it to make decisions that no one individual within it can make. Certain decisions might require either majority or unanimous approval of all individuals recognized in the decision-making process. Those decisions then are a result of the rules regulating what is required for decision, and not any particular go ahead of any individual. As such, we have intentionality independent of any particular human agent.

But there are those who oppose this idea of corporate moral agency. Now, there are various reasons one might oppose it. In being a moral agent, it is usually granted that one then gets to have certain rights. (Notice here a metaethical and normative ethical issue concerning the status of rights and whether or not to think of morality in terms of rights respect and violation.)  If corporations are moral agents with rights, then this might allow for too much moral respect for corporations. That is, corporations would be entities that would have to have their rights respected, in so far as we’re concerned with following the standard thoughts of what moral agency entails – that is, having both obligations and rights.

But there are also more metaphysical reasons supporting the idea that corporations are not moral agents. For example, John Danley gives various reasons, many of them metaphysical in nature, against the idea that corporations are moral agents (Danley, 1980). Danley agrees with French that intention is a necessary condition for moral agency. But is it a sufficient condition?  French sympathizers might reply that even if  it is not a sufficient condition, its being a necessary condition gives reason to believe that in the case of corporations it is sufficient. Danley then can be interpreted as responding to this argument. He gives various considerations under which theoretically defined intentional corporations are nevertheless not moral agents. In particular, such corporations fail to meet some other conditions intuitively present with other moral agents, namely most human beings. Danley writes “The corporation cannot be kicked, whipped, imprisoned, or hanged by the neck until dead. Only individuals of the corporation can be punished” (Danley, 1980). Danley then considers financial punishments. But then he reminds us that it is individuals who have to pay the costs. It could be the actual culprits, the major players. Or, it could be the stockholders, in loss of profits, or perhaps the downfall of the company. And furthermore, it could be the loss of jobs of employees; so, innocents may be affected.

In the literature, French does reply to Danley, as well as to the worries of others. Certainly, there is room for disagreement and discussion. Hopefully, it can be seen that this is an important issue, and that room for argumentative maneuver is possible.

c. Deception in Business

Deception is usually considered to be a bad thing, in particular something that is morally bad. Whenever one is being deceptive, one is doing something morally wrong. But this kind of conventional wisdom could be questioned. In fact, it is questioned by Albert Carr in his famous piece “Is Business Bluffing Ethical?”  (Carr, 1968). There are at least three arguments one can take from this piece. In this section, we will explore them.

The most obvious argument is his Poker Analogy Argument. It goes something like this:  (1) Deception in poker is morally permissible, perhaps morally required. (2) Business is like poker. (3) Therefore, deception in business is morally permissible. Now, obviously, this argument is overly simplified, and certain modifications should be made. In poker, there are certain things that are not allowed; you could be in some serious trouble if it were found out what you were doing. So, for example, the introduction of winning cards slid into the mix would not be tolerated. As such, we can grant that such sliding would not be morally permissible. Similarly, any kind of business practice that would be considered sliding according to Carr’s analogy would also not be permissible.

But there are some obvious permitted kinds of deception involved in poker, even if it’s disliked by the losing parties. Similarly, there will be deceptive practices in business that, although disliked, will be permitted. Here is one objection though. Whereas, the loser of deception in poker is the player, the loser of deception in business is a wide group of people. Whether we go with stockholder theory or stakeholder theory, we are going to have losers/victims that had nothing to do with the poker/deceptive playing of the corporative executives. Employees, for example, could lose their jobs because of the deception of either corporate executive of competing companies or the bad deception of the home companies. Here is a response, though:  When one is involved in corporate culture, as employee for example, they take on the gamble that the corporate executives take on. There are other ways to respond to this charge, as well.

The second reason one might side with Carr’s deception thesis is based on a meta-theoretical position. One might take the metaethical position that moral judgments are truth-apt, but that they are categorically false. So, we might think that a certain action is morally wrong when in fact there is no such thing as moral wrongness. When we make claims condemning a moral practice we are saying something false. As such, condemning deception in business is really just saying something false, as all moral judgments are false. The way to reply to this worry is then through a metaethical route, where one argues against such a theory, which is called Error Theory.

The third reason one might side with Carr is via what appears to be a discussion, on his part, of the difference between ordinary morality and business morality. Yes, in ordinary morality, deception is not morally permissible. But with business morality, it is not only permissible but also required. We are misled in judging business practices by the standards of ordinary morality, and so, deception in business is in fact morally permissible. One response is this is:  Following Carr’s lead, one is to divide her life into two significant components. They are to spend their professional life in such a way that involves deception, but then spend the rest of their life, day by day, in a way that is not deceptive with their family and friends, outside of work. This kind of self looks very much like a divisive self, a self that is conflicted and perhaps tyrannical.

d. Multinational Enterprises

Business is now done globally. This does not just mean the trivial statement of global exchange of goods and services between nations. Instead, it means that goods and services are produced by other nations (often underdeveloped) for the exchange between nations that do not partake in the production of such goods and services.

There are various ways to define multiple national enterprises (MNE’s). Let us consider this definition, though: An MNE is a company that produces at least some of its goods or services in a nation that is distinct from (i) where it is located and (ii) its consumer base. Nike would be a good example of a MNE. The existence of MNE’s is motivated by the fact that in other nations, an MNE could produce more at lesser cost, usually due to the fact that in such other nations wage laws are either absent or such that paying employees in such countries is much less than in the host nation. As a hypothetical example, a company could either pay 2000 employees $12/hr for production of their goods in their own country or they could pay 4000 employees $1.20/hr in a foreign country. The cheaper alternative is going with the employment in the foreign country. Suppose an MNE goes this route. What could morally defend such a position?

One way to defend the MNE route is by citing empirical facts concerning the average wages of the producing nation. If, for example, the average way is $.80/hr, then one could say that such jobs are justified in virtue of providing opportunities to make higher wages than otherwise. To be concrete, $1.20 is more than $.80, and so such jobs are justified.

There are at least two ways to respond. First, one might cite the wrongness of relocating jobs from the host nation to the other nation. This is a good response, except that it does not do well in answering to pre-theoretical commitment concerning fairness:  Why should those in a nation receiving $12/hr be privileged over those in a nation receiving $1.20/hr?   Why do the $12/hr people count more than the $1.20/hr people?  Notice that utilitarian responses will have to deal with how the world could be made better (and not necessarily morally better). Second, one might take the route of Richard Miller. He proposes that the $1.20/hr people are being exploited, and it is not because they are doing worse off than they would otherwise. He agrees that they are doing better than they would otherwise ($1.20/hr is better than $.80/hr). It’s just that their cheapness of labor is determined according to what they would get otherwise. They should not be offered such wages because doing so exploits their vulnerability of already having to work for unjust compensation; being compensated for a better wage than the wage they would get under unjust conditions does not mean that the better wage is just (Miller, 2010).

3. Bioethics

Bioethics is a very exciting field of study, filled with issues concerning the most basic concerns of human beings and their close relatives. In some sense, the term bioethics is a bit ridiculous, as almost anything of ethical concern is biological, and certainly anything that is sentient is of ethical concern. (Note that with silicon based sentient beings, what I say is controversial, and perhaps false.)  Bioethics, then, should be understood as a study of morality as it concerns issues dealing with the biological issues and facts concerning ourselves, and our close relatives, for examples, almost any non-human animal that is sentient. This part of the article will be divided into three sections: beginning of life issues, including abortion; end of life issues, for example euthanasia; and finally, ethical concerns doing medical research, as well as availability of medical care.

a. Beginning of Life Issues, including Abortion

All of the beginning of life issue are contentious. There are four for us to consider:  abortion, stem-cell procurement and research, cloning, and future generations. Each of these big issues (they could be considered research fields themselves) are related to each other.

Let us start with abortion. Instead of asking ‘Is abortion morally permissible?’ a better question will be ‘Under what conditions is an abortion morally permissible?’. In looking at the conditions surrounding a particular abortion, we are able to get a better understanding of all of the possibly morally relevant considerations in determining permissibility/impermissibility. Now, this does not exclude the possibility of a position where all abortions are morally wrong. It’s just that we have to start with the conditions, and then proceed from there.

Up until just 40 or so years ago, the conventional wisdom, at least displayed in the academic literature, was that just so long as a fetus is a person (or counts morally), it would be morally wrong to abort it. Judith Thomson challenged the received wisdom by positing a number of cases that would show, at least as she argued, that even with a fetus being a person, with all of the rights we would confer to any other person, it would still be permissible to abort, under certain conditions (Thomson, 1971). So, for example, with her Violinist Case, it’s permissible for a pregnant woman to abort a fetus under the circumstances that she was raped, even with the granting that the aborted fetus is a full-fledged person. Three remarks should be made here. First, there are those who have questioned whether her case actually establishes this very important conclusion. Second, it should be recognized that it’s not completely clear what all of the points Thomson is making with her Violinist Case. Is she saying something fundamentally about the morality of abortion?  Or is she saying something fundamentally about the nature and structure of moral rights?  Or both?  Minimally, we should be sensitive to the fact that Thomson is saying something important, even if false, about the nature of moral rights. Third, and this is very important, Thomson’s Violinist Case, if successful, only shows the permissibility of abortion in cases where the pregnant woman was raped, where conception occurred due to non-consensual sex. But what about consensual sex?

Thomson does have a way to answer this question. She continues in her essay with another case, called Peopleseeds. (Thomson, 1971)  Imagine a woman (or, perhaps a man) who enjoys her days off in her house with the windows open. It just so happens that she lives in a world in which there are these things called peopleseeds, such that if they make their way into a house’s carpet, they will root and eventually develop, unless uprooted, into full-fledged people (perhaps only human infants). Knowing this, she takes precautions and places a mesh screen in her windows. Nevertheless, there are risks, in that it’s possible, and has been documented, that seeds come through the window. She places the screens in, and because she enjoys Saturdays with her windows open, she leaves her windows open (actually just one), thereby eventually allowing a seed to root, and there she has a problematic person growing. She then decides to uproot the seed, thereby killing the peopleseed. Has she done anything wrong?  Intuitively, the answer is no. Therefore, even in cases of pregnancy due to consensual sex, and with the consideration that the fetus is a person, it is morally permissible to abort it. It’s interesting, though, that very little has been said in the literature to this case; or, there has been very little that has caught on in such a way that is reflected in more basic bioethics texts. One way to question Thomson with this case is by noting that she is having us consult our intuitions about a world where its biological laws are different than ours; it is just not the case that we live in a world (universe) where this kind of fetal development can happen. Perhaps in the world in which this can occur, it would be considered morally wrong by such inhabitants of that world to kill such peopleseed fetuses. Or maybe not. It is, minimally, hard to know.

Thomson’s essay is revolutionary, groundbreaking, more-than-important, and perhaps ““true””. What is so important about it is the idea of arguing for the permissibility of abortion, even with fetuses being considered persons, just like us.  There are others who significantly expand on her approach. Frances Kamm, for example, does so in her Creation and Abortion. This is a sophisticated deontological approach to abortion. Kamm notices certain problems with Thomson’s argument, but then offers various reasons which would support the permissibility of aborting. She takes into consideration such things as third party intervention and morally responsible creation (Kamm, 1992).

Note that I have mentioned Kamm’s deontological approach, where the rights and duties of those involved matter. Also note that with a utilitarian approach, such things as rights and duties are going to be missing, and if they are there, it is only in terms of understanding what will maximize overall goodness/utility. According to utilitarianism, abortion is going to be settled according to whether policies for or against maximize overall goodness/utility. There is a third approach, though. This approach draws from the third major kind of ethical theory, namely virtue theory. In general, virtue theory says that an action is morally permissible if and only if it is what an ideally virtuous person would do. Such a theory sounds very intuitive.  Rosalind Hursthouse argues that it is through virtue theory that we can best understand the issues surrounding abortion. She, I think controversially, asks questions about the personal state under which a woman becomes pregnant. It is from her becoming-pregnant state that we are to understand whether her possible abortion is morally permissible. Perhaps a more generous reading of Hursthouse is that we need to understand where a woman is at in her life to best evaluate whether or not an abortion is morally appropriate for her (Hursthouse, 1991).

There are, of course, the downright naysayers to abortion. Almost all take the position that all fetuses are persons, and thereby, aborting a fetus is tantamount to (wrongful) murder. Any successful position should take on Thomson’s essay. Some, though, might bypass her thoughts, and just say that abortion is the killing of an innocent person, and any killing of an innocent person is morally wrong.

Let’s end, though, with a discussion of an approach against abortion that allows for the fetus to not be a person, and to not have any (supposed) moral standing. This is clever, as Thomson’s argument attempts to show that aborting a person is permissible, and this approach shows that aborting a non-person is impermissible. We see very quickly, though, that this argument is different than the potentiality argument against abortion. The potentiality argument says that some x is a potential person, and therefore the aborting of it is wrong because had x not been aborted, it would eventually had been a person.  This argument, on the other hand, does not appeal to potentiality, and furthermore, does not assume that the fetus is a person. Don Marquis argues that aborting a fetus is wrong on the grounds that explains the wrongness of any killing of people. Namely, what is wrong with killing a person?  It is that in killing a person, the person is being deprived of a future life. A future life contains quite a bit of things, including in general joy and suffering. In killing a fetus via abortion one is depriving it of a future life, even if it is not a person. It’s future life is just like ours; it contains joy and suffering. By killing it, you are depriving it of the same things that are deprived of us if we are killed. The same explanation of why it’s wrong to kill us applies to fetuses; therefore, it’s wrong to abort under all cases (with some exceptions) (Marquis, 1989).

Another beginning of life issue is stem cell research. Stem cell research is important because it provides avenues for the development of organs and tissues that can be used to replace those that are diseased for those suffering from certain medical conditions; in theory, an entire cardiac system could be generated through stem cells, as well as through all of the research required on stem cells in order to eventually produce successful organ systems. There are various routes by which stem cell lines can be procured, and this is where things get controversial. First, though, how are stem cells generally produced in general, in the abstract?  Answering this question first requires specifying what is meant by stem cells. Stem cells are undifferentiated cells, ones that are pluripotent, or more colloquially, ones that can divide and eventually become a number of many different kinds of cells – for example, blood cells, nerve cells, and cells specific to kinds of tissues, for example, muscles, heart, stomach, intestine, prostate, and so forth. A differentiated, non-pluripotent cell is no good for producing pluripotent cells; such a cell is not a candidate for stem cell lines.

And so, how are stem cells produced, abstractly?  Stem cells, given that they must come from a human clump of matter that is not no good, are extracted from an embryo – a cluster of cells that are of both the differentiated and undifferentiated (stem cell) sort. The undifferentiated, pluripotent cells are extracted from the embryo in order to then be specialized into a number of different kind of cells – for example, cells developing into cardiac tissue. Such extraction amounts to the destruction of the human clump of matter – that is, the destruction of the human embryo, and some claim that is tantamount to murder. More mildly, one could condemn such stem cell procurement as an unjustified killing of something that morally counts. Now, it is important to note that such opponents of stem cell line procurement, in the way characterized, will note that there are alternative ways to get the stem cell lines. They will point out that we can get stem cells from already existing adult cells which are differentiated, non-pluripotent. There are techniques that can then “non-specialize” them back into a pluripotent,  undifferentiated state, without having to destroy an embryo for procurement of stem cells; basically, we can get the stem cells without having to kill something, an embryo, that counts morally.

There are some very good responses to those who are opponents of stem cell procurement in the typical (embryo destruction) manner. Typically, they will resort to the idea that such destruction is merely a destruction of something that doesn’t morally count. The idea is that embryos, at least of the kind that are used and destroyed in getting stem cells, are not the kind of thing that morally counts. The sophistication of such embryos is such that they are very early stage embryos, comparable to the kinds of embryos one would find in the early stages of the first trimester of a natural pregnancy.

There are other considerations that proponents of typical stem cell procurement will appeal to. For example, they might give a response to certain slippery slope arguments against (typical) stem cell procurement (Holm, 2007). The main kind of slippery slope argument against stem cell research is that if we allow such procurement and research, then this leaves open the door to the practice of the cloning of full-scale human beings. A rather reasonable way of responding to this worry is two-fold:  If the cloning of full-scale human beings is not problematic, then this is not a genuine slippery slope as, in the words of one author, “there is no slope in the first place” (Holm, 2007). The idea is that, all other things equal, human cloning is not morally problematic, and there is therefore no moral worry about stem cell procurement causing human cloning to come about, as human cloning is not a morally bad thing. But suppose that human cloning (on a full scale) is morally problematic. Then proponents of stem cell procurement will then need to give reasons why stem cell procurement and research won’t cause/lead to human cloning, and there are plausible, but still controversial, reasons that can be given to support this defense. To summarize, there is a slope, but it is not slippery (Holm, 2007).

A third beginning of life issue, which follows quite nicely from the previous discussion, is that of human cloning. There are those who argue that human cloning is wrong, and for various reasons. One could first go with the repugnance route. It’s repugnant to create human beings through this route. One way to respond to this is by noting that it certainly would be different, at least for a period of time, but that such difference, perhaps resulting in the feeling of repugnance, is by itself no reason to think that the practice (of human cloning) is morally wrong. Furthermore, one might say that with any kind of moral progress, feelings of repugnance by some of the population does occur, but that such repugnance is just an effect of moral change; if the moral change is actual progress, then such repugnance is merely the reaction to a change which is actually morally good.

Another way in which cloning may be criticized is that it could lead to a Brave New World world. By cloning, we are controlling people’s destinies, in such a way that what we get is a dystopian result. The best response to this is that such a worry relies on a kind of genetic reductionism which is false. Are we merely the product of our genetic composition?  No. There are plenty of early childhood factors, as well as in general cultural/social factors, which explain the kind of people we are by the time we are adults. Of course, a Brave New World world is possible, but it is possibility is best understood in terms of all of the cultural and social factors that would have to be present to have such complacent and brain-dead people characterized in the book; they aren’t born that way – they are socialized that way. The mere genetic replication of people, through cloning, should be less of a worry, given that there are so many other factors, social, that are relevant in explaining adult behavior.

The second way to criticize human cloning is that it closes the open future of the resulting clone. By cloning a person, P1, we are creating P2. Given that P1 has lived perhaps 52 years, P2 then has knowledge of what her life will be like in the next 52 years. Suppose that the 52 year old writes a very self-honest autobiography. Then P2 now can read how her life will unfold. Once again, this objection to cloning relies on a very ridiculous way of looking at the narrative of a human life; it requires a very, very strong kind of genetic reductionism, and it flies in the face of the results of twin studies. (Note that a human clone is biologically a delayed human twin.)  So, the response to the open future objection can be summed up as this:  A human clone might have their future closed, but it would only be in virtue of anyone else’s future being closed, which would require lots of knowledge about social/cultural/economic knowledge of their future life. Given that these things are very unpredictable, as for everyone else, it’s safe to say that such human clones will not have knowledge of how their life will unfold; as such, they, just like anyone else, have an open future.

b. End of Life Issues

This section is primarily devoted to issues concerning euthanasia and physician-assisted suicide. There are of course other issues relevant to the end of life – for example, issues surrounding consent, often through examining the status of such things as advance directives, living wills, and DNR orders, but for space limits, we will only look at euthanasia and physician-assisted suicide. It will be very important to get a clear idea about what is meant with respect to euthanasia, suicide, and all of its various kinds. First, we can think of euthanasia as the intentional killing of another person, where the intention is to benefit that person by ending their life, and that it, in fact, does benefit their life  (McMahan, 2002). Furthermore, we can distinguish between voluntary, involuntary, and non-voluntary euthanasia. Voluntary is where the person killed consents to it. Involuntary is where the person actively expresses that they do not give their consent, or where consent was possible but where they were not asked. Non-voluntary is where consent is not possible – for example, the person is in a vegetative state. Another distinction is active versus passive euthanasia. Active euthanasia involves doing something to the person which then ends their life, for example, shooting them, or injecting them with a lethal does. Passive euthanasia involves denying assistance or treatment to the person that they would need to otherwise live. Here is an example that should illustrate the difference. Smothering a person with a pillow would be active, even if it technically denies them something they need to live – that is, oxygen. Refusing to continue a breathing device, by unplugging the person from the device, would be passive.

Suicide is the act of a person taking their own life. Most ways that we speak and think of suicide are in terms of it being non-assisted. But suppose that you have a friend who wants to end their own life, but doesn’t have the financial and technical means to do it in a way that she believes is as painless and successful as possible. If you give them money and knowledge in how to end their lives in this way, then you have assisted them in their suicide. Physicians are well-placed to assist others in ending their lives. Already, one could see how the distinction between physician-assisted suicide and voluntary active euthanasia can get rather blurred. (Imagine a terminally ill person whose condition is so extreme and debilitating that the only thing they can do to take part in the ending of their life is pressing a button that injects a lethal dose, but where the entire killing device is set up, both in design and construction, by a physician. Is this assisted suicide or euthanasia?)

Although as far as I know, no surveys have been done to support the following claim, one might think that the following is plausible:  Involuntary active euthanasia is the most difficult to justify, with non-voluntary active euthanasia following, and with voluntary active euthanasia following that; then it goes involuntary passive, non-voluntary passive, and then voluntary passive euthanasia in order from most difficult to least difficult to justify. It is difficult to figure out where physician-assisted suicide and non-assisted suicide would fit in, but it’s plausible to think that non-assisted suicide would be the easiest to justify, where this becomes trivially true if the issue is in terms of what a third party may permissibly do.

It appears then that, minimally, it is more difficult to justify active euthanasia than passive. Some authors, however, have contested this. James Rachels gives various reasons, but perhaps the best two are as follows. First, in some cases, active euthanasia is more humane than passive. For example, if the only way to end the life of a terminally ill person is by denying them life-supporting measures, perhaps by unplugging them from a feeding tube, where it will take weeks, if not months for them to die, then this seems less humane, and perhaps outright cruel, in comparison to just injecting them with a lethal dose. Second, Rachels thinks of the distinction between active and passive euthanasia as being based on the distinction between killing and letting die. Now, this way of basing the distinction between active and passive might be placed under scrutiny – recall that we earlier defined the distinction between actively doing something that ends one life and withholding life-assisting measures, as opposed to killing someone and merely letting them die (Rachels, 1975). But suppose that we go with Rachels in allowing the killing versus letting die distinction base the distinction between active and passive euthanasia. Then consider Rachels’ example as challenging the moral power of the distinction between killing and letting die:  Case 1 – A husband decides to kill his wife, and does so by placing a lethal poison in her red wine. Case 2 – A husband decides to kill his wife, and as he is walking into the bathroom to hand her the lethal dosed glass of wine, he notices her drowning in the bathtub. In case 1, the husband kills his wife, and in case 2, he merely lets her die. Does this mean that what he’s done in case 2 is less morally worse?  Perhaps we might even think that in case 2 the husband is even more morally sinister.

Although it appears to be difficult to justify, there are proponents of voluntary active euthanasia. McMahan is one such proponent who gives a rather sophisticated, incremental argument for the permissibility of voluntary active euthanasia. The argument starts with an argument that rational suicide is permissible, where rational suicide is ending one’s life when one believes that one’s life is not worth living, and it is the case that one’s life is not worth living. Then, McMahan takes the next “increment” and discusses conditions under which we would find it permissible that a physician aid someone in their rational suicide, by perhaps assisting them in the removal of their life support system; here, physician-assisted passive suicide is permissible. But then why is assisted passive suicide permissible but assisted active suicide impermissible?  As McMahan argues, there is no overriding reason why this is the case. In fact, there is a good reason to think assisted active suicide is permissible. First, consider that often people commit suicide actively, not passively, and the idea is that they want to be able to exercise control in how their life ends. Second, because one does not want to risk a failed suicide attempt, which could result in pain, humiliation, and disfigurement, one might find that they can meet their goal of death best by the assistance of another, in particular a physician. Finally, with physician-assisted active suicide being permissible, McMahan takes the next step to the permissibility of voluntary active euthanasia. So, suppose that it is permissible for a physician to design and construct an entire system where the person ending their life needs only to press a button. If the physician presses the button, then this is no longer assisted suicide and instead active euthanasia. As McMahan urges, how can it be morally relevant who presses the button (just so long as consent and intention are the same)?  Secondly, McMahan points out that some people will be so disabled by a terminal illness that they will not be able to press the button. Because they cannot physically end their life by physician-assisted active suicide, their only remaining option would then be deemed impermissible if voluntary active euthanasia is deemed impermissible, and yet those who could end their own lives still have a “permissible option” left open and available to them. On grounds of something like fairness, there is a further feature which speaks to the permissibility of voluntary active euthanasia just so long as physician-assisted active suicide is permissible (McMahan, 2002, 458-460).

c. Research, Patients, Populations, and Access

Access to, and quality of, health care is a very real concern. A good health care system is based on a number of things, one being medicine and delivery systems based on research. But research requires, at least to some extent, the use of subjects that are human beings. As such, one can see that ethical concerns arise here. Furthermore, certain populations of people may be more vulnerable to risky research than others. As such, there is another category of moral concern. There is also a basic question concerning how to finance such health care systems. This concern will be addressed in the sixth main section of this article, social ethics and issues of justice.

First, let’s start with randomized clinical trials (RCT’s). RCT’s are such that the participants of such studies don’t know whether they are obtaining the promising (but not yet certified) treatment for their condition. Informed consent is usually obtained and assumed in addressing the ethicality of RCT’s. Notice, though, that if the promising treatment is life-saving, and the standard treatment received by the control group is inadequate, then there is a basis for criticism of RCT’s. The idea here is that those who are in the control group could have been given the experimental, promising, and successful treatment, thereby most likely successfully treating their condition, and in the case of terminal diseases, saving their lives. Opponents of RCT’s can characterize RCT’s in these cases as condemning someone to death, arbitrarily, as those in the experimental group had a much higher likelihood of living/being treated. Proponents of RCT’s have at least two ways of responding. They could first appeal to the modified kind of RCT’s designed by Zelen. Here, those in the control group have knowledge of being in the group; they can opt out, given their knowledge of being assigned to the control group. A second, and more addressing, way of responding is by acknowledging that there is an apparent unfairness in RCT’s, but then one would say that in order to garner scientifically valid results, RCT’s must be used. Given that scientifically valid results here have large social benefits, the practice of using them is justified. Furthermore, those who are in control groups are not made worse off than they would be otherwise. If the only way to even have access to such “beneficial” promising, experimental treatments is through RCT’s, then those assigned to control groups have not been made worse off – they haven’t been harmed (For interesting discussions see Hellman and Hellman, 1991 and Marquis, 1999).

Another case (affecting large numbers of people) is this:  Certain medications can be tested on a certain population of people and yet benefit those outside the population used for testing. So, take a certain medication that can reverse HIV transmission to fetuses from mothers. This medication needs to be tested. If you go to an underdeveloped country in Africa to test it, then what kinds of obligations does the pharmaceutical company have to those participating in the study and those at large in the country upon making it available to those in developed nations like the U.S?  If availability to those in the research country is not feasible, is it permissible in the first place to conduct the study?  These are just some of the questions that arise in the production of pharmaceutical and medical services in a global context. (See Glantz, et. al., 1998 and Brody, 2002)

4. Moral Standing and Personhood

a. Theories of Moral Standing and Personhood

Take two beings, a rock and a human being. What is it about each such that it’s morally okay to destroy the rock in the process of procuring minerals but not okay to destroy a human being in the process of procuring an organ for transplantation?  This question delves into the issue of moral standing. To give an answer to this question is to give a theory of moral standing/personhood. First, some technical things should be said. Any given entity/being has a moral status. Those beings that can’t be morally wronged have the moral status of having no (that is, zero) moral standing. Those beings that can be morally wronged have the moral status of having some moral standing. And those beings that have the fullest moral standing are persons. Intuitively, most, if not all human beings, are persons. And intuitively, an alien species of a kind of intelligence as great as ours are persons. This leaves open the possibility that certain beings, which we would not currently know exist, could be greater in moral standing than persons. For example, if there were a god, then it seems that such a being would have greater moral standing than us, than persons; this would have us reexamine the idea that persons have the fullest moral standing. Perhaps, we could say that a god or gods were super-persons, with super moral standing.

Why is the question of moral standing important?  Primarily, the question is important in the case of non-human animals and in the case of fetuses. For this article, we will only focus on human animals directly. But before considering animals, let’s take a look at some various theories of what constitutes moral standing for a being. A first shot is the idea that being a human being is necessary and sufficient for being something with moral standing. Notice that according to this theory/definition, rocks are excluded, which is a good thing. But then this runs into the problem of excluding all non-human animals, even for example, primates like chimps and bonobos. As such, the next theory motivated would be this:  A being/entity has moral standing (moral counts/can be morally wronged) if and only if it is living. But according to this theory, things like plants and viruses can be morally wronged. A virus has to be considered in our moral deliberations in considering whether or not to treat a disease, and because the viral entities have moral standing; well, this is counterintuitive, and indicates that with this theory, there is a problem of being too inclusive. So, another theory to consider is one which excludes plants, viruses, and bacteria. This theory would be rationality. According to this theory, those who morally count would have rationality. But there are problems. Does a mouse possess rationality?  But even if one is comfortable with mice not having rationality, and thereby not counting morally, one might then have a problem with certain human beings who lack genuinely rational capacities. As such, another way to go is the theory of souls. One might say that what morally counts is what has a soul; certain human beings might lack rationality, but they at least have a soul. What’s problematic with this theory of moral standing is that it posits an untestable/unobservable entity – namely, a soul. What prohibits a virus, or even a rock, from having a soul?  Notice that this objection to the soul theory of moral standing does not deny the existence of souls. Instead, it is that such a theory posits the existence of an entity that is not observable, and which there cannot be a test for its existence.

Another theory, which is not necessarily true and which is not unanimously accepted as true, is the sentience theory of moral standing. According to this theory, what gives something moral standing is that it is something that is sentient – that is, it is something that has experiences, and more specifically has experiences of pain and pleasure. With this theory, rocks and plants don’t have moral standing; mice and men do. One problem, though, is that many of us think that there is a moral difference between mice and men. According to this theory, there is no way to explain how although mice have moral standing, human beings are persons (Andrews, 1996). It appears that to do this, one would have to appeal to rationality/intelligence. But as discussed, there are problems with this. Finally, there is another theory, intimately tied with sentience theory. We can safely say that most beings who experience pain and pleasure have an interest in the kinds of experiences that they have. There is, however, the possibility that there are beings who experience pain and pleasure but who don’t care about their experiences. So what should we say about those who care about their experiences?  Perhaps it is not their experiences that matter, but the fact that they care about their experiences. In that case, it looks like what matters morally is their caring about their experiences. As such, we should call this new theory “interest theory.”  A being/entity has moral standing if and only if it has interests (in virtue of caring about the experiences it has).

b. The Moral Status of Non-Human Animals

In the literature, though, how are non-human animals considered?  Are they considered as having moral standing?  Peter Singer is probably one of the first to advocate, in the academic literature, for animals as having moral standing. Very importantly, he documented how current agrarian practices treated animals, from chimps to cows to chickens (Singer, 1975). The findings were astonishing. Many people would find the conditions under which these animals are treated despicable and morally wrong. A question arises, though, concerning what the basis is for moral condemnation of the treatment of such animals. Singer, being a utilitarian, could be characterized as saying that treating such animals in the documented ways does not maximize overall goodness/utility. It appears, though, that he appeals to another principle, which can be called the principle of equitable treatment. It goes:  It is morally permissible to treat two different beings differently only if there is some moral difference between the two which justifies the differential treatment (Singer, 1975). So, is there a moral difference between human beings and cows such that the killing of human beings for food is wrong but the killing of cows is not?  According to Singer, there is not. However, we could imagine a difference between the two, and perhaps there is.

Another theorist in favor of non-human animals is Tom Regan. He argues that non-human animals, at least of a certain kind, have moral rights just as human animals do. As such, there are no utilitarian grounds which could justify using non-human animals in a way different than human animals. To be more careful, though, we could imagine a situation in which treating a human a certain way violates her rights but the same treatment does not violate a non-human’s rights. Regan supports this possibility (Regan, 1983). This does not change the fact that non-humans and humans equally have rights, but just that the content of rights will depend on their nature. Finally, we should note that there are certain rights-theorists who, in virtue of their adherence to rights theory, will say that non-human animals do not have rights. As such, they do not have moral standing, or at least a robust enough moral standing in which we should consider them in our moral deliberations as beings that morally count (Cohen, 1986).

5. Professional Ethics

a. What is a Profession?

Certain things like law, medicine, and engineering are considered to be professions. Other things like unskilled labor and art are not. There are various ways to try to understand what constitutes something as a profession. For the purposes of this article, there will be no discussion of necessary and jointly sufficient conditions proposed for something constituting a profession. With that said, some proposed general characteristics will be discussed. We will discuss these characteristics in terms of a controversial case, the case of journalism. Is journalism a profession?  Generally, there are certain financial benefits enjoyed by professions such as law, medicine, and engineering. As such, we can see that there may be a financial motivation on the part of some journalists to consider it to be a profession. Additionally, one can be insulated from criticism by being part of a profession; one could appeal to some kind of professional authority against the layperson (or someone outside that profession) (Merrill, 1974). One could point out, though, that just because some group desires to be some x does not mean that they are x (a basic philosophical point). One way to respond to this is that the law, medicine, and engineering have a certain esteem attached to them. If journalists could create that same esteem, then perhaps they could be regarded as professions.

But as Merrill points out, journalism seems to lack certain important characteristics shared by the professions. With the professional exemplars already mentioned, one has to usually take a series of professional exams. These exams test a number of things, one of them being the jargon of the profession. Usually, one is educated specifically for a certain profession, often with terminal degrees for that profession. Although there are journalism schools, entry into the practice of journalism does not require education in a journalism school, nor does it require anything like the testing involved in, say, the law. Furthermore, there is usually a codified set of principles or rules, even if rather vague and ambiguous, which apply to professionals. Perhaps journalists can appeal to such mottos as tell the truth, cite your sources, protect your sources, and be objective. But in addition to the almost emptiness of these motto’s, there is the problem that under interpretation, there is plenty of disagreement about whether they are valid principles in the first place. For example, if one wants to go with a more literal appeal to truth telling, then how are we to think of the gonzo journalism of Hunter Thomson?  Or with documentary making, there are some who believe that the documentarian should stay objective by not placing themselves in the documentary or by not assisting subjects. Notice here that although journalism may not be a profession, there are still ethical issues involved, ones that journalists should be mindful of. Therefore, even if journalism cannot be codified and organized into something that counts as a profession, this does not mean that there are not important ethical issues involved in doing one’s work. This should be no surprise, as ethical issues are abundant in life and work. 

b. Engineering Ethics

In this section, we will discuss engineering ethics for two purposes. One purpose is to use engineering ethics as a case study in professional ethics. More importantly, the second purpose is to give the reader some idea of some of the ethical issues involved in engineering as a practice.

One way to approach engineering ethics is by first thinking of it as a profession, and then given its features as a profession, examine ethical issues according to those features. So, for example, given that professions usually have a codified set of principles or rules for their professionals, one could try to articulate, expand, and flesh out such principles. Another way to approach engineering ethics is by starting with particular cases, usually of the historical as opposed to the hypothetical kind, and then draw out any moral lessons and perhaps principles from there. Accordingly, one would start with such cases as the Hyatt-Regency Walkway Collapse, the Challenger Space Shuttle Accident, and the Chernobyl and Bhopal Plant Accidents, just to name a few(Martin and Schinzinger, 2005).

The Challenger Space Shuttle Accident brings up a number of ethical issues, but one worth discussing is the role of engineer/manager. When one is both an engineer and also in upper or middle-level management, and when one has the responsibility as an engineer to report safety problems with a design but also has the pressure of project completion being a manager, (i) does one role trump the other in determining appropriate courses of action, and if so which one?; (ii) or are the two reconcilable in such a way that there really is no conflict?; (iii) or are the two irreconcilable such that inevitably assigning people to an engineer/manager role will lead to moral problems?

One philosophically interesting issue that is brought up by engineering is the assessment of safety and risk. What constitutes something being safe?  And what constitutes something being a risk?  Tversky and Kahneman (Tversky and Kahneman, 1981) famously showed that in certain cases, where risk-assessment is made, most people will prefer one option over another even when the expected value of both options are identical. What could explain this?  One explanation appeals to the idea that people are able to appropriately think about risk in a way that is not capturable by standard risk-cost-benefit analyses. Another explanation is that most people are in error and that their basing one preference over another is founded on an illusion concerning risk. With either interpretation/explanation determining risk is important, and understanding risk is then important in determining the safety of a product/design option. It is of great ethical concern that engineers be concerned with producing safe products, and thereby identifying and assessing properly the risks of such products.

There are also concerns with respect to what kinds of projects engineers should participate in. Should they participate in the development of weaponry?  If so, what kind of weapon production is morally permissible?  Furthermore, to what extent should engineers be concerned with the environment in proposing products and their designs?  Should engineers as professionals work to make products that are demanded by the market?  If there are competing claims to a service/product that cannot be explained in terms of market demand, then to what extent do engineers have a responsibility to their corporate employers, if their corporate employers require production design for things that run counter to what’s demanded by those “outside of” the market?  Let us be concrete with an unfortunately hypothetical example. Suppose you have a corporation called GlobalCyber Initiatives, with the motto: making the world globally connected from the ground up. And suppose that your company has a contract in a country with limited cell towers. Wealthy business owners of that country complain that their middle-level manager would like a processing upgrade to their hand-held devices so that they can access more quickly the cell towers (which are conveniently placed next to factories). Your company could provide that upgrade. But you, as lead in R&D, have been working on instead providing upgrades to PC’s, so that these PC’s can be used in remote, rural areas that have no/limited access to cell towers. With your upgrade, PC’s could be sold to the country in question for use in local libraries. The contract with the business owners would be more lucrative (slightly) but a contract with that country’s government, which is willing to participate, would do much more good for that country, at both the overall level, and also specifically for the very many people throughout the very rural country. What should you do as lead of the R&D?  How far should you be concerned?  How far should you be pushy in making the government contract come about?  Or should you not be concerned at all?

These questions are supposed to highlight how engineering ethics thought of merely as an ethic of how to be a good employee is perhaps too limiting, and how engineering as a profession might have a responsibility to grapple with what the purposes of it, as a profession, are supposed to be. As such, this then highlights how framing the purposes of a profession is inherently ethical, insofar as professions are to be responsive to the values of those that they serve.

6. Social Ethics, Distributive Justice, and Environmental Ethics

This section is an oddity, but due to space limitations, is the best way to structure an article like this. First of all, take something like “social ethics”. In some sense, all ethics is social, as it deals with human beings and other social creatures. Nevertheless, some people think that certain moral issues apply only to our private lives while we are behind closed doors. For example, is masturbation morally wrong?  Or, is homosexual sex morally wrong?   One way such questions are viewed is that, in a sense, they are not simple private questions, but inherently social. For example with homosexual sex, since sex is also a public phenomenon in some way, and sense the expression of sexual orientation is certainly public, there is definitely a way of understanding even this issue as public and therefore social. Perhaps the main point that needs to be emphasized is that when I say social I mean those issues that need to be understood obviously in a public, social way, and which cannot be easily subsumed under one of the other sub-disciplines discussed above.

Another reason this section is an oddity is that the topic of distributive justice is often thought of as one properly falling within the discipline of political philosophy, and not applied ethics. One of various reasons for including a section on it is that often distributive justice is talked about directly and indirectly in business ethics courses, as well as in courses discussing the allocation of health care resources (which may be included in a bioethics course). Another reason for inclusion is that famine relief is an applied ethical topic, and distributive justice, in a global context, obviously relates to famine relief. Finally, this section is an oddity because here environmental ethics only gets a subsection of this encyclopedia article and not an entire section, like equally important fields like bioethics or business ethics. The justification, though, for this is (i) space limitations and (ii) that various important moral considerations involving the environment are discussed within the context of bioethics, business ethics, and moral standing.

a. Social Ethics

To start with, perhaps some not-as-controversial (compared to earlier times) topics that fall within social ethics are affirmative action and smoking bans. The discussions involved with these topics are rich in discussion of such moral notions as fairness, benefits, appropriation of scarce resources, liberty, property rights, paternalism, and consent.

Other issues have to do with understanding the still very real gender disparities in wealth, social roles, and employment opportunities. How are these disparities and differences to be understood?  And given that these disparities are not morally justified, there are further questions about how to address and eliminate them in such a way that is sensitive to a full range of moral considerations. Furthermore, important work can be done on how transgender people can be recognized with full inclusion in the modern life of working in corporations, government, education, and industry, and doing this all in a way that respects the personhood of transgender people.

b. Distributive Justice, and Famine Relief

The term distributive justice is misleading in so far as justice is usually thought in terms of punitive justice. Punitive justice deals with determining the guilt or innocence of actions on the part of defendants, as well as just punishments of those found guilty of crimes. Distributive justice on the other hand deals with something related but yet much different. Take a society, or group of societies, and consider a limited number of resources, goods, and services. The question arises about how those resources, goods, and services should be distributed across individuals of such societies. Furthermore, there is the question about what kind of organization, or centralizing power, should be set up to deal with distribution of such goods (short for goods, resources, and services); let’s call such organizations which centralize power governments.

In this subsection, we will examine some very simplified characterizations to the question of distribution of goods, and subsequent questions of government. We will first cover a rather generic list of positions on distributive justice and government, and then proceed to a discussion of distributive justice and famine relief. Finally, we will discuss a number of more contemporary approaches to distributive justice, leaving it open to how each of these approaches would handle the issue of famine relief.

Anarchism is a position in which no such government is justified. As such, there is no centralizing power that distributes goods. Libertarianism is the position that says that government is justified in so far as it is a centralizing power used to enforce taxation for the purpose of enforcing person’s property rights. This kind of theory of distributive justice emphasizes a minimal form of government for the purpose of protecting and enforcing the rights of individuals to their property. Any kind of theory that advocates any further kind of government for purposes other than enforcement of property rights might be called socialist, but to be more informative, it will help to distinguish between at least three theories of distributive justice that might be called socialist. First, we might have those who care about equality. Egalitarian theories will emphasize that government exists to enforce taxation to redistribute wealth to make things as equal as possible between people in terms of their well-being. Bare-minimum theories will instead specify some bare minimum needed for any citizen/individual to minimally do well (perhaps have a life worth living). Government is then to specify policies, usually through taxation, in order to make sure that the bare minimum is met for all. Finally, we have meritocracy theories, and in theory, these may not count as socialist. The reason for this is that we could imagine a society in which there are people that do not merit the help which would be given to them through redistributive taxation. In another sense, however, it is socialist in that we can easily imagine societies where there are people who merit a certain amount of goods, and yet do not have them, and such people, according to the theory of merit, would be entitled to goods through taxation on others.

The debate concerning theories of distributive justice is easily in the 10’s of thousands of pages.  Instead of going into the debates, we should, for the purpose of applied ethics, go on to how distributive justice applies to famine relief, easily something within applied ethics. Peter Singer takes a position on famine relief in which it is morally required of those in developed nations to assist those experiencing famine (usually in underdeveloped nations)  (Singer, 1999). If we take such theories of distributive justice as applying across borders, then it is rather apparent that Singer rejects the libertarian paradigm, whereby taxation is not justified for anything other than protection of property rights. Singer instead is a utilitarian, where his justification has to do with producing overall goodness. Libertarians on the other hand will allow for the justice of actions and polices which do not produce the most overall goodness. It is not quite clear what socialist position Singer takes, but no matter.. It is obvious that he argues from a perspective that is not libertarian. In fact, he uses an example from Peter Unger to make his point, which is obviously not libertarian. The example (modified):  Imagine someone who has invested some of her wealth in some object (a car, for example) that is then the only thing that can prevent some innocent person from dying; the object will be destroyed in saving their life. Suppose that the person decides not to allow her object from being destroyed, thereby allowing the other (innocent) person to die. Has the object (car) owner done something wrong?  Intuitively, yes. Well, as Singer points out, so has anyone in the developed world, with enough money, in not giving to those experiencing famine relief; they have let those suffering people die. One such response is libertarian, Jan Narveson being an exemplar here (Narveson, 1993). Here, we have to make a difference between charity and justice. According to Narveson, it would be charitable (and a morally good thing) for one to give up some of one’s wealth or the saving object, but doing so is not required by justice. Libertarians in general have even more sophisticated responses to Singer, but that will not concern us here, as it can be seen how there is a disagreement on something important like famine relief, based on differences in political principles, or theories of distributive justice.

As discussed earlier in this subsection, libertarian theories were contrasted with socialist positions, where socialist is not to be confused with how it is used in the rhetoric of most media. The earliest of the influential socialist theories is proposed by John Rawls (Rawls, 1971).  Rawls is more properly an egalitarian theorist, who does allow for inequalities just so far as they improve the least-advantaged in the best possible way, and in a way that does not compromise basic civil liberties. There have been reactions to his views, though. For example, his Harvard colleague, Robert Nozick, takes a libertarian perspective, where he argues that the kinds of distributive policies endorsed by Rawls infringe on basic rights (and entitlements) of persons – basically, equality, as Rawls visions, encroaches on liberty (Nozick, 1974). On the other end of the spectrum, there are those like Kai Nielson who argue that Rawls does not go far enough. Basically, the equality Rawls argues for, according to Nielson, will still allow for too much inequality, where many perhaps will be left without the basic things needed to be treated equally and to have basic equal opportunities. For other post-Rawlsian critiques and general theories, consult the works of Michael Sandel, Martha Nussbaum (a student of Rawls), Thomas Pogge (a student of Rawls), and Michael Boylan. 

c. Environmental Ethics

This subsection will be very brief, as some of the issues have already been discussed. Some things, however, should be said about how environmental ethics can be understood in a way that is foundational, independent of business ethics, bioethics, and engineering ethics.

First of all, there is the question of what status the environment has independent of human beings. Does the environment have value if human beings do not exist, and would never exist?   There are actually some who give the answer yes, and not just because there would be other sentient beings. Suppose, then, that we have an environment with no sentient beings, and which will never progress into having sentient beings. Does such an environment still matter?  Yes, according to some. But even if an environment matters in the context of either actual or potential sentient beings, there are those who defend such an idea, but do so without thinking that primarily what matters is sentient beings.

Another way to categorize positions concerning the status of the environment is by differentiating those who advocate anthropocentrism from those who advocate a non-anthropocentric position. This debate is not merely semantic, nor is it merely academic, nor is it something trivial. It’s a question of value, and the role of human beings in helping or destroying things of (perhaps) value, independent of the status of human beings having value. To be more concrete, suppose that the environment of the Earth had intrinsic value, and value independently of human beings. Suppose then that human beings, as a collective, destroyed not only themselves but the Earth. Then, by almost definition, they have destroyed something of intrinsic value. Those who care about things with value, especially intrinsic value, should be rather concerned about this possibility (Here, consult: Keller, 2010; Elliot, 1996; Rolston, 2012; Callicot, 1994).

Many moral issues concerning the environment, though, can be seriously considered going with the two above options – that is, whether or not the environment (under which humans exist) matter if human beings do not exist. Even if one does not consider one of the two above options, it is hard to deny that the environment morally matters in a serious way. Perhaps such ways to consider the importance is through the study of how business and engineering affects the environment.

7. Theory and Application

One might still worry about the status of applied ethics for the reason that it is not quite clear what the methodology/formula is for determining the permissibility of any given action/practice. Such a worry is justified, indeed. The reason for the justification of skepticism here is that there are multiple approaches to determining the permissibility of actions/practices.

One such approach is very much top-down. The approach starts with a normative theory, where actions are determined by a single principle dictating the permissibility/impermissibility (rightness/wrongness) of actions/practices. The idea is that you start with something like utilitarianism (permissible just in case it maximizes overall goodness), Kantianism (permissible just in case it does not violate imperatives of rationality or respecting persons), or virtue theory (permissible just in case it abides with what the ideally virtuous person would do). From there, you get results of permissibility or impermissibility (rightness/wrongness).

Although each of these theories have important things to say about applied ethical issues, one might complain about them due to various reasons. Take utilitarianism, for example. It, as a theory, implies certain things morally required that many take to be wrong, or not required (for example, lynching an innocent person to please a mob, or spending ten years after medical school in a 3rd world country). There are also problems for the other two main kinds of theories, as well, such that one might be skeptical about a top-down approach that uses such theories to apply to applied ethical cases.

Another approach is to use a pluralist kind of ethical theory. Such a pluralist theory is comprised of various moral principles. Each of the principles might be justified by utilitarian, Kantian, or virtue theories. Or they may not. The idea here is that there are multiple principles to draw from to determine to the rightness/wrongness of any given action/practice within the applied ethical world. Such an approach sounds more than reasonable until another approach is considered, which will be discussed below.

What if, though, it was the case that some moral feature, of a purported moral principle, worked in such a way that it counted for the permissibility of an action in one case, case1, but counted against the permissibility of the same action in another case, case 2?  What should we say here?  An example would be helpful. Suppose that Jon has to hit Candy to get candy. Suppose that this counts as a morally good thing. Then the very same Jon hitting of Candy to get candy in a different contest could be a morally bad thing. This example is supposed to highlight the third theoretical possibility of moral particularism (Dancy, 1993).

To sum things up for applied ethics, it very much matters what theoretical approach one takes. Does one take the top-down approach of going with a normative/ethical theory to apply to specific actions/practices?  Or does one go with a pluralist approach?  Or does one go with a particularistic approach that requires, essentially, examining things case by case?

Finally, some things concerning moral psychology should be discussed. Moral psychology deals with understanding how we should appropriate actual moral judgments, of actual moral agents, in light of the very real contexts under which are made. Additionally, moral psychology tries to understand the limits of actions of human beings in relation to their environment, the context under which they act and live. (Notice that according to this definition, multicultural relativity of practices and actions has to be accounted for, as the differences in actions/practices might be due to differences in environments.)  Experiments from social psychology confirm the idea that how people behave is determined by their environment; for example, we have the Milgrim Experiment and the Stanford Prison Experiment. We might not expect people to act in such gruesome ways, but according to such experiments, if you place them in certain conditions, this will provoke ugly responses. Two reasons that these findings are important for applied ethics is:  (i) if you place persons in these conditions, you get non-ideal moral results, and (ii) our judgments about what to morally avoid/prevent are misguided because we don’t keep in mind the findings of such experiments. If we kept in mind the fragility of human behavior relative to conditions/environment, we might try get closer to eradicating such conditions/environments, and subsequent bad results.

8. References and Further Reading

  • Allhoff, Fritz, and Vaidya, Anand J. “Business in Ethical Focus”. (2008), Broadview.
  • Andrews, Kristen. “The First Step in Case for Great Ape Equality: The Argument for Other Minds.”  (1996), Etica and Animali.
  • Beauchamp, Tom, and Bowie, Norman. “Ethical Theory and Business.”  (1983), Prentice-Hall.
  • Boylan, Michael. “A Just Society.” (2004), Lanham, MD: Rowman & Littlefield.
  • Boylan, Michael. “Morality and Global Justice: Justifications and Applications.”  (2011), Westview.
  • Boylan, Michael   “Morality and Global Justice:  Reader.”  (2011), Westview.
  • Brody, Baruch. “Ethical Issues in Clinical Trials in Developing Countries.”  (2002) (vol. 2), Statistics in Medicine. (2002), John Wiley & Sons.
  • Callahan, Joan. “Ethical Issues in Professional Life.”  (1988), Oxford.
  • Callicott, J. Baird. “Earth’s Insights.”  (1994), University of California Press.
  • Carr, Albert Z. “Is Business Bluffing Ethical?” (1968), Harvard Business Review.
  • Chadwick, Ruth; Kuhse, Helga; Landman, Willem; Schuklenk, Udo; Singer, Peter. “The Bioethics Reader: Editor’s Choice.”  (2007), Blackwell.
  • Cohen, Carl. “The Case for the Use of Animals in Biomedical Research.”  (1986), New England Journal of Medicine.
  • Danley, John. “Corporate Moral Agency: The Case for Anthropological Bigotry”. (1980), Action and Responsibility: Bowling Green Studies in Applied Philosophy, vol. 2.
  • Elliot, Robert. “Environmental Ethics.”  (1996), Oxford.
  • Freeman, R. Edward. “A Stakeholder Theory of the Modern Corporation.”  (1994)
  • French, Peter. “Corporations as a Moral Person.”  (1979), American Philosophical Quarterly.
  • Friedman, Milton. “The Social Responsibility of Corporations is to Increase its Profits.”  (1970), New York Times Magazine.
  • Glantz, Leonard; Annas, George J;  Grodin, Michael A; “Mariner, Wendy K. Research in Developing Countries: Taking Benefit Seriously.”  (1998), Hastings Center Report.
  • Hellman, Samuel; Hellman, Deborah S. “Of Mice But Not Men:  Problems of the Randomized Clinical Trial.”  (1991), The New England Journal of Medicine.
  • Holm, Soren. “Going to the Roots of the Stem Cell Controversy.”  In The Bioethics Reader, Chawick, et. al. (2007), Blackwell.
  • Hursthouse, Rosalind. “Virtue Theory and Abortion.”  (1991), Philosophy & Public Affairs.
  • LaFollette, Hugh. “The Oxford Handbood of Practical Ethics.” (2003), Oxford.
  • Kamm, Francis M. “Creation and Abortion.” (1996), Oxford.
  • Keller, David R. “Environmental Ethics: The Big Questions.”  (2010), Wiley-Blackwell.
  • Mappes, Thomas, and Degrazzia, David. “Biomedical Ethics.”  6Th ed. (2006), McGraw-Hill.
  • Marquis, Don. “How to Resolve an Ethical Dilemma Concerning Randomized Clinical Trials”. (1999), New England Journal of Medicine.
  • Martin, Mike W; Schinzinger, Roland. “Ethics in Engineering.”  4Th ed. (2005), McGraw-Hill.
  • McMahan, Jeff. “The Ethics of Killing.”  (2002), Oxford.
  • Merrill, John C. “The Imperative of Freedom: A Philosophy of Journalistic Autonomy.” (1974) Freedom House.
  • Narveson, Jan. “Moral Matters.”  (1993), Broadview Press.
  • Nielsen, Kai. “Radical Egalitarian Justice: Justice as Equality.”  Social Theory and Practice, (1979).
  • Nozick, Robert. “Anarchy, State, and Utopia.”  (1974), Basic Books.
  • Nussbaum, Martha C. “Sex and Social Justice.”  (1999), New York: Oxford University Press.
  • Pogge, Thomas W. “An Egalitarian Law of Peoples.” Philosophy and Public Affairs, (1994)
  • Pojman, Louis P; Pojman, Paul. “Environmental Ethics.”  6Th ed. (2011), Cengage.
  • Prinz, Jesse. “The Emotional Construction of Morals.”  (2007), Oxford.
  • Rachels, James. “Ethical Theory 1: The Question of Objectivity.”  (1998), Oxford.
  • Rachels, James. Ethical Theory 2:  “Theories about How We Should Live”. (1998) , Oxford.
  • Rachels, James. “The Elements of Moral Philosophy.”  McGraw Hill.
  • Rachels, James. “The Right Thing to Do.”  McGraw Hill.
  • Rachels, James. “Passive and Active Euthanasia.”  1975, Journal of New England Medicine.
  • Rawls, John. “A Theory of Justice.”  (1971), Harvard.
  • Rolston III, Holmes. “A New Environmental Ethics.”  (2012), Routledge.
  • Sandel, Michael J. “Liberalism and the Limits of Justice.”  (1982), New York: Cambridge University Press.
  • Singer, Peter. “Practical Ethics.”  (1979), Oxford.
  • Singer, Peter. “Animal Liberation.”  (1975), Oxford.
  • Shaw, William H. “Business Ethics: A Textbook with Cases.”  (2011), Wadsworth.
  • Thomson, Judith Jarvis. “In Defense of Abortion.” (1971), Philosophy & Public Affairs.
  • Unger, Peter. “Living High and Letting Die.”  (1996), Oxford.

 

Author Information

Joel Dittmer
Email: joel.p.dittmer@gmail.com
U. S. A.

The Evidential Problem of Evil

The evidential problem of evil is the problem of determining whether and, if so, to what extent the existence of evil (or certain instances, kinds, quantities, or distributions of evil) constitutes evidence against the existence of God, that is to say, a being perfect in power, knowledge and goodness. Evidential arguments from evil attempt to show that, once we put aside any evidence there might be in support of the existence of God, it becomes unlikely, if not highly unlikely, that the world was created and is governed by an omnipotent, omniscient, and wholly good being. Such arguments are not to be confused with logical arguments from evil, which have the more ambitious aim of showing that, in a world in which there is evil, it is logically impossible—and not just unlikely—that God exists.

This entry begins by clarifying some important concepts and distinctions associated with the problem of evil, before providing an outline of one of the more forceful and influential evidential arguments developed in contemporary times, namely, the evidential argument advanced by William Rowe. Rowe’s argument has occasioned a range of responses from theists, including the so-called “skeptical theist” critique (according to which God’s ways are too mysterious for us to comprehend) and the construction of various theodicies, that is, explanations as to why God permits evil. These and other responses to the evidential problem of evil are here surveyed and assessed.

Table of Contents

  1. Background to the Problem of Evil
    1. Orthodox Theism
    2. Good and Evil
    3. Versions of the Problem of Evil
  2. William Rowe’s Evidential Argument from Evil
    1. An Outline of Rowe’s Evidential Argument
    2. The Theological Premise
    3. The Factual Premise
      1. Rowe’s Case in Support of the Factual Premise
      2. The Inference from P to Q
  3. The Skeptical Theist Response
    1. Wykstra’s CORNEA Critique
    2. Wykstra’s Parent Analogy
    3. Alston’s Analogies
  4. Building a Theodicy, or Casting Light on the Ways of God
    1. What is a Theodicy?
    2. Distinguishing a “Theodicy” from a “Defence”
    3. Sketch of a Theodicy
  5. Further Responses to the Evidential Problem of Evil
  6. Conclusion
  7. References and Further Reading

1. Background to the Problem of Evil

Before delving into the deep and often murky waters of the problem of evil, it will be helpful to provide some philosophical background to this venerable subject. The first and perhaps most important step of this stage-setting process will be to identify and clarify the conception of God that is normally presupposed in contemporary debates (at least within the Anglo-American analytic tradition) on the problem of evil. The next step will involve providing an outline of some important concepts and distinctions, in particular the age-old distinction between “good” and “evil,” and the more recent distinction between the logical problem of evil and the evidential problem of evil.

a. Orthodox Theism

The predominant conception of God within the western world, and hence the kind of deity that is normally the subject of debate in discussions on the problem of evil in most western philosophical circles, is the God of “orthodox theism.” According to orthodox theism, there exists just one God, this God being a person or person-like. The operative notion, however, behind this form of theism is that God is perfect, where to be perfect is to be the greatest being possible or, to borrow Anselm’s well-known phrase, the being than which none greater can be conceived. (Such a conception of God forms the starting-point in what has come to be known as “perfect being theology”; see Morris 1987, 1991, and Rogers 2000). On this view, God, as an absolutely perfect being, must possess the following perfections or great-making qualities:

  1. omnipotence: This refers to God’s ability to bring about any state of affairs that is logically possible in itself as well as logically consistent with his other essential attributes.
  2. omniscience: God is omniscient in that he knows all truths or knows all that is logically possible to know.
  3. perfect goodness: God is the source of moral norms (as in divine command ethics) or always acts in complete accordance with moral norms.
  4. aseity: God has aseity (literally, being from oneself, a se esse) – that is to say, he is self-existent or ontologically independent, for he does not depend either for his existence or for his characteristics on anything outside himself.
  5. incorporeality: God has no body; he is a non-physical spirit but is capable of affecting physical things.
  6. eternity: Traditionally, God is thought to be eternal in an atemporal sense—that is, God is timeless or exists outside of time (a view upheld by Augustine, Boethius, and Aquinas). On an alternative view, God’s eternality is held to be temporal in nature, so that God is everlasting or exists in time, having infinite temporal duration in both of the two temporal directions.
  7. omnipresence: God is wholly present in all space and time. This is often interpreted metaphorically to mean that God can bring about an event immediately at any place and time, and knows what is happening at every place and time in the same immediate manner.
  8. perfectly free: God is absolutely free either in the sense that nothing outside him can determine him to perform a particular action, or in the sense that it is always within his power not to do what he does.
  9. alone worthy of worship and unconditional commitment: God, being the greatest being possible, is the only being fit to be worshipped and the only being to whom one may commit one’s life without reservation.

The God of traditional theism is also typically accorded a further attribute, one that he is thought to possess only contingently:

  1. creator and sustainer of the world: God brought the (physical and non-physical) world into existence, and also keeps the world and every object within it in existence. Thus, no created thing could exist at a given moment unless it were at that moment held in existence by God. Further, no created thing could have the causal powers and liabilities it has at a given moment unless it were at that moment supplied with those powers and liabilities by God.

According to orthodox theism, God was free not to create a world. In other words, there is at least one possible world in which God creates nothing at all. But then God is a creator only contingently, not necessarily. (For a more comprehensive account of the properties of the God of orthodox theism, see Swinburne 1977, Quinn & Taliaferro 1997: 223-319, and Hoffman & Rosenkrantz 2002.)

b. Good and Evil

Clarifying the underlying conception of God is but the first step in clarifying the nature of the problem of evil. To arrive at a more complete understanding of this vexing problem, it is necessary to unpack further some of its philosophical baggage. I turn, therefore, to some important concepts and distinctions associated with the problem of evil, beginning with the ideas of “good” and “evil.”

The terms “good” and “evil” are, if nothing else, notoriously difficult to define. Some account, however, can be given of these terms as they are employed in discussions of the problem of evil. Beginning with the notion of evil, this is normally given a very wide extension so as to cover everything that is negative and destructive in life. The ambit of evil will therefore include such categories as the bad, the unjust, the immoral, and the painful. An analysis of evil in this broad sense may proceed as follows:

An event may be categorized as evil if it involves any of the following:

  1. some harm (whether it be minor or great) being done to the physical and/or psychological well-being of a sentient creature;
  2. the unjust treatment of some sentient creature;
  3. loss of opportunity resulting from premature death;
  4. anything that prevents an individual from leading a fulfilling and virtuous life;
  5. a person doing that which is morally wrong;
  6. the “privation of good.”

Condition (a) captures what normally falls under the rubric of pain as a physical state (for example, the sensation you feel when you have a toothache or broken jaw) and suffering as a mental state in which we wish that our situation were otherwise (for example, the experience of anxiety or despair). Condition (b) introduces the notion of injustice, so that the prosperity of the wicked, the demise of the virtuous, and the denial of voting rights or employment opportunities to women and blacks would count as evils. The third condition is intended to cover cases of untimely death, that is to say, death not brought about by the ageing process alone. Death of this kind may result in loss of opportunity either in the sense that one is unable to fulfill one’s potential, dreams or goals, or merely in the sense that one is prevented from living out the full term of their natural life. This is partly why we consider it a great evil if an infant were killed after impacting with a train at full speed, even if the infant experienced no pain or suffering in the process. Condition (d) classifies as evil anything that inhibits one from leading a life that is both fulfilling and virtuous – poverty and prostitution would be cases in point. Condition (e) relates evil to immoral choices or acts. And the final condition expresses the idea, prominent in Augustine and Aquinas, that evil is not a substance or entity in its own right, but a privatio boni: the absence or lack of some good power or quality which a thing by its nature ought to possess.

Paralleling the above analysis of evil, the following account of “good” may be offered:

An event may be categorized as good if it involves any of the following:

  1. some improvement (whether it be minor or great) in the physical and/or psychological well-being of a sentient creature;
  2. the just treatment of some sentient creature;
  3. anything that advances the degree of fulfillment and virtue in an individual’s life;
  4. a person doing that which is morally right;
  5. the optimal functioning of some person or thing, so that it does not lack the full measure of being and goodness that ought to belong to it.

Turning to the many varieties of evil, the following have become standard in the literature:

Moral evil. This is evil that results from the misuse of free will on the part of some moral agent in such a way that the agent thereby becomes morally blameworthy for the resultant evil. Moral evil therefore includes specific acts of intentional wrongdoing such as lying and murdering, as well as defects in character such as dishonesty and greed.

Natural evil. In contrast to moral evil, natural evil is evil that results from the operation of natural processes, in which case no human being can be held morally accountable for the resultant evil. Classic examples of natural evil are natural disasters such as cyclones and earthquakes that result in enormous suffering and loss of life, illnesses such as leukemia and Alzheimer’s, and disabilities such as blindness and deafness.

An important qualification, however, must be made at this point. A great deal of what normally passes as natural evil is brought about by human wrongdoing or negligence. For example, lung cancer may be caused by heavy smoking; the loss of life occasioned by some earthquakes may be largely due to irresponsible city planners locating their creations on faults that will ultimately heave and split; and some droughts and floods may have been prevented if not for the careless way we have treated our planet. As it is the misuse of free will that has caused these evils or contributed to their occurrence, it seems best to regard them as moral evils and not natural evils. In the present work, therefore, a natural evil will be defined as an evil resulting solely or chiefly from the operation of the laws of nature. Alternatively, and perhaps more precisely, an evil will be deemed a natural evil only if no non-divine agent can be held morally responsible for its occurrence. Thus, a flood caused by human pollution of the environment will be categorized a natural evil as long as the agents involved could not be held morally responsible for the resultant evil, which would be the case if, for instance, they could not reasonably be expected to have foreseen the consequences of their behavior.

A further category of evil that has recently played an important role in discussions on the problem of evil is horrendous evil. This may be defined, following Marilyn Adams (1999: 26), as evil “the participation in which (that is, the doing or suffering of which) constitutes prima facie reason to doubt whether the participant’s life could (given their inclusion in it) be a great good to him/her on the whole.” As examples of such evil, Adams lists “the rape of a woman and axing off of her arms, psycho-physical torture whose ultimate goal is the disintegration of personality, betrayal of one’s deepest loyalties, child abuse of the sort described by Ivan Karamazov, child pornography, parental incest, slow death by starvation, the explosion of nuclear bombs over populated areas” (p.26).

A horrendous evil, it may be noted, may be either a moral evil (for example, the Holocaust of 1939-45) or a natural evil (for example, the Lisbon earthquake of 1755). It is also important to note that it is the notion of a “horrendous moral evil” that comports with the current, everyday use of “evil” by English speakers. When we ordinarily employ the word “evil” today we do not intend to pick out something that is merely bad or very wrong (for example, a burglary), nor do we intend to refer to the death and destruction brought about by purely natural processes (we do not, for example, think of the 2004 Asian tsunami disaster as something that was “evil”). Instead, the word “evil” is reserved in common usage for events and people that have an especially horrific moral quality or character.

Clearly, the problem of evil is at its most difficult when stated in terms of horrendous evil (whether of the moral or natural variety), and as will be seen in Section II below, this is precisely how William Rowe’s statement of the evidential problem of evil is formulated.

Finally, these notions of good and evil indicate that the problem of evil is intimately tied to ethics. One’s underlying ethical theory may have a bearing on one’s approach to the problem of evil in at least two ways.

Firstly, one who accepts either a divine command theory of ethics or non-realism in ethics is in no position to raise the problem of evil, that is, to offer the existence of evil as at least a prima facie good reason for rejecting theism. This is because a divine command theory, in taking morality to be dependent upon the will of God, already assumes the truth of that which is in dispute, namely, the existence of God (see Brown 1967). On the other hand, non-realist ethical theories, such as moral subjectivism and error-theories of ethics, hold that there are no objectively true moral judgments. But then a non-theist who also happens to be a non-realist in ethics cannot help herself to some of the central premises found in evidential arguments from evil (such as “If there were a perfectly good God, he would want a world with no horrific evil in it”), as these purport to be objectively true moral judgments (see Nelson 1991). This is not to say, however, that atheologians such as David Hume, Bertrand Russell and J.L. Mackie, each of whom supported non-realism in ethics, were contradicting their own meta-ethics when raising arguments from evil – at least if their aim was only to show up a contradiction in the theist’s set of beliefs.

Secondly, the particular normative ethical theory one adopts (for example, consequentialism, deontology, virtue ethics) may influence the way in which one formulates or responds to an argument from evil. Indeed, some have gone so far as to claim that evidential arguments from evil usually presuppose the truth of consequentialism (see, for example, Reitan 2000). Even if this is not so, it seems that the adoption of a particular theory in normative ethics may render the problem of evil easier or harder, or at least delimit the range of solutions available. (For an excellent account of the difficulties faced by theists in relation to the problem of evil when the ethical framework is restricted to deontology, see McNaughton 1994.)

c. Versions of the Problem of Evil

The problem of evil may be described as the problem of reconciling belief in God with the existence of evil. But the problem of evil, like evil itself, has many faces. It may, for example, be expressed either as an experiential problem or as a theoretical problem. In the former case, the problem is the difficulty of adopting or maintaining an attitude of love and trust toward God when confronted by evil that is deeply perplexing and disturbing. Alvin Plantinga (1977: 63-64) provides an eloquent account of this problem:

The theist may find a religious problem in evil; in the presence of his own suffering or that of someone near to him he may find it difficult to maintain what he takes to be the proper attitude towards God. Faced with great personal suffering or misfortune, he may be tempted to rebel against God, to shake his fist in God’s face, or even to give up belief in God altogether… Such a problem calls, not for philosophical enlightenment, but for pastoral care. (emphasis in the original)

By contrast, the theoretical problem of evil is the purely “intellectual” matter of determining what impact, if any, the existence of evil has on the truth-value or the epistemic status of theistic belief. To be sure, these two problems are interconnected – theoretical considerations, for example, may color one’s actual experience of evil, as happens when suffering that is better comprehended becomes easier to bear. In this article, however, the focus will be exclusively on the theoretical dimension. This aspect of the problem of evil comes in two broad varieties: the logical problem and the evidential problem.

The logical version of the problem of evil (also known as the a priori version and the deductive version) is the problem of removing an alleged logical inconsistency between certain claims about God and certain claims about evil. J.L. Mackie (1955: 200) provides a succinct statement of this problem:

In its simplest form the problem is this: God is omnipotent; God is wholly good; and yet evil exists. There seems to be some contradiction between these three propositions, so that if any two of them were true the third would be false. But at the same time all three are essential parts of most theological positions: the theologian, it seems, at once must adhere and cannot consistently adhere to all three. (emphases in the original)

In a similar vein, H.J. McCloskey (1960: 97) frames the problem of evil as follows:

Evil is a problem for the theist in that a contradiction is involved in the fact of evil, on the one hand, and the belief in the omnipotence and perfection of God on the other. (emphasis mine)

Atheologians like Mackie and McCloskey, in maintaining that the logical problem of evil provides conclusive evidence against theism, are claiming that theists are committed to an internally inconsistent set of beliefs and hence that theism is necessarily false. More precisely, it is claimed that theists commonly accept the following propositions:

  1. God exists
  2. God is omnipotent
  3. God is omniscient
  4. God is perfectly good
  5. Evil exists.

Propositions (11)-(14) form an essential part of the orthodox conception of God, as this has been explicated in Section 1 above. But theists typically believe that the world contains evil. The charge, then, is that this commitment to (15) is somehow incompatible with the theist’s commitment to (11)-(14). Of course, (15) can be specified in a number of ways – for example, (15) may refer to the existence of any evil at all, or a certain amount of evil, or particular kinds of evil, or some perplexing distributions of evil. In each case, a different version of the logical problem of evil, and hence a distinct charge of logical incompatibility, will be generated.

The alleged incompatibility, however, is not obvious or explicit. Rather, the claim is that propositions (11)-(15) are implicitly contradictory, where a set S of propositions is implicitly contradictory if there is a necessary proposition p such that the conjunction of p with S constitutes a formally contradictory set. Those who advance logical arguments from evil must therefore add one or more necessary truths to the above set of five propositions in order to generate the fatal contradiction. By way of illustration, consider the following additional propositions that may be offered:

  1. A perfectly good being would want to prevent all evils.
  2. An omniscient being knows every way in which evils can come into existence.
  3. An omnipotent being who knows every way in which an evil can come into existence has the power to prevent that evil from coming into existence.
  4. A being who knows every way in which an evil can come into existence, who is able to prevent that evil from coming into existence, and who wants to do so, would prevent the existence of that evil.

From this set of auxiliary propositions, it clearly follows that

  1. If there exists an omnipotent, omniscient, and perfectly good being, then no evil exists.

It is not difficult to see how the addition of (16)-(20) to (11)-(15) will yield an explicit contradiction, namely,

  1. Evil exists and evil does not exist.

If such an argument is sound, theism will not so much lack evidential support, but would rather be, as Mackie (1955: 200) puts it, “positively irrational.” For more discussion, see the article The Logical Problem of Evil.

The subject of this article, however, is the evidential version of the problem of evil (also called the a posteriori version and the inductive version), which seeks to show that the existence evil, although logically consistent with the existence of God, counts against the truth of theism. As with the logical problem, evidential formulations may be based on the sheer existence of evil, or certain instances, types, amounts, or distributions of evil. Evidential arguments from evil may also be classified according to whether they employ (i) a direct inductive approach, which aims at showing that evil counts against theism, but without comparing theism to some alternative hypothesis; or (ii) an indirect inductive approach, which attempts to show that some significant set of facts about evil counts against theism, and it does this by identifying an alternative hypothesis that explains these facts far more adequately than the theistic hypothesis. The former strategy, as will be seen in Section II, is employed by William Rowe, while the latter strategy is exemplified best in Paul Draper’s 1989 paper, “Pain and Pleasure: An Evidential Problem for Theists”. (A useful taxonomy of evidential arguments from evil can be found in Russell 1996: 194 and Peterson 1998: 23-27, 69-72.)

Evidential arguments purport to show that evil counts against theism in the sense that the existence of evil lowers the probability that God exists. The strategy here is to begin by putting aside any positive evidence we might think there is in support of theism (for example, the fine-tuning argument) as well as any negative evidence we might think there is against theism (that is, any negative evidence other than the evidence of evil). We therefore begin with a “level playing field” by setting the probability of God’s existing at 0.5 and the probability of God’s not existing at 0.5 (compare Rowe 1996: 265-66; it is worth noting, however, that this “level playing field” assumption is not entirely uncontroversial: see, for example, the objections raised by Jordan 2001 and Otte 2002: 167-68). The aim is to then determine what happens to the probability value of “God exists” once we consider the evidence generated by our observations of the various evils in our world. The central question, therefore, is: Grounds for belief in God aside, does evil render the truth of atheism more likely than the truth of theism? (A recent debate on the evidential problem of evil was couched in such terms: see Rowe 2001a: 124-25.) Proponents of evidential arguments are therefore not claiming that, even if we take into account any positive reasons there are in support of theism, the evidence of evil still manages to lower the probability of God’s existence. They are only making the weaker claim that, if we temporarily set aside such positive reasons, then it can be shown that the evils that occur in our world push the probability of God’s existence significantly downward.

But if evil counts against theism by driving down the probability value of “God exists” then evil constitutes evidence against the existence of God. Evidential arguments, therefore, claim that there are certain facts about evil that cannot be adequately explained on a theistic account of the world. Theism is thus treated as a large-scale hypothesis or explanatory theory which aims to make sense of some pertinent facts, and to the extent that it fails to do so it is disconfirmed.

In evidential arguments, however, the evidence only probabilifies its conclusion, rather than conclusively verifying it. The probabilistic nature of such arguments manifests itself in the form of a premise to the effect that “It is probably the case that some instance (or type, or amount, or pattern) of evil E is gratuitous.” This probability judgment usually rests on the claim that, even after careful reflection, we can see no good reason for God’s permission of E. The inference from this claim to the judgment that there exists gratuitous evil is inductive in nature, and it is this inductive step that sets the evidential argument apart from the logical argument.

2. William Rowe’s Evidential Argument from Evil

Evidential arguments from evil seek to show that the presence of evil in the world inductively supports or makes likely the claim that God (or, more precisely, the God of orthodox theism) does not exist. A variety of evidential arguments have been formulated in recent years, but here I will concentrate on one very influential formulation, namely, that provided by William Rowe. Rowe’s version of the evidential argument has received much attention since its formal inception in 1978, for it is often considered to be the most cogent presentation of the evidential problem of evil. James Sennett (1993: 220), for example, views Rowe’s argument as “the clearest, most easily understood, and most intuitively appealing of those available.” Terry Christlieb (1992: 47), likewise, thinks of Rowe’s argument as “the strongest sort of evidential argument, the sort that has the best chance of success.” It is important to note, however, that Rowe’s thinking on the evidential problem of evil has developed in significant ways since his earliest writings on the subject, and two (if not three) distinct evidential arguments can be identified in his work. Here I will only discuss that version of Rowe’s argument that received its first full-length formulation in Rowe (1978) and, most famously, in Rowe (1979), and was successively refined in the light of criticisms in Rowe (1986), (1988), (1991), and (1995), before being abandoned in favour of a quite different evidential argument in Rowe (1996).

a. An Outline of Rowe’s Argument

In presenting his evidential argument from evil in his justly celebrated 1979 paper, “The Problem of Evil and Some Varieties of Atheism”, Rowe thinks it best to focus on a particular kind of evil that is found in our world in abundance. He therefore selects “intense human and animal suffering” as this occurs on a daily basis, is in great plenitude in our world, and is a clear case of evil. More precisely, it is a case of intrinsic evil: it is bad in and of itself, even though it sometimes is part of, or leads to, some good state of affairs (Rowe 1979: 335). Rowe then proceeds to state his argument for atheism as follows:

  1. There exist instances of intense suffering which an omnipotent, omniscient being could have prevented without thereby losing some greater good or permitting some evil equally bad or worse.
  2. An omniscient, wholly good being would prevent the occurrence of any intense suffering it could, unless it could not do so without thereby losing some greater good or permitting some evil equally bad or worse.
  3. (Therefore) There does not exist an omnipotent, omniscient, wholly good being. (Rowe 1979: 336)

This argument, as Rowe points out, is clearly valid, and so if there are rational grounds for accepting its premises, to that extent there are rational grounds for accepting the conclusion, that is to say, atheism.

b. The Theological Premise

The second premise is sometimes called “the theological premise” as it expresses a belief about what God as a perfectly good being would do under certain circumstances. In particular, this premise states that if such a being knew of some intense suffering that was about to take place and was in a position to prevent its occurrence, then it would prevent it unless it could not do so without thereby losing some greater good or permitting some evil equally bad or worse. Put otherwise, an omnipotent, omniscient, wholly good God would not permit any gratuitous evil, evil that is (roughly speaking) avoidable, pointless, or unnecessary with respect to the fulfillment of God’s purposes.

Rowe takes the theological premise to be the least controversial aspect of his argument. And the consensus seems to be that Rowe is right – the theological premise, or a version thereof that is immune from some minor infelicities in the original formulation, is usually thought to be indisputable, self-evident, necessarily true, or something of that ilk. The intuition here, as the Howard-Snyders (1999: 115) explain, is that “on the face of it, the idea that God may well permit gratuitous evil is absurd. After all, if God can get what He wants without permitting some particular horror (or anything comparably bad), why on earth would He permit it?”

An increasing number of theists, however, are beginning to question Rowe’s theological premise. This way of responding to the evidential problem of evil has been described by Rowe as “radical, if not revolutionary” (1991: 79), but it is viewed by many theists as the only way to remain faithful to the common human experience of evil, according to which utterly gratuitous evil not only exists but is abundant. In particular, some members of the currently popular movement known as open theism have rallied behind the idea that the theistic worldview is not only compatible with, but requires or demands, the possibility that there is gratuitous evil (for the movement’s “manifesto,” see Pinnock et al. 1994; see also Sanders 1998, Boyd 2000, and Hasker 2004).

Although open theists accept the orthodox conception of God, as delineated in Section 1.a above, they offer a distinct account of some of the properties that are constitutive of the orthodox God. Most importantly, open theists interpret God’s omniscience in such a way that it does not include either foreknowledge (or, more specifically, knowledge of what free agents other than God will do) or middle knowledge (that is, knowledge of what every possible free creature would freely choose to do in any possible situation in which that creature might find itself). This view is usually contrasted with two other forms of orthodox theism: Molinism (named after the sixteenth-century Jesuit theologian Luis de Molina, who developed the theory of middle knowledge), according to which divine omniscience encompasses both foreknowledge and middle knowledge; and Calvinism or theological determinism, according to which God determines or predestines all that happens, thus leaving us with either no morally relevant free will at all (hard determinism) or free will of the compatibilist sort only (soft determinism).

It is often thought that the Molinist and Calvinist grant God greater providential control over the world than does the open theist. For according to the latter but not the former, the future is to some degree open-ended in that not even God can know exactly how it will turn out, given that he has created a world in which there are agents with libertarian free will and, perhaps, indeterminate natural processes. God therefore runs the risk that his creation will come to be infested with gratuitous evils, that is to say, evils he has not intended, decreed, planned for, or even permitted for the sake of some greater good. Open theists, however, argue that this risk is kept in check by God’s adoption of various general strategies by which he governs the world. God may, for example, set out to create a world in which there are creatures who have the opportunity to freely choose their destiny, but he would then ensure that adequate recompense is offered (perhaps in an afterlife) to those whose lives are ruined (through no fault of their own) by the misuse of others’ freedom (for example, a child that is raped and murdered). Nevertheless, in creating creatures with (libertarian) free will and by infusing the natural order with a degree of indeterminacy, God relinquishes exhaustive knowledge and complete control of all history. The open theist therefore encourages the rejection of what has been called “meticulous providence” (Peterson 1982: chs 4 & 5) or “the blueprint worldview” (Boyd 2003: ch.2), the view that the world was created according to a detailed divine blueprint which assigns a specific divine reason for every occurrence in history. In place of this view, the open theist presents us with a God who is a risk-taker, a God who gives up meticulous control of everything that happens, thus opening himself up to the genuine possibility of failure and disappointment – that is to say, to the possibility of gratuitous evil.

Open theism has sparked much heated debate and has been attacked from many quarters. Considered, however, as a response to Rowe’s theological premise, open theism’s prospects seem dim. The problem here, as critics have frequently pointed out, is that the open view of God tends to diminish one’s confidence in God’s ability to ensure that his purposes for an individual’s life, or for world history, will be accomplished (see, for example, Ware 2000, Ascol 2001: 176-80). The worry is that if, as open theists claim, God does not exercise sovereign control over the world and the direction of human history is open-ended, then it seems that the world is left to the mercy of Tyche or Fortuna, and we are therefore left with no assurance that God’s plan for the world and for us will succeed. Consider, for example, Eleonore Stump’s rhetorical questions, put in response to the idea of a “God of chance” advocated in van Inwagen (1988): “Could one trust such a God with one’s child, one’s life? Could one say, as the Psalmist does, “I will both lay me down in peace and sleep, for thou, Lord, only makest me dwell in safety’?” (1997: 466, quoting from Psalm 4:8). The answer may in large part depend on the degree to which the world is thought to be imbued with indeterminacy or chance.

If, for example, the open theist view introduces a high level of chance into God’s creation, this would raise the suspicion that the open view reflects an excessively deistic conception of God’s relation to the world. Deism is popularly thought of as the view that a supreme being created the world but then, like an absentee landlord, left it to run on its own accord. Deists, therefore, are often accused of postulating a remote and indifferent God, one who does not exercise providential care over his creation. Such a deity, it might be objected, resembles the open theist’s God of chance. The objection, in other words, is that open theists postulate a dark and risky universe subject to the forces of blind chance, and that it is difficult to imagine a personal God—that is, a God who seeks to be personally related to us and hence wants us to develop attitudes of love and trust towards him—providing us with such a habitat. To paraphrase Einstein, God does not play dice with our lives.

This, however, need not mean that God does not play dice at all. It is not impossible, in other words, to accommodate chance within a theistic world-view. To see this, consider a particular instance of moral evil: the rape and murder of a little girl. It seems plausible that no explanation is available as to why God would permit this specific evil (or, more precisely, why God would permit this girl to suffer then and there and in that way), since any such explanation that is offered will inevitably recapitulate the explanation offered for at least one of the major evil-kinds that subsumes the particular evil in question (for example, the class of moral evils). It is therefore unreasonable to request a reason (even a possible reason) for God’s permission of a particular event that is specific to this event and that goes beyond some general policy or plan God might have for permitting events of that kind. If this correct, then there is room for theists to accept the view that at least some evils are chancy or gratuitous in the sense that there is no specific reason as to why these evils are permitted by God. However, this kind of commitment to gratuitous evil is entirely innocuous for proponents of Rowe’s theological premise. For one can simply modify this premise so that it ranges either over particular instances of evil or (to accommodate cases where particular evils admit of no divine justification) over broadly defined evils or evil-kinds under which the relevant particular evils can be subsumed. And so a world created by God may be replete with gratuitous evil, as open theists imagine, but that need not present a problem for Rowe.

(For a different line of argument in support of the compatibility of theism and gratuitous evil, see Hasker (2004: chs 4 & 5), who argues that the consequences for morality would be disastrous if we took Rowe’s theological premise to be true. For criticisms of this view, see Rowe (1991: 79-86), Chrzan (1994), O’Connor (1998: 53-70), and Daniel and Frances Howard-Snyder (1999: 119-27).)

c. The Factual Premise

Criticisms of Rowe’s argument tend to focus on its first premise, sometimes dubbed “the factual premise,” as it purports to state a fact about the world. Briefly put, the fact in question is that there exist instances of intense suffering which are gratuitous or pointless. As indicated above, an instance of suffering is gratuitous, according to Rowe, if an omnipotent, omniscient being could have prevented it without thereby losing some greater good or permitting some evil equally bad or worse. A gratuitous evil, in this sense, is a state of affairs that is not (logically) necessary to the attainment of a greater good or to the prevention of an evil at least as bad.

i Rowe’s Case in Support of the Factual Premise

Rowe builds his case in support of the factual premise by appealing to particular instances of human and animal suffering, such as the following:

E1: the case of Bambi
“In some distant forest lightning strikes a dead tree, resulting in a forest fire. In the fire a fawn is trapped, horribly burned, and lies in terrible agony for several days before death relieves its suffering” (Rowe 1979: 337).

Although this is presented as a hypothetical event, Rowe takes it to be “a familiar sort of tragedy, played not infrequently on the stage of nature” (1988: 119).

E2: the case of Sue
This is an actual event in which a five-year-old girl in Flint, Michigan was severely beaten, raped and then strangled to death early on New Year’s Day in 1986. The case was introduced by Bruce Russell (1989: 123), whose account of it, drawn from a report in the Detroit Free Press of January 3 1986, runs as follows:

The girl’s mother was living with her boyfriend, another man who was unemployed, her two children, and her 9-month old infant fathered by the boyfriend. On New Year’s Eve all three adults were drinking at a bar near the woman’s home. The boyfriend had been taking drugs and drinking heavily. He was asked to leave the bar at 8:00 p.m. After several reappearances he finally stayed away for good at about 9:30 p.m. The woman and the unemployed man remained at the bar until 2:00 a.m. at which time the woman went home and the man to a party at a neighbor’s home. Perhaps out of jealousy, the boyfriend attacked the woman when she walked into the house. Her brother was there and broke up the fight by hitting the boyfriend who was passed out and slumped over a table when the brother left. Later the boyfriend attacked the woman again, and this time she knocked him unconscious. After checking the children, she went to bed. Later the woman’s 5-year old girl went downstairs to go to the bathroom. The unemployed man returned from the party at 3:45 a.m. and found the 5-year old dead. She had been raped, severely beaten over most of her body and strangled to death by the boyfriend.

Following Rowe (1988: 120), the case of the fawn will be referred to as “E1”, and the case of the little girl as “E2”. Further, following William Alston’s (1991: 32) practice, the fawn will be named “Bambi” and the little girl “Sue”.

Rowe (1996: 264) states that, in choosing to focus on E1 and E2, he is “trying to pose a serious difficulty for the theist by picking a difficult case of natural evil, E1 (Bambi), and a difficult case of moral evil, E2 (Sue).” Rowe, then, is attempting to state the evidential argument in the strongest possible terms. As one commentator has put it, “if these cases of evil [E1 and E2] are not evidence against theism, then none are” (Christlieb 1992: 47). However, Rowe’s almost exclusive preoccupation with these two instances of suffering must be placed within the context of his belief (as expressed in, for example, 1979: 337-38) that even if we discovered that God could not have eliminated E1 and E2 without thereby losing some greater good or permitting some evil equally bad or worse, it would still be unreasonable to believe this of all cases of horrendous evil occurring daily in our world. E1 and E2 are thus best viewed as representative of a particular class of evil which poses a specific problem for theistic belief. This problem is expressed by Rowe in the following way:

(P) No good state of affairs we know of is such that an omnipotent, omniscient being’s obtaining it would morally justify that being’s permitting E1 or E2. Therefore,

(Q) It is likely that no good state of affairs is such that an omnipotent, omniscient being’s obtaining it would morally justify that being in permitting E1 or E2.

P states that no good we know of justifies God in permitting E1 and E2. From this it is inferred that Q is likely to be true, or that probably there are no goods which justify God in permitting E1 and E2. Q, of course, corresponds to the factual premise of Rowe’s argument. Thus, Rowe attempts to establish the truth of the factual premise by appealing to P.

ii. The Inference from P to Q

At least one question to be addressed when considering this inference is: What exactly do P and Q assert? Beginning with P, the central notion here is “a good state of affairs we know of.” But what is it to know of a good state of affairs? According to Rowe (1988: 123), to know of a good state of affairs is to (a) conceive of that state of affairs, and (b) recognize that it is intrinsically good (examples of states that are intrinsically good include pleasure, happiness, love, and the exercise of virtue). Rowe (1996: 264) therefore instructs us to not limit the set of goods we know of to goods that we know have occurred in the past or to goods that we know will occur in the future. The set of goods we know of must also include goods that we have some grasp of, even if we do not know whether they have occurred or ever will occur. For example, such a good, in the case of Sue, may consist of the experience of eternal bliss in the hereafter. Even though we lack a clear grasp of what this good involves, and even though we cannot be sure that such a good will ever obtain, we do well to include this good amongst the goods we know of. A good that we know of, however, cannot justify God in permitting E1 or E2 unless that good is actualized at some time.

On what grounds does Rowe think that P is true? Rowe (1988: 120) states that “we have good reason to believe that no good state of affairs we know of would justify an omnipotent, omniscient being in permitting either E1 or E2” (emphasis his). The good reason in question consists of the fact that the good states of affairs we know of, when reflecting on them, meet one or both of the following conditions: either an omnipotent being could obtain them without having to permit E1 or E2, or obtaining them would not morally justify that being in permitting E1 or E2 (Rowe 1988: 121, 123; 1991: 72).

This brings us, finally, to Rowe’s inference from P to Q. This is, of course, an inductive inference. Rowe does not claim to know or be able to prove that cases of intense suffering such as the fawn’s are indeed pointless. For as he acknowledges, it is quite possible that there is some familiar good outweighing the fawn’s suffering and which is connected to that suffering in a way unbeknown to us. Or there may be goods we are not aware of, to which the fawn’s suffering is intimately connected. But although we do not know or cannot establish the truth of Q, we do possess rational grounds for accepting Q, and these grounds consist of the considerations adumbrated in P. Thus, the truth of P is taken to provide strong evidence for the truth of Q (Rowe 1979: 337).

3. The Skeptical Theist Response

Theism, particularly as expressed within the Judeo-Christian and Islamic religions, has always emphasized the inscrutability of the ways of God. In Romans 11:33-34, for example, the apostle Paul exclaims: “Oh, the depth of the riches of the wisdom and knowledge of God! How unsearchable his judgments, and his paths beyond tracing out! Who has known the mind of the Lord?” (NIV). This emphasis on mystery and the epistemic distance between God and human persons is a characteristic tenet of traditional forms of theism. It is in the context of this tradition that Stephen Wykstra developed his well-known CORNEA critique of Rowe’s evidential argument. The heart of Wykstra’s critique is that, given our cognitive limitations, we are in no position to judge as improbable the statement that there are goods beyond our ken secured by God’s permission of many of the evils we find in the world. This position – sometimes labelled “skeptical theism” or “defensive skepticism” – has generated a great deal of discussion, leading some to conclude that “the inductive argument from evil is in no better shape than its late lamented deductive cousin” (Alston 1991: 61). In this Section, I will review the challenge posed by this theistic form of skepticism, beginning with the critique advanced by Wykstra.

a. Wykstra’s CORNEA Critique

In an influential paper entitled, “The Humean Obstacle to Evidential Arguments from Evil,” Stephen Wykstra raised a formidable objection to Rowe’s inference from P to Q. Wykstra’s first step was to draw attention to the following epistemic principle, which he dubbed “CORNEA” (short for “Condition Of ReasoNable Epistemic Access”):

(C) On the basis of cognized situation s, human H is entitled to claim “It appears that p” only if it is reasonable for H to believe that, given her cognitive faculties and the use she has made of them, if p were not the case, s would likely be different than it is in some way discernible by her. (Wykstra 1984: 85)

The point behind CORNEA may be easier to grasp if (C) is simplified along the following lines:

(C*) H is entitled to infer “There is no x from “So far as I can tell, there is no x” only if:

It is reasonable for H to believe that if there were an x, it is likely that she would perceive (or find, grasp, comprehend, conceive) it.

Adopting terminology introduced by Wykstra (1996), the inference from “So far as I can tell, there is no x” to “There is no x” may be called a “noseeum inference”: we no see ’um, so they ain’t there! Further, the italicized portion in (C*) may be called “the noseeum assumption,” as anyone who employs a noseeum inference and is justified in doing so would be committed to this assumption.

C*, or at least something quite like it, appears unobjectionable. If, for instance, I am looking through the window of my twentieth-floor office to the garden below and I fail to see any caterpillars on the flowers, that would hardly entitle me to infer that there are in fact no caterpillars there. Likewise, if a beginner were watching Kasparov play Deep Blue, it would be unreasonable for her to infer “I can’t see any way for Deep Blue to get out of check; so, there is none.” Both inferences are illegitimate for the same reason: the person making the inference does not have what it takes to discern the sorts of things in question. It is this point that C* intends to capture by insisting that a noseeum inference is permissible only if it is likely that one would detect or discern the item in question if it existed.

But how does the foregoing relate to Rowe’s evidential argument? Notice, to begin with, that Rowe’s inference from P to Q is a noseeum inference. Rowe claims in P that, so far as we can see, no goods justify God’s permission of E1 and E2, and from this he infers that no goods whatever justify God’s permission of these evils. According to Wykstra, however, Rowe is entitled to make this noseeum inference only if he is entitled to make the following noseeum assumption:

If there are goods justifying God’s permission of horrendous evil, it is likely that we would discern or be cognizant of such goods.

Call this Rowe’s Noseeum Assumption, or RNA for short. The key issue, then, is whether we should accept RNA. Many theists, led by Stephen Wykstra, have claimed that RNA is false (or that we ought to suspend judgement about its truth). They argue that the great gulf between our limited cognitive abilities and the infinite wisdom of God prevents us (at least in many cases) from discerning God’s reasons for permitting evil. On this view, even if there are goods secured by God’s permission of evil, it is likely that these goods would be beyond our ken. Alvin Plantinga (1974: 10) sums up this position well with his rhetorical question: “Why suppose that if God does have a reason for permitting evil, the theist would be the first to know?” (emphasis his). Since theists such as Wykstra and Plantinga challenge Rowe’s argument (and evidential arguments in general) by focusing on the limits of human knowledge, they have become known as skeptical theists.

I will now turn to some considerations that have been offered by skeptical theists against RNA.

b. Wykstra’s Parent Analogy

Skeptical theists have drawn various analogies in an attempt to highlight the implausibility of RNA. The most common analogy, and the one favoured by Wykstra, involves a comparison between the vision and wisdom of an omniscient being such as God and the cognitive capacities of members of the human species. Clearly, the gap between God’s intellect and ours is immense, and Wykstra (1984: 87-91) compares it to the gap between the cognitive abilities of a parent and her one-month-old infant. But if this is the case, then even if there were outweighing goods connected in the requisite way to the instances of suffering appealed to by Rowe, that we should discern most of these goods is just as likely as that a one-month-old infant should discern most of her parents’ purposes for those pains they allow her to suffer – that is to say, it is not likely at all. Assuming that CORNEA is correct, Rowe would not then be entitled to claim, for any given instance of apparently pointless suffering, that it is indeed pointless. For as the above comparison between God’s intellect and the human mind indicates, even if there were outweighing goods served by certain instances of suffering, such goods would be beyond our ken. What Rowe has failed to see, according to Wykstra, is that “if we think carefully about the sort of being theism proposes for our belief, it is entirely expectable – given what we know of our cognitive limits – that the goods by virtue of which this Being allows known suffering should very often be beyond our ken” (1984: 91).

c. Alston’s Analogies

Rowe, like many others, has responded to Wykstra’s Parent Analogy by identifying a number of relevant disanalogies between a one-month-old infant and our predicament as adult human beings (see Rowe 1996: 275). There are, however, various other analogies that skeptical theists have employed in order to cast doubt on RNA. Here I will briefly consider a series of analogies that were first formulated by Alston (1996).

Like Wykstra, Alston (1996: 317) aims to highlight “the absurdity of the claim” that the fact that we cannot see what justifying reason an omniscient, omnipotent being might have for doing something provides strong support for the supposition that no such reason is available to that being. Alston, however, chooses to steer clear of the parent-child analogy employed by Wykstra, for he concedes that this contains loopholes that can be exploited in the ways suggested by Rowe.

Alston’s analogies fall into two groups, the first of which attempt to show that the insights attainable by finite, fallible human beings are not an adequate indication of what is available by way of reasons to an omniscient, omnipotent being. Suppose I am a first-year university physics student and I am faced with a theory of quantum phenomena, but I struggle to see why the author of the theory draws the conclusions she draws. Does that entitle me to suppose that she has no sufficient reason for her conclusions? Clearly not, for my inability to discern her reasons is only to be expected given my lack of expertise in the subject. Similarly, given my lack of training in painting, I fail to see why Picasso arranged the figures in Guernica as he did. But that does not entitle me to infer that he had no sufficient reason for doing so. Again, being a beginner in chess, I fail to see any reason why Kasparov made the move he did, but I would be foolish to conclude that he had no good reason to do so.

Alston applies the foregoing to the noseeum inference from “We cannot see any sufficient reason for God to permit E1 and E2” to “God has no sufficient reason to do so.” In this case, as in the above examples, we are in no position to draw such a conclusion for we lack any reason to suppose that we have a sufficient grasp of the range of possible reasons open to the other party. Our grasp of the reasons God might have for his actions is thus comparable to the grasp of the neophyte in the other cases. Indeed, Alston holds that “the extent to which God can envisage reasons for permitting a given state of affairs exceeds our ability to do so by at least as much as Einstein’s ability to discern the reason for a physical theory exceeds the ability of one ignorant of physics” (1996: 318, emphasis his).

Alston’s second group of analogies seek to show that, in looking for the reasons God might have for certain acts or omissions, we are in effect trying to determine whether there is a so-and-so in a territory the extent and composition of which is largely unknown to us (or, at least, it is a territory such that we have no way of knowing the extent to which its constituents are unknown to us). Alston thus states that Rowe’s noseeum inference

…is like going from “We haven’t found any signs of life elsewhere in the universe” to “There isn’t life elsewhere in the universe.” It is like someone who is culturally and geographically isolated going from “As far as I have been able to tell, there is nothing on earth beyond this forest” to “There is nothing on earth beyond this forest.” Or, to get a bit more sophisticated, it is like someone who reasons “We are unable to discern anything beyond the temporal bounds of our universe,” where those bounds are the big bang and the final collapse, to “There is nothing beyond the temporal bounds of our universe.” (1996: 318)

Just as we lack a map of the relevant “territory” in these cases, we also lack a reliable internal map of the diversity of considerations that are available to an omniscient being in permitting instances of suffering. But given our ignorance of the extent, variety, or constitution of the terra incognita, it is surely the better part of wisdom to refrain from drawing any hasty conclusions regarding the nature of this territory.

Although such analogies may not be open to the same criticisms levelled against the analogies put forward by Wykstra, they are in the end no more successful than Wykstra’s analogies. Beginning with Alston’s first group of analogies, where a noseeum inference is unwarranted due to a lack of expertise, there is typically no expectation on the part of the neophyte that the reasons held by the other party (for example, the physicist’s reasons for drawing conclusion x, Kasparov’s reasons for making move x in a chess game) would be discernible to her. If you have just begun to study physics, you would not expect to understand Einstein’s reasons for advancing the special theory of relativity. However, if your five-year-old daughter suffered the fate of Sue as depicted in E2, would you not expect a perfectly loving being to reveal his reasons to you for allowing this to happen, or at least to comfort you by providing you with special assurances that that there is a reason why this terrible evil could not have been prevented? Rowe makes this point quite well:

Being finite beings we can’t expect to know all the goods God would know, any more than an amateur at chess should expect to know all the reasons for a particular move that Kasparov makes in a game. But, unlike Kasparov who in a chess match has a good reason not to tell us how a particular move fits into his plan to win the game, God, if he exists, isn’t playing chess with our lives. In fact, since understanding the goods for the sake of which he permits terrible evils to befall us would itself enable us to better bear our suffering, God has a strong reason to help us understand those goods and how they require his permission of the terrible evils that befall us. (2001b: 157)

There appears, then, to be an obligation on the part of a perfect being to not keep his intentions entirely hidden from us. Such an obligation, however, does not attach to a gifted chess player or physicist – Kasparov cannot be expected to reveal his game plan, while a physics professor cannot be expected to make her mathematical demonstration in support of quantum theory comprehensible to a high school physics student.

Similarly with Alston’s second set of analogies, where our inability to map the territory within which to look for x is taken to preclude us from inferring from our inability to find x that there is no x. This may be applicable to cases like the isolated tribesman’s search for life outside his forest or our search for extraterrestrial life, for in such scenarios there is no prior expectation that the objects of our search are of such a nature that, if they exist, they would make themselves manifest to us. However, in our search for God’s reasons we are toiling in a unique territory, one inhabited by a perfectly loving being who, as such, would be expected to make at least his presence, if not also his reasons for permitting evil, (more) transparent to us. This difference in prior expectations uncovers an important disanalogy between the cases Alston considers and cases involving our attempt to discern God’s intentions. Alston’s analogies, therefore, not only fail to advance the case against RNA but also suggest a line of thought in support of RNA. (For further discussion on RNA and divine hiddenness, see Trakakis (2003); see also Howard-Snyder & Moser (2002).)

4. Building a Theodicy, or Casting Light on the Ways of God

Most critics of Rowe’s evidential argument have thought that the problem with the argument lies with its factual premise. But what, exactly, is wrong with this premise? According to one popular line of thought, the factual premise can be shown to be false by identifying goods that we know of that would justify God in permitting evil. To do this is to develop a theodicy.

a. What Is a Theodicy?

The primary aim of the project of theodicy may be characterized in John Milton’s celebrated words as the attempt to “justify the ways of God to men.” That is to say, a theodicy aims to vindicate the justice or goodness of God in the face of the evil found in the world, and this it attempts to do by offering a reasonable explanation as to why God allows evil to abound in his creation.

A theodicy may be thought of as a story told by the theist explaining why God permits evil. Such a story, however, must be plausible or reasonable in the sense that it conforms to all of the following:

  1. commonsensical views about the world (for example, that there exist other people, that there exists a mind-independent world, that much evil exists);
  2. widely accepted scientific and historical views (for example, evolutionary theory), and
  3. intuitively plausible moral principles (for example, generally, punishment should not be significantly disproportional to the offence committed).

Judged by these criteria, the story of the Fall (understood in a literalist fashion) could not be offered as a theodicy. For given the doubtful historicity of Adam and Eve, and given the problem of harmonizing the Fall with evolutionary theory, such an account of the origin of evil cannot reasonably held to be plausible. A similar point could be made about stories that attempt to explain evil as the work of Satan and his cohorts.

b. Distinguishing a “Theodicy” from a “Defence”

An important distinction is often made between a defence and a theodicy. A theodicy is intended to be a plausible or reasonable explanation as to why God permits evil. A defence, by contrast, is only intended as a possible explanation as to why God permits evil. A theodicy, moreover, is offered as a solution to the evidential problem of evil, whereas a defence is offered as a solution to the logical problem of evil. Here is an example of a defence, which may clarify this distinction:

It will be recalled that, according to Mackie, it is logically impossible for the following two propositions to be jointly true:

  1. God is omnipotent, omniscient, and perfectly good,
  2. Evil exists.

Now, consider the following proposition:

  1. Every person goes wrong in every possible world.

In other words, every free person created by God would misuse their free will on at least one occasion, no matter which world (or what circumstances) they were placed in. This may be highly implausible, or even downright false – but it is, at least, logically possible. And if (3) is possible, then so is the following proposition:

  1. It was not within God’s power to create a world containing moral good but no moral evil.

In other words, it is possible that any world created by God that contains some moral good will also contain some moral evil. Therefore, it is possible for both (1) and (2) to be jointly true, at least when (2) is said to refer to “moral evil.” But what about “natural evil”? Well, consider the following proposition:

  1. All so-called “natural evil” is brought about by the devious activities of Satan and his cohorts.

In other words, what we call “natural evil” is actually “moral evil” since it results from the misuse of someone’s free will (in this case, the free will of some evil demon). Again, this may be highly implausible, or even downright false – but it is, at least, possibly true.

In sum, Mackie was wrong to think that it is logically impossible for both (1) and (2) to be true. For if you conjoin (4) and (5) to (1) and (2), it becomes clear that it is possible that any world created by God would have some evil in it. (This, of course, is the famous free will defence put forward in Plantinga 1974: ch.9). Notice that the central claims of this defence – namely, (3), (4), and (5) – are only held to be possibly true. That’s what makes this a defence. One could not get away with this in a theodicy, for a theodicy must be more than merely possibly true.

c. Sketch of a Theodicy

What kind of theodicy, then, can be developed in response to Rowe’s evidential argument? Are there any goods we know of that would justify God in permitting evils like E1 and E2? Here I will outline a proposal consisting of three themes that have figured prominently in the recent literature on the project of theodicy.

(1) Soul-making. Inspired by the thought of the early Church Father, Irenaeus of Lyon (c.130-c.202 CE), John Hick has put forward in a number of writings, but above all in his 1966 classic Evil and the God of Love, a theodicy that appeals to the good of soul-making (see also Hick 1968, 1977, 1981, 1990). According to Hick, the divine intention in relation to humankind is to bring forth perfect finite personal beings by means of a “vale of soul-making” in which humans may transcend their natural self-centredness by freely developing the most desirable qualities of moral character and entering into a personal relationship with their Maker. Any world, however, that makes possible such personal growth cannot be a hedonistic paradise whose inhabitants experience a maximum of pleasure and a minimum of pain. Rather, an environment that is able to produce the finest characteristics of human personality – particularly the capacity to love – must be one in which “there are obstacles to be overcome, tasks to be performed, goals to be achieved, setbacks to be endured, problems to be solved, dangers to be met” (Hick 1966: 362). A soul-making environment must, in other words, share a good deal in common with our world, for only a world containing great dangers and risks, as well as the genuine possibility of failure and tragedy, can provide opportunities for the development of virtue and character. A necessary condition, however, for this developmental process to take place is that humanity be situated at an “epistemic distance” from God. On Hick’s view, in other words, if we were initially created in the direct presence of God we could not freely come to love and worship God. So as to preserve our freedom in relation to God, the world must be created religiously ambiguous or must appear, to some extent at least, as if there were no God. And evil, of course, plays an important role in creating the desired epistemic distance.

(2) Free will. The appeal to human freedom, in one guise or another, constitutes an enduring theme in the history of theodicy. Typically, the kind of freedom that is invoked by the theodicist is the libertarian sort, according to which I am free with respect to a particular action at time t only if the action is not determined by all that happened or obtained before t and all the causal laws there are in such a way that the conjunction of the two (the past and the laws) logically entails that I perform the action in question. My mowing the lawn, for instance, constitutes a voluntary action only if, the state of the universe (including my beliefs and desires) and laws of nature being just as they were immediately preceding my decision to mow the lawn, I could have chosen or acted otherwise than I in fact did. In this sense, the acts I perform freely are genuinely “up to me” – they are not determined by anything external to my will, whether these be causal laws or even God. And so it is not open to God to cause or determine just what actions I will perform, for if he does so those actions could not be free. Freedom and determinism are incompatible.

The theodicist, however, is not so much interested in libertarian freedom as in libertarian freedom of the morally relevant kind, where this consists of the freedom to choose between good and evil courses of action. The theodicist’s freedom, moreover, is intended to be morally significant, not only providing one with the capacity to bring about good and evil, but also making possible a range of actions that vary enormously in moral worth, from great and noble deeds to horrific evils.

Armed therefore with such a conception of freedom, the free will theodicist proceeds to explain the existence of moral evil as a consequence of the misuse of our freedom. This, however, means that responsibility for the existence of moral evil lies with us, not with God. Of course, God is responsible for creating the conditions under which moral evil could come into existence. But it was not inevitable that human beings, if placed in those conditions, would go wrong. It was not necessary, in other words, that humans would misuse their free will, although this always was a possibility and hence a risk inherent in God’s creation of free creatures. The free will theodicist adds, however, that the value of free will (and the goods it makes possible) is so great as to outweigh the risk that it may be misused in various ways.

(3) Heavenly bliss. Theodicists sometimes draw on the notion of a heavenly afterlife to show that evil, particularly horrendous evil, only finds its ultimate justification or redemption in the life to come. Accounts of heaven, even within the Christian tradition, vary widely. But one common feature in these accounts that is relevant to the theodicist’s task is the experience of complete felicity for eternity brought about by intimate and loving communion with God. This good, as we saw, plays an important role in Hick’s theodicy, and it also finds a central place in Marilyn Adams’ account of horrendous evil.

Adams (1986: 262-63, 1999: 162-63) notes that, on the Christian world-view, the direct experience of “face-to-face” intimacy with God is not only the highest good we can aspire to enjoy, but is also an incommensurable good – more precisely, it is incommensurable with respect to any merely temporal evils or goods. As the apostle Paul put it, “our present sufferings are not worth comparing with the glory that will be revealed in us” (Rom 8:18, NIV; compare 2 Cor 4:17). This glorification to be experienced in heaven, according to Adams, vindicates God’s justice and love toward his creatures. For the experience of the beatific vision outweighs any evil, even evil of the horrendous variety, that someone may suffer, thus ensuring a balance of good over evil in the sufferer’s life that is overwhelmingly favourable. But as Adams points out, “strictly speaking, there will be no balance to be struck” (1986: 263, emphasis hers), since the good of the vision of God is incommensurable with respect to one’s participation in any temporal or created evils. And so an everlasting, post-mortem beatific vision of God would provide anyone who experienced it with good reason for considering their life – in spite of any horrors it may have contained – as a great good, thus removing any grounds of complaint against God.

Bringing these three themes together, a theodicy can be developed with the aim of explaining and justifying God’s permission of evil, even evil of the horrendous variety. To illustrate how this may be done, I will concentrate on Rowe’s E2 and the Holocaust, two clear instances of horrendous moral evil.

Notice that these two evils clearly involve a serious misuse of free will on behalf of the perpetrators. We could, therefore, begin by postulating God’s endowment of humans with morally significant free will as the first good that is served by these evils. That is to say, God could not prevent the terrible suffering and death endured by Sue and the millions of Holocaust victims while at the same time creating us without morally significant freedom – the freedom to do both great evil and great good. In addition, these evils may provide an opportunity for soul-making – in many cases, however, the potential for soul-making would not extend to the victim but only to those who cause or witness the suffering. The phenomenon of “jailhouse conversions,” for example, testifies to the fact that even horrendous evil may occasion the moral transformation of the perpetrator. Finally, to adequately compensate the victims of these evils we may introduce the doctrine of heaven. Postmortem, the victims are ushered into a relation of beatific intimacy with God, an incommensurable good that “redeems” their past participation in horrors. For the beatific vision in the afterlife not only restores value and meaning to the victim’s life, but also provides them with the opportunity to endorse their life (taken as a whole) as worthwhile.

Does this theodicy succeed in exonerating God? Various objections could, of course, be raised against such a theodicy. One could, for example, question the intelligibility or empirical adequacy of the underlying libertarian notion of free will (see, for example, Pereboom 2001: 38-88). Or one might follow Tooley (1980:373-75) and Rowe (1996: 279-81, 2001a: 135-36) in thinking that, just as we have a duty to curtail another person’s exercise of free will when we know that they will use their free will to inflict considerable suffering on an innocent (or undeserving) person, so too does God have a duty of this sort. On this view, a perfectly good God would have intervened to prevent us from misusing our freedom to the extent that moral evil, particularly moral evil of the horrific kind, would either not occur at all or occur on a much more infrequent basis. Finally, how can the above theodicy be extended to account for natural evil? Various proposals have been offered here, the most prominent of which are: Hick’s view that natural evil plays an essential part in the “soul-making” process; Swinburne’s “free will theodicy for natural evil” – the idea, roughly put, is that free will cannot be had without the knowledge of how to bring about evil (or prevent its occurrence), and since this knowledge of how to cause evil can only be had through prior experience with natural evil, it follows that the existence of natural evil is a logically necessary condition for the exercise of free will (see Swinburne 1978, 1987: 149-67, 1991: 202-214, 1998: 176-92); and “natural law theodicies,” such as that developed by Reichenbach (1976, 1982: 101-118), according to which the natural evils that befall humans and animals are the unavoidable by-products of the outworking of the natural laws governing God’s creation.

5. Further Responses to the Evidential Problem of Evil

Let’s suppose that Rowe’s evidential argument from evil succeeds in providing strong evidence in support of the claim that there does not exist an omnipotent, omniscient, wholly good being. What follows from this? In particular, would a theist who finds its impossible to fault Rowe’s argument be obliged to give up her theism? Not necessarily, for at least two further options would be available to such a theist.

Firstly, the theist may agree that Rowe’s argument provides some evidence against theism, but she may go on to argue that there is independent evidence in support of theism which outweighs the evidence against theism. In fact, if the theist thinks that the evidence in support of theism is quite strong, she may employ what Rowe (1979: 339) calls “the G.E. Moore shift” (compare Moore 1953: ch.6). This involves turning the opponent’s argument on its head, so that one begins by denying the very conclusion of the opponent’s argument. The theist’s counter-argument would then proceed as follows:

(not-3) There exists an omnipotent, omniscient, wholly good being.
(2) An omniscient, wholly good being would prevent the occurrence of any intense suffering it could, unless it could not do so without thereby losing some greater good or permitting some evil equally bad or worse.
(not-1) (Therefore) It is not the case that there exist instances of horrendous evil which an omnipotent, omniscient being could have prevented without thereby losing some greater good or permitting some evil equally bad or worse.

Although this strategy has been welcomed by many theists as an appropriate way of responding to evidential arguments from evil (for example, Mavrodes 1970: 95-97, Evans 1982: 138-39, Davis 1987: 86-87, Basinger 1996: 100-103) – indeed, it is considered by Rowe to be “the theist’s best response” (1979: 339) – it is deeply problematic in a way that is often overlooked. The G.E. Moore shift, when employed by the theist, will be effective only if the grounds for accepting not-(3) [the existence of the theistic God] are more compelling than the grounds for accepting not-(1) [the existence of gratuitous evil]. The problem here is that the kind of evidence that is typically invoked by theists in order to substantiate the existence of God – for example, the cosmological and design arguments, appeals to religious experience – does not even aim to establish the existence of a perfectly good being, or else, if it does have such an aim, it faces formidable difficulties in fulfilling it. But if this is so, then the theist may well be unable to offer any evidence at all in support of not-(3), or at least any evidence of a sufficiently strong or cogent nature in support of not-(3). The G.E. Moore shift, therefore, is not as straightforward a strategy as it initially seems.

Secondly, the theist who accepts Rowe’s argument may claim that Rowe has only shown that one particular version of theism – rather than every version of theism – needs to be rejected. A process theist, for example, may agree with Rowe that there is no omnipotent being, but would add that God, properly understood, is not omnipotent, or that God’s power is not as unlimited as is usually thought (see, for example, Griffin 1976, 1991). An even more radical approach would be to posit a “dark side” in God and thus deny that God is perfectly good. Theists who adopt this approach (for example, Blumenthal 1993, Roth 2001) would also have no qualms with the conclusion of Rowe’s argument.

There are at least two problems with this second strategy. Firstly, Rowe’s argument is only concerned with the God of orthodox theism as described in Section 1.a above, not the God of some other version of theism. And so objections drawn from non-orthodox forms of theism fail to engage with Rowe’s argument (although such objections may be useful in getting us to reconsider the traditional understanding of God). A second problem concerns the worship-worthiness of the sort of deity being proposed. For example, would someone who is not wholly good and capable of evil be fit to be the object of our worship, total devotion and unconditional commitment? Similarly, why place complete trust in a God who is not all-powerful and hence not in full control of the world? (To be sure, even orthodox theists will place limits on God’s power, and such limits on divine power may go some way towards explaining the presence of evil in the world. But if God’s power, or lack thereof, is offered as the solution to the problem of evil – so that the reason why God allows evil is because he doesn’t have the power to prevent it from coming into being – then we are faced with a highly impotent God who, insofar as he is aware of the limitations in his power, may be considered reckless for proceeding with creation.)

6. Conclusion

Evidential arguments from evil, such as those developed by William Rowe, purport to show that, grounds for belief in God aside, the existence of evil renders atheism more reasonable than theism. What verdict, then, can be reached regarding such arguments? A brief answer to this question may be provided by way of an overview of the foregoing investigation.

Firstly, as was argued in Section II, the “open theist” response to Rowe’s theological premise either runs the risk of diminishing confidence in God or else is entirely compatible with the theological premise. Secondly, the “sceptical theist” objection to Rowe’s inference from inscrutable evil to pointless evil was examined in Section III and was found to be inadequately supported. Thirdly, various theodical options were canvassed in Section IV as a possible way of refuting Rowe’s factual premise, and it was found that a theodicy that appeals to the goods of free will, soul-making, and a heavenly afterlife may go some way in accounting for the existence of moral evil. Such a theodicy, however, raises many further questions relating to the existence of natural evil and the existence of so much horrendous moral evil. And finally, as argued in Section V, the strategy of resorting to the “G.E. Moore shift” faces the daunting task of furnishing evidence in support of the existence of a perfect being; while resorting to a non-orthodox conception of God dissolves the problem of evil at the cost of corroding religiously significant attitudes and practices such as the love and worship of God.

On the basis of these results it can be seen that Rowe’s argument has a strongly resilient character, successfully withstanding many of the objections raised against it. Much more, of course, can be said both in support of and against Rowe’s case for atheism. Although it might therefore be premature to declare any one side to the debate victorious, it can be concluded that, at the very least, Rowe’s evidential argument is not as easy to refute as is often presumed.

7. References and Further Reading

  • Adams, Marilyn McCord. 1996. “Redemptive Suffering: A Christian Solution to the Problem of Evil,” in Robert Audi and William J. Wainwright (eds), Rationality, Religious Belief, and Moral Commitment. Ithaca, NY: Cornell University Press, pp.248-67.
  • Adams, Marilyn McCord. 1999. Horrendous Evils and the Goodness of God. Melbourne: Melbourne University Press.
  • Alston, William P. 1991. “The Inductive Argument from Evil and the Human Cognitive Condition,” Philosophical Perspectives 5: 29-67.
  • Alston, William P. 1996. “Some (Temporarily) Final Thoughts on the Evidential Arguments from Evil,” in Daniel Howard-Snyder (ed.), The Evidential Argument from Evil. Bloomington, IN: Indiana University Press, pp.311-32.
  • Ascol, Thomas K. 2001. “Pastoral Implications of Open Theism,” in Douglas Wilson (ed.), Bound Only Once: The Failure of Open Theism. Moscow, ID: Canon Press, pp.173-90.
  • Basinger, David. 1996. The Case for Freewill Theism: A Philosophical Assessment. Downers Grove, IL: InterVarsity Press.
  • Blumenthal, David R. 1993. Facing the Abusing God: A Theology of Protest. Louisville, KY: Westminster John Knox Press.
  • Boyd, Gregory A. 2000. God of the Possible: A Biblical Introduction to the Open View of God. Grand Rapids, MI: Baker Books.
  • Boyd, Gregory A. 2003. Is God to Blame? Moving Beyond Pat Answers to the Problem of Evil. Downers Grove, IL: InterVarsity Press.
  • Brown, Patterson. 1967. “God and the Good,” Religious Studies 2: 269-76.
  • Christlieb, Terry. 1992. “Which Theisms Face an Evidential Problem of Evil?” Faith and Philosophy 9: 45-64.
  • Chrzan, Keith. 1994. “Necessary Gratuitous Evil: An Oxymoron Revisited,” Faith and Philosophy 11: 134-37.
  • Davis, Stephen T. 1987. “What Good Are Theistic Proofs?” in Louis P. Pojman (ed.), Philosophy of Religion: An Anthology. Belmont, CA: Wadsworth, pp.80-88.
  • Draper, Paul. 1989. “Pain and Pleasure: An Evidential Problem for Theists,” Nous 23: 331-50.
  • Evans, C. Stephen. 1982. Philosophy of Religion: Thinking about Faith. Downers Grove, IL: InterVarsity Press.
  • Griffin, David Ray. 1976. God, Power, and Evil: A Process Theodicy. Philadelphia, PA: Westminster Press.
  • Griffin, David Ray. 1991. Evil Revisited: Responses and Reconsiderations. Albany, NY: State University of New York Press.
  • Hasker, William. 2004. Providence, Evil and the Openness of God. London: Routledge.
  • Hick, John. 1966. Evil and the God of Love, first edition. London: Macmillan.
  • Hick, John. 1968. “God, Evil and Mystery,” Religious Studies 3: 539-46.
  • Hick, John. 1977. Evil and the God of Love, second edition. New York: HarperCollins.
  • Hick, John. 1981. “An Irenaean Theodicy” and “Response to Critiques,” in Stephen T. Davis (ed.), Encountering Evil: Live Options in Theodicy, first edition. Edinburgh: T & T Clark, pp.39-52, 63-68.
  • Hick, John. 1990. Philosophy of Religion, fourth edition. Englewood Cliffs, NJ: Prentice-Hall.
  • Hoffman, Joshua, and Gary S. Rosenkrantz. 2002. The Divine Attributes. Oxford: Blackwell.
  • Howard-Snyder, Daniel, and Frances Howard-Snyder. 1999. “Is Theism Compatible with Gratuitous Evil?” American Philosophical Quarterly 36: 115-29.
  • Howard-Snyder, Daniel, and Paul K. Moser (eds). 2002. Divine Hiddenness: New Essays. Cambridge: Cambridge University Press.
  • Jordan, Jeff. 2001. “Blocking Rowe’s New Evidential Argument from Evil,” Religious Studies 37: 435-49.
  • Mackie, J.L. 1955. “Evil and Omnipotence,” Mind 64: 200-212.
  • Mavrodes, George I. 1970. Belief in God: A Study in the Epistemology of Religion. New York: Random House.
  • McCloskey, H.J. 1960. “God and Evil,” Philosophical Quarterly 10: 97-114.
  • McNaughton, David. 1994. “The Problem of Evil: A Deontological Perspective,” in Alan G. Padgett (ed.), Reason and the Christian Religion: Essays in Honour of Richard Swinburne. Oxford: Clarendon Press, pp.329-51.
  • Moore, G.E. 1953. Some Main Problems of Philosophy. London: George Allen & Unwin.
  • Morris, Thomas V. 1987. Anselmian Explorations: Essays in Philosophical Theology. Notre Dame, IN: University of Notre Dame Press.
  • Morris, Thomas V. 1991. Our Idea of God: An Introduction to Philosophical Theology. Downers Grove, IL: InterVarsity Press.
  • Nelson, Mark T. 1991. “Naturalistic Ethics and the Argument from Evil,” Faith and Philosophy 8: 368-79.
  • O’Connor, David. 1998. God and Inscrutable Evil: In Defense of Theism and Atheism. Lanham, MD: Rowman & Littlefield.
  • Otte, Richard. 2002. “Rowe’s Probabilistic Argument from Evil,” Faith and Philosophy 19: 147-71.
  • Pereboom, Derk. 2001. Living Without Free Will. Cambridge: Cambridge University Press.
  • Peterson, Michael L. 1982. Evil and the Christian God. Grand Rapids, MI: Baker Book House.
  • Peterson, Michael L. 1998. God and Evil: An Introduction to the Issues. Boulder, CO: Westview Press.
  • Pinnock, Clark H., Richard Rice, John Sanders, William Hasker, and David Basinger. 1994. The Openness of God: A Biblical Challenge to the Traditional Understanding of God. Downers Grove, IL: InterVarsity Press.
  • Plantinga, Alvin. 1974. The Nature of Necessity. Oxford: Clarendon Press.
  • Plantinga, Alvin. 1977. God, Freedom, and Evil. Grand Rapids, MI: Eerdmans.
  • Quinn, Philip L., and Charles Taliaferro (eds). 1997. A Companion to Philosophy of Religion. Cambridge, MA: Blackwell.
  • Reichenbach, Bruce R. 1976. “Natural Evils and Natural Law: A Theodicy for Natural Evils,” International Philosophical Quarterly 16: 179-96.
  • Reichenbach, Bruce R. 1982. Evil and a Good God. New York: Fordham University Press.
  • Reitan, Eric. 2000. “Does the Argument from Evil Assume a Consequentialist Morality?” Faith and Philosophy 17: 306-19.
  • Rogers, Katherin A. 2000. Perfect Being Theology. Edinburgh: Edinburgh University Press.
  • Roth, John K. 2001. “A Theodicy of Protest”, in Stephen T. Davis (ed.), Encountering Evil: Live Options in Theodicy, second edition. Louisville, KY: Westminster John Knox Press, pp.1-20.
  • Rowe, William L. 1978. Philosophy of Religion: An Introduction, first edition. Encino, CA: Dickenson Publishing Company..
  • Rowe, William L. 1979. “The Problem of Evil and Some Varieties of Atheism,” American Philosophical Quarterly 16: 335-41.
  • Rowe, William L. 1986. “The Empirical Argument from Evil,” in Audi and Wainwright (eds), Rationality, Religious Belief, and Moral Commitment, pp.227-47.
  • Rowe, William L. 1988. “Evil and Theodicy,” Philosophical Topics 16: 119-32.
  • Rowe, William L. 1991. “Ruminations about Evil,” Philosophical Perspectives 5: 69-88.
  • Rowe, William L. 1995. “William Alston on the Problem of Evil,” in Thomas D. Senor (ed.), The Rationality of Belief and the Plurality of Faith: Essays in Honor of William P. Alston. Ithaca, NY: Cornell University Press, pp.71-93.
  • Rowe, William L. 1996. “The Evidential Argument from Evil: A Second Look,” in Daniel Howard-Snyder (ed.), The Evidential Argument from Evil, pp.262-85.
  • Rowe, William L. 2001a. “Grounds for Belief Aside, Does Evil Make Atheism More Reasonable than Theism” in William Rowe (ed.), God and the Problem of Evil. Malden, MA: Blackwell, pp.124-37.
  • Rowe, William L. 2001b. “Reply to Howard-Snyder and Bergmann,” in Rowe (ed.), God and the Problem of Evil, pp.155-58.
  • Russell, Bruce. 1989. “The Persistent Problem of Evil,” Faith and Philosophy 6: 121-39.
  • Russell, Bruce. 1996. “Defenseless,” in Daniel Howard-Snyder (ed.), The Evidential Argument from Evil, pp.193-205.
  • Sanders, John. 1998. The God Who Risks: A Theology of Providence. Downers Grove, IL: InterVarsity Press.
  • Sennett, James F. 1993. “The Inscrutable Evil Defense Against the Inductive Argument from Evil,” Faith and Philosophy 10: 220-29.
  • Stump, Eleonore. 1997. “Review of Peter van Inwagen, God, Knowledge, and Mystery,” Philosophical Review 106: 464-67.
  • Swinburne, Richard. 1977. The Coherence of Theism. Oxford: Clarendon Press.
  • Swinburne, Richard. 1978. “Natural Evil,” American Philosophical Quarterly 15: 295-301.
  • Swinburne, Richard. 1987. “Knowledge from Experience, and the Problem of Evil,” in William J. Abraham and Steven W. Holtzer (eds), The Rationality of Religious Belief: Essays in Honour of Basil Mitchell. Oxford: Clarendon Press, pp.141-67.
  • Swinburne, Richard. 1991. The Existence of God, revised edition. Oxford: Clarendon Press.
  • Swinburne, Richard. 1998. Providence and the Problem of Evil. Oxford: Clarendon Press.
  • Tooley, Michael. 1980. “Alvin Plantinga and the Argument from Evil,” Australasian Journal of Philosophy 58: 360-76.
  • Trakakis, Nick. 2003. “What No Eye Has Seen: The Skeptical Theist Response to Rowe’s Evidential Argument from Evil,” Philo 6: 263-79.
  • Van Inwagen, Peter. 1988. “The Place of Chance in a World Sustained by God,” in Thomas V. Morris (ed.), Divine and Human Action. Ithaca, NY: Cornell University Press, pp.211-35.
  • Ware, Bruce. 2000. God’s Lesser Glory: The Diminished God of Open Theism. Wheaton, IL: Crossways Books.
  • Wykstra, Stephen J. 1984. “The Humean Obstacle to Evidential Arguments from Suffering: On Avoiding the Evils of ‘Appearance’,” International Journal for Philosophy of Religion 16: 73-93.
  • Wykstra, Stephen J. 1986. “Rowe’s Noseeum Arguments from Evil,” in Daniel Howard-Snyder (ed.), The Evidential Argument from Evil, pp.126-50.

Author Information

Nick Trakakis
Email: Nick.Trakakis@acu.edu.au
Australian Catholic University
Australia

Apology

An apology is the act of declaring one’s regret, remorse, or sorrow for having insulted, failed, injured, harmed or wronged another. Some apologies are interpersonal (between individuals, that is, between friends, family members, colleagues, lovers, neighbours, or strangers). Other apologies are collective (by one group to another group or by a group to an individual). More generally, apologies can be offered “one to one,” “one to many,” “many to one,” or “many to many.”

While the practice of apologizing is nothing new, the end of the twentieth century and the beginning of the twenty-first witnessed a sharp rise in the number of public and political apologies, so much so that some scholars believe we are living in an “age of apology” (Gibney et al. 2006) or within a “culture of apology” (Mills 2001). A gesture formerly considered a sign of weakness has grown to represent moral strength and a crucial step towards potential reconciliation. Individuals, but more often states, churches, the judiciary, the medical profession and universities publicly issue apologies to those they have wronged in the past. Crimes ranging from personal betrayals and insults all the way to enslavement, violations of medical ethics, land displacement, violations of treaties or international law, systemic discrimination, wartime casualties, cultural disruptions, or political seizures constitute reasons for public expressions of regret.

What apologies are, and which goals they can promote, are objects of inquiry for a number of academic disciplines in the social sciences and humanities, including philosophy, political science, theology, psychology, history and sociology. Authors have been preoccupied by an array of questions: What are the validity conditions for an apology? Are these the same for interpersonal and collective apologies? And what purposes do apologies serve in human societies?

Table of Contents

  1. Interpersonal Apologies (“One to One”)
    1. Types
    2. Validity Conditions
    3. The Goals of the Interpersonal Apology
  2. The “One to Many” Apology
    1. Types
    2. Validity Conditions
    3. The Goals of the “One to Many” Apology
  3. Collective Apology (“Many to Many” or “Many to One”)
    1. Types
    2. Validity Conditions
    3. The Goals of the Collective Apology
    4. Scepticism about Collective Apologies
  4. Theatricality and Non-verbal Apologies
  5. Intercultural Apologies
  6. References and Further Reading

1. Interpersonal Apologies (“One to One”)

a. Types

In interpersonal apologies, an individual acknowledges and promises to redress offences committed against another individual. Such an apology can be performed in private (for instance, when one family member apologizes to another within the walls of their common abode) or in public (when individuals with public profiles apologise to their spouses, friends or colleagues for their blunders in a highly mediated fashion). Although, in a broad sense, everything is political, interpersonal apologies can be political in the stricter sense when the offender and the offended are politicians, public officials or representatives of political organizations. Clear examples of interpersonal political apologies are Senator Fred Thompson’s apology to Bill Clinton for insinuating that the latter had been involved in corruption or the apology by Republican House Majority Leader Dick Armey for referring to Representative Barney Frank, a Democrat representing Massachusetts, as “Barney Fag.”

b. Validity Conditions

In order to count as valid, an apology must meet a number of conditions. While there is great variation among authors on the number and exact role that different elements play within an apology, there is a growing consensus that an authentic apology implies: an acknowledgement that the incident in question did in fact occur and that it was inappropriate; a recognition of responsibility for the act; the expression of an attitude of regret and a feeling of remorse; and the declaration of an intention to refrain from similar acts in the future.

Authors dealing with the interpersonal apology position themselves on a continuum, ranging from rather lax to very stringent requirements that an apology must meet in order to be valid. Nick Smith provides us with the theoretically most systematic and normatively strictest account of the interpersonal apology, listing no less than twelve conditions for what he calls a valid “categorical” apology: a corroborated factual record, the acceptance of blame (to be distinguished from expressions of sympathy as in “I am sorry for your loss”), having standing (only those causally responsible for the offence can apologise), identification of each harm separately, identification of the moral principles underlying each harm, endorsement of the moral principles underlying each harm, recognition of the victim as a moral interlocutor, categorical regret (recognition of the fact that one’s act constitute a moral failure), the performance of the apology, reform and redress (post-apology), sincere intentions (lying when apologizing would only double the insult to the victim), and some expression of emotion (sorrow, guilt, empathy, sympathy) (Smith 2008). To the extent that an interpersonal apology fails on any of these criteria, it fails to achieve the status of a proper apology.

Whether one has a more lax or a more strict understanding of the validity conditions for the interpersonal apology, the offended individual has the standing to accept or reject the apology.

c. The Goals of the Interpersonal Apology

Normatively, interpersonal apologies are meant to recognise the equal moral worth of the victim. While the offence cannot be undone, the act of acknowledging it recognises the offended as an equal moral agent. Psychologically, an apology aims to meet the victim’s psychological needs of recognition, thus restoring her self-respect (Lazare 2004). Diminishing her desire for revenge, healing humiliations, and facilitating reconciliation are hoped for, but empirically contingent, effects of the apology. A cathartic effect on the guilty conscience of the offender is one other psychologically desirable consequence of a successful apology.

If the apology is accepted and if the offender is forgiven, the moral status quo ante (of equal moral worth of the offending and the offended parties) will be restored. However, forgiveness follows the apology only when the victim undergoes a deep psychological change: when she gives up her moral anger and the desire for revenge. Forgiveness should not be confused with forgetting, which is involuntary and does not presuppose a “change of heart.” While possible, forgiveness is neither necessary nor a right that the offender can claim once she has apologized and shown remorse. Forgiveness remains the privilege of the offended. In addition and contrary to some religious traditions, philosophers have usually argued that forgiveness should not be understood as the victim’s duty, nor should it be conceived of as a test of her good character.

2. The “One to Many” Apology

a. Types

The “one to many” apology can be either private or public, and can be political or non-political. For example, when one individual apologizes privately to her family, group of friends, neighbours, or colleagues for an insult or any other moral failure, we are talking about a non-political “one to many” apology. Public figures sometimes choose to communicate their regret via mass media, and then the apology is public and non-political. For example, actress Morgan James apologized to the cast and crew of the Sondheim musical “Into the Woods” for disproportionally criticising the New York production using language that was too strong. On the contrary, when a politician or official apologizes to her party, her voters or the nation for a wrong, we are dealing with a political public “one to many” apology. Kaing Guek Eav’s (a.k.a. “Duch”) apologizing to the Cambodian people for his actions in the S21 prison or Richard Nixon apologizing to his supporters and voters for the Watergate scandal are just two among many examples of “one to many” public political apologies.

b. Validity Conditions

When an individual apologizes to her family, to her group of friends, or to the nation, we apply the same standards of validity that we apply to interpersonal apologies. Minimally, an apology by one to the many must include an acknowledgement that a wrong has been committed, acceptance of responsibility, a promise of forbearance, expression of regret or remorse and an offer of repair. She who has committed the wrong has the proper standing to apologize.

Things get complicated when we consider who accepts the apology. The size of the group is an important variable. A family or a group of friends can come together and decide what to do in response to the apology. A corporation or a village can organize a consultative process and determine how to react. In fact, under the banner of “restorative justice”, an entire literature addresses the ways in which communities can heal broken relations and re-integrate those among their members who have gone astray (Braithwaite 1989). But how do large, unorganized groups, such as nations, accept an apology? Many critics of restorative justice have pointed out that such a conception of justice does not make much sense outside small, closely knit communities. Can there ever be consensus about how to deal with officials’ expressions of regret within the large, pluralistic publics of today’s societies? Elections and opinion polls are probably the only – imperfect – mechanisms for gaining insight into whether an apology has or has not been accepted by the members of the polity. While a great deal of attention has been paid to the normative pre-requisites of a valid apology, there are no systematic studies regarding their effect on the public culture of the societies in which they are offered. This is an important lacuna in great need of remedy.

c. The Goals of the “One to Many” Apology

The purposes of the non-political “one to many” apology overlap with those of the interpersonal acts of contrition: recognizing the victims as moral interlocutors and communicating the fact that the offender understands and regrets the violation of their legitimate moral expectations, thus making a first step towards a desired reconciliation.

Beside the acknowledgement and recognition functions of the political variety of the “one to many” apology, such acts also seek to satisfy the publicity requirement and set the record straight, re-affirm the principles the community abides by and, in giving an account of one’s personal failures as a politician or representative, they individualize guilt. Strategically, such acts may be employed to minimize political losses, save one’s political career and, if that were not possible, to insulate one’s office or party from the negative consequences of a particular person’s misdeeds. It may also be used to increase the chances of a pardon in case the misdeeds are of a criminal nature.

3. Collective Apology (“Many to Many” or “Many to One”)

a. Types

Collective apologies take two forms: by “many to many” or by “many to one”. In the case of “many to many” one group apologizes to another group. For instance, the French railway company SNCF apologized for transporting Jews to the extermination camps during the Nazi occupation and the Vatican apologized to women for the violations of their rights and historical denigration at the hands of the Catholic Church. In the case of “many to one” a group apologizes to an individual. Clear examples are the apology by the Canadian government to Maher Arar for the ordeal he suffered as a result of his rendition to Syria or corporate apologies to individual clients for faulty services or goods.

When looking into collective apologies, the state has received most of the scholarly attention as perpetrator and apologizer. In addressing the issue of state apologies, we can speak of three contexts where such acts are considered appropriate: domestic, international and postcolonial. In the domestic realm, political apologies address injustice committed against citizens under the aegis of the state. Canada’s apology and compensation to Canadians of Chinese origin for the infamous “Chinese Head Tax” law and the United State’s apology and compensation for American citizens of Japanese descent for their internment during World War II are relevant examples. In the international realm, political apologies are important diplomatic tools and usually address injustice committed during wartime, but not only. In this category, we could discuss Japan’s “sorry” for the abuse of Korean and Chinese “comfort women” and Belgium’s expression of regret for not having intervened to prevent the genocide in Rwanda. Finally, one can identify postimperial and postcolonial relations as a context, somewhere between the domestic and the international realm. Australia’s and Canada’s apologies to their Aboriginal communities for forced assimilation policies, Queen Elizabeth’s declaration of sorrow for Britain’s treatment of New Zealand’s Maori communities, and Guatemala’s apology to a victimized Mayan community constitute important illustrations.

b. Validity Conditions

When applied to collective apologies for harms and wrongs featuring multiple perpetrators – oftentimes committed a long time ago – many of Smith’s criteria for a categorical “sorry” do not hold. Consequently, those who measure collective apologies against the standards for interpersonal apologies argue against the very idea of collective apologies, and especially against the idea of collective apologies for injustices that took place in the distant past.

First, adequately isolating each and every offence inflicted upon the victim(s) can be a daunting task when dealing with multiple perpetrators. Secondly, what do we mean by collective responsibility? In what way can we plausibly speak of collective – as opposed to individual – acts? Third, who has the proper standing to apologize for something that the collective has supposedly perpetrated: the upper echelons of the chain of command or the direct perpetrators? What about those members of the group who had not been involved in the violations? Fourth, can groups express remorse and regret? How can we measure their sincerity and commitment to transformation and redress in the absence of these emotions? Fifth, things are further complicated because often there is no consensus behind a collective’s decision to apologize.

Most of the time, some members of the community reject the idea of apologizing for a past wrong. They see public contrition as a threat to the self-image of the group and as an unnecessary tainting of its history. All recent examples of collective apologies have turned out to be controversial and antagonizing, so much so that some scholars have argued that the lack of consensus constitutes an insuperable obstacle to collective apologies. Last but not least, who should accept these collective apologies? The answer appears to be clear in the case of a “many to one” apology. But what about a “many to many” scenario? The direct victims? What about their families? And what if the members of the group that the apology addresses cannot agree on whether to accept the apology or not?

All these problems are amplified when the direct perpetrators and victims no longer exist. In such cases, there is no identity between perpetrator and apologiser or between the victim and the addressee of the apology. What is more, the potential apologizers and addressees of the apology often owe their very existence to the fact that the injustices had been committed in the past, as is the case, for example, of almost everyone in the Americas or Australia today: without the injustices committed against the First Nations and without the slave trade the demographics of the continents would look different in the 21st Century. For them to apologize sincerely, i.e. to express regret for the very events that made their existence possible, would be impossible.

One way of circumventing the identity problem is to argue that, even if they are not the direct victims, the descendants of victims suffer today the repercussions of the violations in the past. For instance, one might argue that African Americans experience today the socio-economic repercussions of a history of discrimination and oppression that goes back to the slave trade. Consequently, they are owed an apology. White Americans, on the contrary, have been the beneficiaries of the same violations, even if they are not the direct perpetrators thereof. As involuntary beneficiaries of violence they might express regret for the fact that they owe their existence to injustices committed by their ancestors.

Yet the problems do not stop here. Immigration adds to the complexity of the identity problem: should recent immigrants apologise given that they have not even benefitted from the past injustices and they do not owe their existence to the perpetrators of past injustices?

Another way of dealing with the question of the validity of collective apologies is to give up the interpersonal model and think of them as a rather distinct category, whose purposes and functions differ from those of interpersonal apologies. Thus, scholars have argued that it is normatively sound to ascribe responsibility to collectives or institutions as continuous in time and as transcending the particular individuals constituting them at a certain moment. In addition, collectives are responsible for reproducing the culture that made it possible for atrocities to go on uncontested. Therefore, collective responsibility requires that groups’ representatives acknowledge the fact that an injustice has been committed, mark discontinuity with the discriminatory practices of the past, and commit themselves to non-repetition and redress.

Collective responsibility must be conceptually distinguished from collective guilt, a philosophically more problematic notion. For example, a present government who has not committed any wrongs can still take responsibility by acknowledging that wrongs have been committed against a certain group or person in the past, that it was “our culture” that enabled the abuses, that the abuses have repercussions in the present, and that they will not be allowed to happen again. A pledge to revise the very foundations on which the relations between various groups are established within the polity and material compensations for the losses incurred by the victims give concreteness to the apology. In this sense, it can be safely said that collective apologies have both a symbolic function (recognition of the offended group as worthy of respect) and a utility function (the apology might bring about reparations to the victims and might lead to better inter-group relations).

If the issue of collective responsibility is addressed in this way, we then need to turn to the question of who has standing to apologize for the collective. Unlike interpersonal apologies—where the offender has to apologize to the offended—collective apologies depend on representation, or, in other words, they are done by proxy. If we understand collective apologies as symbolic acts and if we agree that collectives can take responsibility for past wrongs even if their current members did not commit any of the past offences, then a legitimate representative – perceived by the collective as having the authority to speak for the collective – has the standing to apologize.

Naturally, the affective dimension of the collective apology becomes less significant if we give up the interpersonal model. The representatives offering the apology might experience feelings of contrition, remorse and regret, but their emotional response is not a necessary condition of an authentic apology by collective agents such as churches, professions, or the state. While representatives speaking on behalf of the group or institution may experience such emotions, the sincerity of the act should not be measured in affective units. The “sincerity” of collective apologies should be measured in terms of what follows from the act. Changes in the norms and practices of the collective, reparations, compensation, or memorialization projects give concreteness to the symbolic act of apologizing.

Last but not least, to whom is the apology addressed? Theorists who do not take the interpersonal “sorry” as a template for the collective apology argue that they are addressed to a number of audiences. First, apologies are directed towards victims and their families and their descendants. Secondly, they are addressed to the general public, with a view to communicating that what happened in the past is in great tension with the moral principles the group subscribes to and that such abuses will not be tolerated ever again. Lastly, the international society – or more abstractly humanity as a whole – is the indirect audience of a collective apology.

c. The Goals of the Collective Apology

If we agree that we can speak meaningfully about public expressions of regret by institutions, then we will also think that they do not serve the same purposes as interpersonal apologies. Such acts aim to restore diplomatic relations, restore the dignity of insulted groups, extend the boundaries of the political community by including the formerly disenfranchised, re-establish equality among groups and recognize suffering, and stimulate reflection and change in a discriminatory public culture. They could also mark a (re-)affirmation of the fundamental moral principles of the community, promote national reconciliation, strengthen a principle of transnational cooperation and contribute to the improvement of international law and diplomatic relations, make a relationship possible by creating a less hostile environment for special groups, and mark a society’s affirmation of a set of virtues in contradistinction to a past of exclusion.

Theological approaches to the functions that collective apologies can perform add to the scholarly reflection about these political practices. In her path-breaking book on the religious dimensions of collective apologies, Celermajer uses insights from the Jewish and the Christian notions and institutions of repentance in order to support an account of collective apologies as speech acts through which “we as a community” ritually express shame for our past, appraise the impact of the past on the present and the future, and make a commitment to change who “we” are by bridging the gap between our ideals and our practices (Celermajer 2009). Other scholars have made reference to the Christian notion of covenant so as to theorise apologies as “embracing” acts and as mechanisms of possible reconciliation. Contributions by theologians thus illuminate one more normative source for the multi-faceted practice of apology: religious traditions.

d. Scepticism about Collective Apologies

While many scholars see public apologies as creating a space of communal reflection and restoration, there are strong sceptical positions that see such official acts as nothing but a “smoke screen” meant to hide the intention to avoid responsibility or further projects of assimilation and discrimination. On the basis of normative inconsistencies associated with current practices of apologies, realist scholars have objected that apologies are a form of “sentimental politics” that serves as a “seductive, feel-good strategy contrived and promoted by governments” to compensate for the lack of redistributive measures. On this view, apologies allow political elites to take the higher moral ground against those who came before them—unfairly applying current standards to the past, thus committing the sin of presentism – and to capitalize electorally.

Defenders of the value of collective apology respond that the presence of strategic reasons does not necessarily doom such practices to irrelevance. True, unless coupled with compensatory schemes and a renunciation of oppressive practices, such declarations of sorrow are signs of hypocritical and meaningless righteousness, far from appropriately addressing the atrocities for which they are issued. Compensation without an apology is also insufficient, as it cannot symbolically affirm the value of the victims. In addition, it might send the wrong signal – that of trying to “buy” the victim’s forgiveness, thus doubling the insult. To the extent that they live up to the tasks they set themselves, i.e. to the extent that they take concrete steps to address injustice symbolically and materially, apologies are “sincere”.

A different kind of criticism comes from conservative commentators who tend to be averse to the idea of apologizing for a past of state-sponsored violence. The fear that discussing the past might damage the community’s self-image pervades many democratic societies with a history of injustice. Turkey’s refusal to acknowledge the Armenian genocide and the US’s problematic relationship with its long history of racial discrimination are two notorious examples where a discomfort with the past prevents sincere processes of national reckoning.

In response to this line of critique, one can argue that democratic elites can employ two strategies: encourage everyone to participate in a political ritual of contrition and assume the unsavoury past or invite resistant groups to conceive of honesty about the past as an act of courage, not an injustice. A rhetorically powerful appeal to positive feelings of courage, rather than shame, to pride, rather than repentance, could persuade citizens to see the apology as a sign of strength, and not one of weakness.

4. Theatricality and Non-verbal Apologies

The theatrical or ritualistic dimension of the collective apology cannot be omitted from any comprehensive discussion of the practice. While public interpersonal apologies by celebrities can be analysed in terms of their theatrical aspects – just think of Arnold Schwarzenegger or Tiger Woods publicly apologizing to their spouses – it is usually collective political apologies that make a more interesting object for this type of inquiry.

Rhetoricians have pointed to the need for the apologizer to establish a special relation between herself and the audience. She should be able to give meaningful expression to common sentiment and avoid being perceived as out of touch with the public. Timing, the rhetorical register used, the tone, the educational and memorialization projects that precede the apology, and the theatrical props used should enter the consideration of those who want their apology to resonate with the wider public. Thinking of the apology in terms of theatre allows us to grasp not only the validity and power of the performance by the apologizer but also the choice that the spectator has to either accept or reject the authority of the apologizer.

While apologies have been mostly studied as verbal (oral or written) acts, some scholars have recently turned their attention to the non-verbal dimension of the practice. Willy Brandt’s kneeling in front of the monument dedicated to the Warsaw Ghetto uprising in 1970 or Pope John Paul II leaning against the Western Wall and slipping a piece of paper containing a prayer into its crevices have been interpreted as acts of apology, regret and sorrow for the suffering of the Jews at the hands of Nazi Germany and the Catholic Church, respectively. Looking into gestures, bodily posture, location and emotional expressions allows us to understand the complexity of factors that enter into an apology that resonates with its audiences, thus adding richness to any analysis of such practices.

5. Intercultural Apologies

The phenomenon of intercultural apologies – interpersonal and collective apologies between individuals with different cultural backgrounds – has been made the object of numerous empirical studies. Such studies usually compare “Western” (mostly American) and “Eastern” (mostly East-Asian) understandings of the apology.

While apologies do cut across cultures, sociologists, social psychologists and students of intercultural communication tell us that there is variation in the type and number of validity conditions, the nature of acts that should give occasion to an apology, the strength of the motivation to apologize, the kind of purposes that they are meant to serve, as well as in the form and style that the practice adopts. For instance, Western individuals and institutions are supposedly less willing to apologize, more likely to focus on the mens rea (the intention behind the offence) and on the justification of the offence, while Asian individuals and institutions are more willing to apologize unconditionally, more likely to zoom in on the consequences of the offence, and see it within its broader context.

Such variation might tempt the observer to essentialize cultures, reify the differences, and deny the possibility of meaningful apologies between members of different cultural groups. The more difficult – yet more productive – alternative is to resist the temptation of going down the path of incommensurability and to try and valorise the reconciliation potential such acts may bring about. A willingness to see the similarities beyond the differences, to adjust one’s expectations so as to accommodate the expectations of the other and to learn transculturally may pave the way to conflict resolution, be it between persons or collectives.

6. References and Further Reading

  • Andrieu, Kora. “‘Sorry for the Genocide’: How Public Apologies Can Help Promote National Reconciliation.” Millennium Journal of International Studies 38, no. 1 (2009): 3–23.
    • Uses Habermas’s notion of discursive solidarity to show how apologies can help rebuild political trust.
  • Barkan, Elazar and Alexander Karn. Taking Wrongs Seriously: Apologies and Reconciliation. Stanford CA: Stanford University Press, 2006.
    • It analyzes, in a comparative and interdisciplinary framework the role and function—as well as the limitations—that apology has in promoting dialogue, tolerance, and cooperation between groups confronting one another over past injustices.
  • Barnlund, D. & Yoshioka, M. “Apologies: Japanese and American styles.” International Journal of Intercultural Relations 14, (1990): 193-205.
    • A comparative contribution to the intercultural study of apologies, focusing on American and Japanese practices of apology.
  • Bilder, Richard B. “The Role of Apology in International Law and Diplomacy.” Virginia Journal of International Law 46, no. 3 (2006): 437-473.
    • Written from the perspective of International Relations, with a focus on the diplomatic use of apologies.
  • Borneman, John. “Public Apologies as Performative Redress.” SAIS Review 25, no. 2, (2005): 59-60.
    • Explores the role of apologies in conflict resolution processes.
  • Braithwaite, John. Crime, Shame and Reintegration. Cambridge: Cambridge University Press, 1989.
    • Foundational text for restorative justice in criminal law.
  • Celermajer, Danielle. The Sins of the Nation and the Ritual of Apology. Cambridge: Cambridge University Press, 2009.
    • Explores political apologies by mobilising religious tropes from Judaism and Christianity and speech-act theory.
  • Cunningham, Michael. “Saying Story: The Politics of Apology.” The Political Quarterly 70, no 3 (1999): 285-293.
    • One of the first and clearest articles to theorise the morphology of apologies.
  • De Gruchy, John W. Reconciliation: Restoring Justice. Minneapolis MN: Fortress, 2002.
    • An essentially theological account of processes of political reconciliation.
  • Engerman, Stanley L. “Apologies, Regrets, and Reparations.” European Review 17, no. 3-4 (2009): 593-610.
    • A historical examination of the development of the practice of apologies in the last few decades.
  • Gibney, Mark, Rhoda Howard-Hassman, Jean-Marc Coicaud and Niklaus Steiner (eds.). The Age of Apology: The West Faces its Own Past, Tokyo: United Nations University Press, 2006.
    • Extensive collection of essays addressing the normative issues associated with the practice of collective apologies, with a focus on state apologies.
  • Gibney, Mark and Erik Roxtrom. “The Status of State Apologies.” Human Rights Quarterly 23, no. 4 (2001): 911-939.
    • Analyses the importance of transnational apologies in the development of human rights standards.
  • Gill, Kathleen. “The Moral Functions of an Apology.” The Philosophical Forum 31, no. 1 (2000): 11-27.
    • Analyses the moral aspects of apologizing, considering its impact on those who offer the apology, the recipient of the apology, and relevant communities.
  • Govier, Trudy, Taking Wrongs Seriously (Amherst: Humanity Books, 2006).
    • One of the most important contributions to philosophical reflections on political reconciliation and its challenges.
  • Govier, Trudy and Wilhelm Verwoerd. “The Promise and Pitfalls of Apology,” Journal of Social Philosophy, Vol. 33, no. 1, (Spring 2002), pp. 67 – 82.
    • Uses the South African experiment in truth and reconciliation to examine the pitfalls of apologies.
  • Griswold, Charles L. Forgiveness: A Philosophical Exploration. New York: Cambridge University Press, 2007.
    • A philosophical exploration of what is involved in processes of forgiveness.
  • Horelt, Michel- André. “Performing Reconciliation: A Performance Approach to the Analysis of Political Apologies.” In Nicola Palmer, Danielle Granville and Phil Clark (eds.), Critical Perspectives on Transitional Justice (Cambridge: Intersentia, 2011):347-369.
    • One of the very few contributions exploring the role of non-verbal elements in apologies.
  • Howard-Hassmann, Rhoda. “Official Apologies”. Transitional Justice Review, Vol.1, Iss.1, (2012), 31-53.
    • Unpacks the various moral, sociological and political dimensions of collective apologies.
  • Kampf, Zohar. “Public (Non–)Apologies: The Discourse of Minimizing Responsibility.” Journal of Pragmatics 41 (2009): 2257–2270.
    • Analyses a number of apologies to highlight the strategies that officials adopt in trying to minimize their responsibility for the wrongs they are apologising for.
  • La Caze, Marguerite. “The Asymmetry between Apology and Forgiveness.” Contemporary Political Theory 5 (2006) 447–468.
    • Discusses the idea that apologizing does not trigger a duty of forgiveness for the addressee of the apology.
  • Lazare, Aaron. On Apology. Oxford: Oxford University Press, 2004.
    • A psychological account of the processes involved in giving and receiving an apology.
  • Mihai, Mihaela. “When the State Says ‘Sorry’: State Apologies as Exemplary Political Judgments,” Journal of Political Philosophy 21, no. 2 (2013): 200-220.
    • Looks at how collective apologies could mobilize public support from a reluctant public who oppose the idea of an apology for past injustice.
  • Mills, Nicolaus. “The New Culture of Apology.” Dissent 48, no. 4 (2001): 113-116.
    • Discusses the growing number of public apologies and the functions they are meant to perform.
  • Negash, Girma. Apologia Politica: States & their Apologies by Proxy. Lanham, MD: Lexington Books, 2006.
    • Examines public apology as ethical and public discourse, recommends criteria for the apology process, analyzes historical and contemporary cases, and formulates a guide to ethical conduct in public apologies.
  • Nobles, Melissa. The Politics of Official Apologies. New York: Cambridge University Press, 2008.
    • A citizenship-based justification for the practice of collective apologies for past injustice.
  • Public Apology Database, published by the CCM Lab, the University of Waterloo.
    • A comprehensive, structured, data base of apologies.
  • Schmidt, Theron, ‘We Say Sorry’: Apology, the Law and Theatricality, Law Text Culture, 14(1), 2010, 55-78.
    • Explores the theatricality at work in three examples of publicly performed discourse.
  • Smith, Nick. I Was Wrong: The Meanings of Apologies. Cambridge: Cambridge University Press, 2008.
    • A philosophical account of the conditions of validity for interpersonal and collective apologies.
  • Sugimoto, N. “A Japan-U.S. comparison of apology styles.” Communication Research, 24 (1997): 349-370.
    • Important contribution for the intercultural study of apologies.
  • Tavuchis, Nicholas. Mea Culpa: A Sociology of Apology and Reconciliation. Stanford, CA: Stanford University Press, 1991.
    • A sociological take on apologies and one of the first books published on the topic.
  • Thaler, Mathias. “Just Pretending: Political Apologies for Historical Injustice and Vice’s Tribute to Virtue.” Critical Review of International Social and Political Philosophy 15, no. 3 (2012): 259–278.
    • Examines the sincerity condition of collective apologies and argues for a purely consequentialist view of such acts.
  • Thompson, Janna. “The Apology Paradox.” The Philosophical Quarterly 50, No. 201 (Oct., 2000): 470-475.
    • Examines the non-identity problem in apologies for historical injustices.
  • Villadsen, Kisa Storm. “Speaking on Behalf of Others: Rhetorical Agency and Epideictic Functions of Official Apologies.” Rhetoric Society Quarterly 38, no. 9 (2008): 25-45.
    • Looks at apologies through the lens of rhetoric.

Author Information

Mihaela Mihai
Email: mihaela.mihai@york.ac.uk
University of York
United Kingdom

Philosophy through Film

This article introduces the main perspectives concerning philosophy through film. Film is understood not so much as an object of philosophical reflection but as a medium for engaging in philosophy. Contributions to the area have flourished since the beginning of the 21st century, along with debates over the extent to which film can really be understood to be “doing” philosophy, as opposed to merely serving as a source of illustration or example for philosophical reflection. A number of objections have their origins in perceived similarities between the cinema and Plato’s cave; other objections have their origins in more general Platonic criticisms of fictive art’s capacity to reveal truth. Against these objections are some surprisingly bold views of film’s capacity to do philosophy, to the effect that much of what can be done in the verbal medium can also be done in the cinematic one; or that there is a distinctive kind of cinematic thinking that resists paraphrasing in traditional philosophical terms. There are also more moderate views, to the effect that film can be seen as engaging in certain recognizably philosophical activities, such as the thought experiment; or that they are able to present certain kinds of philosophical material better than standard philosophical genres. This article considers these views for and against the idea of philosophy through film. It also considers the “imposition” objection—that while film may serve to provide useful illustration, any philosophizing is in fact being done by the philosopher using the film.

Table of Contents

  1. What is Philosophy through Film?
  2. Platonic Objections
    1. Cinema as Plato’s Caves
    2. Wartenberg and Thought Experiments
  3. Objections: Smith, Russell
  4. Objecting to the ‘Bold Thesis’
    1. Livingston
    2. Mulhall
    3. Sinnerbrink
    4. Cox and Levine
  5. The Imposition Objection
  6. References and Further Reading

1. What is Philosophy through Film?

This article introduces the main perspectives concerning the idea of doing philosophy through film. By film here is meant, primarily, narrative fiction film. The idea of doing, or at least engaging in some way with, philosophy through film can mean at least two things. Firstly, it may mean using film as a resource, a source of example and illustration, in order to illuminate philosophical positions, ideas and questions. Secondly, it may mean that film itself is to be understood as a medium for philosophising—doing philosophy in film or philosophy as film. The latter implies a more robust engagement of film with philosophy. The extent to which a film can be philosophical or contribute to philosophical knowledge has itself been a matter of some debate. However, what is broadly accepted is that many films ‘resonate in fruitful ways with traditional and contemporary philosophical issues’ (Livingston and Plantinga 2009: xi).

Consideration of film in its philosophical significance, and of philosophical issues through film, can be distinguished from more traditional philosophy of film, though in practice the two activities overlap. Both are subfields in the area of philosophical aesthetics. Philosophy of film traditionally concerns itself with the reflective study of the nature of film, aiming to spell out what film is, whether it is an art, how it differs from other arts, and so on. It is philosophy about film. Contrasted with this is the idea of film serving as a resource, means or medium for the illumination and exploration of philosophical ideas and questions. This is philosophy through film. Historically, philosophy through film is of a more recent vintage than philosophy of film, which enjoyed significant development in the 1980s. Philosophy through film has flourished mostly since 2000, although there were a number of important forerunners who promoted the idea that film can contribute to philosophy, including Cavell (1979), Jarvie (1987), Kupfer (1999) and Freeland (2000).

Since the turn of the century a significant amount of literature has emerged, devoted to the exploration of philosophical themes and questions through narrative films or genres of narrative film. The literature in this area includes more or less popular explorations of the philosophical dimensions of particular films and genres, and of the work of specific directors or writers (for example, Irwin 2002, Abrams 2007, Sanders 2007, Eaton 2008, LaRocca 2011). Along with them there are more pedagogically-oriented introductions to philosophy through film (for example, Litch 2002, Rowlands 2004, Falzon 2007, Cox and Levine 2012). There are also the more theoretical discussions defending the idea of philosophy through film (for example, Mulhall 2002, Wartenberg 2007, Sinnerbrink 2011), or criticising it (for example, Russell 2000, Smith 2006, Livingston 2009).

This article will discuss a range of philosophical positions that have emerged both for and against the idea of philosophy through film. It will proceed by considering a number of objections to the idea.

2. Platonic Objections

The idea of philosophy through film most clearly contrasts with the view that film has nothing to do with philosophy; that film is in effect philosophy’s ‘bad other’, containing all that is foreign or dubious from philosophy’s point of view. Along these lines it might be argued that philosophy is the realm of reflection and debate, whereas film is restricted to experience and action; that philosophy is concerned with reality and truth, as opposed to film which is the realm of mere illusion, appearance, unreal images; or that philosophy deals with universal questions and is a serious business, whereas film is confined to particular narratives, and is designed only for entertainment and distraction. Implicit in the distinctions invoked here is a certain conception of philosophy. As might be expected, considering film to be outside the realm of the philosophical depends on having a certain view of what constitutes philosophy. In general, consideration of the question of philosophy through film inevitably raises meta-philosophical questions about the nature of philosophy itself.

a. Cinema as Plato’s Caves

The dismissal of film as inherently non-philosophical is at the heart of what can be called the ‘Platonic’ objection to philosophy through film. Many of the oppositions invoked in distinguishing the philosophical from the non-philosophical – between reflection and experience, reality and illusion, universality and particularity – find articulation in Plato. They can be discerned in Plato’s Allegory of the Cave, in The Republic (514a–520a). The famous story of prisoners mistaking for reality what are in fact mere shadows projected on the cave wall carries the message that visual images and representations are inadequate as a source of knowledge; and more broadly, that philosophical enlightenment requires thinking and critical reflection, rather than reliance on the way things appear to us. There is also an implicit rejection, developed elsewhere by Plato, of the fictive arts, which trade in unreal representations and foster illusion.

Plato’s cave story has a particular resonance for film. The scenario proposed in the cave story is itself uncannily evocative of the modern cinema (within film theory, see for example Baudry 1976), and has been seen as haunting theoretical reflection on film (Stam 2009: 10). On this model, film is a realm of seductive illusion, all too readily confused with reality by the captive audience. This view is evident, amongst other places, in the Marxist or psychoanalytical semiotic theorising that was prominent in the 1960s and 1970s, which holds that narrative films are forms of bourgeois illusionism. Their apparently realistic content is in fact determined by the dominant ideology of the time, and they imprison their audience by inculcating conformist thinking that makes people accepting of their social and political circumstances (Wilson 1986: 12-13; Stam 2009: 138-9). Philosophical insight, it would appear, requires that one look elsewhere.

Against this view, which in its monolithic dismissal of film pays scant attention to what is going on in particular films themselves, is the contention that films can do more than simply echo dominant ideologies that distort and obscure social reality. Careful examination of individual films shows how prevailing ways of thinking, social practices and institutions, can not only be illuminated but also challenged within a cinematic framework, through playfulness, irony, even downright subversion (Wilson 1986: 13; Stam 2009: 139). For example, Ridley Scott’s feminist road movie Thelma and Louise, highlights and mocks various aspects of male power. This implies that film is capable of adopting some kind of reflective attitude towards what it presents, and that it can present narrative scenarios through which such reflective activity can be pursued. If this is so, Plato’s cave scenario is no longer an adequate model for film. Indeed, the very philosophical scenario that encourages one to think critically about what one experiences, to think philosophically, itself appears in various forms in film, with similar effects. For example, Bernardo Bertolucci’s film The Conformist (1970) uses Plato’s cave image to draw attention to the central character’s imprisonment in the delusions of Fascist ideology. As such, the critical reflection on what is given to us, arguably a necessary characteristic of philosophy, is not foreign to film.

b. Wartenberg and Thought Experiments

A second, though once again ultimately Platonic, objection to the idea of philosophy through film is that fiction film only deals in specific narratives, images, and scenarios, whereas philosophy concerns itself with universal truths. How can film, mired in the particular, have anything to do with philosophy? Thomas Wartenberg has developed a conception of philosophy through film that turns this objection on its head. He points out that fictional narratives can be found readily enough in philosophy itself, in the form of imaginary scenarios, hypothetical situations, in short, thought experiments. More than mere ornamentation, thought experiments play a role in arguments, initiating philosophical reflection, raising general questions, questioning existing views by posing counter-examples, exploring what is essential in a concept, confirming a theory or helping build one, and so forth. In other words they are modes of reflection, allowing general points to be made through particular stories (see Wartenberg 2007, 24, 56-65).

Plato’s cave story is itself such a thought experiment, a narrative embodying a memorable image or scenario, designed to raise general questions about the role of sense experience, the nature of knowledge, the character of philosophical enlightenment itself. Ironically enough, Plato resorts to a narrative, embodying a memorable image, in order to argue that images have no place in philosophical discourse; and indeed, he makes use of a narrative that helps to establish a larger narrative in which philosophy itself is seen as arising through the rejection of narrative in favour of a rational discourse devoted to universal truths (Wartenberg 2007: 21; see also Derrida 1993). As it happens, philosophical discourse is filled with such narrative thought experiments; from Plato’s Ring of Gyges to Descartes’ dream or evil demon hypotheses; from Locke’s prince and the cobbler; to Nozick’s experience machine. At the very least, one can say that concrete narratives and scenarios in the form of thought experiments are not foreign to philosophy; and equally, there is no reason why fiction film, which deals with such narratives, should not be able to pursue general points, or raise philosophical questions, through them. Wartenberg argues that it makes sense to think of some fiction films as working in ways that thought experiments do, and to the extent they may be seen as doing philosophy (Wartenberg 2007, 67). For example, Gondry’s 2004 film Eternal Sunshine of the Spotless Mind may be interpreted as a thought experiment that provides a counterexample to the ethical theory of utilitarianism (Wartenberg 2007, 76-93; see also Grau 2006).

It might be added that even within philosophical discourse, many of these experiments have a dramatic and decidedly cinematic quality about them; and film-makers have not been slow in translating them into visual form. For example, by appropriating skeptical thought experiments involving the possibility of global illusion – the cinematic appropriation of Descartes’ dream hypothesis or evil demon thought experiment in various forms, most famously in the Wachowski Brothers’ much discussed 1999 film The Matrix. The capacity of films to explicitly appropriate philosophical material also suggests a further, and comparatively straightforward way in which one might speak of philosophy through film: the film may be about philosophy in various ways. It may be about a philosopher who in the course of the film expounds their views, as in Derek Jarman’s Wittgenstein (1993); or it may have characters who articulate or discuss philosophical positions, such as Eric Rohmer’s Ma Nuit Chez Maud (1969) or Richard Linklater’s Waking Life (2001); or it may be an adaptation of a philosophically interesting text, as in Jean-Jacques Annaud’s The Name of the Rose (1986). However, this in itself only amounts to a recording of particular views and positions, which find expression in the film. It needs to be distinguished from the idea of film as a medium for philosophising, film as philosophy; and doing philosophy in film.

3. Objections: Smith, Russell

A third objection to the idea of philosophy through film can be found in Murray Smith, who argues that film may appropriate or employ philosophically interesting scenarios, but can never really engage with philosophy because it ultimately has very different structuring interests. Whereas philosophy presents thought experiments—narratives in which philosophical concerns, for example, epistemological issues, are at the forefront; in film it is artistic concerns—dramatic or comic ones—that dominate and trump any philosophical concerns (see Smith 2006). In this way film becomes once more the ‘bad other’ of philosophy. Against this it may be argued that it is artificial to acknowledge similarities between film and philosophical texts ‘only to insist that these similarities are subordinate to important differences between them’ (Wartenberg 2007: 16-17). In addition, it may of course be acknowledged that films do much more than simply engage with philosophical questions and concerns, but this does not in itself preclude the possibility that amongst other things, they might do precisely that.

Smith also supplements his argument by reinstating Plato’s denial of the capacity of fictive works to reveal truth. Smith claims that works of art like films are inherently ambiguous, and cannot present the sort of precision that is necessary for articulating and defending philosophical claims (Smith 2006; Wartenberg 2007: 17). A similar claim is made by Bruce Russell, for whom narrative films lack explicitness to the extent that it is not true that there is some particular argument to be found in them; they lack the explicitness necessary for philosophy (Russell 2000; Wartenberg 2007, 19). Against this it can be argued that philosophical claims and arguments attributed to films may not have always been formulated by philosophical interpreters as precisely as they should be, but ‘this does not establish the claim that the films themselves are inherently ambiguous and their arguments incapable of precise formulation’ (Wartenberg 2007: 20).

Here, once again, an a priori dismissal of the idea that film can be philosophical shows itself to be informed by Platonic considerations, in this case Plato’s dismissal of the artists from the city of philosophy. By the same token the notion of philosophy through film amounts to a repudiation of the Platonic prejudice against art, of the view, at least implicit in the cave scenario, that the fictive arts trade in unreal representations and foster illusion, and cannot be philosophically enlightening (see Wartenberg 2007:15,17; Sinnerbrink 2011: 4-5). Philosophy through film may not amount to an entirely anti-Platonic enterprise, however. As with Plato’s repudiation of image and narrative which he nonetheless makes use of in his own work, Plato’s dismissal of the artists is paradoxical, given that, as Iris Murdoch notes, Plato himself is a great artist (Murdoch 1977: 87).

4. Objecting to the ‘Bold Thesis’

a. Livingston

Paisley Livingston argues that film may certainly be used for philosophically interesting purposes, as a resource for philosophy. It can serve to illustrate views about scepticism, wisdom and so on, and even give expression to philosophically informed positions and perspectives. But to do more, to be able to be said to be philosophising, film would have to ‘make independent, innovative and significant contributions to philosophy by means unique to the cinematic medium…where such contributions are independent in the sense that they are inherent in the film, and not based on verbally articulated philosophising’ (Livingston 2008: 12). This is what Livingston calls the ‘bold thesis’ of film as philosophy, the view that films engage in creative philosophical thinking and the formation of new philosophical concepts. Livingston rejects this bold thesis, because either the philosophical content of a film can be paraphrased verbally, in which case it has no special connection with film, or it cannot be paraphrased, in which case one may wonder whether this supposed content exists at all.

b. Mulhall

Against this kind of view, a number of positions have been formulated to the effect that film is more than merely a resource for philosophy, a useful way of illustrating philosophical positions and themes. At one extreme, there is Stephen Mulhall’s classic formulation of film as philosophy, in the introduction to his book on the Alien films:

I do not look to these films as handy or popular illustrations of views and arguments properly developed by philosophers; I see them rather as themselves reflecting on and evaluating such views and arguments, as thinking seriously and systematically about them in just the same ways that philosophers do. Such films are not philosophy’s raw material, are not a source for its ornamentation; they are philosophical exercises, philosophy in action – film as philosophising (Mulhall 2002: 2).

This formulation implies that there can be a cinematic performance of philosophy—that what is done in the verbal medium of philosophy can also be done in the cinematic medium. However, by the same token it might be argued that on this view, everything done cinematically could also be done in purely verbal philosophical form; it could be entirely paraphrased, or re-expressed in verbal terms. The problem with this conception of philosophy through film is that film here may no longer be philosophy’s ‘bad other’, but in so far as it can be said to philosophize, it seems to have been effectively reduced to philosophy. On this conception, film as a philosophical exercise is too similar to philosophy to be doing anything distinctively cinematic; which also means that it presumably could be dispensed with in favour of purely verbal philosophical argument without any loss. The cinematic setting for this reflective activity is purely accidental. As such, philosophy through film simply turns out to be more philosophy.

c. Sinnerbrink

At the other extreme, Robert Sinnerbrink argues that films are able to engage in a distinctively cinematic kind of thinking that resists philosophical translation or paraphrase. Such films cannot be reduced to a philosophical thesis or theoretical problem, or translated into a ready-made position or argument; they confront existing categories and open up new ways of thinking (Sinnerbrink 2011:10). By concretely showing rather than arguing, these films question aspects of our practices or normative frameworks, challenge established ways of seeing, disclose new aspects of experience, and open up new paths of thinking (Sinnerbrink, 2011: 141, 142). Thus, for example, rather than exploring scenarios of global illusion that draw attention to a distinction between appearance and underlying reality, a film like David Lynch’s Mulholland Drive (2001) can be seen as bringing into question the very distinction between the real and the illusory, in order to explore an indeterminate zone between fantasy and reality (see Sinnerbrink 2005).

By the same token, however, it might be argued that the more distinctively cinematic the reflection, that is, the more it resists translation into a philosophical thesis or form, the less recognisably philosophical it is going to be. In other words, the less it will constitute philosophy through film. Whereas in Mulhall’s formulation, filmic reflection was too similar to philosophy to do anything distinctively cinematic; now filmic reflection is too different from philosophy to do anything recognisably philosophical. In a sense this view does not oppose Livingston’s rejection of the bold thesis regarding philosophy through film—not only because there cannot be a uniquely cinematic form of reflection, but because to the extent that there is, it is no longer philosophical. In this form, film would be philosophy’s ‘good other’. It would embody a form of thinking or of reflection—something other than philosophy. Film, so understood, would neither be a mere resource or instrument for philosophy, nor a form of reflection that could be translated into existing philosophical categories. Rather, it would be an autonomous mode of reflection that transcends philosophy. Here, one would be escaping from philosophy through film; though it might also be argued that this represents the transformation of philosophical reflection into something new.

d. Cox and Levine

Finally, a ‘moderate’ position, standing effectively between those of Mulhall and Sinnerbrink, which acknowledges both the philosophical and the necessarily cinematic character of cinematic thinking, is argued for by Damien Cox and Michael Levine. This is the view that films are capable of performing philosophical activities that can also be pursued in standard verbal philosophical form, but they can sometimes do some things better than written texts can. That is, precisely as films, they are capable of presenting certain kinds of philosophical material better than standard philosophical genres. This dovetails with the view developed by Wartenberg, discussed earlier, that film is able to explore philosophical issues and questions through concrete thought experiments. What Cox and Levine argue is that cinematic thought experiments can be presented with greater richness, nuance and perspective than may be found within the genres of professional philosophy (the book, the journal article), where these experiments are typically abstract, thin and context-free (Cox and Levine 201: 10-12).

This position joins with those critiques of philosophy to the effect that the peculiar abstractness of standard philosophical genres can be a source of distortion, and which argue that some areas of philosophy, like ethics, are sometimes more at home in literature and the arts (Cox and Levine 2012: 11-12; see Murdoch 1970, 1977; Nussbaum 1990). Clearly evident once again, in this affirmation of art in general and film in particular, is the rejection of the Platonic view that the fictive arts, trading in unreal representations only foster illusion, and have nothing to contribute to philosophy.

5. The Imposition Objection

The idea of philosophy through film, then, can cover a number of things, including: using film as a resource, providing useful examples and illustrations of philosophical ideas; and the idea of film as philosophizing, at least in the sense of evoking or enacting philosophically interesting thought experiments that remind us of various aspects of our concrete experience of life. In addition, it can cover the idea of film as being explicitly about philosophers or philosophy, giving explicit expression to philosophical positions and perspectives.

This brings us to one more possible objection to the idea of philosophy through film, in particular film as doing philosophy, namely, what has been termed by Wartenberg the imposition objection. This is the claim that film may certainly be used for philosophically interesting purposes, such as providing useful illustrations, but that any philosophizing is done by the philosopher using the film, rather than by the film itself. On this view, to think that the film itself is doing anything philosophical is a mistake; any philosophical significance one imagines to be discernible in the film itself is in fact only a projection of the philosopher’s views onto the film. A film may raise interesting questions, but only a philosophical interpreter can organize those into a coherent position or argument (Wartenberg 2007: 25). This view in effect requires that film be once again viewed as the ‘bad other’ of philosophy, with any philosophical significance necessarily being imposed from outside. At best, we are back with the view that film may serve as a useful resource for philosophy, a source of example and illustration, but that it cannot be said in any way to do philosophy.

Obviously, to the extent that film can be said to do philosophy, the imposition view has to be rejected. A general point to make here is that any engagement with film requires interpretation on the part of the onlooker, who is, after all, at the most basic level, required to organise and make sense of a multiplicity of images in a certain way. Engagement with film in its philosophical aspects is no different; it requires a certain reading or interpretation of the film on the part of the onlooker. At the very least, engaging with the philosophical dimension of a film involves singling out a particular aspect of that film and ignoring others, since there is always much more going on in a film than a concern with philosophically relevant matters. As far as responding to the imposition objection is concerned, the important issue is not whether there is interpretation but whether or not the interpretation, the philosophical reading, is imposed on the film; that is, whether it is appropriate or inappropriate to the film. With regard to what renders an interpretation appropriate, both author and audience -centred positions have been argued for in the literature. These alternatives recall broader debates in aesthetics and literary theory over how far a work of art is to be judged by reference to the purposes of its creator, and to what extent a text may yield more than the creator of the work could have conceived.

On the one hand there is the ‘author-centred’ view argued for by Wartenberg, that a philosophical interpretation of a film may be considered appropriate because it is grounded in the intentions of the authors of the film (Wartenberg 2007: 26). That is, it is necessary that the film’s creators intend to present the views or pose the questions that are being attributed to the film. As a result there are constraints as to what interpretations may be made. Specifically, one should never attribute to the film a meaning that could not be intended by the creator of that work. For a philosophical film interpretation to be plausible, it needs to posit a meaning the filmmaker at least could have intended. To that extent, the meaning discerned is not simply imposed or projected onto the film, but rather is inherent in the film, as part of its creators’ intention. Certainly, the filmmaker may not necessarily have an explicit conception of the philosophical views that are embedded in the film; but it is enough that they have some conception of the relevant idea, and what might be questionable about it. (Wartenberg 2007: 26, 91). A problem with the ‘author-centred’ view is that it may be difficult or impossible to establish what the director’s or writer’s intention might have been; and moreover, since films are collaborative endeavours it may not be possible even to identify a particular author.

On the other hand, there is the view that the philosophical views and questions presented in films can be assessed independently of authorial intention. For Cox and Levine, ‘[A]n interesting aspect of film, like other forms of narrative art (such as novels) is that it often lets us see and surmise a great deal more than its creators intended’ (Cox and Levine 2012: 13). In this spirit they argue that philosophical views may be embedded in a film without it being the director’s or writer’s intention that the view be manifest, or without them even being aware that it is a view they hold. That is, ‘it is often possible to distinguish authorial intention from what is revealed in film narrative, visual effect, or performance.’ (Cox and Levine 2012: 14-15) Here, the emphasis changes from a concern with the author’s intentions to how the film is received—an audience-centred interpretation. Films have ‘lives and meanings of their own,’ and these will vary over time and are relative to a degree to particular audiences (Cox and Levine 2012: 15).

What these views have in common is that they acknowledge that engagement with a film involves some interpretation on the part of the onlooker; but also that this interpretation is not arbitrary, such that one could read any kind of significance whatsoever into the film. That would indeed imply that any philosophical significance is simply being imposed on the film from outside. However, even an audience-centred interpretation is subject to constraints arising from the film itself. Not any kind of interpretation is appropriate; some interpretations work better than others. This is so even if one views the film merely as providing a handy illustration of a pre-existing philosophical position or issue, where the film is being used as a resource for philosophical purposes, and the concern is primarily to illuminate certain philosophical ideas. Even here the interpretation is not arbitrary. There are better or worse cinematic illustrations that one can appeal to; More or less effective cinematic resources one can make use of. Some films lend themselves more readily to the task than others; and to the extent that they do so, being able to make use of the film in this way is also saying something about the film itself, and reflects the reality of the film itself in so far as it engages with philosophical ideas. For example, one sees philosophical ideas like Locke’s memory-based conception of personal identity concretely illustrated in a film like Paul Verhoeven’s Total Recall (1990); the film also invites this interpretation, and thereby reveals itself to have strong philosophical resonances.

Furthermore, if film even as illustration can be said to be engaging with philosophy, it seems to follow that a sharp distinction cannot really be drawn between film as illustration for philosophy, and as doing philosophy. It would instead be a matter of degrees of engagement, of film being more or less engaged with philosophical positions, issues and questions. By the same token, if film does indeed have a capacity to do philosophy, in particular by presenting certain philosophical material with greater richness, nuance and perspective than is possible within the genres of professional philosophy, we would expect something of this to also be present in film as mere illustration. Certainly, Wartenberg has argued to this effect, that films which illustrate previously articulated philosophical positions can, despite their status as illustrations, deepen understanding of those positions by providing a concrete version of an abstract theory, and to that extent they can be said to embody philosophical thinking. For example, Charlie Chaplin’s Modern Times (1936) can be seen as offering a concrete illustration of Marx’s conception of the alienating mechanisation of human beings under the factory system, a concrete representation of an abstract account that clarifies and extends it, and shows its human significance (Wartenberg 2007: 32, 44-54).

Finally, if film even as illustration engages with philosophy, and there is no sharp distinction between film as illustration, and film as doing philosophy; it will no longer be possible to accept the former while rejecting the latter. The imposition objection allows that film may be used for illustration, but rejects the idea of film as philosophising, dismissing this as no more than a projection on the part of the onlooker. However, without a sharp distinction between the two, it is not possible to accept the former while rejecting the latter. Or to put it in more positive terms, accepting the former implicates one in the latter because far from being mere illustration, film as illustration can already be said, to an extent, to be doing philosophy.

6. References and Further Reading

  • Abrams, Jerold J. 2007. The Philosophy of Stanley Kubrick. Lexington: The University Press of Kentucky.
  • Baudry, Jean-Louis. 1976. ‘The Apparatus’, Camera Obscura. Fall 1976 1(11), 104-12.
  • Cavell, Stanley. 1979. The World Viewed: Reflections on the Ontology of Film, enlarged edition. Cambridge, MA: Harvard University Press.
  • Cox, Damien and Michael P. Levine. 2012. Thinking Through Film: Doing Philosophy, Watching Movies. Malden, Wiley-Blackwell
  • Derrida, Jacques. 1993. ‘Circumfession’, In J. Derrida & G. Bennington, Jacques Derrida (pp. 3–315). Chicago, IL: University of Chicago Press.
  • Eaton, A.W. (ed.) 2008. Talk to Her, London/New York, Routledge.
  • Falzon, Christopher. 2007. Philosophy Goes to the Movies: an Introduction to Philosophy. London/New York, Routledge.
  • Freeland, Cynthia. 2000. The Naked and the Undead: Evil and the Appeal of Horror, Boulder: Westview Press.
  • Grau, Christopher. 2006. ‘Eternal Sunshine of the Spotless Mind and the Morality of Memory’, Journal of Aesthetics and Art Criticism 64 (1):119–133.
  • Irwin, William (ed.) 2002. The Matrix and Philosophy: Welcome to the Desert of the Real. Chicago: Open Court.
  • Jarvie, Ian. 1987. Philosophy of the Film: Epistemology, Ontology, Aesthetics. New York/London: Routledge & Kegan Paul.
  • Kupfer, Joseph H. 1999. Visions of Virtue in Popular Film. Boulder: Westview Press.
  • LaRocca, David (ed.) 2011. The Philosophy of Charlie Kaufman. Lexington: The University Press of Kentucky.
  • Litch, Mary. 2002. Philosophy through Film. New York: Routledge
  • Livingston, Paisley. 2006. ‘Theses on Cinema as Philosophy’, in Murray Smith and Thomas E. Wartenberg. 2006. Thinking Through Cinema: Film as Philosophy. Oxford: Blackwell.
  • Livingston, Paisley and Carl Plantinga (eds) 2009. ‘Preface’ to The Routledge Companion to Philosophy and Film. London/New York, Routledge.
  • Mulhall, Stephen. 2002. On Film. London: Routledge.
  • Murdoch, Iris. 1970. The Sovereignty of Good. Boston: Routledge & Kegan Paul.
  • Murdoch, Iris. 1977. The Fire and the Sun: Why Plato Banished the Artists. Oxford: Oxford University Press.
  • Nussbaum, Martha. 1990. Love’s Knowledge. Oxford: Oxford University Press
  • Read, Rupert and Jerry Goodenough (eds) 2005. Film as Philosophy: Essays on Cinema after Wittgenstein and Cavell. London: Palgrave Macmillan.
  • Rowland, Mark. 2004. The Philosopher at the End of the Universe. New York: Thomas Dunne Books .
  • Russell, Bruce. 2000. ‘The Philosophical Limits of Film’, Film and Philosophy. Special Issue on Woody Allen, 163-67.
  • Sanders, Steven M. 2007. The Philosophy of Science Fiction Film. Lexington: The University Press of Kentucky.
  • Sinnerbrink, Robert. 2005. ‘Cinematic Ideas: David Lynch’s Mulholland Drive’, Film-Philosophy 9 (34).
  • Sinnerbrink, Robert. 2011. New Philosophies of Film: Thinking Images. London and New York: Continuum.
  • Smith, Murray ‘Film Art, Argument and Ambiguity’. 2006, in Murray Smith and Thomas E. Wartenberg (eds). 2006. Thinking Through Cinema: Film as Philosophy. Oxford: Blackwell.
  • Stam, Robert. 2000. Film Theory: An Introduction. Malden: Blackwell.
  • Wartenberg, Thomas. 2007. Thinking on Screen: Film as Philosophy. New York/London, Routledge.
  • Wilson, George. 1986. Narration in Light: Studies in Cinematic Point of View. Baltimore/London: The Johns Hopkins University Press.

Author Information

Christopher Falzon
Email: Chris.Falzon@newcastle.edu.au
University of Newcastle
Australia

Aesthetic Formalism

artistic imageFormalism in aesthetics has traditionally been taken to refer to the view in the philosophy of art that the properties in virtue of which an artwork is an artwork—and in virtue of which its value is determined—are formal in the sense of being accessible by direct sensation (typically sight or hearing) alone.

While such Formalist intuitions have a long history, prominent anti-Formalist arguments towards the end of the twentieth century (for example, from Arthur Danto and Kendall Walton according to which none of the aesthetic properties of a work of art are purely formal) have been taken by many to be decisive. Yet in the early twenty-first century there has been a renewed interest in and defense of Formalism. Contemporary discussion has revealed both “extreme” and more “moderate” positions, but the most notable departure from traditional accounts is the move from Artistic to Aesthetic Formalism.

One might more accurately summarize contemporary Formalist thinking by noting the complaint that prominent anti-Formalist arguments fail to accommodate an important aspect of our aesthetic lives, namely those judgements and experiences (in relation to art, but also beyond the art-world) which should legitimately be referred to as “aesthetic” but which are accessible by direct sensation, and proceed independently of one’s knowledge or appreciation of a thing’s function, history, or context.

The presentation below is divided into five parts. Part 1 outlines an historical overview. It considers some prominent antecedents to Formalist thinking in the nineteenth century, reviews twentieth century reception (including the anti-Formalist arguments that emerged in the latter part of this period), before closing with a brief outline of the main components of the twenty-first century Formalist revival. Part 2 returns to the early part of the twentieth century for a more in-depth exploration of one influential characterisation and defense of Artistic Formalism developed by art-critic Clive Bell in his book Art (1913). Critical reception of Bell’s Formalism has been largely unsympathetic, and some of the more prominent concerns with this view will be discussed here before turning—in Part 3—to the Moderate Aesthetic Formalism developed in the early part of the twenty-first century by Nick Zangwill in his The Metaphysics of Beauty (2001). Part 4 considers the application of Formalist thinking beyond the art world by considering Zangwill’s responses to anti-Formalist arguments regarding the aesthetic appreciation of nature. The presentation closes with a brief conclusion (Part 5) together with references and suggested further reading.

Table of Contents

  1. A Brief History of Formalism
    1. Nineteenth Century Antecedents
    2. Twentieth Century Reception
  2. Clive Bell’s Artistic Formalism
    1. Clive Bell and ‘Significant Form’
    2. The Pursuit of Lasting Values
    3. Aesthetic versus Non-Aesthetic Appreciation
    4. Conclusions: From Artistic to (Moderate) Aesthetic Formalism
  3. Nick Zangwill’s Moderate Aesthetic Formalism
    1. Extreme Formalism, Moderate Formalism, Anti-Formalism
    2. Responding to Kendall Walton’s Anti-Formalism
    3. Kant’s Formalism
  4. From Art to the Aesthetic Appreciation of Nature
    1. Anti-Formalism and Nature
    2. Formalism and Nature.
  5. Conclusions
  6. References and Further Reading

1. A Brief History of Formalism

a. Nineteenth Century Antecedents

When A. G. Baumgarten introduced the term “aesthetic” into the philosophy of art it seemed to be taken up with the aim of recognising, as well as unifying, certain practices, and perhaps even the concept of beauty itself. It is of note that the phrase l’art pour l’art seemed to gain significance at roughly the same time that the term aesthetic came into wider use.

Much has been done in recognition of the emergence and consolidation of the l’art pour l’art movement which, as well as denoting a self-conscious rebellion against Victorian moralism, has been variously associated with bohemianism and Romanticism and characterises a contention that, for some, encapsulates a central position on art for the main part of the nineteenth century. First appearing in Benjamin Constant’s Journal intime as early as 1804 under a description of Schiller’s aesthetics, the initial statement: “L’art pour l’art without purpose, for all purpose perverts art” has been taken not only as a synonym for the disinterestedness reminiscent of Immanuel Kant’s aesthetic but as a modus operandi in its own right for a particular evaluative framework and corresponding practice of those wishing to produce and insomuch define the boundaries of artistic procedure.

These two interpretations are related insofar as it is suggested that the emergence of this consolidated school of thought takes its initial airings from a superficial misreading of Kant’s Critique of Judgement (a connection we will return to in Part 3). Kant’s Critique was not translated into French until 1846, long after a number of allusions that implicate an understanding and certainly a derivation from Kant’s work. John Wilcox (1953) describes how early proponents, such as Victor Cousin, spoke and wrote vicariously of Kant’s work or espoused positions whose Kantian credentials can be—somewhat undeservedly it turns out—implicated. The result was that anyone interested in the arts in the early part of the nineteenth century would be exposed to a new aesthetic doctrine whose currency involved variations on terms including aesthetic, disinterest, free, beauty, form and sublime.

By the 1830s, a new school of aesthetics thus accessed the diluted Kantian notions of artistic genius giving form to the formless, presented in Scheller’s aesthetics, via the notion of beauty as disinterested sensual pleasure, found in Cousin and his followers, towards an understanding of a disinterested emotion which constitutes the apprehension of beauty. All or any of which could be referred to by the expression L’art pour l’art; all of which became increasingly associated with the term aesthetic.

Notable adoption, and thus identification with what may legitimately be referred to as this “school of thought” included Victor Hugo, whose preface to Cromwell, in 1827, went on to constitute a manifesto for the French Romantic movement and certainly gave support to the intuitions at issue. Théophile Gautier, recognising a theme in Hugo, promoted a pure art-form less constrained by religious, social or political authority. In the preface to his Premières poesies (1832) he writes: “What [end] does this [book] serve? – it serves by being beautiful… In general as soon as something becomes useful it ceases to be beautiful”. This conflict between social usefulness versus pure art also gained, on the side of the latter, an association with Walter Pater whose influence on the English Aesthetic movement blossomed during the 1880s where the adoption of sentimental archaism as the ideal of beauty was carried to extravagant lengths. Here associations were forged with the likes of Oscar Wilde and Arthur Symons, further securing (though not necessarily promoting) a connection with aestheticism in general. Such recognition would see the influence of l’art pour l’art stretch well beyond the second half of the nineteenth century.

As should be clear from this brief outline it is not at all easy, nor would it be appropriate, to suggest the emergence of a strictly unified school of thought. There are at least two strands that can be separated in what has been stated so far. At one extreme we can identify claims like the following from the preface of Wilde’s The Picture of Dorian Gray: “There is no such thing as a moral or an immoral book. Books are well written or badly written.” Here the emphasis is initially on the separation of the value of art from social or moral aims and values. The sentiment is clearly reminiscent of Gautier’s claim: “Only those things that are altogether useless can be truly beautiful; anything that is useful is ugly; for it is the expression of some need…”. Yet for Wilde, and many others, the claim was taken more specifically to legitimise the production and value of amoral, or at least morally controversial, works.

In a slightly different direction (although recognisably local to the above), one might cite James Whistler: Art should be independent of all claptrap—should stand alone […] and appeal to the artistic sense of eye or ear, without confounding this with emotions entirely foreign to it, in devotion, pity, love, patriotism and the like.

While the second half of this statement seems merely to echo the sentiments expressed by Wilde in the same year, there is, in the first half, recognition of the contention Whistler was later to voice with regard to his painting; one that expressed a focus, foremost, on the arrangement of line, form and colour in the work. Here we see an element of l’art pour l’art that anticipated the importance of formal features in the twentieth century, holding that artworks contain all the requisite value inherently—they do not need to borrow significance from biographical, historical, psychological or sociological sources. This line of thought was pursued, and can be identified, in Eduard Hanslick’s The Beautiful in Music (1891); Clive Bell’s Art (1913); and Roger Fry’s Vision and Design (1920). The ruminations of which are taken to have given justification to various art movements from abstract, non-representational art, through Dada, Surrealism, Cubism.

While marked here as two separable strands, a common contention can be seen to run through the above intuitions; one which embarks from, but preserves, something of the aesthetic concept of disinterestedness, which Kant expressed as purposiveness without purpose. L’art pour l’art can be seen to encapsulate a movement that swept through Paris and England in the form of the new Aesthetic (merging along the way with the Romantic Movement and bohemianism), but also the central doctrine that formed not only the movement itself, but a well-established tradition in the history of aesthetics. L’art pour l’art captures not just a movement but an aesthetic theory; one that was adopted and defended by both critics and artists as they shaped art history itself.

b. Twentieth Century Reception

Towards the end of the twentieth century Leonard Meyer (in Dutton, 1983) characterised the intuition that we should judge works of art on the basis of their intrinsic formal qualities alone as a “common contention” according to which the work of art is said to have its complete meaning “within itself”. On this view, cultural and stylistic history, and the genesis of the artwork itself do not enhance true understanding. Meyer even suggests that the separation of the aesthetic from religion, politics, science and so forth, was anticipated (although not clearly distinguished) in Greek thought. It has long been recognised that aesthetic behaviour is different from ordinary behaviour; however, Meyer goes on to argue that this distinction has been taken too far. Citing the Artistic Formalism associated with Clive Bell (see Part 2), he concludes that in actual practice we do not judge works of art in terms of their intrinsic formal qualities alone.

However, Artistic Formalism, or its close relatives, have met with serious (or potentially disabling) opposition of the kind found in Meyer. Gregory Currie (1989) and David Davies (2004) both illustrate a similar disparity between our actual critical and appreciative practices and what is (in the end) suggested to be merely some pre-theoretical intuition. Making such a point in his An Ontology of Art, Currie draws together a number of familiar and related aesthetic stances under the term “Aesthetic Empiricism”, according to which

[T]he boundaries of the aesthetic are set by the boundaries of vision, hearing or verbal understanding, depending on which art form is in question. (Currie, 1989, p.18)

Currie asserts that empiricism finds its natural expression in aesthetics in the view that a work—a painting, for instance—is a “sensory surface”. Such a view was, according to Currie, supposed by David Prall when he said that “Cotton will suffice aesthetically for snow, provided that at our distance from it it appears snowy”. It is the assumption we recover from Monroe Beardsley (1958) in the view that the limits of musical appreciation are the limits of what can be heard in a work. Currie also recognises a comparable commitment concerning literature in Wimsatt and Beardsley’s The Intentional Fallacy (1946).  We can add to Currie’s list Clive Bell’s claim that

To appreciate a work of art we need bring with us nothing from life, no knowledge of its ideas and affairs, no familiarity with its emotions… we need bring with us nothing but a sense of form and colour and a knowledge of three-dimensional space.

Alfred Lessing, in his “What is Wrong with Forgery?” (in Dutton, 1983), argues that on the assumption that an artwork is a “sensory surface” it does seem a natural extension to claim that what is aesthetically valuable in a painting is a function solely of how it looks. This “surface” terminology, again, relates back to Prall who characterised the aesthetic in terms of an exclusive interest in the “surface” of things, or the thing as seen, heard, felt, immediately experienced. It echoes Fry’s claim that aesthetic interest is constituted only by an awareness of “order and variety in the sensuous plane”. However, like Kendall Walton (1970) and Arthur Danto (1981) before him, Currie’s conclusion is that this common and influential view is nonetheless false.

Walton’s anti-formalism is presented in his essay “Categories of Art” in which he first argues that the aesthetic properties one perceives an artwork as having will depend on which category one perceives the work as belonging to (for example, objects protruding from a canvas seen under the category of “painting”—rather than under the category of “collage”—may appear contrary to expectation and thus surprising, disturbing, or incongruous). Secondly, Walton argues that the aesthetic properties an artwork actually has are those it is perceived as having when seen under the category to which it actually belongs. Determination of “correct” categories requires appeal to such things as artistic intentions, and as knowledge concerning these requires more than a sense of form, color, and knowledge of three-dimensional space, it follows that Artistic Formalism must be false (see Part 3 for a more in-depth discussion of Walton’s anti-formalist arguments).

Similarly, Danto’s examples—these include artworks such as Marcel Duchamp’s “Readymades”, Andy Warhol’s Brillo Boxes, and Danto’s hypothetical set of indiscernible red squares that constitute distinct artworks with distinct aesthetic properties (indeed, two of which are not artworks at all but “mere things”)are generally taken to provide insurmountable difficulties for traditional Artistic Formalism. Danto argues that, regarding most artworks, it is possible to imagine two objects that are formally or perceptually indistinguishable but differ in artistic value, or perhaps are not artworks at all.

Despite the prominence of these anti-formalist arguments, there has been some notable resistance from the Formalist camp. In 1983 Denis Dutton published a collection of articles on forgery and the philosophy of art under the title The Forger’s Art. Here, in an article written for the collection, Jack Meiland argues that the value of originality in art is not an aesthetic value. In criticism of the (above) position held by Leonard Meyer, who defends the value of originality in artworks, Meiland asks whether the original Rembrandt has greater aesthetic value than the copy? He refers to “the appearance theory of aesthetic value” according to which aesthetic value is independent of the non-visual properties of the work of art, such as its historical properties. On this view, Meiland argues, the copy, being visually indistinguishable from the original, is equal in aesthetic value. Indeed, he points to an arguable equivocation in the sense of the word “original” or “originality”. The originality of the work will be preserved in the copy—it is rather the level of creativity that may be surrendered. We might indeed take the latter to devalue the copied work, but Meiland argues that while originality is a feature of a work, creativity is a feature applicable to the artist or in this case a feature lacking in the copyist, it therefore cannot affect the aesthetic quality of the work. Thus we cannot infer from the lack of creativity on the part of the artist that the work itself lacks originality.

This distinction between “artistic” and “aesthetic” value marks the transition from Artistic to Aesthetic Formalism. Danto, for example, actually endorsed a version of the latter in maintaining that (while indistinguishable objects may differ in terms of their artistic value or art-status) in being perceptually indiscernible, two objects would be aesthetically indiscernible also. Hence, at its strongest formulation Aesthetic Formalism distinguishes aesthetic from non-aesthetic value whilst maintaining that the former is restricted to those values that can be detected merely by attending to what can be seen, heard, or immediately experienced. Values not discerned in this way may be important, but should not be thought of as (purely) “aesthetic” values.

Nick Zangwill (2001) has developed a more moderate Aesthetic Formalism, drawing on the Kantian distinction between free (formal) and dependent (non-formal) beauty. In relation to the value of art, Zangwill accepts that “extreme formalism (according to which all the aesthetic properties of a work of art are formal) is false. But so too are strongly anti-Formalist positions such as those attributable to Walton, Danto, and Currie (according to which none of the aesthetic properties of a work of art are purely formal). Whilst conceding that the restrictions imposed by Formalism on those features of an artwork available for consideration are insufficient to deliver some aesthetic judgements that are taken to be central to the discourse, Zangwill maintains that there is nonetheless an “important truth” in formalism. Many artworks have a mix of formal and non-formal aesthetic properties, and at least some artworks have only formal aesthetic properties. Moreover, this insight from the Aesthetic Formalisist is not restricted to the art world. Many non-art objects also have important formal aesthetic properties. Zangwill even goes so far as to endorse extreme Aesthetic Formalism about inorganic natural items (such as rocks and sunsets).

2. Clive Bell’s Artistic Formalism

In Part 1 we noted the translation of the L’art pour l’art stance onto pictorial art with reference to Whistler’s appeal to “the artistic sense of eye and ear”.  Many of the accounts referred to above focus on pictorial artworks and the specific response that can be elicited by these. Here in particular it might be thought that Bell’s Artistic Formalism offers a position that theoretically consolidates the attitudes described.

Formalism of this kind has received largely unsympathetic treatment for its estimation that perceptual experience of line and colour is uniquely and properly the domain of the aesthetic. Yet there is some intuitive plausibility to elements of the view Bell describes which have been preserved in subsequent attempts to re-invigorate an interest in the application of formalism to aesthetics (see Part 3). In this section we consider Bell’s initial formulation, identifying (along the way) those themes that re-emerge in contemporary discussion.

a. Clive Bell and ‘Significant Form’

The claim under consideration is that in pictorial art (if we may narrow the scope for the purposes of this discussion) a work’s value is a function of its beauty and beauty is to be found in the formal qualities and arrangement of paint on canvas. Nothing more is required to judge the value of a work. Here is Bell:

What quality is shared by all objects that provoke our aesthetic emotions? What quality is common to Sta. Sophia and the windows at Chartres, Mexican sculpture, a Persian bowl, Chinese carpets, Giotto’s frescoes at Padua, and the masterpieces of Poussin, Piero della Francesca, and Cezanne? Only one answer seems possible – significant form. In each, lines and colours combined in a particular way, certain forms and relations of forms, stir our aesthetic emotions. These relations and combinations of lines and colours, these aesthetically moving forms, I call “Significant Form”; and “Significant Form” is the one quality common to all works of visual art. (1913, p.5)

These lines have been taken to summarise Bell’s account, yet alone they explain very little. One requires a clear articulation of what “aesthetic emotions” are, and what it is to have them stirred. Also it seems crucial to note that for Bell we have no other means of recognising a work of art than our feeling for it. The subjectivity of such a claim is, for Bell, to be maintained in any system of aesthetics. Furthermore it is the exercise of bringing the viewer to feel the aesthetic emotion (combined with an attempt to account for the degree of aesthetic emotion experienced) that constitutes the function of criticism. “…[I]t is useless for a critic to tell me that something is a work of art; he must make me feel it for myself. This he can do only by making me see; he must get at my emotions through my eyes.” Without such an emotional attachment the subject will be in no position to legitimately attribute to the object the status of artwork.

Unlike the proponents of the previous century Bell is not so much claiming an ought (initially) but an is. Significant form must be the measure of artistic value as it is the only thing that all those works we have valued through the ages have in common. For Bell we have no other means of recognising a work of art than our feeling for it. If a work is unable to engage our feelings it fails, it is not art. If it engages our feelings, but feelings that are sociologically contingent (for example, certain moral sensibilities that might be diminished or lost over time), it is not engaging aesthetic sensibilities and, inasmuch, is not art. Thus if a work is unable to stir the viewer in this precise and uncontaminated way (in virtue of its formal qualities alone), it will be impossible to ascribe to the object the status of artwork.

We are, then, to understand that certain forms—lines, colours, in particular combinations—are de facto producers of some kind of aesthetic emotion. They are in this sense “significant” in a manner that other forms are not. Without exciting aesthetic rapture, although certain forms may interest us; amuse us; capture our attention, the object under scrutiny will not be a work of art. Bell tells us that art can transport us

[F]rom the world of man’s activity to a world of aesthetic exaltation. For a moment we are shut off from human interests; our anticipations and memories are arrested; we are lifted above the stream of life. The pure mathematician rapt in his studies knows a state of mind which I take to be similar if not identical.

Thus the significance in question is a significance unrelated to the significance of life. “In this [the aesthetic] world the emotions of life find no place. It is a world with emotions of its own.” Bell writes that before feeling an aesthetic emotion one perceives the rightness and necessity of the combination of form at issue, he even considers whether it is this, rather than the form itself, that provokes the emotion in question. Bell’s position appears to echo G. E. Moore’s intuitionism in the sense that one merely contemplates the object and recognises the significant form that constitutes its goodness.

But the spectator is not required to know anything more than that significant form is exhibited. Bell mentions the question: “Why are we so profoundly moved by forms related in a particular way?” yet dismisses the matter as extremely interesting but irrelevant to aesthetics. Bell’s view is that for “pure aesthetics” we need only consider our emotion and its object—we do not need to “pry behind the object into the state of mind of him who made it.” For pure aesthetics, then, it need only be agreed that certain forms do move us in certain ways, it being the business of an artist to arrange forms such that they so move us.

b. The Pursuit of Lasting Values

Central to Bell’s account was a contention that the response elicited in the apprehension of significant form is one incomparable with the emotional responses of the rest of experience. The world of human interests and emotions do, of course, temper a great deal of our interactions with valuable objects, these can be enjoyable and beneficial, but constitute impure appreciation. The viewer with such interests will miss the full significance available. He or she will not get the best that art can give. Bell is scathing of the mistaken significance that can be attributed to representational content, this too signifies impure appreciation. He suggests that those artists “too feeble to create forms that provoke more than a little aesthetic emotion will try to eke that little out by suggesting the emotions of life”. Such interests betray a propensity in artists and viewers to merely bring to art and take away nothing more than the ideas and associations of their own age or experience. Such prima facie significance is the significance of a defective sensibility. As it depends only on what one can bring to the object, nothing new is added to one’s life in its apprehension. For Bell, then, significant form is able to carry the viewer out of life and into ecstasy. The true artist is capable of feeling such emotion, which can be expressed only in form; it is this that the subject apprehends in the true artwork.

Much visual art is concerned with the physical world—whatever the emotion the artists express may be, it seemingly comes through the contemplation of the familiar. Bell is careful to state, therefore, that this concern for the physical world can be (or should be) nothing over and above a concern for the means to the inspired emotional state. Any other concerns, such as practical utility, are to be ignored by art. With this claim Bell meant to differentiate the use of artworks for documentary, educational, or historical purposes. Such attentions lead to a loss of the feeling of emotions that allow one to get to the thing in itself. These are interests that come between things and our emotional reaction to them. In this area Bell is dismissive of the practice of intellectually carving up our environment into practically identified individuations. Such a practice is superficial in requiring our contemplation only to the extent to which an object is to be utilised. It marks a habit of recognising the label and overlooking the thing, and is indicative of a visual shallowness that prohibits the majority of us from seeing “emotionally” and from grasping the significance of form.

Bell holds that the discerning viewer is concerned only with line and colour, their relations and qualities, the apprehension of which (in significant form) can allow the viewer an emotion more powerful, profound, and genuinely significant than can be afforded by any description of facts or ideas. Thus, for Bell:

Great art remains stable and unobscure because the feelings that it awakens are independent of time and place, because its kingdom is not of this world. To those who have and hold a sense of the significance of form what does it matter whether the forms that move them were created in Paris the day before yesterday or in Babylon fifty centuries ago. The forms of art are inexhaustible; but all lead by the same road of aesthetic emotion to the same world of aesthetic ecstasy. (1913, p.16)

What Bell seems to be pushing for is a significance that will not be contingent on peculiarities of one age or inclination, and it is certainly interesting to see what a pursuit of this characteristic can yield. However, it is unclear why one may only reach this kind of significance by looking to emotions that are (in some sense) out of this world. Some have criticised Bell on his insistence that aesthetic emotion could be a response wholly separate from the rest of a person’s emotional character. Thomas McLaughlin (1977) claims that there could not be a pure aesthetic emotion in Bell’s sense, arguing that the aesthetic responses of a spectator are influenced by her normal emotional patterns. On this view the spectator’s emotions, including moral reactions, are brought directly into play under the control of the artist’s technique. It is difficult to deny that the significance, provocativeness and interest in many works of art do indeed require the spectator to bring with them their worldly experiences and sensibilities. John Carey (2005) is equally condemning of Bell’s appeal to the peculiar emotion provided by works of art. He is particularly critical of Bell’s contention that the same emotion could be transmitted between discreet historical periods (or between artist and latter-day spectator). On the one hand, Bell could not possibly know he is experiencing the same emotion as the Chaldean four thousand years earlier, but more importantly to experience the same emotion one would have to share the same unconscious, to have undergone the same education, to have been shaped by the same emotional experiences.

It is important to note that such objections are not entirely decisive. Provocativeness in general and indeed any interests of this kind are presumably ephemeral qualities of a work. These are exactly the kinds of transitory evaluations that Bell was keen to sidestep in characterising true works and the properties of lasting value. The same can be said for all those qualities that are only found in a work in virtue of the spectator’s peculiar education and emotional experience. Bell does acknowledge such significances but doesn’t give to them the importance that he gives to formal significance. It is when we strip away the interests, educations, and the provocations of a particular age that we get to those works that exhibit lasting worth. Having said that, there is no discernible argument in support of the claim that the lasting worth Bell attempts to isolate should be taken to be more valuable, more (or genuinely) significant than the kinds of ephemeral values he dismisses. Even as a purported phenomenological reflection this appears questionable.

In discussion of much of the criticism Bell’s account has received it is important not to run together two distinct questions. On the one hand there is the question of whether or not there exists some emotion that is peculiar to the aesthetic; that is “otherworldly” in the sense that it is not to be confused with those responses that temper the rest of our lives. The affirmation of this is certainly implicated in Bell’s account and is rightly met with some consternation. But what is liable to become obscured is that the suggestion of such an inert aesthetic emotion was part of Bell’s solution to the more interesting question with which his earlier writing was concerned. This question concerns whether or not one might isolate a particular reaction to certain (aesthetic) objects that is sufficiently independent of time, place and enculturation that one might expect it to be exhibited in subjects irrespective of their historical and social circumstance.

One response to this question is indeed to posit an emotional response that is unlike all those responses that are taken to be changeable and contingent on time, culture and so forth. Looking at the changeable interests of the art-world over time, one might well see that an interest in representation or subject matter betrays the spectator’s allegiance to “the gross herd” (as Bell puts it) of some era. But it seems this response is unsatisfactory. As we have seen, McLaughlin and Carey are sceptical of the kind of inert emotion Bell stipulates. Bell’s response to such criticisms is to claim that those unable to accept the postulation are simply ignorant of the emotion he describes. While this is philosophically unsatisfactory the issue is potentially moot. Still, it might be thought that there are other ways in which one might characterise lasting value such as to capture the kind of quality Bell pursued whilst dismissing the more ephemeral significances that affect a particular time.

Regarding the second question, it is tempting to see something more worthwhile in Bell’s enterprise. There is at least some prima facie attraction to Bell’s response, for, assuming that one is trying to distinguish art from non-art, if one hopes to capture something stable and unobscure in drawing together all those things taken to be art, one might indeed look to formal properties of works and one will (presumably) only include those works from any time that do move us in the relevant respect. What is lacking in Bell’s account is some defense of the claim, firstly that those things that move Bell are the domain of true value, and secondly that we should be identifying something stable and unobscure. Why should we expect to identify objects of antiquity as valuable artworks on the basis of their stirring our modern dispositions (excepting the claim—Bell’s claim—that such dispositions are not modern at all but timeless)? Granted, there are some grounds for pursuing the kind of account Bell offers, particularly if one is interested in capturing those values that stand the test of time. However, Bell appears to motivate such a pursuit by making a qualitative claim that such values are in some way more significant, more valuable than those he rejects. And it is difficult to isolate any argument for such a claim.

c. Aesthetic versus Non-Aesthetic Appreciation

The central line of Bell’s account that appears difficult to accept is that while one might be able to isolate a specifically perceptual response to artworks, it seems that one could only equate this response with all that is valuable in art if one were able to qualify the centrality of this response to the exclusion of others. This presentation will not address (as some critics do) the question of whether such a purely aesthetic response can be identified; this must be addressed if anything close to Bell’s account is to be pursued. But for the time being all one need acknowledge is that the mere existence of this response is not enough to legitimise the work Bell expected it to do. A further argument is required to justify a thesis that puts formal features (or our responses to these) at centre stage.

Yet aside from this aim there are some valuable mechanisms at work in Bell’s theory. As a corollary of his general stance, Bell mentions that to understand art we do not need to know anything about art-history. It may be that from works of art we can draw inferences as to the sort of people who made them; but an intimate understanding of an artist will not tell us whether his pictures are any good. This point again relates to Bell’s contention that pure aesthetics is concerned only with the question of whether or not objects have a specific emotional significance to us. Other questions, he believes, are not questions for aesthetics:

To appreciate a man’s art I need know nothing whatever about the artist; I can say whether this picture is better than that without the help of history, but if I am trying to account for the deterioration of his art, I shall be helped by knowing that he has been seriously ill… To mark the deterioration was to make a pure, aesthetic judgement: to account for it was to become an historian. (1913, pp.44-5, emphasis added)

The above passage illustrates an element of Bell’s account some subsequent thinkers have been keen to preserve. Bell holds that attributing value to a work purely on the basis of the position it holds within an art-historical tradition, (because it is by Picasso, or marks the advent of cubism) is not a pursuit of aesthetics. Although certain features and relations may be interesting historically, aesthetically these can be of no consequence. Indeed valuing an object because it is old, interesting, rare, or precious can over-cloud one’s aesthetic sensibility and puts one at a disadvantage compared to the viewer who knows and cares nothing of the object under consideration. Representation is, also, nothing to do with art’s value according to Bell. Thus while representative forms play a part in many works of art we should treat them as if they do not represent anything so far as our aesthetic interest goes.

It is fairly well acknowledged that Bell had a non-philosophical agenda for these kinds of claims. It is easy to see in Bell a defense of the value of abstract art over other art forms and this was indeed his intention. The extent to which Renaissance art can be considered great, for example, has nothing to do with representational accuracy but must be considered only in light of the formal qualities exhibited. In this manner many of the values formerly identified in artworks, and indeed movements, would have to be dismissed as deviations from the sole interest of the aesthetic: the pursuit of significant form.

There is a sense in which we should not underplay the role of the critic or philosopher who should be capable of challenging our accepted practices; capable of refining or cultivating our tastes. To this end Bell’s claims are not out of place. However, while there is some tendency to reflect upon purely formal qualities of a work of art rather than artistic technique or various associations; while there is a sense in which many artists attempt to depict something beyond the evident (utility driven) perceptual shallowness that can dictate our perceptual dealings, it remains obscure why this should be our only interest. Unfortunately, the exclusionary nature of Bell’s account seems only to be concerned with the aesthetic narrowly conceived, excluding any possibility of the development of, or importance of, other values and interests, both as things stand and in future artistic development. Given the qualitative claim Bell demands concerning the superior value of significant form this appears more and more troubling with the increasing volume of works (and indeed values) that would have to be ignored under Bell’s formulation.

As a case in point (perhaps a contentious one but there are any number of related examples), consider Duchamp’s Fountain (1917). In line with much of the criticism referred to in Part 1, the problem is that because Bell identifies aesthetic value (as he construes it) with “art-hood” itself, Artistic Formalism has nothing to say about a urinal that purports to be anti-aesthetic and yet art. Increasingly, artworks are recognised as such and valued for reasons other than the presence (or precisely because of their lack) of aesthetic properties, or exhibited beauty. The practice continues, the works are criticised and valued, and formalists of this kind can do very little but stamp their feet. The death of Artistic Formalism is apparently heralded by the departure of practice from theory.

d. Conclusions: From Artistic to (Moderate) Aesthetic Formalism

So what are we to take from Bell’s account? His claims that our interactions with certain artworks yield an emotion peculiar to the aesthetic, and not experienced in our everyday emotional lives, is rightly met with consternation. It is unclear why we should recognise such a reaction to be of a different kind (let alone a more valuable kind) to those experienced in other contexts such as to discount many of our reactions to ostensible aesthetic objects as genuine aesthetic responses. Few are prompted by Bell’s account to accept this determination of the aesthetic nor does it seem to satisfactorily capture all that we should want to in this area. However, Bell’s aim in producing this theory was (ostensibly) to capture something common to aesthetic objects. In appealing to a timeless emotion that will not be subject to the contingencies of any specific era, Bell seemingly hoped to account for the enduring values of works throughout time. It is easy enough to recognise this need and the place Bell’s theory is supposed to hold in satisfying what does appear to be a sensible requirement. It is less clear that this path, if adequately pursued, should be found to be fruitless.

That we should define the realm of the aesthetic in virtue of those works that stand the test of time has been intuitive to some; how else are we to draw together all those objects worthy of theoretical inclusion whilst characterising and discounting failed works, impostors, and anomalies? Yet there is something disconcerting about this procedure. That we should ascribe the label “art” or even “aesthetic” to a conjunction of objects that have, over time, continued to impress on us some valuable property, seems to invite a potentially worrying commitment to relativity. The preceding discussion has given some voice to a familiar enough contention that by indexing value to our current sensibility we stand to dismiss things that might have been legitimately valued in the past. Bell’s willingness to acknowledge, even rally for, the importance of abstract art leads him to a theory that identifies the value of works throughout history only on the basis of their displaying qualities (significant form) that he took to be important. The cost (although for Bell this is no cost) of such a theory is that things like representational dexterity (a staple of the Renaissance) must be struck from the list of aesthetically valuable properties, just as the pursuit of such a quality by artists must be characterised as misguided.

The concern shared by those who criticise Bell seems to stem from an outlook according to which any proposed theory should be able to capture and accommodate the moving trends, interests and evaluations that constitute art history and drive the very development of artistic creation. This is what one expects an art theory to be able to do. This is where Artistic Formalism fails, as art-practice and art theory diverge. Formalism, as a theory of art, is ill suited to make ontological distinctions between genuine- and non-art. A theory whose currency is perceptually available value will be ill-equipped to officiate over a practice that is governed by, amongst other things, institutional considerations; in fact a practice that is able to develop precisely by identifying recognised values and then subverting them. For these reasons it seems obvious that Formalism is not a bad theory of art but is no theory of art at all.

This understood, one can begin to see those elements of Bell’s Formalism that may be worth salvaging and those that must be rejected. For instance, Bell ascribes a particular domain to aesthetic judgements, reactions, and evaluations such as to distinguish a number of other pronouncements that can also be made in reference to the object in question (some, perhaps, deserve to be labelled “aesthetic” but some—arguably—do not). Bell can say of Picasso’s Guernica (1937) that the way it represents and expresses various things about the Spanish Civil War might well be politically and historically interesting (and valuable)—and might lead to the ascription of various properties to the work (being moving, or harsh). Likewise, the fact that it is by Picasso (or is a genuine Picasso rather than a forgery) will be of interest to some and might also lead to the ascription of certain properties. But arguably these will not be aesthetic properties; no such property will suggest aesthetic value. Conversely, the fact that a particular object is a fake is often thought to devalue the work; for many it may even take away the status of work-hood. But for Bell if the object were genuinely indistinguishable from the original, then it will be capable of displaying the same formal relations and will thus exhibit equal aesthetic value. It is this identification of aesthetic value with formal properties of the work that appears—for some—to continue to hold some plausibility.

However, there have been few (if any) sympathisers towards Bell’s insistence that only if something displayed value in virtue of its formal features would it count as art, or as valuable in an aesthetic. A more moderate position would be to ascribe a particular domain to formal aesthetic judgements, reactions and evaluations, while distinguishing these from both non-formal aesthetic judgements, and non-aesthetic (for example, artistic, political, historical) judgements. On this kind of approach, Bell’s mistake was two-fold: Bell ran into difficulties when he (1) attempted to tie Formalism to the nature of art itself, and (2) restricted the aesthetic exclusively to a formal conception of beauty.

By construing formalism as an aesthetic theory (as an account of what constitutes aesthetic value) or as part of an aesthetic theory (as an account of one kind of aesthetic value), whilst at the same time admitting that there are other values to be had (both aesthetic and non-aesthetic), the Formalist needn’t go so far as to ordain the priority or importance of this specific value in the various practices in which it features. In this way, one can anticipate the stance of the Moderate Formalist who asserts (in terms reminiscent of Kant’s account) there to be two kinds of beauty: formal beauty, and non-formal beauty. Formal beauty is an aesthetic property that is entirely determined by “narrow” non-aesthetic properties (these include sensory and non-relational physical properties such as the lines and colours on the surface of a painting).  Non-formal beauty is determined by “broad” non-aesthetic properties (which covers anything else, including appeals to the content-related aspects that would be required to ascertain the aptness or suitability of certain features for the intended end of the painting, or the accuracy of a representational portrait, or the category to which an artwork belongs).

While these notions require much clarification (see Part 3), a useful way to express the aspirations of this account would be to note that the Moderate Formalist claims that their metaphysical stance generates the only theory capable of accommodating the aesthetic properties of all works of art. Unlike Bell’s “extreme Formalism”, maintaining all aesthetic properties to be narrowly determined by sensory and intrinsic physical properties; and unlike “anti-Formalism”, according to which all aesthetic properties are at least partly determined by broad non-aesthetic properties such as the artist’s intentions, or the artwork’s history of production; the Moderate Formalist insists that, in the context of the philosophy of art, many artworks have a mix of formal and non-formal aesthetic properties; that others have only non-formal aesthetic properties; and that at least some artworks have only formal aesthetic properties.

3. Nick Zangwill’s Moderate Aesthetic Formalism

The issue of formalism is introduced on the assumption that aesthetic properties are determined by certain non-aesthetic properties; versions of formalism differ primarily in their answers to the question of which non-aesthetic properties are of interest. This part of the presentation briefly outlines the central characterisations of “form” (and their differences) that will be pertinent to an understanding of twenty-first century discussions of Formalism. For present purposes, and in light of the previous discussion, it will be satisfactory to focus on formal characterisations of artworks and, more specifically visual art.

a. Extreme Formalism, Moderate Formalism, Anti-Formalism

Nick Zangwill recognises that arrangements of lines, shapes, and colours (he includes “shininess” and “glossiness” as colour properties) are typically taken as formal properties, contrasting these with non-formal properties which are determined, in part, by the history of production or context of creation for the artwork. In capturing this divide, he writes:

The most straightforward account would be to say that formal properties are those aesthetic properties that are determined solely by sensory or physical properties—so long as the physical properties in question are not relations to other things or other times. This would capture the intuitive idea that formal properties are those aesthetic properties that are directly perceivable or that are determined by properties that are directly perceivable. (2001, p.56)

Noting that this will not accommodate the claims of some philosophers that aesthetic properties are “dispositions to provoke responses in human beings”, Zangwill stipulates the word “narrow” to include sensory properties, non-relational physical properties, and dispositions to provoke responses that might be thought part-constitutive of aesthetic properties; the word “broad” covers anything else (such as the extrinsic property of the history of production of a work). We can then appeal to a basic distinction: “Formal properties are entirely determined by narrow nonaesthetic properties, whereas nonformal aesthetic properties are partly determined by broad nonaesthetic properties.” (2001, p.56)

On this basis, Zangwill identifies Extreme Formalism as the view that all aesthetic properties of an artwork are formal (and narrowly determined), and Anti-Formalism as the view that no aesthetic properties of an artwork are formal (all are broadly determined by history of production as well as narrow non-aesthetic properties). His own view is a Moderate Formalism, holding that some aesthetic properties of an artwork are formal, others are not. He motivates this view via a number of strategies but in light of earlier parts of this discussion it will be appropriate to focus on Zangwill’s responses to those arguments put forward by the anti-formalist.

b. Responding to Kendall Walton’s Anti-Formalism

Part 1 briefly considersed Kendall Walton’s influential position according to which in order to make any aesthetic judgement regarding a work of art one must see it under an art-historical category. This claim was made in response to various attempts to “purge from criticism of works of art supposedly extraneous excursions into matters not (or not “directly”) available to inspection of the works, and to focus attention on the works themselves” (See, for example, the discussion of Clive Bell in Part 2). In motivating this view Walton offers what he supposes to be various “intuition pumps” that should lead to the acceptance of his proposal.

In defense of a moderate formalist view Nick Zangwill has asserted that Walton’s thesis is at best only partly accurate. For Zangwill, there is a large and significant class of works of art and aesthetic properties of works of art that are purely formal; in Walton’s terms the aesthetic properties of these objects emerge from the “configuration of colours and shapes on a painting” alone. This would suggest a narrower determination of those features of a work “available to inspection” than Walton defends in his claim that the history of production (a non-formal feature) of a work partly determines its aesthetic properties by determining the category to which the work belongs and must be perceived. Zangwill wants to resist Walton’s claim that all or most works and values are category-dependent; aiming to vindicate the disputed negative thesis that “the application of aesthetic concepts to a work of art can leave out of consideration facts about its origin”. Zangwill is keen to point out that a number of the intuition pumps Walton utilises are less decisive than has commonly been accepted.

Regarding representational properties, for example, Walton asks us to consider a marble bust of a Roman emperor which seems to us to resemble a man with, say, an aquiline nose, a wrinkled brow, and an expression of grim determination, and about which we take to represent a man with, or as having, those characteristics. The question is why don’t we say that it resembles or represents a motionless man, of uniform (marble) colour, who is severed at the chest? We are interested in representation and it seems the object is in more respects similar to the latter description than the former. Walton is able to account for the fact that we are not struck by the similarity in the latter sense as we are by the former by appeal to his distinction between standard, contra-standard and variable properties:

The bust’s uniform color, motionlessness, and abrupt ending at the chest are standard properties relative to the category of busts, and since we see it as a bust they are standard for us. […] A cubist work might look like a person with a cubical head to someone not familiar with the cubist style. But the standardness of such cubical shapes for people who see it as a cubist work prevents them from making that comparison. (1970, p.345)

His central claim is that what we take a work to represent (or even resemble) depends only on the variable properties, and not those that are standard, for the category under which we perceive it. It seems fairly obvious that this account must be right. Zangwill agrees and is hence led to accept that in the case of representational qualities there is nothing in the objects themselves that could tell the viewer which of the opposing descriptions is appropriate. For this, one must look elsewhere to such things as the history of production or the conventionally accepted practices according to which the object’s intentional content may be derived.

Zangwill argues that while representational properties might not be aesthetic properties (indeed they are possessed by ostensibly non-aesthetic, non-art items such as maps, blueprints, and road signs) they do appear to be among the base (non-aesthetic) properties that determine aesthetic properties. Given that representational properties of a work are, in part, determined by the history of production, and assuming that some aesthetic properties of representational works are partly determined by what they represent, Zangwill concludes some aesthetic properties to be non-formal. This is no problem for the Moderate Formalist of course; Walton’s intuition pump does not lead to an anti-formalist argument for it seems equally clear that only a subclass of artworks are representational works. Many works have no representational properties at all and are thus unaffected by the insistence that representational properties can only be successfully identified via the presence of art-historical or categorical information. Given that Zangwill accepts Walton’s claim in respect only to a subclass of aesthetic objects, Moderate Formalism remains undisturbed.

However, Walton offers other arguments that might be thought to have a more general application and thus forestall this method of “tactical retreat” on the part of the would-be Moderate Formalist. The claim that Walton seems to hold for all artworks (rather than just a subclass) is that the art-historical category into which an artwork falls is aesthetically relevant because one’s belief that a work falls under a particular category affects one’s perception of it—one experiences the work differently when one experiences it under a category. Crucially, understanding a work’s category is a matter of understanding the degrees to which its features are standard, contra-standard and variable with respect to that category. Here is Walton’s most well-known example:

Imagine a society which does not have an established medium of painting, but does produce a kind of work called guernicas. Guernicas are like versions of Picasso’s “Guernica” done in various bas-relief dimensions. All of them are surfaces with the colours and shapes of Picasso’s “Guernica,” but the surfaces are moulded to protrude from the wall like relief maps of different kinds of terrain. […] Picasso’s “Guernica” would be counted as a guernica in this society – a perfectly flat one – rather than as a painting. Its flatness is variable and the figures on its surface are standard relative to the category of guernicas. […] This would make for a profound difference between our reaction to “Guernica” and theirs. (1970, p.347)

When we consider (as a slight amendment to Walton’s example) a guernica in this society that is physically indistinguishable from Picasso’s painting, we should become aware of the different aesthetic responses experienced by members of their society compared to ours. Walton notes that it seems violent, dynamic, vital, disturbing to us, but imagines it would strike them as cold, stark, lifeless, restful, or perhaps bland, dull, boring—but in any case not violent, dynamic, and vital. His point is that the object is only violent and disturbing as a painting, but dull, stark, and so forth as a guernica, hence the thought experiment is supposed to prompt us to agree that aesthetic properties are dependent on (or relative to) the art-historical categories under which the observer subsumes the object in question. Through this example Walton argues that we do not simply judge that an artwork is dynamic and a painting. The only sense in which it is appropriate to claim that Guernica is dynamic is in claiming that it is dynamic as a painting, or for people who see it as a painting.

This analysis has been variously accepted in the literature; it is particularly interesting, therefore, to recognise Zangwill’s initial suspicion of Walton’s account. He notes that a plausible block to this intuition comes in the observation that it becomes very difficult to make aesthetic judgements about whole categories or comparisons of items across categories. Zangwill stipulates that Walton might respond with the claim that we simply widen the categories utilised in our judgements. For example, when we say that Minoan art is (in general) more dynamic than Mycenean art, what we are saying is that this is how it is when we consider both sorts of works as belonging to the class of “prehistoric Greek art”. He continues:

But why should we believe this story? It does not describe a psychological process that we are aware of when we make cross-category judgements. The insistence that we are subconsciously operating with some more embracing category, even though we are not aware of it, seems to be an artefact of the anti-formalist theory that there is no independent reason to believe. If aesthetic judgements are category-dependent, we would expect speakers and thinkers to be aware of it. But phenomenological reflection does not support the category-dependent view. (2001, pp. 92-3)

In these cases, according to Zangwill, support does not appear to be sourced either from phenomenology or from our inferential behaviour. Instead he argues that we can offer an alternative account of what is going on when we say something is “elegant for a C” or “an elegant C”. This involves the claim that questions of goodness and elegance are matters of degree. We often make ascriptions that refer to a comparison class because this is a quicker and easier way of communicating questions of degree. But the formalist will say that the precise degree of some C-thing’s elegance does not involve the elegance of other existing C-things. And being a matter of degree is quite different from being category-dependent. So Zangwill’s claim is that it is pragmatically convenient, but far from essential, that one make reference to a category-class in offering an aesthetic judgement. We are able to make category-neutral aesthetic judgements, and crucially for Zangwill, such judgements are fundamental: category-dependent judgements are only possible because of category-neutral ones. The formalist will hold that without the ability to make category-neutral judgements we would have no basis for comparisons; Walton has not shown that this is not the case.

In this way Zangwill asserts that we can understand that it is appropriate to say that the flat guernica is “lifeless” because it is less lively than most guernicas— but this selection of objects is a particularly lively one. Picasso’s Guernica is appropriately thought of as “vital” because it is more so than most paintings; considered as a class these are not particularly lively. But in fact the painting and the guernica might be equally lively, indeed equivalent in respect of their other aesthetic properties—they only appear to differ in respect of the comparative judgements in which they have been embedded. It is for this reason that Zangwill concludes that we can refuse to have our intuitions “pumped” in the direction Walton intends. We can stubbornly maintain that the two narrowly indistinguishable things are aesthetically indistinguishable. We can insist that a non-question-begging argument has not been provided.

On this view, one can allow that reference to art-historical categories is a convenient way of classifying art, artists, and art movements, but the fact that this convenience has been widely utilised need not be telling against alternative accounts of aesthetic value.

c. Kant’s Formalism

Zangwill’s own distinction between formal and non-formal properties is derived (broadly) from Immanuel Kant’s distinction between free and dependent beauty. Indeed, Zangwill has asserted that “Kant was also a moderate formalist, who opposed extreme formalism when he distinguished free and dependent beauty in §16 of the Critique of Judgement” (2005, p.186). In the section in question Kant writes:

There are two kinds of beauty; free beauty (pulchritudo vaga), or beauty which is merely dependent (pulchritudo adhaerens). The first presupposes no concept of what the object should be; the second does presuppose such a concept and, with it, an answering perfection of the object.

On the side of free beauty Kant lists primarily natural objects such as flowers, some birds, and crustacea, but adds wallpaper patterns and musical fantasias; examples of dependent beauties include the beauty of a building such as a church, palace, or summer-house. Zangwill maintains that dependent beauty holds the key to understanding the non-formal aesthetic properties of art—without this notion it will be impossible to understand the aesthetic importance of pictorial representation, or indeed any of the art-forms he analyses. A work that is intended to be a representation of a certain sort—if that intention is successfully realised—will fulfil the representational function the artist intended, and may (it is claimed) do so beautifully. In other words, some works have non-formal aesthetic properties because of (or in virtue of) the way they embody some historically given non-aesthetic function.

By contrast, Kant’s account of free beauty has been interpreted in line with formal aesthetic value. At §16 and §17, Kant appears to place constraints on the kinds of objects that can exemplify pure (that is, formal) beauty, suggesting that nature, rather than art, provides the proper objects of (pure) aesthetic judgement and that to the extent that artworks can be (pure) objects of tastes they must be abstract, non-representational, works. If this is a consequence of Kant’s account, the strong Formalist position derived from judgements of pure beauty would presumably have to be restricted in application to judgements of abstract art and, perhaps in quotidian cases, the objects of nature. However, several commentators (for example, Crawford (1974) and Guyer (1997)) have maintained that Kant’s distinction between free and dependent beauty does not entail the classification of art (even representational art) as merely dependently beautiful. Crawford, for example, takes the distinction between free and dependent beauty to turn on the power of the judger to abstract towards a disinterested position; this is because he takes Kant’s distinction to be between kinds of judgement and not between kinds of object.

This is not the place for a detailed exegesis of Kant’s aesthetics, but it is pertinent to at least note the suggestion that it is nature (rather than art) that provides the paradigm objects of formal aesthetic judgement. In the next part of this presentation we will explore this possibility, further considering Zangwill’s moderate, and more extreme Formalist conclusions in the domain of nature appreciation.

4. From Art to the Aesthetic Appreciation of Nature

Allen Carlson is well known for his contribution to the area broadly known as “environmental aesthetics”, perhaps most notably for his discussion of the aesthetic appreciation of nature (2000). Where discussing the value of art Carlson seems to adopt a recognisably moderate formalist position, acknowledging both that where formalists like Bell went wrong was in presupposing formalism to be the only valid way to appreciate visual artworks (pace Part 2), but also suggesting that a “proper perspective” on the application of formalism should have revealed it to be one among many “orientations” deserving recognition in art appreciation (pace Part 3). However, when turning to the appreciation of the natural environment Carlson adopts and defends a strongly anti-formalist position, occupying a stance that has been referred to as “cognitive naturalism”. This part of the presentation briefly discusses Carlson’s rejection of formalism before presenting some moderate, and stronger formalist replies in this domain.

a. Anti-Formalism and Nature

Carlson has characterised contemporary debates in the aesthetics of nature as attempting to distance nature appreciation from theories of the appreciation of art. Contemporary discussion introduces different models for the appreciation of nature in place of the inadequate attempts to apply artistic norms to an environmental domain. For example, in his influential “Appreciation and the Natural Environment” (1979) he had disputed both “object” and “landscape” models of nature appreciation (which might be thought attractive to the Moderate Formalist), favouring the “natural environmental” model (which stands in opposition to the other two). Carlson acknowledged that the “object” model has some utility in the art-world regarding the appreciation of non-representational sculpture (he takes Brancusi’s Bird in Space (1919) as an example). Such sculpture can have significant (formal) aesthetic properties yet no representational connections to the rest of reality or relational connections with its immediate surroundings. Indeed, he acknowledges that the formalist intuitions discussed earlier have remained prevalent in the domain of nature appreciation, meeting significant and sustained opposition only in the domain of art criticism.

When it comes to nature-appreciation, formalism has remained relatively uncontested and popular, emerging as an assumption in many theoretical discussions. However, Carlson’s conclusion on the “object” and “landscape” models is that the former rips natural objects from their larger environments while the latter frames and flattens them into scenery. In focussing mainly on formal properties, both models neglect much of our normal experience and understanding of nature.  The “object” model is inappropriate as it cannot recognise the organic unity between natural objects and their environment of creation or display, such environments are—Carlson believes—aesthetically relevant. This model thus imposes limitations on our appreciation of natural objects as a result of the removal of the object from its surroundings (which this model requires in order to address the questions of what and how to appreciate). For Carlson, the natural environment cannot be broken down into discrete parts, divorced from their former environmental relations any more than it can be reduced to a static, two-dimensional scene (as in the “landscape” model). Instead he holds that the natural environment must be appreciated for what it is, both nature and an environment. On this view natural objects possess an organic unity with their environment of creation: they are a part of and have developed out of the elements of their environments by means of the forces at work within those environments. Thus some understanding of the environments of creation is relevant to the aesthetic appreciation of natural objects.

The assumption implicit in the above rejection of Formalism is familiar from the objections (specifically regarding Walton) from Part 3. It is the suggestion that the appropriate way to appreciate some target object is via recourse to the kind of thing it is; taking the target for something it is not does not constitute appropriate aesthetic appreciation of that thing. Nature is natural so cannot be treated as “readymade” art. Carlson holds that the target for the appreciation of nature is also an environment, entailing that the appropriate mode of appreciation is active, involved appreciation. It is the appreciation of a judge who is in the environment, being part of and reacting to it, rather than merely being an external onlooker upon a two-dimensional scene. It is this view that leads to his strong anti-formalist suggestion that the natural environment as such does not possess formal qualities. For example, responding to the “landscape” model Carlson suggests that the natural environment itself only appears to have formal qualities when a person somehow imposes a frame upon it and thus formally composes the resultant view. In such a case it is the framed view that has the qualities, but these will vary depending upon the frame and the viewer’s position. As a consequence Carlson takes the formal features of nature, such as they are, to be (nearly) infinitely realisable; insofar as the natural environment has formal qualities, they have an indeterminateness, making them both difficult to appreciate, and of little significance in the appreciation of nature.

Put simply, the natural environment is not an object, nor is it a static two-dimensional “picture”, thus it cannot be appreciated in ways satisfactory for objects or pictures; furthermore, the rival models discussed do not reveal significant or sufficiently determinate appreciative features. In rejecting these views Carlson has been concerned with the questions of what and how we should appreciate; his answer involves the necessary acknowledgement that we are appreciating x qua x, where some further conditions will be specifiable in relation to the nature of the x in question. It is in relation to this point that Carlson’s anti-formalist “cognitive naturalism” presents itself.

In this respect his stance on nature appreciation differs from Walton’s, who did not extend his philosophical claims to aesthetic judgements about nature (Walton lists clouds, mountains, sunsets), believing that these judgements, unlike judgements of art, are best understood in terms of a category-relative interpretation. By contrast, Carlson can be understood as attempting to extend Walton’s category dependent account of art-appreciation to the appreciation of nature. On this view we do not need to treat nature as we treat those artworks about whose origins we know nothing because it is not the case that we know nothing of nature:

In general we do not produce, but rather discover, natural objects and aspects of nature. Why should we therefore not discover the correct categories for their perception? We discover whales and later discover that, in spite of somewhat misleading perceptual properties, they are in fact mammals and not fish. (Carlson, 2000, p.64)

By discovering the correct categories to which objects or environments belong, we can know which is the correct judgement to make (the whale is not a lumbering and inelegant fish). It is in virtue of this that Carlson claims our judgements of the aesthetic appreciation of nature sustain responsible criticism in the way Walton characterises the appreciation of art. It is for this reason that Carlson concludes that for the aesthetic appreciation of nature, something like the knowledge and experience of the naturalist or ecologist is essential. This knowledge gives us the appropriate foci of aesthetic significance and the appropriate boundaries of the setting so that our experience becomes one of aesthetic appreciation. He concludes that the absence of such knowledge, or any failure to perceive nature under the correct categories, leads to aesthetic omission and, indeed, deception.

b. Formalism and Nature.

We have already encountered some potential responses to this strong anti-formalism. The moderate formalist may attempt to deploy a version of the aesthetic/non-aesthetic distinction such as to deny that the naturalist and ecologist are any better equipped than the rest of us to aesthetically appreciate nature. They are, of course, better equipped to understand nature, and to evaluate (in what we might call a “non-aesthetic” sense) the objects and environments therein. This type of response claims that the ecologist can judge (say) the perfectly self-contained and undisturbed ecosystem, can indeed respond favourably to her knowledge of the rarity of such a find. Such things are valuable in that they are of natural-historical interest. Such things are of interest and significance to natural-historians, no doubt. The naturalist will know that the whale is not “lumbering” compared to most fish (and will not draw this comparison), and will see it as “whale-like”, “graceful”, perhaps particularly “sprightly” compared to most whales. One need not deny that such comparative, cognitive judgements can feel a particular way, or that such judgements are a significant part of the appreciation of nature; but it may be possible to deny that these (or only these) judgements deserve to be called aesthetic.

However, Carlson’s objection is not to the existence of formal value, but to the appropriateness of consideration of such value. Our knowledge of an environment is supposed to allow us to select certain foci of aesthetic significance and abstract from, or exclude, others such as to characterise different kinds of appropriate experience:

…we must survey a prairie environment, looking at the subtle contours of the land, feeling the wind blowing across the open space, and smelling the mix of prairie grasses and flowers. But such an act of aspection has little place in a dense forest environment. Here we must examine and scrutinise, inspecting the detail of the forest floor, listening carefully for the sounds of birds and smelling carefully for the scent of spruce and pine. (Carlson, 2000, p.64)

Clearly knowledge of the terrain and environment that is targeted in each of these cases might lead the subject to be particularly attentive to signs of certain expected elements; however, there are two concerns that are worth highlighting in closing.

Firstly, it is unclear why one should, for all one’s knowledge of the expected richness or desolation of some particular landscape, be in a position to assume of (say) the prairie environment that no detailed local scrutiny should yield the kind of interest or appreciation (both formal and non-formal) that might be found in other environments. It is unclear whether Carlson could allow that such acts might yield appreciation but must maintain that they would not yield instances of aesthetic appreciation of that environment, or whether he is denying the availability of such unpredicted values—in either case the point seems questionable. Perhaps the suspicion is one that comes from proportioning one’s expectation to one’s analysis of the proposed target. The first concern is thus that knowledge (even accurate knowledge) can be as potentially blinding as it is potentially enlightening.

The second concern is related to the first, but poses more of a direct problem for Carlson. His objection to the “object” and “landscape” models regards their propensity to limit the potentiality for aesthetic judgement by taking the target to be something other than it truly is. Part of the problem described above relates to worries regarding the reduction of environments to general categories like prairie landscape, dense forest, pastoral environment such that one enlists expectations of those attentions that will and will not be rewarded, and limits one’s interaction accordingly. While it might be true that some understanding of the kind of environment we are approaching will suggest certain values to expect as well as indicating the act of aspection appropriate for delivering just these, the worry is that this account may be unduly limiting because levels of appreciation are unlikely to exceed the estimations of the theory and the acts of engagement and interaction these provoke. In nature more than anywhere else this seems to fail to do justice to those intuitions that the target really is (amongst other things) a rich, unconstrained sensory manifold. To briefly illustrate the point with a final example, Zangwill (2001, pp.116-8) considers such cases (which he doesn’t think Carlson can account for) as the unexpected or incongruous beauty of the polar bear swimming underwater. Not only is this “the last thing we expected”, but our surprise shows that

…it is not a beauty that we took to be dependent in some way upon our grasp of its polar-bearness. We didn’t find it elegant as a polar bear. It is a category-free beauty. The underwater polar bear is a beautiful thing in beautiful motion…

The suggestion here is that to “do justice to” and thus fully appreciate the target one must be receptive not simply to the fact that it is nature, or that it is an environment, but that it is, first and foremost, the individual environment that it (and not our understanding of it) reveals itself to be. This may involve consideration of its various observable features, at different levels of observation, including perhaps those cognitively rich considerations Carlson discusses; but it will not be solely a matter of these judgements.  According to the (Moderate) Formalist, the “true reality” of things is more than Carlson’s account seems capable of capturing, for while a natural environment is not in fact a static two-dimensional scene, it may well in fact possess (amongst other things) a particular appearance for us, and that appearance may be aesthetically valuable. The Moderate Formalist can accommodate that value without thereby omitting acknowledgement of other kinds of values, including those Carlson defends.

Finally, it should be noted that when it comes to inorganic nature, Zangwill has argued for a stronger formalist position (much closer to Bell’s view about visual art). The basic argument for this conclusion is that even if a case can be made for claiming that much of organic nature should be understood and appreciated via reference to some kind of “history of production” (typically in terms of biological functions, usually thought to depend on evolutionary history), inorganic or non-biological nature (rivers, rocks, sunsets, the rings of Saturn) does not have functions and therefore cannot have aesthetic properties that depend on functions. Nor should we aesthetically appreciate inorganic things in the light of functions they do not have.

5. Conclusions

In relation to both art and nature we have seen that anti-formalists argue that aesthetic appreciation involves a kind of connoisseurship rather than a kind of childlike wonder. Bell’s extreme (artistic) formalism appeared to recommend a rather restricted conception of the art-connoisseur. Walton’s and Carlson’s anti-formalism (in relation to art and nature respectively) both called for the expertise and knowledge base required to identify and apply the “correct” category under which an item of appreciation must be subsumed. Yet the plausibility of challenges to these stances (both the strong formalism of Bell and the strong anti-formalism of Walton and Carlson) appears to be grounded in more moderate, tolerant proposals. Zangwill, for example, defends his moderate formalism as “a plea for open-mindedness” under the auspices of attempts to recover some of our aesthetic innocence.

This presentation began with an historical overview intended to help situate (though not necessarily motivate or defend) the intuition that there is some important sense in which aesthetic qualities pertain to the appearance of things. Anti-formalists point out that beauty, ugliness, and other aesthetic qualities often (or always) pertain to appearances as informed by our beliefs and understanding about the reality of things. Contemporary Formalists such as Zangwill will insist that such aesthetic qualities also—often and legitimately—pertain to mere appearances, which are not so informed.

On this more moderate approach, the aesthetic responses of the connoisseur, the art-historian, the ecologist can be acknowledged while nonetheless insisting that the sophisticated aesthetic sensibility has humble roots and we should not forget them. Formal aesthetic appreciation may be more “raw, naïve, and uncultivated” (Zangwill, 2005, p.186), but arguably it has its place.

6. References and Further Reading

  • Bell, Clive (1913) Art Boston, Massachusetts.
    • An important presentation and defense of Artistic Formalism in the Philosophy of Art.
  • Budd, Malcolm (1996) ‘The Aesthetic Appreciation of Nature’ British Journal of Aesthetics 36, pp.207-222
    • For some important challenges to the anti-formalist views put forward by Carlson (2000); in particular, Budd supports the intuition that part of the value of nature relates to its boundless, unconstrained, and variable potential for appreciation.
  • Carey, John (2005) What Good are the Arts? London: Faber and Faber
    • While pertinent, Carey’s discussion should be treated with some caution as, unlike McLaughlin (1977), he writes with a tone that seeks to trivialise Bell’s position and, at times, apparently misses the acuity with which Bell presented his formulation.
  • Carlson, Allen, (1979) ‘Appreciation and the Natural Environment’ The Journal of Aesthetics and Art Criticism 37, pp. 267-275
    • An early indication and defense of Carlson’s “Natural environmental” model of appreciation (compare Carlson (2000)).
  • Carlson, Allen (2000) Aesthetics and the Environment, Art and Architecture London and New York: Routledge
    • An influential book in which Carlson draws together much of his work over the previous two decades.
  • Crawford, Donald (1974) Kant’s Aesthetic Theory Madison: Wisconsin University Press
    • A scholarly and influential engagement with Kant’s Critique of Judgement (see also Guyer (1997)).
  • Currie, Gregory (1989) An Ontology of Art Basingstoke: Macmillan
    • See Chapter 3 especially for a discussion of “Aesthetic Empiricism” in connection with the anti-formalist arguments discussed in Part 1.
  • Danto, Arthur (1981) The Transfiguration of the Commonplace Cambridge, MA.: Harvard University Press
    • For Danto, Warhol’s Brillo Box exhibition (1964) marked a watershed in the history of aesthetics, rendering almost worthless everything written by philosophers on art. Warhol’s sculptures were indistinguishable from ordinary Brillo boxes, putatively showing that an artwork needn’t posses some special perceptually discernible quality in order to afford art-status.
  • Davies, David (2004) Art as Performance Oxford: Blackwell
    • For a discussion of “Aesthetic Empiricism” and some development/departure from Currie’s ontology of art (1989).
  • Dowling, Christopher (2010) ‘Zangwill, Moderate Formalism, and another look at Kant’s Aesthetic’, Kantian Review 15, pp.90-117
    • Explores the formalist interpretation of Kant’s aesthetics in connection with Zangwill’s moderate formalism.
  • Dutton, Dennis (1983) [ed.] The Forger’s Art Berkeley and Los Angeles, CA.: University of California Press
    • An important collection of articles, including Leonard Meyer’s ‘Forgery and The Anthropology of Art’ (discussed in Part 1); See also, articles in this collection by Alfred Lessing and Jack Meiland.
  • Fry, Roger ([1920] 1956) Vision and Design Cleveland and New York: World Publishing
    • Fry’s collection of essays gives some insight into the changing ideas on European Post Impressionism around the turn of the century. The development of Fry’s aesthetic theory (in relation to significant form) is also discussed by Bell (1913).
  • Guyer, Paul (1997) Kant and the Claims of Taste [2nd Edition] Cambridge: Cambridge University Press
    • A clear and comprehensive critical exploration of Kant’s Critique of Judgement.
  • Hanslick, Eduard (1957) The Beautiful in Music [trans. Gustav Cohen] Indianapolis and New York: Bobbs-Merrill Co
    • In this influential text by “the father of modern musical criticism”, Hanslick arguably lays out much of the groundwork for musical formalism.
  • Kant, Immanuel (1952) Critique of Judgement, [trans. Meredith] Oxford: Clarendon Press
  • McLaughlin, Thomas (1977) ‘Clive Bell’s Aesthetic: Tradition and Significant Form’, Journal of Aesthetics and Art Criticism 35, pp435-7
    • A critical discussion of Bell’s Formalism.
  • Parsons, Glenn (2004) ‘Natural Functions and the Aesthetic Appreciation of Inorganic Nature’, British Journal of Aesthetics 44, pp.44-56
    • Parsons presents some challenges to Zangwill’s extreme formalism about inorganic nature (for a reply to Parsons see Zangwill, 2005).
  • Walton, Kendall (1970) ‘Categories of Art’, The Philosophical Review 79, pp.334-367
    • A very influential contribution to analytic aesthetics in support of the anti-formalist stance regarding the appreciation of an artwork’s aesthetic properties.
  • Wilcox, John (1953) ‘The Beginnings of L’art pour L’art’, Journal of Aesthetics and Art Criticism 11, pp.360-377
    • An instructive article that has informed much of Part 1 of this presentation, and in which you will find references to many of the nineteenth century texts cited here.
  • Wimsatt, W. and Beardsley, M. (1946) ‘The Intentional Fallacy’, Sewanee Review 54, pp.468-88
    • The authors argue against the view that an artwork’s meaning is determined by reference to the artist’s intentions.
  • Zangwill, Nick (2001) The Metaphysics of Beauty, Ithaca, N.Y.: Cornell University Press
    • A thorough exploration and defense of formalist intuitions; the book includes re-printed versions of many of Zangwill’s important contributions (for example, his ‘Feasible Aesthetic Formalism’, Noús 33 (1999)) from earlier years.
  • Zangwill, Nick (2005) ‘In Defence of Extreme Formalism about Inorganic Nature: Reply to Parsons’, British Journal of Aesthetics, 45, pp.185-191

 

Author Information

Christopher Dowling
Email: c.dowling@open.ac.uk
The Open University
United Kingdom

Willard Van Orman Quine: The Analytic/Synthetic Distinction

quine1Willard Van Orman Quine was one of the most well-known American “analytic” philosophers of the twentieth century. He made significant contributions to many areas of philosophy, including philosophy of language, logic, epistemology, philosophy of science, and philosophy of mind/psychology (behaviorism). However, he is best known for his rejection of the analytic/synthetic distinction. Technically, this is the distinction between statements true in virtue of the meanings of their terms (like “a bachelor is an unmarried man”) and statements whose truth is a function not simply of the meanings of terms, but of the way the world is (such as, “That bachelor is wearing a grey suit”). Although a contentious thesis, analyticity has been a popular explanation, especially among empiricists, both for the necessity of necessary truths and for the a priori knowability of some truths. Thus, in some contexts “analytic truth,” “necessary truth,” and “a priori truth” have been used interchangeably, and the analytic/synthetic distinction has been treated as equivalent to the distinctions between necessary and contingent truths, and between a priori and a posteriori (or empirical) truths. Empirical truths can be known only by empirical verification, rather than by “unpacking” the meanings of the terms involved, and are usually thought to be contingent.

Quine wrestled with the analytic/synthetic distinction for years, but he did not make his thoughts public until 1950, when he delivered his paper, “The Two Dogmas of Empiricism” at a meeting of the American Philosophical Association. In this paper, Quine argues that all attempts to define and understand analyticity are circular. Therefore, the notion of analyticity should be rejected (along with, of course, the spurious distinction in which it features). This rejection has inspired debates and discussions for decades. For one thing, if Quine is right, and there are no truly necessary truths (that is, analytic truths), metaphysics, which traffics in such truths, is effectively dead.

Quine is generally classified as an analytic philosopher (where this sense of “analytic” has little to do with the analytic/synthetic distinction) because of the attention he pays to language and logic. He also employed a “naturalistic” method, which generally speaking, is an empirical, scientific method. A metaphysical approach was not an option for Quine, primarily because of his thoughts on analyticity.

Both because of the influence of “The Two Dogmas of Empircism” in analytic circles,  and because its perspective on analyticity is foundational to every other aspect of Quine’s thought—to his philosophies of language and logic, to his naturalistic epistemology, and to his anti-metaphysical stance—a survey of Quine’s thought on analyticity is perhaps the best introduction to his thought as a whole.

Table of Contents

  1. Life and Influences
  2. The Analytic/Synthetic Distinction: A Focus on Analyticity
  3. Metaphysics: Some Implications
  4. References and Further Reading

1. Life and Influences

Willard Van Orman Quine was born on June 25, 1908 in Akron Ohio. As a teenager, he was an avid stamp collector and a budding cartographer. One of his first publications was a free-hand map of the Portage Lakes of Ohio, which he sold for pennies to lakefront stores. When he was sixteen, Quine wrote the first edition of O.K. Stamp News, which was distributed to stamp collectors and dealers. Quine went on to write and distribute six more editions of the philatelic newspaper before moving on to new interests. One of these interests included active participation in the lighthearted “Greeter Club,” where members were required to call each other by their middle names (and engage in other sorts of playful word games). At this point, Quine became known as “Van” to his friends, a moniker that stuck with him for the rest of his life.

Quine received his undergraduate degree from OberlinCollege, in Oberlin, Ohio. He majored in mathematics with honors in mathematical philosophy and mathematical logic. During his college years, along with cultivating his interest in mathematics, mathematical logic, linguistics and philosophy, Quine began his secondary career as an intrepid traveler. He hitchhiked to, at least, Virginia, Kentucky, Canada and Michigan. In 1928, he made his way to Denver and back with a few friends, hopping freight trains, hitchhiking, and riding on running boards. Lodging often included jail houses (where one could sleep for free in relative safety), park benches and the ground. At the end of his junior year, he traveled to Europe. Quine writes in his autobiography, The Time of My Life: “An interest in foreign languages, like an interest in stamps, accorded with my taste for geography. Grammar, moreover, appeals to the same sense that is gratified by mathematics, or by the structure of boundaries and road networks” (Quine, 1987: 38).

During his junior year at Oberlin, Quine became immersed in Whitehead and Russell’s Principia Mathematica. Whitehead was a member of Harvard’s Philosophy Department, so Quine decided to apply to their doctoral program in philosophy. Graduating from Oberlin with an A- average, he was accepted, and received his Ph.D. in 2 years, at age twenty-three. Broadly speaking, his thesis sought to extensionalize the properties that populated the Principia. For our purposes, we may understand an extensional definition as a set of particular things. For instance, the extensional definition of a cat would consist of the set of all cats and the extensional definition of the property orange would consist of the set of all orange things (which could include things that are only partly orange). In Quine’s dissertation, we see his first concerted effort to do away with definitions that are not extensional, that is, with intensional definitions. Intensional definitions are, broadly speaking, generalizations, where particular things (for example, particular cats) are not employed in the definition. For instance, the intensional definition of a cat might be something like “four-legged feline mammal,” and likewise, the intensional definition of the property “orange” might be “a color that is the combination of yellow and red.” To some degree, Quine’s distaste for intensional definitions is rooted in Berkeley and Hume; also see the article on “the Classical theory of Concepts,” section 2 ). Generally speaking, Quine thought that intensional definitions, and likewise, “meanings” were vague, mentalistic entities; out of touch with the concrete particulars that comprise extensional definitions.

Quine’s use/mention distinction also first saw the light of day in his dissertation. This distinction underlines the difference between objects and names of objects. For instance, an actual cat, say Hercules the cat, is to be distinguished from the name ‘Hercules’. Quine uses single quotation marks to denote a name. Thus, Hercules the cat is orange and white, but the name ‘Hercules’ is not. Rather, the name ‘Hercules’ has other properties, for example, it begins with the letter ‘H’.  An actual object (for example, Hercules the cat) is mentioned, and we use a name (that is, ‘Hercules’) to do so.

After completing his dissertation in 1932, Quine was awarded a Sheldon Traveling Fellowship by Harvard. Taking the advice of Hebert Feigl and John Cooley, Quine, along with his first wife Naomi Clayton, set off for Europe to study with Rudolf Carnap. This would prove to be a momentous trip; Carnap had a singular and lasting influence on Quine. For although he initially agreed with much of what Carnap had to say, a number of Quine’s most distinctive ideas emerged as a result of rejecting some of Carnap’s more fundamental positions. Of particular importance is Quine’s rejection of the distinction between analytic and synthetic statements. Although the analytic/synthetic distinction had been a staple of the empiricist tradition since at least Hume, Quine was especially concerned with Carnap’s formulation of it.

2. The Analytic/Synthetic Distinction: A Focus on Analyticity

As Quine understood it (via Carnap), analytic truths are true as a result of their meaning. For instance, the statement “All bachelors are unmarried men” is true because being a “bachelor” means being an “unmarried man.” “Synthetic” claims on the other hand, are not true merely because of the meaning of the words used in the statement. Rather, the truth of these statements turns on facts. For instance, the claim “David is a bachelor” is only true if, in fact, David is a bachelor.

But before rejecting the analytic/synthetic distinction, Quine delivered three papers on Carnap to Harvard’s Society of Fellows in 1934, where he seemed to defend the distinction. They were titled: “The a priori,” “Syntax,” and “Philosophy as Syntax.” Eager to get Carnap’s work on the English-speaking philosophical stage, Quine delivered these lectures not only to give a clear and careful exposition of Carnap’s new book, The Logical Syntax of Language, but to help convince the Harvard University that it needed Carnap (Carnap served as a visiting Professor at Harvard in 1940-41, but otherwise remained at the University of Chicago from 1936-1952.) Not only was Quine reading Carnap’s work at this time, but Carnap was reading Quine’s recent book, A System of Logistic, the published rewrite of his dissertation (Creath 1990: 149 – 160, # 15-17). Quine’s respect for Carnap at this time is indisputable; a rapport had grown between the two such that they could easily exchange ideas and, for the most part, understand each other. Yet some sixty years after he gave the 1934 Harvard lectures, Quine confesses that they were rather servile reconstructions of Carnap’s thoughts, or as Quine puts it, they were “abjectly sequacious” (Quine, 1991: 266). For as early as 1933—a year before the Harvard lectures were delivered—Quine was having serious doubts about the analytic/synthetic distinction, doubts that he privately expressed to Carnap. For instance, on March 31, 1933, Carnap observes that “He [Quine] says after some reading of my “Syntax” MS [that is, the manuscript of Carnap’s The Logical Syntax of Language]: Is there a difference between logical axioms and empirical sentences? He thinks not. Perhaps I seek a distinction just for its utility, but it seems he is right: gradual difference: they are sentences we want to hold fast” (Quine, 1991:  267). Here, in the course of reading a draft of Carnap’s book in 1933, Quine questions the distinction between “logical axioms” and “empirical axioms,” where, generally speaking, “analytic” propositions entail logical axioms.

Almost twenty years later, Quine defended a variant of this position in his now famous paper, “The Two Dogmas of Empiricism.” Here, he claims that there is no sharp distinction between claims that are true in virtue of their meaning (analytic claims) and empirical claims (claims that may be verified by facts). In December 1950, Quine presented “The Two Dogmas of Empiricism” to the philosophers gathered at the annual meeting of the American Philosophical Association (APA). This marked his public rejection of the analytic/synthetic distinction. But as we saw above, Quine had been brooding over the matter since at least 1933. Not only did his qualms about this distinction surface in his discussions and correspondence with Carnap but also in conversation with other prominent philosophers and logicians, for example, Alfred Tarski, Nelson Goodman, and Morton White (Quine, 1987: 226). Stimulated by these conversations, White wrote his often overlooked paper, “The Analytic and the Synthetic: An Untenable Dualism,” which was published before Quine presented the “The Two Dogmas of Empiricism” at the 1950 APA meeting (Quine footnotes this paper at the end of the published version “The Two Dogmas of Empiricism”).

Quine begins “The Two Dogmas of Empiricism” by defining an analytic proposition as one that is “true by virtue of meanings” (Quine, 1980: 21). The problem with this characterization, he explains, is that the nature of meanings is obscure; Quine reminds the reader of what he takes to be one of Carnap’s biggest mistakes in semantics—meaning is not to be confused with naming. For instance, the clause “The Morning Star” has a different meaning than the clause “The Evening Star” but both name the same object (the planet Venus), and thus, both have the same reference. Similarly, Quine explains that one mustn’t confuse the meaning, that is, the “intension,” of a general term with its extension, that is, the class of particular things to which the term applies. For instance, he points out, the two general terms “creature with a heart” and “creature with a kidney” both have the same extension because every creature with a heart has a kidney, and likewise. But these two statements clearly don’t have the same meaning. Thus, for Quine there is a clear distinction between intensions and extensions, which reflects an equally clear distinction between meanings and references.

Quine then briefly explains the notion of what a word might mean, as opposed to what essential qualities an object denoted by that word might be said to have. For instance, the object man might be said to be essentially rational, while being, say, “two-legged” is an accidental property—there are plenty of humans who only have one leg, or who have no legs at all. As a result, the word ‘man’ must mean, at least “a rational being,” but it does not necessarily mean “two-legged.” Thus, Quine concludes, there seems to be some kind of parallel between the essential properties of an object and the meaning of the word that denotes that object. Or as Quine puts it: “meaning is what essence becomes when it is divorced from the object of reference and wedded to the word” (Quine, 1980: 22).

But does this imply that meanings are some kind of “objects?” This can’t be the case, Quine concludes, because he just showed that meanings must not be confused with objects, that is, meanings must not be confused with references (for example, we must not confuse the meaning of the phrase “morning star” with what the phrase names, that is, the object Venus). Instead, it seems we should focus on grasping what is happening when two words are “synonymous,” that is, what seems to be happening when two words are “analytically” related. For instance, the two words ‘bachelor’ and ‘unmarried’ seem to be synonymous, and thus, the proposition, “All bachelors are unmarried men” seems to be an analytic statement. And thus Quine writes: “The problem of analyticity confronts us anew” (Quine, 1980: 22).

To tackle the notion of analyticity, Quine makes a distinction between two kinds of analytic claims, those comprised of logical truths and those comprised of  synonymous terms. Logical truths, Quine explains, are any statements that remain true no matter how we interpret the non-logical particles in the statement. Logical particles are logical operators, for example, not, if then, or, all, no, some, and so forth. For instance, the statement “No not-X is X” will remain true no matter how we interpret X, for example, “No not-cat is a cat,” or “No not-bicycle is a bicycle.” An example of the second kind of analytic statement would be, Quine explains, “No bachelor is married,” where the meaning of the word ‘bachelor’ is synonymous with the meaning of the word ‘unmarried.’ However, we can make this kind of analytic claim into a logical truth (as defined above) by replacing ‘bachelor’ with its synonym, that is, ‘unmarried man,’ to get “No unmarried man is married,” which is an instance of No not-X is X. However, to make this substitution, we had to have some idea of what is meant by the word ‘synonymy,’ but this is problematical, so the notion of synonymy is the focus of the remainder of his discussion of analyticity.

Quine suggests that one might, as is often done, appeal to definitions to explain the notion of synonymy. For instance, we might say that  “bachelor” is the definition of an “unmarried man,”  and thus, synonymy turns on definitions. However, Quine attests, in order to define “bachelor” as unmarried, the definer must possess some notion of synonymy to begin with. In fact, Quine writes, the only kind of definition that does not presuppose the notion of synonymy, is the act of ascribing an abbreviation purely conventionally. For instance, let’s say that I create a new word, ‘Archon.’ I can arbitrarily say that its abbreviation is “Ba2.”   In the course of doing so, I did not have to presuppose that these two notions are “synonymous;” I merely abbreviated ‘Archon’ by convention, by stipulation. However, when I normally attempt to define a notion, for example, “bachelor,” I must think to myself something like, “Well, what does it mean to be a bachelor, particularly, what words have the same meaning as the word ‘bachelor?’” that is, what meanings are synonymous with the meaning of the word ‘bachelor?’ And thus, Quine complains: “would that all species of synonymy were as intelligible [as those created purely by convention]. For the rest, definition rests on synonymy rather than explaining it” (Quine, 1980: 26).

Perhaps then, Quine suggests, one could define synonymy in terms of “interchangability.” Two words are synonymous if they are “[interchangeable] in all contexts without change of truth value” (Quine, 1980: 27)\. However, this is problematic as well. Consider, Quine explains, the sentence “’Bachelor’ has less than ten letters.” If we simply exchange the word ‘bachelor’ with the words ‘unmarried man’ we have created a false statement, that is, “‘Unmarried man’ has less than ten letters.”  We also have problems if we try to replace the word ‘bachelor’ with ‘unmarried man’ in phrases like ‘bachelor of arts,’ or ‘bachelor’s buttons.’ But, Quine explains, we can get around the latter problem if we say that ‘bachelor of arts’ is a complete word, and thus, the ‘bachelor’ part of this word is merely a word-fragment, which cannot be interchanged with a complete word, for example, ‘bachelor’ when not understood as part of a phrase.

Regardless of this quick fix, what we are really after is “cognitive synonymy,” which is to be distinguished from the word-play discussed above. However, Quine is not quite sure what cognitive synonymy entails. But, he reminds us, we do know that “the sort of synonymy needed … [is] merely such that any analytic statement [can] be turned into a logical truth by putting synonyms for synonyms.” (Quine, 1980: 22). Recall, for instance, our example regarding “No not-X is an X.” In this case, we saw that  for example, “No bachelor is married” is in fact, an instance of the logical truth, No not-X is an X, particularly, “No unmarried man is a married man,” if, in fact, the words ‘bachelor’ and ‘unmarried man’ are synonymous. Thus, what we are looking for is this kind of synonymy; this is “cognitive synonymy.” However, in order to explain “cognitive synonymy”—as we just did—we had to assume that we knew what analyticity meant. In particular, we had to assume the meanings of the two kinds of analyticity explained above, that is, analyticity qua logical axioms and analyticity qua synonymy. And thus, Quine writes: “What we need is an account of cognitive synonymy not presupposing analyticity.” (Quine, 1980: 29). So, the question is, can we give an account of cognitive synonymy by appealing to interchangeability (recall that this is the task at hand) without presupposing any definition of analyticity?

Yes, initially it seems that one could, if our language contained the word “necessarily.” However, it turns out that the kind of “necessary” we have to assume may only apply to analytic statements, and thus, we once again presuppose a notion of analyticity to explain cognitive synonymy. Generally speaking, this plays out as follows: Assume that (1) “Necessarily all and only bachelors are bachelors.” (Quine, 1980: 29). That is, the word ‘necessary’ implies that this claim is logically true, and thus, it is analytic. Thus, if we assume that the words ‘bachelor’ and ‘unmarried man’ are interchangeable (in regard to meaning, not letters), then (2) “Necessarily all and only bachelors are unmarried men” is true, where once, again, the word ‘necessary’ seems to make this logically, that is, analytically true. And thus, once again, we needed to presuppose a notion of analyticity to define cognitive synonymy; “To suppose that [‘necessary’ makes sense] is to suppose that we have already made satisfactory sense of ‘analytic’” (Quine, 1980: 30).

Quine’s final proposal to define analyticity without an appeal to meaning—and thus without appeal to synonymy or definitions as follows: We can try to assimilate a natural language to a formal language by appealing to the semantical rules developed by Carnap (see the article on Rudolf Carnap, sections 3-6). However, Quine finds the same kind of circularity here that he has found elsewhere. To show why, Quine reconstructs a general Carnapian paradigm regarding artificial languages and semantical rules, that, broadly speaking, proceeds as follows:

[1] Assume there is an artificial language L0. Its semantical rules explicitly specify which statements are analytic in L0.

[2] A problem immediately surfaces: To extensionally define what is analytic in L0, the intensional  meaning of ‘analytic’ is presupposed in the rules, simply because “the rules contain the word ‘analytic’ which we don’t understand!” (Quine, 1980: 33) Although we have an extensional definition of ‘analytic,’ we do not have an intensional definition, that is, we do not understand what analyticity means, regardless if we have a list of particular expressions that are allegedly analytic. For instance, if I asked you to compile a list of things that are “smargon,” and you did, but you had no idea what the word ‘smargon’ means, you’d be in trouble—how could you even compile your list without knowing what ‘smargon’ means?

[3] Perhaps though, one could understand the term ‘analytic for L0‘ simply as a convention, calling it ‘K’ so it looks like the intensional meaning of the word  ‘analytic’—that is, a well-defined intensional account of analytic—is not at work anywhere. But, Quine asks, why the specific class K, and not some other arbitrary class, for example, L-Z? (Quine, 1980: 33) For instance, let’s say that I wanted to arbitrarily give a list of all things that are smargon, but I don’t know what the word ‘smargon’ means. So I create a list of things that just so happen to be green. But why did I pick just green things? Why not orange things, or things that had no particular color at all?

[4] Let it be supposed instead then, that there is a kind of semantical rule that does not specify which statements are analytic, but simply those that are true. But not all truths, just a certain set of truths. Thus, one may then define “analytic truths” as those that belong to this set. And so, “A statement is analytic if it is (not merely true but) true according to the semantical rule” (Quine, 1980: 34). However (generally speaking), the same problem surfaces in terms of “semantical rule—” how does it specify which statements are to be included in the class of truths without in some sense presupposing the intensional meaning of the word ‘analytic?’ The circle is pervasive, and so: “we might just stop tugging at our bootstraps altogether” (Quine, 1980: 36).

And thus, in 1950, Quine is confident that, perhaps once and for all, after nearly twenty years of intermittent discussion of the matter with Carnap and others, he should reject the notion of “analyticity.”

3. Metaphysics: Some Implications

Quine’s rejection of analyticity has many implications, particularly for the field of metaphysics. According to the main camp of metaphysicians, metaphysics, generally speaking, employs a method where deductive logical laws are applied to a set of axioms that are necessarily true. The propositions produced by such a method are, as a result, necessarily true as well (think, for instance of Descartes’ method or Leibniz’s). For the most part, these truths, the axioms that they are derived from, and the logical laws that are used to derive them, are thought to reflect the necessary and eternal nature of the universe.

However, if there are, as Quine claims, no such things as necessary truths, that is, analytic truths, then this main camp of metaphysics is essentially eviscerated. It is, at best, a field where clever people play still cleverer games, manipulating allegedly “necessary” truths with allegedly “necessary” laws. This attack on metaphysics by Quine has spawned new camps of metaphysics which do not rely in this way on deductive methods.

What method then, did Quine use? The empirical method. In this respect, Quine was a scientific philosopher, that is, what is often called a naturalistic philosopher.  Like Hume, he believed that philosophical conclusions were not necessarily true—they did not reflect or capture the essential nature of humanity, let alone the nature of the universe. Rather, they were testable, and potentially could be rejected.

4. References and Further Reading

  • Carnap, R. The Logical Structure of the World; Psuedo problems in Philosophy, Second Edition. Translated by R.A. George, Berkeley and Los Angeles: University of California Press, 1967 [Original publication date 1928]
  • Carnap, R. The Logical Syntax of Language. Translated by A. Smeaton. London: Routledge, 1959.
  • Carnap, R. Meaning and Necessity. Second edition with supplements. Chicago: Chicago University Press, 1947.
  • Creath, R. Dear Carnap, Dear Van: The Quine-Carnap Correspondence and Related Work. Berkeley: University of California Press, 1990.
  • Davidson, D. and Hintikka, J. eds. Words and Objections, Essays on the Work of W.V. Quine. Synthese Library 21. Revised ed. Dordrecht: D. Reidel Publishing Company, 1969.
  • Dreben, Burton. “In Medius Rebus.” Inquiry, 37 (1995): pp. 441-7.
  • Dreben, Burton. “Putnam, Quine—and the Facts.” Philosophical Topics, 20 (1992): pp. 293-315.
  • Fara, R. Director: “In Conversation: W.V. Quine.” A Philosophy International Production, London: Philosophy International, London School of Economics, 1994: tapes 1-8.
  • Fogelin, R. “Aspects of Quine’s Naturalized Epistemology.” The Cambridge Companion to Quine, ed. R.F. Gibson. Cambridge: Cambridge University Press, 2004
  • Gibson, R.F., ed., The Cambridge Companion to Quine, Cambridge: Cambridge University Press, 2004
  • Hintikka, J. “Three Dogmas of Quine’s Empiricism,” Revue Internationale de Philosophie 51 (1997): pp. 457-477
  • Hookway, C. Quine: Language, Experience and Reality. Cambridge: Polity Press, 1988.
  • Leonardi, P. and Santambrogio, M. On Quine, New Essays. Oxford: Cambridge University Press, 1995.
  • Pakaluk, M. “Quine’s 1946 Lectures on Hume.” Journal of the History of Philosophy, 27 (1989): pp. 445-459
  • Quine, W.V. A System of Logistic, Cambridge, MA: Harvard University Press, 1934.
  • Quine, W.V. “Carnap” Yale Review 76 (1987): pp. 226-230.
  • Quine, W.V. From Stimulus to Science, Cambridge, MA: Harvard, 1995.
  • Quine, W.V. “In Praise of Observation Sentences.” Journal of Philosophy, XC, #3 (1993): pp. 107-116.
  • Quine, W.V. Methods of Logic, fourth edition. Cambridge, MA: Harvard University Press, 1982 [Original edition, 1950].
  • Quine, W.V. “On Empirically Equivalent Systems of the World.” Erkenntnis 9, 3 (1975): 313-28.
  • Quine, W.V. “On Philosophers’ Concern with Language.” (editor’s title: “Words are all we have to go on”), Times Literary Supplement. July 3, 1992: 8
  • Quine, W.V. Ontological Relativity and Other Essays. New York: Columbia, 1969.
  • Quine, W.V. Pursuit of Truth. Revised ed. Cambridge, MA: Harvard, 1992
  • Quine, W.V. Quiddities: An Intermittently Philosophical Dictionary. Cambridge, MA: Harvard, 1987.
  • Quine, W.V. “Review of Carnap,” Philosophical Review, 44 (1935): pp. 394-7.
  • Quine, W.V. The Roots of Reference, La Salle: Illinois: Open Court, 1974.
  • Quine, W.V. “States of Mind.” Journal of Philosophy 82 (1985) pp. 5-8.
  • Quine, W.V. The Time of My Life: An Autobiography, Cambridge, MA: Harvard University Press, 1987.
  • Quine, W.V. Theories and Things. Cambridge, MA: Harvard, 1981
  • Quine, W.V. “Two Dogmas in Retrospect,” Canadian Journal of Philosophy, 21 #3 (1991): pp. 265-274
  • Quine, W.V. “The Two Dogmas of Empiricism” in From a Logical Point of View: 9 logico-philosophical essays. 2nd ed., revised. Cambridge, MA: Harvard University Press, 1980 [Original publication date: 1953]
  • Quine, W.V. “The Two Dogmas of Empiricism,” The Philosophical Review 60 (1951): 20-43.
  • Quine, W.V. The Ways of Paradox and Other Essays. Enlarged edition. Cambridge, MA: Harvard, 1976.
  • Quine, W.V. Word and Object. Cambridge, MA: MIT, 1960..
  • Richardson, A.W. Carnap’s Construction of the World: The Aufbau and the Emergence of Logical Empiricism. Cambridge: Cambridge University Press, 1998,
  • Rocknak, S. “Understanding Quine in Terms of the Aufbau: Another Look at Naturalized Epistemology” in Beyond Description: Naturalism and Normativity, eds. Marcin Milkowski and Konrad Talmud-Kaminski, Texts in Philosophy (College Publications, 2010, Vol. 13, pp. 195-210).
  • Romanos, G.P. Quine and Analytic Philosophy. Cambridge: MIT Press, 1983.
  • Russell, Bertrand. Our Knowledge of the External World. London: George Allen and Unwin Ltd., 1961.
  • Schilpp, P.A. ed, The Philosophy of Rudolph Carnap: Library of Living Philosophers, II. La Salles, Illinois: Open Court, 1963
  • Schilpp, P.A. and Edwin L., eds. The Philosophy of W.V. Quine: Library of Living Philosophers, XVII, La Salle, Illinois: Open Court, 1986.
  • Schilpp, P.A. and Edwin L., eds. The Philosophy of W.V. Quine: Library of Living Philosophers, XVII, Expanded Edition, La Salle, Illinois: Open Court, 1998.
  • Shahan, R.W. and Swoyer, C. eds. Essays on the Philosophy of W.V. Quine, Norman: University of Oklahoma Press, 1979
  • Tooke, J.H., The Divisions of Purley, vol. 1, London, 1786, Boston, 1806.
  • White, M. “The Analytic and the Synthetic: An Untenable Dualism,” John Dewey: Philosopher of Science and Freedom, ed. S. Hook. New York: Dial, 1950.
  • Whitehead, A.N. and Bertrand Russell. Principia Mathematica, Cambridge: Cambridge University Press, 1910.

 

Author Information

Stefanie Rocknak
Email: rocknaks@hartwick.edu
Hartwick College
U. S. A.

Philosophy of Dreaming

According to Owen Flanagan (2000), there are four major philosophical questions about dreaming:

1. How can I be sure I am not always dreaming?

2. Can I be immoral in dreams?

3. Are dreams conscious experiences that occur during sleep?

4. Does dreaming have an evolutionary function?

These interrelated questions cover philosophical domains as diverse as metaphysics, epistemology, ethics, scientific methodology, and the philosophy of biology, mind and language. This article covers the four questions and also looks at some newly emerging philosophical questions about dreams:

5. Is dreaming an ideal scientific model for consciousness research?

6. Is dreaming an instance of hallucinating or imagining?

Section 1 introduces the traditional philosophical question that Descartes asked himself, a question which has championed scepticism about the external world. How can I be sure I am not always dreaming, or dreaming right now? Philosophers have typically looked for features that distinguish dreams from waking life and one key debate centres on whether it is possible to feel pain in a dream.

Section 2 surveys the ethics of dreaming. The classical view of Augustine is contrasted with more abstract ethical positions, namely, those of the Deontologist, the Consequentialist and the Virtue Ethicist. The notion of lucid dreaming is examined here in light of the question of responsibility during dreaming and how we treat other dream characters.

Sections 3 covers the various different positions, objections and replies to question 3: the debate about whether dreaming is, or is not, a conscious state. The challenges from Malcolm and Dennett are covered. These challenges question the authority of the common-sense view of dreaming as a consciously experienced state. Malcolm argues that the concept of dreaming is incoherent, while Dennett puts forward a theory of dreaming without appealing to consciousness.

Section 4 covers the evolutionary debate, where empirical work ultimately leaves us uncertain of the extent to which natural selection has shaped dreaming, if at all. Early approaches by Freud and Jung are reviewed, followed by approaches by Flanagan and Revonsuo. Though Freud, Jung and Revonsuo have argued that dreaming is functional, Flanagan represents a view shared by many neuroscientists that dreaming has no function at all.

Section 5 looks at questions 5 and 6. Question 5 is about the cutting edge issue of precisely how dreaming should be integrated into the research program of consciousness. Should dreaming be taken as a scientific model of consciousness? Might dreaming play another role such as a contrast analysis with other mental states? Question 6, which raises a question of the exact qualitative nature of dreaming, has a longer history, though it is also receiving contemporary attention. The section outlines reasons favouring the orthodox view of psychology, that dream imagery is perceptual (hallucinatory), and reasons favouring the philosophical challenge to that orthodoxy, that dreams are ultimately imaginative in nature.

Table of Contents

  1. Dreaming in Epistemology
    1. Descartes’ Dream Argument
    2. Objections and Replies
  2. The Ethics of Dreaming
    1. Saint Augustine on the Morality of Dreaming
    2. Consequentialist vs. Deontological Positions on Dreaming
    3. Virtue Ethics of Dreaming
  3. Are Dreams Consciously Experienced?
    1. The Received View of Dreaming
    2. Malcolm’s Challenge to the Received View
      1. The Impossibility of Verifying Dream Reports
      2. The Conflicting Definitions of “Sleep” and “Dreaming”
      3. The Impossibility of Communicating or Making Judgments during Sleep
      4. Ramifications (contra Descartes)
    3. Possible Objections to Malcolm
      1. Putnam on the Conceptual Analysis of Dreaming
      2. Distinguishing “State” and “Creature” Consciousness
    4. Dennett’s Challenge to the Received View
      1. A New Model of Dreaming: Uploading Unconscious Content
      2. Accounting for New Data on Dreams: “Precognitive” Dreams
    5. Possible Objections to Dennett
      1. Lucid Dreaming
      2. Alternative Explanations for “Precognitive” Dreams
  4. The Function of Dreaming
    1. Early Approaches
      1. Freud: Psychoanalysis
      2. Jung: Analytic Psychology
    2. Contemporary Approaches
      1. Pluralism
      2. Adaptationism
  5. Dreaming in Contemporary Philosophy of Mind and Consciousness
    1. Should Dreaming Be a Scientific Model?
      1. Dreaming as a Model of Consciousness
      2. Dreaming as a Contrast Case for Waking Consciousness
    2. Is Dreaming an Instance of Images or Percepts?
      1. Dreaming as Hallucination
      2. Dreaming as Imagination
  6. References and Further Reading

1. Dreaming in Epistemology

a. Descartes’ Dream Argument

Descartes strove for certainty in the beliefs we hold. In his Meditations on First Philosophy he wanted to find out what we can believe with certainty and thereby claim as knowledge. He begins by stating that he is certain of being seated by the fire in front of him. He then dismisses the idea that this belief could be certain because he has been deceived before in dreams where he has similarly been convinced that he was seated by a fire, only to wake and discover that he was only dreaming that he was seated by a fire. How can I know that I am not now dreaming? is the resulting famous question Descartes asked himself. Though Descartes was not the first to ask himself this question (see Zhuangzi’s eponymous work, Plato’s Theaetetus and Aristotle’s Metaphysics) he was the first philosopher to doggedly pursue and try to answer the question. In answering the question, due to the sensory deception of dreams, Descartes believes that we cannot trust our senses in waking life (without invoking a benevolent God who would surely not deceive us).

The phenomenon of dreaming is used as key evidence for the sceptical hypothesis that everything we currently believe to be true could be false and generated by a dream. Descartes holds the common-sense view that dreams, which regularly occur in all people, are a sequence of experiences often similar to those we have in waking life (this has come to be labelled as the “received view” of dreaming). A dream makes it feel as though the dreamer is carrying out actions in waking life, for during a dream we do not realize that it is a dream we are experiencing. Descartes claims that the experience of a dream could in principle be indistinguishable from waking life – whatever apparent subjective differences there are between waking life and dreaming, they are insufficient differences to gain certainty that I am not now dreaming. Descartes is left unsure that the objects in front of him are real – whether he is dreaming of their existence or whether they really are there. Dreaming was the first source for motivating Descartes’ method of doubt which came to threaten perceptual and introspective knowledge. In this method, he would use any means to subject a statement or allegedly true belief to the most critical scrutiny.

Descartes’ dream argument began with the claim that dreams and waking life can have the same content. There is, Descartes alleges, a sufficient similarity between the two experiences for dreamers to be routinely deceived into believing that they are having waking experiences while we are actually asleep and dreaming.  The dream argument has similarities to his later evil demon argument. According to this later argument, I cannot be sure anything I believe for I may just be being deceived by a malevolent demon. Both arguments have the same structure: nothing can rule out my being duped into believing I am having experience X, when I am really in state Y, hence I cannot have knowledge Z, about my current state. Even if the individuals happen to be right in their belief that they are not being deceived by an evil demon and even if individuals really are having a waking life experience, they are left unable to distinguish reality from their dream experiences in order to gain certainty in their belief that they are not now dreaming.

b. Objections and Replies

Since the Meditations on First Philosophy was published, Descartes’ argument has been replied to. One main claim that has been replied to is the idea that there are no certain marks to distinguish waking consciousness from dreaming. Hobbes believed that an absence of the absurd in waking life was a key difference (Hobbes, 1651: Part 1, Chapter 2). Though sleeping individuals are too wrapped up in the absurdity of their dreams to be able to distinguish their states, an individual who is awake can tell, simply because the absurdity is no longer there during wakefulness. Locke compared real pain to dream pain. He asks Descartes to consider the difference between dreaming of being in the fire and actually being in the fire (Locke, 1690: Book 4, Chapter 2, § 2). Locke’s claim is that we cannot have physical pain in dreams as we do in waking life. His claim, if true, undermines Descartes’ premise that there are no certain marks to distinguish dreaming from waking life such that we could ever be sure we are in one or the other states.

Descartes thought that dreams are protean (Hill, 2004b). By “protean”, Hill means that dream experience can replicate the panoply of any possible waking life experience; to put it negatively, there is no experience in waking life that could not be realistically simulated (and thereby be phenomenally indistinguishable) in dreams. This protean claim was necessary for Descartes to mount his sceptical argument about the external world. After all, if there was even one experience during waking life which simply could not occur during dreaming, then, in that moment at least, we could be sure we are awake and in contact with the external world, rather than dreaming. Locke alleged that he had found a gap in this protean claim: we do not and cannot feel pain in dreams. The notion of pain occurring in a dream has now been put to the test in a number of scientific studies through quantitative analysis of the content of dream diaries in the case of ordinary dreams and also by participating lucid dreamers. The conclusion reached independently by these various studies is that the occurrence of sharply localized pains can occur in dreams, though they are rare (Zadra, and others 1998; LaBerge & DeGracia, 2000). According to the empirical work then, Locke is wrong about his claim, though he might still query whether really agonizing and ongoing pain (as in his original request of being in a fire) might not be possible in dreams. The empirical work supports Descartes’ conviction that dreams can recapitulate any waking state, meaning that there is no essential difference between waking and dreaming thereby ruling out certainty that this is not now a dream.

Another common attempt to distinguish waking life from dreaming is the “principle of coherence” (Malcolm, 1959: Chapter 17). We are awake and not asleep dreaming if we can connect our current experiences to the overall course of our lives. Essentially, through using the principle of coherence, we can think more critically in waking life. Hobbes seems to adhere to something like the principle of coherence in his appeal to absurdity as a key feature of dreams. Though dreams do have a tendency to involve a lack of critical thinking, it still seems possible that we could wake with a dream connecting to the overall course of our lives. It is generally accepted that there is no certain way to distinguish dreaming from waking life, though the claim that this ought to undermine our knowledge in any way is controversial.

For an alternative response to Descartes’ sceptical dream argument see Sosa (2007), who says that “in dreaming we do not really believe; we only make-believe.” He argues that in dreaming we actually only ever imagine scenarios, which never involve deceptive beliefs, and so we have no reason to feel our ordinary waking life beliefs can be undermined. Descartes relied on a notion of belief that was the same in both dreaming and waking life. Of course, if I have never believed, in sleep, that I was seated by the fire when I was actually asleep in bed, then none of my dreams challenge the perceptual and introspective beliefs I have during waking life. Ichikawa (2008) agrees with Sosa that in dreams we imagine scenarios (rather than believe we are engaged in scenarios as though awake), but he argues in contrast to Sosa, that this does not avoid scepticism about the external world. Even when dreams trade in imaginings rather than beliefs, the dreams still create subjectively indistinguishable experiences from waking experience. Due to the similarity in experience, it would be “epistemically irresponsible” to believe that we are engaged in waking experiences when we think we are on the sole basis that we imagine our dream experiences and imaginings are not beliefs. I still cannot really tell the difference between the experiences. The new worry is whether the belief I have in waking life is really a belief, rather than an imagining during dreaming and so scepticism is not avoided, so Ichikawa claims.

2. The Ethics of Dreaming

Since the late twentieth century, discussion of the moral and criminal responsibility of dreaming has been centred on sleep walking, where sleep-walkers have harmed others. The assessment has typically been carried out in practical, rather than theoretical, settings, for example law courts. Setting aside the notion of sleepwalking, philosophers are more concerned with the phenomenology of ordinary dreams. Does the notion of right and wrong apply to dreams themselves, as well as actions done by sleepwalkers?

a. Saint Augustine on the Morality of Dreaming

Saint Augustine, seeking to live a morally perfect life, was worried about some of the actions he carried out in dreams. For somebody who devoted his life to celibacy, his sexual dreams of fornication worried him. In his Confessions (Book X; Chapter 30), he writes to God. He talks of his success in quelling sexual thoughts and earlier habits from his life before his religious conversion. But he declares that in dreams he seems to have little control over committing the acts that he avoids during the waking day. He rhetorically asks “am I not myself during sleep?” believing that it really is him who is the central character of his dreams. In trying to solve the problem Augustine appeals to the apparent experiential difference between waking and dreaming life. He draws a crucial distinction between “happenings” and “actions.” Dreams fall into the former category. Augustine was not carrying out actions but was rather undergoing an experience which happened to him without choice on his part. By effectively removing agency from dreaming, we cannot be responsible for what happens in our dreams. As a result, the notion of sin or moral responsibility cannot be applied to our dreams (Flanagan, 2000: p.18; pp. 179 – 183). According to Augustine, only actions are morally evaluable. He is committed to the claim that all events that occur in dreams are non-actions. The claim that actions do not occur during sleep is brought into question by lucid dreams which seem to involve genuine actions and decision making processes whereby dreaming individuals can control, affect and alter the course of the dream. The success of Augustine’s argument hinges on there being no actions in dreams. Lucid dreaming is therefore evidence against this premise. We have now seen Augustine’s argument that moral notions never apply to dreams fail (because they can involve actions rather than happenings). In the next section we will see what the two main ethical positions might say on the issue of right and wrong in dreams.

b. Consequentialist vs. Deontological Positions on Dreaming

Dreaming is an instance of a more general concern about a subset of thoughts – fantasies – that occur, potentially without affecting behaviour We seem to carry out actions during dreams in simulated realities involving other characters. So perhaps we ought to consider whether we are morally responsible for actions in dreams. More generally, are we morally obliged to not entertain certain thoughts, even if these thoughts do not affect our later actions and do not harm others? The same issue might be pressed with the use of violent video games, though the link to later behaviour is more controversial. Some people enjoy playing violent video games and the more graphic the better. Is that unethical in and of itself? Why should we excuse people’s thoughts – when, if they were carried out as actual actions they would be grossly wrong? Dreaming is perhaps a special instance because in ordinary dreams we believe we are carrying out actions in real life. What might the two main moral theories say about the issue, with the assumption in place that what we do in dreams does not affect our behaviour in waking life?

Consequentialism is a broad family of ethical doctrines which always assesses an action in terms of the consequences it has. There are two separate issues – ethical and empirical. The empirical question asks whether dreams, fantasies and video games are really without behavioural consequence towards others. To be clear, the Consequentialist is not arguing that dreams do not have any consequences, only that if they really do have no consequences then they are not morally evaluable or should be deemed neutral. Consequentialist theories may well argue that, provided that dreams really do not affect my behaviour later, it is not morally wrong to “harm” other dream characters, even in lucid dreaming. The more liberal Consequentialists might even see value in these instances of free thought. That is, there might be some intrinsic good in allowing such freedom of the mind but this is not a value that can be outweighed by actual harm to others, so the Consequentialists might claim. If having such lucid dreams makes me nicer to people in waking life, then the Consequentialist will actually endorse such activity during sleep.

Consequentialists will grant their argument even though dream content has an intentional relation to other people. Namely, dreams can often have singular content. Singular content, or singular thought, is to be contrasted with general content (the notion of singular thought is somewhat complex. Readers should consult Jeshion, 2010). If I simply form a mental representation of a blond Hollywood actor, the features of the representation might be too vague to pick out any particular individual. My representation could equally be fulfilled by Brad Pitt, Steve McQueen, a fictional movie star or countless other individuals. If I deliberately think of Brad Pitt, or if the images come to me detailed enough, then my thought does not have general content but is about that particular individual. Dreams are not always about people with general features (though they can be), but are rather often about people the sleeping individual is actually acquainted with – particular people from that individual’s own life – family, friends, and so forth.

Deontological theories, in stark contrast to Consequential theories, believe that we have obligations to act and think, or not act and think, in certain ways regardless of effects on other people. According to Deontological moral theories, I have a duty to never entertain certain thoughts because it is wrong in itself. Deontological theories see individuals as more important than mere consequences of action. Individuals are “ends-in-themselves” and not the means to a desirable state of affairs. Since dreams are often actually about real people, I am not treating that individual as an end-in-itself if I chose to harm their “dream representative”. The basic Deontological maxim to treat someone as an end rather than a means to my entertainment can apply to dreams.

As the debate between Deontologists and Consequentialists plays out, nuanced positions will reveal themselves. Perhaps there is room for agreement between the Consequentialist and Deontologist. Maybe I can carry out otherwise immoral acts on dream characters with general features where these characters do not represent any particular individuals of the waking world. Some Deontologists might still be unhappy with the notion that in dreams one crucial element of singular content remains – we represent ourselves in dreams. The arch-Deontologist Kant will argue that one is not treating oneself as an end-in-itself but a means to other ends by carrying out the acts; namely, there is something inherently wrong about even pretending to carry out an immoral action because in doing so we depersonalize ourselves. Other Deontologists might want to speak about fantasies being different from dreams. Fantasies are actions, where I sit down and decide to indulge my daydreams, whereas dreams might be more passive and therefore might respect the Augustinian distinction between actions and happenings. On this view, I am not using someone as a means to an end if I am just passively dreaming whereas I am if I start actively thinking about that individual. So maybe the Deontologist case only applies to lucid dreaming, where Augustine’s distinction would still be at work. This might exempt a large number of dreams from being wicked, but not all of them.

c. Virtue Ethics of Dreaming

Deontology and Consequentialism are the two main moral positions. The third is Virtue Ethics, which emphasizes the role of character. This moral approach involves going beyond actions of right and wrong, avoiding harm and maximizing pleasure, and instead considers an individual for his or her overall life, how to make it a good one and develop that individual’s character. Where might dreaming fit in with the third moral position – that of the Virtue Ethicist? Virtue Ethics takes the question “what is the right action?” and turns it into the broader question: “how should I live?” The question “can we have immoral dreams?” needs to be opened up to: “what can I get out of dreaming to help me acquire virtuousness?”

The Virtue Ethics of dreaming might be pursued in a Freudian or Jungian vein. Dreams arguably put us in touch with our unconscious and indirectly tell us about our motives and habits in life:

“[I]t is in the world of dreaming that the unconscious is working out its powerful dynamics. It is there that the great forces do battle or combine to produce the attitudes, ideals, beliefs, and compulsions that motivate most of our behavior Once we become sensitive to dreams, we discover that every dynamic in a dream is manifesting itself in some way in our practical lives—in our actions, relationships, decisions, automatic routines, urges, and feelings.” (Johnson, 2009: p.19)

Similarly:

“Studying our own dreams can be valuable in all sorts of ways. They can be reveal our inner motivations and hopes, help us face our fears, encourage growing awareness and even be a source of creativity and insight.” (Blackmore, 2004: p.338)

In order to achieve happiness, fulfilment and developing virtuousness we owe it to ourselves to recall and pay attention to our dreams. However, this line of argument relies on the claim that dreams really do function in a way that Freud or Jung thought they do, which is controversial: dream analysis of any kind lacks scientific status and is more of an art. But then social dynamics and the development of character is more of an art than a science. Virtue Ethics is perhaps the opposite side of the coin of psychotherapy. The former focuses on positive improvement of character, whereas the latter focuses on avoiding negative setbacks in mind and behaviour Whether psychotherapy should be used more for positive improvement of character is a question approached in the philosophy of medicine. These considerations touch on a further question of whether dreams should be used in therapy.

Certain changes people make in waking life do eventually “show up” in dreams. Dreams, as unconsciously instantiated, capture patterns of thought from waking life. New modes of thinking can be introduced and this is the process by which people learn to lucid dream. By periodically introducing thoughts about whether one is awake or not during the day, every day for some period of time, this pattern of thinking eventually occurs in dreams. By constantly asking “am I awake?” in the day it becomes more likely to ask oneself in a dream, to realize that one is not awake and answer in the negative (Blackmore, 1991). With the possibility that dreams do capture waking life thinking and the notion that one can learn to lucid dream one may ask whether Augustine tried his hardest at stopping the dreams that troubled him and whether he was really as successful at quelling sexual urges in waking life as he thought he was.

Ordinary dreams are commonly thought to not actually involve choices and corresponding agency. Lucid dreaming invokes our ability to make choices, often to the same extent as in waking life. Lucid dreaming represents an example of being able to live and act in a virtual reality and is especially timely due to the rise in number of lucid dreamers (popular manuals on how to lucid dream are sold and actively endorsed by some leading psychologists; see LaBerge & Rheingold, 1990; Love, 2013) and the increase of virtual realities on computers. Whereas Deontologists and Consequentialists are likely to be more interested in the content of the dreams, the Virtue Ethicist will likely be more interested in dreaming as an overall activity and how it fits in with one’s life. Stephen LaBerge is perhaps an implicit Virtue Ethicist of dreaming. Though humans are thought to be moral agents, we spend a third of our lives asleep. 11% of our mental experiences are dreams (Love, 2013: p.2). The dreams we experience during sleep are mostly non-agentic and this amounts to a significant unfulfilled portion of our lives. LaBerge argues that by not cultivating lucid dreams, we miss out on opportunities to explore our own minds and ultimately enrich our waking lives (LaBerge & Rheingold, 1990: p.9). Arguably then, the fulfilled virtuous person will try to develop the skill of lucid dreaming. One could object that the dreamer should just get on with life in the real world. After all, learning to lucid dream for most people takes time and practice, requiring the individual to think about their dreams for periods of time in their waking life. They could be spending their time instead doing voluntary work for charity in real life. In reply, the Virtue Ethicist can show how parallel arguments can be made for meditation: individuals are calmer in situations that threaten their morality and are working on longer-term habits. Similarly, the lucid dreamer is achieving fulfilment and nurturing important long term traits and habits. By gaining control of dreams, there is the opportunity to examine relationships with people by representing them in dreams. Lucid dreams might aid in getting an individual to carry out a difficult task in real life by allowing them to practice it in life-like settings (that go beyond merely imagining the scenario in waking life). Lucid dreams may then help to play a role in developing traits that people otherwise would not develop, and act as an outlet for encouraging the “thick moral concepts” of oneself – courage, bravery, wisdom, and so forth. Lucid dreaming helps in developing such traits and so can be seen as a means to the end of virtuousness or act as a supplementary virtue. Human experience can be taken as any area in which a choice is required. At the very least then, lucid dreaming signifies an expansion of agency.

3. Are Dreams Consciously Experienced?

a. The Received View of Dreaming

There is an implicit, unquestioned commitment in both Descartes’ dream argument and Augustine’s argument on the morality of dreaming. This is the received view, which is the platitudinous claim that a dream is a sequence of experiences that occur during sleep. The received view typically adheres to a number of further claims: that dreams play out approximately in real time and do not happen in a flash. When a dream is successfully remembered, the content of the dream is to a large extent what is remembered after waking (see Fig. 1 below), and an individual’s dream report is taken as excellent evidence for that dream having taken place.

Fig 1

 

The received view is committed to the claim that we do not wake up with misleading memories. Any failure of memory is of omission rather than error. I might wake unable to recall the exact details of certain parts of a dream, but I will not wake up and believe I had a dream involving contents which did not occur (I might recall details A – G with D missing, but I will not wake and recall content X, Y, Z). The received view is not committed to a claim about exactly how long a dream takes to experience, in correlation to how long it takes to remember, but dreams cannot occur instantaneously during sleep. The received view is committed to the claim that dreams are extended in time during sleep. The content does not necessarily have to occur just before waking; another graph detailing possible experience and later recollection of dream content on the received view might show A – G occurring much earlier in sleep (with the recalled A* – G* in the same place). A – G can represent any dream that people ordinarily recall

We can appear to carry out a scope of actions in our dreams pretty similar to those of waking life. Everything that we can do in waking life, we can also do in dreams. The exact same mental states can occur in dreams just as they do in waking life. We can believe, judge, reason and converse with what we take to be other individuals in our dreams. Since we can be frightened in a dream we can be frightened during sleep.

The received view is attested by reports of dreams from ordinary people in laboratory and everyday settings. Every dreamer portrays the dream as a mental experience that occurred during sleep. The received view is hence a part of folk psychology which is the term given to denote the beliefs that ordinary people hold on matters concerning psychology such as the nature of mental states. When it comes to dreaming, the consensus (folk psychology, scientific psychology and philosophy) agree that dreams are experiences that occur during sleep.

b. Malcolm’s Challenge to the Received View

Malcolm stands in opposition to received view – the implicit set of claims about dreams that Descartes, Augustine and the majority of philosophers, psychologists and ordinary people are committed to. It will be worth separating Malcolm’s challenge to the received view into three arguments: #1 dream reports are unverifiable; #2 sleep and dreaming have conflicting definitions; #3 communication and judgements cannot occur during sleep.

i. The Impossibility of Verifying Dream Reports

According to Malcolm’s first argument, we should not simply take dream reports at face value, as the received view has done; dream reports are insufficient to believe the metaphysical claim that dreaming consciously takes place during sleep. When we use introspection after sleeping to examine our episodic memories of dreams and put our dream report into words, these are not the dream experiences themselves. Malcolm adds that there is no other way to check the received view’s primary claim that dreams are consciously experienced during sleep. Importantly, Malcolm states that the sole criterion we have for establishing that one has had a dream is that one awakes with the impression of having dreamt (that is, an apparent memory) and that one then goes on to report the dream. Waking with the impression does not entail that there was a conscious experience during sleep that actually corresponds to the report. Malcolm views dream reports as inherently first personal and repeatedly claims that the verbal report of a dream is the only criterion for believing that a dream took place. He adds that dreams cannot be checked in any other way without either showing that the individual is not fully asleep or by invoking a new conception of dreaming by relying on behavioural criteria such as patterns of physiology or movement during sleep. Behavioral criteria too, are insufficient to confirm that an individual is consciously experiencing their dreams, according to Malcolm. The best we can get from that are probabilistic indications of consciousness that will never be decisive. If scientists try to show that one is dreaming during sleep then those scientists have invoked a new conception of dreaming that does not resemble the old one, Malcolm alleges. He believes that where scientists appeal to behavioural criteria they are no longer inquiring into dreaming because the real conception of dreaming has only ever relied on dream reports. A dream is logically inseparable from the dream report and it cannot be assumed that the report refers to an experience during sleep. Malcolm thereby undermines the received view’s claim that “I dreamed that I was flying” entails that I had an experience during sleep in which I believed I was flying. Hence there is no way of conclusively confirming the idea that dreaming occurs during sleep at all.

Malcolm’s claim that the received view is unverifiable is inspired by a statement made by Wittgenstein who alludes to the possibility that there is no way of finding out if the memory of a dream corresponds to the dream as it actually occurred (Wittgenstein, 1953: part 2, § vii; p.415). Wittgenstein asks us what we should do about a man who has an especially bad memory. How can we trust his reports of dreams? The received view is committed to a crucial premise that when we recall dreams we recall the same content of the earlier experience. But Wittgenstein’s scenario establishes the possibility that an individual could recall content that did not occur. The question then arises as to why we should believe that somebody with even a good day-to-day memory is in any better position to remember earlier conscious experiences during sleep after waking.

In drawing attention to empirical work on dreams, Malcolm says that psychologists have come to be uncertain whether dreams occur during sleep or during the moment of waking up. The point for Malcolm is that it is “impossible to decide” between the two (Malcolm, 1956: p.29). Hence the question “when, in his sleep, did he dream?” is senseless. There is also a lack of criterion for the duration of dreams, that is how long they last in real time. Malcolm states that the concept of the time of occurrence of a dream and also how long a dream might last has no application in ordinary conversation about dreams. “In this sense, a dream is not an ‘occurrence’ and, therefore, not an occurrence during sleep” (Malcolm: 1956, p.30). Malcolm’s epistemic claim has a metaphysical result, namely, that dreaming does not take place in time or space. The combination of the waking impression and the use of language has misled us into believing that dreams occur during sleep; “dreams” do not refer to anything over and above the waking report, according to Malcolm. This is why Malcolm thinks that the notion of “dreaming” is an exemplar of Wittgenstein’s idea of prejudices “produced by ‘grammatical illusions’” (Malcolm, 1959: p. 75).

ii. The Conflicting Definitions of “Sleep” and “Dreaming”

According to Malcolm’s second argument, he accuses the received view of contradicting itself and so the claim that dreams could consciously occur during sleep is incoherent. Sleep is supposed to entail a lack of experiential content, or at least an absence of intended behaviour, whereas dreaming is said to involve conscious experience. Experience implies consciousness; sleep implies a lack of consciousness; therefore the claim that dreams could occur during sleep implies consciousness and a lack of consciousness. So the received view results in a contradiction. This alleged contradiction of sleep and dreaming supports Malcolm’s first argument that dreams are unverifiable because any attempt to verify the dream report will just show that the individual was not asleep and so there is no way to verify that dreams could possibly occur during sleep. One might object to Malcolm that the content of a dream report could coincide well with a publicly verifiable event, such as the occurrence of thunder while the individual slept and thunder in the reported dream later. Malcolm claims that in this instance the individual could not be sound asleep if they are aware of their environment in any way. He alleges that instances such as nightmares and sleepwalking also invoke new conceptions of sleep and dreaming. By “sleep” Malcolm thinks that people have meant sound sleep as the paradigmatic example, namelessly, sleeping whilst showing no awareness of the outside environment and no behaviour.

iii. The Impossibility of Communicating or Making Judgments during Sleep

Malcolm takes communication as a crucial way of verifying that a mental state could be experienced. His third argument rules out the possibility of individuals communicating or making judgements during sleep, essentially closing off dreams as things we can know anything about. This third argument supports the first argument that dreams are unverifiable and anticipates a counter-claim that individuals might be able to report a dream as it occurs, thereby verifying it as a conscious experience. Malcolm claims that a person cannot say and be aware of saying the statement “I am asleep” without it being false. It is true that somebody could talk in his sleep and incidentally say “I am asleep” but he could not assert that he is asleep. If he is actually asleep then he is not aware of saying the statement (and so it is not an assertion), whilst if he is aware of saying the statement then he is not asleep. Since a sleeping individual cannot meaningfully assert that he is asleep, Malcolm concludes that communication between a sleeping individual and individuals who are awake is logically impossible.

As inherently first personal and retrospective reports, or so Malcolm alleges, the dream report fails Wittgensteinian criteria of being potentially verified as experiences. Malcolm alleges that there could be no intelligible mental state that could occur during sleep; any talk about mental states that could occur during sleep is meaningless. Malcolm assumes the Wittgensteinian point that talk about experiences gain meaning in virtue of their communicability. Communicability is necessary for the meaningfulness of folk psychological terms. Malcolm appeals to the “no private language argument” to rebut the idea that there could be a mental state which only one individual could privately experience and understand (for more on the private language argument, see Candlish & Wrisley, 2012).

The claim that there is a lack of possible communicability in sleep is key for Malcolm to cash out the further claim that one cannot make judgements during sleep. He does not believe that one could judge what they cannot communicate. According to Malcolm, since people cannot communicate during sleep, they cannot make judgements during sleep. He further adds that being unable to judge that one is asleep underlies the impossibility of being unable to have any mental experience during sleep. For, we could never observe an individual judge that he was asleep. This point relies on Malcolm’s second argument that the definitions of sleep and dreaming are in contradiction. There is nothing an individual could do to demonstrate he was making a judgement that did not also simultaneously show that he was awake. Of course, it seems possible that we could have an inner experience that we did not communicate to others. Malcolm points out that individuals in everyday waking instances could have communicated their experiences, at least modally. There is no possible world though, in which a sleeping individual could communicate with us his experience – so one cannot judge that one is asleep and dreaming. If Malcolm’s argument about the impossibility of making judgements in sleep works then his attack is detrimental to the received view’s premise that in sleep we can judge, reason, and so forth.

iv. Ramifications (contra Descartes)

Malcolm thinks that his challenge to the received view, if successful, undercuts Cartesian scepticism Descartes’ scepticism got off the ground when he raised the following issue: due to the similarity of dreams and waking experiences, my apparent waking experience might be a dream now and much of what I took to be knowledge is potentially untrue. A key premise for Descartes is that a dream is a sequence of experiences, the very same kind we can have whilst awake. This premise is undermined if dreams are not experiences at all. If the received view is unintelligible then Descartes cannot coherently compare waking life experiences to dreams: “if one cannot have thoughts while sound asleep, one cannot be deceived while sound asleep” (Malcolm: 1956, p.22). Descartes, championing the received view, failed to notice the incoherence in the notion that we can be asleep and aware of anything. Whenever we are aware of anything, whether it be the fire in front of us or otherwise, this is firm evidence that we are awake and that the world presented to us is as it really is.

c. Possible Objections to Malcolm

i. Putnam on the Conceptual Analysis of Dreaming

Part of Malcolm’s challenge to empirical work was his claim that the researchers have invoked new conceptions of sleep and dreaming (without realizing it) because of the new method of attempted verification. According to Malcolm’s charge, researchers are not really looking into dreaming as the received view understands the concept of dreaming. This was crucial for his attempt to undermine all empirical work on dreaming. Instead of relying on an individual’s waking report scientists may now try to infer from rapid eye movements or other physiological criteria that the individual is asleep and dreaming. For Malcolm, these scientists are working from a new conception of “sleep” and “dreaming” which only resembles the old one. Putnam objects to Malcolm’s claim, stating that science updates our concepts and does not replace them: the received view seeks confirmation in empirical work. In general, concepts are always being updated by new empirical knowledge. Putnam cites the example of Multiple Sclerosis (MS), a disease which is made very difficult to diagnose because the symptoms resemble those of other neurological diseases and not all of the symptoms are usually present. Furthermore, some neurologists have come to believe that MS is caused by a certain virus. Suppose a patient has a paradigmatic case of MS. Saying that the virus is the cause of the disease changes the concept because it involves new knowledge. On Malcolm’s general account, it would be a new understanding with a new concept and so the scientists would not be talking about MS at all (Putnam, 1962: p.219). Putnam believes that we should reject Malcolm’s view that future scientists are talking about a different disease. Analogously, we are still talking about the same thing when we talk about new ways of verifying the existence of dreams. If Putnam’s attack is successful then the work that scientists are doing on dreaming is about dreaming as the received view understands the concept, namely, conscious experiences that occur during sleep. If Putnam is right that scientists are not invoking a new conception of sleep and dreaming, then we can find other ways to verify our understanding of dreaming and the received view is continuous with empirical work.

ii. Distinguishing “State” and “Creature” Consciousness

David Rosenthal develops some conceptual vocabulary (Rosenthal: 2002, p.406), which arguably exposes a flaw in Malcolm’s reasoning. “Creature consciousness” is what any individual or animal displays when awake and responsive to external stimuli. “Creature unconsciousness” is what the individual or animal displays when unresponsive to external stimuli. “State consciousness,” on the other hand, refers to the mental state that occurs when one has an experience. This may be either internally or externally driven. I may have a perception of my environment or an imaginative idea without perceptual input. Malcolm evidently thinks that any form of state consciousness requires some degree of creature consciousness. But such a belief begs the question, so a Rosenthalian opponent of Malcolm might argue. It does not seem to be conceptually confused to believe that one can be responsive to internal stimuli (hence state conscious) without being responsive to external stimuli (hence creature unconscious). If, by “sleep” all we have meant is creature unconsciousness, then there is no reason to believe that an individual cannot have state conscious at the same time. An individual can be creature unconscious whilst having state consciousness, that is to say, an individual can be asleep and dreaming.

There are various reasons to believe that creature consciousness and state consciousness can come apart (and that state consciousness can plausibly occur without creature consciousness): the mental experience of dreaming can be gripping and the individual’s critical reasoning poor enough to be deceived into believing his dream is reality; most movement in sleep is not a response to outside stimuli at all but rather a response to internal phenomenology; the sleeping individual is never directly aware of his own body during sleep. Recall that Malcolm thought that sleep scientists cannot correlate movement in sleep with a later dream report because it detracted from their being fully asleep because creature consciousness and state consciousness can coexist. Malcolm is arguably wrong, then, to think that an individual moving in sleep detracts from their being fully asleep. This may block Malcolm’s appeal to sound sleep as the paradigmatic example of sleep. With the Rosenthalian distinction, we have reason to believe that even if an individual moves around in sleep, they are just as asleep as a sleeping individual lying completely still. The distinction may also count against Malcolm’s third argument against the possibility of communication in sleep. Of course, if creature consciousness is a necessary condition for communication then this distinction is not enough to undermine Malcolm’s third argument that communication cannot occur during sleep. A view where state consciousness alone suffices for communication will survive Malcolm’s third argument, on the other hand.

The apparent contradiction in sleep and dreaming that Malcolm claims existed will be avoided if the kind of consciousness implied by sleep is different to the kind Malcolm thinks is implied. The distinction might allow us to conclude that corroboration between a waking report and a publically verifiable sound, for example, can demonstrate that an individual is dreaming and yet asleep. Some dream content, as reported afterwards, seems to incorporate external stimuli that occurred at the same time as the dream. Malcolm calls this faint perception (Malcolm, 1956: p.22) of the environment and says that it detracts from an individual’s being fully asleep. Perhaps an objector to Malcolm can make a further, albeit controversial claim, in the Rosenthalian framework, to account for such dreams. For example, if there is thunder outside and an individual is asleep he might dream of being struck by Thor’s hammer. His experience of the thunder is not the same sort of experience he would have had if he were awake during the thunder. The possible qualia are different. So Malcolm may be wrong in alleging that an individual is faintly aware of the outside environment if corroboration is attempted between a report and a verifiable sound, for example. Malcolm argued that such dreams are examples of individuals who are not fully asleep. But we can now see within the Rosenthalian framework how an individual could be creature unconscious (or simply “asleep” on the received view) and be taking in external stimuli unconsciously whilst having state consciousness that is not directly responsive to the external environment because he is not even faintly conscious of the external world.

See Nagel (1959), Yost (1959), Ayer (1960; 1961), Pears (1961), Kramer (1962) and Chappell (1963), for other replies to Malcolm.

d. Dennett’s Challenge to the Received View

i. A New Model of Dreaming: Uploading Unconscious Content

Dennett begins his attack on the received view of dreaming (the set of claims about dreams being consciously experienced during sleep) by questioning its authority. He does this by proposing a new model of dreaming. He is flexible in his approach and considers variations of his model. The crucial difference between his theory and the received view is that consciousness is not present during sleep on, what we might call Dennett’s uploading of unconscious content model of dreaming. Dennett does not say much about how this processing of unconscious material works, only that different memories are uploaded and woven together to create new content that will be recalled upon waking as though it was experienced during sleep, although it never was. Dennett is not repeating Malcolm’s first argument that dreaming is unverifiable. On the contrary, he believes that the issue will be settled empirically, though he claims that there is nothing to favour the received view’s own claim that dreams involve conscious experiences.

On the received view, the memory of an earlier dream is caused by the earlier dream experience and is the second time the content is experienced. On Dennett’s model, dream recall is the first time the content is experienced. Why believe that dreaming involves a lack of consciousness during sleep? One might cite evidence that the directions of rapid eye movements during sleep have been well correlated with the reports of dream content. An individual with predominantly horizontal eye movements might wake up and report that they were watching a tennis match in their dream. Dennett accommodates these (at the time) unconfirmed findings by arguing that even if it is the case that eye movement matches perfectly with the reported content, the unconscious is uploading memories and readying the content that will be experienced in the form of a false memory. The memory loading process is not conscious at the time of occurrence. Such findings would almost return us back to the received view – that the content of the dream does occur during sleep. It may be that the unconscious content is uploaded sequentially in the same order as the received view believes. We do not have proof that the individual is aware of the content of the dream during sleep. That is to say, the individual may not be having a conscious experience, even though the brain process involves the scenario which will be consciously experienced later, as though it was consciously experienced during sleep. Movement and apparent emotion in sleep can be accounted for too; a person twitches in their sleep as a memory with content involving a frightening scenario is uploaded and interwoven into a nightmarish narrative. It does not necessarily follow that the individual is conscious of this content being uploaded. This account is even plausible on an evolutionary account of sleep. The mind needs time to be unconscious and the brain and body needs to recalibrate. Thus, during sleep, the body is like a puppet, its strings being pulled by the memory loading process – although individuals show outward sign of emotion and bodily movement, there is nothing going on inside. Sometimes, though, what is remembered is the content being prepared, similar to the received view, only the individual is not aware of the content during sleep – this is why there can be matches between dream content reported and the direction of eye movement. Both sides of the debate agree that when dream content is being prepared some parts of the body move about as though it was a conscious experience, only Dennett denies that consciousness is present at the time and the received view believes that it is present.

Dennett also considers possibilities where the content of dream recall does not match the content that is uploaded. The content of the uploading during sleep might involve, say, window shopping in a local mall, yet the content that is recalled upon waking might recall flying over Paris. Having outlined the two theories – the received view and his own unconscious alternative – Dennett is merely making a sceptical point that the data of dream reports alone will not decide between them. What is to choose between them? Dennett believes that there is further evidence of a specific type of dream report that might decide the issue in favour of his own model.

ii. Accounting for New Data on Dreams: “Precognitive” Dreams

Any scientific theory must be able to account for all of the data. Dennett believes that there exists certain dream reports which the received view has failed to acknowledge and cannot account for. There exists anecdotal evidence that seems to suggest that dreams are concocted at the moment of waking, rather than experienced during sleep and is therefore a direct challenge to the received view. The most well-known anecdotal example was noted by French physicist Alfred Maury, who dreamt for some time of taking part in the French Revolution, before being forcibly taken to the Guillotine. As his head was about to be cut off in the dream, he woke up with the headboard falling on his neck (Freud, 1900: Chapter 1; Blackmore, 2005: p. 103). This anecdotal type of dream is well documented in films – one dreams of taking a long romantic vacation with a significant other, about to kiss them, only to wake up with the dog licking their face. In Dennett’s own anecdotal example, he recalls dreaming for some time of looking for his neighbour’s goat. Eventually, the goat started bleating at the same time as his alarm clock went off which then woke him up (Dennett: 1976, p.157). The received view is committed to the claim that dreams, that is, conscious experiences, occur whilst an individual is asleep. The individual then awakes with the preserved memory of content from the dream. But the anecdotes pose a potentially fatal problem for the received view because the entire content of the dream seems to be caused by the stimulus that woke the individual up. The anecdotes make dreams look more like spontaneous imaginings on waking than the real time conscious experiences of the received view. Dennett argues that precognition is the only defense the received view can take against this implication. Given the paranormal connotations, this defense is redundant (Dennett: 1976, p. 158). Dennett provides a number of other anecdotal examples that imply that the narrative to dreams is triggered retrospectively, after waking. The content of the dream thematically and logically leads up to the end point, which is too similar to the waking stimulus to be a coincidence. The difficulty for the received view is to explain how the content could be working towards simultaneously ending with the sound, or equivalent experience, of something in the outside environment. Figures 2 and 3 depict the attempts to explain the new data on the received view and Dennett’s model.

 

Fig 2

 

 

Fig 3

If Dennett is right that the received view can only explain the anecdotes by appeal to precognition then we would do well to adopt a more plausible account of dreaming. Any attempt to suggest that individuals have premonitions about the near future from dreams would have little credibility. Dennett’s alternative unconscious uploading account might also allow for the retro-selection of appropriate content on the moment of waking. This theory allows for two ways of dreaming to regularly occur, both without conscious experience during sleep. The first way: dreams play out similar to the received view, only individuals lack consciousness of the content during sleep. Specifically, Dennett argues that during sleep different memories are uploaded by the unconscious and woven together to create the dream content that will eventually be experienced when the individual wakes. During sleep the content of the dream is gathered together by the brain, without conscious awareness, like recording a programme at night which is consciously played for the first time in waking moments. The second way: perhaps when one is woken up dramatically, the brain selects material (relevant to the nature of the waking stimulus) at the moment of waking which is threaded together as a new story, causing the individual to have a “hallucination of recollection” (Dennett, 1976: p.169). It might be that the unconscious was preparing entirely different content during sleep, which was set to be recalled, but that content is overwritten due to the dramatic interruption.

On Dennett’s unconscious uploading/retro-selection theory, “it is not like anything to dream, although it is like something to have dreamed” (Dennett, 1976: p.161). Consciousness is invoked on waking as one apparently remembers an event which had never consciously occurred before. On the above proposal, Dennett was not experiencing A – G, nor was the content for that dream even being prepared (though it might have been material prepared at an earlier date and selected now). Dennett also alludes to a library in the brain of undreamed dreams with various endings that are selected in the moments of waking to appropriately fit the narrative connotations of the stimuli that wakes the individual (Dennett, 1976: p. 158). His account is open to include variations of his model: perhaps various different endings might be selected – instead of a goat, other content may have competed for uploading through association – it may have been a dream involving going to the barbers and getting a haircut, before the buzzing of the clippers coincided with the alarm clock, or boarding a spaceship before it took off, had the sound of the alarm clock been more readily associated with these themes. Alternatively, the same ending might get fixed with alternative story-lines leading up to that ending. Dennett could have had a dream about going to his local job centre and being employed as a farmer, rather than searching for his neighbour’s goat, though the same ending of finding a bleating goat will stay in place.

As Dennett notes, if any of these possibilities turn out to be true then the received view is false and so they are serious rivals indeed. For all the evidence we have (in 1976), Dennett believes his unconscious uploading model is better placed to explain the data than the received view because the anecdotes prove that sometimes the conscious experience only occurs after sleep – an alien idea to the received view. Moreover, the received view should have no immediate advantage over the other models. Dennett separates the memory from the experience – the memory of a dream is first experienced consciously when it is recalled. The result is the same as Malcolm’s – the received view is epistemologically and metaphysically flawed. If there is nothing it is like to have a dream during sleep, then the recall of dreams does not refer to any experience that occurred during sleep.

e. Possible Objections to Dennett

i. Lucid Dreaming

It might seem that lucid dreaming is an immediate objection to Malcolm and Dennett’s arguments against the received view. Lucid dreaming occurs when an individual is aware during a dream that it is a dream. Lucid dreaming is therefore an example of experiencing a dream whilst one is asleep, therefore dreams must be experiences that occur during sleep. In replying to this objection, Dennett argues that lucid dreaming does not really occur. For, the waking impression might contain “the literary conceit of a dream within a dream” (Dennett: 1976, p.161). The recalled dream might just have content where it seems as though the individual is aware they are having a dream. Even this content might have been uploaded unconsciously during sleep. It might now seem that we have no obvious way of testing that the individual who reports having had a lucid dream is aware of the dream at the time. Dennett’s reply is undermined by a series of experiments carried out by Stephen LaBerge. LaBerge asked individuals who could lucid dream to voluntarily communicate with him in their lucid dreams by using the match in content of the dream to the direction of eye movements, thereby challenging Dennett’s claim that individuals are not conscious during sleep. The participants made pre-arranged and agreed eye movements. One might think that if lucid dreaming is really occurring, and the direction of REM might be linked to the content of the dream as it occurs, then one could pre-arrange some way of looking around in a lucid dream (that stands out from random eye movement) in order to test both theories. Stephen LaBerge carried out this experiment with positive results (LaBerge, 1990). The meaning of certain patterns of intended and previously arranged eye movement, for example, left-right-left-right can be a sleeping individual’s way of expressing the fact that they are having a dream and aware that they are having a dream (see figure below).

 

Fig 4

 

In figure 4 above, the graph correlates with the reported dream content. The participant wakes and gives a dream report which matches the eye movement. The participant claims to have made five eye signals as agreed. The written report confirms the eye movements and the participant is not shown any of the physiological data until his report is given. He alleges to have gained lucidity in the first eye movement (1, LRLR). He then flew about in his dream, until 2, where he signalled that he thought he had awakened (2, LRx4). 2 is a false awakening which he then realized and went on to signal lucidity once again in 3. Here, he made too many eye signals and went on to correct this. This corroborates with the participant’s report. The participant accurately signals in the fifth set of eye movements that he is awake which coincides with the other physiological data that the participant is awake (participants were instructed to keep their eyes closed when they thought they had awoken).

The important implication of LaBerge’s experiment for Malcolm and Dennett’s arguments is that they can no longer dismiss the results of content matching with eye movement as merely “apparent” or “occasional” (as they took the content-relative-to-eye-movement claim at the time they wrote). Predictability is a hallmark of good science and these findings indicate that sleep science can achieve such status. The content-relativity thesis is confirmed by the findings because it is exactly what that thesis predicts in LaBerge’s experiment. The opposing thesis – that eye movement is not relative to content, cannot explain why the predicted result occurred. Namely, if we believe that eye content might match the content of the dream (and that lucid dreaming really can occur, contra Dennett), then if sleep scientists asked participants to do something in their dream (upon onset of lucidity) that would show that they are aware – this is what we would expect.

The received view gains further confirmation because in order to make sense of the communication as communication, one has to employ the notion that the direction of eye movement was being voluntarily expressed. If one is convinced by LaBerge’s arguments, then the notion of communicating during sleep undercuts Malcolm’s privilege of the dream report and his claim that individuals could never communicate during sleep. We can empirically demonstrate, then, that the waking impression and the dream itself are logically separable. We need an account of why a sleeping individual would be exhibiting such systematic eye movements in LaBerge’s experiments. One reasonable conclusion is that the participant is mentally alert (has state consciousness) even though they are not awake (are not displaying creature consciousness). The most important implication of LaBerge’s study is that communication can arguably occur from within the dream without waking the individual up in any way. This supports the claim of the received view that we can be asleep and yet having a sequence of conscious experiences at the same time. The content of the dream occurs during sleep because the content is matched to the expected, systematic eye movement beyond coincidence. The systematic eye movement occurs exactly as predicted on the received view. Although Dennett could account for matched content to eye movement – he could not account for what seems like voluntary communication, which requires that an individual is conscious. This arguably leaves us in a position to cash out dreaming experience as verifiable as any other waking state.

The “content relative to eye movement” thesis can be accepted uncontroversially. Much dream content really does occur during sleep and this can be taken advantage of via participants communicating and thereby demonstrating conscious awareness. If LaBerge’s participants communicating with sleep scientists convinces, then the received view is right to think that the content of dreams does occur during sleep and that the dream content during sleep matches the memory of dream content. The findings are significant insofar as the participants influenced the dream content and thereby influenced eye movement through an act of agency within the dream.

The most important potential implication of LaBerge’s findings, though the more controversial implication, is that communication can occur from within the dream without the individual waking up in any way, thereby confirming the received view. The content of the dream occurs during sleep because the study confirms that eye movement during sleep really does match up with the content of the dream as reported after sleep. LaBerge chose agreed eye movements rather than speech, because in this way, the individual would remain asleep. The experiment also challenges Malcolm’s claim that dreams cannot be communicated and were therefore logically unverifiable.

There exists a possible objection to the received view cashing in the results from LaBerge’s experiments, which can be raised in line with Dennett’s unconscious memory-loading process. The objector might argue that the unconscious takes care of the pending task of looking around in the dream in the pre-arranged manner. If, according to Dennett’s retro-selection model, the unconscious uploads memories from the day, then it follows that one of the memories it might upload could be of the participant discussing with LaBerge the specific eye movements to be made. So the unconscious might carry out the eye movements, misleading scientists into believing the sleeping individual is conscious. Once we start to credit the unconscious with being able to negotiate with waking memories during sleep and to make judgements, we either have to change our picture of the unconscious or conclude that these individuals are consciously aware during sleep.

LaBerge carried out a further experiment in which the timing of dreams was measured from within the dream. The experiment used lucid dreamers again who made the agreed signal. They then counted to 10 in their dream and then made the eye signal again. They were also asked to estimate the passing of 10 seconds without counting (see fig. 5 below). LaBerge has concluded from his experiments that experience in dreams happens roughly at the same time as in waking life. This second experiment carried out by LaBerge further blunts the Dennettian objection from unconsciously uploading the task. For, it seems to require agency (rather than unconscious processing) to negotiate how much time is passing before carrying out an action.

Fig 5

There is still room for scepticism towards dreams being consciously experienced during sleep. For, the sceptic could bite the bullet and say that lucid dreams are a special, anomalous case that does not apply to ordinary dreaming. Indeed, there is evidence that different parts of the brain are accessed, or more strongly activated, during lucid dreaming (pre-frontal brain regions, pre-cuneus and front polar regions). So there still remains the possibility that lucid dreaming is an example of consciously “waking up within a dream” whilst ordinary dreams are taken care of entirely by the unconscious (thus there is not anything-it’s-like to have an ordinary dream occur during sleep). It might be useful to look at one report of the memory of a lucid dream:

In a dangerous part of San Francisco, for some reason I start crawling on the sidewalk. I start
to reflect: this is strange; why can’t I walk? Can other people walk upright here? Is it just me
who has to crawl? I see a man in a suit walking under the streetlight. Now my curiosity is
replaced by fear. I think, crawling around like this may be interesting but it is not safe. Then I
think, I never do this – I always walk around San Francisco upright! This only happens in
dreams. Finally, it dawns on me: I must be dreaming! (LaBerge & Rheingold, 1990: p.35)

Notice that this is not just the memory-report of a lucid dream. Rather, it is the memory-report of an ordinary dream turning into a lucid dream. The report demonstrates an epistemic transition within the dream. With this in mind, it is difficult to maintain that lucid dreams are anomalous, or that lucid dreams bring about conscious experience. For, there is a gradual process of realization. We might want to also thus accept that the preceding ordinary dream is conscious too because whenever individuals gain lucidity in their dreams, there is a prior process of gradual realization that is present in the dream report. When an individual acquires lucidity in a dream, they are arguably already conscious but they begin to think more critically, as the above example demonstrates. Different parts of the brain are activated during lucidity, but these areas do not implicate consciousness. Rather, they are better correlated to the individual thinking more critically.

The gradual transition from ordinary dream to lucid dream can be more fully emphasized. It is possible to question oneself in a dream whilst failing to provide the right answer. One might disbelieve that one is awake and ask another dream character to hit oneself, apparently feel pain and so conclude that one is awake. It is important here to highlight the existence of partial-lucid dreams in order to show that dreams often involve irrationality and that there are fine-grained transitions to the fuller lucidity of critical thinking in dreams. A lucid dream, unlike ordinary dreams, is defined strictly in terms of the advanced epistemic status of the dreamer – the individual is having a lucid dream if they are aware that he or she is dreaming (Green, 1968: p.15). Lucid dreaming is defined in the weak sense as awareness that one is dreaming. Defined more strongly, lucid dreams involve controllability and a level of clarity akin to waking life. As stated, individuals can come close to realizing they are dreaming and miss out – they might dream about the nature of dreaming or lucid dreaming without realizing that they are currently in a dream themselves. Hence the norms of logical inference do not apply to ordinary dreams. One might look around in one’s dream and think “I must remember the people in this dream for my dream journal” without inferring that one is in a dream and acquiring lucidity. In another dream an individual might intend to tell another person in real life, who is featured in the dream, something they have learned just as soon as they wake up. In a dream, the dreamer could be accused by dream character Y of carrying out an action A – the dreamer thinks “I must wake up to tell Y that I did not carry out this action.” Implicitly, the dreamer knows it is a dream to make sense of the belief that they need to wake up, but they have not realized that Y will not have a continuation of the beliefs of their dream representative. The dream here involves awareness of the dream state without controllability over the dream (for they still seem to be going along with the content as though it were real). There is another type of common, partial-lucid dream in which people wake up from a dream and are able to return to it upon sleeping and change the course of the dream. This type of dream seems to involve controllability without awareness – they treat it as real and do not treat it as an acknowledged dream but are able to have much more control over the content than usual. Lucid dreamers used in experimental settings are much more experienced and have lucid dreams in the strongest sense – they are aware they are dreaming (and can maintain this awareness for a significant duration of time), have control over the content and have a level of clarity of thinking akin to waking life.

There are a number of further implications of LaBerge’s findings for various philosophical claims. The existence of communication during lucid dreaming challenges Augustine’s idea that dreams are not actions. If we can gain the same level of agency we have in waking life during lucid dreaming, then it might be the case that even ordinary dreams carry some, albeit reduced, form of agency. We might want to believe this new claim if we accept that agency is not suddenly invoked during lucidity, but is rather enhanced. The findings also open up the possibility to test the claim that somebody who is dreaming cannot tell that they are not awake, further undermining Descartes’ claim that dreaming and waking experiences are inherently indistinguishable. Previously, philosophers who wanted to resist the sceptical argument would say that someone who is dreaming might be able unable to distinguish the state they were in but not that someone awake is unable to distinguish, for example, Locke’s point that if we are in pain then we must be in an awake state. Had Descartes been a lucid dreamer, then when he was sat by the fire, his phrase might have come out as “I am now seated by the fire but I have also been deceived in dreams into believing I have been seated by the fire … though on occasion I have realized that I was just dreaming when apparently seated by the fire and so was not deceived at all!” It is surely a contingent fact that people rarely have lucid dreams. If people lucidly dreamt much more often, then Descartes’ sceptical dream argument would have had little to motivate it. Work on communicative lucid dreaming might also open up the possibility to test further phenomenally distinguishing features between the two states via individuals communicating statements about these features.

ii. Alternative Explanations for “Precognitive” Dreams

Dennett had cited an interesting type of dream report where the ending of the dream was strongly implied by the stimulus of awakening. For example: having a dream of getting married end with the sound of church bells ringing, which coincides with the sound of the alarm clock. These seemingly precognitive dreams are sometimes referred to in the literature as “the anecdotes” because of their generally non-experimental form. Though they are remote from scientific investigation, the mere existence of the anecdotes at all caused trouble for the received view and requires explanation. They provide extra evidence for Dennett’s proposed paradigm shift of dreaming. On the other hand, if there is enough evidence to claim that dreams are consciously experienced during sleep then the anecdotal data of dreams will not be a powerful enough counterexample; they will not warrant a paradigm shift in our thinking about dreams. The biggest challenge the anecdotes represent is that on occasion the memory can significantly deviate from the actual experience. On this view, false memories overriding the actual content of the dream does occur, but these experiences are the exception rather than the rule.

Though LaBerge’s experiments suggest that the content of dreams consciously occurs during sleep, the findings on their own are insufficient to draw the conclusion that all dreams occur during sleep. The existence of the anecdotes blocks one from drawing that conclusion. But these anecdotes can be explained on the received view. It is already known that the human species has specific bodily rhythms for sleep. Further, there is a noted phenomenon that people wake up at the same time even when their alarm clocks are off. Dennett himself says that he had got out his old alarm clock that he had not used in months and set the alarm himself (Dennett, 1976: p.157) – presumably for a time he usually gets up at anyway. If the subconscious can be credited with either creating a dream world on the received view or a dream memory on the Dennettian view and the personal body clock works with some degree of automaticity during sleep, one may well ask why the dream’s anticipation (and symbolic representation) of this need be precognitive in a paranormal sense. Had Dennett woken up earlier, he may have lain in bed realizing that his alarm clock was going to go off, which is not considered an act of precognition. Had he thought this during sleep, the received view would expect it to be covered symbolically via associative imagery. Thus, perhaps Dennett is not being impartial in his treatment of dreams, and his argument begs the question since he is considering the received view’s version of dreaming to be inferior to his own theory by assuming that thought in dreaming is completely oblivious to the banalities of the future.

Arguably, the other anecdotes can be explained away. Recall the most famous anecdote, where Maury was dragged to the guillotine with the headboard falling on his neck, waking him up (Freud discusses the case of Maury in the first chapter of his Interpretation of Dreams). Maury may have had some ongoing and subconscious awareness of the wobbliness of his headboard before it fell (presumably it did not fall without movement on his part – possibly in resistance to being beheaded. Maury’s could thus be a type of dreaming involving self-fulfilling prophecy) that is left out of the overall account. Dennett’s account agrees that any unconscious awareness of the outside environment is represented differently in the dream. Maury would have had enough time to form the simple association of his headboard being like a guillotine from the French Revolution. If he had some awareness of the looseness of his headboard, then the thought: “if this headboard falls on me it will be like being beheaded,” would be appropriately synchronized into the dream.

Though it is possible to provide alternative explanation for some of the anecdotes, it might be worth dividing Dennett’s anecdotes into hard and soft anecdotes. A hard anecdote is outlined by Dennett: a car backfires outside and an individual wakes up with the memory of a dream logically leading up to being shot. The soft anecdotes, those already mentioned (such as Maury’s and Dennett’s own examples), can at least be alternatively explained. The hard anecdotes, on the other hand, cannot simply be explained by appeal to body clocks and anticipation in sleep. The hard anecdotes reveal that the dreamer has no idea what will wake them up in the morning, if anything (maybe the alarm will not actually go off; or the truck backfiring could occur at any point). The soft anecdotes might involve outside stimuli being unconsciously incorporated into the dream. The only hard anecdote mentioned in Dennett’s paper is made up by Dennett himself to outline his theory, rather than representing a genuine example of a dream. Notice that a hard anecdote could favour the received view – if experimenters switched off an alarm that a participant had set and that individual woke with a dream which seemed to anticipate the alarm going off at the set time (for example, had Dennett’s goat dream occurred despite his alarm not going off) then we might do best to conclude that the dream consciously occurred during sleep in a sequential order leading up to the awakening. Dennett is right that the issue is worth empirically investigating. If a survey found that hard anecdote-like dreams (such as a truck back-firing and waking with a dream thematically similar) occur often, then the received view is either dis-confirmed or must find a way to make room for such dreams.

Dennett might reply that he does not need to wait for the empirical evidence to reveal that the hard anecdotes are common. He might simply deny that the alternative explanation of the soft anecdotes is not credible. If the attempt to explain the soft anecdotes are not credible (as a Dennettian might object), they at least draw attention to the extraneous variables in each of the anecdotes. Dennett’s sole reason for preferring his retro-selection theory over the received view is so an explanation can account for the anecdotal data. But Dennett uses his own alarm clock which he himself set the night before; Maury uses his own bed; Dennett of course admits that the evidence is anecdotal and not experimental. In one of the few actual experiments carried out, water is slowly dripped onto a sleeping participant’s back to wake them up, but this is not timed and is akin to a soft anecdote. More empirical work needs to be done to clarify the issue and for the debate to move forward.

The anecdotes are a specific subclass of prophetic dreams. It is worth noting a related issue about dreams which are alleged to have a more prophetic nature. It seems from the subjective point of view that if one was to have had a dream about a plane crash the night before (or morning of) September 11th 2001 this would have been a premonitory dream. Or one might dream of a relative or celebrity dying and then wake to find this actually happens in real life. The probability of the occurrence of a dream being loosely (or even exactly) about an unforeseen, future event is increased when one has a whole lifetime of dreams – a lifetime to generate one or two coincidences. The dreams which are not premonitory are under-reported and the apparently prophetic dreams are over-reported. The probability of having a dream with content vaguely similar to the future is increased by the fact that a single individual has many dreams over a lifetime. Big events are witnessed and acknowledged by large numbers of people which increases the probability that someone will have a dream with similar content (Dawkins, 1998: p.158). No doubt when the twin towers were attacked some people just so happened to have dreams about plane crashes, and a few might have even had dreams more or less related to planes crashing into towers, or even the Twin Towers. Billions of dreams would have been generated the night prior, as they are every night around the world, and so some would have invariably “hit the mark.” Dennett’s anecdotes are somewhat different, but they too may have the same problem of being over-reported in this way. At the same time, they may also suffer from being under-reported because they are not always obviously related to the stimulus of awakening, particularly where someone pays attention to the content of the dream and forgets how they awakened. Hence, the extent to which we experience anecdote-like dreams is currently uncertain, though they can be explained on the received view.

4. The Function of Dreaming

The function of dreaming – exactly why we dream and what purpose it could fulfil in helping us to survive and reproduce – is explained in evolutionary terms of natural selection. Natural selection is best understood as operating through three principles: Variation, Heredity and Selection. A group of living creatures within a species vary from one another in their traits, their traits are passed on to their own offspring, and there is competition for survival and reproduction amongst all of these creatures. The result is that those creatures that have traits better suited for surviving and reproducing will be precisely the creatures to survive, reproduce and pass on the successful traits. Most traits of any organism are implicated in helping it to survive and thereby typically serve some purpose. Although the question of what dreaming might do for us has to this day remained a mystery, there has never been a shortage of proposed theories. The most sustained first attempt to account for why we dream comes from Freud (1900), whose theory was countered by his friend and later adversary Carl Jung. An outline of these two early approaches will be followed by a leading theory in philosophical and neuro-biological literature: that dreaming is an evolutionary by-product. On this view, dreaming has no function but comes as a side effect of other useful traits, namely, cognition and sleep. A contemporary theory opposing the view that dreaming has no function, in comparison, holds that dreaming is a highly advantageous state where the content of the dream aids an organism in later waking behaviour that is survival-enhancing by rehearsing the perception and avoidance of threat.

a. Early Approaches

i. Freud: Psychoanalysis

The psychoanalytic approach, initiated by Freud, gives high esteem to dreams as a key source of insight into, and the main method of opening up, the unconscious (Freud, 1900: §VII, E, p.381). Psychoanalysis is a type of therapy aimed at helping people overcome mental problems. Therapy often involves in depth analyses of patient’s dreams. In order to analyse dreams though, Freudian psychoanalysis is committed to an assumption that dreams are fulfilling a certain function. Freud explicitly put forward a theory of the function of dreams. He believed that dreams have not been culturally generated by humans, as Malcolm thought, but are rather a mental activity that our ancestors also experienced. During sleep, the mind is disconnected from the external world but remains instinctual. The psychoanalytic take on dreaming is better understood in terms of Freud’s overall picture of the mind, which he split into id, ego and super-ego. The id is an entirely unconscious part of the mind, something we cannot gain control of, but is rather only systematically suppressed. It is present at birth, does not understand the differences between opposites and seeks to satisfy its continually generated libidinal instinctual impulses (Storr, 1989: p.61). The id precedes any sense of self, is chaotic, aggressive, unorganised, produces no collective will and is incapable of making value judgements As the child develops and grows up his instinctual needs as a baby are repressed and covered up (Storr, 1989: p.63). If humans were guided only by the id they would behave like babies, trying to fulfil their every need instantly, without being able to wait; the id has no conception of time. It is the id that will get an individual into social trouble. The super-ego is the opposing, counterbalance to the id, containing all of our social norms such as morality. Though we grow up and develop egos and super-egos, the id constantly generates new desires that pressurize us in our overall psychologies and social relations. The work of repression is constant for as long as we are alive. The super-ego is mostly unconscious. In the case of dreams, the work of censorship is carried out by the super-ego. The ego was initially conceived by Freud as our sense of self. He later thought of it more as planning, delayed gratification and other types of thinking that have developed late in evolution. The ego is mostly conscious and has to balance the struggle between the id and the super-ego and also in navigating the external and internal worlds. This means that we experience the continual power struggle between the super-ego and the id. The ego has to meet a quota of fulfilling some of the id’s desires, but only where this will marginally affect the individual.

Freud’s first topographical division of the mind consisted of the conscious, pre-conscious and unconscious (the pre-conscious is that which is not quite conscious but could quite easily be brought into conscious awareness). His second division of the mind into id, ego and super-ego explains how consciousness can become complicated when the two topographies are superimposed upon one another. Mental content from the unconscious constantly struggles to become conscious. Dreams are an example of unconscious content coming up to the conscious and being partially confronted in awareness, although the content is distorted. Dreaming is an opportunity for the desires of the id to be satisfied without causing the individual too much trouble. We cannot experience the desires of the id in naked form, for they would be too disturbing. The super-ego finds various ways of censoring a dream and fulfilling the desires latently. Our conscious attention is effectively misdirected. What we recall of dreams is further distorted as repression and censorship continues into attempted recollections of the dream. The resulting dream is always a compromise between disguise and the direct expression of the id’s desires. The censorship is carried out in various ways (Hopkins, 1991: pp. 112-113). Images which are associated with each other are condensed (two people might become one, characters may morph based on fleeting similarities) and emotions are displaced onto other objects. Dreams are woven together in a story-like element to further absorb the dreamer in the manifest content.

It is better that these wishes come disguised in apparently nonsensical stories in order to stop the dreamer from awakening in horror (Flanagan: 2000, p.43). The content of the dream, even with some censorship in place, still might shock an individual on waking reflection and is therefore further distorted by the time it reaches memory, for the censor is still at work. Malcolm positively cited a psychoanalyst who had claimed that the psychoanalyst is really interested in what the patient thought the dream was about (that is, the memory of the dream) rather than the actual experience (Malcolm, 1959: pp. 121-123, Appendix).

Though Freud was not an evolutionary biologist, his theory of dreams can be easily recast in evolutionary terms of natural selection. Freud thought that we dream in the way we do because it aids individuals in surviving and so is passed on and therefore positively selected. Individuals who dreamt in a significantly different way would not have survived and reproduced, and their way of dreaming would have died with them. But how does dreaming help individuals to survive? Freud postulates that dreaming simultaneously serves two functions. The primary function at the psychological level is wish fulfilment (Storr, 1989: p.44). During the day humans have far too many desires than they could possibly satisfy. These are the desires continually generated by the id. Desire often provides the impetus for action. If acted upon, some of these desires might get the individual killed or in social trouble, such as isolation or ostracism which may potentially result in not reproducing. Since wish fulfilment occurring during sleep is a mechanism that could keep their desires in check, the mechanism would be selected for. Freud claims that, having fulfilled the multifarious desires of the id in sleep, they can remain suppressed during the following days. The individual no longer needs to carry the action out in waking life, thereby potentially stopping the individual from being killed or having fitness levels severely reduced. This provides reason as to why we would need to sleep with conscious imagery. Freud only applied his theory to humans but it could be extrapolated to other animals that also have desires – for example, mammals and other vertebrates, though the desires of most members of the animal kingdom will surely be less complex than that of human desire.

The primary function at the physiological level is to keep an individual asleep while satisfying unconscious desires (Freud, 1900: §V, C, p.234). Also, it is clear that keeping an individual asleep and stopping him or her from actually carrying out the desires during sleep is beneficial to survival. So though the individual has their desires satisfied during sleep, it is done so in a disguised manner. Here, Freud separates the dream into manifest and latent content. The manifest content is the actual content that is experienced and recalled at the surface level (a dream of travelling on a train as it goes through a tunnel). The latent content is the underlying desire that is being fulfilled by the manifest content (a desire for sexual intercourse with somebody that will land the dreaming individual in trouble). The individual is kept asleep by the unconscious disguising the wishes.

What about dreams where the manifest content seems to be emotionally painful, distressing, or a good example of an anxiety dream, rather than wish fulfilment? How can our desires be satisfied when our experiences do not seem to involve what we wanted? Freud suggests that these can be examples where the dream fails to properly disguise the content. Indeed, these usually wake the individual up. Freud was aware of an early study which suggested that dream content is biased towards the negative (Freud, 1900: pp. 193-194) – a study which has been subsequently confirmed. Freud’s distinction between manifest and latent content explains away this objection. The underlying, latent content carries the wish and dreams are distorted by a psychological censor because they will wake an individual up if they are too disturbing, and then the desire will remain unsuppressed. Perhaps dreams operate with an emotion which is the opposite of gratification precisely to distract the sleeping individual from the true nature of the desire. Wish fulfilment might be carried out in the form of an anxiety dream where the desire is especially disturbing if realized. Other anxiety dreams can actually fulfil wishes through displacement of the emotions. Freud uses an example of one of his own dreams where a patient is infected but it fulfils the wish of alleviating the guilt he felt that he could not cure her (the dream of Irma’s injection). The dream allowed the blame to be lifted away from himself and projected onto a fellow doctor who was responsible for the injection. Despite the displeasure of the dream, the wish for the alleviation guilt was fulfilled.

The dream for Freudians relies on a distinction between indicative and imperative content. An indicative state of mind is where my representational systems show the world to be a certain way. The exemplar instance of indicative representations is belief. I might accurately perceive and believe that it is raining and I thus have an indicative representational mental state. Imperative states of mind, on the other hand, are ones in which I desire the world to be a certain way – a way that is different to the way it currently is. I might desire that it will snow. A dream is an instance of an indicative representation replacing an imperative one in order to suppress a desire. We see something in a dream and believe it is there in front of us. This is what we want and what we desired, so once we believe we have it, the desire is vanquished, making it unlikely we will try to satisfy the desire in waking life.

ii. Jung: Analytic Psychology

Trying to provide even a simple exposition of Freud’s or Jung’s theories of dreams will not please all scholars and practitioners that adhere to the thinkers’ traditions. Drawing out the differences or similarities between their views is an exegetical task and the resulting statements about their theories are thus always debatable. Many working in the tradition of psychoanalysis or analytical psychology opt for a synthesis between their views. At the risk of caricaturing Jung’s theory of dreams, the differences in their views will be emphasized to contrast with Freud’s.

Like Freud, Jung also believed that dream analysis is a major way of gaining knowledge about the unconscious – a deep facet of the mind that has been present in our ancestors throughout evolutionary history. But Jung’s evolutionary story of dreaming differs from Freud’s. Whereas Freud understood dreams as using memory from the days preceding the dream (particularly the “day residue” of the day immediately leading up to the dream) and earlier childhood experiences, Jung thought the dream also worked with more distant material: the collective unconscious. The collective unconscious is where the ancestral memories of the species are stored and are common to all people. Some philosophers, such as Locke, had believed the mind is a blank slate. Jung believed that the collective unconscious underlies the psychology of all humans and is even identical across some different species. The collective unconscious is especially pronounced in dreaming where universal symbols are processed, known as the archetypes. The distinction between signs and symbols is an important one (Jung, 1968: p.3). Signs refer to what is already known, whereas symbols contain a multiplicity of meanings (Mathers, 2001: p.116). More, indeed, than can be captured, thereby always leaving an unknowable aspect, and hence always requiring further work to think about the dream, thereby hinting at future perspectives one may take toward oneself and one’s dreams (Mathers, 2001: p.116).

Jung saw dreaming as playing an integral role in the overall architecture of the mind and conducive to surviving in the world for during the process of dreaming the day’s experiences are connected to previous experiences and our whole personal history. Whereas Freud thought that the adaptive advantage to dreams was to distract us and thereby keep us asleep, Jung thought the reverse: we need to sleep in order to dream and dreaming serves multiple functions. What does dreaming do for us, according to Jung? Dreams compensate for imbalances in the conscious attitudes of the dreamer. Psychology depends upon the interplay of opposites (Jung, 1968: p.47). By dreaming of opposite possibilities to waking life, such as a logical individual (with a strong thinking function) having dreams which are much more feeling–based, the balance is restored (Stevens, 1994: p.106). “Dreams perform some homeostatic or self-regulatory function … they obey the biological imperative of adaptation in the interests of personal adjustment, growth, and survival” (Stevens, 1994: p.104). They play a role in keeping the individual appropriately adapted to their social setting. Dreaming also carries out a more general type of compensation, concerning the psychology of gender. The psyche is essentially androgynous and so dreams provide an opportunity to balance the overall self to an individual’s opposite gender that make up their personality and mind. Hence dreams are not the mere “guardians of sleep” as Freud thought, but are rather a necessary part to maintaining psychological well being and development. When the conscious is not put in touch with the unconscious, homeostasis is lost and psychological disturbance will result (Jung, 1968: p.37). Dreams serve up unconscious content that has been repressed, ignored or undervalued in waking life. Although he emphasized an objective element to dreaming (that the unconscious often makes use of universal and culturally shared symbols), Jung was opposed to the possibility of a fixed dream dictionary because the meaning of symbols will change depending on the dreamer and over time as they associate images with different meanings.

Jung agrees with Freud that there are a wealth of symbols and allegorical imagery that can stand in for the sexual act, from breaking down a door to placing a sword in a sheath. In the course of his analysis, Freud would gradually move from manifest content to the latent content, though he did encourage the dreamer to give their own interpretations through free association. Jung believed that the unconscious choice of symbol itself is just as important and can tell us something about that individual (Jung, 1968: p.14). Alternatively, apparent phallic symbols might symbolize other notions – a key in a lock might symbolize hope or security, rather than anything sexual. The dream imagery, what Freud called the manifest content, is what will reveal the meaning of the dream. Where Freud had asked “What is the dream trying to hide?” Jung asks: “What is the dream trying to express or communicate?” because dreams occur without a censor: they are “undistorted and purposeful” (Whitmont & Perera, 1989: p.1).

Another function of dreaming that distinguishes Jung’s from Freud’s account is that dreams provide images of the possibilities the future may have in store for the sleeping individual. Not that dreams are precognitive in a paranormal sense, Jung emphasized, but the unconscious clearly does entertain counter-factual situations during sleep. Those possibilities entertained are usually general aspects of human character that are common to us all (Johnson, 2009: p.46). Dreams sometimes warn us of dangerous upcoming events, but all the same, they do not always do so (Jung, 1968: p.36). The connection of present events to past experience is where dreams are especially functional (Mathers, 2001: p.126). Freud and Jung’s theories clearly overlap here. But whereas Freud assessed dreams in terms of looking back into the past, especially the antagonisms of childhood experience, dreaming for Jung importantly also lays out the possibilities of the future. This clearly has survival value, since it is in the future where that individual will eventually develop and try to survive and reproduce. Dreaming for Jung is like “dreaming” in the other sense – that of aspiring, wishing and hoping. Dreams point towards our future development and individuation. The notion of personal development brought about by dreaming might depend upon regularly recalling a dream, which is something that Freud’s claim of wish fulfilment need not be committed to, and it is not clear how this deals with animal’s dreaming.

Dreams are a special instance of the collective unconscious at work, where we can trace much more ancient symbolism. The collective unconscious gives rise to the archetypes: “the universal patterns or tendencies in the human unconscious that find their way into our individual psyches and form us. They are actually the psychological building blocks of energy that combine together to create the individual psyche” (Johnson, 2009: p.46). Dreams demonstrate how the unconscious processes thought in “the language of symbolism” (Johnson, 2009: p.4). The most important function of dreams is teaching us how to think symbolically and deal with communication from the unconscious. Dreams always have a multiplicity of meanings, can be re-interpreted and new meanings discovered. Jung believed that “meaning making enhances survival” (Mathers, 2001: p.117). According to Flanagan, it is much harder to construe Jung’s theory of dreams as adhering to the basic tenets of evolutionary biology than Freud’s, for two fundamental reasons. Firstly, why would the collective unconscious expressing symbols relevant to the species have any gains in terms of reproductive success? Secondly, Jung’s theory is more often interpreted in Lamarckian, and not Darwinian, terms, which leaves it out in the cold in regards to accepted evolutionary biology (Flanagan, 2000: p.44). According to Lamarck’s conception of evolution, traits developed during a lifetime can then be subsequently passed onto the next generation. Hence Lamarck believed that the giraffe got its long neck because one stretched its neck during its lifetime to reach high up leaves and this made the neck longer, a trait which was passed onto its offspring. According to the  widely favoured Darwinian view of evolution, those giraffes that just so happened to have a longer neck would have had access to the source of food over those with shorter necks, and would have been selected. Jung’s theory is interpreted as Lamarckian rather than Darwinian because symbols that are learned during an individual’s particular historical period can be genetically encoded and passed on (Flanagan, 2000: p.64). Perhaps one could reply in Jungian-Darwinian terms that those individuals who just so happened to be born with a brain receptive to certain symbols having certain meanings (rather than learning them), survived over those individuals that did not. But then Flanagan’s first reason remains as an objection – what advantage would this have in terms of surviving? Others have argued that the collective unconscious is actually much more scientifically credible than critics have taken it to be. According to the line of argument, the collective unconscious essentially coincides with current views about innate behaviours, in fields such as socio-biology and ethology (Stevens, 1994: p.51). Appropriate environmental (or cognitive) stimuli trigger patterns of behaviours or thought that are inherited, such as hunting, fighting and mothering. In humans, there are, for example, the universal expressions of emotion (anger, disgust, fear, happiness, sadness, surprise). Humans and other mammals have an inbuilt fear of snakes that we do not need to learn. These patterns can be found in dreams. The dream images are generated by homologous neural structures that are shared amongst animals, not merely passed on to the next generation as images. The Jungian can dodge the accusation of Lamarckism by arguing that dreams involve inherited patterns of behaviour based on epigenetic rules (the genetically inherited rules upon which development then proceeds for an individual) and arguing that epigenetics does not necessarily need to endorse Lamarckism (epigenetics is a hot topic in the philosophy of biology. For a cogent introduction, readers can consult Jablonka & Lamb, 2005). Alternatively, in light of epigenetics, the Jungian can defend a Lamarackist view of evolution – a re-emerging but still controversial position.

To rehearse the differences: Freud believed that dreams are the result of a power struggle between the id’s constantly generating desires and the super-ego’s censorship of the exact nature of these desires; various techniques are employed by the censor, including emotional displacement, weaving together a narrative to pay attention to and reworking the latent desires into the manifest content we experience and eventually and occasionally recollect; dreams are a distraction to keep us asleep; dreaming is a mechanism that has been selected to provide social stability amongst individuals. Dreams essentially deal with one’s relationship to their past. Jung thought dreams point towards the future development of the individual. The experiences which process symbols shared amongst the species are a form of compensation to keep the individual at a psychological homeostasis. Dreams do not especially deal with sexuality but have a more general attempt for the individual to understand themselves and the world they exist in.

b. Contemporary Approaches

i. Pluralism

Flanagan represents a nuanced position in which dreaming has no fitness-enhancing effects on an organism that dreams, but neither does it detract from their fitness. Dreams are the by-products of sleep. The notion of evolutionary by-product, or spandrel (see figures 6 and 7), was first introduced by the evolutionary Pluralists Gould and Lewontin in “The Spandrels of San Marco and Panglossian Paradigm” where the authors borrowed the term from the field of architecture. Evolutionary Pluralism claims that traits in the natural world are not always the result of natural selection, but potentially for a plurality of other reasons. The debate between Adaptationists and Pluralists centres on the pervasiveness of natural selection in shaping traits. Pluralists look for factors other than natural selection in shaping a trait, such as genetic drift and structural constraints on development. One important example is the spandrel. Some traits might be necessary by-products of the design of the overall organism.

 

Fig 6

 

 

Fig 7

 

The two separate architectural examples, figures 6 and 7, display spandrels as roughly triangular spaces. The point that Gould and Lewontin are making by introducing the notion of spandrel is that some aspects of the design of an object inevitably come as a side-effect. Perhaps the architect only wanted archways and a dome. In producing such a work of architecture, spandrels cannot be avoided. If architecture can tell us something about metaphysical truths, then might dreams be such spandrels, lodged in between thought and sleep? Flanagan, an evolutionary Pluralist, argues that there is no fitness-enhancing function of dreaming. Though, as a matter of structure, we cannot avoid dreaming for dreams sit in between the functioning of the mind and sleep, like spandrels between two architecturally desired pillars in an archway. Hence Flanagan’s argument is that in evolutionary biological terms, dreaming is not for anything – it just comes as a side effect to the general architecture of the mind. He states more strongly that “so long as a spandrel does not come to detract from fitness, it can sit there forever as a side effect or free rider without acquiring any use whatsoever” (Flanagan, 2000: p.108). Dreaming neither serves the function of wish-fulfilment or psychological homeostasis. Flanagan emphasizes the disorganized nature of dreaming, something which can clearly be appealed to given any individual’s experience of dreams. To be sure, wish-fulfilment and apparent psychological compensation sometimes appear in dreams, but this is because we are just thinking during sleep and so a myriad of human cognition will take place. Another reason put forward by Flanagan for the view that dreams are spandrels is that, unlike for sleep, there is nothing close to an ideal adaptation explanation for dreaming (Flanagan, 2000: p.113). The ideal explanation lays out what evidence there is that selection has operated on the trait, gives proof that the trait is heritable, provides an explanation of why some types of the trait are better adapted than others in different environments and amongst other species and also offers evidence of how later versions of the trait are derived from earlier ones. The spandrel thesis about dreaming is more plausible, then, because trying to argue that dreams have function ends up as nothing more than a “just-so” story, the result of speculation and guess work that cannot meet the standards of an ideal adaptation explanation.

ii. Adaptationism

Antti Revonsuo stands in opposition to Flanagan by arguing that dreaming is an adaptation. He also stands in opposition to Freud and Jung in that the function of dreams is not to deliver wish fulfilment whilst keeping the individual asleep or to connect an individual to the symbolism of the collective unconscious. Revonsuo strives to deliver an account that meets the stringent criteria of a scientific explanation of the adaptation of dreaming. Though there have been many attempts to explain dreaming as an Adaptation, few come close to the ideal adaptation explanation, and those functional theories favoured by neurocognitive scientists (for example, dreams are for consolidating memories or for forgetting useless information) cannot clearly distinguish their theory of the function of dreams from the function of sleep, that is, the spandrel thesis. The Threat Simulation Theory account can be clearly distinguished from the spandrel thesis. According to Revonsuo, the actual content of dreams is helpful to the survival of an organism because dreaming enhances behaviours in waking life such as perceiving and avoiding threat. Revonsuo’s Threat Simulation Theory presents dreams as specializing in the recreation of life-like threatening scenarios. His six claims are as follows: Claim 1: Dream experience is more an organized state of mind than disorganized; Claim 2: Dreaming is tailored to and biased toward simulating threatening events of all kinds found in waking life; Claim 3: Genuine threats experienced in waking life have a profound effect on subsequent dreaming; Claim 4: Dreams provide realistic simulacra of waking life threatening scenarios and waking consciousness generally; Claim 5: Simulation of perceptual and motor activities leads to enhanced performance even when the rehearsal is not recalled later; Claim 6: Dreaming has been selected for (Revonsuo, 2000).

The Threat Simulation Theory is committed to a certain conception of dreams as a realistic and organized state of consciousness, as implied by Claim 1. This claim motivates the challenge to any spandrel thesis of dreams by asking why dreams would show the level of organization that they do, namely, the construction and engagement of virtual realities, if they were just mere “mental noise” as a spandrel thesis might imply or be committed to. Dreaming is allegedly similar to waking life and is indeed experienced as waking life at least at the time of the dream (Valli & Revonsuo, 2009: p.18). These features of dreams are essential in putting in motion the same sort of reaction to threat that will occur during waking life and aid survival. Hence the dream self should be able to react in the dream situation with reasonable courses of action to combat the perceived threat in ways that would also be appropriate in real life. Revonsuo appeals to phenomenological data where a significant proportion of dreams involve situations in which the dreamer comes under attack. We do indeed generally have more negative emotions during an REM related dream. This claim is well supported by experiential examples of dreaming, where anxiety is the most frequent emotion in dreaming, joy/ elation was second and anger third (Hobson, 1994: p.157), making two thirds of dream emotion negative. That dreams process negative emotions likely occurs because the amygdala is highly activated. The amygdala is also the key brain part implicated in the “fight or flight” sympathetic nervous system response to especially intense life-threatening situations. During waking hours this part of the brain is used in handling unpleasant emotions like anxiety, intense fear or anger. This is well explained by the Threat Simulation Theory. We experience more threats in dreams (and especially demanding ones) than in waking life because it is selected to be especially difficult, leaving the individual bestowed with a surplus of successful threat avoidance strategies, coping skills and abilities to anticipate, detect and out-manoeuvre the subtleties of certain threats.

Though Revonsuo claims that dreams specialize in the full panoply of threatening scenarios that will effect overall survival, the most obvious example is the “fight or flight” response. All instances of stress in waking life occur when an individual feels threatened and this will feed back into the system of acknowledging what is dangerous and what is not. In waking life, the fight or flight response essentially involves making a snap decision in a life or death situation to fight a predatory enemy or flee from the scene. The activation of the Sympathetic Nervous System – the stress response implicated in the fight or flight response – is an involuntary, unconsciously initiated process. One might object to Claims 1 and 4, that many dreams simply do not seem realistic representations of threatening scenarios such as those requiring the fight or flight response. One study found, for example, that many recurrent dreams are unrealistic (Zadra et al, 2006). Revonsuo has some scope for manoeuvre for he believes that dreams may no longer be adaptive due to the dramatic environmental changes, and the possible adaptation for human dream consciousness was when humans were hunter-gatherers in the Pleistocene environment over a period of hundreds of thousands of years; so dream content may now no longer seem to have the same function, given that we live in a radically different environment. Dreams are then comparable to the human appendix – useful and adaptive in times when our diet was radically different, but now an essentially redundant and, occasionally, maladaptive vestigial trait.

Dreams may have also become more lax in representing threatening content in the Western world, since life and death threatening situations are no longer anywhere near as common as in the evolutionary past. This will be the case because a lack of exposure to threat in waking life will not activate the threat simulation system of dreaming as it did in earlier times. Revonsuo uses evidence of ordinary dream reports from the population but he also cites cases of psychopathology such as the dreams of individuals with post-traumatic stress disorder (PTSD), where, crucially, their traumatic and threatening experiences dramatically affects what they dream about. Thus the relationship between dreaming and experiencing threats in waking life is bi-directional: Dreams try to anticipate the possible threats of waking life and improve the speed of perceiving and ways of reacting to them. At the same time, any perceived threats that are actually experienced in waking life will alter the course of later dreaming to re-simulate those threats perceived. This feedback element dovetails with Claim 4 for an accurate picture will be built as information is constructed from the real world.

Perhaps the updating element of the Threat Simulation Theory (Claim 4 – that individuals learn from what is threatening and this gets passed onto offspring) suffers from the same accusation of Lamarckism that faces Jung’s theory. Revonsuo believes that dreaming during sleep allows an individual to repetitively rehearse the neurocognitive mechanisms that are indispensable to waking life threat perception and avoidance. Flanagan asks why behaviour that is instinctual would need to be repetitively rehearsed, but what is provided by Revonsuo is an explanation of how instinct is actually preserved in animals. But not all of our dreams are threatening. This fact surely helps the Threat Simulation Theory to show that there is variation amongst the trait and that the threatening type comes to dominate. The neural mechanisms, or hardware, underlying the ability to dream are transferred genetically and became common in the population. Those individuals with an insufficient number of threatening dreams were not able to survive to pass on the trait because they were left ill prepared for the trials and tribulations of the real world’s evolutionary environment. One might still object that survival – the avoidance of threat – is only one half of propagating a trait. Individuals also have to reproduce for the trait to be passed on. So dreams ought to also specialize in enhancing behaviours that help individuals to find mates. Animals and humans have many and varied courting rituals, requiring complicated behaviours If dreaming can, and does, make any difference to behaviour in waking life as Revonsuo must claim, then why would the presence of mate selection behaviours not make up a big factor in dreams? In humans, no more than 6% of adult dreams contain direct sexual themes (Flanagan, 2000: p.149). There is also a slight gender difference in that males tend to dream more of male characters than female characters. Given that many threats would have occurred from same sex individuals within one’s own species, this fares well for the claim that dreams are for threat perception and rehearsal, but not for courtship rituals that could help pass on the trait. There are ways around this objection, however. Threatening and sexual encounters are such polar opposites that different parts of the nervous system deal with them – the sympathetic and parasympathetic – dramatically alternating between the two could potentially disrupt sleep. Clearly, simply surviving is prioritized over reproducing.

5. Dreaming in Contemporary Philosophy of Mind and Consciousness

a. Should Dreaming Be a Scientific Model?

Visual awareness has been used as the model system in consciousness research. It is easy to manipulate and investigate. Humans are predominantly visual creatures and so visual awareness is an excellent paradigm of conscious experience, at least for humans. A good example of the virtues of using paradigm cases to study, from biological science, is drosophila melanogaster (the common fruit fly). This organism is often used in experiments and routinely cited in papers detailing experiments or suggesting further research. A scientific model is a practical necessity and the fruit fly has such a fast reproduction rate that it allows geneticists to see the effects of genes over many generations that they would not see in their own lifetime in, say, humans. The fruit fly also has enough genetic similarity to humans such that the findings can be extrapolated. Hence, a model system has something special about it – some ideal set of features for empirically investigating. Model systems are good because they ideally also display the phenomena being investigated in an “exceptionally prominent form” (Revonsuo, 2006: p.73-74).

i. Dreaming as a Model of Consciousness

Revonsuo (2006) argues that dreaming should also have a place alongside visual awareness, as a special instance of consciousness and therefore a worthy model to be studied. Revonsuo argues that the dreaming brain also captures consciousness in a “theoretically interesting form” (Revonsuo, 2006: p.73). The claim is that dreaming is an unusually rare example of “pure” consciousness, being as it is devoid of ongoing perceptual input and therefore might deserve special status in being scientifically investigated. The system is entirely dependent on internal resources (Metzinger, 2003: p.255) and is isolated from the external world. Representational content changes much quicker than in the waking state, and this is why Metzinger claims that dreams are more dynamic (Metzinger, 2003: p.255). Whereas Malcolm and Dennett had argued that dreams are not even plausibly conscious states, Revonsuo and Metzinger argue that dreams may reveal the very essence of consciousness because of the conditions under which dream consciousness takes place. Crucially, there is a blockade of sensory input. Only very rarely will sensory input contribute to information processing during dreaming, (Metzinger, 2003: p.257) for example in dreams where the sound of the alarm clock is interwoven into a narrative involving wedding bells. This means that most other dreams are “epistemically empty” with regard to the external environment. At no other time is consciousness completely out of synch with the environment, and, so to speak, left to its own devices. Dreaming is especially interesting and fundamentally similar to waking consciousness because it entails consciousness of a world which we take to be the real one, just as we do during waking consciousness. “Only rarely do we realize during the dream that it is only a dream. The estimated frequency of lucid dreams varies in different studies. Approximately 1 to 10% of dream reports include the lucidity and about 60% of the population have experienced a lucid dream at least once in their lifetime (Farthing, 1992; Snyder & Gackenbach, 1988). Thus, as many as 90 to 99% of dreams are completely non-lucid, so that the dream world is taken for real by the dreamer” (Revonsuo, 2006: p.83).

Revonsuo does not so much argue for the displacement of visual awareness qua a model system, as arguing that dreaming is another exemplary token of consciousness. Dreaming, unlike visual awareness, is both untainted by the external world and behavioural activity (Revonsuo, 2006: p.75). Dreams also reveal the especially subjective nature of consciousness: the creation of a “world-for-me”.

Another reason that might motivate modelling dreaming is that it might turn out to be a good instance for looking at the problem of localization in consciousness research. For example, during dreaming the phenomenology is demonstrably not ontologically dependent on any process missing during dreaming. Any parts of the brain not used in dreaming can be ruled out as not being necessary to phenomenal consciousness (Revonsuo, 2006: p.87).

During dreams there is an output blockade (Revonsuo, 2006: p.87; Metzinger, 2003: p.257), making dreaming an especially pure and isolated system. Malcolm had argued that dreaming was worthy of no further empirical work for the notion was simply incoherent, and Dennett was sceptical that dreams would turn out to even involve consciousness. The radical proposal now is that dreaming ought to be championed as an example of conscious experience, a mascot for scientific investigation in consciousness studies. It is alleged that dreams can recapitulate any experience from waking life and for this reason Revonsuo concludes that the same physical or neural realization of consciousness is instantiated in both examples of dreaming and waking experience (Revonsuo, 2006: p.86).

ii. Dreaming as a Contrast Case for Waking Consciousness

The argument of Windt & Noreika (2011) firstly lays out the alleged myriad of problems with taking dreaming as a scientific model in consciousness research and then proposes positive suggestions for the role dreaming can play in consciousness studies. They reject dreaming as a model system but suggest it will work better as a contrast system to wakefulness. The first major problem with using dreaming as a model of consciousness is that, whilst, for example, in biology everybody knows what the fruit fly is, there are a number of different conceptions and debates surrounding key features and claims about dreaming. Hence there is no accepted definition of dreaming (Windt, 2010: p.296). Taking dreaming as a model system of course requires and depends upon exactly what dreaming is characterized as. Revonsuo simply assumes his conception of dreaming is correct. He believes that dreaming can be a model of waking consciousness because dreams can be identical replicas of waking consciousness involving all possible experiences. Windt & Noreika believe that dreams tend to be different to waking life in important ways.

There are further problems with modelling dreams. Collecting dream reports in the laboratory might amount to about five reported dreams a night. This is not practical at all compared to visual awareness where hundreds of reportable experiences can occur in minutes without interference (Windt & Noreika, 2011: p.1099). Scientists do not even directly work with dreams themselves, but rather descriptions of dreams. There is also the added possibility of narrative fabrication (the difference between dream experience and dream report). During the dream itself, it is known that we do not have clarity of awareness as in waking life and we do have a poor memory during the dream experience. Though lucid dreaming is an exception to this rule, it differs from ordinary dreaming and lucid dreaming has not been proposed as a model. The laboratory is needed to control and measure the phenomenon properly but this in itself can influence the dream content and report and so the observer effect and interviewer biases are introduced.

It is not clear that these are insurmountable methodological problems. It might be very difficult to investigate dreams but this comes with the territory of trying to investigate isolated consciousness. Revonsuo is also not suggesting removing visual awareness as the paradigm model, only that dreaming ought to be alongside it. The fact remains, however, that despite suggestions from Revonsuo and others, dreaming being used as a model has simply not yet taken place (Windt & Noreika, 2011: p.1091), regardless of the theoretical incentives.

The problems with the modelling approach point us toward the more modest contrast analysis approach. Windt & Noreika argue that the contrast analysis of dreaming with other wake states should at least be the first step in scientific investigation, even if we wanted to establish what ought to be a model in consciousness research (Windt, 2011: p.1102). What are the reasons for endorsing the positive proposal of using dreams in a contrast analysis with waking life? Though waking consciousness is the default mode through which individuals experience the world, dreaming is the second global state of consciousness. As displayed in reports, dreams involve features which are markedly different to that of waking life – bizarreness and confabulation being key hallmarks of dreaming but not waking consciousness. The two states of waking and dreaming are mediated by radically different neurochemical systems. During wakefulness, the aminergic system is predominantly in control and for dreaming the cholinergic system takes over (Hobson, 1994: pp. 14-15; Hobson, 2005: p.143). This fundamental difference offers an opportunity to examine the neurology and neurochemistry that underpins both states of consciousness. The contrast analysis does not ignore dreaming, but proposes a more modest approach. With research divided between waking consciousness, dreaming and a comparison of the two states, this more practical approach will yield better results, so Windt and Noreika argue. By using the proposed method, we can see how consciousness works both with and without environmental input. Surely both are as equally important as each other, rather than trying to find reason to privilege one over the other. After all, both are genuine examples of consciousness. This approach also means that the outcome will be mutually informative as regards the two types of consciousness with insights gained in both directions. It is important to compare dreaming as an important example of consciousness operating with radically changed neural processing to waking consciousness (Windt & Noreika, 2011: p.1101). With the contrastive analysis there is the prospect of comparing dream consciousness to both pathological and non-pathology waking states, and there is thereby the promise of better understanding how waking consciousness works and how it can also malfunction. We spend about a tenth of our conscious lives dreaming, and yet it is one of the most difficult mental states to scientifically investigate. The contrast analysis is put forward as a possible solution to the problem of how to integrate dreams into consciousness studies. Windt and Noreika add the further proposal that dreams can be more specifically contrasted with pathological, non-pathological and altered states of consciousness. Unlike the modelling option, the contrast analysis seems to be how dreams have been hitherto investigated and so has already proven to be a viable option. This should not definitively preclude the modelling option, however. Perhaps modelling dreams really would be ideal simply because it involves isolated consciousness and the practicality concerns may be overcome in the future.

It remains the case that “one of the central desiderata in the field of empirical dream research is a commonly accepted definition of dreaming” (Windt, 2010: p.296). There are also examples of consciousness at the periphery of sleep that makes it difficult to delineate the boundaries of what does and does not count as a dream (Mavromatis, 1987: p.3; ‘hypnagogic’ experiences are the thoughts, images or quasi-hallucinations that occur prior to and during sleep onset whilst ‘hypnopompic’ experiences are thoughts, images or quasi-hallucinations that occur during or just after waking; these states are now known collectively as hypnagogia). This is a problem that both the modelling and the contrast analysis approach both must confront.

b. Is Dreaming an Instance of Images or Percepts?

Some believe that dreaming involves mental imagery of both hallucinatory and imagistic nature (Symons: 1993: p.185; Seligman & Yellen, 1987). However, other philosophers such as Colin McGinn believe that dreams should only be thought of in terms of images (the imagination) or percepts (perceptual experience). It is better to not inflate our ontology and invoke a third category that dreams are sui generis (of their own kind) if we do not need to. There are reasons why we should not believe dreams are a unique mental state. It would be strange, McGinn claims, if “the faculties recruited in dreaming were not already exploited during waking life” (McGinn, 2004: p.75). So he believes that dreaming is an instance of a psychological state we are already familiar with from waking life: perception (hallucination) or the imagination. There is a separate epistemic question of whether dreams involve beliefs or imaginings (Ichikawa, 2009). This debate comes apart from the psychological one as to whether the phenomenology of dreaming is percept or imagining because all four possible combinations can be held: A theorist might claim that dreams are hallucination which involve belief; another theorist might claim that dreams involve hallucinations which we do not generate belief, but we are rather always entertaining our dream hallucinations as imagined possibilities. Alternatively, dreams might involve the psychological state of the imagination and yet we happen to believe in our imaginings as though they were real; finally, dreams might involve imaginings which we recognize as such, imaginings and not beliefs. Though the two debates come apart, in waking life the psychological state of imagining is usually accompanied by the propositional attitude of imagining that something is the case, rather than believing that the something (psychologically imagined) is the case. Perception, or hallucination, usually triggers belief that what is perceived is the case, rather than merely imagining that what is perceived is the case. So if dreams are psychologically defined in terms of percepts, we can expect that dreams likely also involve belief because percepts usually trigger belief. This is not always the case, however. For example, a schizophrenic might come to realize that his hallucinations do not really express a way the world is and so no longer believe what he perceives. If dreams are propositionally imagined, then we would usually expect our imaginings to be recognized as not being real and therefore triggering the propositional attitude of imagining rather than believing that what we are imaging (in the psychological sense) that something is the case. There are more nuanced views than the four possible combinations. McGinn claims that dreaming involves the imagination and quasi-belief. When we dream we are immersed in the fictional plot, as we are in creative writing or film.

i. Dreaming as Hallucination

It is worth noting that the psychological literature assumes that dreams are hallucinations that occur during sleep. It is the typically unquestioned psychological orthodoxy that dreams are perceptual/ hallucinatory experiences. Evidence can be cited for dreams as percepts from neuroscience:

The offline world simulation engages the same brain mechanisms as perceptual consciousness and seems real to us because we are unaware that it is nothing but a hallucination (except very rarely in lucid dreams). (Valli & Revonsuo, 2009: p.19)

In REM sleep, when we are lying still due to muscle paralysis, the motor programs of the brain are nonetheless active. Phenomenologically, our dream selves are also highly active during dreams. As the content of a dream reveals, we are always on the move. Apart from the bodily paralysis, physiologically the body acts as though it perceives a real world, and continually reacting to events in that apparently real world. It is known that individuals will carry out their dream actions if the nerve cells that suppress movement are surgically removed or have deteriorated due to age, as demonstrated in people with REM Sleep Behavior Disorder. This suggests that dreaming involves the ordinary notion of belief because it is tied to action in the usual way and it is only because of an additional action-suppressing part of the brain that these actions are not carried out. We know that percepts ordinarily trigger belief and corresponding action, whereas the imagination does not.

The claim that dreams are hallucinations can find support in the further claim that dreaming replicates waking consciousness. Many philosophers and psychologists make note of the realistic and organized nature of dreams, and this has been couched in terms of a virtual reality, involving a realistic representation of the bodily self which we can feel. Consider false awakenings, where an individual believes they have woken up in the very place they went to sleep, yet they are actually still asleep in dreaming. False awakenings can arguably be used in support of the view that dreaming is hallucinatory because such dreams detail a realistic depiction of one’s surroundings. Dreams fit the philosophical concept of hallucination as an experience intrinsically similar to legitimate perceptual states with the difference that the apparent stimulus being perceived is non-existent (Windt, 2010: pp. 298-299). It is the job of perceptual states to display the self in a world.

Empirical evidence suggests that pain can be experienced in dreams, which is perceptual in nature and which the imagination can arguably not replicate. So dreams must be hallucinatory, according to this line of reasoning. It is not clear though whether this rules out dreams as mainly imaginative with occasional perceptual elements introduced. Pain is, after all, a rarity in dreaming.

We can plausibly speculate that human ancestors fell prey to a major metaphysical confusion and thought that their dreams involved genuine experiences in the past, including visitations from the dead and entering a different realm. Though few believe this today, we can sympathize with our ancestors’ mistake which is further supported by the occasional everyday feeling where we can be hesitant to categorize a memory of an experience as a dream or a waking event. We usually decide, not based on introspection, but on logical discrepancies between the memory and other factors. These two reasons suggest that we cannot always easily distinguish between dream experience and waking experience, because they are another instance of percepts.

Finally, we seem to have real emotions during dreams which are the natural reaction to our perceptions. According to the percept view of dreams, we dream that we are carrying actions out in an environment, but our accompanying emotions are not dreamed and play out alongside the rest of the dream content. The intensity of the emotions, actually felt, is what the percept theorist will take as support for the content of the dream not being merely imagined, but the natural response of realistic, perceptual-like experience.

ii. Dreaming as Imagination

A number of philosophers believe that dreaming is just the imagination at work during sleep (Ichikawa, 2008; Sosa, 2007; McGinn, 2004, 2005). Any conscious experiences during sleep are imagistic rather than perceptual. McGinn puts forward some reasons in favour of believing that dreams are imaginings. Firstly, he introduces The Observational Attitude: if we are perceiving (or hallucinating), say, two individuals having a conversation then we might need to strain our senses to hear or see what they are discussing. During dreams of course, the body is completely relaxed and the sleeping individual shows no interest in his or her surroundings. When imagining in waking life, I try to minimize my sensory awareness of the surrounding environment in order to get a better and more vivid picture of what it is I am imagining. For example, if I want to imagine a new musical tune, I do best to switch the radio off or cover my ears. Dreaming is the natural instance of shutting out all of our sensory awareness of the outside world, arguably to entirely engage the imagination. This suggests that the dreamer is hearing with their mind’s ear and seeing with their mind’s eye. They are entertaining images, not percepts. Secondly, McGinn claims that percepts and images can coexist in the mind in waking life. We can perceive at the same time as imagining. The novel presupposes that the reader can perceive the text at the same time as imagining its content. It ought to be possible, McGinn argues, that if dreams are hallucinatory, we should be able to imagine at the same time. If I am surfing in a dream, I should be able to imagine the Eiffel tower at the same time. In dreams we cannot do this, there is just the dream, McGinn claims, with no further ability to simultaneously imagine other content. Related to the Observational Stance is the notion of Recognition in dreams. In dreams we seem to already know who all of the characters are, without making any effort to find out who they are (without using any of our senses). This might suggest that in dreams we are partly in control of the content (even if we fail to realize it) because we allegedly summon up the characters that we want to. We recognize who dream characters are, such as relatives, even when they look drastically different. It is not clear, on the other hand, that we really are in control of other dream characters and that we accurately recognize them, for example, Gerrans (2012) has claimed that the same mechanism of misidentification is present in dreams as in certain delusional states where the feeling of familiarity of a person is over-active and aimed at the wrong individuals (as in Fregoli delusion). On this view, we do not accurately bring to mind certain dream characters, we try to identify them and make mistakes. This would be an alternative way of accommodating the evidence.

In the 1940’s and 1950’s, a survey found that a majority of Americans thought that their own dreams, and dreams more generally, occurred in black and white. Crucially, people have thought prior to and after this period that dreams occur in colour. This period coincided with the advent of black and white television. According to Schwitzgebel, the most reasonable conclusion to draw is that dreams are more like imagining written fiction, sketchy and without any colour; I can read a novel without summoning any definite imagery to mind (Schwitzgebel, 2002: p.656). Perhaps this is akin to reading a novel quickly and forming vague and indefinite imagery. Ichikawa also argues that dreaming involves only images and is indeterminate to colour because images can be indeterminate to colour, but to hallucinate would require some determinacy in colour, whether black and white or in full colour Crucially, even outside of Schwitzgebel’s findings, people have raised the question whether we dream in colour or black and white – something needs to explain the very possibility of such a dispute; “the imagination model may provide the best explanation for disagreement about colour sensation in dreams”  (Ichikawa, 2009: p.109). McGinn also believes that dreams can be indeterminate to colour, and they can be coloured in during the waking report, which can be affected by the current media.

6. References and Further Reading

  • Adams, R. (1985) “Involuntary Sins,” The Philosophical Review, Vol. 94, No. 1 (Jan., 1985), pp. 3 – 31.
  • Antrobus, J. S., Antrobus, J. S. & Fisher, C. (1965) “Discrimination of Dreaming and Nondreaming Sleep,” Archives of General Psychiatry 12: pp. 395 – 401.
  • Antrobus, J. (2000) “How Does the Dreaming Brain Explain the Dreaming Mind?” Behavioral and Brain Sciences, 23 (6): pp. 904 – 907.
  • Arikin, A. Antrobus, J. & Ellman, S. (1978) The Mind in Sleep: Psychology and Physiology New Jersey: Lawrence Erlbaum.
  • St. Augustine (398) Confessions in Great Books of the Western World |16 translated by R. S. Pine-Coffin, Chicago: Britannica, 1994.
    • In this historical canon of philosophy, Augustine discusses and comes to the conclusion that we are not immoral in our dreams, though it appears as though we sin.
  • Ayer, A. (1960) “Professor Malcolm on Dreams,” The Journal of Philosophy, Vol. 57, No. 16 (Aug. 4, 1960), pp. 517 – 535.
    • Ayer’s first reply to Malcolm’s thesis against the received view – part of a heated back and forth exchange that borders on the ad hominem.
  • Ayer, A. (1961) “Rejoinder to Professor Malcolm,” The Journal of Philosophy, Vol. 58, No. 11 (May 25, 1961), pp. 297 – 299.
  • Baghdoyan, H. A., Rodrigo-Angulo, M. L., McCarley, R. W. & Hobson, J. A. (1987) “A Neuroanatomical Gradient in the Pontine Tegmentum for the Cholinoceptive Induction of Desynchronized Sleep Signs,” Brain Research 414: pp. 245 – 61.
  • Bakeland, F. (1971) “Effects of Pre-sleep Procedures and Cognitive Style on Dream Content,” Perceptual and Motor Skills 32:63–69.
  • Bakeland, F., Resch, R. & Katz, D. D. (1968) “Pre-sleep Mentation and Dream Reports,” Archives of General Psychiatry 19: pp. 300 – 11.
  • Ballantyne, N. & Evans, E. (2010) “Sosa’s Dream,” Philos Stud (2010) 148: pp. 249 – 252.
  • Barrett, D. (1992) “Just How Lucid are Lucid Dreams?” Dreaming 2: pp. 221 – 28.
  • Berger, R. J. (1967) “When is a Dream is a Dream is a Dream?” Experimental Neurology (Supplement) 4:15–27.
  • Blackmore, S (1991) “Lucid Dreaming: Awake in your Sleep?” Skeptical Inquirer, 15, pp. 362 – 370.
  • Blackmore, S. (2004) Consciousness: An Introduction Oxford: OUP.
  • Blackmore, S. (2005) Consciousness: A Very Short Introduction Oxford: OUP.
    • Chapter 7, “Altered States of Consciousness,” looks at dreams and sleep and considers a “retro-selection theory of dreams” in reference to Dennett’s model of unconsciously dreaming.
  • Blaustein, J., (Eds.) Handbook of Neurochemistry and Molecular Neurobiology, 3rd Edition: Behavioral Neurochemistry and Neuroendocrinology New York: Springer Science, 2007.
    • Contains useful, but advanced, information on the neurochemistry involved during processes of sleep and waking.
  • Brown, J. (2009) “Sosa on Skepticism” Philosophy Studies.
  • Canfield, J. (1961) “Judgements in Sleep,” The Philosophical Review, Vol. 70, No. 2 (Apr., 1961), pp. 224 – 230.
  • Candlish, S. & Wrisley, G. (2012) “Private Language,” Stanford Encyclopedia of Philosophy.
  • Chappell, V. (1963) “The Concept of Dreaming,” Philosophical Quarterly, 13 (July): pp. 193 –  213.
  • Child, W. (2007) “Dreaming, Calculating, Thinking: Wittgenstein and Anti-Realism about the Past” The Philosophical Quarterly, Vol. 57, No. 227 (Apr., 2007), pp. 252-272.
  • Child, W. (2009) “Wittgenstein, Dreaming and Anti-Realism: A Reply to Richard Scheer,”Philosophical Investigations 32:4 October 2009, pp. 229 – 337.
  • Cicogna, P. & Bosinelli, M. (2001) “Consciousness during Dreams,” Consciousness and Cognition 10, pp. 26 – 41.
  • Cioffi, F. (2009) “Making the Unconscious Conscious: Wittgenstein versus Freud,” Philosophia, 37: pp. 565 – 588.
  • Cohen, D., “Sweet Dreams,” New Scientist, Issue 2390, p. 50.
  • Dawkins, R. (1998) Unweaving the Rainbow: Science, Delusion and the Appetite for Wonder London: Penguin, p. 158.
    • Dawkins briefly considers the phenomenon of purported dream prophecy and the relationship between coincidence and reporting, something he also discusses in The Magic of Reality (2011).
  • Dennett, D. (1976) “Are Dreams Experiences?” The Philosophical Review, Vol. 85, No. 2 (Apr., 1976), pp. 151-171.
    • This is the seminal paper by Dennett in which he advances the claim that dreams might not be experiences that occur during sleep.
  • Dennett, D. (1979) “The Onus Re Experiences: A Reply to Emmett,” Philosophical Studies 35: pp. 315 – 318.
  • Dennett, D. (1991) Consciousness Explained London: Penguin.
  • Descartes, R. (1641) Meditations on First Philosophy in Great Books of the Western World 28 Chicago: Britannica, 1994.
  • Domhoff, G. & Kamiya, J. (1964) “Problems in Dream Content Study with Objective Indicators: I. A Comparison of Home and Laboratory Dream Reports,” Archives of General Psychiatry 11: pp. 519 – 24.
  • Doricchi, F. et al (2007) “The “Ways” We Look at Dreams: Evidence from Unilateral Spatial Neglect (with an evolutionary account of dream bizarreness)” Exp Brain Res (2007) 178: pp. 450 – 461.
  • Driver, J. (2007) “Dream Immorality,” Philosophy, Vol. 82, No. 319 (Jan., 2007), pp. 5 – 22.
  • Emmett, K. (1978) “Oneiric Experiences,” Philosophical Studies, 34 (1978): pp. 445 – 450.
  • Empson, J. (1989) Sleep and Dreaming Hempstead: Harvester Wheatsheaf.
  • Farthing, W. (1992) The Psychology of Consciousness New York: Prentice Hall.
  • Flanagan, O. (1995) “Deconstructing Dreams: The Spandrels of Sleep,” Journal of Philosophy, 92, no. 1 (1995): pp. 5 – 27.
  • Flanagan, O. (2000) Dreaming Souls: Sleep, Dreams and the Evolution of the Conscious Mind  Oxford: OUP.
    • This rare book-length treatment of philosophical issues about dreaming is also devoted to arguing that dreams are by-products of sleep.
  • Flowers, L. & Delaney, G. (2003) “Dream Therapy,” in Encyclopedia of Neurological Sciences, Elsevier Science USA: pp. 40 – 44.
  • Foulkes, D. (1999) Children’s Dreaming and the Development of Consciousness Harvard.
  • Freud, S. (1900) The Interpretation of Dreams in Great Books of the Western World 54 Chicago: Britannica, 1994.
  • Gendler, T. (2011) “Imagination,” Stanford Encyclopedia of Philosophy.
  • Gerrans, P. (2012) “Dream Experience and a Revisionist Account of Delusions of Misidentification,” Consciousness and Cognition, 21 (2012) pp. 217 – 227.
    • This paper is primarily about delusions that involve misidentifying individuals, such as Fregoli delusion, and alleges that dreams contain similar delusional features.
  • Ghosh, P. (2010) “Dream Recording Device ‘Possible’ Researcher Claims,” BBC News.
  • Gould, S. & Lewontin, R. (1979) “The Spandrels of San Marco and the Panglossian Paradigm: a Critique of the Adaptationist Programme” Proceedings of the Royal Society of London, B 205, pp. 581 – 598.
  • Green, C. (1968) Lucid Dreams Oxford: Institute of Psychophysical Research.
  • Green, C. & McCreery, C. (1994) Lucid Dreaming: The Paradox of Consciousness During Sleep Philadelphia: Routledge, 2001.
  • Greenberg, M. & Farah, M. (1986) “The Laterality of Dreaming,” Brain and Cognition, 5, pp. 307 – 321.
  • Gunderson K (2000) “The Dramaturgy of Dreams in Pleistocene Minds and Our Own,” Behavioral and Brain Sciences 23(6), pp. 946 – 947.
  • Hacking, I. (1975) “Norman Malcolm’s Dreams,” in Why Does Language Matter toPhilosophy Cambridge: Cambridge University Press.
  • Hacking, I. (2001) “Dreams in Place,” The Journal of Aesthetics and Art Criticism, Vol. 59, No. 3 (Summer, 2001): pp. 245 – 260.
  • Hill, J. (2004a) “The Philosophy of Sleep: The Views of Descartes, Locke and Leibniz,” Richmond Journal of Philosophy, Spring.
  • Hill, J. (2004b) “Descartes’ Dreaming Argument and Why We Might Be Skeptical of It,” Richmond Journal of Philosophy, Winter.
  • Hill, J. (2006) “Meditating with Descartes,” Richmond Journal of Philosophy, Spring.
  • Hobbes, T. (1651) Leviathan in Great Books of the Western World | 21 Chicago: Britannica, 1994.
  • Hobson, J. (1988) Dreaming Brain Basic Books.
    • J. Allan Hobson is an authoritative source on the psychology of sleep and dreaming.
  • Hobson, J. (1989) Sleep USA: Scientific American Library Press, 1995.
    • A rich textbook on the history of the science of sleep.
  • Hobson, J. (1994) The Chemistry of Conscious States: How the Brain Changes Its Mind New York: Little, Brown & Company.
  • Hobson, J. (2001) Dream Drugstore: Chemically Altered States of Consciousness Cambridge, MA: MIT Press.
  • Hobson, J. (2005) Dreaming: A Very Short Introduction Oxford: OUP.
  • Hobson, J. (2009) “The Neurobiology of Consciousness: Lucid Dreaming Wakes Up,” International Journal of Dream Research, Volume 2, No. 2 (2009)
  • Hobson, J. (2012) Dream Life: An Experimental Memoir Cambridge, MA: MIT Press.
  • Hodges, M. & Carter, W. (1969) “Nelson on Dreaming a Pain,” Philosophical Studies, 20 (April): pp. 43 – 46.
  • Hopkins, J. (1991) “The Interpretation of Dreams,” In Neu, J. (ed.), The Cambridge Companion to Freud Cambridge: Cambridge University Press.
  • Horton, C. et al (2009) “The Self and Dreams During a Period of Transition,” Consciousness and Cognition, 18 (2009) pp. 710 – 717.
  • Hunter, J. (1971) “Some Questions about Dreaming,” Mind, 80 (January): pp. 70 – 92.
  • Ichikawa, J. (2008) “Skepticism and the Imagination Model of Dreaming” The Philosophical Quarterly, Vol. 58, No. 232.
  • Ichikawa, J. (2009) “Dreaming and Imagination” Mind & Language, Vol. 24 No. 1 February 2009, pp. 103 121.
  • Jablonka, E. & Lamb, M. (2005) Evolution in Four Dimensions: Genetic, Epigenetic, Behavioral, and Symbolic Variation in the History of Life MIT.
    • This is the go-to introduction for epigenetics (though the content of the book does not include the subject of dreaming).
  • Jeshion, R. (2010) New Essays On Singular Thought Oxford: OUP.
    • This is a good guide to the notion of singular thought (though the content of the book does not include the subject of dreaming).
  • Johnson, R. (2009) Inner Work: Using Dreams and Active Imagination for Personal Growth Harper & Row.
    • A manual for applying the principles of Jungian analysis, working specifically with the imagination and dreaming.
  • Jouvet, M. (1993) The Paradox of Sleep Cambridge, MA: MIT Press, 1999.
  • Jung, C. (eds) (1968) Man and His Symbols USA: Dell Publishing.
  • Jung, C. (1985) Dreams Ark: London.
    • A collection of Jung’s papers on dreams.
  • Kahan, T. & LaBerge, S. (2011) “Dreaming and Waking: Similarities and Differences Revisited,” Consciousness and Cognition, 20, pp. 494–514.
  • Kamitani, Y. et al (2008) “Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders,” Neuron 60, 915–929, Dec. 11.
    • In this controversial article Kamitani and colleagues outline some research into decoding (purely from brain scanning) what individuals can consciously see. The speculative idea is that eventually the method discussed might be applied to dreaming and scientists will thereby be able to “record” dreams.
  • Knoth, I. & Schredl, M. (2011) “Physical Pain, Mental Pain and Malaise in Dreams,” International Journal of Dream Research, Volume 4, No. 1.
  • Kramer, M. (1962) “Malcolm on Dreaming,” Mind, New Series, Vol. 71, No. 281 Jan.
  • LaBerge, S. & Rheingold, H. (1990) Exploring the World of Lucid Dreaming New York: Ballantine Books.
  • LaBerge, S. (1990) “Lucid Dreaming: Psychophysiological Studies of Consciousness During REM Sleep,” in Bootzin, R., Kihlstrom, J., & Schacter, D. (Eds.) Sleep and Cognition  (pp. 109 – 126) Washington, D.C.: American Psychological Association.
  • LaBerge, S. (2000) “Lucid Dreaming: Evidence that REM Sleep Can Support Unimpaired Cognitive Function and a Methodology for Studying the Psychophysiology of Dreaming,” The Lucidity Institute Website.
  • LaBerge, S. & DeGracia, D.J. (2000). “Varieties of Lucid Dreaming Experience,” In R.G.
  • Kunzendorf & B. Wallace (Eds.), Individual Differences in Conscious Experience (pp. 269 – 307). Amsterdam: John.
  • Lakoff, G. (1993) “How Metaphor Structures Dreams: The Theory of Conceptual Metaphor Applied to Dream Analysis,” Dreaming, 3 (2): pp. 77 – 98.
  • Lansky, M. (eds) (1992) Essential Papers on Dreams New York: New York University Press.
  • Lavie, P. (1996) The Enchanted World of Sleep Yale.
  • Llinas, R. & Ribary, U. (1993) “Coherent 40-Hz Oscillation Characterizes Dream State in Humans,” Proceedings of the National Academy of Sciences USA 90: pp. 2078 – 81.
  • Locke, J. (1690) An Essay Concerning Human Understanding in Great Books of the Western World | 33 Chicago: Britannica, 1994.
  • Lockley, S. & Foster, R. (2012) Sleep: A Very Short Introduction Oxford: OUP.
  • Love, D. (2013) Are You Dreaming? Exploring Lucid Dreams: A Comprehensive Guide Enchanted Loom.
  • Lynch, G., Colgin, L. & Palmer, L. (2000) “Spandrels of the Night?” Behavioral and Brain Sciences, 23 (6), pp. 966 – 967.
  • MacDonald, M. (1953) “Sleeping and Waking,” Mind, New Series, Vol. 62, No. 246 (Apr., 1953), pp. 202 – 215.
  • Malcolm, N. (1956) “Dreaming and Skepticism,” The Philosophical Review, Vol. 65, No. 1 (Jan., 1956), pp. 14 – 37.
    • Malcolm’s first articulation of the argument that “dreaming”, as the received view understands the concept, is flawed.
  • Malcolm, N. (1959) Dreaming London: Routledge & Kegan Paul, 2nd Impression 1962.
    • Malcolm’s controversial book length treatment of the topic.
  • Malcolm, N. (1961) “Professor Ayer on Dreaming,” The Journal of Philosophy, Vol.58, No.11 (May), pp. 297 – 299.
  • Mann, W. (1983) “Dreams of Immorality,” Philosophy, Vol. 58, No. 225 (Jul., 1983), pp. 378 – 385.
  • Mathers, D. (2001) An Introduction to Meaning and Purpose in Analytical Psychology East Sussex: Routledge.
    • Contains a chapter on dreams and meaning from the perspective of analytical psychology.
  • Matthews, G. (1981) “On Being Immoral in a Dream,” Philosophy, Vol. 56, No. 215 (Jan., 1981), pp. 47 – 54.
  • Mavromatis, A. (1987) Hypnagogia: The Unique State of Consciousness Between Wakefulness and Sleep London: Routledge & Kegan Paul.
  • McFee, G. (1994) “The Surface Grammar of Dreaming,” Proceedings of the Aristotelian Society, New Series, Vol. 94 (1994), pp. 95 – 115.
  • McGinn, C. (2004) Mindsight: Image, Dream, Meaning USA: Harvard University Press, 2004.
    • The argument that dreaming is a type of imagining.
  • McGinn, C. (2005) The Power of Movies: How Screen and Mind Interact Toronto: Vintage Books, 2007.
    • McGinn argues that dreaming preconditions humans to be susceptible to film (including emotional reaction) because dreaming makes use of the imagination via fictional immersion – this is why we can easily become absorbed in stories and appreciate works of art.
  • Metzinger T (2003) Being No-One. MIT Press.
  • Metzinger, T. (2009) The Ego Tunnel: The Science of the Mind and the Myth of the Self New York: Basic Books.
    • Contains substantial sections on dreaming.
  • Möller, H. (1999) “Zhuangzi’s ‘Dream of the Butterfly’: A Daoist Interpretation,” Philosophy East and West, Vol. 49, No. 4 (Oct., 1999), pp. 439 – 450.
  • Nagel (1959) “Dreaming,” Analysis, Vol. 19, No. 5 (Apr., 1959), pp. 112-116.
  • Nelson, J. (1966) “Can One Tell That He Is Awake By Pinching Himself?” Philosophical Studies, Vol. XVII.
  • Newman, L. (2010) “Descartes’ Epistemology,” Stanford Encyclopedia of Philosophy.
  • Nielsen, T. (2012) “Variations in Dream Recall Frequency and Dream Theme Diversity by Age and Sex,” July 2012 | Volume 3 | Article 106 | 1.
  • Nielsen, T. & Levin, R. (2007) “Nightmares: A New Neurocognitive Model,” Sleep Medicine Reviews (2007) 11: pp. 295 – 310.
  • Nir, Y. & Tononi, G. (2009) “Dreaming and the Brain: from Phenomenology to Neurophysiology,” Trends in Cognitive Sciences, Vol.14, No.2.
  • O’Shaughnessy, B. (2002) “Dreaming,” Inquiry, 45, pp. 399 – 432.
  • Occhionero, M. & Cicogna, P. (2011) “Autoscopic Phenomena and One’s Own Body Representation in Dreams,” Consciousness and Cognition, 20 (2011) pp. 1009 – 1015.
  • Oswald, I. & Evans, J. (1985) “On Serious Violence During Sleep-walking,” British Journal of Psychiatry, 147: pp. 688 – 91.
  • Pears, D. (1961) “Professor Norman Malcolm: Dreaming,” Mind, New Series, Vol. 70, No. 278 (Apr., 1961), pp. 145 -163.
  • Pesant, N. & Zadra, A. (2004) “Working with Dreams in Therapy: What do We Know and What Should we Do?” Clinical Psychology Review, 24: pp. 489–512
  • Putnam, H. (1962) “Dreaming and ‘Depth Grammar’,” in Butler (Eds.) Analytical Philosophy Oxford: Basil & Blackwell, 1962.
  • Rachlin, H. (1985) “Pain and Behavior,” Behavioral and Brain Sciences, 8: 1, pp. 43 – 83.
  • Revonsuo, A. (1995) “Consciousness, Dreams, and Virtual Realities,” Philosophical Psychology 8: pp. 35 – 58.
  • Revonsuo, A. (2000) “The Reinterpretation of Dreams: An Evolutionary Hypothesis of the Function of Dreaming,” Behavioral and Brain Sciences 23, pp. 793 – 1121.
    • An exposition and defence of the theory that dreaming is an escape rehearsal and so will have an adaptive value over those creatures who do not dream, contra Flanagan.
  • Revonsuo A (2006) Inner Presence MIT Press.
  • Rosenthal, D. (2002) “Explaining Consciousness” in Philosophy of Mind: Classical and Contemporary Readings, (eds) David Chalmers, OUP.
    • In this paper, Rosenthal asks in passing whether dreams are conscious experiences that occur during sleep, before he goes on to highlight his higher-order-thought theory of consciousness. In this paper, he introduces important terminology in the philosophy of mind (creature consciousness and state consciousness).
  • Salzarulo, P. & Cipolli, C. (1974) “Spontaneously Recalled Verbal Material and its Linguistic Organization in Relation to Different Stages of Sleep,” Biological Psychology 2: pp. 47 – 57.
  • Schultz, H. & Salzarulo, P. (1993) “Methodological Approaches to Problems in Experimental Dream Research: An Introduction,” Journal of Sleep Research 2: pp. 1 – 3.
  • Schroeder, S. (1997) “The Concept of Dreaming: On Three Theses by Malcolm,” Philosophical Investigations 20:1 January 1997, pp. 15 – 38.
  • Schwitzgebel, E. (2002) “Why Did We Think We Dreamed in Black and White?” Stud. Hist. Phil. Sci. 33 (2002) pp. 649 – 660.
    • This is a more specific case of Schwitzgebel’s general attack on the reliability of introspection. This paper is particularly interesting in examining the relationship between dream experience itself and episodically remembering those experiences. Schwitzgebel couches his account of dreams as a type of imagination.
  • Schwitzgebel, E. (2003) “Do People Still Report Dreaming in Black and White? An Attempt to Replicate a Questionnaire from 1942,” Perceptual and Motor Skills, 96: pp. 25 – 29.
  • Schwitzgebel, E. Huang, C. & Zhou, Y. (2006) “Do We Dream In Color? Cultural Variations and Skepticism,” Dreaming, Vol. 16, No. 1: pp. 36 – 42.
  • Seligman, E & Yellen, A (1987) “What is a Dream?” Behar. Res. Thu. Vol. 25, No. I, pp. 1 – 24.
  • Shaffer, J. (1984) “Dreaming,” American Philosophical Quarterly, Vol. 21, No. 2 (Apr., 1984), pp. 135 – 146.
  • Sharma, K. (2001) “Dreamless Sleep and Some Related Philosophical Issues,” Philosophy East and West, Vol. 51, No. 2 (Apr., 2001), pp. 210 – 231.
  • Shredl, M. & Erlacher, D. (2004) “Lucid Dreaming Frequency and Personality,” Personality and Individual Differences, 37 (2004) pp. 1463 – 1473.
  • Snyder, T. & Gackenbach, J. (1988) “Individual Differences Associated with Lucid Dreaming” in Gackenbach, J. & LaBerge, S. (Eds.) Conscious Mind, Sleeping Brain (pp. 221-259) New York: Plenum.
  • Solms, M. (1997) The Neuropsychology of Dreams Mahwah: Lawrence Erlbaum Associates.
    • This book brings together research on the counter-intuitive idea of abnormalities in dreaming. Some individuals with brain lesions have reported significant changes in the phenomenology of their dreams.
  • Sosa, E. (2007) A Virtue Epistemology: Apt Belief and Reflective Knowledge Oxford: OUP.
    • Chapter One contains Sosa’s brief sketch of the imagination model of dreams, with the consequent attack on Descartes’ dream argument.
  • Squires, R. (1995) “Dream Time,” Proceedings of the Aristotelian Society, New Series, Vol. 95 (1995), pp. 83 – 91.
  • Stevens, A. (1990) On Jung London: Routledge.
  • Stevens, A. (1994) Jung: A Very Short Introduction New York: OUP
  • Stewart, C. (2002) “Erotic Dreams and Nightmares from Antiquity to the Present,” The Journal of the Royal Anthropological Institute, Vol. 8, No. 2 (Jun., 2002), pp. 279 – 309.
  • Stickgold, R., Rittenhouse, C. & Hobson, J. A. (1994) “Dream splicing: A new technique for assessing thematic coherence in subjective reports of mental activity” Consciousness and Cognition 3:pp. 114 – 28.
  • Storr, A. (1989) Freud: A Very Short Introduction Oxford: OUP, 2001.
  • Sutton, J. (2009) “Dreaming” In Calvo, P., & Symons, J. (Eds.), Routledge Companion to the Philosophy of Psychology, pp. 522 – 542.
  • Symons, D. (1993) “The Stuff That Dreams Aren’t Made Of: Why Wake-State and Dream- State Sensory Experiences Differ,” Cognition, 47 (1993), pp. 181 – 217.
  • Valli, K & Revonsuo, A. (2009) “The Threat Simulation Theory in the Light of Recent Empirical Evidence—A Review,” The American Journal of Psychology, 122: pp. 17 – 38.
  • Valli, K. (2011) “Dreaming in the Multilevel Framework,” Consciousness and Cognition, 20 (2011): pp. 1084 – 1090.
  • Whitmont, E. & Perera, S. (1989) Dreams, A Portal to the Source London: Routledge, 1999.
  • Windt, J. M. & Metzinger, T. (2007) “The Philosophy of Dreaming and Self-consciousness: What Happens to the Experiential Subject During the Dream State?” In Barrett, D. & McNamara, P. (Eds.), The New Science of Dreaming, Vol 3: Cultural and Theoretical Perspectives, 193–247. Westport, CT and London: Praeger Perspectives/Greenwood Press.
  • Windt, J.M. (2010) “The Immersive Spatiotemporal Hallucination Model of Dreaming,” Phenomenology and the Cognitive Sciences, 9, pp. 295 – 316.
  • Windt, J. M., & Noreika, V. (2011). “How to Integrate Dreaming into a General Theory of Consciousness—A Critical Review of Existing Positions and Suggestions for Future Research,” Consciousness and Cognition, 20(4), pp. 1091 – 1107.
  • Wittgenstein, L. (1953) Philosophical Investigations, In Great Books of the Western World | 55 USA: Britannica, 1994.
  • Wollheim, R. (1991) Freud 2nd Edition London: Fontana.
  • Wollheim, R. & Hopkins, J. (1982) Philosophical Essays on Freud Cambridge: Cambridge University Press.
  • Wolman, R. & Kozmova, M. (2007) “Last Night I Had the Strangest Dream: Varieties of Rational Thought Processes in Dream Reports,” Consciousness and Cognition, 16 (2007): pp. 838 – 849.
  • Yost, R. & Kalish, D. (1955) “Miss MacDonald on Sleeping and Waking,” The Philosophical Quarterly, Vol. 5, No. 19 (Apr., 1955), pp. 109 – 124.
  • Yost, R. (1959) “Professor Malcolm on Dreaming and Skepticism-I,” The Philosophical Quarterly, Vol. 9, No. 35 (Apr.), pp. 142 – 151.
  • Zadra, A. (1998) “The Nature and Prevalence of Pain in Dreams,” Pain Research and Management, 3, pp. 155 -161.
  • Zadra, A. and others (2006) “Evolutionary Function of Dreams: A Test of the Threat Simulation Theory in Recurrent Dreams,” Consciousness and Cognition, 15 (2006) pp. 450 – 463.

 

Author Information

Ben Springett
Email: bs1844@my.bristol.ac.uk
University of Bristol
United Kingdom

Mou Zongsan (Mou Tsung-san) (1909—1995)

Mou ZongsanMou Zongsan (Mou Tsung-san) is a sophisticated and systematic example of a modern Chinese philosopher. A Chinese nationalist, he aimed to reinvigorate traditional Chinese philosophy through an encounter with Western (and especially German) philosophy and to restore it to a position of prestige in the world. In particular, he engaged closely with Immanuel Kant’s three Critiques and attempted to show, pace Kant, that human beings possess intellectual intuition, a supra-sensible mode of knowledge that Kant reserved to God alone. He assimilated this notion of intellectual intuition to ideas found in Confucianism, Buddhism, and Daoism and attempted to expand it into a metaphysical system that would establish the objectivity of moral values and the possibility of sagehood.

Mou’s Collected Works run to thirty-three volumes and extend to history of philosophy, logic, epistemology, ontology, metaethics, philosophy of history, and political philosophy.  His corpus is an unusual hybrid in that although its main aim is to erect a metaphysical system, many of the books where Mou pursues that end consist largely of cultural criticism or histories of Confucian, Buddhist, or Daoist philosophy in which Mou explains his own opinions through exegesis of other thinkers’ in a terminology appropriated from Kant, Tiantai Buddhism, and Neo-Confucianism.

Table of Contents

  1. Biography
  2. Cultural Thought
    1. Chinese Philosophy and the Chinese Nation
    2. Development of Chinese Philosophy
    3. History of Chinese Philosophy: Confucianism, Buddhism, and Daoism
  3. Metaphysical Thought
    1. Intellectual Intuition and Things-in-Themselves
    2. Two-Level Ontology
    3. Perfect Teaching
  4. Criticisms
  5. Influence
  6. References and Further Reading

1. Biography

Mou was born in 1909, at the very end of China’s imperial era, into the family of a rural innkeeper who admired Chinese classical learning, which the young Mou came to share. At just that time, however, traditional Chinese learning was being denigrated by some of the intellectual elite, who searched frantically for something to replace it due to fears that their own tradition was dangerously impotent against modern nation-states armed with Western science, technology, bureaucracy, and finance.

In 1929, Mou enrolled in Peking (Beijing) University’s department of philosophy. He embarked on a deep study of Russell and Whitehead’s Principia Mathematica (then a subject of great interest in Chinese philosophical circles ) and also of Whitehead’s Process and Reality. However, Mou was also led by his interest in Whiteheadian process thought to read voraciously in the pre-modern literature on the Yijing (I Ching or Book of Changes), and his classical tastes made him an oddball at Peking University, which was vigorously modernist and anti-traditional.  Mou would have been a lonely figure there had he not met Xiong Shili in his junior year. Xiong was just making his name as a nationally known apologist for traditional Chinese philosophy and mentored Mou for years afterward.   The two men remained close until Mou later left the mainland. Mou graduated from Peking University in 1933 and moved around the country unhappily from one short-term teaching job to another owing to frequent workplace personality conflicts and fighting between the Chinese government and both Japanese and Communist forces. Despite these peregrinations, Mou wrote copiously on logic and epistemology and also on the Yijing.  Mou hated the Communists very boisterously and when they took over China in 1949 he moved to Taiwan and spent the next decade teaching and writing in a philosophical vein about the history and future of Chinese political thought and culture. In 1960, once again unhappy with his colleagues, Mou was invited by his friend Tang Junyi, also a student of Xiong, to leave Taiwan for academic employment in Hong Kong.  There, Mou’s work took a decisive turn and entered what he later considered its mature stage, yielding the books which established him as a key figure in modern Chinese philosophy. During the next thirty-five years he published seven major monographs (one running to three volumes), together with translations of all three of Immanuel Kant’s Critiques and many more volumes’ worth of articles, lectures, and occasional writings. Mou officially retired from the Chinese University of Hong Kong in 1974 but continued to teach and lecture in Hong Kong and Taiwan until his death in 1995.

We can divide Mou’s expansive thought into two closely related parts which we might call his “cultural” thought, about the history and destiny of Chinese culture, and his “metaphysical” thought, concerning the problems and doctrines of Chinese philosophy, which Mou thought of as the essence of Chinese culture.

2. Cultural Thought

a. Chinese Philosophy and the Chinese Nation

Like most intellectuals of his generation, Mou was a nationalist and saw his work as a way to strengthen the Chinese nation and return it to a place of greatness in the world.

Influenced by Hegel, Mou thought of the history and culture of the Chinese nation as an organic whole,, with a natural and knowable course of development. He thought that China’s political destiny depended ultimately on its philosophy, and in turn blamed China’s conquest by the Manchus in 1644, and again by the Communists in 1949, on its loss of focus on the Neo-Confucianism of the Song through Ming dynasties (960-1644)He hoped his own work would help to re-kindle China’s commitment to Confucianism and thereby help defeat Communism.

Along with many of his contemporaries, Mou was interested in why China never gave rise its own Scientific Revolution like that in the West, and also in articulating what the strengths were in China’s intellectual history whereby it outstripped the West. Mou phrased his answer to these two questions in terms of “inner sagehood” (neisheng) and “outer kingship” (waiwang). By “inner sagehood,” Mou meant the cultivation of moral conduct and outlook—at which he thought Confucianism was unequalled in the world—and he used “outer kingship” to encompass political governancealong with other ingredients in the welfare of society, such as a productive economy and scientific and technological know-how. Mou thought that China’s classical tradition was historically weak in the theory and practice of “outer kingship,” and he believed that Chinese culture would have to transform itself thoroughly by discovering in its indigenous learning the resources with which to develop traditions of science and democracy.

For the last thirty years of his life, all of Mou’s writings were part of a conversation with Immanuel Kant, whom he considered the greatest Western philosopher and the one most useful for exploring Confucian “moral metaphysics.” Throughout that half of his career, Mou’s books quoted Kant extensively (sometimes for pages at a time) and appropriated Kantian terms into the service of his reconstructed Confucianism. Scholars agree widely that Mou altered the meanings of these terms significantly when he moved them from Kant’s system into his own, but opinions vary about just how conscious Mou was that he had done so.

b. Development of Chinese Philosophy

As an apologist for Chinese philosophy, Mou was anxious to show that its genius extended to almost all areas and epochs of Chinese philosophy. To this end, He wrote extensively about the history of Chinese philosophy and highlighted the important contributions of Daoist and Buddhist philosophy, along with their harmonious interaction with Confucian philosophy in the dialectical unfolding and refinement of Chinese philosophy in each age.

Before China was united under the Qin dynasty (221-206 BCE), Mou believed, Chinese culture gave definitive form to the ancient philosophical inheritance of the late Zhou dynasty (771-221 BCE), culminating in the teachings of Confucius and Mencius. These were still epigrammatic rather systematic, but Mou thought that they already contained the essence of later Chinese philosophy in germinal form. The next great phase of development came in the Wei-Jin period (265-420 CE) with the assimilation of “Neo-Daoism” or xuanxue (literally “dark” or “mysterious learning”), from whence came the first formal articulation of the “perfect teaching” (concerning which see below). Shortly afterward, Mou taught, Chinese culture experienced its first great challenge from abroad, in the form of Buddhist philosophy. In the Sui and Tang dynasties (589-907) it was the task of Chinese philosophy to “digest” or absorb Buddhist philosophy into itself. In the process, it gave birth to indigenous Chinese schools of Buddhist philosophy (most notably Huayan and Tiantai), which Mou believed advanced beyond the Indian schools because they agreed with essential tenets of native Chinese philosophy, such as the teaching that all people are endowed with an innately sagely nature.

On Mou’s account, in the Song through Ming dynasties (960-1644) Confucianism underwent a second great phase of development and reasserted itself as the true and proper leader of Chinese culture and philosophy. However, at the end of the Ming, there occurred what Mou taught was an alien irruption into Chinese history by the Manchus, who thwarted the “natural” course of China’s cultural unfolding. In the thinking of late Ming philosophers such as Huang Zongxi (1610-1695), Gu Yanwu (1613-1682), and Wang Fuzhi (1619-1692), Mou thought that China had been poised to give rise to its own new forms of “outer kingship” which would have eventuated in an indigenous birth of science and democracy in China, enabling China to compete with the modern West. Chinese culture, however. was diverted from that healthy course of development by the intrusion of the Manchus. Mou believed that it was this diversion which made China vulnerable to the Communist takeover of the 20th century by alienating it from its own philosophical tradition.

Mou preached that the mission of modern Chinese philosophy was to achieve a mutually beneficial conciliation with Western philosophy. Inspired by the West’s example, China would appropriate science and democracy into its native tradition, and the West in turn would benefit from China’s unparalleled expertise in “inner sagehood,” typified especially by its “perfect teaching.”

However, throughout Mou’s lifetime, he remained unimpressed with the actual state of contemporary Chinese philosophy. He seldom rated any contemporary Chinese thinkers as worthy of the name “philosopher,” and he mentioned Chinese Marxist thought only rarely and only as a force entirely antipathetic to true Chinese philosophy.

c. History of Chinese Philosophy: Confucianism, Buddhism, and Daoism

In his historical writings on Confucianism, Mou is most famous for his thesis of the “three lineages” (san xi). Whereas Neo-Confucians where traditionally grouped into a “School of Principle” represented chiefly by Zhu Xi (1130-1200) and a “School of Mind” associated with Lu Xiangshan (1139-1192) and Wang Yangming (1472-1529), Mou also recognized a third lineage exemplified by lesser-known figures as Hu Hong (Hu Wufeng) (1105-1161) and Liu Jishan (Liu Zongzhou) (1578-1645).

Mou judged this third lineage to be the true representatives of Confucian orthodoxy. He criticized Zhu Xi, conventionally regarded as the authoritative synthesizer of Neo-Confucian doctrine, as a usurper who despite good intentions depicted heavenly principle (tianli) in an excessively transcendent way that was foreign to the ancient Confucian message. On Mou’s view, that message is one of paradoxical “immanent transcendence” (neizai chaoyue), in which heavenly principle and human nature are only lexically distinct from each other, not substantially separate. It was because Mou believed that the Hu-Liu lineage of Neo-Confucianism expressed this paradoxical relationship most accurately and artfully that that he ranked it the highest or “perfect” (yuan) expression of Confucian philosophy. (See “Perfect Teaching” below.)

Because Mou wanted to revalorize the whole Chinese philosophical tradition, and not just its Confucian wing, he also wrote extensively on Chinese Buddhist philosophy. He maintained that Indian Buddhist philosophy had remained limited and flawed until in migrated to Chinawhere it was leavened with what Mou saw as core principles of indigenous Chinese philosophy, such as a belief in the basic goodness of both human nature and the world. From Chinese Buddhist thought he adopted methodological ideas that he later applied to his own system. One of these was “doctrinal classification” (panjiao), a doxographic technique of reading competing philosophical systems as forming a dialectical progression of closer and closer approximations to a “perfect teaching” (yuanjiao), rather than as mutually incompatible contenders. Much as Mou discovered what he thought of as the highest expression of Confucian doctrine in the largely forgotten thinkers of the Hu-Liu lineage, he found what he thought of as their formal analog and philosophical precursor in the relatively obscure Tiantai school of Buddhism and its thesis of the identity of enlightenment and delusion.

Mou wrote far less about Daoism than Confucianism or Buddhism, but at least in principle he regarded it too as an indispensible part of the Chinese philosophical heritage. Mou focused most on the “Inner Chapters” (neipian) of the Zhuangzi, especially the “Wandering Beyond” (Xiaoyao you) and “Discussion on Smoothing Things Out”(Qi wu lun) and the writings of Wei-Jin commentators Guo Xiang (c. 252-312 CE) and Wang Bi (226-249 CE). Mou saw the Wei-Jin idea of “root and traces” (ji ben) in particular as an early forerunner of Tiantai Buddhist thinking central to its concept of the “perfect teaching.”

3. Metaphysical Thought

In his metaphysical writings, Mou was mainly interested in how moral value is able to exist and how people are able to know it. Mou hoped to show that humans can directly know moral value and indeed that such knowledge amounts to knowledge par excellence. In an inversion of one of Kant’s terms, he called this project “moral metaphysics” (daode de xingshangxue), meaning a metaphysics in which moral value is ontically primary. That is, a moral metaphysics considers that the central ontological fact is that moral value exists and is known or “intuited” by us more directly than anything else. Mou believed that Chinese philosophy alone has generated the necessary insights for constructing such a moral metaphysics, whereas Kant (who represented for Mou the summit of Western philosophy) did not understand moral knowing because, fixated on theoretical and speculative knowledge, he wrong-headedly applied the same transcendentalism that Mou found so masterful in the Critique of Pure Reason (which supposes that we know a thing not directly but only through the distorting lenses of our mental apparatus) to moral matters, where it is completely out of place.

a. Intellectual Intuition and Things-in-Themselves

For many, the most striking thing about Mou’s philosophy (and the hardest to accept) is his conviction that human beings possess “intellectual intuition” (zhi de zhijue), a direct knowledge of reality without resort to the senses and without overlaying such sensory forms and cognitive categories as time, space, number, and cause and effect.

As with most of the terms that Mou borrowed from Kant, he attached a much different meaning to ‘intellectual intuition.’   For Kant, intellectual intuition was a capacity belonging to God alone. However, Mou thought this was Kant’s greatest mistake and concluded that one of the great contributions of Chinese philosophy to the world was a unanimous belief that humans have intellectual intuition. In the context of Chinese philosophy he took ‘intellectual intuition’ as an umbrella term for the various Confucian, Buddhist, and Daoist concepts of a supra-mundane sort of knowing that, when perfected, makes its possessor a sage or a buddha.

On Mou’s analysis, though Confucian, Buddhist, and Daoist philosophers call intellectual intuition by different names and theorize it differently, they agree that it is available to everyone, that it transcends subject-object duality, and that it is higher than, and prior to, the dualistic knowledge which comes through the “sensible intuition” of seeing and hearingoften called “empirical” (jingyan) or “grasping” (zhi) knowledge in Mou’s terminology. However, Mou taught that the Confucian understanding of intellectual intuition (referred to by a variety of names such as ren, “benevolence,” or liangzhi, “innate moral knowing”) is superior to the Buddhist and Daoist conceptions because it recognizes that intellectual intuition is essentially moral and creative.

Mou thought that people regularly manifest intellectual intuition in everyday life in the form of morally correct impulses and behaviors. To use the classic Mencian example, if we see a child about to fall down a well, we immediately feel alarm. For Mou, this sudden upsurge of concern is an occurrence of intellectual intuition, spontaneous and uncaused. Furthermore, Mou endorsed what he saw as the Confucian doctrine that it is this essentially moral intellectual intuition that “creates” or “gives birth to the ten thousand things” (chuangsheng wanwu) by conferring on them moral value.

b. Two-Level Ontology

Through our capacity for intellectual intuition, Mou taught, human beings are “finite yet infinite” (youxian er wuxian). He accepted Kant’s system as a good analysis of our finite aspect, which is to say our experience as beings who are limited in space and time and also in understanding, but he also thought that in our exercise of intellectual intuition we transcend our finitude as well.

Accordingly, Mou went to great pains to explain how the world of sensible objects and the realm of noumenal objects, or objects of intellectual intuition, are related to each other in a “two-level ontology” (liangceng cunyoulun) inspired by the Chinese Buddhist text The Awakening of Faith. In this model, all of reality is said to consist of mind, but a mind which has two aspects (yixin ermen). As intellectual intuition, mind directly knows things-in-themselves, without mediation by forms and categories and without the illusion that things-in-themselves are truly separate from mind. This upper level of the two-level ontology is what Mou labels “ontology without grasping” (wuzhi cunyoulun), once again choosing a term of Buddhist inspiration. However, mind also submits itself to forms and categories in a process that Mou calls “self-negation” (ziwo kanxian). Mind at this lower level, which Mou terms the “cognitive mind” (renzhi xin), employs sensory intuition and associated cognitive processes to apprehend things as discrete objects, separate from each other and from mindhaving location in time and space, numerical identity, and causal and other relations. This lower ontological level is what Mou calls “ontology with grasping” (zhi de cunyoulun).

c. Perfect Teaching

On Mou’s view, all of the many types of Chinese philosophy he studied taught some version of this doctrine of intellectual intuition, things-in-themselves, and phenomena, and he considered it important to explain how he adjudicated among these many broadly similar strains of philosophy. To that end, he borrowed from Chinese Buddhist scholasticism the concept of a “perfect teaching” (yuanjiao) and the practice of classifying teachings (panjiao) doxographically in order to rank them from less to more “perfect” or complete.

A perfect teaching, in Mou’s sense of the term, is distinguished from a penultimate one not by its content (which is the same in either case) but by its rhetorical form. Specifically, a perfect teaching is couched in the form of a paradox (guijue). In Mou’s opinion, all good examples of Chinese philosophy acknowledge the commonsense difference between subject and object but also teach that we can transcend that difference through the exercise of intellectual intuition. But what distinguishes a perfect teaching is that it makes a show of flatly asserting, in a way supposed to surprise the listener, that subject and object are simply identical to one another, without qualification.

Mou developed this formal concept of a perfect teaching from the example offered by Tiantai Buddhist philosophyHe then applied it to the history of Confucian thoughtidentifying what he thought of as a Confucian equivalent to the Tiantai perfect teaching in the writings of Cheng Hao (1032-1085), Hu Hong, and Liu Jishan. Though Mou believed that all three families of Confucian philosophy—Confucian, Buddhist, and Daoist—had their own versions of a perfect teaching, he rated the Confucian perfect teaching more highly than the Buddhist or Daoist ones, for it gives an account of the “morally creative” character of intellectual intuition, which he thought essential. In the Confucian perfect teaching, he taught, intellectual intuition is said to “morally create” the cosmos in the sense that it gives order to chaos by making moral judgments and thereby endowing everything in existence with moral significance.

Mou claimed that the perfect teaching was a unique feature of Chinese philosophy and reckoned this a valuable contribution to world philosophy because, in his opinion, only a perfect teaching supplied an answer to what he called the problem of the “summum bonum” (yuanshan) or “coincidence of virtue and happiness” (defu yizhi), that is, the problem of how it can be assured that a person of virtue will necessarily be rewarded with happiness. He noted that in Kant’s philosophy (and, in his opinion, throughout the rest of Western philosophy too) there could be no such assurance that virtue would be crowned with happiness except to hope that God would make it so in the afterlife. By contrast, Mou was proud to say, Chinese philosophy provides for this “coincidence of virtue and happiness” without having to posit either a God or an afterlife. The argument for that assurance differed in Confucian, Buddhist, and Daoist versions of the perfect teaching, but in each case it consisted of a doctrine that intellectual intuition (equivalent to “virtue”) necessarily entails the existence of the phenomenal world (which Mou construed as the meaning of “happiness”), without being contingent on God’s intervention in this world or the next or on any condition other than the operation of intellectual intuition, which Mou considered available to all people at all times.

In arguing for the historical absence of such a solution to the problem of the perfect good anywhere outside of China, Mou did acknowledge that Epicurean and Stoic philosophers also tried to establish that virtue resulted in happiness, but he claimed that their explanations only worked by redefining either virtue or happiness in order to reduce its meaning to something analytically entailed by the other. However, some critics have argued that Mou’s alternative commits the same fault by effectively collapsing happiness into virtue.

4. Criticisms

Mou has often been accused of irrationalism due to his doctrine of the direct, supra-sensory intellectual intuition, which states that people can apprehend the deeper reality underlying the mere phenomena that are measured and described by scientific knowledge. Mou has also been ridiculed because he did not so much present positive arguments in favor of his main metaphysical beliefs as propound them as definitive facts, presumably known to him through a privileged access to sagely intuition.

Critics also frequently question the relevance of Mou’s philosophy, both to the Confucian tradition from which he took his inspiration, as well as to Chinese society. They point out that Mou’s thought (as well as that of other inheritors of Xiong Shili’s legacy in general) inhabits a far different social context than the Confucian tradition with which he identifies. With Mou and his generation, Chinese philosophy was detached from its old homes in the traditional schools of classical learning (shuyuan), the Imperial civil service, and monasteries and hermitages, and was transplanted into the new setting of the modern university, with its disciplinary divisions and limited social role. Critics point to this academicization as evidence that, despite Mou’s aspirations to kindle a massive revival of the Confucian spirit in China, his thought risks being little more than a “lost soul,” deracinated and intellectualized. The first problem with this, they claim, is that Mou reduces Confucianism to a philosophy in the modern academic sense and leaves out other important aspects of the pre-modern Confucian cultural system, such as its art, literature, and ritual and its political and career institutions. Second, they claim, because Mou’s brand of Confucianism accents metaphysics so heavily, it remains confined to departments of philosophy and powerless to exert any real influence over Chinese society.

Mou has also been criticized for his explicit essentialism. In keeping with his Hegelian tendency, he presented China as consisting essentially of Chinese culture, and even more particularly with Chinese philosophy, and he claimed in turn that this is epitomized by Confucian philosophy.  Furthermore, he presents the Confucian tradition as consisting essentially of an idiosyncratic-looking list of Confucian thinkers. Opponents complain that even if there were good reasons for Mou to enshrine his handful of favorite Confucians as the very embodiment of all of Chinese culture, this would remain Mou’s opinion and nothing more, a mere interpretation rather than the objective, factual historical insight that Mou claimed it was.

5. Influence

In Mou’s last decades, he began to be recognized together with other prominent students of Xiong Shili as a leader of what came to be called the “New Confucian” (dangdai xin rujia) movement, which aspires to revive and modernize Confucianism as a living spiritual tradition. Through his many influential protégés, Mou achieved great influence over the agenda of contemporary Chinese philosophy.

Two of his early students, Liu Shu-hsien (b. 1934)  and Tu Wei-ming (b. 1940) have been especially active in raising the profile of contemporary Confucianism in English-speaking venues, as has the Canadian-born scholar John Berthrong (b. 1946). Mou’s emphasis on Kant’s transcendental analytic gave new momentum to research on Kant and post-Kantians, particularly in the work of Mou’s student Lee Ming-huei (b. 1953), and his writings on Buddhism lie behind much of the interest in and interpretation of Tiantai philosophy among Chinese scholars. Finally, Mou functions as the main modern influence on, and point of reference for, the intense research on Confucianism among mainland Chinese philosophers today.

6. References and Further Reading

  • Angle, Stephen C. Sagehood: The Contemporary Significance of Neo-Confucian Philosophy. New York: Oxford University Press, 2009.
  • Angle, Stephen C. Contemporary Confucian Political Philosophy. Cambridge: Polity, 2012.
  • Asakura, Tomomi. “On Buddhistic Ontology: A Comparative Study of Mou Zongsan and Kyoto School Philosophy.” Philosophy East and West 61/4 (October 2011): 647-678.
  • Berthrong, John. “The Problem of Mind: Mou Tsung-san’s Critique of Chu Hsi.” Journal of Chinese Religion 10 (1982): 367-394.
  • Berthrong, John. All Under Heaven. Albany: State University of New York Press, 1994.
  • Billioud, Sébastien. “Mou Zongsan’s Problem with the Heideggerian Interpretation of Kant.” Journal of Chinese Philosophy 33/2 (June 2006): 225-247.
  • Billioud, Sébastien. Thinking through Confucian Modernity: A Study of Mou Zongsan’s Moral Metaphysics. Leiden and Boston: Brill, 2011.
  • Bresciani, Umberto. Reinventing Confucianism: The New Confucian Movement. Taipei: Taipei Ricci Institute for Chinese Studies, 2001.
  • Bunnin, Nicholas. “God’s Knowledge and Ours: Kant and Mou Zongsan on Intellectual Intuition.” Journal of Chinese Philosophy 35/4 (December 2008): 613-624.
  • Chan, N. Serina. “What is Confucian and New about the Thought of Mou Zongsan?” in New Confucianism: A Critical Examination, ed. John Makeham (New York: Palgrave Macmillan, 2003), 131-164.
  • Chan, N. Serina. The Thought of Mou Zongsan. Leiden and Boston: Brill, 2011.
  • Chan, Wing-Cheuk. “Mou Zongsan on Zen Buddhism.” Dao: A Journal of Comparative Philosophy 5/1 (2005) : 73-88.
  • Chan, Wing-Cheuk. “Mou Zongsan’s Transformation of Kant’s Philosophy.” Journal of Chinese Philosophy 33/1 (March 2006): 125-139.
  • Chan, Wing-Cheuk. “Mou Zongsan and Tang Junyi on Zhang Zai’s and Wang Fuzhi’s Philosophies of Qi : A Critical Reflection.” Dao: A Journal of Comparative Philosophy 10/1 (March 2011): 85-98.
  • Chan, Wing-Cheuk. “On Mou Zongsan’s Hermeneutic Application of Buddhism.” Journal of Chinese Philosophy 38/2 (June 2011): 174-189.
  • Chan, Wing-Cheuk and Henry C. H. Shiu. “Introduction: Mou Zongsan and Chinese Buddhism.” Journal of Chinese Philosophy 38/2 (June 2011): 169-173.
  • Clower, Jason. The Unlikely Buddhologist: Tiantai Buddhism in the New Confucianism of Mou Zongsan. Leiden and Boston: Brill, 2010.
  • Clower, Jason. “Mou Zongsan on the Five Periods of the Buddha’s Teaching. Journal of Chinese Philosophy 38/2 (June 2011): 190-205.
  • Guo Qiyong. “Mou Zongsan’s View of Interpreting Confucianism by ‘Moral Autonomy’.” Frontiers of Philosophy in China 2/3 (June 2007): 345-362.
  • Kantor, Hans-Rudolf. “Ontological Indeterminacy and Its Soteriological Relevance: An Assessment of Mou Zongsan’s (1909-1995) Interpretation of Zhiyi’s (538-597) Tiantai Buddhism.” Philosophy East and West 56/1 (January 2006): 16-68.
  • Kwan, Chun-Keung. “Mou Zongsan’s Ontological Reading of Tiantai Buddhism.” Journal of Chinese Philosophy 38/2 (June 2011): 206-222.
  • Lin, Chen-kuo. “Dwelling in Nearness to the Gods: The Hermeneutical Turn from MOU Zongsan to TU Weiming.” Dao 7 (2008): 381-392.
  • Lin, Tongqi and Zhou Qin. “The Dynamism and Tension in the Anthropocosmic Vision of Mou Zongsan.” Journal of Chinese Philosophy 22/4 (December 1995): 401-440.
  • Liu, Shu-hsien. “Mou Tsung-san (Mou Zongsan)” in Encyclopedia of Chinese Philosophy, ed. A.S. Cua. New York: Routledge, 2003.
  • Mou, Tsung-san [Mou Zongsan]. “The Immediate Successor of Wang Yang-ming: Wang Lung-hsi and his Theory of Ssu-wu.” Philosophy East and West 23/1-2 (1973): 103-120.
  • Neville, Robert C. Boston Confucianism: Portable Tradition in the Late-Modern World. Albany: State University of New York Press, 2000.
  • Stephan Schmidt. “Mou Zongsan, Hegel, and Kant: The Quest for Confucian Modernity.” Philosophy East and West 61/2 (April 2011): 260-302.
  • Shiu, Henry C.H. “Nonsubstantialism of the Awakening of Faith in Mou Zongsan.” Journal of Chinese Philosophy 38/2 (June 2011): 223-237.
  • Tang, Andres Siu-Kwong. “Mou Zongsan’s ‘Transcendental’ Interpretation of Huayan Buddhism.” Journal of Chinese Philosophy 38/2 (June 2011): 238-256.
  • Tu, Wei-ming. Centrality and Commonality: An Essay on Confucian Religiousness. Albany: State University of New York Press, 1989.
  • Tu, Xiaofei. “Dare to Compare: The Comparative Philosophy of Mou Zongsan.” Kritike 1/2 (December 2007): 24-35.
  • Zheng Jiadong. “Mou Zongsan and the Contemporary Circumstances of the Rujia.” Contemporary Chinese Thought 36/2 (Winter 2004-5): 67-88.
  • Zheng Jiadong. “Between History and Thought: Mou Zongsan and the New Confucianism That Walked Out of History.” Contemporary Chinese Thought 36/2 (Winter 2004-5): 49-66.

Author Information

Jason Clower
Email: jclower@csuchico.edu
California State University, Chico
U. S. A.

Jean Bodin (c. 1529—1596)

bodinThe humanist philosopher and jurist Jean Bodin was one of the most prominent political thinkers of the sixteenth century. His reputation is largely based on his account of sovereignty which he formulated in the Six Books of the Commonwealth. Bodin lived at a time of great upheaval, when France was ravaged by the wars of religion between the Catholics and the Huguenots. He was convinced that peace could be restored only if the sovereign prince was given absolute and indivisible power of the state. Bodin believed that different religions could coexist within the commonwealth. His tolerance in religious matters has often been emphasized. He was also one of the first men to have opposed slavery.

Bodin was extremely erudite, and his works discuss a wide variety of topics, extending from natural philosophy and religion to education, political economy, and historical methodology. Natural philosophy and religion where intimately correlated for Bodin. Furthermore, he sought to reform the judicial system of France, and he formulated one of the earliest versions of the quantitative theory of money. Bodin held a superstitious belief about the existence of angels and demons; his works cover topics such as demonology and witchcraft, and include extensive passages on astrology and numerology.

Table of Contents

  1. Life and Career
  2. Method for the Easy Comprehension of History
    1. Methodology for the Study of History
    2. Theory of Climates
  3. The Six Bookes of a Commonweale
    1. Concept of Sovereignty
    2. Definition of Law
    3. Limitations upon the Authority of the Sovereign Prince
    4. Difference between Form of State and Form of Government
    5. The Question of Slavery
  4. Bodin’s Economic Thought
    1. Quantitative Theory of Money
    2. The State’s Finances and the Question of Taxation and Property Rights
  5. Writings Concerning Religion
    1. Colloquium heptaplomeres and the Question of Religious Tolerance
    2. The Question of True Religion and Bodin’s Personal Faith
  6. On Witchcraft
  7. Natural Philosophy
  8. Other Works
    1. Juris universi distributio
    2. Moral Philosophy
    3. Writings on Education
    4. Bodin’s Surviving Correspondence
  9. Influence
  10. References and Further Reading
    1. Primary Sources
      1. Modern Editions of Bodin’s works
        1. Collected Works
        2. Individual Works
    2. Secondary Sources
      1. Bibliography
      2. Conference Proceedings and Article Collections

1. Life and Career

Jean Bodin’s last will and testament, dated 7th June 1596, states that he was 66 years old when he died. He was therefore born in either 1529 or 1530, the youngest of seven children, four of whom were girls. Bodin’s father, Guillaume Bodin, was a wealthy merchant and a member of the bourgeoisie of Angers. Very little is known of his mother beyond that her name was Catherine Dutertre and that she died before the year 1561.

Bodin joined the Carmelite brotherhood at an early age. Surviving documents tell us that he was released from his vows a few years later. He is known to have studied, and later, taught law at the University of Toulouse during the 1550s. Bodin was unable to obtain a professorship at the university, and this may have driven him away from Toulouse and academic life. During the 1560s, he worked as an advocate at the Parlement of Paris.

Bodin’s first major work, the Method for the Easy Understanding of History (Methodus ad facilem historiarum cognitionem) was published in 1566, the same year that saw the death of his father. Bodin’s most famous work, the Six Books of the Commonwealth (Six livres de la République) was published ten years later, in 1576. In 1570, Bodin was commissioned by the French King Charles IX for the reformation of forest tenures in Normandy. He was at the very heart of French political power in the 1570s – first during the reign of Charles IX and also, after Charles’ death in 1574, during the reign of his brother, Henri III. In 1576, Bodin lost the favor of King Henri III after he opposed, among other things, the king’s fiscal policies during the States General of Blois where Bodin served as representative for the third estate of Vermandois.

Bodin settled in Laon during the last two decades of his life. He had moved there shortly after marrying the widow of a Laon official, Françoise Trouilliart (or Trouillard) in 1576. Bodin sought employment with the Duke of Alençon, the king’s youngest brother. The duke aspired to marry Queen Elizabeth of England. During one of the duke’s trips to London, Bodin accompanied him. In 1582, Bodin followed Alençon to Antwerp, where Alençon sided with the Low Countries in their revolt against Spain. Bodin was appointed Master of Requests and counselor (maître des requêtes et conseiller) to the duke in 1583. He retired from national politics after Alençon’s sudden death in 1584. Following the death of his brother-in-law, Bodin succeeded him in office as procureur du roi, or Chief Public Prosecutor, for Laon in 1587.

Bodin wrote two notable works toward the end of his life; his Colloquium of the Seven about Secrets of the Sublime (Colloquium heptaplomeres de rerum sublimium arcanis abditis) is an engaging dialogue in favor of religious tolerance. Bodin’s main contribution in the field of natural philosophy, the Theater of Nature (Universae naturae theatrum) was first published in 1596, the same year that Bodin died of the plague. He was given a Catholic burial in the Franciscan church of Laon.

2. Method for the Easy Comprehension of History

Bodin’s Methodus ad facilem historiarum cognitionem (Method for the Easy Comprehension of History) was first published in 1566, and revised in 1572. It is Bodin’s first important work and contains many of the ideas that are developed further in his other key systematic works. Some of them are human history, natural history, and divine history, later elaborated in the République, the Theatrum, and the Colloquium heptaplomeres respectively. Bodin’s purpose in writing the Methodus was to expose the art and method to be used in the study of history. His desire to elaborate a system and to synthesize all existing knowledge is easily detectable in the Methodus.

The first four chapters of the Methodus are largely a discussion concerning methodology. History and its different categories are defined in Chapter One. Chapters II and III discuss the order in which historical accounts are to be read, and the correct order for arranging all material. Chapter IV elucidates the choice of historians; it may be considered as an exposition of Bodin’s method for a critical study of history, that the student of history should move from generalized accounts to more detailed narratives. Reading should begin from the earliest times of recorded history and the reader should naturally progress towards more recent times. In order to obtain a thorough comprehension of the whole, certain other subjects – cosmography, geography, chorography, topography and geometry – are to be associated with the study of history. All material should be critically assessed; the background and training of historians must be taken into account, as well as their qualifications.

In order, then, that the truth of the matter may be gleaned from histories, not only in the choice of individual authors but also in reading them we must remember what Aristotle sagely said, that in reading history it is necessary not to believe too much or disbelieve flatly (…)If we agree to everything in every respect, often we shall take true things for false and blunder seriously in administering the state. But if we have no faith at all in history, we can win no assistance from it. (Bodin 1945, 42)

Of the ten chapters that constitute the Methodus, Chapter Six is by far the lengthiest, covering more than a third of the book, and it may be considered as a blueprint for the République. Chapters VII to IX seek to refute erroneous interpretations of history. Bodin’s first rebuttal concerns the myth, based on a biblical prophecy, of the four monarchies or empires as it was emphasized by many German Protestant theologians. Bodin’s second criticism concerns the idea of a golden age (and the superiority of the ancients in comparison with moderns). Furthermore, Bodin refutes the error of those who claim the independent origin of races. The final chapter of the Methodus contains a bibliography of universal history.

a. Methodology for the Study of History

There are three kinds of history, Bodin writes; divine, natural and human. The Methodus is an investigation into the third type, that is, the study of human actions and of the rules that govern them. Science is not concerned with particulars but with universals. Bodin therefore considers as absurd the attempts of jurisconsults to establish principles of universal jurisprudence from Roman decrees or, more generally, from Roman law, thus giving preference to one legal tradition. Roman law concerns the legislation of one particular state – and the laws of particular states are the subject of civil law—and as such change within a brief period of time. The correct study of law necessitates a different approach, one that was already described by Plato: the correct way to establish law and to govern a state is to bring together and compare the legal framework of all the states that have existed, and compile the very best of them. Together with other so-called legal humanists, like Budé, Alciat, and Connan, Bodin held that the proper understanding of universal law could only be obtained by combining the studies of history and law.

Indeed, in history the best part of universal law lies hidden; and what is of great weight and importance for the best appraisal of legislation – the custom of the peoples, and the beginnings, growth, conditions, changes, and decline of all states – are obtained from it. The chief subject matter of this Method consists of these facts, since no rewards of history are more ample than those usually gathered around the governmental form of states. (Bodin 1945, 8)

Bodin writes that there are four kinds of interpreters of law. The most skilled among them are those who are

 …trained not only by precepts and forensic practice but also in the finest arts and the most stable philosophy, who grasp the nature of justice, not changeable according to the wishes of men, but laid down by eternal law; who determine skillfully the standards of equity; who trace the origins of jurisprudence from ultimate principles; who pass on carefully the knowledge of all antiquity; who, of course, know the power and the dominion of the emperor, the senate, the people, and the magistrates of the Romans; who bring to the interpretation of legislation the discussion of philosophers about laws and state; who know well the Greek and Latin languages, in which the statutes are set forth; who at length circumscribe the entire division of learning within its limits, classify into types, divide into parts, point out with words, and illustrate with examples. (Bodin 1945, 4-6)

b. Theory of Climates

The Theory of Climates is among Bodin’s best-known ideas. Bodin was not the first to discuss the topic; he owes much to classical authors like Livy, Hippocrates, Aristotle and Tacitus, who are referenced by Bodin himself. He also borrows from his contemporaries—especially historians, travelers, and diplomats – like Commines, Machiavelli, Copernicus, and Jean Cardan. Bodin’s observations on climate differed from that of his medieval predecessors, since Bodin was first and foremost interested in the practical implications of a theory: a correct understanding of the laws of the environment must be thought of as the starting point for all policy, laws and institutions (Tooley 1953, 83). Bodin believed that climate and other geographical factors influence, although they do not necessarily determine, the temperament of any given people. Accordingly, the form of state and legislation needs to be adapted to the temperament of the people, and the territory that it occupies.

Three different accounts of the Theory of Climates are found in Bodin’s writings. The earliest version is in Chapter Five of the Methodus. Although this passage contains the general principles of the theory, Bodin does not relate them to contemporary politics. It is in the first chapter of the fifth book of the République that the theory of climates is further amplified, and its relationship to contemporary politics established. Moreover, the Latin translation of the République contains a few notable additions to the theory.

According to Bodin, no one who has written about states has ever considered the question of how to adapt the form of a state to the territory where it is situated (near the sea or the mountains, etc.), or to the natural aptitudes of its people. Bodin holds that, amid the uncertainty and chaos of human history, natural influences provide us with a sure criterion for historical generalization. These stable and unchanging natural influences have a dominant role in molding the personality, physique, and historical character of peoples (Brown 1969, 87-88). This naturalistic approach is, to some extent, obscured by Bodin’s belief in astrology and numerology. Racial peculiarities, the influence of the planets and Pythagorean numbers were all part of Renaissance Platonism. Bodin combined these ideas with geographic determinism that closely followed the theories of Hippocrates and Strabo. (Bodin 1945, xiii)

Ptolemy divided the world into arctic, temperate, and tropic zones. In adopting the Ptolemaic zones Bodin divided earth into areas of thirty degrees from the equator northward. Different peoples have their capabilities and weaknesses. Southern people are contemplative and religious by nature; they are wise but lack in energy. Northern people, on the other hand, are active and large in stature, but lack in sagaciousness. The people of the South are intellectually gifted and thus resemble old men while the Northern people, because of their physical qualities, remind us of youth. Those that live in between these two regions—the men of the temperate zone—lack the excesses of the previous two, while being endowed with their better qualities. They may therefore be described as men in middle life—prudent and therefore gifted to become executives and statesmen. They are the Aristotelian mean between two extremes. The superiority of this third group is stressed by Bodin throughout his writings.

3. The Six Bookes of a Commonweale

Bodin’s most prominent contribution in the field of political philosophy was first published in 1576, and in his own Latin translation a decade later. Significant differences exist between the French and Latin versions of the text. Translations into other languages soon followed: Italian (1588), Spanish (1590), German (1592), and English (1606). The République must be considered, at least partially, as Bodin’s response to the most important political crisis in France during the sixteenth century: the French wars of religion (1562-1598). It was written as a defense of the French monarchy against the so-called Monarchomach writers, among them François Hotman (1524-1590), Theodore Beza (1519-1605) and the author of the Vindiciae contra tyrannos. The Monarchomach writers called for tyrannicide and considered it the role of the magistrates and the Estates General to limit the sovereign power of the ruler, and that this power be initially derived from the people.

Bodin published three different prefaces to the République. The first is an introduction found in all French editions. The second is a prefatory letter in Latin that appears in the French editions from 1578 onwards. The third preface is an introduction to the Latin editions. These three prefaces were an opportunity for Bodin to defend his work against writers who had attacked it. They give us an account of how Bodin’s opinions developed during the years that followed the publication of the République. In 1580, Bodin answered his detractors in a work entitled Apologie de René Herpin pour la République de Jean Bodin. René Herpin was a pseudonym used by Bodin.

The first book of the République discusses the principal ends and aims of the state, its different elements, and the nature and defining marks of sovereign power. In the second book, Bodin discusses different types of states (democracy, aristocracy, and monarchy) and concludes that there cannot exist a mixed state. In Chapter Five, Bodin examines the conditions under which a tyrant, that is, an illegitimate ruler who does not possess sovereign power, may be rightfully killed. A legitimate monarch, on the other hand, may not be resisted by his subjects – even if he should act in a tyrannical manner.

Book Three discusses the different parts of the state: the senate and its role, the role of magistrates and their relationship to sovereign power, and the different degrees of authority among magistrates. Colleges, corporations and universities are also defined and considered. The origin, flourishing and decline of states, and the reasons that influence these changes are the subject of Book Four. Book Five begins with an exposition of the Theory of Climate: laws of the state and the form of government are to be adapted to the nature of each people. Bodin then discusses the climatic variations between the North and South, and how these variations affect the human temperament. The final book of the République opens with the question of cencus and censorship that is, the assessment of each individual’s belongings, and the advantages that can be derived from it. Chapters Two and Three discuss the state’s finances, and the problem of debasement of the coinage. Chapter Four is a comparison of the three forms of state; Bodin argues that royal, or hereditary, (as opposed to elective) monarchy is the best form of state. The Salic law, or law of succession to the throne, is discussed: Bodin holds that the rule of women is against divine, natural, and human law. The Salic law, together with a law forbidding alienation of the public domain, called Agrarian law in the Methodus (Bodin 1945, p. 253), is one of the two fundamental laws, or leges imperii (Fr. loix royales), which impose legal limitations upon the authority of the sovereign prince. Fundamental laws concern the state of the kingdom and are annexed to the crown, and  the sovereign prince therefore cannot detract from them.

The concluding chapter of the République is a discussion concerning the principle of justice in the government of the state. Geometric, arithmetic, and harmonic justice are explained, as well as their relation to the different forms of state. A strong Platonic influence may be detected in the final chapter of the work: a wise ruler establishes harmony within the commonwealth, just as God has established harmony in the universe he has created. Every individual has their proper place and purpose in the commonwealth.

a. Concept of Sovereignty

The République opens with the following definition of a commonwealth: “A Commonweale is a lawfull government of many families, and of that which unto them in common belongeth, with a puissant soveraigntie.” (Bodin 1962, 1) (Fr. “République est un droit gouvernement de plusieurs ménages, et de ce qui leur est commun, avec puissance souveraine.” (Bodin 1583, 1) Lat.“Respublica est familiarum rerumque inter ipsas communium summa potestate ac ratione moderata multitude.” (Bodin 1586, 1)) The meaning of sovereign power is further clarified in Chapter Eight of the first book:

Maiestie or Soveraigntie is the most high, absolute, and perpetuall power over the citisens and subiects in a Commonweale: which the Latins cal Maiestatem, the Greeks akra exousia, kurion arche, and kurion politeuma; the Italians Segnoria, and the Hebrewes tomech shévet, that is to say, The greatest power to command. (Bodin 1962, 84)

Having defined sovereignty, Bodin then defines the meaning of the terms “perpetual” and “absolute”. A person to whom sovereignty is given for a certain period of time, upon the expiration of which they once again become private citizens, cannot be called sovereign. When sovereign power is given to someone for a certain period of time, the person or persons receiving it are but the trustees and custodians of that power, and the sovereign power can be removed from them by the person or persons that are truly sovereign. Sovereignty, therefore, Bodin writes, “is not limited either in power, charge, or time certaine.” Absolute power is the power of overriding ordinary law, and it has no other condition than that which is commanded by the law of God and of nature:

But it behoveth him that is a soveraigne not to be in any sort subiect to the commaund of another … whose office it is to give laws unto his subiects, to abrogat laws unprofitable, and in their stead to establish other: which hee cannot do that is himselfe subiect unto laws, or to others which have commaund over him. And that is it for which the law saith, That the prince is acquitted from the power of the laws[.] (Bodin 1962, 91)

From this and similar passages Bodin derives the first prerogative of a sovereign prince of which he gives the following definition: “Let this be the first and chiefe marke of a soveraigne prince, to bee of power to give laws to all his subiects in generall and to everie one of them in particular … without consent of any other greater, equall, or lesser than himselfe” (Bodin 1962, 159). All other rights and prerogatives of sovereignty are included in the power of making and repealing laws, Bodin writes, and continues, “so that (to speak properly) a man may say, that there is but this only mark of soveraigne power considering that all other the rights thereof are contained in this”. The other prerogatives include declaring war and making peace, hearing appeals in the last instance, instituting and removing the highest officers, imposing taxes on subjects or exempting them, granting pardons and dispensations, determining the name, value, and measure of the coinage, and finally, requiring subjects to swear their loyalty to their sovereign prince.

Sovereignty and its defining marks or attributes are indivisible, and supreme power within the commonwealth must necessarily be concentrated on a single person or group of persons. Bodin argues that the first prerogative of a sovereign ruler is to give law to subjects without the consent of any other individual. It is from this definition that he derives the logical impossibility of dividing sovereignty, as well as the impossibility of the existence of a mixed state: if sovereignty, in other words, the power to give law, within the state were divided, for example,  between the prince, the nobility, and the people, there would exist in the commonwealth not one, but several agents that possess the power to give law. In such a case, Bodin argues, no one can be called a subject, since all have power to make law. Additionally, no one would be able to give laws to others, since law-givers would be forced to receive law from those upon whom they wish to impose laws. The state would, therefore, be popular or democratic. In the revised Latin edition of the République the outcome of divided sovereignty is described as a state of anarchy since no one would be willing to obey laws.

b. Definition of Law

Bodin writes that there is a great difference between law (Lat. lex; Fr. loi) and right (Lat. jus; Fr. droit). Law is the command of a sovereign prince, that makes use of his power, while right implies that which is equitable. A right connotes something with a normative content; law, on the other hand, has no moral content or normative implications. Bodin writes:

We must presuppose that this word Law, without any other addition, signifieth The right command of him or them, which have soveraigne power above others, without exception of person: be it that such commaundement concerne the subiects in generall, or in particular: except him or them which have given the law. Howbeit to speake more properly, A law is the command of a Soveraigne concerning all his subiects in generall: or els concerning generall things, as saith Festus Pompeius, as a privilege concerneth some one, or some few[.] (Bodin 1962, 156)

c. Limitations upon the Authority of the Sovereign Prince

Although the sovereign prince is not bound by civil law—neither by the laws of his predecessors, which have force only as long as their maker is alive, unless ratified by the new ruler, nor by his own laws—he is not free to do as he pleases, for all earthly princes have the obligation to follow the law of God and of nature. Absolute power is power to override ordinary law, but all earthly princes are subject to divine and natural laws, Bodin writes. To contravene the laws of God, “under the greatnesse of whome all monarches of the world ought to beare the yoke, and to bow their heads in all feare and reverence”, and nature mean treason and rebellion.

Contracts with Subjects and with Foreigners

Bodin mentions a few other things – besides the laws of God and of nature – that limit the sovereign prince’s authority. These include the prince’s contracts with his subjects and foreign princes, property rights of the citizens, and constitutional laws (leges imperii) of the realm. Regarding the difference between contracts and laws, Bodin writes that the sovereign prince is subject to the just and reasonable contracts that he has made, and in the observation of which his subjects have an interest, whilst laws obligate all subjects but not the prince. A contract between a sovereign prince and his subjects is mutually binding and it obligates both parties reciprocally. The prince, therefore, has no advantage over the subject on this matter. The prince must honor is contracts for three reasons: 1) Natural equity, which requires that agreements and promises be kept; 2) The prince’s honor and his good faith, since there is “no more detestable crime in a prince, than to bee false of his oath and promise”; and 3) The prince is the guarantor of the conventions and obligations that his subjects have with each other – it is therefore all the more important that the sovereign prince should render justice for his own act.

Fundamental Laws

Two fundamental laws (leges imperii) are discussed in the République. The first one is the Salic law, or the law of succession to the throne. The Salic law guarantees the continuity of the crown, and determines the legitimate successor (see Franklin 1973, Chapter 5). The other fundamental law is the law against alienation of the royal domain, which Bodin calls “Agrarian law” in the Methodus. As Franklin has observed, “The domain was supposed to have been set aside in order to provide a king with a source of annual income normally sufficient to defray the costs of government” (1973, 73). If the domain is alienated, this signifies lesser income to the crown, and possibly increased taxation upon the citizens. Fundamental laws are annexed and united to the crown, and therefore the sovereign ruler cannot infringe them. But should the prince decide to do so, his successor can always annul that which has been done in prejudice of the fundamental laws of the realm.

Inviolability of Private Property

Finally, Bodin derives from both natural law and the Old Testament that the sovereign prince may not take the private property of his subjects without their consent since this would mean violating the law of God and of nature. He writes: “Now then if a soveraigne prince may not remove the bounds which almightie God (of whom he is the living & breathing image) hath prefined unto the everlasting lawes of nature: neither may he take from another man that which is his, without iust cause” (Bodin 1962, 109; 110). The only exception to the rule, the just causes that Bodin refers to in this passage, concern situations where the very existence of the commonwealth is threatened. In such cases, public interest must be preferred over the private, and citizens must give up their private property in order to guarantee the safety and continuing existence of the commonwealth.

The preceding passage is one among many where the sovereign prince is described by Bodin as the “earthly image of God,” “God’s lieutenant for commanding other men,” or the person “to whom God has given power over us”. It is from this principle regarding the inviolability of private property that Bodin derives that new taxes may not be imposed upon citizens without their consent.

d. Difference between Form of State and Form of Government

Bodin holds that sovereignty cannot be divided – it must necessarily reside in one person or group of persons. Having shown that sovereignty is indivisible, Bodin moves on to refute the widely accepted political myth of the Renaissance that the Polybian model of a mixed state was the optimal form of state. Contrary to the opinions of Polybius, Aristotle, and Cicero, Bodin writes that there are only three types of state or commonwealth: monarchy, where sovereignty is vested with one person, aristocracy, where sovereignty is vested with a minority, and democracy, where sovereignty is vested in all of the people or a majority among them. Bodin’s denial of the possibility of dividing sovereignty directly results in the impossibility of a mixed state in the form that most Renaissance political theorists conceived it. It is with the help of historical and modern examples, most notably of Rome and Venice, that Bodin shows that the states that were generally believed to possess a mixed regime were not really so.

Even though Bodin refuses the idea that there be more than three types of commonwealth, he is willing to accept that there is a variety of governments – that is, different ways to govern the state. The way that the state is governed in no way alters its form nor its structure. Discussion concerning the difference between the form of state and government is found in Book Two. Bodin remarks that despite the importance of the question, no one before him has ever addressed it. All monarchies, aristocracies and popular states are either tyrannical, despotic, or legitimate (i.e. royal). These are not different species of commonwealth, Bodin observes, but diverse ways of governing the state. Tyrannical monarchy is one in which the sovereign ruler violates the laws of God, oppresses his subjects and treats their private property as his own. Tyrannical monarchy must not be confused with despotic monarchy, Bodin writes. Despotic, or lordly, monarchy “is that where the prince is become lord of the goods and persons of his subiects, by law of arms and lawfull warre; governing them as the master of a familie doth his slaves.” Bodin holds that there is nothing unfitting in a prince who has defeated his enemies in a just war, and who governs them under the laws of war and the law of nations. Finally, royal or legitimate monarchy is one in which the subjects obey the laws of the sovereign prince, and the prince in his turn obeys the laws of God and of nature; natural liberty and the right to private property are secured to all citizens.

Although most of Bodin’s examples concern monarchy, he writes that “The same difference is also found in the Aristocratique and popular estate: for both the one and the other may be lawful, lordly, and tirannicall, in such sort as I have said” (Bodin 1962, 200). Bodin qualifies as “absurd” and “treasonable” opinions according to which the constitution of France is a mixture of the three types of state—the Parlement representing aristocracy, the Estates General democracy, and the King representing monarchy.

e. The Question of Slavery

The question of slavery is addressed in Book One, Chapter Five of the République. Bodin is recognized today as one of the earliest advocates of the abolition of slavery. For him, slavery was a universal phenomenon in the sense that slaves exist in all parts of the world, and slavery was widely accepted by the droit des gens. Bodin writes that there are difficulties concerning slavery that have never been resolved. He wishes to answer the following question: “Is slavery natural and useful, or contrary to nature?”

Bodin opposes Aristotle’s opinion (Politics 1254a) according to which slavery is something natural – some people are born to govern and command, while it is the role of others to serve and obey. Bodin admits that “there is certain plausibility in the argument that slavery is natural and useful in the commonwealth.” After all, Bodin continues, the institution of slavery has existed in all commonwealths, and in all ages wise and good men have owned slaves. But if we are to consider the question according to commonly received opinions, thus allowing ourselves to be less concerned with philosophical arguments, we will soon understand that slavery is unnatural and contrary to human dignity.

Bodin’s opposition to slavery is manifold. First of all, he considers slavery in most cases to be unnatural, as the following passage attests: “I confesse that servitude is well agreeing unto nature, when a strong man, rich and ignorant, yeeldeth his obedience and service unto a wise, discreet and feeble poore man: but for wise men to serve fools, men of understanding to serve the ignorant, and the good to serve the bad; what can bee more contrarie unto nature?” (Bodin 1962, 34) Secondly, slavery is an affront to religion since the law of God forbids making any man a slave against their good will and consent. Thirdly, slavery is against human dignity, because of the countless indescribable humiliations that slaves have been forced to suffer. According to one interpretation, Bodin’s opposition to slavery must above all be understood within the context of his opinions concerning the commonwealth in that slavery poses a permanent threat to the stability of the state. Bodin relies on a historical narrative to prove that slavery is incompatible with a stable commonwealth (Herrel 1994, 56). Thus, in the following passage, he states:

Wherefore seeing it is proved by the examples of so many worlds of years, so many inconveniences of rebellions, servile warres, conspiracies eversions and changes to have happened unto Commonweals by slaves; so many murthers, cruelties, and detestable villanies to have bene committed upon the persons of slaves by their lords and masters: who can doubt to affirme it to be a thing most pernitious and daungerous to have brought them into a Commonweale; or having cast them off, to receive them againe? (Bodin 1962, 44)

4. Bodin’s Economic Thought

Bodin’s main economic ideas are expressed in two works: initially, in his Response to the Paradoxes of Malestroit, first published in 1568, then in a revised second version, in 1578. The Response is an analysis of the reasons for the significant and continuous price rises that afflicted sixteenth century Europe. It is in this work that Bodin is said to have given one of the earliest formulations of the Quantity Theory of Money. In its most elementary form, the Quantity Theory of Money is the affirmation that money supply directly affects price levels. Chapter Two of the sixth book of the République is a lengthy discussion of the possible resources of the state. There is a partial overlap between the two works since Bodin included certain passages of the Response in his République, and then incorporated them again in a revised form into the second edition of the Response.

a. Quantitative Theory of Money

High inflation was rampant in sixteenth century Europe. It began in Spain, and soon spread to its neighboring states. This was mainly due to the increase in the quantity of precious metals, namely silver and gold, that were brought by boat to Europe from the Spanish colonies in the New World. In 1563, the Chambre des Comptes de Paris decided to investigate the reasons for inflation, and the results of the investigation were published in 1566 in a study entitled The Paradoxes of the Seigneur de Malestroit on the Matter of Money. The author of the study was a man called Jean Cherruies “Seigneur de Malestroit”, of whom we know only fairly little. It was these “paradoxes” that Jean Bodin sought to refute in his work.

Malestroit held that the price rises are simply changes in the unit of account that have been occasioned by debasement, and that prices of precious metals have remained constant for three hundred years.

Bodin refuted Malestroit’s analysis on two counts. First, he was able to show that Malestroit’s use of data was incorrect: Malestroit’s central claim to back up his thesis was the unchanging price of velvet since the fourteenth century. Bodin, however, cast doubt on the fact whether velvet was even known in France at such an early period. Secondly, Bodin was able to demonstrate that debasement alone did not explain the reasons for such major and significant price rises; while debasement was one of the factors that had occasioned such inflation, it was far from being the principal cause.

Bodin lists five major factors as contributory causes for such widespread inflation:  (1) The sudden abundance of precious metals, namely silver and gold, throughout Europe; (2) Monopolies; (3) Scarcity, caused by excessive export trade, quasi non-existing import trade, and waste; (4) Fashionable demand by rich people for certain luxury products; and, finally, (5) Debasement.

Of these five causes, Bodin considered the abundance of precious metals to be the most important.

b. The State’s Finances and the Question of Taxation and Property Rights

In Chapter Two of the final book of the République Bodin discusses the question of the commonwealth securing its finances. Seven possible sources of income are listed. These are: (1) Public domain; (2) Profits of conquests; (3) Gifts from friends; (4) Tributes from allies; (5) Profits of trading ventures; (6) Customs on exports and imports; and, finally, (7) Taxes on the subject. Bodin considers the public domain to be the most honest and the most reliable source of income for the commonwealth. He writes that throughout history sovereign princes and their citizens have taken it as a universal rule that the public domain should be holy, inviolable and inalienable. The inalienability of the public domain is of the utmost importance, Bodin writes, in order that “princes should not bee forced to overcharge their subiects with imposts, or to seeke any unlawfull meanes to forfeit their goods”. The seventh method of raising revenue on Bodin’s list is by levying taxes on the subject, but it may be used only when all other measures have failed and the preservation of the commonwealth demands it.

Bodin considers the inalienability of the public domain, together with the Salic law, to be one of the fundamental laws (Lat. leges imperii; Fr. loix royales) of the state. Like many of his contemporaries, Bodin held that the levying of new taxes without consent was a violation of the property rights of the individual, and, as such, contrary to the law of God and nature. He was particularly firm in opposing new taxation without proper consent and sought confirmation for his opinion in French and European history. One of the main differences between a legitimate ruler and an illegitimate one concerns the question of how each treats the private property of their subjects. Property rights are protected by the law of God and of nature, and therefore, violation of the private property of citizens is a violation of the law of God and of nature. A tyrant makes his subjects into his slaves, and treats their private property as if it were his own.

5. Writings Concerning Religion

The 16th and 17th centuries witnessed fierce internal conflict and power struggles at the heart of Christianity. The country most seriously ravaged by the combat between the Catholics and the Huguenots was France. Furthermore, a world of hugely diverse religious beliefs had been recently unveiled beyond the walls of Christendom, and the question of knowing which religion was the true religion (vera religio), or that which God wanted humanity to follow, needed to be addressed. Bodin’s main contributions concerning religion are Démonomanie, Colloquium heptaplomeres and the Universae naturae theatrum. Additionally, the République contains passages that discuss religion and the stability of the state.

a. Colloquium heptaplomeres and the Question of Religious Tolerance

Bodin’s Colloquium of the Seven about Secrets of the Sublime (Colloquium heptaplomeres de rerum sublimium arcanis abditis) is often described as one of the earliest works of comparative religion. It is believed to have been written sometime during the 1580s, although it was circulated in manuscript for nearly three centuries before it was published in its entirety in 1857. The Colloquium is a discussion between seven men of different religions or convictions that have gathered in the home of Coronaeus, a Catholic living in Venice, Italy. The participants are Salomon, a Jew, Octavius, a convert from Catholicism to Islam, Toralba, a natural philosopher, Senamus a skeptic, Fridericus, a Lutheran, and Curtius a Calvinist. The men engage in listening to music, reading, gastronomical delights, and discussions concerning religion.

The Colloquium begins with a story that is told by Octavius. A ship leaves the port of Alexandria as gentle winds blow, but an intensive tempest soon arises. The ship’s captain, terrified by the situation, is forced to drop the anchors, and urges everyone to pray to God. The crewmen, being from many different places and of various confessions, all pray for the one God that they have faith in. The storm calms down eventually and the ship is brought safely to port. When Octavius had finished his story, Coronaeus asked the following question: “Finally, with such a variety of religions represented [on the ship], whose prayers did God heed in bringing the ship safely to port?”

The matter of true religion is discussed in the final three books of the Colloquium heptaplomeres. True religion, Bodin holds, is tolerant of all religions, and accepts different ways to approach God. Leathers Kuntz has observed that “no religion is true whose point of view is not universal, whose expression is not free, whose center does not reflect the intimate harmony of God and nature” (Bodin 2008, xliii). The same opinion is expressed in the Démonomanie and in Bodin’s letter to one Jean Bautru des Matras, an advocate working in Paris. In the latter, Bodin writes that “different opinions concerning religion must not lead you astray, as long as you understand that true religion is nothing else than the turning of a purged soul toward true God”.

b. The Question of True Religion and Bodin’s Personal Faith

Leathers Kuntz has detected three stages in the development of Bodin’s religious thinking. She has argued that Bodin’s religious views became more liberal as he grew older (Bodin 2008, xliii-xliv). In 1559, when he wrote the Discours au Senate et au peuple de Toulouse, Bodin held that people should be brought up publicly in one religion. This he considered as an indispensable element in the cohesiveness of the state. Religious unity should be preserved, and religion should not be debated, since disputations damage religion and cast doubt upon it. When writing the République, Bodin’s main concern was the political stability of the French state. He considered religion to provide for the unity of the state, and as supporting the king’s power. Furthermore, religion strengthened the subjects’ obedience toward their sovereign prince and their respect for the execution of laws. Uniformity of worship must be enforced within the commonwealth when it is possible, but tolerance should become the norm when religious minorities become influential enough to no longer be repressed. The final and most liberal stage of Bodin’s religious opinions becomes most apparent in the Colloquium heptaplomeres, “in which his religious opinions seem to have developed into a kind of theism which leaves each man’s religion, provided he has some, to his own personal conscience” (ibid.).

It is impossible to say anything definitive concerning Bodin’s religious views. We may observe that Bodin’s faith seems less loyal to a particular established church than to a deep sense of honoring God. Bodin’s public religious opinions fluctuated throughout his life. As a consequence, he was accused of many things, including of being a Jew, a Calvinist, a heretical Catholic, and an atheist during his lifetime and after his death. Some scholars have even suggested that there are traces of Nicodemism, or religious dissimulation, in both his works and actions.

Scholars have debated for many years the question of knowing which of the opinions expressed in the Colloquium heptaplomeres should be regarded as Bodin’s personal beliefs. Considering that so many, often contradictory, opinions have been advanced, it may be wise to remark that perhaps “all the speakers represent Bodin’s thinking at one time or another. No one represents his thinking exclusively, but Bodin is sympathetic to some views of each as the dialogue develops. The point seems to be, however, that regardless of Bodin’s approval or disapproval of the religious views represented in the dialogue, he constantly stresses the need for toleration of all religions” (Bodin 2008, xliv). It has also been suggested that Bodin’s opinions and views regarding religious faith are so full of compromise that they ultimately amount to a sort of natural religion. Finally, it has been suggested that Bodin’s writings on the topic of religion “transcended the narrow bounds of confessional religion” (Bodin 1980, 1).

Although Bodin’s understanding of true religion as something profoundly personal, for which no church was required, made him an unorthodox believer in the eyes of many, it seems inconceivable that he should be considered an atheist (Bodin 2008, xxix). In fact, he considered atheism to be extremely dangerous to the commonwealth, as the following passage from the République (4, VII), discussing the difference between atheism and superstition, proves:

 And truely they (in mine opinion) offend much, which thinke that the same punishment is to be appointed for them that make many gods, and them that would have none at all: or that the infinitie of gods admitted, the almightie and everliving God is thereby taken away. For that superstition how great soever it be, doth yet hold men in feare and awe, both of the laws and of the magistrats; as also in mutuall duties and offices one of them towards another: whereas mere Atheisme doth utterly root out of mens minds all the feare of doing evill. (Bodin 1962, 539)

Bodin’s reasons for combating atheism in this passage concern the stability of the state: atheists must not be tolerated in the commonwealth since they hold neither moral nor ethical issues regarding breaking the laws of the state. But Bodin had another reason to detest atheism: atheists are blasphemous because they deny the existence of God.

6. On Witchcraft

Bodin’s De la démonomanie des sorciers (On the Demon-Mania of Witches) was first published in 1580 in French, and soon translated into Latin (1581), German (1581) and Italian (1587). Because of its wide distribution and numerous editions, historians have held it accountable for prosecutions of witches during the years that followed its publication. Many readers have been perplexed by the intolerant character of the Démonomanie. Bodin had a strong belief in the existence of angels and demons, and believed that they served as intermediaries between God and human beings; God intervenes directly in the world through the activity of angels and demons. Demonism, together with atheism and any attempt to manipulate demonic forces through witchcraft or natural magic, was treason against God and to be punished with extreme severity. The principal reason, therefore, to punish someone of witchcraft is “to appease the anger of God, especially if the crime is directly against the majesty of God, as this one is”.

Bodin was given the incentive to write the Démonomanie after he took part in the proceedings against a witch in April 1578. His objective in writing the Démonomanie was to “throw some light on the subject of witches, which seems marvelously strange to everyone and unbelievable to many.” Furthermore, the work was to serve as “a warning to all those who read it, in order to make it clearly known that there are no crimes which are nearly so vile as this one, or which deserve more serious penalties.” Finally, he wished to “respond to those who in printed books try to save witches by every means, so that it seems Satan has inspired them and drawn them to his line in order to publish these fine books” (Bodin 2001, 35-7). Among these “protectors of witches,” as Bodin qualified them, was a German Protestant by the name of Johann Weyer, who considered witches to be delusional and excessively melancholic, and recommended physical healing and religious instruction as a remedy to their condition, rather than corporal or capital punishment. Bodin feared that this might lead judges to consider witches as mentally ill, and, as a consequence, permit them to go without punishment.

The Démonomanie is divided into four books. Book One begins with a set of definitions. Bodin then discusses to what extent men may engage in the occult, and the differences between lawful and unlawful means to accomplish things. He also discusses the powers of witches and their practices: whether witches are able to transform men into beasts, induce or inspire in them illnesses, or perhaps even bring about their death. The final book is a discussion concerning ways to investigate and prosecute witches. Bodin’s severity and his rigorousness in condemning witches and witchcraft is largely based on the contents of the final book of the Démonomanie.

Bodin lists three necessary and indisputable proofs upon which a sentence can be based: (1) Truth of the acknowledged and concrete fact; (2) Testimony of several sound witnesses; and (3) Voluntary confession of the person who is charged and convicted of the crime. Certain other types of evidence, such as public reputation or forced confession, are not regarded by Bodin as indisputable proofs, but simply as “presumptions”, or circumstantial evidence, concerning the guilty nature of the person being charged. Presumptions may serve in the conviction and sentencing of witches in cases where clear proof is lacking.

There are fifteen “detestable crimes” that witches may be guilty of, and even the least of them, Bodin affirms, merits painful death. The death penalty, however, must only be sentenced by a competent judge and based on solid proof that eliminates all possibility of error. In cases where sufficient proof is wanting, where there are neither witnesses, nor confession, nor factual evidence, and where only mere presumptions, even strong ones, exist, Bodin is opposed to a death sentence: “I do not recommend that because of strong presumptions one pass sentence of death – but any other penalty except death…One must be very sure of the truth to impose the death sentence.” Bodin may have considered witchcraft an insult against God, and as such meriting the penalty of death, but he nevertheless believed in the rule of law, as in this other passage where he unequivocally states that “it is better to acquit the guilty than to condemn the innocent” (Bodin 2001, 209-210).

7. Natural Philosophy

The Universae naturae theatrum, which was published in the year of his death in 1596, may be considered as the most systematic exposition of Bodin’s vision of the world. It remains the least studied of his works and has never been translated into English. Bodin himself informs us that the Theatrum was written in 1590. The French translation of the work (Le Théâtre de la nature universelle) was published in 1597.

Ever since the beginning of his career Bodin sought to methodologically study all things, human and divine. He writes:

Of history, that is, the true narration of things, there are three kinds: human, natural, and divine. The first concerns man; the second, nature; the third, the Father of nature. /…/ So it shall come about that from thinking first about ourselves, then about our family, then about our society we are led to examine nature and finally to the true history of Immortal God, that is, to contemplation. (Bodin 1945, 15-16)

The Theatrum is the culmination point of Bodin’s systematic examination of things, and as such it is a deeply religious work. Bodin turns to the study of nature in order to better know God:

And indeed the Theater of Nature is nothing other than the contemplation of those things founded by the immortal God as if a certain tablet were placed under the eyes of every single one so that we may embrace and love the majesty of that very author, his goodness, wisdom, and remarkable care in the greatest matters, in moderate affairs, in matters of the least importance” (Bodin 2008, xxx)

Bodin believed that the French civil wars were occasioned, at least partly, by God’s dissatisfaction – God was punishing the French for their growing irreligious sentiment. The Theatrum has been described as an attack against those arrogant and ungodly philosophers, or naturalists, who wish to explain everything without reference to the creator and father of all things that is God. God is the author of all existing things, and the contemplation of nature brings us closer to Him. Furthermore, contemplating nature makes us love God for the care and goodness that he shows us.

The Theatrum has been written in a pseudo-dialogue form; it is a discussion between an informant, Mystagogus, and his questioner Theorus. The work opens with a short overview of the text, in which Bodin stresses the importance of order for the study of things. This gives him the opportunity to criticize Aristotle, who failed to discuss things in the right order; simpler things must be discussed before more complex ones, and therefore matters of physics should have been discussed after metaphysical things. Arranging all the material that is being considered in a convenient order – simplest notions to be studied first, and difficult ones later – is one of the distinctive characteristics of the Ramist framework of knowledge, as McRae has observed (McRae 1955, 8). McRae considers that, together with the Juris universi distributio, Bodin’s Theatrum “is perhaps the most thoroughly Ramist of any of his works.” Bodin’s two main objectives in the first book of the Theatrum are to prove that there is only one principle in nature, that is, God, and, that it is He who has created this world and He who governs it.

Other topics that Bodin discusses in Book One include matter, form and the causes of things. Furthermore, movement, generation, corruption and growth are considered, as well as things related to them: time and place, void, finitude and infinitude. In Book Two, Bodin examines elements, meteorites, rocks, metals and minerals. Book Three is a discussion on the subjects of the nature of plants and animals. The fourth book contains Bodin’s doctrine concerning soul; angels are also discussed in Book Four. The final book of the Theatrum discusses celestial bodies – their natural movement, the admirable harmony that exists between them, and the structure of the heavens. The final book attests of Bodin’s enmity toward Copernicus’ heliocentric system (Bodin 1596, 554 and especially 574-583); Bodin relies on the writings of Ptolemy, Aristotle, and the Holy Scripture in combating Copernicus. He dismisses Copernicus’ hypothesis concerning the heliocentric system on the grounds that it is “contrary to the evidence of the senses, to the authority of the Scriptures, and incompatible with Aristotelian physics.”

According to a recent interpretation by Blair, Bodin’s objective in writing the Theatrum was first and foremost to combat three impious propositions of ancient philosophy: (1) The eternity of the world; (2) The necessity of the laws of nature; and (3) The mortality of the soul.

Against the Eternity of the World

One solution to the conflict between Aristotelian philosophy of the eternity of the world and the Judeo-Christian account of creation—God has created the world, therefore it is not eternal, had been proposed by Thomas Aquinas. He argued that human reason alone cannot establish whether the world is eternal or not; the problem can be solved only by an appeal to faith and to biblical authority. Bodin’s argument differs from that of Aquinas. Bodin offers a rational demonstration based on “arguments for an all-powerful God, who knows no necessity and has complete free will”. Several scholars have observed that Bodin’s emphasis on divine free will is “characteristic of Christian nominalists like Duns Scotus and of Jewish philosophers like Maimonides” (Blair 1997, 118) The concluding syllogism for the “voluntary first cause” that is God is as follows: “Nothing can be eternal by nature whose first cause is voluntary; but the first cause of the world is voluntary; therefore the world cannot be eternal by nature, since its state and condition depend on the decision and free will of another.” (Blair 1997, 118)

Against Natural Necessity

The second conclusion is drawn from the unlimited freedom of God’s will: not only is it impossible that the world should be eternal, but furthermore it is arranged according to a divine plan. According to Bodin, providential divine governance is twofold: ordinary providence, where laws that govern nature under so-called normal circumstances are chosen by God, and extraordinary providence, where God is able to suspend those laws at will at any time he chooses, in order to intervene in the world (Blair 1997, 120). Bodin offers the following explanation for the existence of apparently useless or evil features of nature. He begins by claiming that everything in creation is good, and evil is simply the absence of good; this same idea is repeated in the Paradoxon. Then he attempts to illustrate, through various examples, that even things that are apparently evil in nature serve a “useful purpose in God’s good and wise plan” (Blair 1997, 122).

Immortality of the Soul

Bodin’s demonstration concerning the immortality of the soul is based on the soul’s intermediate nature: the soul is both corporeal and immortal. Blair defines this particular demonstration as “possibly Bodin’s most noteworthy innovation” and as a “significant departure from the standard or orthodox accounts [concerning the soul]” (Blair 1997, 137; 142). In combating the mortality of the soul, Blair writes, Bodin is reacting against all forms of impious philosophizing: against Averroes for denying the personal immortality of the soul; against Pomponazzi for claiming that philosophy shows the soul to be mortal; and against all those, like Pomponazzi or even Duns Scotus, who deny the rational demonstrability of this central doctrine. But Bodin calls his opponents only “Epicureans,” using the term to designate at first, generally, those who doubt the immortality of the soul, then more specifically those who, barely above the level of brutes, take pleasure and pain as the measure of good and evil and believe in the random distribution of atoms. (Blair 1997, 138)

Bodin’s first argument in favor of the immortality of the soul is based on empirical evidence concerning the ability of the soul to function independently of the body: during ecstatic experiences, as these have been conveyed by many learned men, it has been reported that the soul is able to hear, feel and understand while being temporarily transported outside the living body. Two further demonstrations follow. First, Bodin affirms that extremes are always joined by intermediates; passing from one extreme to another always necessitates passing through a ‘middle’ being and that there exists only two extremes in the world; (1) Form completely separated from matter, meaning angels and demons, and (2) Form entirely concrete, inseparable from matter, except by destruction, that is, natural bodies. Between these two extremes there must necessarily exist some intermediate which joins the two. This intermediate is form separable from matter, or, as Bodin states it, the soul. He concludes: “if therefore the human soul [mens] is separable from the dead body, it follows necessarily that it survives and carries out its actions without the operation of the senses” (Blair 1997, 139). Bodin’s final demonstration is as follows:

Given the extremes, of which one is totally corruptible (natural elements or bodies) and one is totally incorruptible (angels and demons), there must be an intermediate, which is corrupted in one part of itself, but free from corruption in the other; but this is nothing other than man, who participates in both natures: brute elements, plants, stones are far inferior to man in worth and dignity, and since man alone associates with angels and demons, he alone can link the celestial to the terrestrial, superior to inferior, immortal to mortal. (Blair 1997, 139)

Humans participate in both extremes and yet form an entity that is distinct from them. According to the standard view, the corporeal body is connected with the incorporeal soul, but Bodin’s demonstration is not built on this distinction because, for him, the soul is both immortal and corporeal. As Blair has observed, “for Bodin the human hypostasis mediates between form separated from matter (disembodied souls and angels) and form fully embedded in matter (as in all natural bodies), by virtue of its soul, which is corporeal, yet separable from the material body” (Blair 1997, 139-40). The following passage elucidates Bodin’s rather peculiar demonstration:

The body of the soul is not material, but spiritual – yet corporeal nonetheless: “from which it follows that human souls, angels and demons consist of the same corporeal nature, but not of bone, nor of flesh, but of an invisible essence. Like air, or fire, or both, or of a celestial essence, surpassing with its fineness the most subtle bodies: thus, even if we grant it is a spiritual body, it is a body nonetheless.” (Blair 1997, 140)

According to Blair, Bodin constructs a new type of natural philosophy that seeks to combine religion with philosophy, a combination of philosophical research concerning causes with a pious recognition of divine providence and the greatness of God.

Although Bodin often refers to Holy Scripture, he also constantly reminds us of the importance of reason and reasoning – so long as we do not infringe upon the limits of reason. Bodin uses physics to serve religious ends and the fundamental principle behind Bodin’s strategy is the Augustinian precept, later adopted by Aquinas in his synthesis of reason and faith, that truth is one and that there is, indeed, unity of knowledge: a necessary agreement between philosophy and religion exists, and therefore “natural philosophy as a reasoned investigation can never contradict true religion” (Blair 1997, 143).

8. Other Works

a. Juris universi distributio

The Juris universi distributio (Fr. Exposé du droit universel) was first published in 1578, but, as the Dedicatory Epistle of the Methodus informs us, it already existed in manuscript form twelve years earlier. Unlike later editions of the work that were published as books, the first edition of the Distributio was in the form of a poster, measuring approximately 40 by 180 cm, to be hung on the walls of universities.

Bodin’s objective in writing the Juris universi distributio was to arrive at a systematization of universal law. He sought to realize this by the study of history, paired with a comparative method which analyzes the different legal systems that either currently exist or have existed in the past. Bodin uses the same method in his main political works, (République and Methodus), in which comparative public law and its historical study permit Bodin to erect a theory of the state. Bodin is interested in “universal history”, of which his Methodus is an example, in the same way that he is interested in “universal law”, and it seems that the same type of historical and comparative method may be used in discovering them.

According to Bodin, law is divided into two categories: natural (ius naturale) and human (ius humanum). Bodin thus rejects the common threefold division based on the Digest – natural law, law of peoples and civil law –  because he considers dichotomy more convenient. The two principal divisions of human law are ius civile (civil law) and ius gentium (law of peoples). Bodin strongly criticizes law professors, or Romanists, for he writes that they have concentrated almost exclusively on ius civile – particularly the civil law of the Romans – and that, as a consequence, the ius gentium has not been properly studied, and, therefore, has no proper methodology. Bodin’s personal interest lies precisely in the ius gentium because it is concerned with the universal laws that are common to all peoples. The methods of the Romanists are inadequate for the study of ius gentium because the ius civile varies from state to state and no universally valid truths can be derived from it; in this sense it is not even part of legal science. A new critical method is therefore required; a method that is both historical and comparative.

Bodin’s system of universal law is a drastic rupture with the exegetical methods of the Middle Ages. Medieval jurists applied Roman law to their own societies and saw no problem in doing so. It is with the arrival of the so-called humanist scholars, in the sixteenth century, and their use of the methods of classical philology, that the internal coherence and authority of the Corpus juris civilis were challenged.

b. Moral Philosophy

Bodin’s Paradoxon quod nec virtus ulla in mediocritate nec summum hominis bonum in virtutis actione consistere possit (Fr. Paradoxe de Jean Bodin qu’il n’y a pas de vertu en médiocrité ni au milieu de deux vices) was first published in Latin in 1596, although Bodin had completed the text in 1591. Two French translations were later published. Bodin’s own translation dates from 1596, but it remained unpublished until 1598. Bodin’s translation may be considered as a revised version of the Latin text, rather than its simple translation. The Latin edition includes a preface that does not exist in the French version.

The Paradoxon has been written in dialogue form, and is a discussion between a father and a son. During the course of the dialogue, the son repeatedly refers to the authority of Aristotle. His opinions are often refuted by the father, who refers to the writings of Plato and to the Holy Scripture. The term “paradox” in the title refers to the fact that Bodin acknowledges his views to be in contradiction with the moral opinions that were generally accepted in his day – especially concerning the Aristotelian doctrine of the mean.

The work opens with a discussion concerning the question of good and evil and that of divine justice. This is followed by an outline of the basic structure of Bodin’s moral philosophy: God is the sovereign good, or, “that which is the most useful and the most necessary to every imaginable creature”. He is also the source of all other things that are good. Evil is defined as the privation of good – a definition that Bodin traces to St. Augustine. The same definition is found in the Theatrum, where it is used to support the argument that everything in Creation is good – God has not created anything evil (Blair 1997, 122). The good of man and a contented life are discussed, followed by a discussion concerning particular virtues and vices, as well as their origins. Bodin refutes Aristotle’s doctrine of the mean. Discussion concerning moral and intellectual virtues follows. Bodin then examines prudence; he then claims that prudence alone helps us choose between good and evil. The final section discusses wisdom and the love of God. The father affirms that wisdom is found in the fear of offending God. Fear of God is inseparable from love of God – together they form the basis of wisdom.

c. Writings on Education

Bodin wrote or compiled four works where he discusses the education of children: The Address to the Senate and People of Toulouse on the Education of Youth in the Commonwealth, Epître à son neveu, Sapientia moralis epitome, and Consilium de institutione principis aut alius nobilioris ingenii. The earliest of them, the Oratio, is a discourse that was given in Toulouse in 1559, and published the same year. The three other works date from a later period; the Epître is a letter written to Bodin’s nephew, dated November 1586, and the Epitome was first published in 1588. Evidence within the Consilia suggests that it was written sometime between 1574 and 1586, although it remained unpublished until 1602.

Address to the Senate and People of Toulouse on the Education of Youth in the Commonwealth

Bodin’s Oratio de instituenda in repub. juventute ad senatum populumque tolosatem (Fr. Le discours au sénat et au peuple de Toulouse sur l’éducation à donner aux jeunes gens dans la république) is the most valuable single document that informs us of Bodin’s stay in Toulouse in the 1550’s. Furthermore, it is Bodin’s earliest surviving work on education and contains a detailed portrayal of the humanist ideal that Bodin embraced during this period.

Nothing is more salutary to a city than to have those who shall one day rule the nation be educated according to virtue and science. It is only by providing youth with proper education and intellectual and moral culture that the glory of France, and that of its cities could be preserved. Art and science are the auxiliaries of virtue, and one cannot conceive of living – much less leading a happy life – without them. Bodin urges the people of Toulouse to participate in the movement of the Renaissance. The town is well-known for its faculty of law, and he argues that the study of humanities and belles-lettres should also be appended to the study of law.

In Bodin’s time, the children of Toulouse were either given a public education – in which case they were most often sent to Paris – or taught privately, in domicile. While both systems have their inconveniences, Bodin considers that public schooling must be favored. In order to prevent children from being sent to Paris to be educated, however, a collège must be built in Toulouse and the children of Toulouse should be educated in their own hometown. Bodin proposes that all children – including gifted children belonging to the poorest classes – be sent to public schools where they shall be taught according to the official method.

Epître de Jean Bodin touchant l’institution de ses enfants à son neveu

This short work is Bodin’s response in the form of a letter dated November 9, 1586, to his nephew’s enquiry concerning the education of children. Bodin’s nephew had welcomed a newborn son to his family, and had turned to Bodin for advice on how to give him a proper education. Bodin’s advice came in the form of a description of how he taught his own children when they were three and four years old.

Bodin began by teaching his children the Latin names of things. Having observed that they have a good memory and necessary mental capacities, Bodin asked them to repeat more abstract words, and began informing them about such things as how old the world is (5,534 years), how many planets there are, and the names of these planets. He taught them the names of body parts, what senses we have, the virtues and vices, and so forth. Knowledge of different things was acquired by a continuous daily exercise. Soon after, Bodin had his children interrogate each other, thus allowing himself to retire from this task. The study of Latin grammar soon followed, as well as the study of moral sentences in both French and Latin. The children would then begin the study of arithmetic and geometry. This was followed by the translation of Cicero’s writings from Latin to French.

Sapientia moralis epitome 

The Sapientia moralis epitome was published in Paris in 1588. It consists of 210 moral maxims that have been arranged into groups of seven sentences. Each group is a discussion upon a common topic: youth and education, nature, truth and opinion, virtue, war, liberty, marriage, etc. The majority of the maxims are Bodin’s own formulations of ideas expressed by Ovid, Horace, Juvenale and Lucretius.

Consilium de institutione principis

Bodin’s Consilium de institutione principis was first published in 1602 as part of a compilation entitled Consilia Iohannis Bodini Galli et Fausti Longiani Itali de principe recte instituendo. Although the determination of a precise date seems impossible, evidence within the work suggests that Bodin composed it sometime between 1574 and 1586.

The Consilium is a collection of precepts for the young princes of the Saxon court. The content of the Consilium is in many ways identical to the views that were expressed in the Epître, although the Consilium is more detailed. Young princes are to be taught in small groups, and their eating and sleeping habits are to be observed, so that they remain alert and in good health.

Bodin particularly recommends the study of two texts: Peter Ramus’ Dialectica, and Pibrac du Faur’s Quatrains. The education of the princes is to be completed by the study of law and the art of government. Knowledge of practical matters should be acquired by studying “the state of the republica and its offices and the laws, customs and natures of various peoples.” Knowledge in practical matters is necessary in order to acquire prudence. According to Bodin, only a prudent prince is worthy of his people (Rose 1980, 57-58).

d. Bodin’s Surviving Correspondence

Several letters from Bodin’s personal correspondence have survived to the present day; (for the complete list, see Couzinet 2001, 32-36). Chauviré published a series of letters as an Appendix to his Jean Bodin, auteur de la République. The most important among them are Bodin’s letter, written in Latin, to one Jean Bautru des Matras, as well as Bodin’s account from January 1583, addressed to his brother-in-law, regarding the events that took place in Antwerp when the Duke of Alençon was trying to help the Low Countries in their efforts to drive out the Spanish.

Later, Moreau-Reibel made a discovery in France’s Bibliothèque Nationale, recueil manuscrit 4897 of the library’s fonds français, and published a series of five letters that had been brought together by a certain Philippe Hardouyn. These letters were written between 1589-93. Together they complete our understanding of the possible reasons that made Bodin a ligueur. A sixth letter from this same period is Bodin’s notorious letter of 20 January 1590, in which he explains the reasons that made him a supporter of the Catholic League. A couple of letters from the correspondence between Bodin and Walsingham, dating from 1582, have also survived.

9. Influence

As the work’s numerous editions and translations attest, Bodin’s République was widely read in Europe after its publication, up until the mid-seventeenth century. It was subsequently forgotten, however, and Bodin’s influence during the eighteenth century was only marginal. It was not until the twentieth century that his works, slowly, but decisively, began to interest scholars again. Growing interest in his works has assured Bodin the place he deserves among the most important political thinkers of the sixteenth century. New translations and modern editions of his works have made his ideas accessible to wider audiences.

Among Bodin’s best-known ideas is the Theory of Climate that is currently most often associated with another French philosopher, Montesquieu (1689-1755). Bodin’s comparative and empirical approach in the fields of historical methodology, jurisprudence, and religion represented a break with medieval traditions. He was among the most influential legal philosophers of his time, and his Colloquium heptaplomeres is one of the earliest works of comparative religious studies. Bodin’s ideas concerning religious tolerance and the abolition of slavery found an echo among European writers of both the seventeenth and eighteenth centuries. Although the Colloquium heptaplomeres remained unpublished until the 1840s, scholars were familiar with its ideas due to manuscript copies that circulated in Europe. The numerous editions of his Démonomanie, on the other hand, testify to an interest previously demonstrated toward his ideas regarding witchcraft. Finally, Bodin’s Response to the Paradoxes of Malestroit includes one of the earliest formulations of the Quantity Theory of Money.

In political theory, Bodin’s most influential contribution remains his Theory of Sovereignty, and the conceptualization of sovereign power. A majority of scholars have labeled Bodin as an absolutist. For others, he favored a type of constitutionalism. Still others have observed that he shifted from the perceived constitutionalism of his early writings toward a more absolutist theory in the République. His writings were received in various ways in different parts of Europe, and interpretations regarding them were often contradictory – depending on the country. His Theory of Sovereignty was used by royalists and parliamentarians alike to defend their widely differing opinions. In France, for example, his political theory was largely absorbed into the absolutist movement and the doctrine of the divine right of kings that became highly influential soon after Bodin’s death; one needs only to think of Cardinal Richelieu and Louis XIV. For example, Jacques-Bénigne Bossuet (1627-1704), who was tutor to the oldest son of Louis XIV, argued in favor of an absolute hereditary monarchy from Scriptural sources in his Politics Drawn from the Very Words of Holy Scripture (Politique tirée des propres paroles de l’Écriture sainte). Other French writers who incorporated absolutist elements from Bodin’s theory in their own writings are Pierre Grégoire de Toulouse (c. 1540-1597), Charles Loyseau (1566-1627), and Cardin Le Bret (1558-1655).

The term “monarchomachs” (Fr., monarchomaques) denotes the writers – Protestants or Catholics – who opposed the powers of the monarch. The term was first coined by the Scottish jurist and royalist William Barclay (1546-1608) in his De Regno et Regali Potestate (1600). Similar to what Bodin had done in his République, Barclay defended the rights of kings. Giovanni Botero (1544-1617) was one of the earliest writers to have used the expression “reason of state” (Fr., Raison d’état) in his work Della ragion di Stato (1589). Bodin’s political writings may have been one of the sources used by Botero and his followers.

In Germany, Johannes Althusius (1557-1638) adopted Bodin’s theory of sovereignty in his Politica methodice digesta (1603), but argued that the community is always sovereign. In this sense, every commonwealth – no matter what its form may be – is popular. Dutch jurist Hugo Grotius published his renowned De jure belli ac pacis in 1625; Grotius does not conceal his admiration for Bodin, nor for the method used by French writers that consisted of combining the study of history with the study of law.

Bodin’s République was among the works that introduced the idea of legislative sovereignty in England. His considerable influence upon Elizabethan and Jacobean political thought in England, one scholar has observed, was largely due to his precise definition of sovereignty. Among the political writers who defended the powers of the king, Sir Robert Filmer (c. 1588-1653) drew heavily upon Bodin’s writings. One shorter text, in particular, The Necessity of the Absolute Power of all Kings and in particular of the King of England,  published in 1648, is hardly anything more than a collection of ideas expressed in the République. John Locke’s First Treatise of Government (1689) may, therefore, be considered not only a refutation of Filmer’s political ideas, but also a critical commentary upon Bodin’s political theory. Thomas Hobbes, in his The Elements of Law (1640), cites Bodin by name and approves Bodin’s opinion according to which sovereign power in the commonwealth may not be divided (II.8.7. “Of the Causes of Rebellion”). This principle of indivisible sovereign power is also expressed in Hobbes’ later political works De cive (1642) and Leviathan (1651).

10. References and Further Reading

a. Primary Sources

  • Oppiani De venatione (1555)
  • Oratio de instituenda iuventute… (1559)
  • Methodus ad facilem historiarum cognitionem (1566)
  • La réponse aux paradoxes de Malestroit (1568)
  • La harangue de Messire Charles des Cars (1573)
  • Les Six Livres de la République (1576; all references in this article are to the edition of 1583)
  • Apologie de Réne Herpin pour la République (before 1581)
  • Recueil de tout ce qui s’est négocié en la compagnie du tiers état… (1577)
  • Juris universi distributio (1578)
  • De la démonomanie des sorciers (1580)
  • De republica libri sex (1586)
  • Sapientiae moralis epitome (1588)
  • Paradoxon (1596)
  • Universae naturae theatrum (1596)
  • Consilia de principe recte instituendo (1602)
  • Colloquium heptaplomeres (1841)
  • Epître de Jean Bodin touchant l’institution de ses Enfans de 1586 (1841)

i. Modern Editions of Bodin’s works

1. Collected Works
  • Bodin, Jean. Oeuvres philosophiques de Jean Bodin. Ed. Pierre Mesnard. Trans. Pierre Mesnard.  Paris: PUF, 1951.
    • Includes the following Latin works, together with their French translations: Oratio de instituenda in repub. juventute ad senatum populumque tolosatemJuris universi distributio, and Methodus ad facilem historiarum cognitionem.
  • Bodin, Jean. Selected Writings on Philosophy, Religion and Politics. Ed. Paul L. Rose. Genève: Droz, 1980.
    • Includes the following seven works: Bodin’s letter to his nephew (1586), Consilium de institutione principis (1574-86), Sapientia moralis epitome (1588), Latin dedicatory letter to the Paradoxon quod nec virtus ulla in mediocritate nec summum hominis bonum in virtutis actione consistere possit (1596) and the French translation of the text, Le Paradoxe de Jean Bodin Angevin (1598), Bodin’s letter to Jean Bautru des Matras (1560s), as well as a letter to a friend in which he gives reasons for supporting the Catholic League (1590).
2. Individual Works
  • Bodin, Jean. Method for the Easy Comprehension of History. Trans. Beatrice Reynolds. New York: Columbia University Press, 1945.
    • Includes an introduction by Reynolds.
  • Bodin, Jean. Six Books of the Commonwealth. Abr. ed. Trans. Marian J. Tooley. Oxford: Basil Blackwell, 1955.
    • An abridgment of Bodin’s major work, together with an introduction.
  • Bodin, Jean. The Six Bookes of a Commonweal. Trans. Richard Knolles. Ed. Kenneth Douglas McRae. Cambridge: Harvard University Press, 1962.
    • This is the only existing full English translation of the work; facsimile reprint of Knolles’ English translation of 1606 that compares the French and Latin versions of the text. McRae’s introductory material discusses Bodin’s life, his career and his influence.
  • Bodin, Jean. Address to the Senate and People of Toulouse on Education of Youth in the Commonwealth. Trans. George Albert Moore. Chevy Chase, Md: Country Dollar Press, 1965.
    • Moore’s translation of an important and interesting early text by Bodin.
  • Bodin, Jean. Colloquium of the Seven about Secrets of the Sublime. Trans. Marion Leathers Kuntz. Princeton, N. J.: Princeton University Press, 1975. Second edition. University Park, PA: Pennsylvania State University Press, 2008.
    • First complete modern translation of the work, together with highly informative introductory material.
  • Bodin, Jean. On Sovereignty. Trans. Julian H. Franklin. Ed. Julian H. Franklin. Cambridge: Cambridge University Press, 1992.
    • Contains chapters 8 and 10 of the First book, and chapters 1 and 5 of the Second book of the République. Concentrates on Bodin’s analysis of sovereignty. Franklin’s textual notes are informative.
  • Bodin, Jean. Response to the Paradoxes of Malestroit. Trans. Henry Tudor. Eds. Henry Tudor and R. W. Dyson. Bristol: Thoemmes Continuum, 1997.
    • Most recent English translation of the text, it is based on the first edition of the work, but also included are the major changes that occurred between the first (1568) and second (1578) editions. Includes a concise and useful introduction.
  • Bodin, Jean. On the Demon-Mania of Witches. Abr. ed. Trans. Randy A. Scott and Jonathan L. Pearl. Toronto: Centre for Reformation and Renaissance Studies, 2001.
    • Abridged translation of Bodin’s Démonomanie that contains about two-thirds of the original text and informative notes.

b. Secondary Sources

  • Blair, Ann. The Theater of Nature. Jean Bodin and Renaissance Science. Princeton, N. J.: Princeton University Press, 1997.
    • Indispensable study concerning the methods and practices of Renaissance science in the light of Bodin’s Theatrum.
  • Brown, John L. The Methodus ad facilem historiarum cognitionem of Jean Bodin. Washington, D.C.: Catholic University of America Press, 1939. Reprint. New York: AMS Press, 1969.
    • Central study that analyses the background and influence of Bodin’s Methodus. Brown establishes that Bodin’s earlier work contains many of the political and legal principles that were further developed in the République.
  • Franklin, Julian H. Jean Bodin and the Rise of Absolutist Theory. Cambridge: Cambridge University Press, 1973.
    • An influential study on the topic of the formation of Bodin’s absolutist view, as it is expressed in the République.
  • Heller, Henry. “Bodin on Slavery and Primitive Accumulation.” The Sixteenth Century Journal 25.1 (1994): 53-65.
    • Argues that Bodin conceived of slavery not only as something irrational and unnatural, but as a permanent threat to the stability of the state.
  • McRae, Kenneth D. “Ramist Tendencies in the Thought of Jean Bodin.” Journal of the History of Ideas 16.3 (1955): 306-323.
    • Argues that several of Bodin’s writings reveal the influence of Ramist concepts; even the République (in which the Ramist influence is less evident) can be described as Ramist in its structure.
  • O’Brien, Denis P. “Bodin’s Analysis of Inflation.” History of Political Economy 32.2 (2000): 267-292.
    • A longer version of the introduction that O’Brien wrote to the 1997 edition of Bodin’s Response. Argues that Bodin should be regarded as the pioneer formulator of the quantity theory of money.
  • Pearl, Jonathan L. “Humanism and Satanism: Jean Bodin’s Contribution to the Witchcraft Crisis.” Canadian Review of Sociology and Anthropology 19.4 (1982): 541-548.
    • On Bodin’s influence on the “witchcraft crisis”. Pearl reminds us that the Renaissance witnessed, not only a revival of the arts and the birth of modern science, but also the re-appearance of the occult: magic, astrology and witchcraft.
  • Remer, Gary. “Dialogues of Toleration: Erasmus and Bodin.” Review of Politics 56.2 (1994): 305-336.
    • Examines two different types of dialogues of toleration; Erasmus’ common truth and Bodin’s subjective. Erasmus’ traditional conception aims at the discovery of truth in religious questions; Bodin’s conception, on the contrary, does not presuppose that a common truth may be discovered, since every opinion is one part of the truth.
  • Rose, Paul Lawrence. Bodin and the Great God of Nature. The Moral and Religious Universe of a Judaiser. Genève: Droz, 1980.
    • A valuable study concerning Bodin’s ideas on religion and ethics; many of Bodin’s less-known works are considered. Rose argues that Bodin went through three religious conversions in his lifetime.
  • Salmon, John Hearsey McMillan. “The Legacy of Jean Bodin: Absolutism, Populism or Constitutionalism?” The History of Political Thought 17. Thorverton (1996): 500-522.
    • Discusses the ways in which Bodin’s ideas were understood and transformed in France’s neighboring countries during the seventeenth century.
  • Tooley, Marian J. “Bodin and the Mediaeval Theory of Climate.” Speculum 28.1 (1953): 64-83.
    • A scholarly investigation of Bodin’s medieval predecessors regarding the theory of climates. Argues that contrary to his predecessors, Bodin was more interested in the practical implications of the things he observed.
  • Ulph, Owen. “Jean Bodin and the Estates-General of 1576.” Journal of Modern History 19.4 (1947): 289-296.
    • Examines Bodin’s role, as deputy from the bailiwick of Vermandois, during the estates-general at Blois in 1576.
  • Wolfe, Martin. “Jean Bodin on Taxes: The Sovereignty-Taxes Paradox.” Political Science Quarterly 83.2 (1968): 268-284.
    • Argues that Bodin’s main objective in writing about taxes was to push for reform in France’s fiscal system.

i. Bibliography

  • Couzinet, Marie-Dominique, ed. Jean Bodin. Roma: Memini, 2001.
    • Indispensable for conducting serious research on Bodin. Contains references to over 1,500 articles, books and other documents.

ii. Conference proceedings and Article Collections

  • Denzer, Horst, ed. Jean Bodin – Proceedings of the International Conference on Bodin in Munich. München: C.H. Beck, 1973.
    • Fine collection of twenty-four articles (in English, French and German) by the foremost Bodin scholars. Part II contains discussions, and part III an exhaustive bibliography on Bodin from the year 1800 onwards.
  • Franklin, Julian H., ed. Jean Bodin. Aldershot: Ashgate, 2006.
    • Collection of twenty previously published articles or book chapters (in English).

 

Author Information

Tommi Lindfors
Email: tommi.lindfors@helsinki.fi
University of Helsinki
Finland

John Langshaw Austin (1911—1960)

J. L. AustinJ. L. Austin was one of the more influential British philosophers of his time, due to his rigorous thought, extraordinary personality, and innovative philosophical method. According to John Searle, he was both passionately loved and hated by his contemporaries. Like Socrates, he seemed to destroy all philosophical orthodoxy without presenting an alternative, equally comforting, orthodoxy.

Austin is best known for two major contributions to contemporary philosophy: first, his ‘linguistic phenomenology’, a peculiar method of philosophical analysis of the concepts and ways of expression of everyday language; and second, speech act theory, the idea that every use of language carries a performative dimension (in the well-known slogan, “to say something is to do something”). Speech act theory has had consequences and import in research fields as diverse as philosophy of language, ethics, political philosophy, philosophy of law, linguistics, artificial intelligence and feminist philosophy.

This article describes Austin’s linguistic method and his speech act theory, and it describes the original contributions he made to epistemology and philosophy of action. It closes by focusing on two main developments of speech act theory─the dispute between conventionalism and intentionalism, and the debate on free speech, pornography, and censorship.

Table of Contents

  1. Life and Method
  2. Philosophy of Language
    1. Meaning and Truth
    2. Speech Acts
  3. Epistemology
    1. Sense and Sensibilia
    2. Other Minds
      1. Knowledge of Particular Empirical Facts
      2. Knowledge of Someone Else’s Mental States
  4. Philosophy of Action
  5. Legacy
    1. Speech Act Theory
      1. Conventionalism and Intentionalism
      2. Free Speech and Pornography
    2. Epistemology
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life and Method

John Langshaw Austin, born on March 26th 1911 in Lancaster, England. He was trained as a classicist at Balliol College Oxford. He first came to philosophy by studying Aristotle, who deeply influenced his own philosophical method. He also worked on the philosophy of Leibniz and translated Frege’s Grundlagen. Austin spent his whole academic life in Oxford, where he was White’s Professor of Moral Philosophy from 1952 until his death in 1960. During the Second World War Austin was commissioned in the Intelligence Corps, and played a leading role in the organization of D-Day, leaving the British Army in 1945 with the rank of Lt. Colonel. Austin published only seven articles. According to Searle, Austin’s reluctance to publish was partly characteristic of his own attitude, but also it was part of the culture of Oxford at the time: “Oxford had a long tradition of not publishing during one’s lifetime, indeed it was regarded as slightly vulgar to publish” (Searle 2001, 227). Most of Austin’s work was thus published posthumously, and includes a collection of papers (Austin 1961), and two series of lectures reconstructed by the editors on the basis of Austin’s lecture notes: the lectures on perception, edited by Geoffrey Warnock (Austin 1962a), and the William James Lectures held at Harvard in 1955, devoted to speech acts, and edited by James O. Urmson (Austin 1962b) for the first edition, and by Urmson and Marina Sbisa for the second (Austin 1975).

Austin had a profound dissatisfaction not only with the traditional way of philosophizing, but also with Logical Positivism (whose leading figure in Oxford was Alfred J. Ayer). In particular, his dissatisfaction was directed towards a way of practicing philosophy which, in Austin’s view, was responsible for the production of tidy dichotomies, and instead of clarifying the problems at issue, seemed to lead to oversimplifications and dogmatic and preconceived schemes of thought. Austin thus developed a new philosophical methodology and style, which became paradigmatic of Ordinary Language Philosophy. Austin does not claim that this method is the only correct method to adopt. Rather, it represents a valuable preliminary approach to at least some of the most stubborn problems in the tradition of Western philosophy, such as those of freedom, responsibility, and perception. According to Austin, the starting point in philosophy should be the analysis of the concepts and ways of expression of everyday language, and the reconnaissance of our ordinary language. This would help to dismantle the ‘philosophical mistakes’ induced by the way philosophers use certain ordinary words, on the one hand, and, on the other, to gain access to actual features in the world picked out by the expressions we use to describe it.

Ordinary language is not the last word: in principle it can everywhere be supplemented and improved upon and superseded. Only remember, it is the first word. [Austin 1956a/1961, 185]

According to Austin, in ordinary language are deposited all the distinctions and connections established by human beings, as if our words in their daily uses “had stood up to the long test of the survival of the fittest” (Austin 1956a/1961, 182). It is necessary, first of all, to carefully consider the terminology available to us, by making a list of the expressions relevant to the domain at issue: a sort of ‘linguistic phenomenology’ carried out with the help of the dictionary and of the imagination, by searching for combinations of expressions and synonyms, devising linguistic thought-experiments and unusual contexts and scenarios, and speculating about our linguistic reactions to them. The examination of ordinary language enables us to pay attention to the richness of linguistic facts and to tackle philosophical problems from a fresh and unprejudiced perspective.

To be sure, this is not a new methodology in the history of philosophy. Still, this strategy is now carried out with distinctive meticulousness and on a large scale on the one hand, and is undertaken and evaluated collectively, so as to gain a reasonable consensus, on the other. For Austin, philosophy is not an endeavor to be pursued privately, but a collective labor. This was in fact the gist of Austin’s “Saturday mornings,” weekly meetings held during term at Oxford and attended by philosophers of language, moral philosophers and jurists. Austin’s method was better meant for philosophical discussion and research along these lines than for publication: we can nonetheless fully appreciate it in How to Do Things with Words (1962b), and in papers like “A Plea for Excuses” (1956a), and “Ifs and Cans” (1956b).

Austin’s method has been regarded by some as pedantic, as a mere insistence that all we must do is use our words carefully, with no genuine interest in the phenomena which arouse our philosophical worries. There are indeed limitations to his methodology: on the one hand, many philosophical questions still remain untouched even after a meticulous reformulation; on the other hand, our everyday language does not embody all the distinctions which could be relevant for a philosophical inquiry (compare Searle 2001, 228-229). But Austin is far from being concerned merely by language: his focus is always also on the phenomena talked about – as he says, “It takes two to make a truth” (Austin 1950/1961, 124fn; compare Martin 2007).

2. Philosophy of Language

a. Meaning and Truth

With the help of his innovative methodology, Austin takes a new stance towards our everyday language. As is well known, philosophers and logicians like Gottlob Frege, Bertrand Russell, the earlier Ludwig Wittgenstein, Alfred Tarski and Willard Quine want to build a perfect language for philosophical and scientific communication, that is, an artificial language devoid of all the ambiguities and imperfections that characterize natural languages. Conversely, ordinary language philosophers (besides Austin, the later Wittgenstein, Friedrich Waismann, Paul Grice, Peter Strawson) view natural language as an autonomous object of analysis – and its apparent imperfections as signs of richness and expressive power.

In a formal language, semantic conventions associate with each term and each sentence a fixed meaning, once and for all. By contrast, the expressions of a natural language seem essentially incomplete; as a result, it seems impossible to fully verify our everyday sentences. The meanings of our terms are only partially constrained, depending on the beliefs, desires, goals, activities, and institutions of our linguistic community. The boundaries, even when temporarily fixed, are unstable and open to new uses and new conventions in unusual situations. In “The Meaning of a Word,” Austin takes into consideration different contexts of utterance of sentences containing familiar terms, in order to create unusual occasions of use: extraordinary or strange cases are meant to contrast with our intuitions and reveal moments of tension hidden in natural language. What are we to say about “It’s a goldfinch” uttered about a goldfinch that “does something outrageous (explodes, quotes Mrs. Woolf, or what not)”? (Austin 1940/1961, 88). Austin’s main point is that it is in principle impossible to foresee all the possible circumstances which could lead us to modify or retract a sentence. Our everyday terms are extremely flexible, and can still be used in odd cases. Concerning “That man is neither at home nor not at home,” Austin writes: “Somehow we cannot see what this ‘could mean’ – there are no semantic conventions, explicit or implicit, to cover this case: yet it is not prohibited in any way – there are no limiting rules about what we might or might not say in extraordinary cases” (Austin 1940/1961, 68). Of a dead man, lying on his bed, what would we say? That he is at home? That he is not at home?

Similarly, should the utterance “France is hexagonal” be taken as true or false? According to Austin we must take into consideration the speaker’s goals and intentions, the circumstances of utterance, and the obligations we undertake in asserting something. Assertions are not simply true or false, but more or less objective, adequate, exaggerated, and rough: “‘true’ and ‘false’ […] do not stand for anything simple at all; but only for a general dimension of being a right or proper thing to say as opposed to a wrong thing, in these circumstances, to this audience, for these purposes and with these intentions” (Austin 1975, 145).

b. Speech Acts

Austin’s most celebrated contribution to contemporary philosophy is his theory of speech acts, presented in How to Do Things with Words (Austin 1975). While for philosophers interested mainly in formal languages the main function of language is describing reality, representing states of affairs and making assertions about the world, for Austin our utterances have a variety of different uses. A similar point is made in Philosophical Investigations by Wittgenstein, who underlines the “countless” uses we may put our sentences to (Wittgenstein 1953: § 23). Austin contrasts the “desperate” Wittgensteinian image of the countless uses of language with his accurate catalogue of the various speech acts we may perform – a taxonomy similar to the one employed by an entomologist trying to classify the many (but not countless) species of beetles.

Not all utterances, then, are assertions concerning states of affairs. Take Austin’s examples

(1) I name this ship the ‘Queen Elisabeth’

as uttered in the course of the launching of a ship, or

(2) I bet you sixpence it will rain tomorrow.

The utterer of (1) or (2) is not describing the launching ceremony or a bet, but doing it. By uttering these sentences we bring about new facts, “as distinguished from producing consequences in the sense of bringing about states of affairs in the ‘normal’ way, that is, changes in the natural course of events” (Austin 1975: 117): by uttering (1) or (2) we modify the social reality, institute new conventions, and undertake obligations. In the first lessons of How to Do Things with Words, Austin traces a tentative distinction between constatives and performatives, to be abandoned in the subsequent lessons. Constatives, on the one hand, are sentences like

(3) The cat is on the mat:

they aim to describe states of affairs and are assessable as true or false. Performatives like (1) and (2), on the other hand, do rather than report something: they perform acts governed by norms and institutions (such as the act of marrying or baptizing) or social conventions (such as the act of betting or promising) and do not seem assessible as true or false. This last controversial claim is not argued for (“I assert this as obvious and do not argue it,” Austin 1975, 6): the claim is only provisional and subject to revision in the light of later sections (Austin 1975, 4n).

According to Austin it is possible and fruitful to shed light on standard cases of successful communication, and to specify the conditions for the smooth functioning of a performative, by focusing on non-standard cases and communicative failures. As we have said, performatives cannot be assessed as true or false, but they are subject to different types of invalidity or failure, called “infelicities.” In some cases the attempt to perform an act fails or “misfires.” The act is “null and void” on the basis of the violation of two kinds of rules:

A.1: there must exist an accepted conventional procedure having a certain conventional effect, that procedure to include the uttering of certain words by certain persons in certain circumstances;

A.2: that procedure must be invoked in adequate circumstances and by appropriate persons.

Further infelicities concern the execution of the procedure, for it must be executed by all participants both

B.1: correctly, and

B.2: completely.

Finally, there are cases in which the performance of an act is achieved, but there is an abuse of the procedure, due to the violation of two kinds of rules:

C.1: the procedure must be executed by the speaker with appropriate thoughts, feelings or intentions;

C.2: the participants must subsequently conduct themselves in accordance with the procedure performed.

As we said, in How to Do Things with Words Austin draws the distinction between constatives and performatives merely as a preliminary to the presentation of his main thesis, namely that there is a performative dimension in any use of language. The putative class of performatives seems to admit only specific verbs (like to promise, to bet, to apologize, to order), all in the first person singular present. Any attempt to characterize the class with grammatical or lexical criteria, however, is bound to fail. We may in fact perform the act of, say, ordering by using an explicit performative, as in

(4) I order you to close the door

but also with

(5) Close the door!

Similarly, there are performative verbs also for acts of stating, asserting, or concluding, as in

(6) I assert that the Earth is flat.

The very distinction between utterances assessable along the dimension of truth and falsehood (constatives) and utterances assessable along the dimension of felicity or infelicity (performatives) is a mere illusion. To show this, Austin presents two arguments:

a) on the one hand, constatives may be assessed as happy or unhappy: like performatives, assertions require appropriate conditions for their felicitous performance (to give an example, it does not seem appropriate to make an assertion one does not believe);

b) on the other hand, performatives may be assessed in terms of truth and falsehood, or in terms of some conformity to the facts: of a verdict we say that it is fair or unfair, of a piece of advice that it is good or bad, of praise that it is deserved or not.

By a) and b) Austin is led to the conclusion that the distinction between constatives and performatives is inadequate: all sentences are tools we use in order to do something – to say something is always to do something. Therefore it is necessary to develop a general theory of the uses of language and of the acts we perform by uttering a sentence: a general theory of what Austin calls illocutionary force.

Within the same total speech act Austin distinguishes three different acts: locutionary, illocutionary and perlocutionary.

  • The locutionary act is the act of saying something, the act of uttering certain expressions, well-formed from a syntactic point of view and meaningful. It may furthermore be analyzed into a phonetic act (the act of uttering certain noises), a phatic act (the act of uttering words, that is, sounds as conforming to a certain vocabulary and grammar), and a rhetic act (the act of using these words with a certain meaning – sense or reference).
  • To perform a locutionary act is also and eo ipso to perform an illocutionary act (Austin 1975, 98). An illocutionary act is a way of using language, and its performance is the performance of an act in saying something as opposed to performance of an act of saying something. It corresponds to the force that an utterance like (5) has in a particular context: order, request, entreaty, or challenge.
  • The perlocutionary act corresponds to the effects brought about by performing an illocutionary act, to its consequences (intentional or non-intentional) on the feelings, thoughts, or actions of the participants. According to Austin the speaker, by saying what she says, performs another kind of act (like persuading, convincing, or alerting) because she can be taken as responsible for those effects (compare Sbisa 2006 and 2013). Yet the perlocutionary consequences of illocutionary acts are non-conventional, not being completely under the speaker’s control, but rather related to the specific circumstances in which the act is performed. Austin makes a further distinction between perlocutionary objects (the consequences brought about by an illocutionary act in virtue of its force – as alerting can be a consequence of the illocutionary act of warning) and perlocutionary sequels (the consequences brought about by an illocutionary act without a systematic connection to its force – as surprising can be a consequence of the illocutionary act of asserting) (Austin 1975: 118).

In the last lesson of How to Do Things with Words Austin tentatively singles out five classes of illocutionary acts, using as a starting point a list of explicit performative verbs: Verdictives, Exercitives, Commissives, Behabitives, Expositives.

  • The class of Verdictives includes acts (formal or informal) of giving a verdict, estimate, or appraisal (as acquitting, reckoning, assessing, diagnosing). These may concern facts or values.
  • The class of Exercitives includes acts of exerting powers, rights or influence (as appointing, voting, ordering, warning). These presuppose that the speaker has a certain kind of authority or influence.
  • The class of Commissives includes acts that commit the speaker to doing something (as promising, undertaking, consenting, opposing, betting).
  • The class of Expositives includes acts that clarify reasons, arguments, or communications (as affirming, denying, stating, describing, asking, answering).
  • The class of Behabitives includes acts having to do with attitudes and social behavior (as apologizing, congratulating, commending, thanking). These include reactions to other people’s behavior or fortune, and are particularly vulnerable to insincerity (condition C.1).

Austin characterizes the illocutionary act as the conventional aspect of language (to be contrasted with the perlocutionay act). As we said before, for any speech act there must exist an accepted conventional procedure having a certain conventional effect (condition A.1): if the conventional procedure is executed according to further conditions, the act is successfully performed. This claim seems plausible as far as institutional or social acts (like naming a ship, or betting) are concerned: the conventional dimension is here manifest because it is our society (and sometimes our laws) that validates those acts. The claim seems less plausible as far as speech acts in general are concerned: nothing conventional, or semantic, makes of (5) an order, or a challenge, or an entreaty – the illocutionary force of the utterance is fixed by the context of utterance. More generally, according to Austin the speaker’s intentions play only a minor role in the performance of a speech act (violation of condition C.1 leads to an abuse of the procedure, but not to a failure of the speech act). Drawing on Gricean ideas, Peter Strawson argues that what makes of (5) an illocutionary act of ordering instead of entreating are the speaker’s intentions – intentions the speaker may (but need not) make available to the audience using linguistic conventions:

I do not want to deny that there may be conventional postures or procedures for entreating… But I do want to deny that an act of entreaty can be performed only as conforming to some such conventions. What makes X’s words to Y an entreaty not to go is something complex enough, no doubt relating to X’s situation, attitude to Y, manner, and current intention. [Strawson 1964, 444; compare Warnock 1973 and Bach & Harnish 1979]

Marina Sbisa disagrees with Strawson’s reading of the conventionality of illocutionary acts and identifies two distinct claims in Austin’s conventionalism: (a) illocutionary acts are performed via conventional devices (like linguistic conventions); and (b) illocutionary acts produce conventional effects. Austin specifies three kinds of conventional effects: the performance of an illocutionary act involves the securing of uptake, that is, bringing about the understanding of the meaning and force of the locution; the illocutionary act takes effect in conventional ways, as distinguished from producing consequences in the sense of bringing about changes in the natural course of events; and many illocutionary acts invite by convention a response or sequel (Austin 1975, 116-117). According to Sbisa, Austin deals too briefly with the conventionality of the effects produced by illocutionary acts (b) as opposed to the conventionality of the devices by which we perform illocutionary acts (a), leaving room for Strawson’s distinction between two groups of illocutionary acts: those depending on some convention the speaker follows, and those depending on a particular kind of intention of the speaker (Sbisa 2013, 26). On this point see below, § 5.a.

3. Epistemology

There are two main examples of Austin’s philosophical method applied to epistemological issues: the paper “Other Minds” (1946), and the series of lectures Sense and Sensibilia, delivered at Oxford and Berkeley during the decade from 1947 to 1958, and published in 1962. “Other Minds” is a paper given in the homonymous symposium at joint sessions of the Mind Association and the Aristotelian Society (John Wisdom, Alfred J. Ayer, and Austin were the main participants), and its topic was a much debated one in the middle decades of the twentieth century. In Sense and Sensibilia Austin applies his linguistic analysis to the sense-data theory and the more general foundational theory of knowledge, within which sense-data played the role of the basis of the very structure of empirical knowledge, in order to gain a clarification of the concept of perception.

a. Sense and Sensibilia

These lectures represent a very detailed criticism of the claims put forward by A. J. Ayer in The Foundations of Empirical Knowledge (1940), and, to a lesser extent, of those contained in H. H. Price’s Perception (1932) and G. J. Warnock’s Berkeley (1953). Austin challenges the sense-data theory, according to which we never directly perceive material objects. On the contrary, it is claimed by such theory, we perceive nothing but sense-data.

The notion of sense-data is introduced to identify the object of perception in abnormal, exceptional cases, for example, refraction, mirages, mirror-images, hallucinations, and so forth. In such cases perceptions can either be ‘qualitatively delusive’ or ‘existentially delusive,’ depending on whether sense-data endow material things with qualities that they do not really possess, or the material things presented do not exist at all. In all such cases, the sense-data theorist maintains, we directly perceive sense-data. The subsequent step in this argument, named the argument from illusion, is to claim that in ordinary cases too we directly perceive merely sense-data.

Austin’s goal is not to answer the question “What are the objects of perception?” Austin aims to get rid of “such illusions as ‘the argument from illusion’” on the one hand, and to offer on the other a “technique for dissolving philosophical worries” by clarifying the meaning of words such as ‘real,’ ‘look,’ ‘appear’ and ‘seem’ (Austin 1962a, 4-5). The argument from illusion amounts to a misconception inasmuch as it introduces a bogus dichotomy: that between sense-data and material objects. Austin challenges this dichotomy, and the subsequent claim that abnormal, illusory perceptions do not differ from normal, veridical ones in terms of quality (in both cases sense-data are perceived, though in different degrees), by presenting different cases of perceptions in order to show that “there is no one kind of thing that we ‘perceive’ but many different kinds, the number being reducible if at all by scientific investigation and not by philosophy” (Austin 1962a, 4).

Besides chairs, tables, pens and cigarettes, indicated by the sense-data theorist as examples of material objects, Austin draws attention to rainbows, shadows, flames, vapors and gases as cases of things we ordinarily say that we perceive, even though we would not classify them as material things. Likewise, Austin argues, there is no single way in which we are ‘deceived by senses’ (that is, to perceive something unreal or not material), but “things may go wrong […] in lots of different ways – which don’t have to be, and must not be assumed to be, classifiable in any general fashion” (Austin 1962a, 13). Moreover, Austin asks whether we would be prone to speak of ‘illusions’ with reference to dreams, phenomena of perspective, photos, mirror-images or pictures on the screen at the cinema. By recalling the familiarity of the circumstances in which we encounter these phenomena and the ways in which we ordinarily consider them, Austin intends to show how the dichotomies between sense-data and material objects, and between illusory perceptions and veridical ones, are in fact spurious alternatives.

The facts of perceptions are “diverse and complicated” and the analysis of words in their contexts of use enables us to make the subtle distinctions obliterated by the “obsession” some philosophers have for certain words (for example, ‘real’ and ‘reality’), and by the lack of attention to the (not even remotely interchangeable) uses of verbs like ‘look,’ ‘appear,’ and ‘seem.’ The way the sense-data theorists use the words ‘real’ and ‘directly’ in the argument from illusion is not the ordinary use of these words, but a new use, which nevertheless fails to be explained. Austin does not want to rule out the possibility of tracing, for theoretical purposes, new distinctions, and thus of emending our linguistic practices by introducing technical terms, but he rather proposes always to pay attention to the ordinary uses of our words, in order to avoid oversimplifications and distortions.

As an example, Austin examines the word ‘real’ and contrasts the ordinary, firmly established meanings of that word as fixed by the everyday ways we use it to the ways it is used by sense-data theorists in their arguments. What Austin recommends is a careful consideration of the ordinary, multifarious meanings of that word in order not to posit, for example, a non-natural quality designed by that word, common to all the things to which that word is attributed (‘real ducks,’ ‘real cream,’ ‘real progress,’ ‘real color,’ ‘real shape,’ and so forth).

Austin highlights the complexities proper to the uses of ‘real’ by observing that it is (i) a substantive-hungry word that often plays the role of (ii) adjuster-word, a word by means of which “other words are adjusted to meet the innumerable and unforeseeable demands of world upon language” (Austin 1962a, 73). Like ‘good,’ it is (iii) a dimension-word, that is, “the most general and comprehensive term in a whole group of terms of the same kind, terms that fulfil the same function” (Austin 1962a, 71): that is, ‘true,’ ‘proper,’ ‘genuine,’ ‘live,’ ‘natural,’ ‘authentic,’ as opposed to terms such as ‘false,’ ‘artificial,’ ‘fake,’ ‘bogus,’ ‘synthetic,’ ‘toy,’ but also to nouns like ‘dream,’ ‘illusion,’ ‘mirage,’ ‘hallucination.’ ‘Real,’ is also (iv) a word whose negative use “wears the trousers” (a trouser-word) (Austin 1962a, 70).

In order to determine the meaning of ‘real’ we have to consider, case by case, the ways and contexts in which it is used. Only by doing so, according to Austin, can we avoid introducing false dichotomies (for a criticism of Austin’s attack on sense-data see Ayer 1967 and Smith 2002).

b. Other Minds

In this paper Austin tackles the philosophical problems of the possibility of knowing someone else’s mental states (for example, that another man is angry) and of the reliability of the reasons to which we appeal when we justify our assertions about particular empirical facts (for example, that the bird moving around my garden is a goldfinch). The target of Austin’s analysis is the skeptical outcome of the challenge of such a possibility on the part of certain philosophers (in this case Austin is addressing some of John Wisdom’s contentions). As to the knowledge of particular empirical facts, given that “human intellect and senses are, indeed, inherently fallible and delusive” (Austin 1946/1961, 98), the skeptic claims that we should never, or almost never, say that we know something, except for what I can perceive with my senses now, for example, “Here is something that looks red to me now.” On the other hand, the possibility of knowing someone else’s mental states is challenged by means of the idea of a privileged access to our own sensations and mental states, such that only about them can we not ‘be wrong’ “in the most favoured sense” (Austin 1946/1961, 90).

i. Knowledge of Particular Empirical Facts

Austin engages in an examination of the kinds of answers we would provide, in ordinary, concrete and specific circumstances, to challenges to our claims of knowledge. For instance, in responding to someone’s question, “How do you know?” in the face of my claim “That is a goldfinch,” my answer could appeal to my past experience, by virtue of which I have learned something about goldfinches, and hence to the criteria for determining that something is a goldfinch, or to the circumstances of the current case, which enable me to determine that the bird moving around my garden now is a goldfinch. The ways in which, in ordinary circumstances, our claims can be challenged, or be wrong, are specific (ways that the context helps us to determine), and there are recognized procedures appropriate to the particular type of case to which we can appeal to justify or verify such claims.

The precautions to take, in ordinary cases, in order to claim to know something “cannot be more than reasonable, relative to current intents and purposes” (Austin 1946/1961, 88), inasmuch as in order to suppose that one is mistaken there must be some concrete reason relative to the specific case. On the contrary, the “wile of the metaphysician,” Austin claims, amounts to phrasing her doubts and questions in a very general way, and “not specifying or limiting what may be wrong,” “so that I feel at a loss ‘how to prove’” what she has challenged (Austin 1946/1961, 87).

By drawing a parallel with the performative formula ‘I promise,’ Austin claims that in uttering ‘I know’ the speaker does not describe her mental state (this would be, in Austin’s terms, a descriptive fallacy). Rather, in the appropriate circumstances, she does something: she gives others her word, that is, her authority for saying that ‘S is P.’

Austin’s analysis of the epistemological terms in their ordinary, specific uses is meant to determine the conditions under which our claims are felicitous, successful speech acts. These conditions are the ones we typically appeal to in order to justify our claims of knowledge should they be challenged.

Whether Austin’s strategy proves to be successful against the skeptical challenge, which rests on a metaphysical and logical possibility, is a further issue to be resolved.

ii. Knowledge of Someone Else’s Mental States

Austin objects to the idea (as claimed for example, by Wisdom) that we know (if ever we do) someone else’s feelings only from the physical symptoms of these feelings: we never know someone else’s feelings in themselves, as we know our own. According to Austin, claims about someone else’s mental states can be tackled like those about particular empirical facts, even though the former are more complex, owing to “the very special nature (grammar, logic) of feelings” (Austin 1946/1961, 105). It is on this special nature that the analysis of Austin in this paper is meant to shed light.

To affirm of someone, ‘I know he is angry,’ requires, on the one hand, a certain familiarity with the person to whom we are attributing the feeling; in particular, familiarity in situations of the same type as the current one. On the other hand, it seems to be necessary to have had a first-person experience of the relevant feeling/emotion.

A feeling (say anger), Austin claims, presents a close connection both with its natural expressions/manifestations, and with the natural occasions of those manifestations, so that “it seems fair to say that ‘being angry’ is in many respects like ‘having mumps’. It is a description of a whole pattern of events, including occasion, symptoms, feeling and manifestation, and possibly other factors besides” (Austin 1946/1961, 109).

Against Wisdom’s claim that we never get at someone else’s anger, but only at the symptoms/signs of his anger, Austin draws attention to the ways we talk about others’ feelings, and highlights the general pattern of events “peculiar to the case of ‘feelings’ (emotions)” on which our attributions are based (Austin 1946/1961, 110). Moreover, emphasis is placed on the fact that in order for feelings and emotions to be attributed, and also self-attributed, a problem of recognition, and of familiarity with the complexities of such pattern, seems to be in place, due to the very way in which the uses of the relevant terms have been learnt.

Emotion terms, for their part, are vague, for, on the one hand, they are often applied to a wide range of situations, while on the other, the patterns they cover are rather complex, such that in “unorthodox” cases there may be hesitation in attribution. Apart from this kind of intrinsic vagueness, doubts may arise as to the correctness of a feeling attribution, or its authenticity, due to cases of misunderstanding or deception. But these cases are special ones, and there are, as in the goldfinch case, “established procedures” for dealing with them.

Unlike the goldfinch case, in which “sensa are dumb” (Austin 1946/1961, 97), in the case of feeling attribution a special place is occupied, within its complex pattern of events, by “the man’s own statement as to what his feelings are” (Austin 1946/1961, 113). According to Austin, “believing in other persons, in authority and testimony, is an essential part of the act of communicating, an act which we all constantly perform. It is as much an irreducible part of our experience as, say, giving promises, or playing competitive games, or even sensing coloured patches” (Austin 1946/1961, 115). Austin thus aims at blocking the skeptical argument by claiming that the possibility of knowing others’ states of minds and feelings is a constitutive feature of our ordinary practices as such, for which there is no “justification” (ibid.). Again, it may still be open to contention whether this is sufficient to refute skepticism.

4. Philosophy of Action

Austin’s contribution to the philosophy of action is traceable mainly to two papers: “A Plea for Excuses” (1956a) and “Three Ways of Spilling Ink” (1966), where the notions of  ‘doing an action,’ and ‘doing something’ are clarified by means of the linguistic analysis of excuses, that is, by considering “the different ways, and different words, in which on occasion we may try to get out of things, to show that we didn’t act ‘freely’ or were not ‘responsible’” (Austin 1966/1961, 273). According to the method dear to Austin, through the analysis of abnormal cases, or failures, it is possible to throw light on the normal and standard cases. An examination of excuses should enable us to gain an understanding of the notion of action, by means of the preliminary elucidation of the notions of responsibility and freedom.

As for the case of ‘knowing,’ Austin’s contribution is one of clarification of use, which sheds light on the notion of ‘doing an action.’

According to Austin, from the analysis of the modifying expressions occurring in excuses (for example, ‘unwittingly,’ ‘impulsively’) and in accusations (‘deliberately,’ ‘purposely,’ ‘on purpose’), it is possible to classify the different breakdowns affecting actions, and thus to dismantle the complex internal details of the machinery of action. Far from being reducible to merely making some bodily movements, doing an action is organized into different stages: the intelligence, the appreciation of the situation, the planning, the decision, and the execution. Moreover, apart from the stages, “we can generally split up what might be named as one action in several distinct ways, into different stretches or phases” (Austin 1956a/1961, 201). In particular, by using a certain term to describe what someone did, we can cover either a smaller or larger stretch of events, distinguishing the act from its consequences, results, or effects. Austin thus points out that it is crucial to determine ‘what modifies what,’ that is, what is being excused, because of the different possible ways in which it is possible to refer to ‘what she did.’ Depending on the way we delimit the act, we may hold a person responsible or not, hence it is extremely important to determine with precision what is being justified.

In order to ascertain someone’s responsibility for a certain action, for example, a child that has spilled ink in class, it is necessary to consider as distinct cases whether he did it ‘intentionally,’ ‘deliberately,’ or ‘on purpose.’ Austin reveals the differences among the three terms in two steps. First, he considers imaginary actions the description of which presents two of the terms as expressly dissociated: for example, to do something intentionally, but not deliberately (for example, to impulsively stretch out one’s hand to make things up at the end of a quarrel; see Austin 1966/1961, 276-277); intentionally, but not on purpose (for example, to act wantonly); deliberately, but not intentionally (for example, unintended, though foreseeable, consequences/results of some actions of mine – to ruin one’s debtor by insisting on payment of due debts; see Austin 1966/1961, 278); and on purpose, but not intentionally (this last case seems to verge on impossibility, and has a paradoxical flavor). The logical limits of these combinations enable Austin to single out the differences among the concepts under investigation. Subsequently, as a second step, such differences are made apparent by an analysis of the grammar and philology of the terms (adjectival terminations, negative forms, the prepositions used to form adverbial expressions, and so forth). From this examination each term turns out to mark different ways in which it is possible to do an action, or different ‘aspects’ of action.

Austin’s analysis of the notions of action and responsibility casts light on that of freedom, which is clarified by the examination of “all the ways in which each action may not be ‘free’” (Austin 1956a/1961, 180). Like the word ‘real,’ also in the case of ‘free’ it is the negative use that wears the trousers: the notion of freedom derives its meaning from the concepts excluded, case by case, by each use of the term ‘free’ (for example, consider the following utterances: “I had to escape, since he threatened me with a knife;” “The glass slipped out of my hand because of my astonishment;” “I can’t help checking the email every five minutes”). By considering the specific, ordinary situations in which it is possible or not to ascertain someone’s responsibility for an act (where an issue of responsibility arises), the notions of freedom and responsibility emerge as closely intertwined. In Austin’s words: “As ‘truth’ is not a name for a characteristic of assertions, so ‘freedom’ is not a name for a characteristic of actions, but a name of a dimension in which actions are assessed” (Austin 1956a/1961, 180). Austin does not provide a positive account of the notion of freedom: it is rather elucidated through attention to the different ways in which our actions may be free.

5. Legacy

a. Speech Act Theory

Since the end of the 1960s Speech Act Theory (SAT) has been developed in many directions. Here we will concentrate on two main strands: the dispute between conventionalism and intentionalism on the one hand, and the debate on pornography, free speech, and censorship on the other.

i. Conventionalism and Intentionalism

Peter Strawson’s contribution to Austin’s SAT, with ideas drawn from Paul Grice’s analysis of the notion of non-natural meaning, marks the beginning of the dispute over the role played by convention and intention respectively in the performance of speech acts. Although almost all the developments of SAT contain, in different degrees, both a conventionalist and an intentionalist element, it may be useful to distinguish two main traditions, depending on the preponderance of one element over the other: a conventionalist, Austinian tradition, and an intentionalist, neo-Gricean one. There are two versions of conventionalism: John Searle’s conventionalism of the means, and Marina Sbisa’s conventionalism of the effects (the distinction is marked by Sbisa).

Central to Searle’s systematization of SAT is the hypothesis that speaking a language amounts to being engaged in a rule-governed activity, and that human languages may be regarded as the conventional realizations of a set of underlying constitutive rules, that is, rules that create, rather than just regulate, an activity, to the effect that the existence of such activity logically depends on that of the rules constituting it. Searle’s analysis aims at specifying a set of necessary and sufficient conditions for the successful performance of an illocutionary act (Searle concentrates on the act of promising, but the analysis is further extended to the other types of illocutionary acts, of which several classifications have been developed within SAT). Starting from the felicitous conditions for the performance of an act, a set of semantic rules for the use of ‘the illocutionary force indicating device for that act is extracted. Searle’s position thus qualifies as a conventionalism of the means because the felicitous performance of an illocutionary act obtains thanks to the conformity of the meaning of the utterance used to perform it to the linguistic conventions proper to the language within which the act is performed.

Sbisa developed Austin’s thesis of the conventionality of the illocutionary force by linking it to the second of the three kinds of conventional effects characteristic of the illocutionary act: the production of states of affairs in a way different from natural causation, namely, the production of normative states of affairs, such as obligations, commitments, entitlements, rights, and so forth. According to Sbisa, each type of illocutionary act produces specific conventional effects, which correspond to “assignments to or cancellations from each one of the participants of modal predicates” (Sbisa 2001, 1797), which are conventional insofar as their production depends on intersubjective agreement. The hallmark of such effects is, unlike physical actions, their being liable to annulment, their defeasibility.

The main exemplification of the intentionalist tradition within SAT is the analysis developed by Kent Bach and Robert Harnish. In the opposite direction relative to Searle’s views, Bach and Harnish claim that the link between the linguistic structure of an utterance and the illocutionary act it serves to perform is never semantic, or conventional, but rather inferential.

In order for the illocutionary act to be determined by the hearer, in addition to ‘what is said’ (the semantic content of the utterance), the following are also necessary: the speaker’s illocutionary communicative intention (this intention is reflexive: its fulfillment consists in its being recognized, on the part of the hearer, as intended to be recognized); the contextual beliefs shared by the interlocutors; general beliefs about the communicative situation which play the role of presumptions, that is, expectations about the interlocutor’s behavior which are so basic that they constitute the very conditions of possibility of the communicative exchange (for example, the belief of belonging to the same linguistic community, and that in force to the effect that whenever a member of the community utters an expression, she is doing so with some recognizable illocutionary intent); and, finally, conversational assumptions, drawn from Gricean conversational maxims.

All these elements combine to bring about the inference that enables the hearer to move from the level of the utterance act up to the illocutionary level, going through the locutionary one (note that Bach and Harnish, and Searle himself, part company with Austin in distinguishing the components of the speech act). Bach and Harnish put forward a model, the Speech Act Schema (SAS), which represents the pattern of the inference a hearer follows (and that the speaker intends that she follow) in order to understand the illocutionary act as an act of a certain type (as an order, a promise, an assertion, and so forth).

The merit of the intentionalist analysis offered by Bach and Harnish is that it attempts to integrate SAT within a global account of linguistic communication whose aim is to provide a psychologically plausible account. A shortcoming of this approach seems to be concerned with an essential element of SAT itself: the emphasis put on the normative dimension produced by the performance of speech acts. The intentionalist analysis fails to provide an account of such a normative dimension, since it is maintained that the creation of commitments and obligations is a “moral” question not answerable by SAT. On the other hand, the conventionalist tradition in both its variants seems to show an opposite and equally unsatisfactory tendency. It has in fact been argued, specifically by relevance theorists (see Sperber and Wilson 1995), that the illocutionary level, as identified by SAT, does not effectively play a role in the process of linguistic comprehension. That it does play such a role, as the SAT claims, is an unjustified hypothesis, so runs the relevance theorists’ objection. More generally, this objection urges speech act theorists to confront the cognitive turn in the philosophy of language and linguistics.

ii. Free Speech and Pornography

One more surprising applications of Speech Act Theory concerns the debate on free speech, pornography and censorship (compare Langton 1993, Hornsby 1993, Hornsby & Langton 1998, Saul 2006, Bianchi 2008). Liberal defenders of pornography maintain that pornography – even when violent and degrading – should be protected to defend a fundamental principle: the right to freedom of speech or expression. Conversely, Catharine MacKinnon claims that pornography violates women’s right to free speech: more precisely pornography not only causes the subordination and silencing of women, but it also constitutes women’s subordination (a violation of their civil right to equal civil status) and silencing (a violation of their civil right to freedom of speech; MacKinnon 1987). MacKinnon’s thesis has been widely discussed and criticized. Rae Langton and Jennifer Hornsby offer a defence of her claim in terms of Austin’s speech act theory: works of pornography can be understood as illocutionary acts of subordinating women or of silencing women. On the one hand, pornography (or at least violent and degrading pornography where women are portrayed as willing sexual objects) subordinates women by ranking them as inferior and legitimizing discrimination against them. On the other hand, pornography silences women by creating a communicative environment that deprives them of their illocutionary potential. The claim that pornography silences women can be analysed along the lines drawn by Austin. According to Langton (1993), one may prevent the performance of a locutionary act by preventing the utterance of certain expressions, using physical violence, institutional norms, or intimidation (locutionary silence). Or one may prevent the performance of a perlocutionary act by disregarding the illocutionary act, even if felicitously achieved (perlocutionary frustration). Or else one may prevent the performance of an illocutionary act by creating a communicative environment that precludes the uptake of the force of the speech act, or the recognition of the speaker’s authority, particularly with respect to women’s refusals of unwanted sex (illocutionary disablement). Indeed in Langton and Hornsby’s perspective, pornography might prevent women from performing the illocutionary act of refusing sex – by contributing to situations where a woman’s ‘No’ is not recognized as a refusal, and rape occurs as a result. Against the protection of pornography as a form of expression (a simple form of locution), Langton and Hornsby argue that pornography does more than simply express: it acts to silence the expressions of women (a form of illocution), thereby restricting their freedom of speech. We are thus confronted with two different, and conflicting, rights or, better, with different people in conflict over the same right, the right to free speech.

b. Epistemology

Austin’s legacy has been taken up in epistemology by individual authors such as Charles Travis (2004), who, with regard to the debate about the nature of perception, claims it to be nonrepresentational, and, along the same lines, by Michael G. F. Martin (2002), who defends a form of nonrepresentational realism.

Austin’s work may be considered as a source of inspiration for contextualism in epistemology, but some scholars tend to resist tracing much affinity by pointing out important differences between Austin’s method and the contextualist approach (see, for example, Baz 2012 and McMyler 2011).

More generally, Mark Kaplan (2000, 2008) has insisted on the necessity of considering our ordinary practices of knowledge attribution as a methodological constraint for epistemology, in order for it to preserve its own intellectual integrity. According to Kaplan, the adoption of an Austinian methodology in epistemology would undermine skeptical arguments about knowledge.

Such an Austinian trend is surely a minority one within the epistemological scenery, but later contributions (Gustafsson and Sørli 2011, Baz 2012) have shown how Austin’s method can play a significant role in dealing with several issues in the contemporary philosophical agenda, in epistemology and philosophy of language in particular.

6. References and Further Reading

a. Primary Sources

  • Austin, John L. 1940. “The Meaning of a Word.” The Moral Sciences Club of the University of Cambridge and the Jowett Society of the University of Oxford. Printed in 1961, James O. Urmson and Geoffrey J. Warnock (eds.), Philosophical Papers (pp. 55-75). Oxford: Clarendon Press.
  • Austin, John L. 1946. “Other Minds.” Proceedings of the Aristotelian SocietySupplementary Volumes 20, 148-187. Reprinted in 1961, James O. Urmson and Geoffrey J. Warnock (eds.), Philosophical Papers (pp. 76-116). Oxford: Clarendon Press.
  • Austin, John L. 1950. “Truth.” Proceedings of the Aristotelian SocietySupplementary Volumes 24, 111-128. Reprinted in 1961, James O. Urmson and Geoffrey J. Warnock (eds.), Philosophical Papers (pp. 117-133). Oxford: Clarendon Press.
  • Austin, John L. 1956a. “A Plea for Excuses.” Proceedings of the Aristotelian Society 57, 1-30. Reprinted in 1961, James O. Urmson and Geoffrey J. Warnock (eds.), Philosophical Papers (pp. 175-204). Oxford: Clarendon Press.
  • Austin, John L. 1956b. “Ifs and Cans.” Proceedings of the Aristotelian Society 42, 109-132. Reprinted in 1961, James O. Urmson and Geoffrey J. Warnock (eds.), Philosophical Papers (pp. 205-232). Oxford: Clarendon Press.
  • Austin, John L. 1961. Philosophical Papers. J. O. Urmson and G. J. Warnock (eds.), Oxford: Clarendon Press.
  • Austin, John L. 1962a. Sense and Sensibilia, G. J. Warnock (ed.). Oxford: Oxford University Press.
  • Austin, John L. 1962b. How to Do Things with Words. Oxford: Oxford University Press.
  • Austin, John L. 1966. “Three Ways of Spilling Ink.” The Philosophical Review 75, 427-440. Printed in 1961, James O. Urmson and Geoffrey J. Warnock (eds.), Philosophical Papers (pp. 272-287). Oxford: Clarendon Press.
  • Austin, John L. 1975. How to Do Things with Words, James O. Urmson and Marina Sbisa (eds.). Oxford: Oxford University Press, 2nd edition.

b. Secondary Sources

  • Ayer, Alfred J. 1940. The Foundations of Empirical Knowledge. New York: MacMillan.
  • Ayer, Alfred J. 1967. “Has Austin Refuted the Sense-Datum Theory?” Synthese 17, 117-140.
  • Bach, Kent, and Harnish, Robert M. 1979. Linguistic Communication and Speech Acts. Cambridge: MIT Press.
    • The most important development of Speech Act Theory in an intentionalist/inferentialist framework of linguistic communication.
  • Baz, Avner. 2012. When Words Are Called For: A Defense of Ordinary Language Philosophy. Cambridge: Harvard University Press.
    • Arguments against Ordinary Language Philosophy are scrutinized and dismantled, and its general method is applied to the debate about the reliance on intuitions in philosophy, and to that between contextualism and anti-contextualism with respect to knowledge.
  • Berlin, Isaiah, Forguson, Lynd W., Pears, David F., Pitcher, George, Searle, John R., Strawson, Peter F., and Warnock, Geoffrey J. (Eds.). 1973. Essays on J. L. Austin. Oxford: Clarendon Press.
    • Together with Fann 1969, the most important collection of papers on Austin.
  • Bianchi, Claudia. 2008. “Indexicals, Speech Acts and Pornography.” Analysis 68, 310-316.
    • A defense of Langton’s thesis according to which works of pornography can be understood as illocutionary acts of silencing women.
  • Fann, Kuang T. (Ed.). 1969. Symposium on J. L. Austin. London: Routledge & Kegan Paul.
    • The standard anthology on Austin’s philosophy, with papers by several of Austin’s colleagues and students.
  • Gustafsson, Martin, and Sørli, Richard. (Eds.). 2011. The Philosophy of J. L. Austin. Oxford: Oxford University Press.
    • A collection of essays, aiming to reassess the importance of Austin’s philosophy by developing his ideas in connection with several issues in the contemporary philosophical agenda, especially in epistemology and philosophy of language.
  • Hornsby, Jennifer. 1993. “Speech Acts and Pornography.” Women’s Philosophy Review 10, 38-45.
  • Hornsby, Jennifer, and Langton, Rae. 1998. “Free Speech and Illocution.” Legal Theory 4, 21-37.
    • An analysis of works of pornography as illocutionary acts of subordinating women, or illocutionary acts of silencing women.
  • Kaplan, Mark. 2000. “To What Must an Epistemology Be True?” Philosophy and Phenomenological Research 61, 279-304.
  • Kaplan, Mark. 2008. “Austin’s Way with Skepticism.” In John Greco (ed.), The Oxford Handbook of Skepticism (pp. 348-371). Oxford: Oxford University Press.
    • Kaplan draws attention to the ordinary practices of knowledge attribution as a methodological constraint epistemology should respect, and as a move that would undermine skeptical arguments about knowledge.
  • Langton, Rae. 1993. “Speech Acts and Unspeakable Acts.” Philosophy and Public Affairs 22, 293-330.
    • A discussion of pornography as a kind of speech act.
  • MacKinnon, Catharine. 1987. “Francis Biddle’s Sister: Pornography, Civil Rights and Speech.” In Catharine MacKinnon (ed.), Feminism Unmodified: Discourses on Life and Law (pp. 163-197). Cambridge: Harvard University Press.
    • A passionate plea against pornography as a violation of women’s civil rights.
  • Martin, Michael G. F. 2002. “The Transparency of Experience.” Mind and Language 17, 376-425.
  • Martin, Michael G. F. 2007. “Austin: Sense and Sensibilia Revisited.” London Philosophy Papers, 1-31.
    • Austin’s work on perception is discussed in relation with disjunctive theories of perception (Martin 2007), and a nonrepresentational realist account of perception is put forward (Martin 2002).
  • McMyler, Benjamin. 2011. “Believing What the Man Says about His Own Feelings.” In Martin Gustafsson and Richard Sørli (eds.), The Philosophy of J. L. Austin (pp. 114-145). Oxford: Oxford University Press.
  • Price, Henry H. (1932). Perception. London: Methuen & Co.
  • Saul, Jennifer. 2006. “Pornography, Speech Acts and Context.” Proceedings of the Aristotelian Society 106, 229-248.
    • A criticism of Langton’s thesis.
  • Sbisa, Marina. 2001. “Illocutionary Force and Degrees of Strength in Language Use.” Journal of Pragmatics 33, 1791-1814.
  • Sbisa, Marina. 2006. “Speech Act Theory.” In Jef Verschueren and Jan-Ola Oestman (eds.), Key Notions for Pragmatics: Handbook of Pragmatics Highlights, Volume 1 (pp. 229-244). Amsterdam: John Benjamins.
  • Sbisa, Marina. 2013. “Locution, Illocution, Perlocution.” In Marina Sbisa and Ken Turner (eds.), Pragmatics of Speech Actions: Handbooks of Pragmatics, Volume 2. Berlin: Mouton de Gruyter.
    • An overview of Austin’s distinction and of its reformulations.
  • Sbisa, Marina, and Turner, Ken (Eds.) 2013. Pragmatics of Speech Actions: Handbooks of Pragmatics, Volume 2. Berlin: Mouton de Gruyter.
  • Searle, John. 2001. “J. L. Austin (1911-1960).” In Aloysius P. Martinich and David Sosa (eds.), A Companion to Analytic Philosophy (pp. 218-230). Oxford: Blackwell.
    • A biographical sketch and brief discussion of Austin’s speech act theory.
  • Smith, Arthur D. 2002. The Problem of Perception. Cambridge: Harvard University Press.
  • Sperber, Dan, and Wilson, Deirdre. 1986. Relevance: Communication and Cognition. Oxford: Blackwell.
    • Chapter 4, §10 presents a criticism of Speech Act Theory.
  • Strawson, Peter F. 1964. “Intention and Convention in Speech Acts.” Philosophical Review 73, 439-460.
    • The paper that first discussed the role of conventional and intentional elements in the performance of speech acts.
  • Travis, Charles. 2004. “The Silence of the Senses.” Mind 113, 57-94.
    • Nonrepresentationalism about perception is defended, along Austin-inspired lines.
  • Warnock, Geoffrey J. 1953. Berkeley. Harmondsworth: Penguin Books.
  • Warnock, Geoffrey J. 1973. “Some Types of Performative Utterances.” In Isaiah Berlin, Lynd W. Forguson, David F. Pears, George Pitcher, John R. Searle, Peter F. Strawson and Geoffrey J. Warnock (eds.), Essays on J. L. Austin (pp. 69-89). Oxford: Clarendon Press.
  • Wittgenstein, Ludwig. 1953. Philosophical Investigations. Gertrude E. M. Anscombe (trans.). Oxford: Blackwell.
    • A discussion of the countless uses we may put our sentences to.

 

Author Information

Federica Berdini
Email: federica.berdini2@unibo.it
University of Bologna
Italy

and

Claudia Bianchi
Email: claudia.bianchi@unisr.it
Vita-Salute San Raffaele University
Italy

Robert Nozick: Political Philosophy 

NozickAlthough Robert Nozick did not consider himself to be primarily a political philosopher, he is best known for his contributions to it. Undoubtedly, Nozick’s work in epistemology and metaphysics (especially with respect to free will and the “closest continuer” theory of personal identity) has had a significant impact on those fields. However, it was the publication of his first book, Anarchy, State and Utopia (1974) that revitalized the political right-wing and set off a firestorm of critical replies and commentaries. While Nozick’s accomplishments reach far beyond the confines of political philosophy, it is safe to say that most recognize him for his work on attempting to provide a justification for the state, setting the limits of government, and trying to convince us that accepting his minimal state could foster a framework for a constellation of communities constituting a sort of utopia.

Anarchy, State and Utopia can also be seen as a critical response to John Rawls’ Theory of Justice, which was published just three years earlier and was considered to be the most robust and sophisticated defense of liberal egalitarianism. Although many credit Rawls for single-handedly rekindling interest in political philosophy, this is likely overstated praise. There is little doubt that Nozick’s systematic criticism of Rawls’ theory of justice and establishment of a rival political theory in Anarchy, State and Utopia also played a major role in bring significant attention back to political philosophy.

Table of Contents

  1. Life
  2. Libertarianism versus Conservatism and Nozick’s Appeal to Capitalists
  3. Anarchy, State and Utopia
    1. An Invisible Hand Justification of the State
    2. Individual Rights and the Nature of the Minimal State
    3. Self-Ownership Argument
    4. Separateness of Persons
    5. The Three Principles of Justice
    6. The Wilt Chamberlain and “Sunset/Yacht Lover” Examples
    7. The Experience Machine
    8. The Lockean Proviso
    9. Utopia
  4. Later Thought on Libertarianism
  5. References and Future Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Robert Nozick was born of Jewish immigrants in Brooklyn, New York in 1938 and died in 2002 of stomach cancer. He was a philosopher of wide-ranging interests who worked in metaphysics, epistemology, decision theory, political philosophy, and value theory more generally. Early in his career, Nozick taught at Princeton and Rockefeller Universities, but he spent the vast majority of his career on the faculty at Harvard. Nozick was unique in his approach to philosophy. While he is most properly thought of as an analytic philosopher, throughout his career he increasingly resisted the usual tactic in the discipline to argue from sure premises to a definite conclusion. Instead, Nozick tried to detail plausible explanatory theories to properly describe phenomena and to provide viable alternatives to conventional wisdom. He thought that the aim of philosophy shouldn’t necessarily be to convince opponents that a particular theory is true. Instead, Nozick saw philosophy as an exploratory venture of making connections between ideas and explaining how things could be as they are. This does not mean that he shirked the responsibility of engaging in rigorous analysis of concepts and when warranted he used technical arguments to make highly sophisticated points. It is also noteworthy that Nozick used work from several disciplines outside of philosophy to enhance his scholarship within it. He incorporated insights from economics, evolutionary biology, Buddhism, and moral psychology to clarify his theoretical work and sometimes ground it in empirical studies.

With respect to political philosophy, Nozick was a right-libertarian, which in short means he accepted the idea that individuals own themselves and have a right to private property. While he argued that the state is legitimate, he thought that only a very scaled-down version that provides security to individuals and protects private property can be justified. Nozick’s version of legitimate government is sometimes called the “nightwatchman” state which underscores what he saw as its essential function: to protect individuals and their private property. This means that to properly respect contracts and resolve property disputes, a judiciary is necessary and with respect to protecting persons and their property, a police force and a military are legitimate and citizens may justifiably be taxed to support these basic functions. However, no more expansive function of the state can be justified according to Nozick, which means that taxation aimed at building funds to be redistributed for welfare purposes (for example, health care, education, poverty relief) are all illegitimate. This is very different from the political theory of his one time Harvard colleague John Rawls. Rawls argued that in addition to protecting the basic liberties, it is legitimate for the state to use its coercive powers (via redistributive taxation) to maximize the position of the least well-off.

2. Libertarianism versus Conservatism and Nozick’s Appeal to Capitalists

What Nozick did share with Rawls is that he was part of the liberal political tradition, which places a premium on the individual and opposes the idea that the state’s function is to make its citizens moral. Instead, for liberals in political theory such as both Rawls and Nozick, the state exists to provide the appropriate conditions for individuals to define the good life for themselves (just so long as they don’t impede the ability of others to do the same). Notice that the labels are a bit confusing here. For political theorists “liberal” refers to the political tradition of Hobbes, Locke, Rousseau, and Kant. This is different from what “liberal” means in party politics in America (where “liberal” is associated with the Democratic Party).

Once more, while Nozick has been held up as a champion of conservatives in party politics (especially among fiscal ones) at the time of its publication, Anarchy, State and Utopia is best thought of specifically as a right-libertarian tract. In fact, Nozick’s political ideas diverge quite radically from those thought of as “conservative” in political theory. For example, the usual markings of conservatism in political theory include veneration of custom, trust of long standing traditions, and skepticism of wholesale political reform. Conservative political thinkers also usually insist that authority is crucial because it yields societal stability. However, none of these could plausibly be construed as traits of a Nozickian state. Nozick’s political philosophy does not explicitly refer to custom or tradition at all. In fact, if the result of individuals making consensual decisions happens to lead to radical breaks with the customs and traditions of culture, so much the worse for that culture. There would be nothing illegitimate about these consequences in Nozick’s view.

Nozick’s arguments for merely a “nightwatchman” state suggest that there is nothing particularly interesting about authority in his political theory except as it allows for security of citizens and their property. However, if authority is used to curtail individual freedoms in order to fit some pre-ordained set of societal norms, Nozick would have been vehemently opposed to such intrusion of the state into private lives. He was opposed to any sort of paternalistic legislation. For example, Nozick would have criticized laws prohibiting or restricting the riding of motorcycles without helmets on the ground that the state should not have the right to interfere with an individual’s choice to risk serious injury from accidents. Furthermore, Nozick opposed what might be called the “legislation of morality.” For instance, he would have thought it wrongful for the state to interfere with homosexual couples who want to marry and wish to exercise property rights and provisions with respect to domestic partnerships.

Yet, even with that said, it is clear why Nozick appeals to those who are classified as part of the right in party politics (especially in the United States). Again, there is some confusion to be cleared up with respect to labels. Nozick’s views are amenable to some conservatives as they are called in party politics (that is, those members of the Republican Party in the United States who call themselves fiscal conservatives). First of all, Nozick thought that we have robust property rights and borrowing from John Locke he endorsed the notion that we own ourselves. As Nozick famously asserts, “Individuals have rights, and there are things no person or group may do to them (without violating their rights)” (Nozick, 1974, ix). Capitalists and/or free market supporters will find this attractive since it means that we can sell our labor to whomever we wish and (more importantly) buy it from whoever will consent. Self-ownership, thought Nozick, means that we own our labor. This also allows for legitimate transfer of goods as we own ourselves and through our actions with other items in the world we justifiably come to own them as well. While Nozick doesn’t use this particular argument, his position is consistent with the capitalist idea that our use and manipulation of properly acquired raw materials also seems to confer value onto these items. Free market advocates endorse much of Locke’s theory of just property acquisition and Nozick, despite recognizing some possible criticisms of this position seems to end up implicitly accepting important components of it.

Furthermore, Nozick’s insistence that one may legitimately transfer her property any way she wishes just so long as no one is coerced or defrauded is very friendly to the views of free market economists. It is significant to note that Nozick thought these transfers to be legitimated without any consideration of whether the recipients deserve the acquisition of goods. While it is true that free market advocates tend to proclaim the idea that entrepreneurs have an additional claim to profits because they deserve them due to effort, contribution to society, and acceptance of risk, they still share with Nozick the fundamental principle that property owners possess absolute property rights. In sum, the implication of robust property rights – that freedom has a great deal to do with what one may do with his or her property without state interference – is a strongly shared notion of Nozick and capitalists.

Another area of agreement between capitalists and Nozick continues the theme of the limited role of governmental action, this time involving a disavowal of redistributive taxation. Nozick thinks individual rights are so strong that his theory will require significant constraints on state action vis-à-vis individuals. For him, the state’s proper purview is protection of persons and their assorted property rights. However, the state is not justified in forced takings of holdings (that is, via taxation) in order to serve the welfare needs of the poor.

In fact, Nozick is both supported (most notably by Edward Feser (2000)) and criticized (most notably by Brian Barry (1975, 332)) for comparing taxation with forced labor. “Taxation of earnings from labor,” argues Nozick, “is on a par with forced labor” (1974, 169).” He adds that “Seizing the results from someone’s labor is equivalent to seizing hours from him and directing him to carry on various activities. If people force you to do certain work, or unrewarded work, for a certain period of time, they decide what you are to do and what purposes your work is to serve apart from your decisions. This process whereby they take this decision from you makes them a part-owner of you; it gives them a property right in you. Just as having such partial control and power of decision, by right, over an animal or inanimate object would be to have a property right in it” (1974, 172).

3. Anarchy, State and Utopia

a. An Invisible Hand Justification of the State

In Anarchy, State and Utopia, Nozick first accepted the task of evaluating of whether government is necessary at all. Notice that Nozick is not merely addressing Rawls in writing Anarchy, State and Utopia. Part of the reason that Nozick became a libertarian was as a result of his discussion while a graduate student with Murray Rothbard. Rothbard not only convinced him of the plausibility of libertarianism, but also of the strength of challenges from anarchists to the idea that any state could be legitimate. Thus, Nozick took anarchist doubts about the legitimacy of the state seriously and thought these concerns needed to be addressed. Additionally, since Nozick commenced with the assertion that individuals have rights, he needed to see if the very notion of a state is compatible with these rights. It is important to note that Nozick believed in Lockean natural rights, and thus he had to show that a legitimate state could arise without the violation of them. What some commentators on Nozick disagree about is why Nozick did not use a social contract type argument (that is, in the tradition of Hobbes, Locke, and Rousseau) for how a legitimate state could arise.

Some commentators (most forcefully Ralf Bader) do not believe Nozick avoided using a social contract approach to showing how a state could arise because he somehow worried that such an approach is problematic. On this interpretation, Nozick’s goal was not to show how an actual state came about, but that he simply wanted to give a plausible explanatory account of how a legitimate state could come to be (Bader, 2010, 29-30).  Bader notes that Nozick gives what he calls a fundamental potential explanation. A fundamental potential explanation is, as Nozick put it, “an explanation that would explain the whole realm under consideration were it the actual explanation” (Nozick, 1974, 8). But it would also do so with appealing to the realm itself. What this means is that Nozick tried to give an explanation of the political realm in non-political terms. He went to say that an adequate theory of the state of nature would describe “how a state arises from that state of nature will serve our explanatory purposes, even if no actual state arose in that way” (Nozick, 1974, 7). Since Nozick had this explanatory goal, this is part of the reason why he didn’t use a consent-based approach to showing the formation of the state. Using the consent approach, on this interpretation, would be a trivial case because it doesn’t explain the justification of the state in non-political terms and thus would not qualify as a fundamental potential explanation. Nozick didn’t want to simply show that people could agree to form a state, he wanted to reveal that even without their intention, the actions of people would lead to the establishment of a state.

However, other commentators on Nozick’s work (such as Jeffrey Paul) contend that as a natural rights libertarian, Nozick also knew that any argument he made for the justification of the state could not proceed from the approach that there is a social contract to which individuals would need to consent. There are two reasons for this. First of all, the state created via the social contract monopolizes protective services and thus constrains the property rights of those of succeeding generations who were not involved with the negotiations of the original contract. Secondly, in the social contract approach, the state would transfer the power to tax to the majority, which would bind subsequent generations to that power, and thereby would seem to violate Lockean property rights. On this interpretation, while Nozick still desired to give a fundamental potential explanation, he did so because it fits well in addressing anarchist complaints. For all of these substantive reasons, Paul believes that Nozick needed to take an approach that does not follow from social contract theory (J. Paul, 1981, 68).

No matter which interpretation is endorsed, most agree that in effect Nozick proposed a new solution to the very old problem of political legitimacy. His starting point was still a state of nature (just as it was for Locke, Hobbes and Rousseau), understood as the condition people find themselves in when no political authority exists. Nozick began with the question of whether there are any methods of efficient administration of justice within the state of nature. One of his original contributions to the debate over this question was to offer an account of how such a method could arise. In fact, Nozick thought that the method will inevitably result in the creation of a state despite the fact that no one intends to create one. This is known as an invisible hand justification of the state. Nozick proposed that individuals, simply via their pursuit of improving their own conditions will perform actions that will lead to a minimal state. Again, this result comes without anyone intending to found (or perhaps even conceive of) one. Additionally, and perhaps more importantly, Nozick argued that such a conclusion could be reached without violation of anyone’s property rights.

How did Nozick think he answered the anarchists who insist that no state could be legitimate because such an entity inherently violates individual rights? First, we have to more fully understand how he envisioned the anarchists’ main objection to the justifiable formation of a state. The anarchist objection as Nozick saw it involves the idea that the state by its nature monopolizes coercive power and then proceeds to punish those who violate this monopoly. When this notion is combined with the idea that states provide protection for all within a specific geographical area by coercing some to finance the protection of others, there is a violation of “side constraints” that come in the form of natural rights. Therefore, since these actions are performed by the state, and these actions are immoral, the state is immoral. Of course, anarchists think the criticism goes even deeper than this. They complain that these actions flow from what they see as the main, inherent function of the state, not merely as the result of any particular state that happens to violate the rights of individuals.

How exactly did Nozick think the state arises through his invisible hand approach? We have to imagine that we are in a free market state of nature, where individuals possess natural rights. Generally speaking, these individuals are moral actors. However, if some do not act morally (as some likely won’t on some occasions) we would enter into what Nozick called mutual protection agencies. These agencies sell security packages like a free market economic good that individuals in a state of nature would surely buy for self and property protection.

Mutual protection agencies will eventually develop into dominant protection agencies, which are basically the pre-eminent mutual protection agencies within their respective territories. In turn, dominant protection agencies will ultimately become ultra minimal states. An ultra minimal state is a dominant protection agency where protection is still purchased as an economic good but where the agency also has a monopoly on force in the region. The ultra minimal state will evolve into the minimal state, when in addition to having the two factors of the ultra minimal state also makes a “redistribution” of protection to some independents (non-clients of the ultra minimal state) within its territorial boundaries at the cost of the ultra minimal state’s clients.

Nozick regarded this evolution of protection agencies into states as significant because it would provide a moral justification for state legitimacy. First, he believed he had given an account of the legitimacy for the state using only an invisible hand explanation. Notice that for each step, individuals are voluntarily consenting to arrangements that are not necessarily designed to develop a state. Alternatively, private protection agencies are simply negotiating arrangements between its clients and independents in order to keep the peace. Second, Nozick thought this explanation qualifies as a fundamental potential explanation. That is, the invisible hand explanation explains the formation of the state in a nonpolitical way. In sum, Nozick proposed that the state is justified if it can be explained with a fundamental potential explanation that makes use of an invisible hand explanation without making any morally impermissible steps.

To show how this argument is supposed to run, we need to reflect on the right to self-defense, which in Nozick’s view is presumably a natural right that all have. More specifically, however, everyone has a right to resist if others try to enforce upon them unreliable or unfair procedures of justice. Individuals can elect to transfer this right to the dominant protection agency given that they have purchased the proper protection policy. On these grounds the dominant protection agency thereby can acquire the right to intervene and prohibit independents from trying to exact punishment on their clients. So, in cases where individuals consent to transfer these rights to their dominant protection agency, the dominant protection agency has this right and legitimately monopolizes force. At this point the dominant protection agency in effect becomes an ultra minimal state.

It is significant to notice that Nozick believed that clients of dominant protection agencies presumably do not have to show actual violations of their rights by independents to justify punishment of independents who try to enforce unreliable procedures of justice. In turn, he thought that the dominant protection agency, working on behalf of its clients, could legitimately prohibit independents who use unreliable procedures of justice. This is because when independents try to use unreliable procedures of justice they are effectively risking rights violations. In other words, this riskiness alone is all the justification the dominant protection agency needs in order to prevent independents from punishing their clients.

However, Nozick did realize that this transition from the dominant protection agency to the ultra minimal state places those independents in its territory at a disadvantage. In effect, independents are being prohibited from defending themselves or enforcing their right to punish wrongdoing. It seems that if one is disadvantaged by being prohibited from committing a risky action (that is, is kept from committing the potentially risky activity of applying a procedure of justice that is unreliable) he or she must be compensated for the said disadvantage. This compensation can come in the form of money or “in kind” services. In this case, Nozick thought it would be most feasible for the ultra minimal state to simply offer services instead of money to those disadvantaged by its monopoly of force. At this stage when the ultra minimal state provides protective services to the disadvantaged it thereby transforms into a minimal state.

In order to try to buttress his position for the legitimacy of the minimal state, Nozick provides a second strand of argument. He sets out to explain why, if the client of the ultra minimal state is actually guilty of violating the rights of an independent, the independent still doesn’t have a legitimate case for exacting punishment on the client of the ultra minimal state. After all, everyone else appears to have the right to punish wrongdoers when actual rights violations occur, so why shouldn’t the independent also have this right? In essence, Nozick had to try to justify the dominant protection agency’s prohibition of victims of rights violations by their clients from defending themselves and using their rights of punishment.

Again recall that Nozick thought no one has the right to use an unreliable procedure (or at least one of unproven reliability) to determine whether or not to punish anyone. He contended that it doesn’t matter whether we say a person does not have a right to do something in the absence of appropriate knowledge of wrongdoing or that she has the right but does wrong by using it in the absence of such knowledge. As it turns out, Nozick opts to use the latter expression which allows him to put forth what he calls the Epistemic Principle of Border Crossing. Nozick described this principle as the idea that if someone knows that doing act A would violate Q’s rights unless condition C obtained, he may not do A if he has not ascertained that C obtains through being in the best feasible position for ascertaining this.

Nozick believed that anyone may punish those who violate this principle, just so long as she does not violate the principle herself in the process. With this said, the independent who is being prohibited from exercising punishment ought to be prohibited because she is about to use a procedure that has not been proven reliable. That is, condition C has not been fulfilled and by attempting to exact punishment would be in violation of the Epistemic Principle of Border Crossing. So, in this way, the dominant protection agency is justified in opposing the action of the independent or in punishing the independent if she does carry on punishing its (admittedly guilty) client. Also notice that Nozick tries to deflect the charge from anarchists that granting independents protection regardless of the ability to pay is a form of redistribution of resources and hence a rights violation. Nozick is careful to point out that this action is a form of compensation to independents for effectively disallowing them the right to punish.

b. Individual Rights and the Nature of the Minimal State

As noted earlier, Nozick begins Anarchy, State and Utopia with the assertion that “individuals have rights.” He never defended this proposition, but its intuitive appeal is supposed to be widely shared. Nozick even admitted that he didn’t have a bedrock foundation for individual rights. Once more, he surveyed a number of candidate reasons for why individuals possess rights and found them, taken individually anyhow, to be wanting. He noted that moral agency, free will, and rationality in themselves are not sufficient to ground rights and serve as the moral foundation for his theory. For instance, he explained that the possession of free will, by itself, doesn’t imply that a being ought to act freely.

While Nozick had no structured argument for the notion that individuals possess rights, he did still think that there are reasons in favor of it. He concluded that some combination of the characteristics above constitute something closer to a moral ground for rights. He attempted to ground the intuition that individuals have rights in the idea that each individual has unique value. Only humans have the rational capacity to choose, devise, and pursue projects unlike non-human animals and commodities.

It bears mentioning that what it means to respect people as equals for Nozick draws on the work of Immanuel Kant. Kant thought that we show proper respect for persons when we treat them as ends in themselves. That is, we should treat others as having goals and projects of their own and we mustn’t use them merely as instruments to get what we want. Humans can deliberate about which behaviors will allow them to reach their goals and can only be used in a way that respects that rational capacity. This also means that people cannot be used in any way without their consent.

The individual rights that people possess amount to moral side constraints on what can be done to them. The only condition that could allow for the violations of such side constraints would be if this were the only way to “avoid catastrophic moral horror.” Barring such a dramatic condition, this means that the rights of individuals are not to be ‘traded in’ even if this means gains for the entire society. But why it is so important that individual rights to self and property are not violated could be further explained by the idea that individuals need the space to make their lives meaningful: “I [Nozick] conjecture that the answer [to the question as to what grounds rights] is connected with that elusive and difficult notion: the meaning of life.” Nozick went on to say that the ability of one to shape his life in accordance with some sort of life plan is the very way one brings meaning to his life. This offers another perspective from which we can understand why human life is uniquely valuable. Only beings with the rational capacity to shape their own lives can have or even pursue a meaningful life.

c. Self-Ownership Argument

Nozick attempted to demonstrate the intuition that only free market exchanges respect persons as equals. To do this Nozick developed his “entitlement theory of justice.” To make this clear, we have to start with what might be called his self-ownership argument. Nozick’s self-ownership argument essentially goes like this. 1) People own themselves. This is based on the intuition Nozick supplies above. 2) The world and its objects are originally unowned. 3) One can acquire an absolute right to a disproportionate share of the world if this doesn’t worsen the material condition of others. Furthermore, since Nozick thought each owns herself, each will also own her talents. He reasoned that this also translates into ownership of the products of those talents. The entity that owns and is owned is the same entity, the whole person. With this stipulation, Nozick understood individuals as having absolute property rights over themselves, and once more, absolute property rights over the resources they acquire. 4) It is relatively easy to acquire property rights over a disproportionate amount of the world. 5) Therefore, once private property has been appropriated, a free market in goods and resources is morally required.

d. Separateness of Persons

An important related strand that runs in various ways throughout all of Anarchy, State and Utopia is the “separateness of persons.” As noted earlier, Nozick used Kant as an inspiration and borrowed Kant’s idea that individuals are ends in themselves. But, in order to be different “ends-in-themselves,” Kant observed that people must be separate entities. The idea is the each of us is a distinct individual, each mattering from a moral point of view. This view that individuals, like distinct atoms, are self-contained entities in some ways evokes Hobbes’ metaphysical picture of human existence. An implication of the notion that individuals matter morally is that benefits to some supposed larger entity, such as the “greater society,” cannot be used as a justification for violating individuals as persons. Consider what is often called “the common good.” Nozick challenged the idea that any such entity could really exist. In his view, there are merely individuals, and the common good can only really be the good of these individuals added together. While it is true that some individuals might make sacrifices of some of their interests in order to gain benefits for some other of their interests, society can never be justified in sacrificing the interests of some individuals for the sake of others.

The separateness of persons doctrine is important to Nozick insofar as he thought it supports the existence of moral side-constraints that individuals possess. Moral side constraints set the boundaries for what can permissibly be done to people (whether by other individuals or the state). Violations of such side constraints are wrongful because this treats individuals who suffer the violations merely as means or as instruments to the ends of other people. Notice that these side constraints are supposed to be understood as absolute prohibitions on what can permissibly be done to individuals. That is, even if we thought that violation of the moral side-constraint of just one person could be to the benefit of many others within the society, such a violation could not be justified.

In trying to describe what exactly moral side constraints are perhaps the most obvious prohibition that would follow from them would be any sort of physical aggression. But the very existence of a state seems to suggest coercion of individuals is permissible in some circumstances. As a consequence, Nozick needed to carefully argue that at least one possible function of the state, redistributive taxation, is a form of coercion that is not legitimate for a state to exercise.

e. The Three Principles of Justice

Nozick argued that there are three principles of just distribution. These principles make up the basic framework of his entitlement theory.

The first is the principle of just acquisition. Under this principle, individuals may acquire any property they wish just so long as it is previously unowned and is not taken by theft, coercion or fraud. The second is the principle of just transfer whereby property may be exchanged just so long as the transfer is not executed (again) by theft, force or fraud. These two principles constitute the legitimate means of acquiring and transferring goods. All valid transactions come from repeated actions on these two principles. While Nozick did not explicitly define what is now known as his third principle, he described its function. This third principle – just rectification – works, as the name suggests, to rectify violations of the first two principles.

What actions on these three principles reflect is what Nozick calls a “historical” theory of justice. He thought there was no way, simply by looking at the patterns of distribution that we could tell whether or not these distributions of goods are just. Nozick emphasized that we have to know exactly how the distribution came about. If somewhere in the chain of transactions theft, coercion or fraud occurred, then we could safely say that the property involved is unjustly held. The three principles of just distribution thereby regulate proper property acquisition and exchange.

As alluded to earlier, the entitlement theory of justice does not depend on the concept of just desert (reward) in order to validate the acquisition or transfer of property. For example, consider the principle of just transfer. One implication of this principle is that inheritance is perfectly legitimate. It is irrelevant from the perspective of the entitlement theory that the scions of billionaires may not have worked hard or contributed in any way to their parents’ fortunes. Moreover, it is not germane to the entitlement theorist that the benefactor of the inheritance is morally upstanding or somehow has made great contributions to society. Finally, it doesn’t matter that the scion’s being a part of billionaire family is morally arbitrary – that there is no sense of thinking that he is somehow undeserving of inheritance because he didn’t “make his way” into the family . If the billionaire wishes to gift the family fortune to the ne’er-do-well son, the son is entitled to that inheritance. The idea that just holdings for Nozick are not dependent on desert is an important consideration given that many free market advocates have tried to justify accumulation of enormous amounts of wealth based on the desert of those acquiring it. Common notions of desert appeal to the concepts of effort, innovation, contribution or merit in attempting to justify why some should have more than others. But Nozick saw the strained arguments used for the justification of property holdings on the basis of desert. For Nozick, desert is beside the point. But more importantly, desert theory cannot be justified as the basis for the justification of property holdings on his view because it is a patterned theory (more on this below).

The state then can be seen as an institution that serves to protect private property rights and the transactions that follow from them regardless of our thoughts of some people deserving more or less than they have. This function is fulfilled via preservation of the three principles of distribution. This means that the state functions to secure individuals’ safety and their goods, which presumably could be threatened by forces within or from outside of one’s territory. For security within the state, this necessitates a civil police force. For security from threats outside of the territory, the need is for a standing army. Provisions for these functions are legitimate concerns of the Nozickian state. In addition, given the third principle of distributive justice in Nozick’s scheme, there will need to be a judiciary. This is due to the practical matter that there will likely be disagreements over the status of contracts, whether they have been enacted properly, and whether allegations of personal or property rights violations have occurred. Somebody needs to adjudicate these conflicts, and the most effective means (according to Nozick) is through tort law proceedings. Also, the courts are to function in perhaps a more obvious way given the principle of rectification – to ensure that holdings are restored to their rightful owners in cases where justice in acquisition and justice in transfer principles are violated. An implication of all of this is that the state is within its right to demand tax monies to support these functions.

f. The Wilt Chamberlain and “Sunset/Yacht Lover” Examples

Nozick devises an imaginative example using an actual former professional basketball player (Wilt Chamberlain) to criticize what he calls “patterned” theories of justice. Patterned theories of justice are those whereby the distribution of holdings is considered to be just when there is some purpose or goal reached. For example, radical egalitarians argue that the most just distribution of goods and resources is one in which these items are equally shared across a society. Utilitarianism is another example of a patterned theory since all goods are justly distributed when they maximize the overall good of a society. Finally, John Rawls’ theory of justice would be considered to be a patterned theory because goods may only be distributed in the case that transfers benefit the least well-off members of society. What all of these theories share is that distributions of goods will only be declared just when they conform to a pattern stipulated by a favored principle of justice.

To argue against patterned theories of just distribution, Nozick wanted to show that patterns can only be imposed by either disallowing acts that disrupt the pattern (or as Nozick puts it, to “forbid capitalist acts between two adults”) or to constantly redistribute goods in order to reset the pattern. Nozick tried to demonstrate the truth of two propositions. The first is that if individuals have acquired their holdings justly, then the free exchange of them with others (provided no theft, fraud or coercion are involved) are just. The second proposition is supposed to result from the first. It holds that free exchanges will always conclude in disrupting alleged patterns of just distribution. But it seems that if we have ownership of property legitimately, we can dispose of it in any way we wish no matter what distribution of goods results. Nozick certainly realized that allowing individuals to exchange their goods without any patterning principle would lead to extreme inequalities of resources. However, he thought any measures to “correct” for such inequalities by governmental intervention would be unjust.

In the Wilt Chamberlain example, Nozick has us imagine a society that has the reader’s favorite distribution of resources. For the sake of argument, we’ll say that there are one million and one members and all have equal shares – we’ll call this distribution D1. We are also to imagine that Wilt signs a contract with a team with the stipulation that for each home game he gets 25 cents from the price of every ticket sold. As the season goes on, imagine that each person in the society gladly contributes 25 cents to see Wilt perform. After a million fans come to see him play, he has made $250,000. Therefore, there will be a new distribution, D2, in which Wilt has his original resources plus $250,000. Excluding all other transactions, we can assume that all others in society have much less than Wilt. Nozick asks if there is anything unjust about this example and if there is anything unjust, what could be the reason? Since each person agreed to D1, each person should be entitled to his or her share. This also means that each individual may choose to do with his resources what he likes and many have decided to give some to Wilt. Obviously D1 and D2 are not the same. However, D2 came from what we agreed was a just distribution plus a number of free exchanges. So, D2 is just, but D2 violates the pattern in D1 in which each is supposed to remain on the same level of resources.

Nozick’s first point is that allowing individuals the liberty to exchange their goods as they wish, given that they have fully consented to the transfers will clearly (and continually) disrupt the pattern of holdings. Notice that if you agree with Nozick that no injustice is done in the steps of forming the new set of holdings in D2, there is nothing unjust about D2. But what if one wants to re-establish D1? Notice what Nozick thinks would have to happen in this case. In order to realign the pattern of holdings, some entity (most likely the state) would need to intervene. That is, if D1 was that all had equal holdings and now at D2 we realize we want to revert to D1, we would have to take a portion of the holdings from those who have more to give to those who have less. But, here is the problem: What if those who have greater holdings in D2 do not want to give up their holdings, protesting that they had acquired them justly? If the state takes a portion of holdings away, how is this different from Robin Hood doing the same thing? Isn’t this merely a form of theft?

With the Wilt Chamberlain example in place, this provided Nozick with the opportunity to bolster his argument against patterned theories. He emphasized the point that preserving patterns will often necessitate redistributing goods through taxation from those who have to those who have not and that this is not an innocuous result. In effect, this redistribution results in forced labor. After all, Nozick pointed out, what do taxes represent? Certainly, money is taken, but Nozick argued that “seizing the results of someone’s labor is equivalent to seizing hours from him and directing him to carry on various activities.” In essence, Nozick thought that this is tantamount to forcing someone to work for an amount of time for proposes he has not chosen. Patterned principles in effect give some people a claim to the labor of other people. Nozick believed this translates into a transgression of self-ownership as the enforceable claim of some on the labor of others amounts to the partial ownership of some people by others.

There is another problem here as well. Nozick also worried that the degree of interference in peoples’ lives will vary depending on their different chosen lifestyles in a way that is discriminatory against certain people. He wondered why people shouldn’t have the right to choose their own lifestyles provided that they are not making others worse off than they would have been in a state of nature? Nozick objected that individuals who have more materialistic interests are required to work longer hours than those with more aesthetic lifestyles. He questioned why the person who wishes to “see a movie” must be required (since he must earn money for the movie ticket) to provide “money for the needy” while those who prefer to look at sunsets (who don’t need to earn money to enjoy them) do not? The point Nozick was trying to make is that if owning a luxury yacht is what makes me happy, I will have to work much longer than someone who simply likes sunsets. The person who likes sunsets doesn’t need money to enjoy them – he can pursue his happiness and meet his tax obligations to the state with limited labor time. However, I may need to work six extra months to have enough to purchase the yacht and meet my tax obligations to the state. Not only do I need the principle amount of money for the purchase, I am also taxed to provide for social programs. In this scenario, Nozick wonders why the yacht lover, simply because of his lifestyle, has to work six months to acquire what he wants and pay his taxes, as opposed to the sunset lover, who needs to work much less to achieve these same ends.

g. The Experience Machine

While the Wilt Chamberlain and sunset/yacht lover examples may take aim at any sort of patterned theory (though it was most likely targeted at Rawls’ theory of justice), Nozick uses yet another creative example to argue against classical utilitarianism. He has us imagine a machine developed by “super duper neuropsychologists” into which one could enter and have any sort of experience she desires. A person’s brain could be stimulated so she would think and feel that she was reading a book, writing a great novel, or climbing Mt.Everest. But all of the time the person would simply be floating in a tank with electrodes attached to her head. If one worries that she would get bored by a life of pleasant circumstances, there is nothing that disallows her from simply having problematic events programmed in to keep things interesting. As this is a thought experiment, and Nozick doesn’t want readers to be distracted by details that don’t force them to test their intuitions, we are to imagine that the machine is reliable, in fact unbreakable, so these would not be technical or trivial reasons to fail to enter. Nozick asks the reader if she would enter the machine.

Nozick thought we would not enter, concluding that people would follow his intuition that such programmed experiences are not real. He argued that people don’t merely want to experience certain actions, but that they want to actually do them. Nozick suspects that we wouldn’t enter the machine because we don’t merely wish to experience being famous, but we want to be certain types of people who do certain types of thing. For example, I don’t merely want to experience that I am a great novelist, I want to genuinely be a great novelist.

In what way is the intuition we are supposed to gain from the experience machine a criticism of classical utilitarianism? The idea is the fundamental value underlying utilitarianism as classically described by Jeremy Bentham and John Stuart Mill is that happiness is the highest and only intrinsic good. That is, happiness is the good for whose sake all other goods are pursued. Presumably the initial allure of entering into Nozick’s machine would be the promise of acquiring pleasure. However, the unwillingness to ultimately enter seems to signify that we want something else beyond happiness in our lives, whether that is reality or genuineness. Hence, it appears that happiness is not the highest good. Though this appears to strike a significant blow against classical utilitarianism, it doesn’t seem to affect other forms of utilitarianism such as preference utilitarianism. Preference utilitarians could claim that people might not go into the machine not because they care about values other than happiness more, but because they prefer to experience happiness only via some means that involve actively pursuing happiness and not merely experiencing it. Furthermore, they might see happiness in a complex way whereby they don’t merely desire “brute” pleasure, but want a nuanced sort of happiness.

h. The Lockean Proviso

Are there any limits to how many more resources some may own over others assuming that the distribution of holdings came about in a just way, even on Nozick’s view? There is one (albeit loose) constraint – the Lockean Proviso. While Nozick did not accept the entire Lockean theory of property, he does utilize components of it and retrofits them for his own use in his theory of entitlement. One of the famous constraints on property accumulation from Locke; is that one may own property just so long as he has left “enough and as good” for others. This prevents the monopolization of goods that are necessary for life. For instance, owning the only remaining water well would not be permitted under the terms of the proviso.

However, Nozick re-interprets the proviso to mean that if the initial acquisition fails to make anyone worse off who was using the resource before, then it is justly acquired. That is, in Nozick’s interpretation of the proviso, the position of others after the acquisition cannot be made worse off than they were when the property in question was unowned. Of course, this rendering of the proviso means that someone could appropriate all of a good, just so long as they allowed access to it to those who were using the resource before and that this appropriation does not make them worse off materially than they were before. This alternative rendering of the Lockean proviso leaves only an extremely limited constraint on the accumulation of resources. Hence, it would seem that even with the amended Lockean proviso adding some qualification to the acquisition of resources, Nozick’s entitlement theory would still allow for rather dramatic inequality of holdings throughout a libertarian society.

i. Utopia

The third and final part of Anarchy, State and Utopia deals with the topic of utopia and is generally thought to be the section of the book that receives the least amount of attention. Many scholars have noted that this section seems to serve as Nozick’s attempt to make what seems to be a very bleak picture of property rights (in which the poor are left being aided only by the discretion of private benefactors) into a potentially inspiring political model. But a second purpose of this section is likely to provide an independent argument for the minimal state.

Nozick begins the section by noting that we could use two different methods for figuring out what is the best sort of society to live in. One method he calls a “design” method and the other he dubs a “filter” method. By a design method Nozick had in mind the idea of people brainstorming about what the best sort of society might be (for example, Rawls’ method in A Theory of Justice). Then, the group could pattern a single society based on the desiderata arising from these deliberations. However, Nozick was skeptical about the viability of this model to yield successful results. He didn’t think that it is possible for there to be just one best form of society for everyone. Human beings are complex, with lots of different ideas about what traits would adhere to the best possible world. To make this point, Nozick has us imagine a bevy of different historical figures (including Beethoven, Wittgenstein, Henry Ford, Thoreau, Elizabeth Taylor, and Moses). Nozick challenges us to consider whether there could really be any one ideal world for all of these very different individuals. In the least, Nozick says that it is highly unlikely that even if there were one ideal society for all, that it could be determined in “an a priori fashion.”

Instead, he turns to the possibility of using a filtering device and promotes the idea of a sort of meta-utopia. In this filtering method, people consider many different societies and critique them, eliminating some and modifying others. In this process, we can imagine people trying out societies as a sort of experiment, leaving those they find hopeless or altering those they could find acceptable with some tinkering. As we might expect, some communities will simply be abandoned, others will flourish, and some will continue on but not without struggles. So, this meta-utopia would serve as a framework or platform for many diverse experimental communities. These would all be formed as voluntary communities, whereby no one would have the right to impose one utopian vision on the others. This effectively disallows “imperialistic” communities that may take as their mission to gain control of other communities for its own expansion. Some commentators have likened this anti-imperialistic clause in Nozick’s framework of utopia to a version of cultural relativism.

The point of Nozick’s framework for utopia is that consenting citizens could band together into political subunits and voluntarily select from any number of different societies to join. For example, those sympathetic with trade unionists could elect to develop a community where ownership groups consisted only of workers. In fact, Nozick was surprised that this sort of arrangement had not already evolved from vocal members of the political left who argue that the only just notion of property is one held by all in a classless society. A different subgroup which collectively decides that a strong welfare safety net is essential could vote to tax themselves at a high level in order to redistribute goods to help out the least fortunate via extensive rights to health care, child care, unemployment assistance, and so forth.

Nozick thought that following his political principles could lead to a vast array of different political entrepreneurships. All that is required is that one (or many) recruit fellow members to join into an association of likeminded citizens who all voluntarily agree to follow the stipulations of their new community. Notice that there is no violation of rights here. Just so long as all agree to the rules of the group, they could agree to form a society that looks much like the social democracies of Scandinavia. Likewise, it seems possible that individuals could join to form quite illiberal communities. The minimal requirements involve the idea that no one can be forced or deceived into joining. Additionally, if one is disenchanted with his/her current political association, he or she must have a right to exit.

Some commentators have thought that for the sake of clarity, it would be best to think of Nozick’s utopian vision here in a federalist fashion. The minimal state would serve as a replacement for the federal government. Smaller political entities, such as states or provinces could be developed which would consist of groups of citizens who voluntarily (but unanimously) agree to their own separate sets of political ideals. Additionally, reflecting on the best method for allowing individuals to achieve their own utopian aspirations leads us to a separate argument for the minimal state. For if we agree with Nozick that the filtering method is the best way for individuals to realize their utopian dreams, they should endorse his conception of a minimal state. After all, what is the framework for utopia but a minimal state? Notice that in the final section of Anarchy State and Utopia, the minimal state is not justified on the grounds of individual rights. Instead, the minimal state here provides a political framework that respects diversity and allows different individuals to pursue their own conceptions of the good. Nozick thinks that this should be considered as a major advantage to endorsing libertarian principles as a platform or framework for political organization.

4. Later Thought on Libertarianism

Famously, Nozick largely abandons work in political philosophy after the publication of Anarchy, State and Utopia. He mentions in Socratic Puzzles that he had no interest in defending his libertarian views in political theory from its critics by writing “Son of Anarchy, State and Utopia” or “Return of the Son of Anarchy.” However, in The Examined Life, Nozick did indicate that he had strayed away from libertarianism. On the other hand, the details for this turn are thin and somewhat mysterious. Nozick said that he did not mean to work out his own alternative to libertarianism later in life and only points out what he viewed as a major failure of the theory

He notes that libertarianism is seriously inadequate, partially because the theory does not “fully knit the humane considerations and joint cooperative activities it left room for more closely into its fabric” (that is, a communitarian or Aristotelian analysis). Apparently he also thought that somehow the theory did not take into account sufficiently the “symbolic” importance of some collective actions. He acknowledges that there are some goals citizens wish to achieve through the means of government as an expression of our human solidarity and perhaps of human respect in itself. Some commentators have noted that we shouldn’t interpret these reservations as a wholesale abdication of libertarianism. In an interview late in his life Nozick explained that his turn away from libertarianism was largely exaggerated, adding that he still endorsed the theory but was no longer a “hard core” follower.

5. References and Future Reading

a. Primary Sources

  • Nozick, Robert. Anarchy, State and Utopia. New York: Basic Books, 1974.
  • Nozick, Philosophical Explanations. Cambridge, MA: Belknap Press, 1981.
  • Nozick, Robert. Socratic Puzzles. Cambridge, MA: HarvardUniversity Press, 1997.
  • Nozick, Robert. The Examined Life. New York: Simon and Schuster, 1989.

b. Secondary Sources

  • Althan, J.E.J. Review of Anarchy, State and Utopia. Philosophy. Vol. 52, no. 199 (1977), pp. 102-105.
  • Bader, Ralf. Robert Nozick. London, Continuum Press, 2010.
  • Bakaya, Santosh. The Political Theory of Robert Nozick. Delhi, India: Gyan Books, 2006.
  • Barry, Brian. Review of Anarchy, State and Utopia by Robert Nozick. Political Theory, Vol. 3, No. 3 (1975), pp. 331-336.
  • Brighouse, Harry. Justice. Cambridge, U.K.: Polity Press, 2004.
  • Cohen, G.A. Self-Ownership, Freedom, and Equality. (Cambridge, U.K.: CambridgeUniversity Press, 1995.
  • Exdell, John. Distributive Justice: Nozick on Property Rights. Ethics, Vol. 87, No. 2 (1977), pp. 142-149.
  • Feser, Edward. Taxation, Forced Labor, and Theft. The Independent Review, Vol. 2, No. 2 (2000), pp. 219-235.
  • Fried, Barbara. Wilt Chamberlain Revisited: Nozick’s ‘Justice in Transfer’ and the Problem of Market-Based Distribution. Philosophy and Public Affairs, Vol. 24, No. 3 (1995), pp. 226-245.
  • Hailwood, Simon A. Exploring Nozick. Aldershot, UK: Avebury, 1996.
  • Holmes, Robert. Nozick on Anarchism. Political Theory, Vol. 5, No.2 (1977), pp. 247-256.
  • Lacey, A.R. Robert Nozick. Princeton, NJ: PrincetonUniversity Press, 2001.
  • Litan, Robert. On Rectification in Nozick’s Minimal State. Political Theory, Vol. 5, no. 2 (1977), pp. 233-246.
  • Kearl, J.R. Do Entitlements Imply that Taxation is Theft? Philosophy and Public Affairs, Vol. 7, no. 1 (1977), pp. 74-81.
  • Kymlicka, Will. Contemporary Political Philosophy. Oxford: Clarendon Press, 1990.
  • Machan, Tibor. Some Recent Work in Human Rights Theory. American Philosophical Quarterly, Vol. 17, No. 2 (1980), pp. 103-115.
  • Murray, Dale. Nozick, Autonomy and Compensation. London: Continuum Press, 2007.
  • Otsuka, Michael. Self-Ownership and Equality: A Lockean Reconciliation. Philosophy and Public Affairs, Vol. 27, no.1 (1998), pp. 65-92.
  • Paul, Ellen Frankel. Natural Rights Liberalism from Locke to Nozick. Cambridge, UK: CambridgeUniversity Press, 2005.
  • Paul, Jeffrey. Reading Nozick. Totawa, NJ: Rowman & Littlefield, 1981.
  • Rigterink, Roger. Robert Nozick: Anarchy, State and Utopia. (Unpublished essay, 2007).
  • Sandel, Michael. Liberalism and its Critics. New York: New YorkUniversity Press, 1984.
  • Scanlon, Thomas. Nozick on Rights, Liberty, and Property. Philosophy and Public Affairs, Vol. 6, No. 1 (1976), pp. 3-25.
  • Schmidtz, David. Robert Nozick. Cambridge: CambridgeUniversity Press, 2002.
  • Sterba, James. Recent Work on Alternative Conceptions of Justice. American Philosophical Quarterly, Vol. 23, No. 1 (1986), pp. 1-22.
  • Wolff, Jonathan. Robert Nozick: Property, Justice, and the Minimal State. Stanford, CA: StanfordUniversity Press, 1991.

Author Information

Dale Murray
Email: dale.murray@uwc.edu
University of Wisconsin-Baraboo/Sauk County
University of Wisconsin-Richland
U. S. A.

Rudolf Carnap: Modal Logic

carnap02In two works, a paper in The Journal of Symbolic Logic in 1946 and the book Meaning and Necessity in 1947, Rudolf Carnap developed a modal predicate logic containing a necessity operator N, whose semantics depends on the claim that, where α is a formula of the language, Nα represents the proposition that α is logically necessary. Carnap’s view was that Nα should be true if and only if α itself is logically valid, or, as he put it, is L-true. In the light of the criticisms of modal logic developed by W.V. Quine from 1943 on, the challenge for Carnap was how to produce a theory of validity for modal predicate logic in a way which enables an answer to be given to these criticisms. This article discusses Carnap’s motivation for developing a modal logic in the first place; and it then looks at how the modal predicate logic developed in his 1946 paper might be adapted to answer Quine’s objections. The adaptation is then compared with the way in which Carnap himself tried to answer Quine’s complaints in the 1947 book. Particular attention is paid to the problem of how to treat the meaning of formulas which contain a free individual variable in the scope of a modal operator, that is, to the problem of how to handle what Quine called the third grade of ‘modal involvement’.

Table of Contents

  1. Introduction
  2. Carnap’s Propositional Modal Logic
  3. Carnap’s (Non-Modal) Predicate Logic
  4. Carnap’s 1946 Modal Predicate Logic
  5. De Re Modality
  6. Individual Concepts
  7. References and Further Reading

1. Introduction

In an important article (Carnap 1946) and in a book a year later, (Carnap 1947), Rudolf Carnap articulated a system of modal logic. Carnap took himself to be doing two things; the first was to develop an account of the meaning of modal expressions; the second was to extend it to apply to what he called “modal functional logic” — that is, what we would call modal predicate logic or modal first-order logic. Carnap distinguishes between a logic or a ‘semantical system’, and a ‘calculus’, which is an axiomatic system, and states on p. 33 of 1946 that  “So far, no forms of MFC [modal functional calculus] have been constructed, and the construction of such a system is our chief aim.” In fact, in the preceding issue of The Journal of Symbolic Logic, the first presentation of Ruth Barcan’s axiomatic systems of modal predicate logic had already appeared, although they contained only an axiomatic presentation. (Barcan 1946.) The principal importance of Carnap’s work is thus his attempt to produce a semantics for modal predicate logic, and it is that concern that this article will focus on.

Nevertheless, first-order logic is founded on propositional logic, and Carnap first looks at non-modal propositional logic and modal propositional logic. I shall follow Carnap in using ~ and ∨ for negation and disjunction, though I shall use ∧ in place of Carnap’s ‘.’ for conjunction. Carnap takes these as primitive together with ‘t’ which stands for an arbitrary tautologous sentence. He recognises that ∧ and t can be defined in terms of ~ and ∨, but prefers to take them as primitive because of the importance to his presentation of conjunctive normal form. Carnap adopts the standard definitions of ⊃ and ≡. I will, however, deviate from Carnap’s notation by using Greek in place of German letters for metalinguistic symbols. In place of ‘valid’ Carnap speaks of L-true, and in place of ‘unsatisfiable’, L-false. α L-implies β iff (if and only if) α ⊃ β is valid. α and β are L-equivalent iff α ≡ β is valid.

One might at this stage ask what led Carnap to develop a modal logic at all. The clue here seems to be the influence of Wittgenstein. In his philosophical autobiography Carnap writes:

For me personally, Wittgenstein was perhaps the philosopher who, besides Russell and Frege, had the greatest influence on my thinking. The most important insight I gained from his work was the conception that the truth of logical statements is based only on their logical structure and on the meaning of the terms. Logical statements are true under all conceivable circumstances; thus their truth is independent of the contingent facts of the world. On the other hand, it follows that these statements do not say anything about the world and thus have no factual content. (Carnap 1963, p. 25)

Wittgenstein’s account of logical truth depended on the view that every (cognitively meaningful) sentence has truth conditions. (Wittgenstein 1921, 4.024.) Carnap certainly appears to have taken Wittgenstein’s remark as endorsing the truth-conditional theory of meaning. (See for instance Carnap 1947 p. 9.) If all logical truths are tautologies, and all tautologies are contentless, then you don’t need metaphysics to explain (logical) necessity.

One of the features of Wittgenstein’s view was that any way the world could be is determined by a collection of particular facts, where each such fact occupies a definite position in logical space, and where the way that position is occupied is independent of the way any other position of logical space is occupied. Such a world may be described in a logically perfect language, in which each atomic formula describes how a position of logical space is occupied. So suppose that we begin with this language, and instead of asking whether it reflects the structure of the world, we ask whether it is a useful language for describing the world. From Carnap’s perspective, (Carnap 1950) one might describe it in such a way as this. Given a language £ we may ask whether £ is adequate, or perhaps merely useful, for describing the world as we experience it. It is incoherent to speak about what the world in itself is like without presupposing that one is describing it. What makes £ a Carnapian equivalent of a logically perfect language would be that each of its atomic sentences is logically independent of any other atomic sentence, and that every possible world can be described by a state-description.

2. Carnap’s Propositional Modal Logic

In (non-modal) propositional logic the truth value of any well-formed formula (wff) is determined by an assignment of truth values to the atomic sentences. For Carnap an assignment of truth values to the atomic sentences is represented by what he calls a ‘state-description’. This term, like much in what follows, is only introduced at the predicate level (1946, p. 50) but it is less confusing to present it first for the propositional case, where a state-description, which I will refer to as s, is a class consisting of atomic wff or their negations, such that for each atomic wff p, exactly one of p or ~p is in s. (Here we may think of p as a propositional variable, or as a metalinguistic variable standing for an atomic wff.) Armed with a state-description s we may determine the truth of a wff α at s in the usual way, where s ╞ α means that α is true according to s, and s ╡ α means that not s ╞ α:

If α is atomic, then s ╞ α if α ∈ s, and s ╡ α if ~α ∈ s

s ╞ ~α iff s ╡ α

s ╞ α ∨ β iff s ╞ α or s ╞ β

s ╞ α ∧ β iff s ╞ α and s ╞ β

s ╞ t

This is not the way Carnap describes it. Carnap speaks of the range of a wff (p. 50). In Carnap’s terms the truth rules would be written:

If α is atomic then the range of α is those state-descriptions s such that α ∈ s.

Where V is the set of all state-descriptions, the range of ~α is V minus the range of α, that is, it is the class of those state-descriptions which are not in the range of α.

The range of α ∨ β is the range of α ∪ the range of β, that is, the class of state-descriptions which are either in the range of α or the range of β.

The range of α ∧ β is the range of α ∩ the range of β, that is, the class of state-descriptions which are in both the range of α and the range of β.

The range of t is V.

It should I hope be easy to see, first that Carnap’s way of putting things is equivalent to my use of s ╞ α, and second that these are in turn equivalent to the standard definitions of validity in terms of assignments of truth values.

By a ‘calculus’ Carnap means an axiomatic system, and he uses ‘PC’ to indicate any axiomatic system which is closed under modus ponens (the ‘rule of implication’, p. 38) and contains “‘t’ and all sentences formed by substitution from Bernays’s four axioms [See Hilbert and Ackermann, 1950, p. 28f] of the propositional calculus”. (loc cit.) Carnap notes that the soundness of this axiom system may be established in the usual way, and then shows how the possibility of reduction to conjunctive normal form (a method which Carnap, p. 38, calls P-reduction) may be used to prove completeness.

Modal logic is obtained by the addition of the sentential operator N. Carnap notes that N is equivalent to Lewis’s ~◊~. (Note that the □ symbol was not used by Lewis, but was invented by F.B. Fitch in 1945, and first appeared in print in Barcan 1946. It was not then known to Carnap.) Carnap tells us early in his article that “the guiding idea in our construction of systems of modal logic is this: a proposition p is logically necessary if and only if a sentence expressing p is logically true.” When this is turned into a definition in terms of truth in a state-description we get the following:

sNα iff sʹ ╞ α for every state-description sʹ.

This is because L-truth, or validity, means truth in every state-description. I shall refer to validity when N is interpreted in this way, as Carnap-validity, or C-validity. This account enables Carnap to address what was an important question at the time — what is the correct system of modal logic? While Carnap is clear that different systems of modal logic can reflect different views of the meaning of the necessity operators he is equally clear that, as he understands it, principles like NpNNp and ~NpN~Np are valid. It is easy to see that the validity of both these formulae follows easily from Carnap’s semantics for N. From this it is a short step to establishing that Carnap’s modal logic includes the principles of Lewis’s system S5, provided one takes the atomic wff to be propositional variables. However, we immediately run into a problem. Suppose that p is an atomic wff. Then there will be a state-description sʹ such that ~psʹ. And this means that for every state-description s, sNp, and so s ╞ ~Np. But this means that ~Np will be L-true. One can certainly have a system of modal logic in which this is so. An axiomatic basis and a completeness proof for the logic of C-validity occurs in Thomason 1973. (For comments on this feature of C-validity see also Makinson 1966 and Schurz 2001.) However, Carnap is clear that his system is equivalent to S5 (footnote 8, p. 41, and on p. 46.); and ~Np is not a theorem of S5. Further, the completeness theorem that Carnap proves, using normal forms, is a completeness proof for S5, based on Wajsberg 1933.

How then should this problem be addressed? Part of the answer is to look at Carnap’s attitude to propositional variables:

We here make use of ‘p’, ‘q’, and so forth, as auxiliary variables; that is to say they are merely used (following Quine) for the description of certain forms of sentences. (1946, p.41)

Quine 1934 suggests that the theorems of logic are always schemata. If so then we can define a wff α as what we might call QC-valid (Quine/Carnap valid) iff every substitution instance of α is C-valid. Wffs which are QC-valid are precisely the theorems of S5.

3. Carnap’s (Non-Modal) Predicate Logic

In presenting Carnap’s 1946 predicate logic (or as he prefers to call it ‘functional logic’, FL or FC depending on whether we are considering it semantically or axiomatically) I shall use ∀x in place of (x), and ∃x in place of (∃x). FL contains a denumerable infinity of individual constants, which I will often refer to simply as ‘constants’. Carnap uses the term ‘matrix’ for wff, and the term ‘sentence’ for closed wff, that is wff with no free variables. A state-description is as for propositional logic in containing only atomic sentences or their negations. Each of these will be a wff of the form Pa1an or ~Pa1an, where P is an n-place predicate and a1,…, an are n individual constants, not necessarily distinct.

To define truth in such a state-description Carnap proceeds a little differently from what is now common. In place of relativising the truth of an open formula to an assignment to the variables of individuals from a domain, Carnap assumes that every individual is denoted by one and only one individual constant, and he only defines truth for sentences. If s is any state-description, and α and β are any sentences, the rules for propositional modal logic can be extended by adding the following:

sPa1an if Pa1ans and sPa1an if ~Pa1ans

sa = b iff a and b are the same constant

s ╞ ∀xα iff s ╞ α[a/x] for every constant a, where [α/x] is α with a replacing every free x.

Carnap produces the following axiomatic basis for first-order predicate logic, which he calls ‘FC’. In place of Carnap’s ( ) to indicate the universal closure of a wff, I shall use ∀, so that Carnap’s D8-1a (1946, p. 52) can be written as:

PC       ∀α where α is a PC-tautology

and so on. Carnap refers to axioms as ‘primitive sentences’ and in addition to PC, using more current names, we have:

       ∀(∀x(α ⊃ β) ⊃ (∀xα ⊃ ∀xβ))

VQ      ∀(α ⊃ ∀xα), where x is not free in α.

∀1a     ∀(∀x ⊃ α[y/x]), where α[y/x] is just like α except in having y in place of free x, where y is any variable for which x is free

∀1b     ∀(∀x ⊃ α[b/x]), where α[b/x] is just like α except in having b in place of free x, where b is any constant

I1         ∀x x = x

I2         ∀(x = y ⊃ (α ⊃ β)), where α and β are alike except that α has free x in 0 or more places where β has free y.

I3         ab where a and b are different constants.

The only transformation rule is modus ponens:

MP      ├ α, ├ α ⊃ β therefore ├ β

The only thing non-standard here, except perhaps for the restriction of theorems to closed wffs, is I3, which ensures that all state-descriptions are infinite, and, as Carnap points out on p. 53, validates ∃xy xy. It is possible to prove the completeness of this axiomatic system with respect to Carnap’s semantics.

4. Carnap’s 1946 Modal Predicate Logic

Perhaps the most important issue in Carnap’s modal logic is its connection with the criticisms of W.V. Quine. These criticisms were well known to Carnap who cites Quine 1943. Some years later, in Quine 1953b, Quine distinguishes three grades of what he calls ‘modal involvement’. The first grade he regards as innocuous. It is no more than the metalinguistic attribution of validity to a formula of non-modal logic. In the second grade we say that where α is any sentence then Nα is true iff α itself is valid — or logically true. On pp. 166-169 Quine argues that while such a procedure is possible it is unilluminating and misleading. The third grade applies to modal predicate logic, and allows free individual variables to occur in the scope of modal operators. It is this grade that Quine finds objectionable. One of the points at issue between Quine and Carnap arises when we introduce what are called definite descriptions into the language. Much of Carnap’s discussion in his other works — see especially Carnap 1947 — elevates descriptions to a central role, but in the 1946 paper these are not involved.

The extension of Carnap’s semantics to modal logic is exactly as in the propositional case:

sNα iff sʹ ╞ α for every state-description sʹ.

As before, a wff can be called C-valid iff it is true in every state-description, when ╞ satisfies the principle just stated. As in the propositional case if α is S5-valid then α is C-valid. However, also as in the propositional case, (quantified) S5 is not complete for C-validity. This is because, where Pa is an atomic wff, ~NPa is C-valid even though it is not a theorem of S5 — and similarly with any atomic wff. Unlike the propositional case it seems that this is a feature which Carnap welcomed in the predicate case, since he introduces some non-standard axioms.

The first set of axioms all form part of a standard basis for S5. They are as follows (p. 54, but with current names and notation):

LPCN  Where α is one of the LPC axioms PC-I3 then both α and Nα are axioms of MFC.

K         N∀(N(α ⊃ β) ⊃ (Nα ⊃ Nβ))

T         ∀(Nα ⊃ α)

5          N∀(Nα ∨ N~Nα)

BFC     N∀(Nxα ⊃ ∀xNα)

BF       N∀(∀xNα ⊃ Nxα)

The non-standard axioms, which show that he is attempting to axiomatise C-validity, are what Carnap calls ‘Assimilation’, ‘Variation and Generalization’ and ‘Substitution for Predicates’. (Carnap 1946, p. 54f.) In our notation these can be expressed as follows:

Ass      Nxyz1…∀zn((xz1 ∧ … ∧ xzn) ⊃ (Nα ⊃ N α[y/x])), where α contains no free variables other than x, y, z1,…, zn, and no constants and no occurrences of =.

VG      Nxyz1…∀zn((xz1 ∧ … ∧ xznyz1 ∧ … ∧ yzn) ⊃ (Nα ⊃ N α[y/x]), where α contains no free variables other than x, y, z1,…, zn, and no constants.

SP       N∀(Nα ⊃ Nβ), where β is obtained from α by uniform substitution of a complex expression for a predicate.

None of these axiom schemata is easy to process, but it is not difficult to see what the simplest instances would look like. A very simple instance, which is of both Ass and VG is

AssP     Nxyz(xz ⊃ (NPxyzNPyyz))

To establish the validity of AssP it is sufficient to show that if a and c are distinct constants then NPabcNPbbc is valid. This is trivially so, since there is some s such that sPabc, and therefore for every s, sNPabc, and so, for every s, sNPabcPbbc. More telling is the case of SP. Let P be a one-place predicate and consider

SPP      Nx(NPxN(Px ∧ ~Px))

In this case α is Px, while β is Px ∧ ~Px, so that, in Carnap’s words, β ‘is formed from α by replacing every atomic matrix containing P by the current substitution form of β’. That is, where β is Px ∧ ~Px, it replaces α’s Px. If α had been more complex and contained Py as well as Px, then the replacement would have given Py ∧ ~ Py, and so on, where care needs to be taken to prevent any free variable being bound as a result of the replacement. In this case we have ├ ~N(Pa ∧ ~ Pa), and so ├ ~NPa.

In fact, although Carnap appears to have it in mind to axiomatise C-validity, it is easy to see that the predicate version is not recursively axiomatisable. For, where α is any LPC wff, α is not LPC-valid iff ~Nα is C-valid, and so, if C-validity were axiomatisable then LPC would be decidable. There is a hint on p. 57 that Carnap may have recognised this. He is certainly aware that the kind of reduction to normal form, with which he achieves the completeness of propositional S5, is unavailable in the predicate case, since it would lead to the decidability of LPC.

5. De Re Modality

What then can be said on the basis of Carnap 1946 to answer Quine’s complaints about modal predicate logic? Quine illustrates the problem in Quine 1943, pp. 119-121, and repeats versions of his argument many times, most famously perhaps in Quine 1953a, 1953b and 1960. The example  goes like this:

(1)                                9 is necessarily greater than 7

(2)                                The number of planets = 9

therefore

(3)                                The number of planets is necessarily greater than 7.

Carnap 1946 does not introduce definite descriptions into the language, so I shall present the argument in a formalisation which only uses the resources found there. I shall also simplify the discussion by using the predicate O, where Ox means ‘x is odd’, rather than the complex predicate ‘is greater than 7’. This will avoid reference to ‘7’, which is of no relevance to Quine’s argument. P means ‘is the number of the planets’, so that Px means ‘there are x-many planets’. With this in mind I take ‘9’ to be an individual constant, and use O and P to express (1) and (2) by

(4)                                NO9

(5)                                ∃x(Pxx = 9)

One could account for (4) by adding O9 as a meaning postulate in the sense of Carnap 1952, which would restrict the allowable state-descriptions to those which contain O9, though from some remarks on p. 201 of Carnap 1947 it seems that Carnap might have regarded both O and 9 as complex expressions defined by the resources of the Frege/Russell account of the natural numbers and their arithmetical properties. It also seems that he might have treated the numbers as higher-order entities referred to by higher-order expressions. If so then the necessity of arithmetical truths like (4) would derive from their analysis into logical truths. In my exposition I shall take the numerals as individual constants, and assume somehow that O9 is a logical truth, true in every state-description, and that therefore (4) is true.

In this formalisation I am ignoring the claim that the description ‘the number of the planets’ is intended to claim that there is only one such number. So much for the premises. But what about the conclusion? The problem is where to put the N. There are at least three possibilities:

(6)                                Nx(PxOx)

(7)                                ∃xN(PxOx)

(8)                                ∃x(PxNOx)

It is not difficult to show that (6) and (7) do not follow from (4) and (5). In contrast to (6) and (7), (8) does follow from (4) and (5), but there is no problem here, since (8) says that there is a necessarily odd number which is such that there happen to be that many planets. And this is true, because 9 is necessarily odd, and there are 9 planets. All of this should make clear how the phenomenon which upset Quine can be presented in the formal language of the 1946 article. Quine of course claims not to make sense of quantifying in. (See for instance the comments on Smullyan 1948 in Quine 1969, p. 338.)

6. Individual Concepts

Even if something like what has just been said might be thought to enable Carnap to answer Quine’s complaints about de re modality, it seems clear that Carnap had not availed himself of it in the 1947 book, and I shall now look at the modal logic presented in Carnap 1947. On p. 193f Carnap cites the argument (1)(2)(3) from Quine 1943 discussed above. He does not appear to recognise any potential ambiguity in the conclusion, and characterises (3) as false. Carnap doesn’t consider (8), and on p. 194 simply says:

“we obtain the false statement [(3)]”

In Carnap’s view the problem with Quine’s argument is that it assumes an unrestricted version of what is sometimes called ‘Leibniz’ Law’:

I2         ∀xy(x = y ⊃ (α ⊃ β)), where α and β differ only in that α has free x in 0 or more places where β has free y.

In the 1946 paper this law holds in full generality, as does a consequence of it which asserts the necessity of true identities.

LI        ∀xy(x = yNx = y)

For suppose LI fails. Then there would have to be a state-description ss in which for some constants a and b, sa = b but sNa = b. So there is a state-description sʹ such that sʹ ╡ a = b, but then, a and b are different constants, and so, sa = b, which gives a contradiction.

In the 1947 book Carnap holds that I2 must be restricted so that neither x nor y occur free in the scope of a modal operator. In particular the following would be ruled out as an allowable instance of I2:

(1)                                x = y ⊃ (NOxNOy)

In order to explain how this failure comes about, and solve the problems posed by co-referring singular terms, Carnap modifies the semantics of the 1946 paper. The principal difference from the modal logic of the 1946 paper, as Carnap tells us on p. 183, is that the domain of quantification for individual variables now consists of individual concepts, where an individual concept i is a function from state-descriptions to individual constants. Where s is a state-description, let is denote the constant which is the value of the function i for the state-description s. Carnap is clear that the quantifiers range over all individual concepts, not just those expressible in the language.

Using this semantics it is easy to see how (9) can fail. For let x have as its value the individual concept i, which is the function such that is is 9 for every state-description s, while the value of y is the function j such that, in any state-description s, js is the individual which is the number of the planets in s, that is, js is the (unique) constant a such that Pa is in s. (Assume that in each state-description there is a unique number, possibly 0, which satisfies P.) Assume that x = y is true in any state-description s iff, where i is the individual concept which is the value of x, and j is the individual concept which is the value of y, then is is the same individual constant as js. In the present example it happens that when s is the state-description which represents the actual world, is and js are indeed the same, for in s there are nine planets, making x = y true at s. Now NOx will be true if Ox is true in every state-description sʹ, which is to say if isʹ satisfies O in every sʹ. Since isʹ is 9 in every state-description then isʹ does satisfy O in every sʹ, and so NOx is true at s. But suppose sʹ represents a situation in which there are six planets. Then jsʹ will be 6 and so Oy will be false in sʹ, and for that reason NOy will be false in s, thus falsifying (9). (It is also easy to see that LI is not valid, since it is easy to have is = js even though ij.)

The difference between the modal semantics of Carnap 1946 and Carnap 1947 is that in the former the only individuals are the genuine individuals, represented by the constants of the language ℒ. In the proof of the invalidity of (9) it is essential that the semantics of identity require that when x is assigned an individual concept i and y is assigned an individual concept j that x = y be true at a state-description s iff is and js are the same individual. And now we come to Quine’s complaint (Quine 1953a, p. 152f). It is that Carnap replaces the domain of things as the range of the quantifiers with a domain of individual concepts. Quine then points out that the very same paradoxes arise again at the level of individual concepts. Thus for instance it might be that the individual concept which represents the number of planets in each state-description is identical with the first individual concept introduced on p. 193 of Meaning and Necessity. Carnap is alive to Quine’s criticism that ordinary individuals have been replaced in his ontology by individual concepts. In essence Carnap’s reply to Quine on pp. 198- 200 of Carnap 1947 is that if we restrict ourselves to purely extensional contexts then the entities which enter into the semantics are precisely the same entities as are the extensions of the intensions involved. What this amounts to is that although the domain of quantification consists of individual concepts, the arguments of the predicates are only the genuine individuals. For suppose, as Quine appears to have in mind, we permit predicates which apply to individual concepts. Then suppose that i and j are distinct individual concepts. Let P be a predicate which can apply to individual concepts, and let s be a state-description in which P applies to i but not to j but in which is and js are the same individual. We now have two options depending on how = is to be understood. If we take x = y to be true in s when is and js are the same individual then if x is assigned i and y is assigned j we would have that x = y and Px are both true in s, but Py is not. So that even the simplest instance of I2

I2P       x = y ⊃ (PxPy)

fails, and here there are no modal operators involved. The second option is to treat = as expressing a genuine identity. That is to say x = y is true only when the individual concept assigned to x is the same individual concept as the one assigned to y. In the example I have been discussing, since i and j are distinct individual concepts if i is assigned to x and j to y, then x = y will be false. But on this option the full version of I2 becomes valid even when α and β contain modal operators. This is just another version of Quine’s complaint that if an operator expresses identity then the terms of a true identity formula must be interchangeable in all contexts. Presumably Carnap thought that the use of individual concepts could address these worries. The present article makes no claims on whether or not an acceptable treatment of individual concepts is desirable, and if it is whether one can be developed.

7. References and Further Reading

This list contains all items referred to in the text, together with some other articles relevant to Carnap’s modal logic.

  • Barcan, (Marcus) R.C., 1946, A functional calculus of first order based on strict implication. The Journal of Symbolic Logic, 11, 1–16.
  • Burgess, J.P., 1999, Which modal logic is the right one? Notre Dame Journal of Formal Logic, 40, 81–93.
  • Carnap, R., 1937, The Logical Syntax of Language, London, Kegan Paul, Trench Truber.
  • Carnap, R., 1946, Modalities and quantification. The Journal of Symbolic Logic, 11, 33–64.
  • Carnap, R., 1947, Meaning and Necessity, Chicago, University of Chicago Press (Second edition 1956, references are to the second edition.).
  • Carnap, R., 1950, Empiricism, semantics and ontology. Revue Intern de Phil. 4, pp. 20–40 (Reprinted in the second edition of Carnap 1947, pp. 2052–2221. Page references are to this reprint.).
  • Carnap, R., 1952, Meaning postulates. Philosophical Studies, 3, pp. 65–73. (Reprinted in the second edition of Carnap 1947, pp. 222–229. Page references are to this reprint.)
  • Carnap, R., 1963, The Philosophy of Rudolf Carnap, ed P.A. Schilpp, La Salle, Ill., Open Court, pp. 3–84.
  • Church, A., 1973, A revised formulation of the logic of sense and denotation (part I). Noũs, 7, pp. 24–33.
  • Cocchiarella, N.B., 1975a, On the primary and secondary semantics of logical necessity. Journal of Philosophical Logic, 4, pp. 13–27..
  • Cocchiarella, N.B.,1975b, Logical atomism, nominalism, and modal logic. Synthese, 31, pp. 23−67.
  • Cresswell, M.J., 2013, Carnap and McKinsey: Topics in the pre–history of possible worlds semantics. Proceedings of the 12th Asian Logic Conference, J. Brendle, R. Downey, R. Goldblatt and B. Kim (eds), World Scientific, pp. 53-75.
  • Garson, J.W., 1980, Quantification in modal logic. Handbook of Philosophical Logic, ed. D.M. Gabbay and F. Guenthner, Dordrecht, Reidel, Vol. II, Ch. 5, 249-307
  • Gottlob, G., 1999, Remarks on a Carnapian extension of S5. In J. Wolenski, E. Köhler (eds.), Alfred Tarski and the Vienna Circle, Kluwer, Dordrecht, 243−259.
  • Hilbert, D., and W. Ackermann, 1950, Mathematical Logic, New York, Chelsea Publishing Co., (Translation of Grundzüge der Theoretischen Logik.).
  • Hughes, G.E., and M.J. Cresswell, 1996, A New Introduction to Modal Logic, London, Routledge.
  • Lewis, C.I., and C.H. Langford, 1932, Symbolic Logic, New York, Dover publications.
  • Makinson, D., 1966, How meaningful are modal operators? Australasian Journal of Philosophy, 44, 331−337.
  • Quine, W.V.O., 1934, Ontological remarks on the propositional calculus. Mind, 433, pp. 473– 476.
  • Quine, W.V.O., 1943, Notes on existence and necessity, The Journal of Philosophy, Vol 40, pp. 113-127.
  • Quine, W.V.O., 1953a, Reference and modality. From a Logical Point of View, Cambridge, Mass., Harvard University Press, second edition 1961, pp. 139–59.
  • Quine, W.V.O., 1953b, Three grades of modal involvement, The Ways of Paradox, Cambridge Mass., Harvard University Press, 1976, pp. 158–176.
  • Quine, W.V.O., 1960, Word and Object, Cambridge, Mass, MIT Press.
  • Quine, W.V.O., 1969, Reply to Sellars. Words and Objections, (ed D. Davidson and K.J.J. Hintikka), Dordrecht, Reidel, 1969, pp. 337–340.
  • Schurz, G., 2001, Carnap’s modal logic. In W. Stelzner and M. Stockler (eds.), Zwischen traditioneller und moderner Logik. Paderborn, Mentis, pp. 365–380.
  • Smullyan, A.F., 1948, Modality and description. The Journal of Symbolic Logic, 13, 31–7.
  • Thomason, S. K.,1973, New Representation of S5. Notre Dame Journal of Formal Logic, 14, 281−284.
  • Wajsberg, M., 1933, Ein erweiteter Klassenkalkül. Monatshefte für Mathematik und Physik, Vol. 40, 113–26.
  • Wittgenstein, L., 1921, Tractatus Logic-Philosophicus. (Translated by D.F.Pears and B.F.McGinness), 2nd printing 1963. London, Routledge and Kegan Paul.

 

Author Information

M. J. Cresswell
Email: max.cresswell@msor.vuw.ac.nz
Victoria University of Wellington
New Zealand

Differential Ontology

Differential ontology approaches the nature of identity by explicitly formulating a concept of difference as foundational and constitutive, rather than thinking of difference as merely an observable relation between entities, the identities of which are already established or known. Intuitively, we speak of difference in empirical terms, as though it is a contrast between two things; a way in which a thing, A, is not like another thing, B. To speak of difference in this colloquial way, however, requires that A and B each has its own self-contained nature, articulated (or at least articulable) on its own, apart from any other thing. The essentialist tradition, in contrast to the tradition of differential ontology, attempts to locate the identity of any given thing in some essential properties or self-contained identities, and it occupies, in one form or another, nearly all of the history of philosophy. Differential ontology, however, understands the identity of any given thing as constituted on the basis of the ever-changing nexus of relations in which it is found, and thus, identity is a secondary determination, while difference, or the constitutive relations that make up identities, is primary. Therefore, if philosophy wishes to adhere to its traditional, pre-Aristotelian project of arriving at the most basic, fundamental understanding of things, perhaps its target will need to be concepts not rooted in identity, but in difference.

“Differential ontology” is a term that may be applied particularly to the works and ideas of Jacques Derrida and Gilles Deleuze. Their successors have extended their work into cinema studies, ethics, theology, technology, politics, the arts, and animal ethics, among others.

This article consists of three main sections. The first explores a brief history of the problem. The historical emergence of the problem in ancient Greek philosophy reveals not only the dangers of a philosophy of difference, but also demonstrates that it is a philosophical problem that is central to the nature of philosophy as such, and is as old as philosophy itself. The second section explores some of the common themes and concerns of differential ontology. The third section discusses differential ontology through the specific lenses of Jacques Derrida and Gilles Deleuze.

Table of Contents

  1. The Origins of the Philosophy of Difference in Ancient Greek Philosophy
    1. Heraclitus
    2. Parmenides
    3. Plato
    4. Aristotle
  2. Key Themes of Differential Ontology
    1. Immanence
    2. Time as Differential
    3. Critique of Essentialist Metaphysics
  3. Key Differential Ontologists
    1. Jacques Derrida as a Differential Ontologist
    2. Gilles Deleuze as a Differential Ontologist
  4. Conclusion
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources
    3. Other Sources Cited in this Article

1. The Origins of the Philosophy of Difference in Ancient Greek Philosophy

Although the concept of differential ontology is applied specifically to Derrida and Deleuze, the problem of difference is as old as philosophy itself. Its precursors lie in the philosophies of Heraclitus and Parmenides, it is made explicit in Plato and deliberately shut down in Aristotle, remaining so for some two and a half millennia before being raised again, and turned into an explicit object of thought, by Derrida and Deleuze in the middle of the twentieth century.

a. Heraclitus

From its earliest beginnings, what distinguishes ancient Greek philosophy from a mythologically oriented worldview is philosophy’s attempt to offer a rationally unified picture of the operations of the universe, rather than a cosmos subject to the fleeting and conflicting whims of various deities, differing from humans only in virtue of their power and immortality. The early Milesian philosophers, for instance, had each sought to locate among the various primitive elements a first principle (or archê). Thales had argued that water was the primary principle of all things, while Anaximenes had argued for air. Through various processes and permutations (or in the case of Anaximenes, rarefaction and condensation), this first principle assumes the forms of the various other elements with which we are familiar, and of which the cosmos is comprised. All things come from this primary principle, and eventually, they return to it.

Against such thinkers, Heraclitus of Ephesus (fl. c.500 BCE) argues that fire is the first principle of the cosmos: “The cosmos, the same for all, no god or man made, but it always was, is, and will be, an everlasting fire, being kindled in measures and put out in measures.” (DK22B30) From this passage, we are able to glean a few things. The most obvious divergence is that Heraclitus names fire as the basic element, rather than water, air, or, in the case of Anaximander, the boundless (apeiron). But secondly, unlike the Milesians, Heraclitus does not hold in favor of any would-be origin of the cosmos. The universe always was and always will be this self-manifesting, self-quenching, primordial fire, expressed in nature’s limitless ways. So while fire, for Heraclitus, may be ontologically basic in some sense, it is not temporally basic or primordial: it did not, in the temporal or sequential order of things, come first.

However, like his Milesian predecessors, Heraclitus appears to provide at least a basic account for how fire as first principle transforms: “The turnings of fire: first sea, and of sea, half is earth, half lightning flash.” (DK22B31a) Elsewhere we can see more clearly that fire has ontological priority only in a very limited sense for Heraclitus: “The death of earth is to become water, and the death of water is to become air, and the death of air is to become fire, and reversely.” (DK22B76) Earth becomes water; water becomes air; air becomes fire, and reversely. Combined with the two passages above, we can see that the ontological priority of fire is its transformative power. Fire from the sky consumes water, which later falls from the sky nourishing the earth. Likewise, fire underlies water (which in its greatest accumulations rages and howls as violently as flame itself), out of which comes earth and the meteorological or ethereal activity itself. Thus we can see the greatest point of divergence between Heraclitus and his Milesian forbears: the first principle of Heraclitus is not a substance. Fire, though one of the classical elements, is of its very nature a dynamic element—a vital element that is nothing more than its own transformation. It creates (volcanoes produce land masses; furnaces temper steel; heat cooks our food and keeps us safe from elemental exposure), but it also destroys in countless obvious ways; it hardens and strengthens, just as it weakens and consumes. Fire, then, is not an element in the sense of a thing, but more as a process. In contemporary scientific terms, we would say that it is the result of a chemical reaction, its essence for Heraclitus lies in its obvious dynamism. When we look at things (like tables, trees, homes, people, and so forth), they seem to exemplify a permanence which is saliently missing from our experience of fire.

This brings us to the next point: things, for Heraclitus, only appear to have permanence; or rather, their permanence is a result of the processes that make up the identities of the things in question. Here it is appropriate to cite the famous river example, found in more than one Heraclitean fragment: “You cannot step twice into the same rivers; for fresh waters are flowing in upon you.” (DK22B12) “We step and do not step into the same rivers; we are and are not.” (DK22B49a) These passages highlight the seemingly paradoxical nature of the cosmos. On the one hand, of course there is a meaningful sense in which one may step twice into the same river; one may wade in, wade back out, then walk back in again; this body of water is marked between the same banks, the same land markers, and the same flow of water, and so forth. But therein lies the paradox: the water that one waded into the first time is now completely gone, having been replaced by an entirely new configuration of particles of water. So there is also a meaningful sense in which one cannot step into the same river twice. But it is this particular flowing that makes this river this river, and nothing else. Its identity, therefore (as also my own identity), is an effervescent impermanence, constituted on the basis of the flows that make it up. We peer into nature and see things—rivers, people, animals, and so forth—but these are only temporary constitutions, even deceptions: “Men are deceived in their knowledge of things that are manifest.” (DK22B56) The true nature of nature itself, however, continuously eludes us: “Nature tends to conceal itself.” (DK22B123)

The paradoxical nature of things, (that their identities are constituted on the basis of processes), helps us to make sense of Heraclitus’ proclamation of the unity of opposites, (which both Plato and Aristotle held to be unacceptable). Fire is vital and powerful, raging beneath the appearances of nature like a primordial ontological state of warfare: “War is the father of all and the king of all.” (DK22B53) The very same process-driven nature of things that makes a thing what it is by the same operations tends toward the thing’s undoing as well. As we saw, fire is responsible for the opposite and complementary functions of both creativity and destruction. The nature of things is to tend toward their own undoing and eventual passage into their opposites: “What opposes unites, and the finest attunement stems from things bearing in opposite directions, and all things come about by strife.” (DK22B8).

It is in this sense that all things are, for Heraclitus, one. The creative-destructive operations of nature underlie all of its various expressions, binding the whole of the cosmos together in accordance with a rational principle of organization, recognized only in the universal and timeless truth that everything is constantly subject to the law of flux and impermanence: “It is wise to hearken, not to me, but to the Word [logos—otherwise translatable as reason, argument, rational principle, and so forth], and to confess that all things are one.” (DK22B50). Thus it is that Heraclitus is known as the great thinker of becoming or flux. The being of the cosmos, the most essential fact of its nature, lies in its becoming; its only permanence is its impermanence. For our purposes, we can say that Heraclitus was the first philosopher of difference. Where his predecessors had sought to identify the one primordial, self-identical substance or element, out of which all others had emerged, Heraclitus had attempted to think the world, nothing more than the world, in a permanent state (if this is not too paradoxical) of flux.

b. Parmenides

Parmenides (b. 510 BCE) was likely a young man at the time when Heraclitus was philosophically active. Born in Elea, in Lower Italy, Parmenides’ name is the one most commonly associated with Eleatic monism. While (ironically) there is no one standard interpretation of Eleatic monism, probably the most common understanding of Parmenides is filtered through our familiarity with the paradoxes of his successor, Zeno, who argued, in defense of Parmenides, that what we humans perceive as motion and change are mere illusions.

Against this backdrop, what we know of Parmenides’ views come to us from his didactic poem, titled On Nature, which now exists in only fragmentary form. Here, however, we find Parmenides, almost explicitly objecting to Heraclitus: “It is necessary to say and to think that being is; for it is to be/but it is by no means nothing. These things I bid you ponder./For from this first path of inquiry, I bar you./then yet again from that along which mortals who know nothing/wander two-headed: for haplessness in their/breasts directs wandering understanding. They are borne along/deaf and blind at once, bedazzled, undiscriminating hordes,/who have supposed that being and not-being are the same/and not the same; but the path of all is back-turning.” (DK28B6) There are a couple of important things to note here. First, the mention of those who suppose that being and not-being are the same and not the same hearkens almost explicitly to Heraclitus’ notion of the unity of opposites. Secondly, Parmenides declares this to be the opinion of the undiscriminating hordes, the masses of non-philosophically-minded mortals.

Therefore, Heraclitus, on Parmenides’ view, does not provide a philosophical account of being; rather, he simply coats in philosophical language the everyday experience of the mob. Against Heraclitus’ critical diagnosis that humans (deceptively) see only permanent things, Parmenides claims that permanence is precisely what most people, Heraclitus included, miss, preoccupied as we are with the mundane comings and goings of the latest trends, fashions, and political currents. The being of the cosmos lies not in its becoming, as Heraclitus thought. Becoming is nothing more than an illusion, the perceptions of mortal minds. What is, for Parmenides, is and cannot not be, while what is not, is not, and cannot be. Neither is it possible to even think of what is not, for to think of anything entails that it must be an object of thought. Thus to meditate on something that is not an object is, for Parmenides, contradictory. Therefore, “for the same thing is for thinking and for being.” (DK28B3)

Being is indivisible; for in order to divide being from itself, one would have to separate being from being by way of something else, either being or not-being. But not-being is not and cannot be, (so not-being cannot separate being from being). And if being is separated from being by way of being, then being itself in this thought-experiment is continuous, that is to say, being is undivided (and indivisible): “for you will not sever being from holding to being.” (DK28B4.2) Being is eternal and unchanging; for if being were to change or become in any way, this would entail that in some sense it had participated or would participate in not-being which is impossible: “How could being perish? How could it come to be?/For if it came to be, it was not, nor if it is ever about to come to be./In this way becoming has been extinguished and destruction is not heard of.” (DK28B8.19-21)

Being, for Parmenides, is thus eternal, unchanging, and indivisible spatially or temporally. Heraclitus might have been right to note the way things appear, (as a constant state of becoming) but he was wrong, on Parmenides’ view, to confuse the way things appear with the way things actually are, or with the “steadfast heart of persuasive truth.” (DK28B1.29) Likewise, Parmenides has argued, thought can only genuinely attend to being—what is eternal, unchanging, and indivisible. Whatever it is that Heraclitus has found in the world of impermanence, it is not, Parmenides holds, philosophy. While unenlightened mortals may attend to the transience of everyday life, the path of genuine wisdom lies in the eternal and unchanging. Thus, while Heraclitus had been the first philosopher of difference, Parmenides is the first to assert explicitly that self-identity, and not difference, is the basis of philosophical thought.

c. Plato

It is these two accounts, the Heraclitean and the Parmenidean (with an emphatic privileging of the latter), that Plato attempts to answer and fuse in his theory of forms. Throughout Plato’s Dialogues, he consistently gives credence to the Heraclitean observation that things in the material world are in a constant state of flux. The Parmenidean inspiration in Plato’s philosophy, however, lies in the fact that, like Parmenides, Plato will explicitly assert that genuine knowledge must concern itself only with that which is eternal and unchanging. So, given the transient nature of material things, Plato will hold that knowledge, strictly speaking, does not apply to material things specifically, but rather, to the Forms (eternal and unchanging) of which those material things are instantiations. In Book V of the Republic, Plato (through Socrates) argues that each human capacity has a matter that is proper exclusively to it. For example, the capacity of seeing has as its proper matter waves of light, while the capacity of hearing has as its proper matter waves of sound. In a proclamation that hearkens almost explicitly to Parmenides, knowledge, Socrates argues, concerns itself with being, while ignorance is most properly affiliated with not-being.

From here, the account continues, (with an eye toward Heraclitus). Everything that exists in the world participates both in being and in not-being. For example, every circle both is and is not “circle.” It is a circle to the extent that we recognize its resemblance, but it is not circle (or the absolute embodiment of what it means to be a circle) because we also recognize that no circle as manifested in the world is a perfect circle. Even the most circular circle in the world will possess minor imperfections, however slight, that will make it not a perfect circle. Thus the things in the world participate both in being and in not-being. (This is the nature of becoming). Since being and not-being each has a specific capacity proper to it, becoming, lying between these two, must have a capacity that is proper only to it. This capacity, he argues, is opinion, which, as is fitting, is that epistemological mode or comportment lying somewhere between knowledge and ignorance.

Therefore, when one’s attention is turned only to the things of the world, she can possess only opinions regarding them. Knowledge, reserved only for that domain of being, that which is and cannot not be, for Plato, applies only to the Form of the thing, or what it means to be x and nothing but x. This Form is an eternal and unchanging reality. If one has knowledge of the Form, then one can evaluate each particular in the world, in order to accurately determine whether or not it in fact accords with the principle in question. If not, one may only have opinions about the thing. For example, if one possesses knowledge of the Form of the beautiful, then one may evaluate particular things in the world—paintings, sculptures, bodies, and so forth—and know whether or not they are in fact beautiful. Lacking this knowledge, however, one may hold opinions as to whether or not a given thing is beautiful, but one will never have genuine knowledge of it. More likely, she will merely possess certain tastes on the matter—(I like this poem; I do not like that painting, and so forth.) This is why Socrates, especially in the earlier dialogues, is adamant that his interlocutors not give him examples in order to define or explain their concepts (a pious action is doing what I am doing now…). Examples, he argues, can never tell me what the Form of the thing is, (such as piety, in the Euthyphro). The philosopher, Plato holds, concerns himself with being, or the essentiality of the Form, as opposed to the lover of opinion, who concerns himself only with the fleeting and impermanent.

From this point, however, things in the Platonic account start to get more complicated. “Participation” is itself a somewhat messy notion that Plato never quite explains in any satisfactory way. After all, what does it mean to say, as Socrates argues in the Republic, that the ring finger participates in the form of the large when compared to the pinky, and in the form of the small when compared to the middle finger? It would seem to imply that a thing’s participation in its relevant Form derives, not from anything specific about its nature, but only insofar as its nature is related to the nature of another thing. But the story gets even more complicated in that at multiple points in his later dialogues, Plato argues explicitly for a Form of the different, which complicates what we typically call Platonism, almost beyond the point of recognition, (see, for example, Theaetetus 186a; Parmenides 143b, 146b, and 153a; and Sophist 254e, 257b, and 259a).

On its face, this should not be surprising. If a finger sometimes participates in the form of the large and sometimes in the form of the small, it should stand to reason that any given thing, when looked at side by side with something similar, will be said to participate in the form of the same, while by extension, when compared to something that differs in nature, will be said to participate in the form of the different—and participate more greatly in the form of the different, the more different the two things are. A baseball, side by side with a softball, will participate greatly in the form of the same, (but somewhat in the Form of the different), but when looked at side by side with a cardboard box, will participate more in the Form of the different.

But consistently articulating what a Form of the different would be is more complicated than it may at first seem. To say that the Form of x is what it means to be x and nothing but x is comprehensible enough, when one is dealing with an isolable characteristic or set of characteristics of a thing. If we say, for instance, that the Form of circle is what it means to be a circle and nothing but a circle, we know that we mean all of the essential characteristics that make a circle, a circle, (a round-plane figure the boundary of which consists of points, equidistant from a center; an arc of 360°, and so forth.). By implication, each individual Form, to the extent that it completely is what it is, also participates equally in the Form of the same, in that it is the same as itself, or it is self-identical. But what can it possibly mean to say that the form of the different is what it means to be different and nothing but different? It would seem to imply that the identity of the form of the different is that it differs, but this requires that it differs even from itself. For if the essence of the different is that it is the same as the different, (in the way that the essence of circle is self-identical to what it means to be circle), then this would entail that the essence of the Form of the different is that, to the same extent that it is different, it participates equally in the Form of the same, (or that, like the rest of the Forms, it is self-identical). But the Form of the different is defined by its being absolutely different from the Form of the same; it must bear nothing of the Form of the same. But this means that the Form of the different must be different from the different as well; put otherwise, while for every other conceivable Platonic Form, one can say that it is self-identical, the Form of the different would be absolutely unique in that its nature would be defined by its self-differentiation.

But there are further complications still. Each Form in Plato’s ontology must relate to every other Form by way of the Form of the different. From this it follows that, just as the Form of the same pervades all the other Forms, (in that each is identical to itself), the Form of the different also pervades all the other Forms, (in that each Form is different from every other). This implies, in some sense, that the different is co-constitutive, along with Plato’s Form of the same, of each other individual Form. After all, part of what makes a thing what it is, (and hence, self-same, or self-identical), is that it differs from everything that it is not. To the extent that the Form of the same makes any given Form what it is, it is commensurably different from every other Form that it is not.

This complication, however, reaches its apogee when we consider the Form of the same specifically. As we said, the Form of the different is defined by its being absolutely different from the Form of the same. The Form of the same differs from all other Forms as well. While, for instance, the Form of the beautiful participates in the Form of the same, (in that the beautiful is the same as the beautiful, or it is self-identical), nevertheless, the Form of the same is different from, (that is, it is not the same as) the Form of the beautiful. The Form of the same differs, similarly, from all other Forms. However, its difference from the Form of the different is a special relation. If the Form of the different is defined by its being absolutely different from the Form of the same, we can say reciprocally, that the Form of the same is defined by its being absolutely different from the Form of the different; it relates to the different through the different. But this means that, to the extent that the Form of the same is self-same, or self-identical, it is so because it differs absolutely from the Form of the different. This entails, however, that its self-sameness derives from its maximal participation in the Form of the different itself.

We see, then, the danger posed by Plato’s Form of the different, and hence, by any attempt to formulate a concept of difference itself. Plato’s Form of the same is ubiquitous throughout his ontology; it is, in a certain sense, the glue that holds together the rest of the Forms, even if in many of his Middle Period dialogues, it never makes an explicit appearance. Simply by understanding what a Form means for Plato, we can see the central role that the Form of the same plays for this, or for that matter, any other essentialist ontology. By simply introducing a Form of the different, and attempting to rigorously think through its implications, one can see that it threatens to fundamentally undermine the Form of the same itself, and hence by implication, difference threatens to devour the whole rest of the ontological edifice of essentialism. Plato, it seems, was playing with Heraclitean fire. It is likely largely for this reason that Aristotle, Plato’s greatest student, nixes the Form of the different in his Metaphysics.

d. Aristotle

In the Metaphysics, Aristotle attempts to correct what he perceives to be many of Plato’s missteps. For our purposes, what is most important is his treatment of the notion of difference. For Aristotle inserts into the discussion a presupposition that Plato had not employed, namely, that ‘difference’ may be said only of things which are, in some higher sense, identical. Where Plato’s Form of the different may be said to relate everything to everything else, Aristotle argues that there is a conceptual distinction to be made between difference and otherness.

For Aristotle, there are various ways in which a thing may be said to be one, in terms of: (1) Continuity; (2) Wholeness or unity; (3) Number; (4) Kind. The first two are a bit tricky to distinguish, even for Aristotle. By continuity, he means the general sense in which a thing may be said to be a thing. A bundle of sticks, bound together with twine, may be said to be one, even if it is a result of human effort that has made it so. Likewise, an individual body part, such as an arm or leg, may be said to be one, as it has an isolable functional continuity. Within this grouping, there are greater and lesser degrees to which something may be said to be one. For instance, while a human leg may be said to be one, the tibia or the femur, on their own, are more continuous, (in that each is numerically one, and the two of them together form the leg).

With respect to wholeness or unity, Aristotle clarifies the meaning of this as not differing in terms of substratum. Each of the parts of a man, (the legs, the arms, the torso, the head), may be said to be, in their own way, continuous, but taken together, and in harmonious functioning, they constitute the oneness or the wholeness of the individual man and his biological and psychological life. In this sense, the man is one, in that all of his parts function naturally together towards common ends. In the same respect, a shoelace, each eyelet, the sole, and the material comprising the shoe itself, may be said to be, each in their own way, continuous, while taken together, they constitute the wholeness of the shoe.

Oneness in number is fairly straightforward. A man is one in the organic sense above, but he is also one numerically, in that his living body constitutes one man, as opposed to many men. Finally, there is generic oneness, the oneness in kind or in intelligibility. There is a sense in which all human beings, taken together, may be said to be one, in that they are all particular tokens of the genus human. Likewise, humans, cats, dogs, lions, horses, pigs, and so forth, may all be said to be one, in that they are all types of the genus animal.

Otherness is the term that Aristotle uses to characterize existent things which are, in any sense of the term, not one. There is, as we said, a sense in which a horse and a woman are one, (in that both are types of the genus animal), but an obvious sense in which they are other as well. There is a sense in which my neighbor and I are one, (in that we are both tokens of the genus human), but insofar as we are materially, emotionally, and psychologically distinct, there is a sense in which I am other than my neighbor as well. There is an obvious sense in which I and my leg are one but there is also a sense in which my leg is other than me as well, (for if I were to lose my leg in an accident, provided I received prompt and proper medical attention, I would continue to exist). Every existent thing, Aristotle argues, is by its very nature either one with or other than every other existent thing.

But this otherness does not, (as it does for Plato’s Form of the different), satisfy the conditions for what Aristotle understands as difference. Since everything that exists is either one with or other than everything else that exists, there need not be any definite sense in which two things are other. Indeed, there may be, (as we saw above, my neighbor and I are one in the sense of tokens of the genus human, but are other numerically), but there need not be. For instance, you are so drastically other than a given place, say, a cornfield, that we need not even enumerate the various ways in which the two of you are other.

This, however, is the key for Aristotle: otherness is not the same as difference. While you are other than a particular cornfield, you are not different than a cornfield. Difference, strictly speaking, applies only when there is some definite sense in which two things may be said to differ, which requires a higher category of identity within which a distinction may be drawn: “For the other and that which it is other than need not be other in some definite respect (for everything that is existent is either other or same), but that which is different is different from some particular thing in some particular respect, so that there must be something identical whereby they differ. And this identical thing is genus or species…” (Metaphysics, X.3) In other words, two human beings may be different, (that is, one may be taller, lighter-skinned, a different gender, and so forth), but this is because they are identical in the sense that both are specific members of the genus ‘human.’ A human being may be different than a cat, (that is, one is quadripedal while the other is bipedal, one is non-rational while the other is rational, and so forth), but this is because they are identical in the sense that both are specified members of the genus ‘animal.’

But between these two, generic and specific, specific difference or contrariety is, according to Aristotle, the greatest, most perfect, or most complete. This assessment too is rooted in Aristotle’s emphasis on identity as the basis of differentiation. Differing in genus in Aristotelian terminology means primarily belonging to different categories of being, (substance, time, quality, quantity, place, relation, and so forth.) You are other than ‘5,’ to be sure, but Aristotle would not say that you are different from ‘5,’ because you are a substance and ‘5’ is a quantity and given that these two are distinct categories of being, for Aristotle there is not really a meaningful sense in which they can be said to relate, and hence, there is not a meaningful sense in which they can be said to differ. Things that differ in genus are so far distant (closer, really, to otherness), as to be nearly incomparable. However, a man may be said to be different than a cat, because the characteristics whereby they are distinguished from each other are contrarieties, occupying opposing sides of a given either/or, for instance, rational v. non-rational. Special difference or contrariety thus provides us with an affirmation or a privation, a ‘yes’ or a ‘no’ that constitutes the greatest difference, according to Aristotle. Differences in genus are too great, while differences within species are too minute and numerous (skin color, for instance, is manifested in an infinite myriad of ways), but special contrariety is complete, embodying an affirmation or negation of a particular given quality whereby genera are differentiated into species.

There are thus two senses in which, for Aristotle, difference is thought only in accordance with a principle of identity. First, there is the identity that two different things share within a common genus. (A rock and a tree are identical in that both are members of the genus, ‘substance,’ differentiated by the contrariety of ‘living/non-living.’) Second, there is the identity of the characteristic whereby two things are differentiated: material (v. non-material), living (v. non-living), sentient (v. non-sentient), rational (v. irrational), and so forth.

We see, then, that with Aristotle, difference becomes fully codified within the tradition as the type of empirical difference that we mentioned at the outset of this article: it is understood as a recognizable relation between two things which, prior to (and independently of) their relating, possess their own self-contained identities. This difference then is a way in which a thing A, (which is identical to itself) is not like a thing B (which is identical to itself), while both belong to a higher category of identity, (in the sense of an Aristotelian genus). Given the difficulties that we encountered with Plato’s attempt to think a Form of the different, it is perhaps little wonder that Aristotle’s understanding of difference was left unchallenged for nearly two and a half millennia.

2. Key Themes of Differential Ontology

As noted in the Introduction, differential ontology is a term that can be applied to two specific thinkers (Deleuze and Derrida) of the late twentieth century, along with those philosophers who have followed in the wakes of these two. It is, nonetheless, not applicable as a school of thought, in that there is not an identifiable doctrine or set of doctrines that define what they think. Indeed, for as many similarities that one can find between them, there are likely equally many distinctions as well. They work out of different philosophical traditions: Derrida primarily from the Hegel-Husserl-Heidegger triumvirate, with Deleuze, (speaking critically of the phenomenological tradition for the most part) focusing on the trinity of Spinoza-Nietzsche-Bergson. Theologically, they are interested in different sources, with Derrida giving constant nods to thinkers in the tradition of negative theology, such as Meister Eckhart, while Deleuze is interested in the tradition of univocity, specifically in John Duns Scotus. They have different attitudes toward the history of metaphysics, with Derrida working out of the Heideggerian platform of the supposed end of metaphysics, while Deleuze explicitly rejects all talk of any supposed end of metaphysics. They hold different attitudes toward their own philosophical projects, with Derrida coining the term, (following Heidegger’s Destruktion), deconstruction, while for Deleuze, philosophy is always a constructivism. In many ways, it is difficult to find two thinkers who are less alike than Deleuze and Derrida. Nevertheless, what makes them both differential ontologists is that they are working within a framework of specific thematic critiques and assumptions, and that on the basis of these factors, both come to argue that difference in itself has never been recognized as a legitimate object of philosophical thought, to hold that identities are always constituted, on the basis of difference in itself, and to explicitly attempt to formulate such a concept. Let us now look to these thematic, structural elements.

a. Immanence

The word immanence is contrasted with the word transcendence. “Transcendence” means going beyond, while “immanence” means remaining within, and these designations refer to the realm of experience. In most monotheistic religious traditions, for instance, which emphasize creation ex nihilo, God is said to be transcendent in the sense that He exists apart from His creation. God is the source of nature, but God is not, Himself, natural, nor is he found within anything in nature except perhaps in the way in which one might see reflections of an artist in her work of art. Likewise God does not, strictly speaking, resemble any created thing. God is eternal, while created beings are temporal; God is without beginning, while created things have a beginning; God is a necessary being, while created things are contingent beings; God is pure spirit, while created things are material. The creature closest in nature to God is the human being who, according to the Biblical book of Genesis, is created in the image of God, but even in this case, God is not to be understood as resembling human beings: “For my thoughts are not your thoughts, neither are my ways your ways…” (Isaiah 55:8).

In this sense, we can say that historically, the trend in Eastern philosophies and religions (which are not as radically differentiated as they are in the Western tradition), has always leaned much more in the direction of immanence than of transcendence, and definitely more so than nearly all strains of Western monotheism. In schools of Eastern philosophy that have some notion of the divine (and a great many of them do not), many if not most understand the divine as somehow embodied or manifested within the world of nature. Such a position would be considered idolatrous in most strains of Western monotheism.

With respect to the Western philosophical tradition specifically, we can say that, even in moments when religious tendencies and doctrines do not loom large, (like they do, for instance, during the Medieval period), there nevertheless remains a dominant model of transcendence throughout, though this transcendence is emphasized in greater and lesser degrees across the history of philosophy. There have been outliers, to be sure—Heraclitus comes to mind, along with Spinoza and perhaps David Hume, but they are rare. A philosophy rooted in transcendence will, in some way, attempt to ground or evaluate life and the world on the basis of principles or criteria not found (indeed not discoverable de jure) among the living or in the world. When Parmenides asserts that the object of philosophical thought must be that which is, and which cannot not be, which is eternal, unchanging, and indivisible, he is describing something beyond the realm of experience. When Plato claims that genuine knowledge is found only in the Form; when he argues in the Phaedo that the philosophical life is a preparation for death; that one must live one’s life turning away from the desires of the body, in the purification of the spirit; when he alludes, (through the mouth of Socrates) to life as a disease, he is basing the value of this world on a principle not found in the world. When St. Paul writes that “…the flesh lusts against the Spirit, and the Spirit against the flesh; and these are contrary to one another, so that you do not do the things you wish” (Galatians 5:17), and that “the mind governed by the flesh is death,” (Romans 8:6), he is evaluating this world against another. When René Descartes recognizes his activity of thinking and finds therein a soul; when John Locke bases Ideas upon a foundation of primary qualities, immanent, allegedly, to the thing, but transcendent to our experience; when Immanuel Kant bases phenomenal appearances upon noumenal realities, which, outside of space, time, and all causality, ever elude cognition; when an ethical thinker seeks a moral law, or an absolute principle of the good against which human behaviors may be evaluated; in each of these cases, a transcendence of some sort is posited—something not found within the world is sought in order to make sense of or provide a justification for this world.

Famously, Friedrich Nietzsche argued that the history of philosophy was one of the true world finally becoming a fable. Tracing the notion of the true world from its sagely Platonic (more accurately, Parmenidean) inception up through and beyond its Kantian (noumenal) manifestation, he demonstrates that as the demand for certainty (the will to Truth) intensifies, the so-called true world becomes less plausible, slipping further and further away, culminating in the moment he called “the death of God.” The true world has now been abolished, leaving only the apparent world. But the world was only ever called apparent by comparison with a purported true world (think here of Parmenides’ castigation of Heraclitus). Thus, when the true world is abolished, the apparent world is abolished as well; and we are left with only the world, nothing more than the world.

One of the key features of differential ontology will be the adoption of Nietzsche’s proclaimed (and reclaimed) enthusiasm for immanence. Deleuze and Derrida will, each in his own way, argue that philosophy must find its basis within and take as its point of departure the notion of immanence. As we shall see below, in Deleuze’s philosophy, this emphasis on immanence will take the form of his enthusiasm for the Scotist doctrine of the univocity of being. For Derrida, it will be his lifelong commitment to the phenomenological tradition, inspired by the vast body of research conducted over nearly half a century by Edmund Husserl, out of which Derrida’s professional research platform began, (and in which he discovers the notion of différance).

b. Time as Differential

Related to the privileging of immanence is the second principle of central importance to differential ontology, a careful and rigorous analysis of time. Such analysis, inspired by Edmund Husserl, will yield the discovery of a differential structure, which stands in opposition to the traditional understanding of time, the spatially organized, puncti-linear model of time. This refers to a conglomeration of various elements from Plato, Aristotle, St. Augustine, Boethius, and ultimately, the Modern period.

Few thinkers have attempted so rigorously as Aristotle to think the paradoxical nature of time. If we take the very basic model of time as past-present-future, Aristotle notes that one part of this (the past), has been but is no more, while another part of it (the future) will be but is not yet. There is an inherent difficulty, therefore, in trying to understand what time is, because it seems as though it is composed of parts made up of things that do not exist; therefore, we are attempting to understand what something that does not exist, is.

Furthermore, the present or the now itself, for Aristotle, is not a part of time, because a part is so called because of its being a constitutive element of a whole. However, time, Aristotle claims, is not made up of nows, in the same way that a line, though it has points, is not made up of points.

Likewise, the now cannot simply remain the same, but nor can it be said to be discrete from other nows and ever-renewed. For if the now is ever the same, then everything that has ever happened is contained within the present now, (which seems absurd); but if each now is discrete, and is constantly displaced by another discrete, self-contained now, the displacement of the old now would have to take place within (or simultaneously with) the new, incoming now, which would be impossible, as discrete nows cannot be simultaneous; hence time as such would never pass.

Aristotle will therefore claim that there is a sense in which the now is constantly the same, and another sense in which it is constantly changing. The now is, Aristotle argues, both a link of and the limit between future and past. It connects future and past, but is at the same time the end of the past and the beginning of the future; but the future and the past lie on opposite sides of the now, so the now cannot, strictly speaking, belong either to the past or to the future. Rather, it divides the past from the future. The essence of the now is this division—as such, the now itself is indivisible, “the indivisible present ‘now’.” (Physics IV.13). Insofar as each now succeeds another in a linear movement from future to past, the now is ever-changing—what is predicated of the now is constantly filled out in an ever-new way. But structurally speaking, inasmuch as the now is always that which divides and unites future and past, it is constantly the same.

Without the now, there would be no time, Aristotle argues, and vice versa. For what we call time is merely the enumeration that takes place in the progression of history between some moment, future or past, relative to the now moment: “For time is just this—number of motion in respect of ‘before’ and ‘after’.” (Physics, IV.11)

Here, then, are the elements that Aristotle leaves us with: an indivisible now moment that serves as the basis of the measure of time, which is a progression of enumeration taking place between moments, and a notion of relative distance that marks that progression of enumeration.

In Plato, St. Augustine, and Boethius, we find an important distinction between time and eternity: (it is important to note that Aristotle’s discussion of time is found in his Physics, not in his Metaphysics, because time, as the measure of change, belongs only to the things of nature, not to the divine). The reason that this is an important distinction for our purposes is that eternity, (for Plato, Augustine, and Boethius), is the perspective of the divine, while temporality is the perspective of creation. Eternity, for all three of these thinkers, does not mean passing through every moment of time in the sense of having always been there, and always being there throughout every moment of the future, (which is called ‘sempiternity’). All three of these philosophers view time as itself a created thing, and eternity, the perspective of the divine, stands outside of time.

Having created the entire spectrum of time, and standing omnisciently outside of time, the divine sees the whole of time, in an ever-present now. This complete fullness of the now is how Plato, St. Augustine, and Boethius understand eternity. Once we have made this move, however, it is a very short leap to the conclusion that, in a sense, all of time is ever-present; certainly not from our perspective, but from the perspective of the divine. In other words, right now, in an ever-present now, God is seeing the exodus of the Israelites, the sacking of Troy, the execution of Socrates, the crucifixion of Jesus, the fall of the Roman Empire, and the moment (billions of years from now) that the sun will become a Red Giant. Therefore, what we call the now is, on this model, no more or less significant, and no more or less NOW, than any other moment in time. It only appears to be so, from our very limited, finite perspectives. From the perspective of eternity, however, each present is equally present. Plato refers to time in the Timaeus as a “moving image of eternity.” (Timaeus, 37d)

The final piece of the puncti-linear model of time comes from the historical moment of the scientific revolution, with the conceptual birth of absolute time. On the Modern view, time is not the Aristotelian measure of change; rather, the measure of change is how we typically perceive time. Time, in and of itself, however, just is, in the same way that space, on the Modern view, just is—it is mathematical, objectively uniform and unitary, and the same in all places, its units (or moments) unfolding with precise regularity. Though the term “absolute time” was officially introduced by Sir Isaac Newton in his Philosophiae Naturalis Principia Mathematica (1687), as that which “from its own nature flows equably without regard to anything external,” it is nevertheless clear that the experiments and theories of both Johannes Kepler and Galileo Galilei (both of whom historically preceded Newton) assume a model of absolute time. None other than René Descartes, the philosopher who did more than any other to usher in the modern sciences, writes in the third of his famous Meditations on First Philosophy, (where he argues for the existence of God) “for the whole time of my life may be divided into an infinity of parts, each of which is in no way dependent on any other…”

The puncti-linear model of time, then, conceives the whole of time as a series of now-points, or moments, each of which makes up something like an atom of time (as the physical atom is a unit of space). Each of those moments is ontologically and logically independent of every other, with the present moment being the now-point most alive. The past, then, is conceived as those presents that have come and gone, while the future is conceived as those presents that have not yet come, and insofar as we speak of past and future moments as occupying points of greater and lesser distances with respect to the present and to each other as well, we are, whether we realize it or not, conceiving of time as a linear progression; thus when we attempt to understand the essence of time, we tend to conceptually spatialize it. This prejudice is most easily seen in the ease with which our mind leaps to timelines in order to conceptualize relations of historical events.

Edmund Husserl, whose 1904-1905 lectures On the Phenomenology of the Consciousness of Internal Time probably did more to shape the future of the continental tradition in philosophy in the twentieth century than any other single text, was also very influential on the issue of time consciousness. There, Husserl constructs a model of time consciousness that he calls “the living present.” Whether or not there is any such thing as real time, or absolute time is, for Husserl, one of those questions that is bracketed in the performance of the phenomenological epochē; time, for Husserl, as everything else, is to be analyzed in terms of its objective qualities, in other words, in terms of how it is lived by a constituting subject. Husserl’s point of departure is to object to the theory of experience offered by Husserl’s mentor, Franz Brentano. Assuming (though not making explicit) the puncti-linear model of time, Brentano distinguishes between two basic types of experience. The primary mode is perception, which is the consciousness of the present now-point. All modes of the past are understood in terms of memory, which is the imagined recollection or representation of a now-moment no longer present. If you are asked to call to mind your celebration of your tenth birthday, you are employing the faculty of memory. And every mode of experience dealing with the past (or the no-longer-present), is understood by Brentano as memory.

This understanding presents a problem, though. From the moment you began reading the first word of this sentence, from, until this moment right now, as you are reading these words, there is a type of memory being employed. Indeed, in order to genuinely experience any given experience as an experience, and not just a random series of moments, there must be some operation of memory taking place. To cognize a song as a song, rather than a random series of notes, we must have some memory of the notes that have come just before, and so on. However, the type of memory that is being used here seems to be qualitatively different than the sort of memory employed when you are encouraged to reflect upon your tenth birthday, or even, to reflect upon what you had for dinner last night. This type of reflection, (or representation) is a calling-back-to-mind of an event that was experienced at some point in the past, but has long since left consciousness, while the other (the sort of memory it takes to cognize a sentence or a paragraph, for instance), is an operative memory of an event that has not yet left consciousness. Both are modes of memory, to be sure, but they are qualitatively different modes of memory, Husserl argues.

Moreover, in each moment of experience, we are at the same time looking forward to the future in some mode of expectation. This, too, is something we experience on a frequent basis. For instance, when walking through a door that we walk through frequently, we might casually tap the handle and lead with our head and shoulders because we expect that the door, unlocked, will open for us as it always does; when sitting at the bus stop, as the bus approaches the curb, we stand, because we expect that it is our bus, and that it is stopping to let us on. Our expectations are not quite as salient as are our primary memories, but they are there. All it takes is a rupture of some sort—the door may be locked, causing us to hit our head; the bus may not be our bus, or the driver may not see us and may continue driving—to realize that the structure of expectation was present in our consciousness.

Time, Husserl argues, is not experienced as a series of discrete, independent moments that arise and instantly die; rather, experience of the present is always thick with past and future. What Aristotle refers to as the now, Husserl calls the ‘primal impression,’ the moment of impact between consciousness and its intentional object, which is ever-renewed, but also ever-passing away; but the primal impression is constantly surrounded by a halo of retention (or primary memory) and protention (primary expectation). This structure, taken altogether, is what Husserl calls, “the living present.”

Derrida and Deleuze will each employ (while subjecting to strident criticism) Husserl’s concept of the living present. If the present is always constituted as a relation of past to future, then the very nature of time is itself relational, that is to say, it cannot be conceived as points on a line or as seconds on a clock. If time is essentially or structurally relational, then everything we think about ‘things’ (insofar as things are constituted in time), and knowledge (insofar as it takes place in time), will be radically transformed as well. To fully think through the implications of Husserl’s discovery entails a fundamental reorientation toward time and along with it, being. Deleuze employs the retentional-protentional structure of time, while discarding the notion of the primal impression. Derrida will stick with the terms of Husserl’s structure, while demonstrating that the present in Husserl is always infected with or contaminated by the non-present, the structure of which Derrida calls différance.

c. Critique of Essentialist Metaphysics

In a sense it is difficult to talk in a synthetic way about what Deleuze and Derrida find wrong with traditional metaphysics, because they each, in the context of their own specific projects, find distinct problems with the history of metaphysics. However, the following is clear: (1) If we can use the term ‘metaphysics’ to characterize the attempt to understand the fundamental nature of things, and; (2) If we can acknowledge that traditionally the effort to do that has been carried out through the lenses of representational thinking in some form or other, then; (3) the critiques that Deleuze and Derrida offer against the history of metaphysical thinking are centered around the point that traditional metaphysics ultimately gets things wrong.

For Deleuze, this will come down to a necessary and fundamental imprecision that accompanies traditional metaphysics. Inasmuch as the task of metaphysics is to think the nature of the thing, and inasmuch as it sets for itself essentialist parameters for doing so, it necessarily filters its own understanding of the thing through conceptual representations, philosophy can never arrive at an adequate concept of the individual. To our heart’s desire, we may compound and multiply the concepts that we use to characterize a thing, but there will always necessarily be some aspect of that thing left untouched by thinking. Let us use a simple example, an inflated ball—we can describe it by as many representational concepts as we like: it is a certain color or a certain pattern, made of a certain material (rubber or plastic, perhaps), a certain size, it has a certain degree of elasticity, is filled with a certain amount of air, and so forth. However many categories or concepts we may apply to this ball, the nature of this ball itself will always elude us. Our conceptual characterization will always reach a terminus; our concepts can only go so far down. The ball is these characterizations, but it is also different from its characterizations as well. For Deleuze, if our ontology cannot reach down to the thing itself, if it is structurally and essentially incapable of comprehending the constitution of the thing, (as any substance metaphysics will be), then it is, for the most part, worthless as an ontology.

Derrida casts his critiques of the history of metaphysics in the Heideggerian language of presence. The history of philosophy, Derrida argues, is the metaphysics of presence, and the Western religious and philosophical tradition operates by categorizing being according to conceptual binaries: (good/evil, form/matter, identity/difference, subject/object, voice/writing, soul/body, mind/body, invisible/visible, life/death, God/Satan, heaven/earth, reason/passion, positive/negative, masculine/feminine, true/apparent, light/dark, innocence/fallenness, purity/contamination, and so forth.) Metaphysics consists of first establishing the binary, but from the moment it is established, it is already clear which of the two terms will be considered the good term and which the pejorative term—the good term is the one that is conceptually analogous to presence in either its spatial or temporal sense. The philosopher’s task, then, is to isolate the presence as the primary term, and to conceptually purge it of its correlative absent term.

A few examples will help elucidate this point: for Parmenides, divine wisdom entails an attempt to think that which is, at every present moment, the same. Heraclitean flux is the way of the masses. This in part shapes Plato’s emphasis on the eternality of the Form—while the material world changes with the passage of each new present, the Form remains the same, (ever-present or eternal); but in addition, the Form is also that which is closest to (a term of presence) the nature of the soul. In Plato the body, (given that it is in constant flux), is the prison of the soul, and in the Phaedo, life is declared a disease, for which death is the cure. In Aristotle’s Nicomachean Ethics, the most complete happiness for the human being lies in the life of contemplation, because the capacity for contemplation is that which makes us most like the gods, (who are ever the same). So we would be happiest if we could contemplate all the time; however, we unfortunately cannot (as we also have bodies and so, various familial, social, and political responsibilities). In Christianity, the flesh is subject to change (which is the very essence of corruptibility) while the Spirit is incorruptible (or ever-present); for Descartes, (and for the early phenomenological tradition), only what is spatially present (that is, immediate to consciousness), is indubitable. Whether or not the object of my present perception exists in the so-called real world, it is nevertheless indubitable that I am experiencing what I am experiencing, in the moment at which I am experiencing it—Descartes even goes so far as to define clarity (one of his conditions for ‘truth’) as that which is present. And for Descartes, the body (insofar as it is at least possible to doubt its existence), is not most essentially me; rather, my soul is what I am. We saw Aristotle’s emphasis on the now, or the present, as the yardstick against which time is measured. The spatial and temporal senses of the emphasis on presence are completely solidified in the phenomenology of Edmund Husserl, whose phenomenological reduction attempts to focus its attention exclusively on the phenomena of consciousness, because only then can it accord with the philosophical demand for apodicticity; and he understands this experience as constituted on the model of the living present. In each of these cases, a presence is valorized, and its correlative absence is suppressed.

As was the case with Deleuze, however, Derrida will argue that the self-proclaimed task of metaphysics ultimately fails. Thus, (and against some of his more dismissive critics), Derrida operates in the name of truth—when the history of metaphysics posits that presence is primary, and absence secondary, this claim, Derrida shows, is false. Metaphysics, in all of its manifestations, attempts to cast out the impure, but somehow, the impure always returns, in the form of a supplementary term; the secondary term is always ultimately required in order to supplement the primary term. Derrida’s project then is dedicated to demonstrating that if the subordinate term is required in order to supplement the so-called primary term, it can only be because the primary term suffers always and originally from a lack or a deficiency of some sort. In other words, the absence that philosophy sought to cast out was present and infecting the present term from the origin.

3. Key Differential Ontologists

a. Jacques Derrida as a Differential Ontologist

It is in the phenomenology of Edmund Husserl, as we said above, that Derrida discovers the constitutive play known as différance. In its effort to isolate ideal essences, constituted within the sphere of consciousness, phenomenology brackets or suspends all questions having to do with the real existence of the external world, the belief in which Husserl refers to as the natural attitude. The natural attitude is the non-thematic subjective attitude that takes for granted the real existence of the real world, absent and apart from my (or any) experience of it. Science or philosophy, in the mode of the natural attitude, ontologically distinguishes being from our perceptions of being, from which point it becomes impossible to ever again bridge the gap between the two. Try as it might, philosophy in the natural attitude can only ever have representations of being, and certainty (the telos of philosophical activity) becomes untenable. With this in mind, we get Husserl’s famous principle of all principles: “that everything originarily… offered to us in ‘intuition’ is to be accepted simply as what it is presented as being, but also only within the limits in which it is presented there.” (Ideas I, §24) Husserl thus proposes a change in attitude, in what he calls the phenomenological epochē, which suspends all questions regarding the external existence of the objects of consciousness, along with the constituting priority of the empirical self. Both self and world are bracketed, revealing the correlative structure of the world itself in relation to consciousness thereof, opening a sphere of pure immanence, or, in Derrida’s terminology, pure presence. It is for this reason that Husserl represents for Derrida the most strident defense of the metaphysics of presence, and it is for this reason that his philosophy also serves as the ground out of which the notion of constitutive difference is discovered.

In his landmark 1967 text, Voice and Phenomenon, Derrida takes on Husserl’s notion of the sign. The sign, we should note, is typically understood as a stand-in for something that is currently absent. Linguistically a sign is a means by which a speaker conveys to a listener the meaning that currently resides within the inner experience of the speaker. The contents of one person’s experience cannot be transferred or related to the experience of a listener except through the usage of signs, (which we call ‘language’). Knowing this, and knowing that Husserl’s philosophy is an attempt to isolate a pure moment of presence, it is little wonder that he wants, inasmuch as it is possible, to do away with the usage of signs, and isolate an interior moment of presence. It is precisely this aim that Derrida takes apart.

In the opening chapter of Husserl’s First Logical Investigation, titled “Essential Distinctions,” Husserl draws a distinction between different types or functions of signs: expressions and indications. He claims signs are always pointing to something, but what they point to can assume different forms. An indication signifies without pointing to a sense or a meaning. The flu or bodily infection, for instance, is not the meaning of the fever, but it is brought to our attention by way of the fever—the fever, that is, points to an ailment in the body. An expression, however, is a sign that is, itself, charged with conceptual meaning; it is a linguistic sign. There are countless types of signs—animal tracks in the snow point to the recent presence of life, a certain aroma in the house may indicate that a certain food or even a certain ethnicity of food, is being prepared—but expressions are signs that are themselves meaningful.

This distinction (indication/expression) is itself problematic, however, and does not seem to be fundamentally sustainable. An indication may very well be an expression, and expressions are almost always indications. If one’s significant other leaves one a note on the table, for instance, before one has even read the words on the paper, the sheer presence of writing, left in a certain way, on a sheet of paper, situated a certain way on the table, indicates: (1) That the beloved is no longer in the house, and; (2) That the beloved has left one a message to be read. These signs are both indications and expressions. Furthermore, every time we use an expression of some sort, we are indicating something, namely, we are pointing toward an ideal meaning, empirical states of affairs, and so forth. In effect, one would be hard pressed to find a single example of a use of an expression that was not, in some sense, indicative.

Husserl, however, will argue that even if, in fact, expressions are always caught up in an indicative function, that this has nothing to do with the essential distinction, obtaining de jure (if not de facto) between indications and expressions. Husserl cannot relinquish this conviction because, after all, he is attempting to isolate a pure moment of self-presence of meaning. So if expressions are signs charged with meaning, then Husserl will be compelled to locate a pure form of expression, which will require the absolute separation of the expression from its indicative function. Indeed he thinks that this is possible. The reason expressions are almost always indicative is that we use them to communicate with others, and in the going-forth of the sign into the world, some measure of the meaning is always lost—no matter how many signs we use to articulate our experience, the experience can never be perfectly and wholly recreated within the mind of the listener. So to isolate the expressive essence of the expression, we must suspend the going-forth of the sign into the world. This is accomplished in the soliloquy of the inner life of consciousness.

In one’s interior monologue, there is nothing empirical, (nothing from the world), and hence, nothing indicative. The signs themselves have only ideal, not real, existence. Likewise, the signs employed in the interior monologue are not indicative in the way that signs in everyday communication are. Communicative expressions point us to states of affairs or the internal experiences of another person; in short, they point us to empirical events. Expressions of the interior monologue, however, do not point us to empirical realities, but rather, Husserl claims that in the interior expression, the sign points away from itself, and toward its ideal sense. For Husserl, therefore, the purest, most meaningful mode of expression is one in which nothing is, strictly speaking, expressed to anyone.

One might nevertheless wonder, is it not the case, that when one ‘converses’ internally with oneself, that one is, in some sense, articulating meaning to oneself? Here is a mundane example, one which has happened to each of us at some point in time: we walk into a room, and forget why we have entered the room. “Why did I come in here?” we might silently utter to ourselves, and after a moment, we might say to ourselves, “Ah, yes, I came in here to turn down the thermostat,” or something of the like. Is it not the case that the individual is communicating something to herself in this monologue?

Husserl responds in the negative. The signs we are using are not making known to the self a content that was previously inaccessible to the self, (which is what takes place in communication). In pointing away from itself and directly toward the sense, the sense is not conveyed from a self to a self, but rather, the sense of the expression is experienced by the subject at exactly that same moment in time.

This is where Derrida brings into the discussion Husserl’s notion of the living present, for this emphasis on the exact same moment in time relies upon Husserl’s notion of the primal impression. Derrida writes, “The sharp point of the instant, the identity of lived-experience present to itself in the same instant bears therefore the whole weight of this demonstration.” (Voice and Phenomenon, 51). In our discussion above of Husserl’s living present, we saw that the primal impression is constantly displaced by a new primal impression—it is in constant motion. Nevertheless, this does not keep Husserl from referring to the primal impression in terms of a point; it is, he says, the source-point on the basis of which retention is established. While the primal impression is always, as we saw with Husserl, surrounded by a halo of retention and protention, nevertheless this halo is always thought from the absolute center of the primal impression, as the source-point of experience. It is the “non-displaceable center, an eye or living nucleus” of temporality. (Voice and Phenomenon, 53)

This, we note, is the Husserlian manifestation of the emphasis on the temporal sense of presence in the philosophical tradition. Here, we should also note: Derrida never attempts to argue that philosophy should move away from the emphasis on presence. The philosophical tradition is defined by, Derrida claims, its emphasis on the present; the present provides the very foundation of certainty throughout the history of philosophy; it is certainty, in a sense. The present comprises an ineliminable essential element of the whole endeavor of philosophy. So to call it into question is not to try to bring a radical transformation to philosophy, but to shift one’s vantage point to one that inhabits the space between philosophy and non-philosophy. Indeed, Derrida motions in this direction, which is one of the reasons that Derrida is more comfortable than many traditional academic philosophers writing in the style of and in communication with literary figures. The emphasis on presence within philosophy, strictly speaking is incontestable.

Husserl, we saw, formulated his notion of the living present on the basis of his insistence on a qualitative distinction between retention as a mode of memory still connected to the present moment of consciousness, and representational memory, that deliberately calls to mind a moment of the past that has, since its occurrence, left consciousness. This means, for Husserl, that retention must be understood in the mode of Presentation, as opposed to Representation. Retention is a direct intuition of what has just passed, directly presented, and fully seen, in the moment of the now, as opposed to represented memory, which is not. Retention is not, strictly speaking, the present; the primal impression is the present. Nevertheless, retention is still attached to the moment of the now. Furthermore, there is a sense in which retention is necessary to give us the experience of the present as such. The primal impression, where the point of contact occurs, is in a constant mode of passing away, and the impression only becomes recognizable in the mode of retention—as we said, to truly experience a song as a song entails that one must keep in one’s memory the preceding few notes; but this is just another way of saying that the present is constituted in part on the basis of memory, even if memory of what has just passed.

Derrida diagnoses, then, a tension operating at the heart of Husserl’s thinking. On the one hand, Husserl’s phenomenology requires the sharp point of the instant in order to have a pure moment of self-presence, wherein meaning, without the intermediary of signs, can be found. In this sense, retention, though primary memory, is memory nonetheless, and does not give us the present, but is rather, Husserl claims, the “antithesis of perception.” (On the Phenomenology of the Consciousness of Internal Time, 41) Nevertheless, the present as such is only ever constituted on the basis of its becoming concretized and solidified in the acts that constitute the stretching out of time, in retention. Moreover, given that retention is still essentially connected to the primal impression, (which is itself always in the mode of passing away) there is not a radical distinction between retention and primal impression; rather, they are continuous, and the primal impression is really only, Husserl claims, an ideal limit. There is thus a continual accommodation of Husserlian presence to non-presence, which entails the admission of a form of otherness into the self-present now-point of experience. This accommodation is what keeps fundamentally distinct memory as retention and memory as representation.

Nevertheless, the common root, making possible both retention and representational memory, is the structural possibility of repetition itself, which Derrida calls the trace. The trace is the mark of otherness, or the necessary relation of interiority to exteriority. Husserl’s living present is marked by the structure of the trace, Derrida argues, because the primal impression for Husserl, never occurs without a structural retention attached to it. Thus, in the very moment, the selfsame point of time, when the primal impression is impressed within experience, what is essentially necessary to the structure, (in other words, not an accidental feature thereof), is the repeatability of the primal impression within retention. To be experienced in a primal impression therefore requires that the object of experience be repeatable, such that it can mark itself within the mode of retention, and ultimately, representational memory. Thus the primal impression is traced with exteriority, or non-presence, before it is ever empirically stamped.

On the basis of this trace that constitutes the presence of the primal impression, Derrida introduces the concept of différance:

[T]he trace in the most universal sense, is a possibility that must not only inhabit the pure actuality of the now, but must also constitute it by means of the very movement of the différance that the possibility inserts into the pure actuality of the now. Such a trace is, if we are able to hold onto this language without contradicting it and erasing it immediately, more ‘originary’ than the phenomenological originarity itself… In all of these directions, the presence of the present is thought beginning from the fold of the return, beginning from the movement of repetition and not the reverse. Does not the fact that this fold in presence or in self-presence is irreducible, that this trace or this différance is always older than presence and obtains for it its openness, forbid us from speaking of a simple self-identity ‘im selben Augenblick’? (Voice and Phenomenon, 58)

Différance, Derrida says, is a movement, a differentiating movement, on the basis of which the presence (the ground of philosophical certainty) is opened up. The self-presence of subjectivity, in which certainty is established, is inseparable from an experience of time, and the structural essentialities of the experience of time are marked by the trace. It is more originary than the primordiality of phenomenological experience, because it is what makes it possible in the first place. From this it follows, Derrida claims, that Husserl’s project of locating a moment of pure presence will be necessarily thwarted, because in attempting to rigorously think it through, we have found hiding behind it this strange structure signifying a movement and hence, not a this, that Derrida calls différance.

Using Derrida’s terminology, (which we shall presently dissect), we can say that différance is the non-originary, constituting-disruption of presence. Let us take this bit by bit. Différance is constituting—this signifies, as we saw in Husserl, that it is on the basis of this movement that presence is constituted. Derrida claims that it is the play of différance that makes possible all modes of presence, including the binary categories and concepts in accordance with which philosophy, since Plato, has conducted itself. That it is constitutive does not, however, mean that it is originary, at least not without qualification. To speak of origins, for Derrida, implies an engagement with a presumed moment of innocence or purity, in other words, a moment of presence, from which our efforts at meaning have somehow fallen away. Derrida says, rather, that différance is a “non-origin which is originary.” (“Freud and the Scene of Writing,” Writing and Difference, 203) Différance is, in a sense, an origin, but one that is already, at the origin, contaminated; hence it is a non-originary origin.

At the same time, because différance as play and movement always underlie the constitutive functioning of philosophical concepts, it likewise always attenuates this functioning, even as it constitutes it. Différance prevents the philosophical concepts from ever carrying out fully the operations that their author intends them to carry out. This constituting-disruption is the source of one of Derrida’s more (in)famous descriptions of the deconstructive project: “But this condition of possibility turns into a condition of impossibility.” (Of Grammatology, 74)

Différance attenuates both senses of presence to which we referred above: (1) Spatially, it differs, creating spaces, ruptures, chasms, and differences, rather than the desired telos of absolute proximity; (2) Temporally, it defers, delaying presence from ever being fully attained. Thus it is that Derrida, capitalizing on the “two motifs of the Latin differre…” will understand différance in the dual sense of differing and deferring (“Différance,” in Margins of Philosophy, 8)

Derrida is therefore a differential ontologist in that, through his critiques of the metaphysical tradition, he attempts to think the fundamental explicitly on the basis of a differential structure, reading the canonical texts of the philosophical tradition both with and against the intentions of their authors. In so doing, he attempts to expose the play of force underlying the constitution of meaning, and thereby, to open new trajectories of thinking, rethinking the very concept of the concept, and forging a path “toward the unnameable.” (Voice and Phenomenon, 66)

b. Gilles Deleuze as a Differential Ontologist

As early as 1954, in one of Deleuze’s first publications, (a review of Jean Hyppolite’s book, Logic and Existence), Deleuze stresses the urgency for an ontology of pure difference, one that does not rely upon the notion of negation, because negation is merely difference, pushed to its outermost limit. An ontology of pure difference means an attempt to think difference as pure relation, rather than as a not-.

The reason for this urgency, as we saw above, is that the task of philosophy is to have a direct relation to things, and in order for this to happen, philosophy needs to grasp the thing itself, and this means, the thing as it differs from everything else that it is not, the thing as it essentially is. Reason, he claims, must reach down to the level of the individual. In other words, philosophy must attempt to think what he calls internal difference, or difference internal to Being itself. Above we noted that however many representational concepts we may adduce in order to characterize a thing, (our example above was a ball), so long as we are using concepts rooted in identity to grasp it, we will still be unable to think down to the level of what makes this, this. Therefore, philosophy suffers from an essential imprecision, which only differential ontology can repair.

This imprecision is rooted, Deleuze argues, in the Aristotelian moment, specifically in Aristotle’s metaphysics of analogy. Aristotle’s notion of metaphysics as first philosophy is that, while other sciences concentrate on one or another domain of being, metaphysics is to be the science of being qua being. Unlike every other science, (which all study some genus or species of being), the object of the metaphysical science, being qua being, is not a genus. This is because any given genus is differentiated (by way of differences) into species. Any species, beneath its given genus, is fully a member of that genus, but the difference that has so distinguished it, is not. Let us take an example: the genus animal. In this case, the difference, we might say, is rational, by which the genus animal is speciated into the species of man and beast, for Aristotle. Both man and beast are equally members of the genus animal, but rational, the differentia whereby they are separated, is not. Differentiae cannot belong, properly, to the given genus of which they are differentiae, and yet, differences exist, that is, they have being. Therefore, if differences have being, and the differentiae of any given genus cannot belong, properly speaking, to that genus, it follows that being cannot be a genus.

Metaphysics therefore cannot be the science of being as such, (because, given that there is no genus, ‘being,’ there is no being as such), but rather, of being qua being, or, put otherwise, the science that focuses on beings insofar as they are said to be. The metaphysician, Aristotle holds, must study being by way of abstraction—all that is is said to be in virtue of something common, and this commonality, Aristotle holds, is rooted in a thing’s being a substance, or a bearer of properties. The primary and common way in which a thing is said to be, Aristotle argues, is that it is a substance, a bearer of properties. Properties are, no doubt, but they are only to the extent that there is a substance there to bear them. There is no ‘red’ as such; there are only red things. Metaphysics, therefore, is the science that studies primarily the nature of what it means to be a substance, and what it means to be an attribute of a substance, but insofar as it relates to a substance. All of these other qualities, and along with them, the Aristotelian categories, are related analogically, through substances, to being. This explains Aristotle’s famous dictum: Being is said in many ways. (Metaphysics, IV.2) This is Aristotle’s doctrine known as the equivocity of being. Being is hierarchical in that there are greater and lesser degrees to which a thing may be said to be. In this sense, analogical being is a natural bedfellow with the theological impetus, in that it provides a means of speaking and thinking about God that does not flirt with heresy, that falls into neither of the two following traps: (1) Believing that a finite mind could ever possess knowledge of an infinite being; (2) Thinking that God’s qualities, for instance, his love or his mercy, are to be understood in like fashion to our own. Analogical being easily and naturally allows the Scholastic tradition to hierarchize the scala naturae, the great chain of being, thus creating a space in language and in thought for the ineffable.

Deleuze, however, will argue that the Aristotelian equivocity disallows ontology a genuine concept of Being, of the individual, or of difference itself: “However, this form of distribution commanded by the categories seemed to us to betray the nature of Being (as a cardinal and collective concept) and the nature of the distributions themselves (as nomadic rather than sedentary and fixed distributions), as well as the nature of difference (as individuating difference).” (Difference and Repetition, 269) By ‘cardinal’ and ‘collective,’ respectively, Deleuze means Being as the fundamental nature of what-is, (which is the individuating power of differentiation) and being as the whole of what-is. Analogical being, Deleuze objects, cannot think being as such (because being is not a genus, for Aristotle), and then, because it posits substances and categories (fixed, rather than fluid, models of distribution of being), it misses the nature of difference itself (because difference is always ‘filtered’ through a higher concept of identity), as well as the ability to think the individual (because thought can only go as far ‘down’ as the substance, which is always comprehended through our concepts). Thus, ontology, understood analogically, cannot do what it is designed to do: to think being.

To think a genuine concept of difference (and hence, being) requires an ontology which abandons the analogical model of being and affirms the univocity of being, that being is said in only one sense of everything of which it is said. Here, Deleuze cites three key figures that have made such a transformation in thinking possible: John Duns Scotus, Benedict de Spinoza, and Friedrich Nietzsche.

Deleuze calls Duns Scotus’ book, the Opus Oxoniense, “the greatest book of pure ontology.” (Difference and Repetition, 39) Scotus argues against Thomas Aquinas, and hence, against the Aristotelian doctrine of equivocity, in an effort to salvage the reliability of proofs for God’s existence, rooted, as they must be, in the experiences and the minds of human beings. Any argument that draws a conclusion about the being of God on the basis of some fact about his creatures presupposes, Scotus thinks, the univocal expression of the term, ‘being.’ If the being of God is wholly other, or even, analogous to, the being of man, then the relation of the given premises to the conclusion, “God exists,” thereby loses all validity. While, as noted, understanding being analogically affords the theological tradition a handy way of keeping the creation distinct from its creator, this same distance, Scotus thinks, also keeps us from truly possessing natural knowledge about God. So in order to save that knowledge, Scotus had to abolish the analogical understanding of being. However, Deleuze notes, in order to keep from falling into a heresy of another sort (namely, pantheism, the conviction that everything is God), Scotus had to neutralize univocal being. The abstract concept of being precedes the division into the categories of “finite” and “infinite,” so that, even if being is univocal, God’s being is nevertheless distinguished from man’s by way of God’s infinity. Being is therefore, univocal, but neutral and abstract, not affirmed.

Spinoza offers the next step in the affirmation of univocal being. Spinoza creates an elaborate ontology of expression and immanent causality, consisting of substance, attributes, and modes. There is but one substance, God or Nature, which is eternal, self-caused, necessary, and absolutely infinite. A given attribute is conceived through itself and is understood as constituting the essence of a substance, while a mode is an affection of a substance, a way a substance is. God is absolutely free because there is nothing outside of Nature which could determine God to act in such and such a way, so God expresses himself in his creation from the necessity of his own Nature. God’s nature is his power, and his power is his virtue. It is sometimes said that Spinoza’s God is not the theist’s God, and this is no doubt true, but it is equally true that Spinoza’s Nature is not the naturalist’s nature, because Spinoza equates Nature and God; in other words, he makes the world of Nature an object of worshipful admiration, or affirmation. Thus Spinoza takes the step that Scotus could not; nevertheless, Spinoza does not quite complete the transformation to the univocity of being. Spinoza leaves intact the priority of the substance over its modes. Modes can be thought only through substance, but the converse is not true. Thus for the great step Spinoza takes towards the immanentizing of philosophy, he leaves a tiny bit of transcendence untouched. Deleuze writes, “substance must itself be said of the modes and only of the modes.” (Difference and Repetition, 40)

This shift is made, Deleuze claims, with Nietzsche’s notion of eternal return. Deleuze understands Nietzsche’s eternal return in the terms of what Deleuze calls the disjunctive synthesis. In the disjunctive synthesis, we find the in itself of Deleuze’s notion of difference in itself. The eternal return, or the constant returning of the same as the different, constitutes systems, but these systems are, as we saw above, nomadic and fluid, constituted on the basis of disjunctive syntheses, which is itself a differential communication between two or more divergent series.

Given in experience, Deleuze claims, is diversity—not difference as such, but differences, different things, limits, oppositions, and so forth. But this experience of diversity presupposes, Deleuze claims, “a swarm of differences, a pluralism of free, wild, untamed differences; a properly differential and original space and time.” (Difference and Repetition, 50) In other words, the perceived, planar effects of limitations and oppositions presuppose a pure depth or a sub-phenomenal play of constitutive difference. Insofar as they are the conditions of space and time as we sense them, this depth that he is looking for is itself an imperceptible spatio-temporality. This depth (or ‘pure spatium’), he claims, is where difference, singularities, series, and systems, relate and interact. It is necessary, Deleuze claims, because the developmental and differential processes whereby living systems are constituted take place so rapidly and violently that they would tear apart a fully-formed being. Deleuze, comparing the world to an egg, argues that there are “systematic vital movements, torsions and drifts, that only the embryo can sustain: an adult would be torn apart by them.” (Difference and Repetition, 118)

At the embryonic level are series, with each series being defined by the differences between its terms, rather than its terms themselves. Rather, we should say that its terms are themselves differences, or what Deleuze calls intensities: “Intensities are implicated multiplicities, ‘implexes,’ made up of relations between asymmetrical elements.” (Difference and Repetition, 244). An intensity is essentially an energy, but an energy is always a difference or an imbalance, folded in on itself, an essentially elemental asymmetry. These intensities are the terms of a given series. The terms, however, are themselves related to other terms, and through these relations, these intensities are continuously modified. An intensity is an embryonic quantity in that its own internal resonance, which is constitutive of higher levels of synthesis and actualization, pulsates in a pure speed and time that would devastate a constituted being; it is for this reason that qualities and surface phenomena can only come to be on a plane in which difference as such is cancelled or aborted. In explicating its implicated differences, the system, so constituted, cancels out those differences, even if the differential ground rumbles beneath.

A system is formed whenever two or more of these heterogeneous series communicate. Insofar as each series is itself constituted by differences, the communication that takes place between the two heterogeneous series is a difference relating differences, a second-order difference, which Deleuze calls the differenciator, in that these differences relate, and in so relating, they differenciate first-order differences. This relation takes place through what Deleuze calls the dark precursor, comparing it to the negative path cleared for a bolt of lightning. Once this communication is established, the system explodes: “coupling between heterogeneous systems, from which is derived an internal resonance within the system, and from which in turn is derived a forced movement the amplitude of which exceeds that of the basic series themselves.” (Difference and Repetition, 117) The constitution of a series thus results in what we referred to above as the explication of qualities, which are themselves the results of a cancelled difference. Thus, Deleuze claims, the compounding or synthesis of these systems, series, and relations are the introduction of spatio-temporal dynamisms, which are themselves the sources of qualities and extensions.

The whole of being, ultimately, is a system, connected by way of these differential relations—difference relating to difference through difference—the very nature of Deleuze’s disjunctive synthesis. Difference in itself teems with vitality and life. As systems differentiate, their differences ramify throughout the system, and in so doing, series form new series, and new systems, de-enlisting and redistributing the singular points of interest and their constitutive and corresponding nomadic relations, which are themselves implicating, and, conversely, explicated in the phenomenal realm. The disjunctive synthesis is the affirmative employment of the creativity brought about by the various plays of differences. Against Leibniz’s notion of compossibility is opposed the affirmation of incompossibility. Leibniz defined the perfection of the cosmos in terms of its compossibility, and this in terms of a maximization of continuity. The disjunctive synthesis brings about the communication and cooperative disharmony of divergent and heterogeneous series; it does not, thereby, cancel the differences between them. Where incompossibility for Leibniz was a means of exclusion—an infinity of possible worlds excluded from reality on the basis of their incompossibility, in the hands of Deleuze, it becomes a means of opening the thing to the possible infinity of events. It is in this sense that Nietzsche’s eternal return is, for Deleuze, the affirmation of the return of the different, and hence, the affirmation, all of chance, all at once.

Finally, the eternal return is the name that Deleuze gives to the pulsating-contracting temporality according to which the pure spatium or depth differenciates. In Chapter 2 of Difference and Repetition, Deleuze employs the Husserlian discussion of the living present. However, unlike Derrida, Deleuze will simply discard Husserl’s notion of the primal impression—this term never makes an appearance in Difference and Repetition. Deleuze will argue that, with respect to time, the present is all there is, but the present itself is nothing more than the relation of retention to protention. If the present were truly present, (in the sense of a self-contained kernel of time, like Husserl’s primal impression), then, just as Aristotle noted, the present could never pass, because in order to pass, it would have to pass within another moment. The present can only pass, Deleuze claims, because the past is already in the present, reaching through the present, into the future, drawing the future into itself. Time, in other words, is nothing more than the contraction of past and future. The present, therefore, is the beating heart of difference in itself, but it is a present constituted on the basis of a differentiation.

Deleuze is therefore a differential ontologist in that he attempts to formulate a notion of difference that is: (1) The constitutive play of forces underlying the constitution of identities; (2) Purely relational, that is, non-negational, and hence, not in any way subordinate to the principle of identity. It is an ontology that attempts to think the conditions of identity but in such a way as to not recreate the presuppositions surrounding identity at the level of the conditions themselves—it is an ontology that attempts to think the conditions of real experience, the world as it is lived.

4. Conclusion

To briefly sum up, we can say that Derrida and Deleuze are the two key differential ontologists in the history of philosophy. While others before them were indeed thinkers of multiplicity, as opposed to thinkers of identity, none, so rigorously as Derrida and Deleuze, came to the conclusion that what was required in order to truly think multiplicity was an explicit formulation of a concept of difference, in itself. One of the key distinctions between the two of them, which explains many of their other differences, is their respective attitudes towards fidelity to the tradition of philosophy. While Derrida will understand fidelity to the tradition in the sense of embracing the presuppositions and prejudices of the tradition, using them, ultimately, to think beyond the tradition, but in such a way as to speak constantly of the end of metaphysics, and ultimately eschewing the adoption of any traditional philosophical terms such as ontology, being, and so forth; Deleuze, on the contrary, will understand fidelity to the philosophical tradition in the sense of embracing what philosophy has always sought to do, to think the fundamental, and will, in the name of this task, happily discard any presuppositions or prejudices that the tradition has attempted to bestow. So, while Derrida will, for instance, claim that the founding privilege of presence is not up for grabs in philosophy, but will, at the same time, avoid using terms like experience, being, ontology, concept, and so forth, Deleuze will claim that it is precisely the emphasis on presence (in the form of representational concepts and categories) that has kept philosophy from living up to its task. Therefore, the prejudice should be discarded. But that does not mean, Deleuze will argue, that we should give up metaphysics. If the old metaphysics no longer works, throw it out, and build a new one, from the ground up, if need be, but a new metaphysics, all the same.

5. References and Further Reading

The following is an annotated list of the key sources in which the differential ontologies of Derrida and Deleuze are formulated.

a. Primary Sources

  • Deleuze, Gilles. Bergsonism. Translated by Hugh Tomlinson and Barbara Habberjam. New York: Zone Books, 1988.
    • Henri Bergson was a French “celebrity” philosopher in the early twentieth century, whose philosophy had very much fallen out of favor in France by the mid to late 1920s, as Husserlian phenomenology began to work its way into Paris. Deleuze’s 1966 text on Bergson played no small part in bringing Bergson back into fashion. In this text, Deleuze analyzes the Bergsonian notions of durée, memory, and élan vital, demonstrating the consistent trajectory of Bergson’s work from beginning to end, and highlighting the centrality of Bergson’s notion of time for the entirety of Deleuze’s thought.
  • Deleuze, Gilles. Desert Islands and Other Texts (1953-1974). Ed. David Lapoujade, Trans. Michael Taormina. Los Angeles and New York: Semiotext(e), 2004.
    • This text contains many of Deleuze’s most important essays from his philosophically “formative” years. It contains his 1954 review of Jean Hyppolite’s Logic and Existence, as well as very early essays on Bergson. In addition, it contains interviews and round-table discussions surrounding the period leading up to and following the 1968 release of Difference and Repetition.
  • Deleuze, Gilles. Difference and Repetition. Translated by Paul Patton. New York: Columbia University Press, 1994.
    • This is without question the most important book Deleuze ever wrote. It was his principal thesis for the Doctorat D’Etat in 1968, and it is in this text that the various elements that had emerged from his book-length engagements with other philosophers over the years come together into the critique of representation and the formulation of a differential ontology.
  • Deleuze, Gilles. Expressionism in Philosophy: Spinoza. Translated by Martin Joughin. New York: Zone Books, 1992.
    • This book was his secondary, historical thesis in 1968, though there is good evidence that the book was actually first drafted in 1961-62, around the same time as the book on Nietzsche. The concept of expression, here analyzed in Spinoza, plays a very important role for Deleuze, and it is Spinoza who provides an alternative ontology of immanence which, contrary to that of Hegel, (quite prominent in mid-1960s France), does not rely upon the movement of negation, a concept that, for Deleuze, does not belong in the domain of ontology. As we saw in the article, Being is itself creative, and hence an object of affirmation, and to make negation an integral element in one’s ontology is, for Deleuze, antithetical to affirmation. Deleuze emphasizes expression, rather than negation. This emphasis already plays a role in the 1954 Review of Hyppolite (mentioned above).
  • Deleuze, Gilles. Kant’s Critical Philosophy. Translated by Hugh Tomlinson and Barbara Habberjam. Minneapolis: University of Minnesota Press, 1984.
    • In this 1963 text, Deleuze looks at the philosophy of Immanuel Kant, through the lens of the third CritiqueThe Critique of Judgment, arguing that Kant, (thanks in large part to Salomon Maimon) recognized the problems in the first two Critiques, and began attempting to correct them at the end of his life. In this text Deleuze examines Kant’s notion of the faculties, highlighting that by the time of the third Critique, (and unlike its two predecessors), the faculties are in a discordant accord—none legislating over the others.
  • Deleuze, Gilles. The Logic of Sense. Translated by Mark Lester with Charles Stivale. Edited by Constantin V. Boundas. New York: ColumbiaUniversity Press, 1990.
    • This text is interesting for a number of reasons. First, published in 1969, just one year following Difference and Repetition, it explores many of the same themes of Difference and Repetition, such as becomingtimeeternal returnsingularities, and so forth. At the same time, other themes, such as the event become centrally important, and these themes are now explored through the notion of sense. In a way it is in part a reorientation of Difference and Repetition. But one of the most important points to note is that the notion of the fractal subject is brought into conversation with psychoanalytic theory, and thus, this text forms a bridge between Deleuze’s earlier philosophical works and his political works with Félix Guattari, the first of which, Anti-Oedipus, was released just three years later, in 1972. In addition, the appendices in this volume include important essays on the reversal of Platonism, on the Stoics, and on Pierre Klossowski.
  • Deleuze, Gilles. Nietzsche and Philosophy. Translated by Hugh Tomlinson. New York: ColumbiaUniversity Press, 1983.
    • This book, from 1962, was Deleuze’s second book. It may be a stretch to say that this book is the second most important book in Deleuze’s corpus. Nevertheless, it is certainly high on the list of importance. Here we see in their formational expressions, some of the motifs that will dominate all of Deleuze’s works up through The Logic of Sense. Among them are the rejection of negation, the articulation of the eternal return, the incompleteness of the Kantian critique, the purely relational ontology of force, which is the will to power, and so forth. This book lays out in very raw and accessible, (if somewhat underdeveloped) form some of the most significant criticisms later developed in Difference and Repetition.
  • Derrida, Jacques. Dissemination. Translated by Barbara Johnson. Chicago: The University of Chicago Press, 1981.
    • This text was published in 1972, (along with Positions and Margins of Philosophy) as part of Derrida’s second publication blitz, (the first being in 1967). In addition to the titular essay, it contains the very important essay on Plato, “Plato’s Pharmacy.” Like much of what Derrida wrote, this text is extremely difficult, and probably not a good starter text for Derrida. Nonetheless, it’s one of his more important books.
  • Derrida, Jacques. Edmund Husserl’s Origin of Geometry: An Introduction. Lincoln, Nebraska: University of Nebraska Press, 1989.
    • In 1962, Derrida translated an essay from very late in Husserl’s career, “The Origin of Geometry.” He also wrote a translator’s introduction to the essay—Husserl’s essay is about 18 pages long, while Derrida’s introduction is about 150 pages in length. This essay is of particular interest because it deals with the problem of how the ideal is constituted in the sphere of the subject; this will be a problem that will occupy Derrida up through the period of Voice and Phenomenon and beyond.
  • Derrida, Jacques. Margins of Philosophy. Translated by Alan Bass. Chicago: The University of Chicago Press, 1982.
    • This book is a collection of essays from the late-1960s up through the early 1970s, and is another from the 1972 publication blitz. It is very important for numerous reasons, but in no small part because it is here that his engagement with the work of Martin Heidegger, (which will occupy him throughout the remainder of his career), robustly begins. It contains some of the most important essays from Derrida’s mature work. Among them are the famous “Différance” essay, “The Ends of Man,” “Ousia and Grammē: Note on a Note from Being and Time,” “White Mythology,” (on the essential metaphoricity of language), and “Signature Event Context,” known for spawning the famous debate with John Searle over the work of J.L. Austin.
  • Derrida, Jacques. Of Grammatology. Translated by Gayatri Chakravorty Spivak. Baltimore and London: The JohnsHopkinsUniversity Press, 1974.
    • This book is one from the 1967 publication blitz, and is probably Derrida’s most famous work, among philosophers and non-philosophers alike. In addition, it is really one of the only book-length pieces Derrida wrote that is (at least the first part), merely programmatic and expository, as opposed to the prolonged engagements he typically undertakes with a particular text or thinker (the second part of the book is such an engagement, primarily with the thought of Jean-Jacques Rousseau). In this book we get extended discussions of the history of metaphysics, logocentrism, the privilege of voice over writing in the tradition, différance, trace, and supplementarity. It is also in this text that the infamous quote, “There is no outside-text,” is found.
  • Derrida, Jacques. Positions. Translated by Alan Bass. Chicago: The University of Chicago Press, 1981.
    • This book is from the 1972 group, and is a very short collection of interviews. It is highly recommended as a good starting-point for those approaching Derrida for the first time. Here Derrida lays out in very straightforward, programmatic terms, the stakes of the deconstructive project, unencumbered by deep textual analysis.
  • Derrida, Jacques. Voice and Phenomenon. Trans. Leonard Lawlor. Evanston: Northwestern University Press, 2011.
    • This is probably the single most important book that Derrida wrote. It is, like most of the rest of his work, a textual engagement with Husserl. Nevertheless, Derrida concentrates on a few key passages of Husserl’s works, most of which are cited in the body of Derrida’s work, such that a very basic knowledge of the Husserlian project makes reading Derrida’s text quite possible. It is in his engagement with Husserl, and the deconstruction of consciousness, that Derrida formulates the key concepts of différancetrace, and supplementarity, that will govern the direction of his work for most of the rest of his career. Derrida himself claims (in Positions) that this is his favorite of his books.
  • Derrida, Jacques. Writing and Difference. Translated by Alan Bass. Chicago: The University of Chicago Press, 1978.
    • This book is from the 1967 publication blitz, and it is a collection of Derrida’s early essays up through the mid-1960s. Here we get a very important essay on Michel Foucault that spawned the bitter fallout between the two for many years, the essay on Emmanuel Levinas that reoriented Levinas’ thinking, as well as demonstrated the broad and deep knowledge that Derrida had of the phenomenological tradition. In addition, this essay, “Violence and Metaphysics,” highlights the important role that Levinas will play in Derrida’s own thinking. There is also a very important essay on Freud and the trace, which is contemporaneous with the writing of Voice and Phenomenon.

b. Secondary Sources

There are many terrific volumes of secondary material on these two thinkers, but here are selected a few of the most relevant with respect to the themes explored in this article.

  • Bell, Jeffrey A. Philosophy at the Edge of Chaos. Toronto: University of Toronto Press, 2006.
  • Bell, Jeffrey A. The Problem of Difference: Phenomenology and Poststructuralism. Toronto: University of Toronto Press, 1998.
    • Both of Bell’s books deal with the question of difference, and situate, primarily Deleuze but Derrida as well, within the larger context of the history of philosophical attempts to think about difference, including Plato, Aristotle, Whitehead, Descartes, and Kant.
  • Bogue, Ronald. Deleuze and Guattari. London and New York: Routledge, 1989.
    • Bogue’s text was one of the earlier attempts to write a comprehensive introductory text that took into account Deleuze’s historical engagements, along with his own philosophical articulations of his concepts, and the later political and aesthetic texts as well. This text is still a ‘standard’.
  • Bryant, Levi. Difference and Givenness: Deleuze’s Transcendental Empiricism and the Ontology of Immanence. Evanston: Northwestern University Press, 2008.
    • Bryant’s is one of the best books on Deleuze in recent years. It focuses primarily on Difference and Repetition, but examines the concept of transcendental empiricism, what it means to attempt to think the conditions of real experience, through the lens of Deleuze’s previous interactions with the thinkers who informed the research of Difference and Repetition.
  • De Beistigui, Miguel. Truth and Genesis: Philosophy as Differential Ontology. Bloomington and Indianapolis: Indiana University Press, 2004.
    • De Beistigui, whose early works centered on the philosophy of Martin Heidegger, argues in this book that Deleuze completes the Heideggerian attempt to think difference, in that it is Deleuze who overcomes the humanism that Heidegger never quite escapes.
  • Descombes, Vincent. Modern French Philosophy. Trans. L. Scott-Fox and J.M. Harding. Cambridge: Cambridge University Press, 1980.
    • Descombes’ book was one of the first to attempt to capture the heart of French philosophy in the wake of Bergson. It is still one of the best, especially given its brevity. Chapters 5 and 6 are particularly relevant.
  • Gasché, Rodolphe. The Tain of the Mirror: Derrida and the Philosophy of Reflection. Cambridge and London: HarvardUniversity Press, 1986.
    • Gasché’s text was one of the earliest and most forceful attempts to situate Derrida’s thinking in the context of an explicit philosophical problem, the problem of critical reflection, complete with a long philosophical history. In the Anglophone world, Derrida’s work was at this time explored mostly in the context of Literary Theory departments. While that gave Derrida something of a ‘head start’ over the reception of Deleuze in the United States, it also created the unfortunate impression, throughout academic philosophy, that Derrida’s project was one of merely ‘playing’ with canonical texts. Gasché’s book corrected that misconception, and to this day it remains one of the best texts for understanding the overall thrust of Derrida’s project.
  • Hägglund, Martin. Radical Atheism: Derrida and the Time of Life. Stanford: Stanford University Press, 2008.
    • Hägglund’s book enters into the oft-discussed theme of Derrida’s so-called religious turn in his later period. It is one of the best recent books on Derrida’s thinking, taking up the implications of a fully immanent analysis of temporality.
  • Hughes, Joe. Deleuze and the Genesis of Representation. London: Continuum International Publishing Group, 2008.
    • This book is one of the first to take up a close comparison of Deleuze with the philosophy of Husserl, boldly arguing that Deleuze’s project is marked from start to finish with a certain phenomenological impetus.
  • Lawlor, Leonard. Derrida and Husserl: The Basic Problem of Phenomenology. Bloomington and Indianapolis: Indiana University Press, 2002.
    • Lawlor was one of the first American scholars to emphasize the centrality of Husserl to Derrida’s work. This book is particularly helpful because, inasmuch as it deals with the entirety of Derrida’s engagement with Husserl (from 1953 up through Voice and Phenomenon and beyond), it provides a rigorous but accessible explication of the early formation of Derrida’s project. It is without question one of the best books on Derrida available.
  • Marrati, Paola. Genesis and Trace: Derrida Reading Husserl and Heidegger. Stanford: Stanford University Press, 2005.
    • This book, first published in French in 1998, explores the origins of the deconstructive project in Derrida’s engagement with the phenomenological tradition, and more specifically, with the question of time. It demonstrates and articulates the lasting significance of the Husserlian problematic up through Derrida’s immersion into Heidegger’s thought, and beyond.
  • Smith, Daniel W. Essays on Deleuze. Edinburgh: EdinburghUniversity Press, 2012.
    • Smith has for years been one of the leading voices in the world when it comes to Deleuze, and this 2012 volume at last collects all of his most important essays on Deleuze, along with a few new ones. Of particular interest is the essay on the simulacrum, the one on univocity, the one on the conditions of the new, and the comparative essay on Deleuze and Derrida.

c. Other Sources Cited in this Article

  • Aristotle. Complete Works of Aristotle: The Revised Oxford Translation, Vol. 1. Ed. Jonathan Barnes. Princeton: Princeton University Press, 1984.
  • Aristotle. Complete Works of Aristotle: The Revised Oxford Translation, Vol. 2. Ed. Jonathan Barnes. Princeton: Princeton University Press, 1984.
  • Duns Scotus. Philosophical Writings—A Selection. Trans. Allan B. Wolter. Indianapolis: Hackett Publishing Company, 1987.
  • Heraclitus. Fragments: A Text and Translation with a Commentary By T.M. Robinson. Toronto: University of Toronto Press, 1987.
  • Husserl, Edmund. The Crisis of European Sciences and Transcendental Phenomenology: An Introduction to Phenomenological Philosophy. Trans. David Carr. Evanston: Northwestern University Press, 1970.
  • Husserl, Edmund. Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy, First Book. Trans. F. Kersten. Dordrecht, Boston, and London: Kluwer Academic Publishers, 1998.
  • Husserl, Edmund. Logical Investigations Volume I. Ed. Dermot Moran. Trans. J.N. Findlay. New York: Routledge, 1970.
  • Husserl, Edmund. On the Phenomenology of the Consciousness of Internal Time (1893-1917). Trans. John Barnett Brough. Dordrecht, Boston, and London: Kluwer Academic Publishers, 1991.
  • Hyppolite, Jean. Logic and Existence. Trans. Leonard Lawlor and Amit Sen. Albany: State University of New York Press, 1997.
  • Parmenides. Fragments: A Text and Translation with an Introduction By David Gallop. Toronto: University of Toronto Press, 1984.
  • Plato. Complete Works. Ed. John M. Cooper. Associate Ed. D.S. Hutchinson. Indianapolis: Hackett Publishing Company, 1997.
  • Spinoza. Complete Works. Ed. Michael L. Morgan. Trans. Samuel Shirley. Indianapolis: Hackett Publishing Company, 2002.

 

Author Information

Vernon W. Cisney
Email: vcisney@gmail.com
Gettysburg College
U. S. A.

Internalism and Externalism in the Philosophy of Mind and Language

This article addresses how our beliefs, our intentions, and other contents of our attitudes are individuated, that is, what makes those contents what they are. Content externalism (henceforth externalism) is the position that our contents depend in a constitutive manner on items in the external world, that they can be individuated by our causal interaction with the natural and social world. In the 20th century, Hilary Putnam, Tyler Burge and others offered Twin Earth thought experiments to argue for externalism. Content internalism (henceforth internalism) is the position that our contents depend only on properties of our bodies, such as our brains. Internalists typically hold that our contents are narrow, insofar as they locally supervene on the properties of our bodies or brains.

Although externalism is the more popular position, internalists such as David Chalmers, Gabriel Segal, and others have developed versions of narrow content that may not be vulnerable to typical externalist objections. This article explains the variety of positions on the issues and explores the arguments for and against the main positions. For example, externalism incurs problems of privileged access to our contents as well as problems about mental causation and psychological explanation, but externalists have offered responses to these objections.

Table of Contents

  1. Hilary Putnam and Natural Kind Externalism
  2. Tyler Burge and Social Externalism
  3. Initial Problems with the Twin Earth Thought Experiments
  4. Two Problems for Content Externalism
    1. Privileged Access
    2. Mental Causation and Psychological Explanation
  5. Different Kinds of Content Externalism
  6. Content Internalism and Narrow Content
    1. Jerry Fodor and Phenomenological Content
    2. Brian Loar and David Chalmers on Conceptual Roles and Two-Dimensional Semantics
    3. Radical Internalism
  7. Conclusion
  8. References and Further Reading

1. Hilary Putnam and Natural Kind Externalism

In “The Meaning of ‘Meaning’” and elsewhere, Putnam argues for what might be called natural kind externalism. Natural kind externalism is the position that our natural kind terms (such as “water” and “gold”) mean what they do because we interact causally with the natural kinds that these terms are about. Interacting with natural kinds is a necessary (but not sufficient) condition for meaning (Putnam 1975, 246; Putnam 1981, 66; Sawyer 1998 529; Nuttecelli 2003, 172; Norris 2003, 153; Korman, 2006 507).

Putnam asserts that the traditional Fregean theory of meaning leaves a subject “as much in the dark as it ever was” (Putnam 1975, 215) and that it, and any descriptivist heirs to it, are mistaken in all their forms. To show this, Putnam insists that Frege accounts for the sense, intension, or meaning of our terms by making two assumptions:

  1. The meaning of our terms (for example, natural kind terms) is constituted by our being in a certain psychological state.
  2. The meaning of such terms determines its extension (Putnam 1975, 219).

Putnam concedes that assumptions one and two may seem appealing, but the conjunction of these assumptions is “…not jointly satisfied by any notion, let alone the traditional notion of meaning.” Putnam suggests abandoning the first while retaining a modified form of the second.

Putnam imagines that somewhere there is a Twin Earth. Earthlings and Twin Earthlings are exact duplicates (down to the last molecule) and have the same behavioral histories. Even so, there is one difference: the substance that we call water, on Twin Earth does not consist of H20, but of XYZ. This difference has not affected how the inhabitants of either planet use their respective liquids, but the liquids have different “hidden structures” (Putnam 1975, 235, 241; Putnam 1986, 288, Putnam 1988, 36). Putnam then imagines it is the year is 1750, before the development of chemistry on either planet. In this time, no experts knew of the hidden structure of water or of its twin, twater. Still, Putnam says, when the inhabitants of Earth use the term “water,” they still refer to the liquid made of H20; when the inhabitants of Twin Earth do the same, they refer to their liquid made of XYZ (Putnam 1975, 270).

The reason this is so, according to Putnam, is that natural kind terms like water have an “unnoticed indexical component.” Water, he says, receives its meaning by our originally pointing at water, such that water is always water “around here.” Putnam admits this same liquid relation “…is a theoretical relation, and may take an indefinite amount of investigation to determine” (Putnam 1975, 234). Given this, earthlings and their twin counterparts use their natural kind terms ((for example, “water,” “gold,” “cat,” and so forth), to refer to the natural kinds of their respective worlds (for example, on Earth, “water” refers to H20 but on Twin Earth it refers to XYZ). In other words, on each planet, twins refer to the “hidden structure” of the liquid of that planet. Thus, Putnam concludes that the first Fregean assumption is false, for while twins across planets are in the same psychological state, when they say “water,” they mean different things. However, understood correctly, the second assumption is true; when the twins say “water,” their term refers to the natural kind (that is, with the proper structural and chemical properties) of their world, and only to that kind.

Natural kind externalism, then, is the position that the meanings of natural kind terms are not determined by psychological states, but are determined by causal interactions with the natural kind itself (that is, kinds with certain structural properties). “Cut the pie any way you like,” Putnam says, “Meaning just ain’t in the head” (Putnam, 1975 227). However, soon after Putnam published “The Meaning of ‘Meaning,’” Colin McGinn and Tyler Burge noted that what is true of linguistic meaning extends to the contents of our propositional attitudes. As Burge says, the twins on Earth and Twin Earth “…are in no sense duplicates in their thoughts,” and this is revealed by how we ascribe attitudes to our fellows (Burge 1982b, 102; Putnam 1996, viii). Propositional attitude contents, it seems, are also not in our heads.

2. Tyler Burge and Social Externalism

In “Individualism and the Mental,” “Two Thought Experiments Reviewed,” “Other Bodies,” and elsewhere Tyler Burge argues for social externalism (Burge, 1979, 1982a, 1982b). Social externalism is the position that our attitude contents (for example, beliefs, intentions, and so forth) depend essentially on the norms of our social environments.

To argue for social externalism, Burge introduces a subject named Bert. Bert has a number of beliefs about arthritis, many of which are true, but in this case, falsely believes that he has arthritis in his thigh. While visiting his doctor, Bert says, “I have arthritis.” After hearing Bert’s complaint, the doctor informs his patient that he is not suffering from arthritis. Burge then imagines a counterfactual subject – let us call him Ernie. Ernie is physiologically and behaviorally the same as Bert (non-intentionally described), but was raised in a different linguistic community. In Ernie’s community, “arthritis” refers to a disease that occurs in the joints and muscles. When Ernie visits his doctor and mentions arthritis, he is not corrected.

Burge notes that although Ernie is physically and behaviorally the same as Bert, he “…lacks some – perhaps all – of the attitudes commonly attributed with content causes containing the word ‘arthritis…’” Given that arthritis is part of our linguistic community and not his, “it is hard to see how the patient could have picked up the notion of ‘arthritis’” (Burge 1979, 79; Burge 1982a, 286; Burge 1982b, 109). Although Bert and Ernie have always been physically and behaviorally the same, they differ in their contents. However, these differences “…stem from differences outside the patient.” As a result, Burge concludes: “it is metaphysically or constitutively necessary that in order to have a thought about arthritis one must be in relations to others who are in a better position to specify the disease” (Burge 1988, 350; Burge 2003b, 683; Burge 2007, 154).

Unlike Putnam, Burge does not restrict his externalism to natural kinds, or even to obvious social kinds. Regardless of the fact that our fellows suffer incomplete understanding of terms, typically we “take discourse literally” (Burge 1979, 88). In such situations, we still have “… assumed responsibility for communal conventions governing language symbols,” and given this, maintain the beliefs of our home communities (Burge 1979, 114). Burge states that this argument

…does not depend, for example, on the kind of word “arthritis” is. We could have used an artifact term, an ordinary natural kind word, a color adjective, a social role term, a term for a historical style, an abstract noun, an action verb, a physical movement verb, or any of other various sorts of words (Burge 1979, 79).

In other papers, Burge launches similar thought experiments to extend his externalism to cases where we have formed odd theories about artifacts (for example, where we believe sofas to be religious artifacts), to the contents of our percepts (for example, to our perceptions of, say, cracks or shadows on a sidewalk), and even to our memory contents (Burge 1985, 1986, 1998). Other externalists—William Lycan, Michael Tye, and Jonathan Ellis among them—rely on similar thought experiments to extend externalism to the contents of our phenomenal states, or seemings (Lycan 2001; Tye 2008; Ellis 2010). Externalism, then, can seem to generalize to all contents.

3. Initial Problems with the Twin Earth Thought Experiments

Initially, some philosophers found Putnam’s Twin Earth thought experiments misleading. In an early article, Eddy Zemach pointed out that Putnam had only addressed the meaning of water in 1750, “…before the development of modern chemistry” (Zemach 1976; 117). Zemach questions whether twins pointing to water “around here” would isolate the right liquids (that is, with H20 on Earth and XYX on Twin Earth). No, such pointing would not have done this, especially long ago; Putnam just stipulates that “water” on Earth meant H20 and on Twin Earth meant XYZ, and that “water” means the same now (Zemach 1976, 123). D. H. Mellor adds that, in point of fact, “specimens are causally downwind of the usage they are supposed to constrain.  They are chosen to fit botanical and genetic knowledge, not the other way around” (Mellor 1977, 305; Jackson 1998, 216; Jackson 2003, 61).

In response to this criticism, externalists have since developed more nuanced formulations of the causal theory of reference, downplaying supposed acts of semantic baptism (for example, “this is water”). The theoretical adjustments can then account for how natural kind term meanings can change or diminish the role of referents altogether (Sainsbury 2007). Zemach, Mellor, Jackson, and others infer that Putnam should have cleaved to the first Fregean principle (that our being in a certain psychological state constitutes meaning) as well as to the second. By combining both Fregean principles, earthlings and their twins, mean the same thing after all.

Similarly, Kent Bach, Tim Crane, Gabriel Segal, and others argue that since Bert incompletely understands “arthritis,” he does not have the contents that his linguistic community does (that is, that the disease refers only to the joints). Given his misunderstanding of “arthritis,” Bert would make odd inferences about the disease, but “these very inferences constitute the evidence for attributing to the patient some notion other than arthritis” (Bach 1987, 267). Crane adds that since Bert and Ernie would have “…all the same dispositions to behavior in the actual and counterfactual situations,” they have the same contents (Crane 1991, 19; Segal 2000, 81). However, Burge never clarifies how much understanding Bert needs to actually have arthritis.(Burge 1986, 702; Burge 2003a, 276). Burge admits, “when a person incompletely understands an attitude, he has some other content that more or less captures his understanding” (Burge 1979, 95). If Bert does have some other content ­– perhaps, “tharthritis”– there is little motive to insist that he also has arthritis. Bert and Ernie, it seems, have the same contents.

According to these criticisms, the Twin Earth thought experiments seem to show that twins across planets, or in those raised in different linguistic communities, establish differing contents by importing questionable assumptions such as; that the causal theory of reference is true, that we refer to hidden structural properties, or dictate norms for ascribing contents. Without these assumptions, the Twin Earth thought experiments can easily invoke internalist intuitions. When experimental philosophers tested such thought experiments, they elicited different intuitions. At the very least, these philosophers say, “intuitions about cases continue to underdetermine the selection of the correct theory of reference” (Machery, Mallon, Nichols, and Stich 2009, 341).

4. Two Problems for Content Externalism

a. Privileged Access

René Descartes famously insisted that he could not be certain he had a body: “…if I mean only to talk of my sensation, or my consciously seeming to see or to walk, it becomes quite true because my assertion refers only to my mind.” In this view, privileged access (that is, a priori knowledge) to our content would seem to be true.

However, as Paul Boghossian has argued, externalism may not be able to honor this apparent truism. Boghossian imagines that we earthlings have slowly switched from Earth to Twin Earth, unbeknownst to us. As a result, he proposes that the externalist theory would not allow for discernment of whether contents refer to water or twater, to arthritis or tharthritis (Boghossian 1989, 172; Boghossian 1994, 39). As Burge sees the challenge, the problem is that “…[a] person would have different thoughts under the switches, but the person would not be able to compare the situations and note when and where the differences occurred (Burge 1988, 653; Burge 1998, 354; Burge 2003a, 278).

Boghossian argues that since externalism seems to imply that we cannot discern whether our contents refer to water or twater, arthritis or tharthritis, then we do not have privileged access to the proper contents. Either the approach for understanding privileged access is in need of revision (for example, restricting the scope of our claims to certain a priori truths), or externalism is itself false.

In response, Burge and others claim the “enabling conditions” (for example, of having lived on Earth or Twin Earth) to have water or twater, arthritis or tharthritis, the referential contents need not be known. As long as the enabling conditions are satisfied, first-order contents can be known (for example, “that is water” or “I have arthritis”); this first-order knowledge gives rise to our second-order reflective contents about them. Indeed, our first-order thoughts (that is, about objects) completely determine the second-order ones (Burge 1988, 659). However, as Max de Gaynesford and others have argued, when considering switching cases, one cannot assume that such enabling conditions are satisfied. Since we may be incorrect about such enabling conditions, and may be incorrect about our first-order attitude contents as well, then we may be incorrect about our reflective contents too (de Gaynesford 1996, 391).

Kevin Falvey and Joseph Owens have responded that although we cannot discern among first order contents, doing so is unnecessary. Although the details are “large and complex,” our contents are always determined by the environment where we happen to be, eliminating the possibility of error (Falvey and Owens 1994, 118). By contrast, Burge and others admit that contents are individuated gradually by “how the individual is committed” to objects. and that this commitment is constituted by other factors (Burge 1988, 652, Burge 1998, 352; Burge 2003a, 252). If we were suddenly switched from Earth to Twin Earth, or back again, our contents would not typically change immediately. Externalism, Burge concedes, is consistent with our having “wildly mistaken beliefs” about our enabling conditions (for example, about where we are) and so allows for errors about our first-order contents, as well as reflections upon them (Burge 2003c, 344). Because there is such a potential to be mistaken about enabling conditions, perhaps we do need to know them after all.

Other externalists have claimed that switching between Earth and Twin Earth is not a “relevant alternative”. Since such a switch is not relevant, they say, our enabling conditions (that is, conditions over time) are what we take them to be, such that we have privileged access to our first-order contents, and to our reflections on these (Warfield 1997, 283; Sawyer 1999, 371). However, as Peter Ludlow has argued, externalists cannot so easily declare switching cases to be irrelevant (Ludlow 1995, 48; Ludlow 1997, 286). As many externalists concede, relevant switching cases are easy to devise. Moreover, externalists cannot both rely on fictional Twin Earth thought experiments to generate externalist intuitions yet also object to equally fictional switching cases launched to evoke internalist intuitions. Either all such cases of logical possibility are relevant or none of them are (McCulloch 1995, 174).

Michael McKinsey, Jessica Brown and others offer another set of privileged access objections to externalism. According to externalism, they say, we do have privileged access to our contents (for example, to those expressed by water or twater, or arthritis or tharthritis). Given such access, we should be able to infer, by conceptual implication, that we are living in a particular environment (for example, on a planet with water and arthritis, on a planet twater and tharthritis, and so forth). Externalism, then, should afford us a priori knowledge of our actual environments (McKinsey 1991, 15; Brown 1995, 192; Boghossian 1998, 208). Since privileged access to our contents does not afford us a priori knowledge of our environments (the latter always remaining a posteriori), externalism implies that we possess knowledge we do not have.

Externalists, such as Burge, Brian McLaughlin and Michael Tye, Anthony Brueckner, and others have proposed that privileged access to our contents does not afford any a priori knowledge of our environments. Since we can be mistaken about natural kinds in our actual environment (for example, we may even discover that there are no kinds) the facts must be discovered. Actually, as long as objects exist somewhere, we can form contents about them just by hearing others theorize about them (Burge 1982, 114; Burge 2003c, 344; Brueckner 1992, 641; Gallois and O’Leary-Hawthorne 1996, 7; Ball 2007, 469). Although external items individuate contents, our privileged access to those contents does not afford us a priori knowledge of our environments (such details always remaining a posteriori). However, if we can form contents independent of the facts of our environment (for example, if we can have “x,” “y,” or “z” contents about x, y, or z objects when these objects have never existed around us), then externalism seems to be false in principle. Externalists must explain how this possibility does not compromise their position, or they must abandon externalism (Boghossian 1998, 210; Segal 2000, 32, 55; Besson 2012, 420).

Some externalists, Sarah Sawyer and Bill Brewer among them, have responded to this privilege access objection by suggesting that, assuming externalism is true, we can only think about objects by the causal interaction we have with them. Since this is so, privileged access to our contents allows us to infer, by conceptual implication, that those very objects exist in our actual environments. In other words, externalism affords us a priori knowledge of our environments after all (Sawyer 1998, 529-531; Brewer 2000, 416). Perhaps this response is consistent, but it requires privileged access to our contents by itself determine a priori knowledge of our environments.

b. Mental Causation and Psychological Explanation

Jerry Fodor, Colin McGinn, offer a second objection to externalism based in scientific psychology. When explaining behavior, psychologists typically cite local contents as causes of our behavior. Psychologists typically ascribe twins the same mental causes and describe twins as performing the same actions (Fodor 1987, 30; McGinn 1982, 76-77; Jacob 1992, 208, 211). However, according to externalism, contents are individuated in relation to different distal objects. Since twins across planets relate to different objects, they have different mental causes, and so should receive different explanations. Yet as Fodor and McGinn note, since the only differences between twins across planets are the relations they bear to such objects, those objects are not causally relevant. As Jacob says, the difference between twin contents and behavior seems to drop out as irrelevant (Jacob 1992, 211). Externalism, then, seems to violate principles of scientific psychology, rendering mental causation and psychological explanation mysterious (Fodor 1987, 39; McGinn 1982, 77).

Externalists, such as Burge, Robert van Gulick, and Robert Wilson have responded that, when considering mental causation and psychological explanation, local causes are important. There is still good reason to individuate contents between planets by the relation twins bear to distal (and different) objects (Burge 1989b, 309; Burge 1995, 231; van Gulick 1990; 156; Wilson 1997, 141). Externalists have also noted that any relations we bear to such objects may yet be causally relevant to behavior, albeit indirectly (Adams, Drebushenko, Fuller, and Stecker 1990, 221-223; Peacocke 1995, 224; Williamson 2000, 61-64). However, as Fodor, Chalmers, and others have argued, our relations to such distal and different objects are typically not causal but only conceptual (Fodor 1991b 21; Chalmers 2002a, 621). Assuming this is so, psychologists do not yet have any reason to prefer externalist recommendations for the individuation of content, since these practices do not distinguish the causal from the conceptual aspects of our contents.

5. Different Kinds of Content Externalism

Historically, philosophers as diverse as Georg Wilhelm Friedrich Hegel, Martin Heidegger, and Ludwig Wittgenstein have all held versions of externalism. Putnam and Burge utilize Twin Earth thought experiments to argue for natural kind and social externalism, respectively, which they extend to externalism about artifacts, percepts, memories, and phenomenal properties. More recently, Gareth Evans and John McDowell, Ruth Millikan, Donald Davidson, Andy Clark and David Chalmers have developed four more types of externalism – which do not primarily rely upon Twin Earth thought experiments, and which differ in their motivations, scopes and strengths.

Gareth Evans, John McDowell adapt neo-Fregean semantics to launch a version of externalism for demonstrative thoughts (for example, this is my cat). In particular, we bear a “continuing informational link” to objects (for example, cats, tables, and so forth) where we can keep track of them, as manifested in a set of abilities (Evans 1982, 174). Without this continuing informational link and set of abilities, Evans says, we can only have “mock thoughts” about such objects. Such failed demonstrative thoughts appear to have contents, but in fact, do not (Evans 1981, 299, 302; McDowell 1986, 146; McCulloch 1989, 211-215). Evans and McDowell conclude that when we express demonstrative thoughts, our contents about them are individually dependent on the objects they represent. In response, some philosophers have questioned whether there is any real difference between our successful and failed demonstrative thoughts (that is, rendering the latter mock thoughts). Demonstrative thoughts, successful or not, seem to have the same contents and we seem to behave the same regardless (Segal 1989; Noonan 1991).

Other externalists – notably David Papineau and Ruth Garrett Millikan – employ teleosemantics to argue that contents are determined by proper causes, or those causes that best aid in our survival. Millikan notes that when thoughts are “developed in a normal way,” they are “a kind of cognitive response to” certain substances and that it is this process that determines content (Millikan 2004, 234). Moreover, we can “make the same perceptual judgment from different perspectives, using different sensory modalities, under different mediating circumstances” (Millikan 2000, 103). Millikan claims that improperly caused thoughts may feel the same as genuine thoughts, but have no place in this system. Therefore, contents are, one and all, dependent on only proper causes. However, some philosophers have responded that the prescribed contents can be produced in aberrant ways and aberrant contents can be produced by proper causes. Contents and their proper causes may be only contingently related to one another (Fodor 1990).

Donald Davidson offers a third form of externalism that relies on our basic linguistic contact with the world, on the causal history of the contents we derive from language, and the interpretation of those contents by others. Davidson insists that

…in the simplest and most basic cases, words and sentences derive their meaning from the objects and circumstances in whose presence they were learned (Davidson 1989, 44; Davidson, 1991; 197).

In such cases, we derive contents from the world directly; this begins a causal history where we can manifest the contents. An instantly created swampman – a being with no causal history – could not have the contents we have, or even have contents at all (Davidson 1987, 19). Since our contents are holistically related to each other as interpreters, we must find each other to be correct in most things. Interpretations, however, change what our contents are and do so essentially (Davidson 1991, 211). Davidson concludes that our contents are essentially dependent on our basic linguistic contact with the world, our history of possessing such contents, and on how others interpret us. In response, some philosophers have argued that simple basic linguistic contact with the world, from which we derive specific contents directly, may not be possible. A causal history may not be relevant to our having contents and the interpretations by others may have nothing to do with content (Hacker 1989; Leclerc 2005).

Andy Clark and David Chalmers have developed what has been called “active externalism,” which addresses the vehicles of content – our brains. When certain kinds of objects (for example, notebooks, mobile phones, and so forth) are engaged to perform complex tasks, the objects change the vehicles of our contents as well as how we process information. When processing information, we come to rely upon objects (for example, phones), such that they literally become parts of our minds, parts of our selves (Clark and Chalmers 1998; Clark 2010). Clark and Chalmers conclude that our minds and selves are “extended” insofar as there is no boundary between our bodies, certain objects, and our social worlds. Indeed, other externalists have broadened active or vehicle externalism to claim that there are no isolated persons at all (Noe 2006; Wilson 2010). Philosophers have suggested various ways to draw a boundary between the mind and objects, against active or vehicle externalism (Adams and Aizawa 2010). Other philosophers have responded that because active or vehicle externalism implies that the mind expands to absorb objects, it is not externalism at all (Bartlett 2008).

These newer forms of externalism do not rely on Twin Earth thought experiments. Moreover, these forms of externalism do not depend on the causal theory of reference, reference to hidden structural properties, incomplete understanding, or dictated norms for ascribing content. Consequently, these newer forms of externalism avoid various controversial assumptions in addition to any problems associated with them. However, even if one or more of these externalist approaches could overcome their respective objections, all of them, with the possible exception of active or vehicle externalism (which is not universally regarded as a form of externalism) must still face the various problems associated with privileged access, mental causation, and psychological explanation.

6. Content Internalism and Narrow Content

Internalism proposes that our contents are individuated by the properties of our bodies (for example, our brains), and these alone. According to this view, our contents locally supervene on the properties of our bodies. In the history of philosophy, René Descartes, David Hume, and Gottlob Frege have defended versions of internalism. Contemporary internalists typically respond to Twin Earth thought experiments that seem to evoke externalist intuitions. Internalists offer their own thought experiments (for example, brain in a vat experiments) to invoke internalist intuitions. For example, as David Chalmers notes, clever scientists could arrange a brain such that it has

…the same sort of inputs that a normal embodied brain receives…the brain is connected to a giant computer simulation of the world. The simulation determines which inputs the brain receives. When the brain produces outputs, these are fed back into the simulation. The internal state of the brain is just like that of a normal brain (Chalmers 2005, 33).

Internalists argue that such brain in vat experiments are coherent; they contend that brains would have the same contents as they do without their having any causal interaction with objects. It follows that neither brains in vats nor our contents are essentially dependent on the environment (Horgan, Tienson and Graham 2005; 299; Farkas 2008, 285). Moderate internalists, such as Jerry Fodor, Brian Loar, and David Chalmers, who accept Twin Earth inspired externalist intuitions, insofar as they agree that some contents are broad (that is, some contents are individuated by our causally interacting with objects). However, these philosophers also argue that some contents are narrow (that is, some contents depending only on our bodies) Radical internalists, such as Gabriel Segal, reject externalist intuitions altogether and argue that all content is narrow.

Moderate and radical internalists are classed as internalists because they all develop accounts of narrow content, or attempt to show how contents are individuated by the properties of our bodies. Although these accounts of narrow content differ significantly, they are all designed to have particular features. First, narrow content is intended to mirror our perspective, which is the unique set of contents in each person, regardless of how aberrant these turn out to be. Second, narrow content is intended to mirror our patterns of reasoning – again regardless of aberrations. Third, narrow content is intended to be relevant to mental causation and to explaining our behavior, rendering broad content superfluous for that purpose (Chalmers 2002a, 624, 631). As internalists understand narrow content, then, our unique perspective, reasoning, and mental causes are all important – especially when we hope to understand the precise details of ourselves and our fellows.

a. Jerry Fodor and Phenomenological Content

In his early work, Jerry Fodor developed a version of moderate internalism. Fodor concedes that the Twin Earth thought experiments show that some of our contents (for example, water or twater) are broad. Still, for the purposes of scientific psychology, he argues, we must construct some content that respects “methodological solipsism,” which in turn explains why the twins are psychologically the same. Consequently, Fodor holds that narrow content respects syntax, or does so “without reference to the semantic properties of truth and reference” (Fodor 1981, 240; Fodor 1982, 100). Phenomenologically, the content of “water is wet” is something like “…a universally quantified belief with the content that all the potable, transparent, sailable-on…and so forth, kind of stuff is wet (Fodor 1982, 111).” However, soon after Fodor offered this phenomenological version of narrow content, many philosophers noted that if such content by itself has nothing to do with truth or reference it is hard to see how this could be content, or how it could explain our behavior (Rudder-Baker 1985, 144; Adams, Drebushenko, Fuller, and Stecker 1990, 217).

Fodor responded by slightly changing his position. According to his revised position, although narrow content respects syntax, contents are functions, or “mappings of thoughts and contexts onto truth conditions” (for example, the different truth conditions that obtain on either planet). In other words, before we think any thought in context we have functions that determine that when we are on one planet we have certain contents and when on the other we will have different ones. If we were switched with our twin, our thoughts would still match across worlds (Fodor 1987, 48). However, Fodor concedes, independent of context, we cannot say what narrow contents the twins share (that is, across planets). Still, he insists that we can approach these contents, albeit indirectly (Fodor 1987, 53). In other words, we can use broad contents understood in terms of “primacy of veridical tokening of symbols” and then infer narrow contents from them (Fodor 1991a, 268). Fodor insists that while twins across worlds have different broad contents, they have the same narrow functions that determine these contents.

But as Robert Stalnaker, Ned Block, and others have argued, if narrow contents are but functions that map contexts of thoughts onto truth conditions, they do not themselves have truth conditions – arguably, they are not contents. Narrow content, in this view, is just syntax (Stalnaker 1989, 580; Block 1991, 39, 51). Even if there was a way to define narrow content so that it would not reduce to syntax, it would still be difficult to say what about the descriptions must be held constant for them to perform the function of determining different broad contents on either planet (Block 1991, 53). Lastly, as Paul Bernier has noted, even if we could settle on some phenomenological descriptions to be held constant across planets, we could not know that any tokening of symbols really did determine their broad contents correctly (Bernier 1993, 335).

Ernest Lepore, Barry Loewer, and others have added that even if narrow contents could be defined as functions that map thoughts and contexts onto truth conditions, and even if these were short phenomenological descriptions, parts of the descriptions (for example, water or twater “…is wet,” or arthritis or tharthritis “…is painful” and so forth) might themselves be susceptible to new externalist thought experiments. If we could run new thought experiments on terms like “wet” or “painful,” those terms may not be narrow (Lepore and Loewer 1986, 609; Rowlands 2003, 113). Assuming that this is possible, there can be no way to pick out the truly narrow component in such descriptions, rendering this entire account of narrow content hopeless.

In the early 21st century, proponents of “phenomenal intentionality” defended this phenomenological version of narrow content. Experience, these philosophers say, must be characterized as the complex relation that “x presents y to z,” where z is an agent. In other words, agents are presented with various apparent properties, relations, and so forth, all with unique feels. Moreover, these presentations can be characterized merely by using “logical, predicative, and indexical expressions” (Horgan, Tienson, and Graham 2005, 301). These philosophers insist the phenomenology of such presentations is essential to them, such that all who share them also share contents. Phenomenal intentionality, then, is narrow (Horgan, Tienson, and Graham 2005, 302; Horgan and Kriegel 2008, 360). Narrow contents, furthermore, have narrow truth conditions. The narrow truth conditions of “that picture is crooked” may be characterized as “…there is a unique object x, located directly in front of me and visible by me, such that x is a picture and x is hanging crooked (Horgan, Tienson, and Graham 2005, 313).”

Defenders of phenomenological intentionality insist that although it is difficult to formulate narrow truth conditions precisely, we still share them with twins and brains in vats – although brains will not have many of their conditions satisfied (Horgan, Tienson, and Graham 315; Horgan 2007, 5). Concerning the objections to this account, if phenomenology is essential to content, there is little worry that content will reduce to syntax, about what part of content should be held constant across worlds, or about new externalist thought experiments. Externalists, of course, argue that phenomenology is not essential to content and that phenomenology is not narrow (Lycan 2001; Tye 2008; Ellis 2010). Because proponents of phenomenological intentionality hold that phenomenology is essential to content, they can respond that such arguments beg the question against them.

b. Brian Loar and David Chalmers on Conceptual Roles and Two-Dimensional Semantics

Brian Loar, David Chalmers, and others have developed another version of moderate internalism. Loar and Chalmers concede that the Twin Earth thought experiments show that some contents are broad (for example, water, twater or arthritis, tharthritis, and so forth), but also argue that our conceptual roles are narrow. Conceptual roles, these philosophers say, mirror our perspective and reasoning, and are all that is relevant to psychological explanations.

Conceptual roles, Colin McGinn notes, can be characterized by our “subjective conditional probability function on particular attitudes. In other words, roles are determined by our propensity to “assign probability values” to attitudes, regardless of how aberrant these attitudes turn out to be (McGinn 1982 234). In a famous series of articles, Loar adds that we experience such roles subjectively as beliefs, desires, and so forth. Such subjective states, he says, are too specific to be captured in public discourse and only have “psychological content,” or the content that is mirrored in how we conceive things (Loar 1988a, 101; Loar 1988b, 127). When we conceive things in the same way (for example, as twins across worlds do), we have the same contents. When we conceive things differently (for example, when we have different evaluations of a place), we have different contents. However, Loar admits that psychological contents do not themselves have truth conditions but only project what he calls “realization conditions,” or the set of possible worlds where contents “…would be true,” or satisfied (Loar 1988a, 108; Loar 1988b, 123). Loar concedes that without truth conditions psychological contents may seem odd, but they are narrow. In other words, psychological contents project realization conditions and do so regardless of whether or not these conditions are realized (Loar 1988b, 135).

David Chalmers builds on this conceptual role account of narrow content but defines content in terms of our understanding of epistemic possibilities. When we consider certain hypotheses, he says, it requires us to accept some things a priori and to reject others. Given enough information about the world, agents will be “…in a position to make rational judgments about what their expressions refer to” (Chalmers 2006, 591). Chalmers defines scenarios as “maximally specific sets of epistemic possibilities,” such that the details are set (Chalmers 2002a, 610). By dividing up various epistemic possibilities in scenarios, he says, we assume a “centered world” with ourselves at the center. When we do this, we consider our world as actual and describe our “epistemic intensions” for that place, such that these intentions amount to “…functions from scenarios to truth values” (Chalmers 2002a, 613). Epistemic intensions, though, have epistemic contents that mirror our understanding of certain qualitative terms, or those terms that refer to “certain superficial characteristics of objects…in any world” and reflect our ideal reasoning (for example, our best reflective judgments) about them (Chalmers 2002a, 609; Chalmers 2002b, 147; Chalmers 2006, 586). Given this, our “water” contents are such that “If the world turns out one way, it will turn out that water is H20; if the world turns out another way, it will turn out that water is XYZ” (Chalmers 2002b, 159).

Since our “water” contents constitute our a priori ideal inferences, the possibility that water is XYZ may remain epistemically possible (that is, we cannot rule it out a priori). Chalmers insists that given this background, epistemic contents really are narrow, because we evaluate them prior to whichever thoughts turn out to be true, such that twins (and brains in vats) reason the same (Chalmers 2002a, 616; Chalmers 2006, 596). Lastly, since epistemic contents do not merely possess realization conditions or conditions that address “…what would have been” in certain worlds but are assessed in the actual world, they deliver “epistemic truth conditions” for that place (Chalmers 2002a, 618; Chalmers 2002b, 165; Chalmers 2006, 594). Epistemic contents, then, are not odd but are as valuable as any other kind of content. In sum, Chalmers admits that while twins across worlds have different broad contents, twins (and brains) still reason exactly the same, and must be explained in the same way. Indeed, since cognitive psychology “…is mostly concerned with epistemic contents” philosophy should also recognize their importance (Chalmers, 2002a, 631).

However, as Robert Stalnaker and others have argued, if conceptual roles only have psychological contents that project realization conditions, we can best determine them by examining broad contents and then extracting narrow contents (Stalnaker 1990, 136). If this is how we determine narrow contents, they are merely derivative and our access to them is limited. However, Stalnaker says, that if psychological contents have actual truth conditions, then they are broad (Stalnaker 1990, 141). Chalmers, though, has responded that on his version of the position, our a priori ideal evaluations of epistemic possibilities in scenarios yields epistemic intensions with epistemic contents, and these contents have epistemic truth conditions of their own. Given these changes, he says, the problems of narrow content do not arise for him. Still, as Christian Nimtz and others have argued, epistemic contents seem merely to be qualitative descriptions of kinds, and so fall victim to typical modal and epistemological objections to descriptivist theories of reference (Nimtz 2004, 136). Similarly, as Alex Byrne and James Pryor argue, because such descriptions do not yield “substantive identifying knowledge” of kinds, it is doubtful we can know what these contents are, or even that they exist (Byrne and Pryor 2006, 45, 50). Lastly, other externalists have responded that the qualitative terms we use to express capture epistemic contents must be semantically neutral. However, since such neutral terms are not available, such contents are beyond our ken (Sawyer 2008, 27-28).

Chalmers has responded that although epistemic contents can be phrased descriptively, “there is no reason to think that grasping an epistemic intension requires any sort of descriptive articulation by a subject” (Chalmers 2002b, 148). Since this is so, he says, theory is not committed to the description theory of reference. Similarly, Chalmers insists that as long as we have enough information about the world, we do not require substantive identifying knowledge of kinds, but only “…a conditional ability” to identify them (Chalmers 2006, 600). Some philosophers have postulated that repeated contact with actual or hypothetical referents may change and refine it, and others have suggested that this ability may include abductive reasoning (Brogaard 2007, 320). However, Chalmers denies that this objection applies to his position because the conditional ability to identify kinds concerns only our ideal reasoning (for example, our best reflective judgments). Chalmers concedes that our qualitative language must be semantically neutral, but he insists that once we have enough information about the world, “…a qualitative characterization… will suffice” (Chalmers 2006, 603). Such a characterization, he says, will guarantee that twins (and brains in vats) have contents with the same epistemic truth conditions.

c. Radical Internalism

Unlike moderate internalists such as Fodor, Loar, and Chalmers, Gabriel Segal makes the apparently radical move of rejecting Twin Earth inspired externalist intuitions altogether (Segal, 2000, 24). Radical internalism offers the virtues of not needing to accommodate externalism, and of avoiding typical criticisms of narrow content. Internalists, Segal says, are not required to concede that some of our contents are broad and then to develop versions of narrow contents as phenomenology, epistemic contents, or conceptual roles beyond this. Indeed, he insists that this move has always been an “unworkable compromise” at best (Segal 2000, 114). Rather, as Segal says, narrow content is…a variety of ordinary representation” (Segal 2000, 19).

Segal notes that when offering Twin Earth thought experiments, externalists typically assume that the causal theory of reference is true, that we refer to hidden structural properties, can have contents while incompletely understanding them, and assume tendentious norms for ascribing contents. Moreover, Segal says, externalists seem unable to account for cases of referential failure (for example, cases of scientific failure), and so their account of content is implausible on independent grounds (Segal 2000, 32, 55). When all this is revealed, he says, our intuitions about content can easily change, and do. Interpreted properly, in this view, the Twin Earth thought experiments do not show that our contents are broad, but rather reveals “psychology, as it is practiced by the folk and the scientist, is already, at root, internalist” (Segal 2000, 122).

Segal describes our narrow contents as “motleys,” or organic entities that can “persist through changes of extension” yet still evolve. Motleys “…leave open the possibility of referring to many kinds,” depending on the circumstances (Segal 2000, 77, 132). Although Segal disavows any general method for determining when contents change or remain the same, he insists that we can employ various charitable means to determine this. When there is any mystery, we can construct neologisms to be more accurate. Neologisms allow us to gauge just how much our fellows deviate from us, and even to factor these deviations out (Segal 2000, 141; Segal 2008, 16). Regardless of these details, motleys are shared by twins (and brains in vats). Segal suggests that we follow the lead of experimental psychologists who track the growth and changes in our concepts by using scientific methods and models, but who only recognize narrow contents (Segal 2000, 155).

Importantly, because Segal rejects Twin Earth inspired externalist intuitions, externalists cannot accuse his version of narrow content of not being content, or of being broad content in disguise, without begging the question against him. Still, as Jessica Brown has argued, his account is incomplete. Segal, she notes, assumes a neo-Fregean account of content which she paraphrases as

If S rationally assents to P (t1) and dissents from or abstains from the truth value of P (t2), where P is in an extensional context, then for S, t1 and t2 have different contents, and S associates different concepts with them (Brown 2002, 659).

Brown insists that Segal assumes this neo-Fregean principle, but “…fails to provide any evidence” for it (Brown 2002, 660). Given this lacuna in his argument, she says, externalists might either try to accommodate this principle, or to reject it outright. Externalists have done both. Although Segal does not explicitly defend this neo-Fregean principle, he begins an implicit defense elsewhere (Segal 2003, 425). Externalists, by contrast, have struggled to accommodate this neo-Fregean principle and those externalists who have abandoned it have rarely offered replacements (Kimbrough 1989, 480).

Because Segal bases his radical internalism on rejecting externalist intuitions altogether, its very plausibility rests on this. Externalists typically take their intuitions about content obvious and expect others to agree. Still, as Segal notes, although externalist intuitions are popular, it is reasonable to reject them. By doing this, he hopes to “reset the dialectical balance” between externalism and internalism (Segal 2000, 126). Radical internalism, he says, may not be so radical after all.

7. Conclusion

Externalism and internalism, as we have seen, address how we individuate the content of our attitudes, or address what makes those contents what they are. Putnam, Burge, and other externalists insist that contents can be individuated by our causal interaction with the natural and social world. Externalists seem to incur problems of privileged access to our contents as well as problems about mental causation and psychological explanation. Externalist responses to these issues have been numerous, and have dominated the literature. By contrast, Chalmers, Segal, and other content internalists argue that contents can be individuated by our bodies. Internalists, though, have faced the charge that since narrow content does not have truth conditions, it is not content. Moreover, if such content does have truth conditions it is just broad content in disguise. Internalist responses to these charges have ranged from trying to show how narrow content is not vulnerable to these criticisms, to suggesting that the externalist intuitions upon which the charges rest should themselves be discarded.

Externalists and internalists have very different intuitions about the mind. Externalists seem unable to imagine minds entirely independent of the world (repeatedly claiming that possibility revises our common practices, or is incoherent in some way). By contrast, internalists see the split between mind and world as fundamental to understanding ourselves and our fellows, and so concentrate on questions of perspective, reasoning, and so forth. Given the contrasting intuitions between externalists and internalists, both sides often mischaracterize the other and see the other as clinging to doubtful, biased, or even incoherent conventions. Unless some way to test such intuitions is developed, proponents of both sides will continue to cleave to their assumptions in the debate, convinced that the burden of proof must be borne by the opposition.

8. References and Further Reading

  • Adams, F., Drebushenko, D., Fuller, G., and Stecker., R. 1990. “Narrow Content: Fodor’s Folly.” Mind and Language 5, 213-229.
  • Adams, F. & Aizawa, K. 2010. “Defending the Bounds of Cognition.” In R. Menary (ed.) The Extended Mind, Cambridge, MA: MIT Press, 67-80.
  • Bach, K. 1987. Thought and Reference. Oxford: Oxford University Press.
  • Ball, D. 2007. “Twin Earth Externalism and Concept Possession.” Australasian Journal of Philosophy, 85.3, 457-472.
  • Bartlett G. 2008. “Whither Internalism. How Should Internalists Respond to the Extended Mind.” Metaphilosophy 39.2, 163-184.
  • Bernecker, S. 2010. Memory: a Philosophical Study. Oxford: Oxford University Press.
  • Bernier, P. 1993. “Narrow Content, Context of Thought, and Asymmetric Dependency.” Mind and Language 8, 327-342.
  • Besson, C. 2012. “Externalism and Empty Natural Kind Terms.” Erkenntnis 76, 403-425.
  • Biggs, Stephen & Wilson, Jessica. (forthcoming). “Carnap, the Necessary A Posteriori, and Metaphysical Nihilism.” In Stephan Blatti and Sandra Lapointe (eds.) Ontology After Carnap. Oxford: Oxford University Press.
  • Block, N., 1991. “What Narrow Content is Not.” In Lepore and Rey eds., Meaning and Mind: Fodor and His Critics. Oxford: Blackwell Publishers, 33-64.
  • Boghossian, P. 1989. “Content and Self-Knowledge.” In Ludlow and Martin eds., Externalism and Self-Knowledge. Stanford: CSLI Publications, 149-175.
  • Boghossian, P. 1994. “The Transparency of Mental Content.” Philosophical Perspectives 8, 33-50.
  • Boghossian, P. 1998. “What Can the Externalist Know A Priori?” Philosophical Issues 9, 197-211.
  • Brewer, B. 2000. “Externalism and A Priori Knowledge of Empirical Facts.” In Peacocke and Boghossian eds., New Essays on the A Priori. Oxford: Oxford University Press.
  • Brogaard, B. 2007. “That May Be Jupiter: A Heuristic for Thinking Two-Dimensionally.” American Philosophical Quarterly 44.4, 315-328.
  • Brown, J. 1995. “The Incompatibility of Anti-Individualism and Privileged Access.” Analysis 55.3, 149-156.
  • Brown, J. 2002. “Review of ‘A Slim Book about Narrow Content’.” Philosophical Quarterly, 657-660.
  • Brueckner, A. 1992. “What the Anti-Individualist can Know A Priori.” Reprinted in Chalmers, ed. The Philosophy of Mind: Classic and Contemporary Readings. Oxford: Oxford University Press, 639-651.
  • Burge, T. 1979. “Individualism and the Mental.” In French, P., Uehling, T., and Wettstein, H., eds., Midwest Studies in Philosophy 4. Minneapolis: University of Minnesota Press, 73-121.
  • Burge, T. 1982a. “Two Thought Experiments Reviewed.” Notre Dame Journal of Formal Logic 23.3, 284-293.
  • Burge, T. 1982b. “Other Bodies.” In Woodfield, Andrew, ed., Thought and Object: Essays on Intentionality. Oxford: Oxford University Press, 97-121.
  • Burge, T. 1985. “Cartesian Error and the Objectivity of Perception.” In: Pettit, P. and McDowell, J., eds. Subject, Thought, and Context. Oxford: Clarendon Press, 117-135.
  • Burge, T. 1986 “Intellectual Norms and the Foundations of Mind.” Journal of Philosophy 83, 697-720.
  • Burge, T. 1988. “Individualism and Self-Knowledge.” Journal of Philosophy 85, 647-663.
  • Burge, T. 1989a. “Wherein is Language Social?” Reprinted in Owens J. ed., Propositional Attitudes: The Role of Content in Logic, Language and Mind Cambridge: Cambridge University Press, 113-131.
  • Burge, T. 1989b. “Individuation and Causation in Psychology.” In Pacific Philosophical Quarterly, 303-322.
  • Burge, T. 1995. “Intentional Properties and Causation.” In C. Macdonald (ed.), Philosophy of Psychology: Debates on Psychological Explanation. Cambridge: Blackwell, 225-234.
  • Burge, T. 1998. “Memory and Self-Knowledge.” In Ludlow and Martin eds., Externalism and Self-Knowledge. Stanford: CLSI Publications, 351-370.
  • Burge, T. 2003a. “Replies from Tyler Burge.” In Frapolli, M. and Romero, E., eds. Meaning, Basic Self-Knowledge, and Mind. Stanford, Calif.: CSLI Publications, 250-282.
  • Burge, T. 2003b. “Social Anti-Individualism, Objective Reference.” Philosophy and Phenomenological Research 73, 682-692.
  • Burge, T. 2003c. “Some Reflections on Scepticism: Reply to Stroud.” In Martin Hahn & B. Ramberg (eds.), Reflections and Replies: Essays on the Philosophy of Tyler Burge. Mass: MIT Press, 335-346.
  • Burge, T., 2007. “Postscript to ‘Individualism and the Mental.’” Reprinted in Foundations of Mind, by Tyler Burge. Oxford: Oxford University Press, 151-182.
  • Byrne, A. and J. Pryor, 2006, “Bad Intensions.” In Two-Dimensional Semantics: Foundations and Applications, M. Garcia-Carprintero and J. Macia (eds.), Oxford: Oxford University Press, 38–54.
  • Chalmers, D. 2002a. “The Components of Content.” Reprinted in Chalmers, ed. The Philosophy of Mind: Classic and Contemporary Readings. Oxford: Oxford University Press, 607-633.
  • Chalmers, D. 2002b. “On Sense and Intension.” In Philosophical Perspectives; Language and Mind, 16, James Tomberlin, ed., 135-182.
  • Chalmers, D. 2005. “The Matrix as Metaphysics.” Reprinted in Philosophy and Science Fiction: From Time Travel to Super Intelligence, ed. Schneider, S. London: Wiley-Blackwell Press, 33-53.
  • Chalmers, D. 2006. “Two-Dimensional Semantics.” In Oxford Handbook of Philosophy of Language, E. Lepore and B. Smith eds., Oxford: Oxford University Press, 575–606.
  • Clark, A, Chalmers, D.1998. “The Extended Mind.” Analysis 58,10-23.
  • Clark, A. 2010. Supersizing the Mind: Embodiment, Action, and Cognitive Extension. Oxford: Oxford University Press.
  • Crane, T. 1991. “All the Difference in the World.” Philosophical Quarterly 41, 3-25.
  • Davidson, D. 1987. “Knowing One’s Own Mind.” Reprinted in Davidson, Subjective, Intersubjective, Objective. Oxford: Oxford University Press, 15-39.
  • Davidson, D. 1989. “The Myth of the Subjective.” Reprinted in Davidson, Subjective, Intersubjective, Objective. Oxford: Oxford University Press, 39-52.
  • Davidson, D. 1991. “Epistemology Externalized.” Reprinted in Davidson, Subjective, Intersubjective, Objective. Oxford: Oxford University Press, 193-204.
  • De Gaynesford, M. 1996. “How Wrong can One Be?” Proceedings of the Aristotelian Society 96, 387-394.
  • Descartes, 1934-1955. Haldane, E. S. and Ross, G. R. T., trs. The Philosophical Works, vols. 1 and 2. Cambridge: Cambridge University Press.
  • Ellis, J. 2010. “Phenomenal Character, Phenomenal Consciousness, and Externalism.” Philosophical Studies, 147.2, 279-298.
  • Evans, G. 1981. “Understanding Demonstratives.” In Evans, Collected Papers. Oxford: Oxford University Press, 291-321.
  • Evans, G. 1982. The Varieties of Reference. Oxford: Oxford University Press.
  • Falvey and Owens, 1994. “Externalism, Self-Knowledge, and Skepticism.” Philosophy and Phenomenological Research, vol. 103.1, 107-137.
  • Farkas, K. 2003. “What is Externalism?” Philosophical Studies 112.3, 187-201.
  • Farkas, K 2008. “Phenomenal Intentionality without Compromise.” The Monist 71.2, 273-293.
  • Fodor, J. 1981. “Methodological Solipsism Considered as a Research Strategy in Cognitive Science.” In Fodor, J. RePresentations: Philosophical Essays on the Foundations of Cognitive Science. Cambridge MA: MIT Press, 225-256.
  • Fodor, J. 1982. “Cognitive Science and the Twin Earth Problem.” Notre Dame Journal of Formal Logic 23, 97-115.
  • Fodor, J. 1987. Psychosemantics: The Problem of Meaning in the Philosophy of Mind. Cambridge, Mass: MIT Press.
  • Fodor, J. 1990. A Theory of Content and Other Essays. Cambridge, Mass: MIT Press.
  • Fodor, J. 1991a. “Replies.” In Loewer, B. and Rey, G. eds. Meaning in Mind: Fodor and His Critics. Cambridge, Mass.: B. Blackwell, 255-319.
  • Fodor, J. 1991b. “A Modal Argument for Narrow Content,” Journal of Philosophy 87, 5-27.
  • Gallois, A. and O’Leary-Hawthorne, 1996. “Externalism and Skepticism,” Philosophical Studies 81.1, 1-26.
  • Hacker, P. 1989. “Davidson and Externalism,” Philosophy 73, 539-552.
  • Horgan, T. 2007. “Agentive Phenomenal Intentionality and the Limits of Introspection.” Psyche 13.1, 1-29.
  • Horgan, T., Kriegel, J. 2008. “Phenomenal Intentionality Meets the Extended Mind.” The Monist 91, 353-380.
  • Horgan, T., Tienson J., and Graham, G. 2005. “Phenomenal Intentionality and the Brain in a Vat.” In The Externalist Challenge, ed. Shantz. Berlin and New York: de Gruyter, 297-319.
  • Jacob, P. 1992. “Externalism and Mental Causation.” Proceedings of the Aristotelian Society 92, 203-219.
  • Jackson, F. 1998. “Reference and Description Revisited.” in Philosophical Perspectives, 12, Language, Mind and Ontology, 201-218.
  • Jackson, F 2003. Narrow Content and Representation- Or Twin Earth Revisited.” Proceedings and Addresses of the American Philosophical Association Vol. 77.2, 55-70.
  • Kimbrough, S. 1989. “Anti-Individualism and Fregeanism.” Philosophical Quarterly 193, 471-483.
  • Korman, D 2006. “What Should Externalists Say about Dry Earth?” Journal of Philosophy 103, 503-520.
  • Leclerc, A. 2005. “Davidson’s Externalism and Swampman’s Troublesome Biography.” Principia 9, 159-175.
  • Lepore E., and Loewer B. 1986. “Solipsistic Semantics.” Midwest Studies in Philosophy 10, 595-613.
  • Loar, B., 1988a. “Social Content and Psychological Content.” In Grim and Merrill, eds., Contents of Thought: Proceedings of the Oberlin Colloqium in Philosophy, Tucson, Arizona: University of Arizona Press, 99-115.
  • Loar, B. 1988b. “A New Kind of Content.” In Grim, R. H. and Merrill, D. D., eds. Contents of Thought: Proceedings of the Oberlin Colloquium in Philosophy. Tucson, Arizona: University of Arizona Press, 117-139.
  • Ludlow, P. 1995. “Externalism, Self-Knowledge, and the Relevance of Slow Switching.” Analysis 55.1, 45-49.
  • Ludlow, P. 1997. “On the Relevance of Slow Switching.” Analysis, 57.4, 285-286.
  • Lycan. W. 2001. “The Case for Phenomenal Externalism.” Philosophical Perspectives 15, 17-35.
  • Machery, Mallon, Nichols and Stich, 2009. “Against Arguments from Reference.” Philosophy and Phenomenological Research 79.2, 332-356.
  • McCulloch, G., 1989 Game of the Name: Introducing Logic, Language, and Mind. Oxford: Clarendon Press.
  • McCulloch, G., 1995 Mind and its World. London: Routledge Press.
  • McDowell, J. 1986. “Singular Thought and the Extent of Inner Space.” In McDowell and Pettit eds., Subject, Thought, Context. Oxford: Clarendon Press, 137-169.
  • McGinn, C. 1982. “The Structure of Content,” In Woodfield, A., ed. Thought and Object: Essays on Intentionality. Oxford: Oxford University Press, 207-254.
  • McKinsey, M. 1991. “Anti-Individualism and Privileged Access. Analysis 51, 9-15.
  • McLaughlin, B and Tye, M. 1998. “Is Content Externalism Compatible with Privileged Access?” Philosophical Review, 107.3, 149-180.
  • Mellor, H. 1979. “Natural Kinds.” British Journal for the Philosophy of Science 28, 299-313.
  • Millikan, R. G. 2000. On Clear and Confused Concepts. Cambridge: Cambridge University Press.
  • Millikan R. G. 2004. “Existence Proof for a Viable Externalism,” In The Externalist Challenge, ed. Shantz. Berlin and New York: de Gruyter, 297-319.
  • Nimtz, C. 2004. “Two-Dimensionalism and Natural Kind Terms.” Synthese 138.1, 125-48.
  • Noe, A. 2006. “Experience without the Head,” In Gendler and Hawthorne, ed. Perceptual Experience. Oxford: Oxford University Press, 411-434.
  • Noonan, H. 1991. “Object-Dependent Thoughts and Psychological Redundancy,” Analysis 51, 1-9.
  • Norris, C. 2003. “Twin Earth Revisited: Modal Realism and Causal Explanation.” Reprinted in Norris, The Philosophy of Language and the Challenge to Scientific Realism. London: Routledge Press, 143-174.
  • Nuttetelli, S. 2003. “Knowing that One Knows What One is Talking About.” In New Essays on Semantic Externalism and Self-Knowledge, ed. Susanna Nuttetelli. Cambridge. Cambridge University Press, 169-184.
  • Peacocke, C 1995. “Content.” In A Companion to the Philosophy of Mind, S. Guttenplan, ed. Cambridge: Blackwell Press, 219-225.
  • Putnam, H., 1975. “The Meaning of Meaning.” In Mind, Language and Reality; Philosophical Papers Volume 2. Cambridge: Cambridge University Press, 215-271.
  • Putnam, H. 1981. Reason, Truth, and History. Cambridge: Cambridge University Press.
  • Putnam, 1986. “Meaning Holism.” Reprinted in Conant ed., Realism with a Human Face, by Hilary Putnam. Harvard: Harvard University Press, 278-303.
  • Putnam, 1988. Representation and Reality. Cambridge, Mass: A Bradford Book , MIT Press.
  • Putnam, H. 1996. “Introduction.” In: The Twin Earth Chronicles, eds. Pessin and Goldberg, i-xii.
  • Rowlands, M. 2003. Externalism: Putting Mind and World Back Together Again. New York: McGill-Queens Press.
  • Rudder-Baker, L. 1985. “A Farewell to Functionalism.” In Silvers, S. ed. Rerepresentations: Readings in the Philosophy of Mental Representation. Holland: Kluwer Academic Publishers, 137-149.
  • Sainsbury, M. 2007. Reality without Referents. Oxford. Oxford University Press.
  • Sawyer, S. 1998. “Privileged Access to the World.” Australasian Journal of Philosophy, 76.4, 523-533.
  • Sawyer, S. 1999. “An Externalist Account of Introspective Experience.” Pacific Philosophical Quarterly, 4.4, 358-374.
  • Sawyer, 2008. “There is no Viable Notion of Narrow Content.” In Contemporary Debates in the Philosophy of Mind, eds. McLaughlin and Cohen. London: Blackwell Press, 20-34.
  • Segal, 1989. “Return of the Individual.” Mind 98, 39-57.
  • Segal, G. 2000. A Slim Book about Narrow Content. Cambridge Mass: MIT Press.
  • Segal, G. 2003. “Ignorance of Meaning.” In The Epistemology of Language, Barber, A. ed. Oxford: Oxford University Press, 415-431.
  • Segal, 2008. “Cognitive Content and Propositional Attitude Attributions.” In Contemporary Debates in the Philosophy of Mind, eds. McLaughlin and Cohen. London: Blackwell Press, 4-19.
  • Stalnaker, R. 1989. “On What’s in the Head.” Reprinted in Rosenthal, ed., The Nature of Mind. Oxford: Oxford University Press, 576-590.
  • Stalnaker, R. 1990. “Narrow Content.” In Anderson, C.A. and Owens, J., eds. Propositional Attitudes: The Role of Content in Logic, Language and Mind. Stanford: Center for the Study of Language and Information, 131-148.
  • Tye, M. 2008. Consciousness Revisited; Materialism without Phenomenal Concepts. Cambridge, Mass: MIT Press.
  • Van Gulick, R. 1990. “Metaphysical Arguments for Internalism and Why They Do Not Work.” In Silvers, Stuart, ed. Rerepresentations: Readings in the Philosophy of Mental Representation. Holland: Kluwer Academic Publishers, 153-158.
  • Warfield, T. 1997. “Externalism, Self-Knowledge, and the Irrelevance of Slow Switching.” Analysis 52, 232-237.
  • Williamson, T. 2002. Knowledge and its Limits. Oxford: Oxford University Press.
  • Wilson, R. 1995. Cartesian Psychology and Physical Minds: Individualism and the Sciences of the Mind. Cambridge: Cambridge University Press.
  • Wilson, R. 2010. “Meaning, Making, and the Mind of the Externalist.” In: Menary, ed. The Extended Mind. Cambridge, Mass. MIT Press, 167-189.
  • Zemach, E. 1976. “Putnam on the Reference of Substance Kind Terms.” Journal of Philosophy 88, 116-127.

 

Author Information

Basil Smith
Email: Bsmith108@saddleback.edu
Saddleback College
U. S. A.

Michel Foucault: Political Thought

Michel FoucaultThe work of twentieth-century French philosopher Michel Foucault has increasingly influenced the study of politics. This influence has mainly been via concepts he developed in particular historical studies that have been taken up as analytical tools; “governmentality” and ”biopower” are the most prominent of these. More broadly, Foucault developed a radical new conception of social power as forming strategies embodying intentions of their own, above those of individuals engaged in them; individuals for Foucault are as much products of as participants in games of power.

The question of Foucault’s overall political stance remains hotly contested. Scholars disagree both on the level of consistency of his position over his career, and the particular position he could be said to have taken at any particular time. This dispute is common both to scholars critical of Foucault and to those who are sympathetic to his thought.

What can be generally agreed about Foucault is that he had a radically new approach to political questions, and that novel accounts of power and subjectivity were at its heart. Critics dispute not so much the novelty of his views as their coherence. Some critics see Foucault as effectively belonging to the political right because of his rejection of traditional left-liberal conceptions of freedom and justice. Some of his defenders, by contrast, argue for compatibility between Foucault and liberalism. Other defenders see him either as a left-wing revolutionary thinker, or as going beyond traditional political categories.

To summarize Foucault’s thought from an objective point of view, his political works would all seem to have two things in common: (1) an historical perspective, studying social phenomena in historical contexts, focusing on the way they have changed throughout history; (2) a discursive methodology, with the study of texts, particularly academic texts, being the raw material for his inquiries. As such the general political import of Foucault’s thought across its various turns is to understand how the historical formation of discourses have shaped the political thinking and political institutions we have today.

Foucault’s thought was overtly political during one phase of his career, coinciding exactly with the decade of the 1970s, and corresponding to a methodology he designated “genealogy”. It is during this period that, alongside the study of discourses, he analysed power as such in its historical permutations. Most of this article is devoted to this period of Foucault’s work. Prior to this, during the 1960s, the political content of his thought was relatively muted, and the political implications of that thought are contested. So, this article is divided into thematic sections arranged in order of the chronology of their appearance in Foucault’s thought.

Table of Contents

  1. Foucault’s Early Marxism
  2. Archaeology
  3. Genealogy
  4. Discipline
  5. Sexuality
  6. Power
  7. Biopower
  8. Governmentality
  9. Ethics
  10. References and Further Reading
    1. Primary
    2. Secondary

1. Foucault’s Early Marxism

Foucault began his career as a Marxist, having been influenced by his mentor, the Marxist philosopher Louis Althusser, as a student to join the French Communist Party. Though his membership was tenuous and brief, Foucault’s later political thought should be understood against this background, as a thought that is both under the influence of, and intended as a reaction to, Marxism.

Foucault himself tells us that after his early experience of a Stalinist communist party, he felt sick of politics, and shied away from political involvements for a long time. Still, in his first book, which appeared in 1954, less than two years after Foucault had left the Party, his theoretical perspective remained Marxist. This book was a history of psychology, published in English as Mental Illness and Psychology. In the original text, Foucault concludes that mental illness is a result of alienation caused by capitalism. However, he excised this Marxist content from a later edition in 1962, before suppressing publication of the book entirely; an English translation of the 1962 edition continues to be available only by an accident of copyright (MIP vii). Thus, one can see a trajectory of Foucault’s decisively away from Marxism and indeed tendentially away from politics.

2. Archaeology

Foucault’s first major, canonical work was his 1961 doctoral dissertation, The History of Madness. He gives here a historical account, repeated in brief in the 1962 edition of Mental Illness and Psychology, of what he calls the constitution of an experience of madness in Europe, from the fifteenth to the nineteenth centuries. This encompasses correlative study of institutional and discursive changes in the treatment of the mad, to understand the way that madness was constituted as a phenomenon. The History of Madness is Foucault’s longest book by some margin, and contains a wealth of material that he expands on in various ways in much of his work of the following two decades. Its historical inquiry into the interrelation of institutions and discourses set the pattern for his political works of the 1970s.

Foucault saw there as being three major shifts in the treatment of madness in the period under discussion. The first, with the Renaissance, saw a new respect for madness. Previously, madness had been seen as an alien force to be expelled, but now madness was seen as a form of wisdom. This abruptly changed with the beginning of the Enlightenment in the seventeenth century. Now rationality was valorized above all else, and its opposite, madness, was excluded completely. The unreasonable was excluded from discourse, and congenitally unreasonable people were physically removed from society and confined in asylums. This lasted until the end of the eighteenth century, when a new movement “liberating” the mad arose. For Foucault, however, this was no true liberation, but rather the attempt by Enlightenment reasoning to finally negate madness by understanding it completely, and cure it with medicine.

The History of Madness thus takes seriously the connection between philosophical discourse and political reality. Ideas about reason are not merely taken to be abstract concerns, but as having very real social implications, affecting every facet of the lives of thousands upon thousands of people who were considered mad, and indeed, thereby, altering the structure of society. Such a perspective represents a change in respect of Foucault’s former Marxism. Rather than attempt to ground experience in material circumstances, here it might seem that cultural transformation is being blamed for the transformation of society. That is, it might seem that Foucault had embraced idealism, the position that ideas are the motor force of history, Marxism’s opposite. This would, however, be a misreading. The History of Madness posits no causal priority, either of the cultural shift over the institutional, or vice versa. It simply notes the coincident transformation, without etiological speculation. Moreover, while the political forces at work in the history of madness were not examined by Foucault in this work, it is clearly a political book, exploring the political stakes of philosophy and medicine.

Many were convinced that Foucault was an idealist, however, by later developments in his thought. After The History of Madness, Foucault began to focus on the discursive, bracketing political concerns almost entirely. This was first, and most clearly, signalled in the preface to his next book, The Birth of the Clinic. Although the book itself essentially extends The History of Madness chronologically and thematically, by examining the birth of institutional medicine from the end of the eighteenth century, the preface is a manifesto for a new methodology that will attend only to discourses themselves, to the language that is uttered, rather than the institutional context. It is the following book in Foucault’s sequence, rather than The Birth of the Clinic itself, that carried this intention to fulfilment: this book was The Order of Things (1966). Whereas in The History of Madness and The Birth of the Clinic, Foucault had pursued historical researches that had been relatively balanced between studying conventional historical events, institutional change, and the history of ideas, The Order of Things represented an abstract history of thought that ignored almost anything outside the discursive. This method was in effect what was at that time in France called “structuralism,” though Foucault was never happy with this use of this term. His specific claims were indeed quite unique, namely that in the history of academic discourses, in a given epoch, knowledge is organized by an episteme, which governs what kind of statements can be taken as true. The Order of Things charts several successive historical shifts of episteme in relation to the human sciences.

These claims led Foucault onto a collision with French Marxism. This could not have been entirely unintended by Foucault, in particular because in the book he specifically accuses Marxism of being a creature of the nineteenth century that was now obsolete. He also concluded the work by indicating his opposition to humanism, declaring that “man” (the gendered “man” here refers to a concept that in English we have come increasingly to call the “human”) as such was perhaps nearing obsolescence. Foucault here was opposing a particular conception of the human being as a sovereign subject who can understand itself. Such humanism was at that time the orthodoxy in French Marxism and philosophy, championed the pre-eminent philosopher of the day, Jean-Paul Sartre, and upheld by the French Communist Party’s central committee explicitly against Althusser just a month before The Order of Things was published (DE1 36). In its humanist form, Marxism cast itself as a movement for the full realization of the individual. Foucault, by contrast, saw the notion of the individual as a recent and aberrant idea. Furthermore, his entire presumption to analyse and criticize discourses without reference to the social and economic system that produced them seemed to Marxists to be a massive step backwards in analysis. The book indeed seems to be apolitical: it refuses to take a normative position about truth, and accords no importance to anything outside abstract, academic discourses. The Order of Things proved so controversial, its claims so striking, that it became a best-seller in France, despite being a lengthy, ponderous, scholarly tome.

Yet, Foucault’s position is not quite as anti-political as has been imagined. The explicit criticism of Marxism in the book was specifically of Marx’s economic doctrine: it amounts to the claim that this economics is essentially a form of nineteenth century political economy. It is thus not a total rejection of Marxism, or dismissal of the importance of economics. His anti-humanist position was not in itself anti-Marxist, inasmuch as Althusser took much the same line within a Marxist framework, albeit one that tended to challenge basic tenets of Marxism, and which was rejected by the Marxist establishment. This shows it is possible to use the criticism of the category of “man” in a pointedly political way. Lastly, the point of Foucault’s “archaeological” method of investigation, as he now called it, of looking at transformations of discourses in their own terms without reference to the extra-discursive, does not imply in itself that discursive transformations can be explained without reference to anything non-discursive, only that they can be mapped without any such reference. Foucault thus shows a lack of interest in the political, but no outright denial of the importance of politics.

Foucault was at this time fundamentally oriented towards the study of language. This should not in itself be construed as apolitical. There was a widespread intellectual tendency in France during the 1960s to focus on avant-garde literature as being the main repository for radical hopes, eclipsing a traditional emphasis on the politics of the working class. Foucault wrote widely during this period on art and literature, publishing a book on the obscure French author Raymond Roussel, which appeared on the same day as The Birth of the Clinic. Given Roussel’s eccentricity, this was not far from his reflections on literature in The History of Madness. For Foucault, modern art and literature are essentially transgressive. Transgression is something of a common thread running through Foucault’s work in the 1960s: the transgression of madness and literary modernism is for Foucault directly implicated in the challenge he sees emerging to the current episteme. This interest in literature culminated in what is perhaps Foucault’s best known piece in this relation, ”What Is an Author?”, which combines some of the themes from his final book of the sixties, The Archaeology of Knowledge, with reflections on modern literature in challenging the notion of the human “author” of a work in whatever genre. All of these works, no matter how abstract, can be seen as having important political-cultural stakes for Foucault. Challenging the suzerainty of “man” can in itself be said to have constituted his political project during this period, such was the importance he accorded to discourses. The practical importance of such questions can be seen in The History of Madness.

Foucault was ultimately dissatisfied with this approach, however. The Archaeology of Knowledge, a reflective consideration of the methodology of archaeology itself, ends with an extraordinary self-critical dialogue, in which Foucault answers imagined objections to this methodology.

3. Genealogy

Foucault wrote The Archaeology of Knowledge while based in Tunisia, where he had taken a three-year university appointment in 1966. While the book was in its final stages, the world around him changed. Tunisia went through a political upheaval, with demonstrations against the government, involving many of his students. He was drawn into supporting them, and was persecuted as a result. Much better known and more significant student demonstrations occurred in Paris shortly afterwards, in May of 1968. Foucault largely missed these because he was in Tunis, but he followed news of them keenly.

He returned to France permanently in 1969. He was made the head of the philosophy department at a brand new university at Vincennes. The milieu he found on his return to France was itself highly politicized, in stark contrast to the relatively staid country he had left behind three years before. He was surrounded by peers who had become committed militants, not least his partner Daniel Defert, and including almost all of the colleagues he had hired to his department. He now threw himself into an activism that would characterize his life from that point on.

It was not long before a new direction appeared in his thought to match. The occasion this first became apparent was his 1970 inaugural address for another new job, his second in as many years, this time as a professor at the Collège de France, the highest academic institution in France. This address was published as a book in France, L’ordre du discours, “The Order of Discourse” (which is one of the multiple titles under which it has been translated in English). For the first time, Foucault sets out an explicit agenda of studying institutions alongside discourse. He had done this in the early 1960s, but now he proposed it as a deliberate method, which he called “genealogy.” Much of “The Order of Discourse” in effect recapitulates Foucault’s thought up to that point, the considerations of the history of madness and the regimes of truth that have governed scientific discourse, leading to a sketch of a mode of analysing discourse similar to that of The Archeology of Knowledge. In the final pages, however, Foucault states that he will now undertake analyses in two different directions, critical and “genealogical.” The critical direction consists in the study of the historical formation of systems of exclusion. This is clearly a turn back to the considerations of The History of Madness. The genealogical direction is more novel – not only within Foucault’s work, but in Western thought in general, though the use of the term “genealogical” does indicate a debt to one who came before him, namely Friedrich Nietzsche. The genealogical inquiry asks about the reciprocal relationship between systems of exclusion and the formation of discourses. The point here is that exclusion is not a fate that befalls innocent, pre-existing discourses. Rather, discourses only ever come about within and because of systems of exclusion, the negative moment of exclusion co-existing with the positive moment of the production of discourse. Now, discourse becomes a political question in a full sense for Foucault, as something that is intertwined with power.

“Power” is barely named as such by Foucault in this text, but it becomes the dominant concept of his output of the 1970s. This output comprises two major books, eight annual lecture series he gave at the Collège de France, and a plethora of shorter pieces and interviews. The signature concept of genealogy, combining the new focus on power with the older one on discourse, is his notion of “power-knowledge.” Foucault now sees power and knowledge as indissolubly joined, such that one never has either one without the other, with neither having causal suzerainty over the other.

His first lecture series at the Collège de France, now published in French as Leçons sur la volonté de savoir (“Lessons on the will to knowledge”), extends the concerns of “The Order of Discourse” with the production of knowledge. More overtly political material followed in the next two lecture series, between 1971 and 1973, both of which dealt with the prison system, and led up to Foucault’s first full-scale, published genealogy, 1975’s Discipline and Punish: The Birth of the Prison.

4. Discipline

This research on prisons began in activism. The French state had banned several radical leftist groups in the aftermath of May 1968, and thousands of their members ended up in prisons, where they began to agitate for political rights for themselves, then began to agitate for rights for prisoners in general, having been exposed by their incarceration to ordinary prisoners and their problems. Foucault was the main organizer of a group formed outside the prison, in effect as an outgrowth of this struggle, the Groupe d’informations sur les prisons (the GIP – the Prisons Information Group). This group, composed primarily of intellectuals, sought simply to empower prisoners to speak of their experiences on their own account, by sending surveys out to them and collating their responses.

In tandem with this effort, Foucault researched the history of the prisons, aiming to find out something that the prisoners themselves could not tell him: how the prison system had come into being and what purpose it served in the broader social context. His history of the prisons turns out to be a history of a type of power that Foucault calls “disciplinary,” which encompasses the modern prison system, but is much broader. Discipline and Punish thus comprises two main historical theses. One, specifically pertaining to the prison system, is that this system regularly produces an empirically well-known effect, a layer of specialized criminal recidivists. This is for Foucault simply what prisons objectively do. Pointing this out undercuts the pervasive rationale of imprisonment that prisons are there to reduce crime by punishing and rehabilitating inmates. Foucault considers the obvious objection to this that prisons only produce such effects because they have been run ineffectively throughout their history, that better psychological management of rehabilitation is required, in particular. He answers this by pointing out that such discourses of prison reform have accompanied the prison system since it was first established, and are hence part of its functioning, indeed propping it up in spite of its failures by providing a constant excuse for its failings by arguing that it can be made to work differently.

Foucault’s broader thesis in Discipline and Punish is that we are living in a disciplinary society, of which the prison is merely a potent example. Discipline began not with the prisons, but originally in monastic institutions, spreading out through society via the establishment of professional armies, which required dressage, the training of individual soldiers in their movements so that they could coordinate with one another with precision. This, crucially, was a matter of producing what Foucault calls “docile bodies,” the basic unit of disciplinary power. The prison is just one of a raft of broadly similar disciplinary institutions that come into existence later. Schools, hospitals, and factories all combine similar methods to prisons for arranging bodies regularly in space, down to their minute movements. All combine moreover similar functions. Like the prison, they all have educational, economically productive, and medical aspects to them. The differences between these institutions is a matter of which aspect has primacy.

All disciplinary institutions also do something else that is quite novel, for Foucault: they produce a “soul” on the basis of the body, in order to imprison the body. This eccentric formulation of Foucault’s is meant to capture the way that disciplinary power has increasingly individualized people. Discipline and Punish begins with a vivid depiction of an earlier form of power in France, specifically the execution in 1757 of a man who had attempted to kill the King of France. As was the custom, for this, the most heinous of crimes in a political system focused on the person of the king, the most severe punishment was meted out: the culprit was publicly tortured to death. Foucault contrasts this with the routinized imprisonment that became the primary form of punishing criminals in the 19th century. From a form of power that punished by extraordinary and exemplary physical harm against a few transgressors, Western societies adopted a form of power that attempted to capture all individual behaviour. This is illustrated by a particular example that has become one of the best known images from Foucault’s work, the influential scheme of nineteenth century philosopher Jeremy Bentham called the “Panopticon,” a prison in which every action of the inmates would be visible. This serves as something of a paradigm for the disciplinary imperative, though it was never realized completely in practice.

Systems of monitoring and control nevertheless spread through all social institutions: schools, workplaces, and the family. While criminals had in a sense already been punished individually, they were not treated as individuals in the full sense that now developed. Disciplinary institutions such as prisons seek to develop detailed individual psychological profiles of people, and seek to alter their behaviour at the same level. Where previously most people had been part of a relatively undifferentiated mass, individuality being the preserve of a prominent or notorious few, and even then a relatively thin individuality, a society of individuals now developed, where everyone is supposed to have their own individual life story. This constitutes the soul Foucault refers to.

5. Sexuality

The thread of individualization runs through his next book, the first of what were ultimately three volumes of his History of Sexuality. He gave this volume the title The Will to Knowledge. It appeared only a year after Discipline and Punish. Still, three courses at the Collège de France intervene between his lectures on matters penitential and the book on sexuality. The first, Psychiatric Power, picks up chronologically where The History of Madness had left off, and applies Foucault’s genealogical method to the history of psychiatry. The next year, 1975, Foucault gave a series of lectures entitled Abnormal. These link together the studies on the prison with those on psychiatry and the question of sexuality through the study of the category of the abnormal, to which criminals, the mad, and sexual “perverts” were all assigned. Parts of these lectures indeed effectively reappear in The Will to Knowledge.

Like Discipline and Punish, the Will to Knowledge contains both general and specific conclusions. Regarding the specific problem of sexuality, Foucault couches his thesis as a debunking of a certain received wisdom in relation to the history of sexuality that he calls “the repressive hypothesis.” This is the view that our sexuality has historically been repressed, particularly in the nineteenth century, but that during the twentieth century it has been progressively liberated, and that we need now to get rid of our remaining hang-ups about sex by talking openly and copiously about it. Foucault allows the core historical claim that there has been sexual repressiveness, but thinks that this is relatively unimportant in the history of sexuality. Much more important, he thinks, is an injunction to talk about our sexuality that has consistently been imposed even during the years of repressiveness, and is now intensified, ostensibly for the purpose of lifting our repression. Foucault again sees a disciplinary technique at work, namely confession. This began in the Catholic confessional, with the Church spreading the confessional impulse in relation to sex throughout society in the early modern period. Foucault thinks this impulse has since been made secular, particularly under the auspices of institutional psychiatry, introducing a general compulsion for everyone to tell the truth about themselves, with their sexuality a particular focus. For Foucault, there is no such thing as sexuality apart from this compulsion. That is, sexuality itself is not something that we naturally have, but rather something that has been invented and imposed.

The implication of his genealogy of sexuality is that “sex” as we understand it is an artificial construct within this recent “device” (dispositif) of sexuality. This includes both the category of the sexual, encompassing certain organs and acts, and “sex” in the sense of gender, an implication spelt out by Foucault in his introduction to the memoirs of Herculine Barbin, a nineteenth century French hermaphrodite, which Foucault discovered and arranged to have published. Foucault’s thought, and his work on sexuality in particular, has been immensely influential in the recent “third wave” of feminist thought. The interaction of Foucault and feminism is the topic of a dedicated article elsewhere in this encyclopedia.

6. Power

The most general claim of The Will to Knowledge, and of Foucault’s entire political thought, is his answer to the question of where machinations such as sex and discipline come from. Who and what is it that is responsible for the production of criminality via imprisonment? Foucault’s answer is, in a word, “power.” That is to say that no one in particular is producing these things, but that rather they are effects generated by the interaction of power relations, which produce intentions of their own, not necessarily shared by any individuals or institutions. Foucault’s account of power is the broadest of the conclusions of The Will to Knowledge. Although similar reflections on power can be found in Discipline and Punish and in lectures and interviews of the same period, The Will to Knowledge gives his most comprehensive account of power. Foucault understands power in terms of “strategies” which are produced through the concatenation of the power relations that exist throughout society, wherever people interact. As he explains in a later text, “The Subject and Power,” which effectively completes the account of power given in The Will to Knowledge, these relations are a matter of people acting on one another to make other people act in turn. Whenever we try to influence others, this is power. However, our attempts to influence others rarely turn out the way we expect; moreover, even when they do, we have very little idea what effects our actions on others’ have more broadly. In this way, the social effects of our attempts to influence other people run quite outside of our control or ken. This effect is neatly encapsulated in a remark attributed to Foucault that we may know what we do, but we do not know what what we do does. What it does is produce strategies that have a kind of life of their own. Thus, although no one in the prison system, neither the inmates, nor the guards, nor politicians, want prisons to produce a class of criminals, this is nonetheless what the actions of all the people involved do.

Controversy around Foucault’s political views has focused on his reconception of power. Criticisms of him on this point invariably fail, however, to appreciate his true position or beg the question against it by simply restating the views he has rejected. He has been interpreted as thinking that power is a mysterious, autonomous force that exists independently of any human influence, and is so all-encompassing as to preclude any resistance to it. Foucault clearly states in The Will to Knowledge that this is not so, though it is admittedly relatively difficult to understand his position, namely that resistance to power is not outside power. The point here for Foucault is not that resistance is futile, but that power is so ubiquitous that in itself it is not an obstacle to resistance. One cannot resist power as such, but only specific strategies of power, and then only with great difficulty, given the tendency of strategies to absorb apparently contradictory tendencies. Still, for Foucault power is never conceived as monolithic or autonomous, but rather is a matter of superficially stable structures emerging on the basis of constantly shifting relations underneath, caused by an unending struggle between people. Foucault explains this in terms of the inversion of Clausewitz’s dictum that war is diplomacy by other means into the claim that “politics is war by other means.” For Foucault, apparently peaceful and civilized social arrangements are supported by people locked in a struggle for supremacy, which is eternally susceptible to change, via the force of that struggle itself.

Foucault is nevertheless condemned by many liberal commentators for his failure to make any normative distinction between power and resistance, that is, for his relativism. This accusation is well founded: he consistently eschews any kind of overtly normative stance in his thought. He thus does not normatively justify resistance, but it is not clear there is any inherent contradiction in a non-normative resistance. This idea is coherent, though of course those who think it is impossible to have a non-normative political thought (which is a consensus position within political philosophy) will reject him on this basis. For his part, he offers only analyses that he hopes will prove useful to people struggling in concrete situations, rather than prescriptions as to what is right or wrong.

One last accusation, coming from a particularly noteworthy source, the most prominent living German philosopher, Jürgen Habermas, should also be mentioned. This accusation is namely that Foucault’s account of power is “functionalist.” Functionalism in sociology means taking society as a functional whole and thus reading every part as having distinct functions. The problem with this view is that society is not designed by anyone and consequently much of it is functionally redundant or accidental. Foucault does use the vocabulary of “function” on occasion in his descriptions of the operations of power, but does not show any allegiance to or even awareness of functionalism as a school of thought. His position in any case is not that society constitutes a totality or whole via any necessity: functions exist within strategies that emerge spontaneously from below, and the functions of any element are subject to change.

7. Biopower

Foucault’s position in relation to resistance implies not so much that one is defeated before one begins as that one must proceed with caution to avoid simply supporting a strategy of power while thinking oneself rebellious. This is for him what has happened in respect of sexuality in the case of the repressive hypothesis. Though we try to liberate ourselves from sexual repression, we in fact play into a strategy of power which we do not realize exists. This strategy is for everyone to constitute themselves as “‘subjects’ in both senses of the word,” a process Foucault designates “subjection” (assujettissement). The two senses here are passive and active. On the one hand, we are subjected in this process, made into passive subjects of study by medical professionals, for example. On the other, we are the subjects in this process, having to actively confess our sexual proclivities and indeed in the process develop an identity based on this confessed sexuality. So, power operates in ways that are both overtly oppressive and more positive.

Sexuality for Foucault has a quite extraordinary importance in the contemporary network of power relations. It has become the essence of our personal identity, and sex has come to be seen as “worth dying for.” Foucault details how sexuality had its beginnings as a preoccupation of the newly dominant bourgeois class, who were obsessed with physical and reproductive health, and their own pleasure. This class produced sexuality positively, though one can see that it would have been imposed on women and children within that class quite regardless of their wishes. For Foucault, there are four consistent strategies of the device of sexuality: the pathologisation of the sexuality of women and children, the concomitant medicalization of the sexually abnormal “pervert,” and the constitution of sexuality as an object of public concern. From its beginning in the bourgeoisie, Foucault sees public health agencies as imposing sexuality more crudely on the rest of the populace, quite against their wishes.

Why has this happened? For Foucault, the main explanation is how sexuality ties together multiple “technologies of power”, namely discipline on the one hand, and a newer technology, which he calls “bio-politics,” on the other. In The Will to Knowledge, Foucault calls this combination of discipline and bio-politics together “bio-power,” though confusingly he elsewhere seems to use “bio-power” and “bio-politics” as synonyms, notably in his 1976 lecture series, Society Must Be Defended. He also elsewhere dispenses with the hyphens in these words, as it will in the present article hereafter.

Biopolitics is a technology of power that grew up on the basis of disciplinary power. Where discipline is about the control of individual bodies, biopolitics is about the control of entire populations. Where discipline constituted individuals as such, biopolitics does this with the population. Prior to the invention of biopolitics, there was no serious attempt by governments to regulate the people who lived in a territory, only piecemeal violent interventions to put down rebellions or levy taxes. As with discipline, the main precursor to biopolitics can be found in the Church, which is the institution that did maintain records of births and deaths, and did minister to the poor and sick, in the medieval period. In the modern period, the perception grew among governments that interventions in the life of the people would produce beneficial consequences for the state, preventing depopulation, ensuring a stable and growing tax base, and providing a regular supply of manpower for the military. Hence they took an active interest in the lives of the people. Disciplinary mechanisms allowed the state to do this through institutions, most notably perhaps medical institutions that allowed the state to monitor and support the health of the population. Sex was the most intense site at which discipline and biopolitics intersected, because any intervention in population via the control of individual bodies fundamentally had to be about reproduction, and also because sex is one of the major vectors of disease transmission. Sex had to be controlled, regulated, and monitored if the population was to be brought under control.

There is another technology of power in play, however, older than discipline, namely “sovereign power.” This is the technology we glimpse at the beginning of Discipline and Punish, one that works essentially by violence and by taking, rather than by positively encouraging and producing as both discipline and biopolitics do. This form of power was previously the way in which governments dealt both with individual bodies and with masses of people. While it has been replaced in these two roles by discipline and biopower, it retains a role nonetheless at the limits of biopower. When discipline breaks down, when the regulation of the population breaks down, the state continues to rely on brute force as a last resort. Moreover, the state continues to rely on brute force, and the threat of it, in dealing with what lies outside its borders.

For Foucault, there is a mutual incompatibility between biopolitics and sovereign power. Indeed, he sometimes refers to sovereign power as “thanatopolitics,” the politics of death, in contrast to biopolitics’s politics of life. Biopolitics is a form of power that works by helping you to live, thanatopolitics by killing you, or at best allowing you to live. It seems impossible for any individual to be simultaneously gripped by both forms of power, notwithstanding a possible conflict between different states or state agencies. There is a need for a dividing line between the two, between who is to be “made to live,” as Foucault puts it, and who is to be killed or simply allowed to go on living indifferently. The most obvious dividing line is the boundary between the population and its outside at the border of a territory, but the “biopolitical border,” as it has been called by recent scholars, is not the same as the territorial border. In Society Must Be Defended, Foucault suggests there is a device he calls “state racism,” that comes variably into play in deciding who is to receive the benefits of biopolitics or be exposed to the risk of death.

Foucault does not use this term in any of the works he published himself, but nevertheless does point in The Will to Knowledge to a close relationship between biopolitics and racism. Discourses of scientific racism that emerged in the nineteenth century posited a link between the sexual “degeneracy” of individuals and the hygiene of the population at large. By the early twentieth century, eugenics, the pseudo-science of improving the vitality of a population through selective breeding, was implemented to some extent in almost all industrialized countries. It of course found its fullest expression in Nazi Germany. Nevertheless, Foucault is quite clear that there is something quite paradoxical about such attempts to link the old theme of “blood” to modern concerns with population health. The essential point about “state racism” is not then that it necessarily links to what we might ordinarily understand as racism in its strict sense, but that there has to be a dividing line in modern biopolitical states between what is part of the population and what is not, and that this is, in a broad sense, racist.

8. Governmentality

After the publication of The Will to Knowledge, Foucault took a one-year hiatus from lecturing at the Collège de France. He returned in 1978 with a series of lectures that followed logically from his 1976 ones, but show a distinct shift in conceptual vocabulary. Talk of “biopolitics” is almost absent. A new concept, “governmentality,” takes its place. The lecture series of 1978 and 1979, Security, Territory, Population and The Birth of Biopolitics, center on this concept, despite the somewhat misleading title of the latter in this regard.

“Governmentality” is a portmanteau word, derived from the phrase “governmental rationality.” A governmentality is thus a logic by which a polity is governed. But this logic is for Foucault, in keeping with his genealogical perspective (which he still affirms), not merely ideal, but rather encompasses institutions, practices and ideas. More specifically, Foucault defines governmentality in Security, Territory, Population as allowing for a complex form of “power which has the population as its target, political economy as its major form of knowledge, and apparatuses of security as its essential technical instrument” (pp. 107–8). Confusingly, however, Foucault in the same breath also defines other senses in which he will use the term “governmentality.” He will use it not only to describe this recent logic of government, but as the longer tendency in Western history that has led to it, and the specific process in the early modern period by which modern governmentality was formed.

“Governmentality” is a slippery concept then. Still, it is an important one. Governmentality seems to be closely contemporaneous and functionally isomorphic with biopolitics, hence seems to replace it in Foucault’s thought. That said, unlike biopolitics, it never figures in a major publication of his – he only allowed one crucial lecture of Security, Territory, Population to be published under the title of “Governmentality,” in an Italian journal. It is via the English translation of this essay that this concept has become known in English, this one essay of Foucault’s in fact inspiring an entire school of sociological reflection.

What is the meaning of this fuzzy concept then? Foucault never repudiates biopower. During these lectures he on multiple occasions reaffirms his interest in biopower as an object of study, and does so as late as 1983, the year before he died. The meaning of governmentality as a concept is to situate biopower in a larger historical moment, one that stretches further back in history and encompasses more elements, in particular the discourses of economics and regulation of the economy.

Foucault details two main phases in the development of governmentality. The first is what he identifies as raison d’État, literally “reason of state.” This is the central object of study of Security, Territory, Population. It correlates the technology of discipline, as an attempt to regulate society to the fullest extent, with what was contemporaneously called “police.” This governmentality gave way by the eighteenth century to a new form of governmentality, what will become political liberalism, which reacts against the failures of governmental regulation with the idea that society should be left to regulate itself naturally, with the power of police applied only negatively in extremis. This for Foucault is broadly the governmentality that has lasted to this day, and is the object of study of The Birth of Biopolitics in its most recent form, what is called “neo-liberalism.” With this governmentality, we see freedom of the individual and regulation of the population subtly intertwined.

9. Ethics

The 1980s see a significant turn in Foucault’s work, both in terms of the discourses he attends to and the vocabulary he uses. Specifically, he focuses from now on mainly on Ancient texts from Greece and Rome, and prominently uses the concepts of “subjectivity” and “ethics.” None of these elements is entirely new to his work, but they assume novel prominence and combination at this point.

There is an article elsewhere in this encyclopedia about Foucault’s ethics. The question here is what the specifically political import of this ethics is. It is often assumed that the meaning of Foucault’s ethics is to retract his earlier political thought and thus to recant on that political project, retreating from political concerns towards a concern with individual action. There is a grain of truth to such allegations, but no more than a grain. While certainly Foucault’s move to the consideration of ethics is a move away from an explicitly political engagement, there is no recantation or contradiction of his previous positions, only the offering of an account of subjectivity and ethics that might enrich these.

Foucault’s turn towards subjectivity is similar to his earlier turn towards power: he seeks to add a dimension to the accounts and approach he has built up. As in the case of power, he does so not by helping himself to an available approach, but by producing a new one: Foucault’s own account of subjectivity is original and quite different to the extant accounts of subjectivity he rejected in his earlier work. Subjectivity for Foucault is a matter of people’s ability to shape their own conduct. In this way, his account relates to his previous work on government, with subjectivity a matter of the government of the self. It is thus closely linked to his political thought, as a question of the power that penetrates the interior of the person.

“Ethics” too is understood in this terms. Foucault does not produce an “ethics” in the sense that the word is conventionally used today to mean a normative morality, nor indeed does he produce a “political philosophy” in the sense that that phrase is conventionally used, which is to say a normative politics. “Ethics” for Foucault is rather understood etymologically by reference to Ancient Greek reflection on the ethike, which is to say, on character. Ancient Greek ethics was marked by what Foucault calls the “care of the self”: it is essentially a practice of fashioning the self. In such practices, Foucault sees a potential basis for resistance to power, though he is clear that no truly ethical practices exist today and it is by no means clear that they can be reestablished. Ethics has, on the contrary, been abnegated by Christianity with its mortificatory attitude to the self. This account of ethics can be found primarily in Foucault’s last three lecture series at the Collège de France, The Hermeneutics of the Subject, The Government of the Self and Others and The Courage of the Truth.

10. References and Further Reading

English translations of works by Foucault named above, in the order they were originally written.

a. Primary

    • Mental illness and psychology. Berkeley: University of California Press, 1987.
    • The History of Madness. London: Routledge, 2006.
    • Birth of the Clinic. London: Routledge, 1989.
    • The Order of Things. London: Tavistock, 1970.
    • The Archaeology of Knowledge. New York: Pantheon, 1972.
    • “The order of discourse,” in M. Shapiro, ed., Language and politics (Blackwell, 1984), pp. 108-138. Translated by Ian McLeod.
    • Psychiatric Power. New York: Palgrave Macmillan, 2006.
    • Discipline and Punish. London: Allen Lane, 1977.
    • Abnormal. London: Verso, 2003.
    • Society Must Be Defended. New York: Picador, 2003.
    • An Introduction. Vol. 1 of The History of Sexuality. New York: Pantheon, 1978. Reprinted as The Will to Knowledge, London: Penguin, 1998.
    • Security, Territory, Population. New York: Picador, 2009
    • The Birth of Biopolitics. New York: Picador, 2010
    • “Introduction” M. Foucault, ed., Herculine Barbin: being the recently discovered memoirs of a nineteenth-century French hermaphrodite Brighton, Sussex: Harvester Press, 1980, pp. vii-xvii. Translated by Richard McDougall.
    • “The Subject and Power,” J. Faubion, ed., Power. New York: New Press, 2000, pp. 326-348.
    • The Hermeneutics of the Subject. New York: Palgrave Macmillan, 2005.
    • The Government of the Self and Others. New York: Palgrave Macmillan, 2010.
    • The Courage of the Truth. New York: Palgrave Macmillan, 2011.

The shorter writings and interviews of Foucault are also of extraordinary interest, particularly to philosophers. In French, these have been published in an almost complete collection, Dits et écrits, by Gallimard, first in four volumes and more recently in a two-volume edition. In English, Foucault’s shorter works are spread across many overlapping anthologies, which even between them omit much that is important. The most important of these anthologies for Foucault’s political thought are:

  • J. Faubion, ed., Power Vol. 3, Essential Works. New York: New Press, 2000.
  • Colin Gordon, ed., Power/Knowledge. Brighton, Sussex: Harvester Press, 1980.

b. Secondary

  • Graham Burchell, Colin Gordon and Peter Miller (eds.), The Foucault Effect. Chicago: Chicago University Press, 1991.
    • An edited collection that is a mixture of primary and secondary sources. Both parts of the book have been extraordinarily influential. If constitutes a decent primer on governmentality.
  • Gilles Deleuze, Foucault. Trans. Seán Hand. London: Athlone, 1988.
    • The best book about Foucault’s work, from one who knew him. Though predictably idiosyncratic, it is still a pointedly political reading.
  • David Couzens Hoy (ed.), Foucault: A Critical Reader. Oxford: Blackwell, 1986.
    • An excellent selection of critical essays, mostly specifically political.
  • Mark G. E. Kelly, The Political Philosophy of Michel Foucault. New York: Routledge, 2009.
    • A comprehensive treatment of Foucault’s political thought from a specifically philosophical angle
  • John Rajchman, Michel Foucault: The Freedom of Philosophy. New York: Columbia University Press, 1985.
    • An idiosyncratic reading of Foucault that is particularly good at synthesising his entire career according to the political animus behind it.
  • Jon Simons, Foucault and the Political. London: Routledge, 1995.
    • The first work to focus on Foucault’s thought from a political introduction, this survey of his work serves well as a general introduction to the topic, though it necessarily lacks consideration of much that has appeared in English since its publication.
  • Barry Smart (ed.), Michel Foucault: Critical Assessments (multi-volume). London: Routledge, 1995.
    • Includes multiple sections on Foucault’s political thought.

Author Information

Mark Kelly
Email: markgekelly@gmail.com
Middlesex University
United Kingdom

Western Theories of Justice

Justice is one of the most important moral and political concepts.  The word comes from the Latin jus, meaning right or law.  The Oxford English Dictionary defines the “just” person as one who typically “does what is morally right” and is disposed to “giving everyone his or her due,” offering the word “fair” as a synonym.  But philosophers want to get beyond etymology and dictionary definitions to consider, for example, the nature of justice as both a moral virtue of character and a desirable quality of political society, as well as how it applies to ethical and social decision-making.  This article will focus on Western philosophical conceptions of justice.  These will be the greatest theories of ancient Greece (those of Plato and Aristotle) and of medieval Christianity (Augustine and Aquinas), two early modern ones (Hobbes and Hume), two from more recent modern times (Kant and Mill), and some contemporary ones (Rawls and several successors).  Typically the article considers not only their theories of justice but also how philosophers apply their own theories to controversial social issues—for example, to civil disobedience, punishment, equal opportunity for women, slavery, war, property rights, and international relations.

For Plato, justice is a virtue establishing rational order, with each part performing its appropriate role and not interfering with the proper functioning of other parts. Aristotle says justice consists in what is lawful and fair, with fairness involving equitable distributions and the correction of what is inequitable.  For Augustine, the cardinal virtue of justice requires that we try to give all people their due; for Aquinas, justice is that rational mean between opposite sorts of injustice, involving proportional distributions and reciprocal transactions.  Hobbes believed justice is an artificial virtue, necessary for civil society, a function of the voluntary agreements of the social contract; for Hume, justice essentially serves public utility by protecting property (broadly understood).  For Kant, it is a virtue whereby we respect others’ freedom, autonomy, and dignity by not interfering with their voluntary actions, so long as those do not violate others’ rights; Mill said justice is a collective name for the most important social utilities, which are conducive to fostering and protecting human liberty.  Rawls analyzed justice in terms of maximum equal liberty regarding basic rights and duties for all members of society, with socio-economic inequalities requiring moral justification in terms of equal opportunity and beneficial results for all; and various post-Rawlsian philosophers develop alternative conceptions.

Western philosophers generally regard justice as the most fundamental of all virtues for ordering interpersonal relations and establishing and maintaining a stable political society.  By tracking the historical interplay of these theories, what will be advocated is a developing understanding of justice in terms of respecting persons as free, rational agents.  One may disagree about the nature, basis, and legitimate application of justice, but this is its core.

Table of Contents

  1. Ancient Greece
    1. Plato
    2. Aristotle
  2. Medieval Christianity
    1. Augustine
    2. Aquinas
  3. Early Modernity
    1. Hobbes
    2. Hume
  4. Recent Modernity
    1. Kant
    2. Mill
  5. Contemporary Philosophers
    1. Rawls
    2. Post-Rawls
  6. References and Further Readings
    1. Primary Sources
    2. Secondary Sources

1. Ancient Greece

For all their originality, even Plato’s and Aristotle’s philosophies did not emerge in a vacuum.  As far back in ancient Greek literature as Homer, the concept of dikaion, used to describe a just person, was important.  From this emerged the general concept of dikaiosune, or justice, as a virtue that might be applied to a political society.  The issue of what does and does not qualify as just could logically lead to controversy regarding the origin of justice, as well as that concerning its essence.  Perhaps an effective aid to appreciating the power of their thought is to view it in the context of the teachings of the Sophists, those itinerant teachers of fifth-century ancient Greece who tried to pass themselves off as “wise” men.  In his trial, Socrates was at pains to dissociate himself from them, after his conviction refusing to save himself, as a typical Sophist would, by employing an act of civil disobedience to escape (Dialogues, pp. 24-26, 52-56; 18b-19d, 50a-54b); Plato is more responsible than anyone else for giving them the bad name that sticks with them to this present time; and Aristotle follows him in having little use for them as instructors of rhetoric, philosophy, values, and the keys to success.  So what did these three great philosophers (literally “lovers of wisdom”) find so ideologically objectionable about the Sophists?  The brief answer is, their relativism and their skepticism.  The first important one, Protagoras, captures the former with his famous saying, “Man is the measure of all things—of the things that are, that they are, and of the things that are not, that they are not”; and he speaks to the latter with a declaration of agnosticism regarding the existence of divinities.  Gorgias (Plato named dialogues after both of them) is remembered for a striking three-part statement of skepticism, holding that nothing really exists, that, even if something did exist, we could not grasp it, and that, even if we could grasp something real, we could never express it to anyone else.  If all values are subjective and/or unknowable, then what counts as just gets reduced to a matter of shifting opinion.  We can easily anticipate how readily Sophists would apply such relativism and skepticism to justice.  For example, Thrasymachus (who figures into the first book of Plato’s Republic) is supposed to have said that there must not be any gods who care about us humans because, while justice is our greatest good, men commonly get away with injustice.  But the most significant Sophist statement regarding justice arguably comes from Antiphon, who employs the characteristic distinction between custom (nomos) and nature (physis) with devastating effect.  He claims that the laws of justice, matters of convention, should be obeyed when other people are observing us and may hold us accountable; but, otherwise, we should follow the demands of nature.  The laws of justice, extrinsically derived, presumably involve serving the good of others, the demands of nature, which are internal, serving self-interest.  He even suggests that obeying the laws of justice often renders us helpless victims of those who do not (First, pp. 211, 232, 274, 264-266).  If there is any such objective value as natural justice, then it is reasonable for us to attempt a rational understanding of it.  On the other hand, if justice is merely a construction of customary agreement, then such a quest is doomed to frustration and failure.  With this as a backdrop, we should be able to see what motivated Plato and Aristotle to seek a strong alternative.

a. Plato

Plato’s masterful Republic (to which we have already referred) is most obviously a careful analysis of justice, although the book is far more wide-ranging than that would suggest.  Socrates, Plato’s teacher and primary spokesman in the dialogue, gets critically involved in a discussion of that very issue with three interlocutors early on.  Socrates provokes Cephalus to say something which he spins into the view that justice simply boils down to always telling the truth and repaying one’s debts.  Socrates easily demolishes this simplistic view with the effective logical technique of a counter-example:  if a friend lends you weapons, when he is sane, but then wants them back to do great harm with them, because he has become insane, surely you should not return them at that time and should even lie to him, if necessary to prevent great harm.  Secondly, Polemarchus, the son of Cephalus, jumps into the discussion, espousing the familiar, traditional view that justice is all about giving people what is their due.  But the problem with this bromide is that of determining who deserves what.  Polemarchus may reflect the cultural influence of the Sophists, in specifying that it depends on whether people are our friends, deserving good from us, or foes, deserving harm.  It takes more effort for Socrates to destroy this conventional theory, but he proceeds in stages:  (1) we are all fallible regarding who are true friends, as opposed to true enemies, so that appearance versus reality makes it difficult to say how we should treat people; (2) it seems at least as significant whether people are good or bad as whether they are our friends or our foes; and (3) it is not at all clear that justice should excuse, let alone require, our deliberately harming anyone (Republic, pp. 5-11; 331b-335e).  If the first inadequate theory of justice was too simplistic, this second one was downright dangerous.

The third, and final, inadequate account presented here is that of the Sophist Thrasymachus.  He roars into the discussion, expressing his contempt for all the poppycock produced thus far and boldly asserting that justice is relative to whatever is advantageous to the stronger people (what we sometimes call the “might makes right” theory).  But who are the “stronger” people?  Thrasymachus cannot mean physically stronger, for then inferior humans would be superior to finer folks like them.  He clarifies his idea that he is referring to politically powerful people in leadership positions.  But, next, even the strongest leaders are sometimes mistaken about what is to their own advantage, raising the question of whether people ought to do what leaders suppose is to their own advantage or only what actually is so.  (Had Thrasymachus phrased this in terms of what serves the interest of society itself, the same appearance versus reality distinction would apply.)  But, beyond this, Socrates rejects the exploitation model of leadership, which sees political superiors as properly exploiting inferiors (Thrasymachus uses the example of a shepherd fattening up and protecting his flock of sheep for his own selfish gain), substituting a service model in its place (his example is of the good medical doctor, who practices his craft primarily for the welfare of patients).  So, now, if anything like this is to be accepted as our model for interpersonal relations, then Thrasymachus embraces the “injustice” of self-interest as better than serving the interests of others in the name of “justice.”  Well, then, how are we to interpret whether the life of justice or that of injustice is better?  Socrates suggests three criteria for judgment:  which is the smarter, which is the more secure, and which is the happier way of life; he argues that the just life is better on all three counts.  Thus, by the end of the first book, it looks as if Socrates has trounced all three of these inadequate views of justice, although he himself claims to be dissatisfied because we have only shown what justice is not, with no persuasive account of its actual nature (ibid., pp. 14-21, 25-31; 338c-345b, 349c-354c).  Likewise, in Gorgias, Plato has Callicles espouse the view that, whatever conventions might seem to dictate, natural justice dictates that superior people should rule over and derive greater benefits than inferior people, that society artificially levels people because of a bias in favor of equality.  Socrates is then made to criticize this theory by analyzing what sort of superiority would be relevant and then arguing that Callicles is erroneously advocating injustice, a false value, rather than the genuine one of true justice (Gorgias, pp. 52-66; 482d-493c; see, also, Laws, pp. 100-101, 172; 663, 714 for another articulation of something like Thrasymachus’ position).

In the second book of Plato’s Republic, his brothers, Glaucon and Adeimantus, take over the role of primary interlocutors.  They quickly make it clear that they are not satisfied with Socrates’ defense of justice.  Glaucon reminds us that there are three different sorts of goods—intrinsic ones, such as joy, merely instrumental ones, such as money-making, and ones that are both instrumentally and intrinsically valuable, such as health—in order to ask which type of good is justice.  Socrates responds that justice belongs in the third category, rendering it the richest sort of good.  In that case, Glaucon protests, Socrates has failed to prove his point.  If his debate with Thrasymachus accomplished anything at all, it nevertheless did not establish any intrinsic value in justice.  So Glaucon will play devil’s advocate and resurrect the Sophist position, in order to challenge Socrates to refute it in its strongest form.  He proposes to do this in three steps:  first, he will argue that justice is merely a conventional compromise (between harming others with impunity and being their helpless victims), agreed to by people for their own selfish good and socially enforced (this is a crude version of what will later become the social contract theory of justice in Hobbes); second, he illustrates our allegedly natural selfish preference for being unjust if we can get away with it by the haunting story of the ring of Gyges, which provides its wearer with the power to become invisible at will and, thus, to get away with the most wicked of injustices—to which temptation everyone would, sooner or later, rationally succumb; and, third, he tries to show that it is better to live unjustly than justly if one can by contrasting the unjust person whom everyone thinks just with the just person who is thought to be unjust, claiming that, of course, it would be better to be the former than the latter.  Almost as soon as Glaucon finishes, his brother Adeimantus jumps in to add two more points to the case against justice:  first, parents instruct their children to behave justly not because it is good in itself but merely because it tends to pay off for them; and, secondly, religious teachings are ineffective in encouraging us to avoid injustice because the gods will punish it and to pursue justice because the gods will reward it, since the gods may not even exist or, if they do, they may well not care about us or, if they are concerned about human behavior, they can be flattered with prayers and bribed with sacrifices to let us get away with wrongdoing (Republic, pp. 33-42; 357b-366e).  So the challenge for Socrates posed by Plato’s brothers is to show the true nature of justice and that it is intrinsically valuable rather than only desirable for its contingent consequences.

In defending justice against this Sophist critique, Plato has Socrates construct his own positive theory.  This is set up by means of an analogy comparing justice, on the large scale, as it applies to society, and on a smaller scale, as it applies to an individual soul.  Thus justice is seen as an essential virtue of both a good political state and a good personal character.  The strategy hinges on the idea that the state is like the individual writ large—each comprising three main parts such that it is crucial how they are interrelated—and that analyzing justice on the large scale will facilitate our doing so on the smaller one.  In Book IV, after cobbling together his blueprint of the ideal republic, Socrates asks Glaucon where justice is to be found, but they agree they will have to search for it together.  They agree that, if they have succeeded in establishing the foundations of a “completely good” society, it would have to comprise four pivotal virtues:  wisdom, courage, temperance, and justice.  If they can properly identify the other three of those four, whatever remains that is essential to a completely good society must be justice.  Wisdom is held to be prudent judgment among leaders; courage is the quality in defenders or protectors whereby they remain steadfast in their convictions and commitments in the face of fear; and temperance (or moderation) is the virtue to be found in all three classes of citizens, but especially in the producers, allowing them all to agree harmoniously that the leaders should lead and everyone else follow.  So now, by this process-of-elimination analysis, whatever is left that is essential to a “completely good” society will allegedly be justice.  It then turns out that “justice is doing one’s own work and not meddling with what isn’t one’s own.”  So the positive side of socio-political justice is each person doing the tasks assigned to him or her; the negative side is not interfering with others doing their appointed tasks.  Now we move from this macro-level of political society to the psychological micro-level of an individual soul, pressing the analogy mentioned above.  Plato has Socrates present an argument designed to show that reason in the soul, corresponding to the leaders or “guardians” of the state, is different from both the appetites, corresponding to the productive class, and the spirited part of the soul, corresponding to the state’s defenders or “auxiliaries” and that the appetites are different from spirit.  Having established the parallel between the three classes of the state and the three parts of the soul, the analogy suggests that a “completely good” soul would also have to have the same four pivotal virtues.  A good soul is wise, in having good judgment whereby reason rules; it is courageous in that its spirited part is ready, willing, and able to fight for its convictions in the face of fear; and it is temperate or moderate, harmoniously integrated because all of its parts, especially its dangerous appetitive desires, agree that it should be always under the command of reason.  And, again, what is left that is essential is justice, whereby each part of the soul does the work intended by nature, none of them interfering with the functioning of any other parts.  We are also told in passing that, corresponding to these four pivotal virtues of the moral life, there are four pivotal vices, foolishness, cowardice, self-indulgence, and injustice.  One crucial question remains unanswered:  can we show that justice, thus understood, is better than injustice in itself and not merely for its likely consequences?  The answer is that, of course, we can because justice is the health of the soul.  Just as health is intrinsically and not just instrumentally good, so is justice; injustice is a disease—bad and to be avoided even if it isn’t yet having any undesirable consequences, even if nobody is aware of it (ibid., pp. 43, 102-121; 368d, 427d-445b; it can readily be inferred that this conception of justice is non-egalitarian; but, to see this point made explicitly, see Laws, pp. 229-230; 756-757).

Now let us quickly see how Plato applies this theory of justice to a particular social issue, before briefly considering the theory critically.  In a remarkably progressive passage in Book V of his Republic, Plato argues for equal opportunity for women.  He holds that, even though women tend to be physically weaker than men, this should not prove an insuperable barrier to their being educated for the same socio-political functions as men, including those of the top echelons of leadership responsibility.  While the body has a gender, it is the soul that is virtuous or vicious.  Despite their different roles in procreation, child-bearing, giving birth, and nursing babies, there is no reason, in principle, why a woman should not be as intelligent and virtuous—including as just—as men, if properly trained.  As much as possible, men and women should share the workload in common (Republic, pp. 125-131; 451d-457d).  We should note, however, that the rationale is the common good of the community rather than any appeal to what we might consider women’s rights.  Nevertheless, many of us today are sympathetic to this application of justice in support of a view that would not become popular for another two millennia.

What of Plato’s theory of justice itself?  The negative part of it—his critique of inadequate views of justice—is a masterful series of arguments against attempts to reduce justice to a couple of simplistic rules (Cephalus), to treating people merely in accord with how we feel about them (Polemarchus), and to the power-politics mentality of exploiting them for our own selfish purposes (Thrasymachus).  All of these views of a just person or society introduce the sort of relativism and/or subjectivism we have identified with the Sophists.  Thus, in refuting them, Plato, in effect, is refuting the Sophists.  However, after the big buildup, the positive part—what he himself maintains justice is—turns out to be a letdown.  His conception of justice reduces it to order.  While some objective sense of order is relevant to justice, this does not adequately capture the idea of respecting all persons, individually and collectively, as free rational agents.  The analogy between the state and the soul is far too fragile to support the claim that they must agree in each having three “parts.”  The process-of-elimination approach to determining the nature of justice only works if those four virtues exhaust the list of what is essential here.  But do they?  What, for example, of the Christian virtue of love or the secular virtue of benevolence?  Finally, the argument from analogy, showing that justice must be intrinsically, and not merely instrumentally, valuable (because it is like the combination good of health) proves, on critical consideration, to fail.  Plato’s theory is far more impressive than the impressionistic view of the Sophists; and it would prove extremely influential in advocating justice as an objective, disinterested value.  Nevertheless, one cannot help hoping that a more cogent theory might yet be developed.

b. Aristotle

After working with Plato at his Academy for a couple of decades, Aristotle was understandably most influenced by his teacher, also adopting, for example, a virtue theory of ethics.  Yet part of Aristotle’s greatness stems from his capacity for critical appropriation, and he became arguably Plato’s most able critic as well as his most famous follower in wanting to develop a credible alternative to Sophism.  Book V of his great Nicomachean Ethics deals in considerable depth with the moral and political virtue of justice.  It begins vacuously enough with the circular claim that it is the condition that renders us just agents inclined to desire and practice justice.  But his analysis soon becomes more illuminating when he specifies it in terms of what is lawful and fair.  What is in accordance with the law of a state is thought to be conducive to the common good and/or to that of its rulers.  In general, citizens should obey such law in order to be just.  The problem is that civil law can itself be unjust in the sense of being unfair to some, so that we need to consider special justice as a function of fairness.  He analyzes this into two sorts:  distributive justice involves dividing benefits and burdens fairly among members of a community, while corrective justice requires us, in some circumstances, to try to restore a fair balance in interpersonal relations where it has been lost.  If a member of a community has been unfairly benefited or burdened with more or less than is deserved in the way of social distributions, then corrective justice can be required, as, for example, by a court of law.  Notice that Aristotle is no more an egalitarian than Plato was—while a sort of social reciprocity may be needed, it must be of a proportional sort rather than equal.  Like all moral virtues, for Aristotle, justice is a rational mean between bad extremes.  Proportional equality or equity involves the “intermediate” position between someone’s unfairly getting “less” than is deserved and unfairly getting “more” at another’s expense.  The “mean” of justice lies between the vices of getting too much and getting too little, relative to what one deserves, these being two opposite types of injustice, one of “disproportionate excess,” the other of disproportionate “deficiency” (Nicomachean, pp. 67-74, 76; 1129a-1132b, 1134a).

Political justice, of both the lawful and the fair sort, is held to apply only to those who are citizens of a political community (a polis) by virtue of being “free and either proportionately or numerically equal,” those whose interpersonal relations are governed by the rule of law, for law is a prerequisite of political justice and injustice.  But, since individuals tend to be selfishly biased, the law should be a product of reason rather than of particular rulers.  Aristotle is prepared to distinguish between what is naturally just and unjust, on the one hand, such as whom one may legitimately kill, and what is merely conventionally just or unjust, on the other, such as a particular system of taxation for some particular society.  But the Sophists are wrong to suggest that all political justice is the artificial result of legal convention and to discount all universal natural justice (ibid., pp. 77-78; 1134a-1135a; cf. Rhetoric, pp. 105-106; 1374a-b).  What is allegedly at stake here is our developing a moral virtue that is essential to the well-being of society, as well as to the flourishing of any human being.  Another valuable dimension of Aristotle’s discussion here is his treatment of the relationship between justice and decency, for sometimes following the letter of the law would violate fairness or reasonable equity.  A decent person might selfishly benefit from being a stickler regarding following the law exactly but decide to take less or give more for the sake of the common good.  In this way, decency can correct the limitations of the law and represents a higher form of justice (Nicomachean, pp. 83-84; 1137a-1138a).

In his Politics, Aristotle further considers political justice and its relation to equality.  We can admit that the former involves the latter but must carefully specify by maintaining that justice involves equality “not for everyone, only for equals.”  He agrees with Plato that political democracy is intrinsically unjust because, by its very nature, it tries to treat unequals as if they were equals.  Justice rather requires inequality for people who are unequal.  But, then, oligarchy is also intrinsically unjust insofar as it involves treating equals as unequal because of some contingent disparity, of birth, wealth, etc.  Rather, those in a just political society who contribute the most to the common good will receive a larger share, because they thus exhibit more political virtue, than those who are inferior in that respect; it would be simply wrong, from the perspective of political justice, for them to receive equal shares.  Thus political justice must be viewed as a function of the common good of a community.  It is the attempt to specify the equality or inequality among people, he admits, that constitutes a key “problem” of “political philosophy.”  He thinks we can all readily agree that political justice requires “proportional” rather than numerical equality.  But inferiors have a vested interest in thinking that those who are equal in some respect should be equal in all respects, while superiors are biased, in the opposite direction, to imagine that those who are unequal in some way should be unequal in all ways.  Thus, for instance, those who are equally citizens are not necessarily equal in political virtue, and those who are financially richer are not necessarily morally or mentally superior.  What is relevant here is “equality according to merit,” though Aristotle cannot precisely specify what, exactly, counts as merit, for how much it must count, who is to measure it, and by what standard.  All he can suggest, for example in some of his comments on the desirable aristocratic government, is that it must involve moral and intellectual virtue (Politics, pp. 79, 81, 86, 134, 136, 151, 153; 1280a, 1281a, 1282b, 1301a-1302a, 1307a, 1308a).

Let us now consider how Aristotle applies his own theory of justice to the social problem of alleged superiors and inferiors, before attempting a brief critique of that theory.  While Plato accepted slavery as a legitimate social institution but argued for equal opportunity for women, in his Politics, Aristotle accepts sexual inequality while actively defending slavery.  Anyone who is inferior intellectually and morally is properly socio-politically inferior in a well-ordered polis.  A human being can be naturally autonomous or not, “a natural slave” being defective in rationality and morality, and thus naturally fit to belong to a superior; such a human can rightly be regarded as “a piece of property,” or another person’s “tool for action.”  Given natural human inequality, it is allegedly inappropriate that all should rule or share in ruling.  Aristotle holds that some are marked as superior and fit to rule from birth, while others are inferior and marked from birth to be ruled by others.  This supposedly applies not only to ethnic groups, but also to the genders, and he unequivocally asserts that males are “naturally superior” and females “naturally inferior,” the former being fit to rule and the latter to be ruled.  The claim is that it is naturally better for women themselves that they be ruled by men, as it is better for “natural slaves” that they should be ruled by those who are “naturally free.”  Now Aristotle does argue only for natural slavery.  It was the custom (notice the distinction, used here, between custom and nature) in antiquity to make slaves of conquered enemies who become prisoners of war.  But Aristotle (like Plato) believes that Greeks are born for free and rational self-rule, unlike non-Greeks (“barbarians”), who are naturally inferior and incapable of it.  So the fact that a human being is defeated or captured is no assurance that he is fit for slavery, as an unjust war may have been imposed on a nobler society by a more primitive one.  While granting that Greeks and non-Greeks, as well as men and women, are all truly human, Aristotle justifies the alleged inequality among them based on what he calls the “deliberative” capacity of their rational souls.  The natural slave’s rational soul supposedly lacks this, a woman has it but it lacks the authority for her to be autonomous, a (free male) child has it in some developmental stage, and a naturally superior free male has it developed and available for governance (ibid., pp. 7-11, 23; 1254a-1255a, 1260a).

This application creates a helpful path to a critique of Aristotle’s theory of justice.  If we feel that it is unjust to discriminate against people merely on account of their gender and/or ethnic origin, as philosophers, we try to identify the rational root of the problem.  If our moral intuitions are correct against Aristotle (and some would even call his views here sexist and racist), he may be mistaken about a matter of fact or about a value judgment or both.  Surely he is wrong about all women and non-Greeks, as such, being essentially inferior to Greek males in relevant ways, for cultural history has demonstrated that, when given opportunities, women and non-Greeks have shown themselves to be significantly equal.  But it appears that Aristotle may also have been wrong in leaping from the factual claim of inequality to the value judgment that it is therefore right that inferiors ought to be socially, legally, politically, and economically subordinate—like Plato and others of his culture (for which he is an apologist here), Aristotle seems to have no conception of human rights as such.  Like Plato, he is arguing for an objective theory of personal and social justice as a preferable alternative to the relativistic one of the Sophists.  Even though there is something attractive about Aristotle’s empirical (as opposed to Plato’s idealistic) approach to justice, it condemns him to the dubious position of needing to derive claims about how things ought to be from factual claims about the way things actually are.  It also leaves Aristotle with little viable means of establishing a universal perspective that will respect the equal dignity of all humans, as such.  Thus his theory, like Plato’s, fails adequately to respect all persons as free, rational agents.  They were so focused on the ways in which people are unequal, that they could not appreciate any fundamental moral equality that might provide a platform for natural human rights.

2. Medieval Christianity

When Christian thinkers sought to develop their own philosophies in the middle ages (“medieval” meaning the middle ages and “middle” in the sense of being between antiquity and modernity), they found precious basic building-blocks in ancient thought.  This included such important post-Aristotelians as the enormously influential Roman eclectic Cicero, such prominent Stoics as Marcus Aurelius (a Roman emperor) and Epictetus (a Greek slave of the Romans), and neo-Platonists like Plotinus.  But the two dominant paths that medieval philosophy would follow for its roughly thousand year history had been blazed by Plato and Aristotle.  More specifically, Augustine uses Platonic (and neo-Platonic) philosophy to the extent that he can reconcile it with Christian thought; Aquinas, many centuries later, develops a great synthesis of Christian thought (including that of Augustine) and Aristotelian philosophy.  A great difference, however, between their philosophies and those of Hellenic thinkers such as Plato and Aristotle stems from the commitment of these Christians to the authority of the Hebrew and Christian scriptures.  Aquinas would later agree with Augustine (who is accepting the mandate of Isaiah 7:9) that the quest for philosophical understanding should begin with belief in religious traditions (Choice, pp. 3, 32).  Both the Old Testament and the New Testament call for just behavior on the part of righteous people, with injustice being a sin against God’s law, the references being too numerous to cite (but see Job 9:2, Proverbs 4:18, Proverbs 10:6-7, Ecclesiastes 7:20, Matthew 5:45, Philippians 4:8, and Hebrews 12:23).  The claim that God’s justice will prevail in the form of divine judgment is both a promise for the just and a threat for the unjust.  Righteousness is identified with mercy as well as with justice (e.g., Micah 6:8 and Matthew 5:7) and involves our relationship with God as well as with fellow humans.  The ten commandments of the Old Testament (Exodus 20:1-17) are prescriptions regarding how the righteous are to relate to God as well as to one another.  In the New Testament, Jesus of Nazareth interprets how the righteous are to live (Matthew 22:36-40) in terms of love of both God and their neighbors; the concept of one’s neighbor is meant to extend even to strangers, as is illustrated in the parable of the Good Samaritan (Luke 10:29-37).  In the Beatitudes beginning the Sermon on the Mount, Jesus expands on this gospel of love by advocating that his followers go beyond the duties of justice to behave with compassion in certain supererogatory ways (Matthew 5:3-12).  All of this scriptural tradition essentially influenced medieval thinkers such as Augustine and Aquinas in a way that distinguishes them from ancient Greek philosophers such as Plato and Aristotle.

a. Augustine

Aurelius Augustine was born and raised in the Roman province of North Africa; during his life, he experienced the injustices, the corruption, and the erosion of the Roman Empire.  This personal experience, in dialectical tension with the ideals of Christianity, provided him with a dramatic backdrop for his religious axiology.  Philosophically, he was greatly influenced by such neo-Platonists as Plotinus.  His Christian Platonism is evident in his philosophical dialogue On Free Choice of the Will, in which he embraces Plato’s view of four central moral virtues (which came to be called “cardinal,” from the Latin word for hinges, these being metaphorically imaginable as the four hinges on which the door of morality pivots).  These are prudence (substituted for wisdom), fortitude or courage, temperance, and justice.  His conception of justice is the familiar one of “the virtue by which all people are given their due.”  But this is connected to something new and distinctly Christian—the distinction between the temporal law, such as the law of the state, and the eternal, divine law of God.  The eternal law establishes the order of God’s divine providence.  And, since all temporal or human law must be consistent with God’s eternal law, Augustine can draw the striking conclusion that, strictly speaking, “an unjust law is no law at all,” an oxymoron (Choice, pp. 20, 11, 8; cf. Religion, p. 89, for an analysis of justice that relates it to love).  Thus a civil law of the state that violates God’s eternal law is not morally binding and can be legitimately disobeyed in good conscience.  This was to have a profound and ongoing influence on Christian ethics.

In his masterpiece, The City of God, Augustine draws the dramatic conclusion from this position that the Roman Empire was never a truly just political society.  He expresses his disgust over its long history of “revolting injustice.”  Rome was always a pagan, earthly city, and “true justice” can allegedly only be found in a Christian “city of God.”  The just, rather than the powerful, should rule for the common good, rather than serving their own self-interest.  He strikingly compares unjust societies, based on might rather than on right, to “gangs of criminals on a large scale,” for, without justice, a kingdom or empire is merely ruled by the arbitrary fiat of some leader(s).  A genuinely just society must be based on Christian love, its peaceful order established by the following of two basic rules—that people harm nobody and that they should try to help everyone to the extent that they can do so (City, pp. 75, 67, 75, 138-139, 873).

Despite his Christian commitment to love and peace, Augustine is not a pacifist and can support “just wars” as morally permissible and even as morally obligatory.  Every war aims at the order of some sort of established peace; while an unjust war aims to establish an unjust peace of domination, a just war aims to establish a “just peace.”  He agrees with Cicero that a just war must be defensive rather than aggressive (ibid., pp. 861-862, 866, 868-869, 1031).  In a letter (# 138) to Marcellinus, Augustine uses scripture to deny that Christian doctrine is committed to pacifism, though wars should be waged, when necessary, with a benevolent love for the enemy.  In a letter (# 189) to Boniface, he maintains that godly, righteous people can serve in the military, again citing scripture to support his position.  He repeats the view that a just war should aim at establishing a lasting and just peace and holds that one must keep faith with both one’s allies and one’s enemies, even in the awful heat of warfare.  Augustine’s most important treatment of the just war theory is contained in his writing Against Faustus the Manichean, where he analyzes the evils of war in terms of the desire to harm others, the lust for revenge and cruelty, and the wish to dominate other people.  In addition to the condition that a just war must aim at establishing a just and lasting peace, a second condition is that it must be declared by a leader or body of leaders, with the “authority” to do so, after deliberating that it is justified.  Again Augustine makes it clear that he is no pacifist (Political, pp. 209, 219-223).

While this is a very valuable application of his theory of justice, this doctrine of the just war standing the test of time to this very day, the general theory on which it is based is more problematic.  The unoriginal (and uninspired) conception of justice as giving others their due had already become familiar to the point of being trite.  It remains vulnerable to the serious problems of vagueness already considered:  what is the relevant criterion whereby it should be determined who deserves what, and who is fit to make such a judgment?  But, also, Augustine should have an advantage over the ancient Greeks in arriving at a theory of justice based on universal equality on account of the Christian doctrine (not to mention because of the influences of Cicero, the Stoics, and Plotinus) that all humans are equally children of God.  Unfortunately, his zealous Christian evangelism leads him to identify justice itself, in a divisive, intolerant, polemical way, with the Christian church’s idea of what God requires, so that only a Christian society can possibly qualify as just, as if a just political society would need to be a theocracy.  Thus, while he has some sense of some moral or spiritual equality among humans, it does not issue in equal respect for all persons as free, rational agents, allowing him, for example, to accept the institution of slavery as a just punishment for sin, despite the belief that God originally created humans as naturally free, because of the idea that we have all been corrupted by original sin (City, pp. 874-875).

b. Aquinas

As Augustine is arguably the greatest Christian Platonist, so Thomas Aquinas, from what is now Italy, is the greatest Christian Aristotelian.  Nevertheless, as we shall see, his theory of justice is also quite compatible with Augustine’s.  Aquinas discusses the same four cardinal moral virtues, including that of justice, in his masterpiece, the multi-volume Summa Theologica.  No more a socio-political egalitarian than Plato, Aristotle, or Augustine, he analyzes it as calling for proportional equality, or equity, rather than any sort of strict numerical equality, and as a function of natural right rather than of positive law.  Natural right ultimately stems from the eternal, immutable will of God, who created the world and governs it with divine providence.  Natural justice must always take precedence over the contingent agreements of our human conventions.  Human law must never contravene natural law, which is reason’s way of understanding God’s eternal law.  He offers us an Aristotelian definition, maintaining that “justice is a habit whereby a man renders to each one his due by a constant and perpetual will.”  As a follower of Aristotle, he defines concepts in terms of genus and species.  In this case, the general category to which justice belongs is that it is a moral habit of a virtuous character.  What specifically distinguishes it from other moral virtues is that by justice, a person is consistently committed to respecting the rights of others over time.  Strictly speaking, the virtue of justice always concerns interpersonal relations, so that it is only metaphorically that we can speak of a person being just to himself.  In addition to legal justice, whereby a person is committed to serving the “common good” of the entire community, there is “particular justice,” which requires that we treat individuals in certain ways.  Justice is a rational mean between the vicious extremes of deficiency and excess, having to do with our external actions regarding others.  Like many of his predecessors, Aquinas considers justice to be preeminent among the moral virtues.  He agrees with Aristotle in analyzing particular justice into two types, which he calls “distributive” and “commutative”; the former governs the proportional distribution of common goods, while the latter concerns the reciprocal dealings between individuals in their voluntary transactions (Law, pp. 137, 139, 145, 147, 155, 160, 163, 165).

Aquinas applies this theory of justice to many social problems.  He maintains that natural law gives us the right to own private property.  Given this natural right, theft (surreptitiously stealing another’s property) and robbery (taking it openly by force or the threat of violence) must be unjust, although an exception can arise if the thief and his family are starving in an environment of plenty, in which case, stealing is justified and, strictly speaking, not theft or robbery at all.  Secondly, Aquinas refines the Augustinian just war theory by articulating three conditions that must jointly be met in order for the waging of war to be just:  (a) it must be declared by a leader with socio-political authority; (b) it must be declared for a “just cause,” in that the people attacked must be at fault and thus deserve it; and (c) those going to war must intend good and the avoidance of evil.  It is not justifiable deliberately to slay innocent noncombatants.  It is legitimate to kill another in self-defense, though one’s intention should be that of saving oneself, the taking of the other’s life merely being the necessary means to that good end (this, by the way, is the source of what later evolves into the moral principle of “double effect”).  Even acting in self-defense must be done in reasonable proportion to the situation, so that it is wrong to employ more force than is necessary to stop aggression.  Even killing another unintentionally can be unjust if done in the course of committing another crime or through criminal negligence.  Thirdly, while Aquinas thinks we should tolerate the religious beliefs of those who have never been Christians, so that it would be unjust to persecute them, he thinks it just to use force against heretics who adhered to but then rejected orthodox Christianity, even to the point of hurting them, as in the Inquisition, for the good of their own souls.  In an extreme case of recalcitrant heretics who will not be persuaded to return to the truth of Christianity, it is allegedly just that they should be “exterminated” by execution rather than being allowed to corrupt other Christians by espousing their heterodox religious views.  Fourth, like Augustine, Aquinas accepts slavery, so long as no Christian is the slave of a non-Christian (ibid., pp. 178-183, 186, 221, 224, 226, 228, 250, 256, 253), and considers it just that women should be politically and economically “subject” to men.  Although he considers women to be fully human, he agrees with Aristotle that they are “defective and misbegotten,” the consequence allegedly being inferior rational discretion (Summa, pp. 466-467).

From a critical perspective, his general theory of justice is, by now, quite familiar, a sort of blend of Aristotle’s and Augustine’s, and marked by the same flaws as theirs.  His applications of the theory can be regarded as indicative of its problematic character:  (a) given the assumption of a right to own private property, his discussion of the injustices of theft and robbery seems quite reasonable; (b) assuming that we have a right to self-defense, his analysis of the legitimacy of killing in a just war does also; (c) his attempted defense of the persecution of religious heretics, even unto death, invites suspicions of dogmatic, intolerant fanaticism on his part; and (d) his acceptance of slavery and the political and economic subjection of women as just is indicative of an empirical orientation that is too uncritically accepting of the status quo.  Here again the Christian belief that all humans are personal creatures of a loving God is vitiated by an insufficient commitment to the implications of that, regarding socio-political equality, so that only some humans are fully respected as free, rational agents.  The rationalistic theories of Plato and Augustine and the classical empirical theories of Aristotle and Aquinas all leave us hoping that preferable alternatives might be forthcoming.

3. Early Modernity

Although only half as much time elapses between Aquinas and Hobbes as did between Augustine and Aquinas, from the perspective of intellectual history, the period of modernism represents a staggering sea-change.  We have neither the time nor the space to consider the complex causal nexus that explains this fact; but, for our purposes, suffice it to say that the Protestant Reformation, the revolution of the new science, and the progressive willingness publicly to challenge authority (both political and religious) converge to generate a strikingly different philosophical mentality in the seventeenth century.  In the previous century, the Protestant Reformation shattered the hegemony of the Roman Catholic Church, so that thinkers need not feel so constrained to adhere to established orthodoxy.  The naturalistic worldview of the sixteenth and early seventeenth centuries that eventuated in an empirical and experimental (non-dogmatic) methodology in both natural and political science set an example for philosophers.  Thinkers of the modern era became increasingly comfortable breaking from the mainstream to pursue their own independent reasoning.  Although the influence of great ancient philosophers like Plato and Aristotle and of great medieval thinkers such as Augustine and Aquinas would persist, there was no returning to their bygone perspectives.  This vitally affects moral and political theory, in general, and views on justice, in particular.  As we shall see in this section, views of justice as relative to human needs and interests became prominent as they had not been for a couple of millennia.  This will locate Hobbes and Hume closer to the Sophists than had been fashionable since pre-Socratic times in philosophy, regarding justice as a social construct.

a. Hobbes

Whereas Plato, Aristotle, Augustine, and Aquinas all offer accounts of justice that represent alternatives to Sophism, Thomas Hobbes, the English radical empiricist, can be seen as resurrecting the Sophist view that we can have no objective knowledge of it as a moral or political absolute value.  His radical empiricism does not allow him to claim to know anything not grounded in concrete sense experience.  This leads him in Leviathan, his masterpiece, to conclude that anything real must be material or corporeal in nature, that body is the one and only sort of reality; this is the philosophical position of materialistic monism, which rules out the possibility of any spiritual substance.  On this view, “a man is a living body,” only different in kind from other animals, but with no purely spiritual soul separating him from the beasts.  Like other animals, man is driven by instinct and appetite, his reason being a capacity of his brain for calculating means to desirable ends.  Another controversial claim here is that all actions, including all human actions, are causally determined to occur as they do by the complex of their antecedent conditions; this is causal determinism.  What we consider voluntary actions are simply those we perform in which the will plays a significant causal role, human freedom amounting to nothing more exalted than the absence of external restraints.  Like other animals, we are always fundamentally motivated by a survival instinct and ultimately driven by self-interest in all of our voluntary actions; this is psychological egoism.  It is controversial whether he also holds that self-interest should always be our fundamental motivation, which is ethical egoism.  In his most famous Chapter XIII, Hobbes paints a dramatic and disturbing portrait of what human life would be like in a state of nature—that is, beyond the conventional order of civil society.  We would be rationally distrustful of one another, inclined to be anti-social, viewing others as threats to our own satisfaction and well-being.  Interpersonal antagonism would be natural; and, since there would exist no moral distinctions between right and wrong, just and unjust, violent force and fraudulent deception would be desirable virtues rather than objectionable vices.  In short, this would be a state of “war of every man against every man,” a condition in which we could not reasonably expect to survive for long or to enjoy any quality of life for as long as we did.  We are smart enough to realize that this would be a condition in which, as Hobbes famously writes, “the life of man” would inevitably be “solitary, poor, nasty, brutish, and short.”  Fortunately, our natural passions of fear, desire, and hope motivate us to use reason to calculate how we might escape this hellish state.  Reason discovers a couple of basic laws of nature, indicating how we should prudently behave if we are to have any reasonable opportunity to survive, let alone to thrive.  The first of these is double-sided:  the positive side holds that we should try to establish peace with others, for our own selfish good, if we can; the negative side holds that, if we cannot do that, then we should do whatever it takes to destroy whoever might be a threat to our interests.  The second law of nature maintains that, in order to achieve peace with others, we must be willing to give up our right to harm them, so long as they agree to reciprocate by renouncing their right to harm us.  This “mutual transferring of right,” established by reciprocal agreement, is the so-called social contract that constitutes the basis of civil society; and the agreement can be made either explicitly or implicitly (Leviathan, pp. 261-262, 459-460, 79, 136, 82, 95, 74-78, 80-82; for comparable material, see Elements, pp. 78-84, 103-114, as well as Citizen, pp. 109-119, 123-124).

What is conspicuously missing here is any sense of natural justice or injustice.  In the state of nature, all moral values are strictly relative to our desires:  whatever seems likely to satisfy our desires appears “good” to us, and whatever seems likely to frustrate our desires we regard as “evil.”  It’s all relative to what we imaginatively associate with our own appetites and aversions.  But as we move from this state of nature to the state of civil society by means of the social contract, we create the rules of justice by means of the agreements we strike with one another.  Prior to the conventions of the contract, we were morally free to try to do whatever we wished.  But when a covenant is made, then to break it is unjust; and the definition of injustice is no other than the not performance of covenant.  What is not unjust, is just in civil society.  This turns out to be the third law of nature, that, in the name of justice, we must try to keep our agreements.  In civil society, we may justly do anything we have not, at least implicitly, committed ourselves not to do.  A just person typically does just actions, though committing one or a few unjust actions does not automatically render that person unjust, especially if the unjust behavior stems from an error or sudden passion; on the other hand, a person who is typically inclined to commit unjust actions is a guilty person.  Still, if we are as selfishly motivated by our own desires as Hobbes maintains, why should we not break our word and voluntarily commit injustice, if doing so is likely to pay off for us and we imagine we might get away with it (remember the problem posed by Glaucon with the story of the ring of Gyges)?  Clearly one more element is needed to prevent the quick disintegration of the rules of justice so artificially constructed by interpersonal agreement.  This is the power of sovereign authority.  We need laws codifying the rules of justice; and they must be so vigilantly and relentlessly enforced by absolute political power that nobody in his right mind would dare to try to violate them.  People simply cannot be trusted to honor their social commitments without being forced to do so, since “covenants without the sword are but words, and of no strength to secure a man at all.”  In other words, we must sacrifice a great deal of our natural liberty to achieve the sort of security without which life is hardly worth living.  In civil society, our freedom is relative to the lack of specified obligations, what Hobbes calls “the silence of the law.”  If we worry that this invests too much power in the government, which may abuse that power and excessively trample on our freedom, the (cynical) response is that this is preferable to the chaos of the state of nature or to the horrors of civil war (Leviathan, pp. 28-29, 89, 93, 106, 109, 143, 117; for comparable material, see Elements, pp. 88-89, Citizen, pp. 136-140, and Common, p. 34).  One of the most crucial problems of political philosophy is where to strike the balance between personal liberty and public order; Hobbes is, perhaps, more willing than most of us to give up a great deal of the former in order to secure the latter.

As we have with earlier thinkers, let us see how Hobbes applies this theory of justice, as a prelude to evaluating it critically.  He compares the laws of civil society to “artificial chains” binding us to obey the sovereign authority of the state in the name of justice.  The third law of nature, the law of justice, obliges us to obey the “positive” laws of the state.  Any deliberate violation of civil law is a “crime.”  Now the social problem to be considered is that of criminal punishment.  This deliberately inflicts some sort of “evil” on an alleged criminal for violating civil law.  Its rationale is to enforce obedience to the law itself and, thus, to promote security and public order.  Hobbes lays down various conditions that must be met in order for such an infliction of evil to qualify as legitimate “punishment,” including that no retroactive punishment is justifiable.  He also analyzes five sorts of criminal punishment—“corporal, or pecuniary, or ignominy, or imprisonment, or exile,” allowing for a combination of them; he also specifies that the corporal sort can be capital punishment.  It would be wrong for the state deliberately to punish a member of civil society believed to be innocent; indeed, strictly speaking, it would not even qualify as “punishment,” as it fails to meet an essential part of the definition.  The severity of punishment should be relative to the severity of the crime involved, since its rationale is to deter future violations of civil law (Leviathan, pp. 138, 173, 175, 185, 190, 203-208, 230; see, also, Elements, pp. 177-182, and Citizen, pp. 271-279; near the end of his verse autobiography—Elements, p. 264—Hobbes writes, “Justice I Teach, and Justice Reverence”).

While this is a decent consequentialist theory of crime and punishment, the more general view of justice from which it is derived is far more problematic.  It does stand in sharp contrast to the theories of Plato, Aristotle, Augustine, and Aquinas.  It does revive something like the Sophist theory to which they were all advocating alternatives.  And it does reflect the naturalistic approach represented by the new science.  However, all the foundational elements supporting it are quite dubious:  the radical empiricism, the materialism, the determinism, the egoism, the moral relativism, and the narrow conception of human reason.  Without these props, this theory of justice as artificially constructed by us and purely a function of our interpersonal agreements seems entirely arbitrary.  But in addition to its being insufficiently justified, this theory of justice would justify too much.  For example, what would prevent its involving a justification of slavery, if the alternative for the slaves were death as enemies in a state of nature?  Even apart from the issue of slavery, in the absence of any substantive human rights, minorities in civil society might be denied any set of civil liberties, such as the right to adopt religious practices to which they feel called in conscience.  Hobbes’s conception of justice is reductionistic, reducing it to conventional agreements that seem skewed to sacrifice too much liberty on the altar of law and order.

b. Hume

As a transition between Hobbes and Hume, brief mention can be made of John Locke, the most important political philosopher between them.  (The reason he is not being considered at length here is that he does not offer a distinctive general theory of justice.)  In his masterful Second Treatise of Government, Locke describes a state of nature governed by God’s law but insecure in that there is no mechanism for enforcing it, when the natural rights of property—comprising one’s life, liberty, and estates—are violated.  In order to protect such property rights, people agree to a social contract that moves them from that state of nature to a state of political society, with government established to enforce the law.  Another great social contract theorist between Hobbes and Hume who is worth mentioning here (again he gives us no distinctive theory of justice) is Jean-Jacques Rousseau.  In The Social Contract, he maintains that, in a well-ordered society, the general will (rather than the will of any individual or group of individuals) must prevail.  True freedom in society requires following the general will, and those who do not choose to do so can legitimately be forced to do so.  A human being is allegedly so transformed by the move from the state of nature to that of civil society as to become capable of such genuine freedom as will allow each citizen to consent to all the laws out of deference to the common good.   David Hume, an eighteenth-century Scottish thinker, who is very influenced by Locke’s focus on property while rejecting the social contract theory of Hobbes, Locke, and Rousseau, is an interesting philosopher to consider in relation to Hobbes.  Like Hobbes, Hume is a radical empiricist and a determinist who is skeptical of justice as an objective, absolute virtue.  But Hume does not explicitly embrace materialism, is not a psychological or ethical egoist, and famously attacks the social contract theory’s account of moral and political obligation on both historical grounds (there is no evidence for it, and history shows that force rather than consent has been the basis of government) and philosophical grounds (even if our ancestors had given their consent, that would not be binding on us, and utility is a more plausible explanation of submission than genuine agreement) alike (Essays, pp. 186-201).  In the third section of his Enquiry concerning the Principles of Morals, Hume argues that “public utility is the sole origin of justice.”  To place that claim in context, we can note that, like Hobbes, Hume sees all values, including that of justice, as derived from our passions rather than (as Plato, Aristotle, Augustine, and Aquinas thought) from reason.  Any virtue, he maintains, is desirable in that it provides us with the pleasant feeling of approval; and any vice, including that of injustice, is undesirable in that it provides us with the painful sense of disapproval.  In order to qualify as a virtue, a quality must be “useful or agreeable to the person himself or to others.”  It is possible for some virtues to be rich enough to fit appropriately in more than one of these four categories (for example, benevolence seems to be useful and agreeable to both the benevolent person and to others); but justice is purportedly a virtue only because it is useful to others, as members of society.  Hume offers us a unique and fascinating argument to prove his point.  He imagines four hypothetical scenarios, in which either human nature would be radically different (utterly altruistic or brutally selfish) or our environment would be so (with everything we desire constantly and abundantly available or so destitute that hardly anyone could survive), allegedly showing that, in each of them, justice would not be a virtue at all.  His conclusion is that justice is only a virtue because, relative to reality, which is intermediate among these extremes, it is beneficial to us as members of society.  He also refuses to identify justice with “perfect equality,” maintaining that the ideal of egalitarianism is both “impracticable” and “extremely pernicious to human society.”  For Hume, the rules of justice essentially involve protecting private property, although property rights are not absolute and may be abridged in extreme cases where “public safety” and the common good require it.  Even international relations normally require that “rules of justice” be observed for mutual advantage, although public utility can also require that they should be suspended (Enquiry, pp. 20, 85, 72, 21-25, 28-35; see also Essays, pp. 20, 202).  Though different from Hobbes’s theory, this one also leans towards the Sophist view of justice as conventional and relative.

In his masterpiece, A Treatise of Human Nature, Hume makes the striking claim, “Reason is, and ought only to be the slave of the passions,” which rules out all forms of ethical rationalism.  He also makes a remarkable distinction between descriptive language regarding what “is, and is not,” on the one hand, and prescriptive language concerning what “ought, or ought not” to be, on the other, challenging the possibility of ever justifying value claims by means of any factual ones, of logically inferring what should be from what is.  The second part of Book 3 of Hume’s Treatise deals extensively with justice.  Here he calls it an “artificial” but “not arbitrary” virtue, in that we construct it as a virtue for our own purposes, relative to our needs and circumstances, as we experience them.  It is valuable as a means to the end of social cooperation, which is mutually “advantageous.”   An especially beneficial, if unnatural, convention is respecting others’ property, which is what the rules of justice essentially require of us.  The psychological grounds of our sense of justice are a combination of “self-interest” and “sympathy” for others.  He holds a very conservative view of property rights, in that, normally, people should be allowed to keep what they already have acquired.  Indeed, justice normally comprises three principles—“of the stability of possession, of its transference by consent, and of the performance of promises.”  He rejects the traditional definition of justice as giving others their due, because it rashly and wrongly assumes that “right and property” have prior objective reality independent of conventions of justice.  Internationally, the rules of justice assume the status of “the law of nations,” obliging civilized governments to respect the ambassadors of other countries, to declare war prior to engaging them in battle, to refrain from using poisonous weapons against them, and so forth.  The rationale for such principles of international justice is that they reduce the horrors of war and facilitate the advantages of peace.  By respecting other societies’ possessions, leaders minimize the likelihood of war; by respecting the transference of possessions by mutual consent, they enhance the possibilities of international trade; and by keeping their promises, they create a climate for peaceful alliances.  A bit later, Hume adopts a position which, in the twentieth century, has been called a “rule utilitarian” view of justice, writing that, though individual acts of justice might be contrary to public utility, they ought to be performed if they are conducive to “a general scheme or system” of conduct that benefits society as a whole (Treatise, pp. 266, 302, 311, 307, 312, 315, 320-321, 323, 337-338, 362-363, 370-371).  Yet the rules of justice that are normally conducive to public utility are never absolute and can be legitimately contravened where following them would seem to do more harm than good to our society.  He applies this view to the issue of civil disobedience, which is normally unjust because it threatens “public utility” but can be justified as a last resort “in extraordinary circumstances” when that same public utility is in jeopardy (Essays, pp. 202-204).  Whether that is or is not the case in specific circumstances becomes a judgment call.

Hume is important here because of a convergence of several factors.  First, like the Sophists and Hobbes, he makes justice a social construct that is relative to human needs and interests.  Second, like Hobbes, he associates it fundamentally with human passions rather than with reason.  Third, the virtue of justice and the rules of justice are essentially connected to the protection of private property.  And, fourth, he considers public utility to be the sole basis of justice.  This theory would prove extremely influential, in that Kant will take issue with it, while utilitarians like Mill will build on its flexibility.  This sort of flexibility is both a strength and a weakness of Hume’s theory of justice.  While it may be attractive to allow for exceptions to the rules, this also creates a kind of instability.  Is justice merely an instrumental good, having no intrinsic value?  If that were the case, then it would make sense to say that the role of reason is simply to calculate the most effective means to our most desirable ends.  But then, assuming that our ends were sufficiently desirable, any means necessary to achieve them would presumably be justifiable—so that, morally and politically, anything goes, in principle, regardless how revolting.  Finally, notice that Hume himself, because of the empirical nature of his practical philosophy, fails to avoid the “is-ought” trap against which he so deftly warned us:  because some end is sufficiently desired, whatever means are necessary, or even most effective, to achieve it ought to be pursued.  Is this the best we can do in our pursuit of an adequate theory of justice?

4. Recent Modernity

Moving from one of the greatest philosophers of the Enlightenment to the other, we shall see that Kant will take more seriously the “is-ought” challenge than Hume himself did.  As justice is both a moral and a political virtue, helping to prescribe both a good character and right conduct, the question of how such obligations arise is crucial.  For Hume, we ought to pursue virtue (including justice) because it (allegedly) is agreeable and/or useful to do so.  But, then, what is the logical link here?  Why should we, morally speaking, act for the sake of agreeableness and utility?  For Kant, the reason we should choose to do what is right has nothing to do with good consequences.  It is merely because it is the right thing to do.  Conceding that prescriptive “ought” claims can never be logically deduced from any set of factually descriptive “is” claims, Kant will forsake the empirical approach to justice (of Hobbes and Hume) in favor of the sort of rationalistic one that will revert to seeing it as an absolute value, not to be compromised, regardless of circumstances and likely consequences.  Then we shall consider the utilitarian response to this, as developed by the philosopher who is, arguably, the greatest consequentialist of modern times, John Stuart Mill, who, as an empiricist, like Hobbes and Hume, will make what is right a function of what is good.

a. Kant

Immanuel Kant, an eighteenth-century German professor from East Prussia, found his rationalistic philosophical convictions profoundly challenged by Hume’s formidable skepticism (as well as being fascinated by the ideas of Rousseau).  Even though he was not convinced by it, Kant was sufficiently disturbed by it that he committed decades to trying to answer it, creating a revolutionary new philosophical system in order to do so.  This system includes, but is far from limited to, a vast, extensive practical philosophy, comprising many books and essays, including a theory of justice.  It is well known that this practical philosophy—including both his ethical theory and socio-political philosophy—is the most renowned example of deontology (from the Greek, meaning the study or science of duty).  Whereas teleological or consequentialist theories (such as those of Hobbes and Hume) see what is right as a function of and relative to good ends, a deontological theory such as Kant’s sees what is right as independent of what we conceive to be good and, thus, as potentially absolute.  Justice categorically requires a respect for the right, regardless of inconvenient or uncomfortable circumstances and regardless of desirable and undesirable consequences.  Because of the “is-ought” problem, the best way to proceed is to avoid the empirical approach that is necessarily committed to trying to derive obligations from alleged facts.

This is precisely Kant’s approach in the foundational book of his system of practical philosophy, his Grounding for the Metaphysics of Morals.  He argues, in its Preface, that, since the moral law “must carry with it absolute necessity” and since empiricism only yields “contingent and uncertain” results, we must proceed by way of “pure practical reason, “ which would be, to the extent possible, “purified of everything empirical,” such as physiological, psychological, and environmental contingencies.  On this view, matters of right will be equally applicable to all persons as potentially autonomous rational agents, regardless of any contingent differences, of gender, racial or ethnic identity, socio-economic class status, and so forth.  If Kant can pull this off, it will take him further in the direction of equality of rights than any previous philosopher considered here.  In order to establish a concept of right that is independent of empirical needs, desires, and interests, Kant argues for a single fundamental principle of all duty, which he calls the “categorical imperative,” because it tells us what, as persons, we ought to do, unconditionally.  It is a test we can use to help us rationally to distinguish between right and wrong; and he offers three different formulations of it which he considers three different ways of saying the same thing:  (a) the first is a formula of universalizability, that we should try to do only what we could reasonably will should become a universal law; (b) the second is a formula of respect for all persons, that we should try always to act in such a way as to respect all persons, ourselves and all others, as intrinsically valuable “ends in themselves” and never treat any persons merely as instrumental means to other ends; and (c) the third is a “principle of autonomy,” that we, as morally autonomous rational agents, should try to act in such a way that we could be reasonably legislating for a (hypothetical) moral republic of all persons.  For the dignity of all persons, rendering them intrinsically valuable and worthy of respect, is a function of their capacity for moral autonomy.  In his Metaphysics of Morals, Kant develops his ethical system, beyond this foundation, into a doctrine of right and a doctrine of virtue.  The former comprises strict duties of justice, while the latter comprises broader duties of merit.  Obviously, it is the former category, duties we owe all other persons, regardless of circumstances and consequences, that concerns us here, justice being a matter of strict right rather than one of meritorious virtue.  At the very end of his Metaphysics of Morals, Kant briefly discusses “divine justice,” whereby God legitimately punishes people for violating their duties (Ethical, pp. 2-3, 30-44, 36, 48, 158-161).

In his Metaphysical Elements of Justice, which constitutes the first part of his Metaphysics of Morals, Kant develops his theory of justice.  (His concept of Rechtslehre—literally, “doctrine of right”—has also been translated as “doctrine of justice” and “doctrine of law.”)  For Kant, justice is inextricably bound up with obligations with which we can rightly be required to comply. To say that we have duties of justice to other persons is to indicate that they have rights, against us, that we should perform those duties—so that duties of justice and rights are correlative.  Three conditions must be met in order that the concept of justice should apply:  (a) we must be dealing with external interpersonal behaviors; (b) it must relate to willed action and not merely to wishes, desires, and needs; and (c) the consequences intended are not morally relevant.  A person is not committing an injustice by considering stealing another’s property or in wanting to do so, but only by voluntarily taking action to appropriate it without permission; and the act is not justified no matter what good consequences may be intended.  According to Kant, there is only one innate human right possessed by all persons; that is the right freely to do what one wills, so long as that is “compatible with the freedom of everyone else in accordance with a universal law.”  Thus one person’s right freely to act cannot extend to infringing on the freedom of others or the violation of their rights.  This leads to Kant’s ultimate universal principle of justice, which is itself a categorical imperative:  “Every action is just [right] that in itself or in its maxim is such that the freedom of the will of each can coexist together with the freedom of everyone in accordance with a universal law.”  Although the use of coercive force against other persons involves an attempt to restrict their freedom, this is not necessarily unjust, if it is used to counteract their unjust abuse of freedom—for example, in self-defense or punishment or even war.  Kant approvingly invokes three ancient rules of justice:  (1) we should be honest in our dealings with others; (2) we should avoid being unjust towards others even if that requires our trying to avoid them altogether; and (3) if we cannot avoid associating with others, we should at least try to respect their rights (Justice, pp. 29, 38, 30-31, 37; see also Lectures, pp. 211-212).

Kant distinguishes between natural or private justice, on the one hand, and civil or public justice, on the other.  He has an intricate theory of property rights, which we can only touch upon here.  We can claim, in the name of justice, to have rights to (a) physical property, such as your car, (b) the performance of a particular deed by another person, such as the auto shop keeping its agreement to try to fix your car, and (c) certain characteristics of interpersonal relationships with those under our authority, such as obedient children and respectful servants.  Someone who steals your car or the auto mechanic who has agreed to fix it and then fails to try to do so is doing you an injustice.  Children, as developing but dependent persons, have a right to support and care from their parents; but, in turn, they owe their parents obedience while under their authority.  Children are not the property of their parents and must never be treated like things or objects; and, when they have become independent of their parents, they owe them nothing more than gratitude.  Similarly, a master must respect a servant as a person.  The servant may be under contract to serve the master, but that contract cannot be permanent or legitimately involve the giving up of the servant’s personhood (in other words, one cannot justifiably enter into slavery).  While the master has authority over the servant, that must never be viewed as ownership or involve abuse.  This all concerns private or natural justice, having to do with the securing of property rights.  Next let us next consider how Kant applies his theory of justice to the problem of crime and punishment, in the area of public or civil justice, involving protective, commutative, and distributive justice, the requirements of which can be legitimately enforced by civil society.  When a person commits a crime, that involves misusing freedom to infringe the freedom of others or to violate their rights.  Thus the criminal forfeits the right to freedom and can become a legitimate prisoner of the state.  Kant considers the rule that criminals should be punished for their crimes to be “a categorical imperative,” a matter of just “retribution” not to be denied or even mitigated for utilitarian reasons.  This extends to the ultimate punishment, the death penalty:  justice requires that murderers, the most heinous criminals, should suffer capital punishment, as no lesser penalty would be just.  A third application to consider here is that of war.  This is in the international part of public justice that Kant calls “the Law of Nations.”  He adopts a non-empirical version of the social contract theory, interpreting it not as a historical fact mysteriously generating obligations but rather as a hypothetical idea of what free and equal moral agents could reasonably agree to in the way of rules of justice.  Unlike Hobbes, he does not see this as a basis for all moral duty.  It does account for the obligation we have to the state and other citizens.  But states have duties to other states, so that there is an international law of nations.  Even though different states, in the absence of international law, are in a natural condition of a state of war, as Hobbes thought, he was wrong to think that, in that state, anything rightly goes and that there is no justice.  War is bad, and we should try to minimize the need for it, although Kant is not a pacifist and can justify it for purposes of self-defense.  Kant proposes an international “league of nations” to help provide for mutual “protection against external aggression” and, thus, to discourage it and reduce the need to go to war.  Still, when war cannot be avoided, it should be declared rather than launched by means of a sneak attack; secondly, there are legitimate limits that prohibit, for example, trying to exterminate or subjugate all members of the enemy society; third, when a war is over, the winning party cannot destroy the civil freedom of the losing parties, as by enslaving them; and, fourth, certain “rights of peace” must be assured and honored for all involved.  Thus the ultimate goal of international relations and of the league of nations should be the ideal of “perpetual peace” among different states that share our planet (Justice, pp. 41, 43, 91-95, 113, 136-141, 146, 151-158; for more on Kant’s version of the social contract theory, see Writings, pp. 73-85, and for more on his views on war and “perpetual peace,” see Writings, pp. 93-130).  Thus we see Kant applying his own theory of justice in three areas:  in the area of private law having to do with the securing of property rights, in the area of public law having to do with retributive punishment for crimes committed, and in the area of international justice concerned with war and peace.

What shall we critically say about this theory?  First, it argues for a sense of justice in terms of objective, non-arbitrary right—against, say, Hobbes and Hume.  Second, this sense of justice is of a piece with Kant’s categorical imperative, in that the rules of justice (e.g., regarding property rights, punishment, and war) are universalizable, designed to respect persons as intrinsically valuable, and conforming to the principle of autonomy.  Third, if Hume is correct in suggesting that we can never logically infer what ought to be from what actually is, then Kant’s is the only theory we have considered thus far that can pass the test.  To focus the issue, ask the question, why should we be just?  For Plato, this is the way to achieve the fulfillment of a well-ordered soul.  For Aristotle, the achievement and exercising of moral virtue is a necessary condition of human flourishing.  For Augustine and Aquinas, God’s eternal law requires that we, as God’s personal creatures, should be just, with our salvation at stake.  For Hobbes, practicing justice is required by enlightened self-interest.  For Hume, even though our being just may not benefit us directly all the time, it is conducive to public utility or the good of the society of which we are members.  But for each of these claims, we can ask, so what?  If any combination of these claims were to turn out to be correct, we could still legitimately ask why we should therefore be just.  Are we to assume that we ought to do whatever it takes to achieve a well-ordered soul or to flourish or to comply with God’s will or to serve our own self-interest or public utility?  Why?  Consider Kant’s answer:  we should try to be just because it’s the right thing to do and because it is our duty, as rational, moral agents, to try to do what is right.  Kant’s analysis of justice works well; and, given that, his applications to property rights, crime and punishment, and war and peace are also impressive.  Yet his theory is commonly rejected as too idealistic to be realistically applicable in the so-called “real world,” because it maintains that some things can be absolutely unjust and are, thus, categorically impermissible, regardless of likely consequences.  His theory as we have considered it here is a paradigmatic example of the view of justice being advocated in this article, as essentially requiring respect for persons as free, rational agents.  Yet Kant’s inflexibility in other points of application, such as in his absolute prohibition against lying to a would-be murderer in order to save innocent human life (Ethical, pp. 162-166), his idea that women and servants are merely “passive citizens” unfit to vote, and his categorical denial of any right to resistance or revolution against oppression (Justice, pp. 120, 124-128), is problematic here, inviting an alternative such as is represented by Mill’s utilitarianism.

b. Mill

Let us consider a bit of Karl Marx (and his collaborator Friedrich Engels) as a quick transition between Kant and Mill.  Kant represents the very sort of bourgeois conception of justice against which Marx and Engels protest in their call, in The Communist Manifesto, for a socialistic revolution.  Marx explains the ideal of socio-economic equality he advocates with the famous slogan that all should be required to contribute to society to the extent of their abilities and all should be allowed to receive from society in accordance with their needs.  John Stuart Mill, a nineteenth-century English philosopher, was aware of the call for a Communist revolution and advocated progressive liberal reform as an alternative path to political evolution.  Whereas Kant was the first great deontologist, Mill subscribed to the already established tradition of utilitarianism.  Although earlier British thinkers (including Hobbes and Hume) were proto-utilitarians, incorporating elements of the theory into their own worldviews, the movement, as such, is usually thought to stem from the publication of Jeremy Bentham’s Introduction to the Principles of Morals and Legislation in 1789.  He there proposes the “principle of utility,” which he also later calls the “greatest happiness” principle, as the desirable basis for individual and collective decision-making:  “By the principle of utility is meant that principle which approves or disapproves of every action whatsoever, according to the tendency which it appears to have to augment or diminish the happiness of the party whose interest is in question.”  That single sentence establishes the ultimate criterion for utilitarian reasoning and the root of a great movement.  A famous lawyer named John Austin, under whom Mill studied, wrote a book of jurisprudence based on Bentham’s “principle of general utility.”  Mill’s father, James Mill, was a friend and disciple of Bentham and educated his only son also to be a utilitarian.  Near the end of his life, Mill observed that it was the closest thing to a religion in which his father raised him.  And, if he was not the founder of this secular religion, he clearly became its most effective evangelist.  In Utilitarianism, his own great essay in ethical theory, Mill gives his own statement of the principle of utility (again employing a curiously religious word):  “The creed which accepts as the foundation of morals, Utility, or the Greatest Happiness Principle, holds that actions are right in proportion as they tend to promote happiness, wrong as they tend to produce the reverse of happiness.”  He immediately proceeds to interpret human happiness and unhappiness (as Bentham had done) in hedonistic terms of pleasure and pain (Utilitarianism, pp. 33-34, 329, 257).  This presents the deceptive appearance of a remarkably simple rubric for practical judgment:  if an action generates an excess of pleasure over pain, that contributes to human happiness, which is our greatest good, making the action right; on the other hand, if an action generates an excess of pain over pleasure, that contributes to human unhappiness, which is our greatest evil, making the action wrong.  But what is deceptive about this is the notion that we can sufficiently anticipate future consequences to be able to predict where our actions will lead us.  (Notice, also, that unlike Kantian deontology, which makes what is right independent of good consequences, utilitarianism makes the former a function of the latter.)

Mill acknowledges that concern about a possible conflict between utility and justice has always been “one of the strongest obstacles” to the acceptance of utilitarianism.  If permanently enslaving a minority could produce overwhelming happiness for a majority (he was personally opposed to slavery as an unconscionable violation of human liberty), then, given that utility is the value that trumps all others, why shouldn’t the injustice of slavery be accepted as a (regrettably) necessary means to a socially desirable end, the former, however unfortunate, being thus justified?  Mill thinks that the key to solving this alleged problem is that of conceptual analysis, that if we properly understand what “utility” and “justice” are all about, we shall be able to see that no genuine conflict between them is possible.  We have already discerned what the former concept means and now need to elucidate the latter.  Mill lays out five dimensions of justice as we use the term:  (1) respecting others’ “legal rights” is considered just, while violating them is unjust; (2) respecting the “moral right” someone has to something is just, while violating it is unjust; (3) it is considered just to give a person what “he deserves” and unjust to deny it; (4) it is thought unjust to “break faith” with another, while keeping faith with others is just; and (5) in some circumstances, it is deemed unjust “to be partial” in one’s judgments and just to be impartial.  People commonly associate all of these with justice, and they do seem to represent legitimate aspects of the virtue.  (Interestingly, Mill rejects the idea “of equality” as essential to our understanding of justice, a stand which would be problematic for Marxists.)  As he seeks his own common denominator for these various dimensions of justice, he observes that justice always goes beyond generic right and wrong to involve what “some individual person can claim from us as his moral right.”  This entails the legitimate sense that anyone who has committed an injustice deserves to be punished somehow (which connects with Kant).  Mill thinks all this boils down to the idea that justice is a term “for certain moral requirements, which, regarded collectively, stand higher in the scale of social utility,” being more obligatory “than any others.”  But this means that justice, properly understood, is a name for the most important of “social utilities” (ibid., pp. 296-301, 305, 309, 320-321).  Therefore there purportedly cannot be any genuine conflict between utility and justice.  If there ever were circumstances in which slavery were truly useful to humanity, then presumably it would be just; the reason it is (typically) unjust is that it violates utility.  The main goal here is to reduce justice to social utility, in such a way as to rule out, by definition, any ultimate conflict between the two.  Thus, the social role played by our sense of justice is allegedly that it serves the common good.

Mill’s other great work is On Liberty, which provides us with a connecting link between this utilitarian theory and applications of it to particular social issues.  The problem Mill sets for himself here is where to draw a reasonable line between areas in which society can rightly proscribe behavior and those in which people should be allowed the freedom to do as they will.  When is it just to interfere with a person’s acting on personal choice?  To solve this problem, which is as relevant today as it was a century and a half ago, he proposes his “one very simple principle” of liberty, which he states in two slightly different ways:  (1) the “self-protection” version holds that people can only legitimately interfere with the freedom of action of others to protect themselves from them; (2) the “harm” version maintains that force can only be justifiably used against other members of community to prevent their harming others.  It is not acceptable to use power against others to stop them from hurting only themselves.  Mill candidly admits that this principle is reasonably feasible only with regard to mature, responsible members of civilized societies—not to children or to the insane or even necessarily to primitive peoples who cannot make informed judgments about their own true good.  He decisively renounces any appeal to abstract rights as a basis for this principle, basing it simply on “utility in the largest sense, grounded on the permanent interests of a man as a progressive being.”  Notice that this presupposes that we can distinguish between other-regarding behavior, which may be justifiably regulated, and purely self-regarding behavior, which may not be.  If that turns out to be a false distinction, then Mill’s theory may collapse.  At any rate, he articulates at least three areas of social life in which people’s liberty should be “absolute and unqualified”:  (a) that of freedom of thought and expression; (b) that of freedom of personal lifestyle; and (c) the freedom to associate with others of one’s choice, so long as it is for peaceful purposes.  He seems confident that utility will always require that freedom be protected in these areas (ibid., pp. 135-138).  In other words, on this liberal utilitarian view, it would always be unjust for an individual or a social group, in a civilized society, deliberately to interfere with a responsible, rational person’s actions in any combination of these areas.

Let us now see how Mill applies his utilitarian theory to three problems of justice that are still timely today.  First of all, the issue of punishment is one he considers in Utilitarianism, though his discussion is aimed at considering alternative accounts rather than conclusively saying what he himself thinks (we might also observe that, in this short passage, he attacks the social contract theory as a useless fiction) (ibid., pp. 311-313).  As a utilitarian, he favors the judicious use of punishment in order to deter criminal activity.  He believes in the utility/justice of self-defense and sees the right to punish as anchored in that.  In 1868, as an elected member of Parliament, he made a famous speech in the House of Commons supporting capital punishment on utilitarian grounds.  Although it is clear that he would like to be able to support a bill for its abolition, the lawful order of society, a necessary condition of societal well-being, requires this means of deterring the most heinous crimes, such as aggravated murder.  He even thinks it a quicker, more humane punishment than incarcerating someone behind bars for the rest of his life.  Mill does worry about the possibility of executing an innocent person, but he thinks a carefully managed legal system can render this danger “extremely rare” (“Punishment,” pp. 266-272).  Thus his utilitarian theory provides him with a basis for supporting capital punishment as morally justifiable.  A second famous application of his utilitarian theory of justice Mill makes is to the issue of equal opportunity for women.  In the very first paragraph of The Subjection of Women, Mill maintains that “the principle which regulates the existing social relations between the two sexes—the legal subordination of one sex to the other—is wrong in itself, and now one of the chief hindrances to human improvement; and that it ought to be replaced by a principle of perfect equality, admitting no power or privilege on the one side, nor disability on the other.”  So he does not call for the preferential treatment of “affirmative action” but only for equal opportunity.  Unlike contemporary feminists, he does not appeal to women’s human rights as his rationale, but only to the maximization of “human happiness” and the liberty “that makes life valuable” (Subjection, pp. 1, 26, 101).  Here, again, we have an issue of social justice to which his utilitarian theory is applied, generating liberal conclusions.  Our third issue of application is that of international non-intervention.  Mill’s general principle here is that using force against others is prima facie unjust. Although defensive wars can be justifiable, aggressive ones are not.  It can be justifiable to go to war without being attacked or directly threatened with an attack, for example, to help civilize a barbarian society, which, as such, allegedly has no rights.  It can be justifiable to save a subjected population from the oppression of a despotic government (“Non-Intervention,” pp. 376-383).  All of this is presumably a function of utilitarian welfare.  Once more, a still timely moral issue has been addressed using the utilitarian theory of justice.

These applications all plausibly utilize the values and reasoning of utilitarianism, which, by its very nature, must be consequentialist.  From that perspective, the deterrence approach to punishment, including capital punishment, seems appropriate, as do Mill’s call for equal opportunity for women and his measured position on international interventionism.  Surely, the premium he places on human happiness is admirable, as is his universal perspective, which views all humans as counting.  The problem is in his assumptions that all values are relative to consequences, that human happiness is the ultimate good, and that this reduces to the maximization of pleasure and the minimization of pain.  The upshot of this position is that, in principle, nothing can  be categorically forbidden, that, given sufficiently desirable ends, any means necessary to achieve them can be justified.  If we really believe that there can be no genuine conflict between justice and utility because the former is merely the most important part of the latter, then the rules of justice are reducible to calculations regarding what is generally conducive to the greatest happiness for the greatest number of people—mere inductive generalizations which must permit of exceptions; at least Mill’s ambiguity leaves him open to this interpretation.  There would seem to be a tension in Mill’s thought:  on the one hand, he wants to respect the liberty of all (civilized) responsible persons as rational agents; but, on the other hand, his commitment to utilitarianism would seem to subordinate that respect to the greatest good for the greatest number of people, allowing for the possibility of sacrificing the interests of the few to those of the many.

5. Contemporary Philosophers

From its founding, American political thought had an enduring focus on justice.  The Preamble to the American Constitution says that one of its primary goals is to “establish justice.”  Founding father James Madison, in 1788, wrote in The Federalist Papers that justice should be the goal of all government and of all civil society, that people are willing to risk even liberty in its pursuit.  American schoolchildren are made to memorize and recite a Pledge of Allegiance that ends with the words “with liberty and justice for all.”  So justice is an abiding American ideal.  We shall now consider how one of America’s greatest philosophers, John Rawls, addresses this ideal.  We should notice how he places a greater emphasis on equality than do most of his European predecessors—perhaps reflecting the conviction of the American Declaration of Independence that “all men are created equal.”  (This greater emphasis may reflect the influence of Marx, whom he occasionally mentions.)  After considering the formidable contributions of Rawls to justice theory and some of its applications, we shall conclude this survey with a brief treatment of several post-Rawlsian alternatives.  A key focus that will distinguish this section from previous ones is the effort to achieve a conception of justice that strikes a reasonable balance between liberty and equality.

a. Rawls

Rawls burst into prominence in 1958 with the publication of his game-changing paper, “Justice as Fairness.”  Though it was not his first important publication, it revived the social contract theory that had been languishing in the wake of Hume’s critique and its denigration by utilitarians and pragmatists, though it was a Kantian version of it that Rawls advocated.  This led to a greatly developed book version, A Theory of Justice, published in 1971, arguably the most important book of American philosophy published in the second half of the last century.  Rawls makes it clear that his theory, which he calls “justice as fairness,” assumes a Kantian view of persons as “free and equal,” morally autonomous, rational agents, who are not necessarily egoists.  He also makes it clear early on that he means to present his theory as a preferable alternative to that of utilitarians.  He asks us to imagine persons in a hypothetical “initial situation” which he calls “the original position” (corresponding to the “state of nature” or “natural condition” of Hobbes, but clearly not presented as any sort of historical or pre-historical fact).  This is strikingly characterized by what Rawls calls “the veil of ignorance,” a device designed to minimize the influence of selfish bias in attempting to determine what would be just.  If you must decide on what sort of society you could commit yourself to accepting as a permanent member and were not allowed to factor in specific knowledge about yourself—such as your gender, race, ethnic identity, level of intelligence, physical strength, quickness and stamina, and so forth—then you would presumably exercise the rational choice to make the society as fair for everyone as possible, lest you find yourself at the bottom of that society for the rest of your life.  In such a “purely hypothetical” situation, Rawls believes that we would rationally adopt two basic principles of justice for our society:  “the first requires equality in the assignment of basic rights and duties, while the second holds that social and economic inequalities, for example inequalities of wealth and authority, are just only if they result in compensating benefits for everyone, and in particular for the least advantaged members of society.”  Here we see Rawls conceiving of justice, the primary social virtue, as requiring equal basic liberties for all citizens and a presumption of equality even regarding socio-economic goods.  He emphasizes the point that these principles rule out as unjust the utilitarian justification of disadvantages for some on account of greater advantages for others, since that would be rationally unacceptable to one operating under the veil of ignorance.  Like Kant, Rawls is opposed to the teleological or consequentialist gambit of defining the right (including the just) in terms of “maximizing the good”; he rather, like Kant, the deontologist, is committed to a “priority of the right over the good.”  Justice is not reducible to utility or pragmatic desirability.  We should notice that the first principle of justice, which requires maximum equality of rights and duties for all members of society, is prior in “serial or lexical order” to the second, which specifies how socio-economic inequalities can be justified (Theory, pp. 12-26, 31, 42-43).  Again, this is anti-utilitarian, in that no increase in socio-economic benefits for anyone can ever justify anything less than maximum equality of rights and duties for all.  Thus, for example, if enslaving a few members of society generated vastly more benefits for the majority than liabilities for them, such a bargain would be categorically ruled out as unjust.

Rawls proceeds to develop his articulation of these two principles of justice more carefully.  He reformulates the first one in terms of maximum equal liberty, writing that “each person is to have an equal right to the most extensive basic liberty compatible with a similar liberty for others.”  The basic liberties intended concern such civil rights as are protected in our Constitution—free speech, freedom of assembly, freedom of conscience, the right to private property, the rights to vote and hold public office, freedom from arbitrary arrest and seizure, etc.  The lexical priority of this first principle requires that it be categorical in that the only justification for limiting any basic liberties would be to enhance other basic liberties; for example, it might be just to limit free access of the press to a sensational legal proceeding in order to protect the right of the accused to a fair trial.  Rawls restates his second principle to maintain that “social and economic inequalities are to be arranged so that they are both (a) reasonably expected to be to everyone’s advantage, and (b) attached to positions and offices open to all.”  Thus socio-economic inequalities can be justified, but only if both conditions are met.  The first condition (a) is “the difference principle” and takes seriously the idea that every socio-economic difference separating one member of society from others must be beneficial to all, including the person ranked lowest.  The second condition is one of “fair equality of opportunity,” in that socio-economic advantages must be connected to positions to which all members of society could have access.  For example, the office of the presidency has attached to it greater social prestige and income than is available to most of us.  Is that just?  It can be, assuming that all of us, as citizens, could achieve that office with its compensations and that even those of us at or near the bottom of the socio-economic scale benefit from intelligent, talented people accepting the awesome responsibilities of that office.  Just as the first principle must be lexically prior to the second, Rawls also maintains that “fair opportunity is prior to the difference principle.”  Thus, if we have to choose between equal opportunity for all and socio-economically benefiting “the least advantaged” members of society, the former has priority over the latter.  Most of us today might be readily sympathetic to the first principle and the equal opportunity condition, while finding the difference principle to be objectionably egalitarian, to the point of threatening incentives to contribute more than is required.  Rawls does consider a “mixed conception” of justice that most of us would regard as more attractive “arising when the principle of average utility constrained by a certain social minimum is substituted for the difference principle, everything else remaining unchanged.”  But there would be a problem of fairly agreeing on that acceptable social minimum, and it would change with shifting contingent circumstances.  It is curious that his own theory of “justice as fairness” gets attacked by socialists such as Nielsen (whom we shall consider) for sacrificing equality for the sake of liberty and by libertarians such as Nozick (whom we shall also consider) for giving up too much liberty for the sake of equality.  Rawls briefly suggests that his theory of justice as fairness might be applied to international relations, in general, and to just war theory, in particular (ibid., pp. 60-65, 75, 83, 302-303, 316, 378).

Rawls applies his theory of justice to the domestic issue of civil disobedience.  No society is perfectly just.  A generally or “nearly just society” can have unjust laws, in which case its citizens may or may not have a duty to comply with them, depending on how severely unjust they are.  If the severity of the injustice is not great, then respect for democratic majority rule might morally dictate compliance.  Otherwise, citizens can feel a moral obligation to engage in civil disobedience, which Rawls defines as “a public, nonviolent, conscientious yet political act contrary to law usually done with the aim of bringing about a change in the law or policies of the government.”  Certain conditions must be met in order that an act of civil disobedience be justified:  (1) it should normally address violations of equal civil liberties (the first principle of justice) and/or of “fair equality of opportunity” (the second part of the second principle), with violations of the difference principle (the first part of the second principle) being murkier and, thus, harder to justify; (2) the act of civil disobedience should come only after appeals to the political majority have been reasonably tried and failed; (3) it must seem likely to accomplish more good than harm for the social order.  Yet, even if all three of these conditions seem to be met and the disobedient action seems right, there remains the practical question of whether it would be “wise or prudent,” under the circumstances, to engage in the act of civil disobedience.  Ultimately, every individual must decide for himself or herself whether such action is morally and prudentially justifiable or not as reasonably and responsibly as possible.  The acts of civil disobedience of Martin Luther King (to whom Rawls refers in a footnote) seem to have met all the conditions, to have been done in the name of justice, and to have been morally justified (ibid., pp. 350-357, 363-367, 372-376, 389-390, 364n).

Rawls’s second book was Political Liberalism.  Here he works out how a just political conception might develop a workable “overlapping consensus” despite the challenges to social union posed by a pluralism of “reasonable comprehensive doctrines.”  This, of course, calls for some explanation.  A just society must protect basic liberties equally for all of its members, including freedom of thought and its necessary condition, freedom of expression.  But, in a free society that protects these basic liberties, a pluralism of views and values is likely to develop, such that people can seriously disagree about matters they hold dear.  They will develop their own “comprehensive doctrines,” or systems of beliefs that may govern all significant aspects of their lives.  These may be religious (like Christianity) or philosophical (like Kantianism) or moral (like utilitarian).  Yet a variety of potentially conflicting comprehensive doctrines may be such that all are reasonable.  In such a case, social unity requires respect for and tolerance of other sets of beliefs.  It would be unjust deliberately to suppress reasonable comprehensive doctrines merely because they are different from our own.  The problem of political liberalism nowadays is how we can establish “a stable and just society whose free and equal citizens are deeply divided by conflicting and even incommensurable religious, philosophical, and moral doctrines.”  What is needed is a shared “political conception of justice” that is neutral regarding competing comprehensive doctrines.  This could allow for “an overlapping consensus of reasonable comprehensive doctrines,” such that tolerance and mutual respect are operative even among those committed to incompatible views and values, so long as they are reasonable (Liberalism, pp. 291-292, 340-342, 145, xviii, 13, 152n., 59-60, 133, 154-155, 144, 134).  Thus, for example, a Christian Kantian and an atheistic utilitarian, while sincerely disagreeing on many ethical principles, philosophical ideas, and religious beliefs, can unite in mutually accepting, for instance, the American Constitution as properly binding on all of us equally.  This agreement will enable them mutually to participate in social cooperation, the terms of which are fair and reciprocal and which can contribute to the reasonable good of the entire society.

Near the end of his life, Rawls published The Law of Peoples, in which he tried to apply his theory of justice to international relations.  Given that not all societies act justly and that societies have a right to defend themselves against aggressive violent force, there can be a right to go to war (jus ad bellum).  Yet even then, not all is fair in war, and rules of just warfare (jus in bello) should be observed:  (1) the goal must be a “just and lasting peace”; (2) it must be waged in defense of freedom and security from aggression; (3) reasonable attempts must be made not to attack innocent non-combatants; (4) the human rights of enemies (for example, against being tortured) must be respected; (5) attempts should be made to establish peaceful relations; and (6) practical tactics must always remain within the parameters of moral principles.  After hostilities have ceased, just conquerors must treat their conquered former enemies with respect—not, for example, enslaving them or denying them civil liberties.  Rawls adds a very controversial “supreme emergency exemption” in relation to the third rule—when a relatively just society’s very survival is in desperate peril, its attacking enemy civilian populations, as by bombing cities, can be justifiable.  More generally, Rawls applies his theory of justice to international relations, generating eight rules regarding how the people of other societies must be treated.  While we do not have time to explore them all here, the last one is sufficiently provocative to be worth our considering:  “Peoples have a duty to assist other peoples living under unfavorable conditions that prevent their having a just or decent political and social regime.”  This, of course, goes beyond not exploiting, cheating, manipulating, deceiving, and interfering with others to a positive duty of trying to help them, at the cost of time, money, and other resources.  Justice demands that we try to assist what Rawls calls “burdened societies,” so that doing so is not morally supererogatory.  What is most interesting here is what Rawls refuses to say.  While different peoples, internationally speaking, might be imagined in an original position under the veil of ignorance, and Rawls would favor encouraging equal liberties and opportunities for all, he refuses to apply the difference principle globally in such a way as to indicate that justice requires a massive redistribution of wealth from richer to poorer societies (Peoples, pp. 94-96, 98-99, 37, 106, 114-117).

From a critical perspective, Rawls’s theory of civil disobedience is excellent, as are his theory of political liberalism and his version of the just war theory, except for that “supreme emergency exemption,” which uncharacteristically tries to make right a function of teleological good.  His views on international aid seem so well worked out that, ironically, they call into question part of his general theory of justice itself.  It does not seem plausible that the difference principle should apply intrasocietally but not internationally.  The problem may be with the difference principle itself.  It is not at all clear that rational agents in a hypothetical original position would adopt such an egalitarian principle.  The veil of ignorance leading to this controversial principle can itself be questioned as artificial and unrealistic; one might object that, far from being methodologically neutral, it sets up a bias (towards, for example, being risk-aversive) that renders Rawls’s own favored principles of justice almost a foregone conclusion.  Indeed, the “mixed conception” that Rawls himself considers and rejects seems more plausible and more universally applicable—keeping the first principle and the second part of the second but replacing the difference principle with one of average utility, constrained by some social minimum, adjustable with changing circumstances.  Thus we could satisfactorily specify the requirements of an essentially Kantian conception of justice, as requiring respect for the dignity of all persons as free and equal, rational moral agents.  While less egalitarian than what Rawls offers, it might prove an attractive alternative.  To what extent should liberty be constrained by equality in a just society?  This is a central issue that divides him from many post-Rawlsians, to a few of whom we now briefly turn.

b. Post-Rawls

Rawls’s monumental work on justice theory revitalized political philosophy in the United States and other English-speaking countries.  In this final subsection, we shall briefly survey some of the most important recent attempts to provide preferable alternatives to Rawls’s conception of justice.  They will represent six different approaches.  We shall consider, in succession, (1) the libertarian approach of Robert Nozick, (2) the socialistic one of Kai Nielsen, (3) the communitarian one of Michael Sandel, (4) the globalist one of Thomas Pogge, (5) the feminist one of Martha Nussbaum, and (6) the rights-based one of Michael Boylan.  As this is merely a quick survey, we shall not delve much into the details of their theories (limiting ourselves to a single work by each) or explore their applications or do much in the way of a critique of them.  But the point will be to get a sense of several recent approaches to developing views of justice in the wake of Rawls.

(1)    Nozick

Nozick (a departmental colleague of Rawls at Harvard) was one of the first and remains one of the most famous critics of Rawls’s liberal theory of justice.  Both are fundamentally committed to individual liberty.  But as a libertarian, Nozick is opposed to compromising individual liberty in order to promote socio-economic equality and advocates a “minimal state” as the only sort that can be socially just.  In Anarchy, State, and Utopia (1974), especially in its famous chapter on “Distributive Justice,” while praising Rawls’s first book as the most important “work in political and moral philosophy” since that of Mill, Nozick  argues for what he calls an “entitlement conception of justice” in terms of three principles of just holdings.  First, anyone who justly acquires any holding is rightly entitled to keep and use it.  Second, anyone who acquires any holding by means of a just transfer of property is rightly entitled to keep and use it.  It is only through some combination of these two approaches that anyone is rightly entitled to any holding.  But some people acquire holdings unjustly—e.g., by theft or fraud or force—so that there are illegitimate holdings.  So, third, justice can require the rectification of unjust past acquisitions.  These three principles of just holdings—“the principle of acquisition of holdings, the principle of transfer of holdings, and the principle of rectification of the violations of the first two principles”—constitute the core of Nozick’s libertarian entitlement theory of justice.  People should be entitled to use their own property as they see fit, so long as they are entitled to it.  On this view, any pattern of distribution, such as Rawls’s difference principle, that would force people to give up any holdings to which they are entitled in order to give it to someone else (i.e., a redistribution of wealth) is unjust.  Thus, for Nozick, any state, such as ours or one Rawls would favor, that is “more extensive” than a minimal state and redistributes wealth by taxing those who are relatively well off to benefit the disadvantaged necessarily “violates people’s rights” (State, pp. 149, 183, 230, 150-153, 230-231, 149).

(2)    Nielsen

Nielsen, as a socialist (against both Rawls and Nozick) considers equality to be a more fundamental ideal than individual liberty; this is more in keeping with Marxism than with the liberal/libertarian tradition that has largely stemmed from Locke.  (Whereas capitalism supports the ownership and control of the means of producing and distribution material goods by private capital or wealth, socialism holds that they should be owned and controlled by society as a whole.)  If Nozick accuses Rawls of going too far in requiring a redistribution of wealth, Nielsen criticizes him for favoring individual liberty at the expense of social equality.  In direct contrast to Rawls’s two liberal principles of justice, in “Radical Egalitarian Justice:  Justice as Equality,” Nielsen proposes his own two socialistic principles constituting the core of his “egalitarian conception of justice.”  In his first principle, he calls for “equal basic liberties and opportunities” (rather than for merely “equal basic liberties”), including the opportunities “for meaningful work, for self-determination, and political participation,” which he considers important to promote “equal moral autonomy and equal self-respect.”  Also (unlike Rawls) he does not claim any lexical priority for either principle over the other.  His sharper departure from Rawls can be found in his second principle, which is to replace the difference principle that allegedly justified socio-economic inequality.  After specifying a few qualifications, it calls for “the income and wealth” of society “to be so divided that each person will have a right to an equal share” and for the burdens of society “also to be equally shared, subject, of course, to limitations by differing abilities and differing situations.”  He argues that his own second principle would better promote “equal self-respect and equal moral autonomy” among the members of society.  Thus we might eliminate social stratification and class exploitation, in accordance with the ideals of Marxist humanism (“Equality,” pp. 209, 211-213, 222-225).

(3)    Sandel

Sandel, as a communitarian, argues (against Rawls and Nozick) that the well-being of a community takes precedence over individual liberty and (against Nielsen) over the socio-economic welfare of its members.  While acknowledging that Rawls is not so “narrowly individualistic” as to rule out the value of building social community, in Liberalism and the Limits of Justice, he maintains that the individualism of persons in the original position is such that “a sense of community” is not a basic “constituent of their identify as such,” so that community is bound to remain secondary and derivative in the Rawlsian theory.  To deny that community values help constitute one’s personal identity is to render impossible any preexisting interpersonal good from which a sense of right can be derived.  Thus, for Sandel, Rawls’s myopic theory of human nature gives him no basis for any pre-political natural rights.  So his conception of justice based on this impoverished view must fail to reflect “the shared self-understandings” of who they are as members of community that must undergird the basic structure of political society.  Through the interpersonal relationships of community, we establish “more or less enduring attachments and commitments” that help define who we are, as well as the values that will help characterize our sense of justice as a common good that cannot be properly understood by individuals detached from community.  Thus justice must determine what is right as serving the goods we embrace in a social context—“as members of this family or community or nation or people, as bearers of this history, as sons and daughters of that revolution, as citizens of this republic” rather than as abstract individuals (Limits, pp. 66, 60-65, 87, 150, 172-174, 179, 183, 179).

(4)    Pogge

Pogge develops a globalist interpretation of justice as fairness that, in a sense, is more consistent than Rawls’s own.  More specifically, it not only accepts the difference principle but wants to apply it on an international level as well as nationally.  In “An Egalitarian Law of Peoples,” Pogge observes that Rawls means his theory of justice to be relatively “egalitarian.”  And, as applied intranationally, so it is.  But, as applied internationally, it is not.  As he says, there is a disconnect “between Rawls’s conception of domestic and of global justice.”  (We should note that, like Sandel’s critique, which we just considered, Pogge’s is not a complete theory of justice, but more a modification of Rawls’s own.)  While Rawls does believe that well-off societies have a duty to assist burdened societies, he rejects the idea of a global application of his difference principle.  What Pogge is proposing is a global egalitarian principle of distributive justice.  He thinks that this will address socio-economic equalities that are to the detriment of the world’s worst-off persons.  What he proposes is “a global resources tax, or GRT.”  This means that, although each of the peoples of our planet “owns and fully controls all resources within its national territory,” it will be taxed on all of the resources it extracts.  If it uses those extracted resources itself, it must pay the tax itself.  If it sells some to other societies, presumably at least part of the tax burden will be borne by buyers in the form of higher sales prices.  “The GRT is then a tax on consumption” of our planet’s resources.  Corporations extracting resources (such as oil companies and coal mining companies) would pay their taxes to their governments which, in turn, would be responsible for transferring funds to disadvantaged societies to help the global poor.  Such payments should be regarded as “a matter of entitlement rather than charity,” an obligation of international justice.  If the governments of the poorer states were honest, they could disburse the funds; if they were corrupt, then transfers could go through United Nations agencies and/or nongovernmental organizations.  At any rate, they should be channeled toward societies in which they could improve the lot of the poor and disadvantaged.  (Of course, less well-off societies would be free to refuse such funds, if they so chose.)  But, one might wonder, would well-off societies only be motivated to pay their fair share by benevolence, a sense of justice, and possible shame at being exposed for not doing so?  No, there could be international sanctions:  “Once the agency facilitating the flow of GRT payments reports that a country has not met its obligations under the scheme, all other countries are required to impose duties on imports from, and perhaps also similar levies on exports to, this country to raise funds equivalent to its GRT obligations plus the cost of these enforcement measures.”  Pogge believes that well-off societies should recognize that his more egalitarian model of international relations is also more just than Rawls’s law of peoples (“Egalitarian,” pp. 195-196, 210, 199-202, 205, 219, 224).

(5)    Nussbaum

Nussbaum, like Pogge (and unlike Nozick and Nielsen), does not so much reject Rawls’s liberal conception of justice as extend its explicit application.  In Sex and Social Justice, she argues for a feminist interpretation of justice, using what she calls a “capabilities approach” that connects with “the tradition of Kantian liberalism,” nowadays represented by Rawls, tapping into their “notions of dignity and liberty,” as a foundation for discussing the demands of justice regarding “women’s equality and women’s human rights.”  The feminism she embraces has five key dimensions:  (1) an internationalism, such that it is not limited to any one particular culture; (2) a humanism, such as affirms a basic equal worth in all human beings and promotes justice for all; (3) a commitment to liberalism as the perspective that best protects and promotes the “basic human capacities for choice and reasoning” that render all humans as having an equal worth; (4) a sensitivity to the cultural shaping of our preferences and desires; and (5) a concern for sympathetic understanding between the sexes.  She expresses an appreciation for the primary goods at the core of Rawls’s theory, while asserting that his analysis does not go far enough.  She offers her own list of ten “central human functional capabilities” that must be respected by a just society:  (1) life of a normal, natural duration; (2) bodily health and integrity, including adequate nourishment and shelter; (3) bodily integrity regarding, for example, freedom of movement and security against assault; (4) freedom to exercise one’s senses, imagination, and thought as one pleases, which includes freedom of expression; (5) freedom to form emotional attachments to persons and things, which includes freedom of association; (6) the development and exercise of practical reason, the capacity to form one’s own conception of the good and to try to plan one’s own life, which includes the protection of freedom of conscience; (7) freedom of affiliation on equal terms with others, which involves provisions of nondiscrimination; (8) concern for and possible relationships with animals, plants, and the world of nature; (9) the freedom to play, to seek amusement, and to enjoy recreational activities; and (10) some control over one’s own political environment, including the right to vote, and one’s material environment, including the rights to seek meaningful work and to hold property.  All of these capabilities are essential to our functioning as flourishing human beings and should be assured for all citizens of a just society.  But, historically, women have been and still are short-changed with respect to them and should be guaranteed their protection in the name of justice (Sex, pp. 24, 6-14, 34, 40-42).

(6)    Boylan

Boylan has recently presented “a ‘rights-based’ deontological approach based upon the necessary conditions for human action.”  In A Just Society, he observes that human goods are more or less deeply “embedded” as conditions of human action, leading to a hierarchy that can be set forth.  There are two levels of basic goods.  The most deeply embedded of these, such as food, clothing, shelter, protection from physical harm, are absolutely necessary for any meaningful human action.  The second level of basic goods comprises (less) deeply embedded ones, such as basic knowledge and skills such as are imparted by education, social structures that allow us to trust one another, basic assurance that we will not be exploited, and the protection of basic human rights.  Next, there are three levels of secondary goods.  The most embedded of these are life enhancing, if not necessary for any meaningful action, such as respect, equal opportunity, and the capacity to form and follow one’s own plan of life and to participate actively and equally in community, characterized by shared values.  A second level of secondary goods comprises those that are useful for human action, such as having and being able to use property, being able to benefit from one’s own labor, and being able to pursue goods typically owned by most of one’s fellow citizens.  The third level of secondary goods comprises those that are least embedded as conditions of meaningful action but still desirable as luxuries, such as being able to seek pleasant objectives that most of one’s fellow citizens cannot expect to achieve and being able to compete for somewhat more than others in one’s society.  The more deeply embedded goods are as conditions of meaningful human action, the more right to them people have.  Boylan follows Kant and Rawls in holding an ultimate moral imperative is that individual human agents and their rights must be respected.  This is a matter of justice, distributive justice involving a fair distribution of social goods and services and retributive justice involving proper ways for society to treat those who violate the rules.  A just society has a duty to provide basic goods equally to all of its members, if it can do so.  But things get more complicated with regards to secondary goods.  A just society will try to provide the first level of secondary goods, those that are life enhancing, equally to all its members.  Yet this becomes more problematic with the second and third levels of secondary goods—those that are useful and luxurious—as the conditions for meaningful human action have already been satisfied by more deeply embedded ones.  The need that people have to derive rewards for their work commensurate with their achievement would seem to militate against any guarantee of equal shares in these, even if society could provide them, although comparable achievement should be comparably rewarded.  Finally, in the area of retributive justice, we may briefly consider three scenarios.  First, when one person takes a tangible good from another person, justice requires that the perpetrator return to the victim some tangible good(s) of comparable worth, plus compensation proportionate to the harm done the victim by the loss.  Second, when one person takes an intangible good from another person, justice requires that the perpetrator give the victim some tangible good as adequate compensation for the pain and suffering caused by the loss.  And, third, when one person injures another person through the deprivation of a valued good that negatively affects society, society can justly incarcerate the perpetrator for a period of time proportionate to the loss (Society, pp. x, 53-54, 56-58, 131, 138, 143-144, 164-167, 174-175, 181, 183).

In conclusion, we might observe that, in this rights-based alternative, as in the previous five (the libertarian, the socialistic, the communitarian, the globalist, and the feminist) we have considered, there is an attempt to interpret justice as requiring respect for the dignity of all persons as free and equal, rational moral agents.  This historical survey has tracked the progressive development of this Kantian idea as becoming increasingly prominent in Western theories of justice.

6. References and Further Readings

a. Primary Sources

  • Thomas Aquinas, On Law, Morality, and Politics, ed. William P. Baumgarth and Richard J. Regan, S.J. (called “Law”).  Indianapolis:  Hackett, 1988.
  • Thomas Aquinas, Summa Theologica, trans. Fathers of the English Dominican Province, Vol. One (called “Summa”).  New York:  Benziger Brothers, 1947.
  • Aristotle, Nicomachean Ethics, trans. Terence Irwin, Second Edition (called “Nicomachean”).  Indianapolis:  Hackett, 1999.
  • Aristotle, On Rhetoric, trans. George A. Kennedy (called “Rhetoric”).  New York:  Oxford University Press, 1991.
  • Aristotle, Politics, trans. C. D. C. Reeve (called “Politics”).  Indianapolis:  Hackett, 1998.
  • Augustine, The City of God, trans. Henry Bettenson (called “City”).  London:  Penguin Books, 1984.
  • Augustine, Of True Religion, trans. J. H. S. Burleigh (called “Religion”).  Chicago:  Henry Regnery, 1959.
  • Augustine, On Free Choice of the Will, trans. Thomas Williams (called “Choice”).  Indianapolis:  Hackett, 1993.
  • Augustine, Political Writings, trans. and ed. Michael W. Tkacz and Douglas Kries (called “Political”).  Indianapolis:  Hackett, 1994).
  • Michael Boylan, A Just Society (called “Society”).  Lanham, MD:  Rowman & Littlefield, 2004.
  • Thomas Hobbes, The Elements of Law, ed. J. C. A. Gaskin (called “Elements”).  Oxford:  Oxford University Press, 1994.
  • Thomas Hobbes, Leviathan, ed. Edwin Curley.  Indianapolis:  Hackett, 1994.
  • Thomas Hobbes, Man and Citizen, ed. Bernard Gert (called “Citizen”).  Indianapolis:  Hackett, 1991.
  • Thomas Hobbes, Writings on Common Law and Hereditary Right, ed. Alan Cromartie and Quentin Skinner (called “Common”).  Oxford:  Oxford University Press, 2008.
  • David Hume, An Enquiry concerning the Principles of Morals, ed. J. B. Schneewind (called “Enquiry”).  Indianapolis:  Hackett, 1983.
  • David Hume, Political Essays, ed. Knud Haakonssen (called “Essays”).  Cambridge:  Cambridge University Press, 1994.
  • David Hume, A Treatise of Human Nature, ed. David Fate Norton and Mary J. Norton (called “Treatise”).  Oxford:  Oxford University Press, 2000.
  • Immanuel Kant, Ethical Philosophy, trans. James W. Ellington, Second Edition (called “Ethical”).  Indianapolis:  Hackett, 1994.
  • Immanuel Kant, Lectures on Ethics, trans. Louis Infield (called “Lectures”).  New York:  Harper & Row, 1963).
  • Immanuel Kant, Metaphysical Elements of Justice, trans. John Ladd, Second Edition (called “Justice”).  Indianapolis:  Hackett, 1999.
  • Immanuel Kant, Political Writings, trans. H. B. Nisbet, ed. Hans Reiss, Second Edition (called “Writings”).  Cambridge:  Cambridge University Press, 1991.
  • John Locke, Second Treatise of Government, ed. C. B. Macpherson.  Indianapolis:  Hackett, 1980.
  • Karl Marx, Selected Writings, ed. Lawrence H. Simon.  Indianapolis:  Hackett, 1994.
  • John Stuart Mill, “A Few Words on Non-Intervention,” in Essays on Politics and Culture, ed. Gertrude Himmelfarb (called “Non-Intervention”).  Garden City, NY:  Anchor Books, 1963.
  • John Stuart Mill, “Capital Punishment,” in Public and Parliamentary Speeches, ed. John M. Robson and Bruce L. Kinzer (called “Punishment”).  Toronto:  University of Toronto Press, 1988.
  • John Stuart Mill, The Subjection of Women (called “Subjection”).  Mineola, NY:  Dover, 1997.
  • John Stuart Mill, Utilitarianism and Other Writings, ed. Mary Warnock (called “Utilitarianism”).  Cleveland:  World Publishing Company, 1962.
    • This anthology also contains some Bentham and some Austin.
  • Kai Nielsen, “Radical Egalitarian Justice:  Justice as Equality” (called “Equality”).  Social Theory and Practice, Vol. 5, No. 2, 1979.
  • Robert Nozick, Anarchy, State, and Utopia (called “State”).  New York:  Basic Books, 1974.
  • Martha C. Nussbaum, Sex and Social Justice (called “Sex”).  New York:  Oxford University Press, 1999.
  • Plato, Five Dialogues, trans. G. M. A. Grube (called “Dialogues”).  Indianapolis:  Hackett, 1981.
  • Plato, Gorgias, trans. Donald J. Zeyl.  Indianapolis:  Hackett, 1987.
  • Plato, The Laws, trans. Trevor J. Saunders (called “Laws”).  London:  Penguin Books, 1975.
  • Plato, Republic, trans. G. M. A. Grube, revised by C. D. C. Reeve.  Indianapolis:  Hackett, 1992.
  • Thomas W. Pogge, “An Egalitarian Law of Peoples” (called “Egalitarian”).  Philosophy and Public Affairs, Vol. 23, No. 3, 1994.
  • John Rawls, Collected Papers, ed. Samuel Freeman (called “Papers”).  Cambridge, MA:  Harvard University Press, 1999.
  • John Rawls, The Law of Peoples (called “Peoples”).  Cambridge, MA:  Harvard University Press, 1999.
  • John Rawls, Political Liberalism (called “Liberalism”).  New York:  Columbia University Press, 1996.
  • John Rawls, A Theory of Justice (called “Theory”).  Cambridge, MA:  Harvard University Press, 1971.
  • Jean-Jacques Rousseau, On the Social Contract, trans. G. D. H. Cole.  Mineola, NY:  Dover, 2003.
  • Michael J. Sandel, Liberalism and the Limits of Justice (called “Limits”).  New York:  Cambridge University Press, 1982.
  • Robin Waterfield, trans., The First Philosophers (called “First”).  New York:  Oxford University Press, 2000.

b. Secondary Sources

  • John Arthur and William H. Shaw, ed., Justice and Economic Distribution.  Englewood Cliffs, NJ:  Prentice-Hall, 1978.
    • This is a good collection of contemporary readings, especially one by Kai Nielsen.
  • Jonathan Barnes, ed., The Cambridge Companion to Aristotle.  New York:  Cambridge University Press, 1995.
    • The articles on Aristotle’s “Ethics” and “Politics” are particularly relevant.
  • Brian Barry, Justice and Impartiality.  New York:  Oxford University Press, 1995.
    • This is a good study.
  • Brian Barry, Theories of Justice.  Berkeley:  University of California Press, 1989.  This discussion makes up in depth what it lacks in breadth.
  • Hugo A. Bedau, ed., Justice and Equality.  Englewood Cliffs, NJ:  Prentice-Hall, 1971.
    • This is an old but still valuable anthology.
  • H. Gene Blocker and Elizabeth H. Smith, ed., John Rawls’ Theory of Social Justice.  Athens, OH:  Ohio University Press, 1980.
    • This is an early but still worthwhile collection of papers, with “Justice and International Relations,” by Charles R. Beitz, being particularly provocative.
  • Ronald Dworkin, Taking Rights Seriously.  Cambridge, MA:  Harvard University Press, 1977.
    • See, especially, the chapter on “Justice and Rights,” which contains a critique of Rawls’s theory.
  • Joel Feinberg, Doing and Deserving.  Princeton, NJ:  Princeton University Press, 1970.
    • The fourth chapter, on “Justice and Personal Desert,” is especially relevant.
  • Samuel Freeman, ed., The Cambridge Companion to Rawls.  New York:  Cambridge University Press, 2003.
    • Like all the books in this series, this one offers a fine array of critical articles, with the one by Martha C. Nussbaum being particularly noteworthy.
  • John-Stewart Gordon, ed. Morality and Justice: Reading Boylan’s A Just Society. Lanham, MD: Lexington Books, 2009.
    • 14 essays by scholars from 8 countries.  There is a reply by Boylan.
  • Richard Kraut, ed., Plato’s Republic:  Critical Essays.  Lanham, MD:  Rowman & Littlefield, 1997.
    • See, in particular, the articles by John M. Cooper and Kraut himself.
  • Rex Martin and David A. Reidy, ed., Rawls’s Law of Peoples.  Oxford:  Blackwell, 2006.
    • In particular, see “Do Rawls’s Two Theories of Justice Fit Together,” by Thomas Pogge.
  • David Miller, Principles of Social Justice.  Cambridge, MA:  Harvard University Press, 1999.
    • This is a good contemporary treatment.
  • David Fate Norton, ed., The Cambridge Companion to Hume.  New York:  Cambridge University Press, 1993.
    • See, especially, “The Structure of Hume’s Political Theory,” by Knud Haakonssen.
  • Thomas W. Pogge, Realizing Rawls.  Ithaca, NY:  Cornell University Press, 1989.
    • This is a constructive critique of Rawls’s early work.
  • Louis P. Pojman, Global Political Philosophy.  New York:  McGraw-Hill, 2003.
    • The fifth chapter focuses on justice.
  • Wayne P. Pomerleau, Twelve Great Philosophers.  New York:  Ardsley House, 1997.
    • This contains discussions of Plato, Aristotle, Augustine, Aquinas, Hobbes, Hume, Kant, Mill, and (a bit on) Rawls.
  • Tom Regan and Donald VanDeVeer, ed., And Justice for All.  Totowa, NJ:  Rowman and Littlefield, 1982).
    • An interesting collection, with a particularly penetrating article by Kai Nielsen.
  • Henry S. Richardson, “John Rawls (1921-2002),” in the Internet Encyclopedia of Philosophy.
    • This is a very good overview article.
  • Paul Ricoeur, The Just, trans. David Pellauer.  Chicago:  University of Chicago Press, 2000.
    • This is interesting as a contemporary treatment from the continental tradition.
  • Allen D. Rosen, Kant’s Theory of Justice.  Ithaca:  Cornell University Press, 1993.
    • This is a valuable, in-depth analysis.
  • Alan Ryan, ed., Justice.  Oxford:  Oxford University Press, 1993.
    • This is a very good anthology of classical and contemporary readings.
  • Michael J. Sandel, ed., Justice:  A Reader.  New York:  Oxford University Press, 2007.
    • This is an interesting anthology of readings that includes Sandel’s own article on “Political Liberalism.”
  • Gerasimos Santas, Goodness and Justice.  Oxford:  Blackwell, 2001.
    • This is an in-depth examination of Socratic, Platonic, and Aristotelian views.
  • Amartya Sen, The Idea of Justice.  Cambridge, MA:  Harvard University Press, 2009.
    • This is a wide-ranging recent study.
  • John Skorupski, ed., The Cambridge Companion to Mill.  New York:  Cambridge University Press, 1998.
    • See, especially, “Mill’s Utilitarianism,” by Wendy Donner.
  • Robert C. Solomon and Mark C. Murphy, ed., What Is Justice?, Second Edition.  New York:  Oxford University Press, 2000.
    • This is a nice and well-organized collection of classical and contemporary texts.
  • James P. Sterba, The Demands of Justice.  Notre Dame:  University of Notre Dame Press, 1980.
    • This is a good monograph.
  • James P. Sterba, ed., Justice:  Alternative Political Perspectives, Fourth Edition.  Belmont, CA:  Wadsworth/Thomson, 2003.
    • This is a well-organized collection that includes a classic feminist critique of Rawls, taken from Justice, Gender and the Family, by Susan Okin.
  • Gregory Vlastos, ed., Plato:  A Collection of Critical Essays, Vol. II.  Garden City, NY:  Anchor Books, 1971.
    • See, especially, “Justice and Happiness in the Republic,” by Vlastos himself.
  • Michael Walzer, Just and Unjust Wars, Third Edition.  New York:  Basic Books, 2000.
    • This is an in-depth contemporary exploration of the topic.
  • Michael Walzer, Spheres of Justice.  New York:  Basic Books, 1983.
    • This is a comprehensive study.
  • Eric Thomas Weber, Rawls, Dewey and Constructivism:  On the Epistemology of Justice.  London:  Continuum, 2010.
    • This is a good recent comparative analysis.
  • Jonathan Westphal, ed., Justice.  Indianapolis:  Hackett, 1996.
    • This is one of the best anthologies of classic texts on this subject.

Author Information

Wayne P. Pomerleau
Email: Pomerleau@calvin.gonzaga.edu
Gonzaga University
U. S. A.

Art and Emotion

It is widely thought that the capacity of artworks to arouse emotions in audiences is a perfectly natural and unproblemmatic fact. It just seems obvious that we can feel sadness or pity for fictional characters, fear at the view of threatening monsters on the movie screen, and joy upon listening to upbeat, happy songs. This may be why so many of us are consumers of art in the first place. Good art, many of us tend to think, should not leave us cold.

These common thoughts, however natural they are become problematic once we start to make explicit other common ideas about both emotion and our relationship with artworks. If some emotions, such as pity, require that the object of the emotion be believed to exist, even though it actually doesn’t, how would it then be possible to feel pity for a fictional character that we all know does not exist? A task of fundamental importance, therefore, is to explain the possibility of emotion in the context of our dealings with various kinds of artworks.

How are we motivated to pursue, and find value in, an emotional engagement with artworks when much of this includes affective states that we generally count as negative or even painful (fear, sadness, anger, and so on)? If we would rather avoid feeling these emotions in real life, how are we to explain cases where we pursue activities, such as watching films, that we know may arouse similar feelings? Alternatively, why are so many people eager to listen to seemingly deeply distressing musical works when they would not want to feel this way in other contexts? Are most of us guilty of irrational pleasure in liking what makes us feel bad? Answering these and related questions is of prime importance if we wish to vindicate the thought that emotion in response to art is not only a good thing to have, but also valuable in enabling us to appreciate artworks.

Table of Contents

  1. Introduction
  2. Emotion in Response to Representational Artworks: The Paradox of Fiction
    1. Denying the Fictional Emotions Condition
    2. Denying the Belief Condition
    3. Denying the Disbelief Condition: Illusion and Suspension of Disbelief
  3. Emotion in Response to Abstract Artworks: Music
    1. Moods, Idiosyncratic Responses and Surrogate Objects
    2. Emotions as Responses to Musical Expressiveness
      1. Music as Expression of the Composer’s Emotions
      2. Reacting Emotionally to Music’s Persona
      3. Contour Accounts of Expressiveness and Appropriate Emotions
    3. Music as Object
  4. Art and Negative Emotion: The Paradox of Tragedy
    1. Pleasure-Centered Solutions
      1. Negligible Pain
        1. Conversion
        2. Control
        3. No Essential Valence
      2. Compensation
        1. Intellectual Pleasure
        2. Meta-Response
        3. Catharsis
        4. Affective Mixture
    2. Non-Pleasure-Centered Solutions
      1. ‘Relief from Boredom’ and Rich Experience
      2. Pluralism about Reasons for Emotional Engagement
  5. The Rationality of Audience Emotion
  6. Conclusion
  7. References and Further Reading

1. Introduction

That emotion is a central part of our dealings with artworks seems undeniable. Yet, natural ideas about emotion, at least taken collectively, make it hard to see why such an assumption should be true. Once we know a bit more about emotion, it isn’t clear how we could feel genuine emotion towards artworks in the first place. Furthermore, if it were possible to find a plausible theory of emotion that would vindicate the claim that we experience genuine emotion towards artworks, questions arise as to why most of us are motivated to engage in artworks, especially when these tend to produce negative emotions. How can the emotion we feel towards artworks be rationally justified? Overall, a proper understanding of our emotional responses to art should shed light on its value.

Although the general question of what an emotion consists in is highly debated, it is nonetheless possible to address a list of fairly uncontroversial general features of emotions that are most relevant to the present discussion.

Cognitive thesis: Emotions are cognitive states in the sense that they purport to be about things in the world. This means they require objects, they have intentionality, and they ascribe certain properties to them. For instance, fear can be thought to be attributed to an object’s dangerous nature or quality. Given that emotions are representational states, they are subject to norms of correctness. A case of fear is inappropriate if the object is not in fact dangerous or threatening. The nature of the relevant state─whether it is a judgment, a perception, or something else─is a matter of debate (see Deonna & Teroni, 2012, for an excellent introduction). Depending on the answer one gives on this issue, one’s views on the nature of our affective states towards artworks may differ.

Belief requirement: Even though emotions are probably not belief-states, they may always depend on beliefs for their existence and justification. One cannot, it seems, grieve over someone’s death if one doesn’t believe that they actually died. Or one cannot hope that it will be sunny tomorrow if one believes that the world is going to end tonight. More relevantly for the rest of the discussion, it may be difficult to make sense of why someone is genuinely afraid of something that does not exist.

Phenomenological thesis: Emotions are typically felt. There is something it is like to be experiencing an emotion; they are ‘qualitative’ states). Moreover, emotions are experienced with various levels of intensity. Sometimes it is possible to be in a situation where a given emotion is experienced at a high/low level of intensity where a higher/lower level is considered appropriate, suggesting that the correctness conditions of emotion should include a condition to the effect that the emotion is appropriately proportionate to its object.

Valence thesis: Emotions have valence or hedonic tone. Their phenomenology is such that some of them are experienced as negative and some as positive. Some emotion types seem to be essentially positive or negative. For example, fear, sadness, and pity are paradigmatic negative emotions, whereas joy and admiration are paradigmatic positive emotions.

All of these features, taken either individually or in combination, raise issues with respect to the alleged emotional relationship we entertain with artworks. Here are a couple of examples. The cognitive thesis and belief requirement make it difficult to see how our affective responses to things we believe to be non-existent, or to strings of meaningless (without representational content) sounds─as in instrumental music─could be genuine emotions. Alternatively, they require us to give a cogent account of the objects of the relevant emotions if these are to count as genuine. The valence thesis, combined with the plausible claim that ‘negative’ emotions are frequently felt in response to artworks, raise the question of how we can be motivated to pursue an engagement with artworks that we know are likely to arouse painful experiences in us. Finally, if the claims that we experience genuine emotions towards artworks, and that we are not irrational in seeking out unpleasant experiences, can be made good, the question whether these emotional responses themselves are rational, justified, or proportional remains to be answered.

2. Emotion in Response to Representational Artworks: The Paradox of Fiction

Artworks can be roughly divided into those that are representational and those that are non-representational or abstract. Examples of the former include works of fiction, landscape paintings and pop music songs. Examples of the latter include expressionist paintings and works of instrumental or absolute music. Both kinds of artworks are commonly thought to arouse affective responses in audiences. The basic question is how best to describe such responses.

Starting with representational artworks, it isn’t clear how we can undergo genuine emotional responses towards things we believe not to exist in the first place (creating a tension with the belief requirement above). Taking the case of fiction as a paradigm case, we seem to come to the following paradox (first formulated by Radford, 1975):

  1. We experience genuine emotions directed at fictional characters and situations (fictional emotions condition).
  2. In order to experience an emotion towards X, one must believe that X exists (belief requirement).
  3. We do not believe that fictional characters and situations exist (disbelief condition).

Claims (1)-(3) being inconsistent, most theorists have tried to solve the paradox of fiction by rejecting one or more of these premises.

a. Denying the Fictional Emotions Condition

The fictional emotions condition (claim (1)) can be denied in two main ways. On the one hand, one could deny that the affective states we experience in the context of an encounter with fiction are genuine emotions. On the other hand, accepting that the affective states in question are genuine emotions, one could deny that such mental states are in fact directed at fictional characters and situations.

Regardless of how the debate over the nature of our affective responses to fiction turns out, one thing is clear: we can be moved by an encounter with fiction. It is difficult to deny that works of fiction tend to affect us in some way or other. One solution to the paradox of fiction is therefore to claim that the relevant affective states need not be genuine emotions but affective states of some other kind (for example, Charlton, 1970, 97). Some affective states or moods (like cheerfulness) and, perhaps, reflex-like reactions (surprise and shock), do not seem to require existential beliefs in order to occur and are in fact compatible with the presence of beliefs with any given content. There is certainly no contradiction in a state of mind involving both a gloomy mood and the belief that the characters described in fiction are purely fictional. An advantage of this view is that it explains why the responses we have towards fiction are very much emotion-like. Moods, for instance, are not typically ‘cold’ mental states but have a phenomenal character.

The major difficulty with approaches appealing to such seemingly non-cognitive or non-intentional states is that they only seem to account for a small part of the full range of affective states we experience in response to fiction (Levinson, 1997, 23, Neill, 1991, 55). It is quite clear that many of the affective states we have in response to fiction are experienced as directed towards some object or other. One’s sadness upon seeing a story’s main character die is, at the very least, felt as directed towards someone, be it the character himself, a real-world analogue of himself (for instance, a best friend), or any other known reference. An adequate solution to the paradox of fiction, therefore, should allow the affective states one feels in response to fiction to be genuinely intentional.

Another solution that denies that we experience genuine emotions towards fictional characters is the one given by Kendal Walton (1978). According to Walton, it is not literally true that we experience pity for the fictional character Anna Karenina, and it is not literally true that we experience fear at the view of the approaching green slime on the movie screen. Rather, we are in some sense pretending to be experiencing such emotions. It is only fictionally the case that we feel pity or fear in such contexts. As a result, the affective states we actually experience are not genuine emotions, but what Walton calls ‘quasi-emotions’. Just as one makes-believe that the world is infested with heartless zombies when watching Night of the Living Dead, or that one is reading the diary of Robinson Crusoe, one only makes-believe that one feels fear for the survivors, or sadness when Crusoe is having a bad day. Watching films and reading novels turn out very much like children’s games where objects are make-believedly taken to be something else (for example, children pretend to bake cookies using cakes of mud). In engaging with both fiction and children’s games, for Walton, we can only be pretending to be experiencing emotion.

Walton does not deny that we are genuinely moved by─or emotionally involved with─the world of fiction; he only denies that such affective states are of the garden-variety, that they should be described as genuine states of fear, pity, anger, and so on. This denial is backed up by two main arguments. First, in contrast with real-life emotions, quasi-emotions are typically not connected to behavior. We, indeed, do not get up on the theater stage in order to warn Romeo that he is about to make a terrible mistake. Second, in line with the belief requirement, Walton appeals to what seems to him a “principle of commonsense”, “one which ought not to be abandoned if there is any reasonable alternative, that fear must be accompanied by, or must involve, a belief that one is in danger.” (1978, 6-7)

Walton’s theory can be attacked on several fronts, but the most common strategy is to reject one or both of the arguments Walton gives. In response to the first argument, one could argue that emotions in general are not necessarily connected to motivation or behavior. For instance, emotions directed at the past, such as regret, are not typically associated with any particular sort of behavior. Of course, emotions can lead to behavior, and many of them include a motivational component, but it is not clear that such a component is a defining feature of emotion in general. Moreover, it is important to note that most works of fiction are composed of representations of certain events, and do not purport to create the illusion that we are in direct confrontation with them. The most natural class of real-world emotions comparable to the relevant affective states, is therefore, emotions experienced towards representations of real-world states of affairs (documentaries and newspapers), and not in contexts of direct confrontation with the relevant objects. Once this distinction─between emotion in response to direct confrontation and emotion in response to representations of real-world events─is made, it becomes apparent that the lack of distinctive behavioral tendencies in affective states produced by fiction should not count against the claim that they constitute genuine emotions. Assuming the claim that it is possible to feel genuine pity, sadness or anger upon reading or hearing the report of a true story, the relevant emotions need not, and typically do not, include a motivational component whose aim is to modify the world in some way. One can read the story of a Jewish family deported in the 1940’s, feel deep sadness in response to it, yet lack the desires to act, as in sadness that is evoked in contexts of direct confrontation. (See Matravers, 1998, Ch. 4, for elaboration.)

The second argument, relying on (something like) the belief requirement, can be rejected by denying the requirement itself. A reason in favor of rejecting the belief requirement instead of the fictional emotions conditions is that rejecting the latter leads to a view with the highly revisionary consequence that it is never the case that we respond to fictions with genuine emotion. This sort of strategy will be introduced in Section 2.b.

If one accepts the claim that fiction can trigger genuine emotions in audiences, but nonetheless does not wish to reject either the belief requirement or the disbelief condition, one can deny that the relevant emotions are directed at fictional entities in the first place. An alternative way to describe our emotions in response to fiction is to say that, although such emotions are caused by fictional entities, they are not directed at them. As Gregory Currie summarizes what is sometimes called the ‘counterpart theory’ (defended in, for instance, Weston, 1975 and Paskins, 1977),

…we experience genuine emotions when we encounter fiction, but their relation to the story is causal rather than intentional; the story provokes thoughts about real people and situations, and these are the intentional objects of our emotions. (1990, 188)

Given that the objects of the relevant emotions─particular people and situations, types of people and situations, the fictional work itself, certain real-world properties inspired by the fictional characters, as well as real human potentialities (Mannison, 1985)─both exist and are believed to exist, the view satisfies the belief requirement. In addition, the view is compatible with the plausible claim that we do not believe in the existence of fictional characters and situations (that is, the disbelief condition). Appealing to surrogate objects is in fact a way to do justice to it.

Despite its elegance and simplicity, this solution to the paradox of fiction does not find a lot of proponents. It manages to capture some of the affective states, but not all. More specifically, the example of appealing to objectless affective states, fails to capture those affective states that are experienced as directed at the fictional characters and situations themselves. To many philosophers, it is simply a datum of the phenomenology of our responses to fiction that these are sometimes directed at the characters and situations themselves. It is precisely these responses that initially called for explanation.
“For we do not really weep for the pain that a real person might suffer”, says Colin Radford, “and which real persons have suffered, when we weep for Anna Karenina, even if we should not be moved by her story if it were not of that sort. We weep for her.” (Radford, 1975, 75) An adequate theory of our emotional responses to fiction that appeals to surrogate objects, as a result, owes us an account of why we may be mistaken in thinking otherwise.

b. Denying the Belief Condition

The rejection of the claim that we can respond to fictional characters and situations with genuine emotions, and that such emotions can be genuinely directed at the latter, seems to clash with a rather stable commitment of commonsense. In recent years, the dominant approach has rather been to reject the belief requirement (claim (2)).

One reason why one might deny that the affective states we have in response to fiction are genuine emotions is an acceptance, explicit or implicit, of a certain theory of emotion. According to the so-called ‘cognitive’ theory of emotion (sometimes called ‘judgmentalism’), an emotion necessarily involves the belief or judgment that the object of the emotion instantiates a certain evaluative property (see examples in Lyons, 1980. Solomon, 1993). In this view, an episode of fear would necessarily involve the belief or judgment that one is in danger. This resembles the ‘principle of commonsense’ that Walton speaks of when he claims that “fear must be accompanied by, or must involve, a belief that one is in danger” (1978, 6-7). If emotions are, or necessarily involve, evaluative beliefs about the actual, it is no wonder that they require existential beliefs. The belief that a given big dog constitutes a threat, for instance, seems to entail the belief that the dog exists. Rejecting the cognitive theory, as a result, may seem to be a way to cast doubt on the belief requirement.

The cognitive theory of emotion is widely rejected for well-known reasons. One of them is the counterintuitive consequence that one would be guilty of a radical form of irrationality every time one’s emotions turn out to conflict with one’s judgments, as in cases of phobic fears (Tappolet, 2000, Döring, 2009). Another reason to reject the cognitive theory is that it may lead to the denial that human infants and non-human animals cannot experience emotions, since they plausibly lack the necessary level of cognitive sophistication (Deigh, 1994). Ultimately, many people seem to think, the theory is too ‘intellectualist’ to be plausible (Goldie, 2000, Robinson, 2005).

It is one thing to reject the cognitive theory of emotion, however, quite another to reject the belief requirement. Although the two views are closely related and are often held together (since the cognitive theory implies the belief requirement, at least understood as a requirement of rationality─see below), the falsity of the cognitive theory does not entail the falsity of the belief requirement. What the latter says is that, no matter how an emotion is to be ultimately defined, the having of one requires that the subject believe that its object exists. As Levinson says,

The sticking point of the paradox of fiction is the dimension of existence and nonexistence, as this connects to the cognitive characterization that emotions of the sort in question minimally require. When we view or conceive an object as having such and such properties, whether or not we strictly believe that it does, we must, on pain of incoherence, be taking said object to exist or be regarding it as existence. For nothing can coherently be viewed or conceived as having properties without at the same time being treated as existent. (1997, 24-25)

It is a mistake, therefore, to think that a simple rejection of the cognitive theory of emotion can solve the paradox of fiction. What one must do is confront the belief requirement directly.

The main way to challenge the belief requirement is to claim that, although our emotional responses to events in the world require the presence of some cognitive states, such states need not be beliefs. On such a view, sometimes called the “thought theory” (Lamarque, 1981, Carroll, 1990), one need not believe that an object O exists in order to be afraid of it; rather, one must at least entertain the thought that O exists (Carroll, 1990, 80), or imagine what O would be like if it existed (Novitz, 1980), or ‘alieve’ O being a certain way (see Gendler, 2008, for the distinction between alief and belief). We surely are capable of responding emotionally as a result of thought processes that fall short of belief, as when we are standing near a precipice and entertaining the thought that we may fall over (Carroll, 1990, 80. Robinson, 2005).

Thoughts, according to Lamarque and Carroll, can generate genuine emotions. On their view, the vivid imagining of one’s lover’s death can produce genuine sadness. It is important, however, to note that what the emotion is about─its intentional object─is not the thought itself, but what the thought is about (its ‘content’). In the case just given, one’s sadness is about one’s lover’s death, and not about the thought that one’s lover may die. Here, as in many everyday instances of emotion, the cause of one’s emotion need not coincide with its object, and the object of one’s emotion need not exist. As a result, the view preserves the claim that we can have genuine emotions directed towards fictional characters and situations.

Despite the fact that the thought theory (and its cognates─for example, Gendler’s view) enjoys a great deal of plausibility, in that it meshes quite well with recent orthodoxy in both the philosophy and the psychology of emotion in its claim that existential beliefs are not necessary for emotion in general (for a review, see Deonna & Teroni, 2012). It nonetheless appears to naturally lead to the claim that there is something utterly irrational, unreasonable, or otherwise wrong, in feeling fear, pity, joy, grief, and so on, both in response to mere thoughts, and for things that we know not to exist. If feeling pity for a fictional character is like feeling fear while imagining the possible crash of the plane one is taking (assuming flying the plane is a perfectly safe activity and one knows it), it becomes hard to see how one can be said to be rational in engaging emotionally with fiction, perhaps even to see where the value of doing so could lie. See Section 5 below.

When first presenting the paradox of fiction, Colin Radford decided to accept all the three premises of the alleged paradox and declared us ‘irrational’ in being moved by fiction. If one interprets the paradox as a logical paradox, that is as one in which the claims cannot all be true, then one would be led to the unpalatable thought that the world in itself includes some form of incoherence (Yanal, 1994). A more reasonable way to understand Radford’s decision to accept the three premises of the paradox, and in turn to declare us irrational, is to say that, as he understands them, they are not really inconsistent (they in fact can all be true). What interests Radford may not be so much how it is possible for audiences to feel emotions in response to fiction (although he must presuppose an answer to that question) than how it is possible for us to feel such emotions and be rational in doing so. An alternative way to read the belief requirement is, therefore, as a norm of rationality (Joyce, 2000). On such a reading, in order for an emotion to be fully appropriate or rational, it is necessary that the subject believe that the emotion’s object exists. A child’s fear of monsters in his bed at night may be irrational if the child does not believe that monsters in fact exist. Our emotional responses to fiction, although genuine, may be akin to such irrational fears of unreal entities.

It remains to be seen whether the belief requirement formulated as a norm of rationality (of a sort to be elucidated) is acceptable in the context of emotion in general, and in the context of emotion in response to artworks in particular. (See Section 5 below for further details.)

c. Denying the Disbelief Condition: Illusion and Suspension of Disbelief

Among all the strategies that have been adopted in response to the paradox of fiction, the least popular is the one that rejects the disbelief condition. There are two main ways to reject this condition.

The first way to reject the disbelief condition is to claim that, while watching films and reading novels, we are in some way under the illusion that the characters and situations depicted in them are real and we willfully suspend our disbelief in their existence (Coleridge, 1817), or simply forget, temporarily, that they do not exist.

The main problem with this solution is the notable behavioral differences between real-world responses and responses to fiction. If we were really suspending our disbelief we would behave differently than we actually do in our dealings with fictions. Another problem is the notion that we can suspend our disbelief at will in the context of fiction, seems to presuppose a capacity of voluntary control over our beliefs in the face of opposing evidence – a claim that surely needs a proper defense. See Doxastic Voluntarism, and Carroll, 1990, Ch.2, for further discussion. Finally, it may be partly due to the fact that we are aware that we are dealing with fiction that we can find pleasure, or some worth, in experiencing negative emotions towards fictional entities in the first place. (See section on Art and Negative Emotion: The Paradox of Tragedy.)

Currie (1990, 10) provides an alternative understanding of the notion of willful suspension of disbelief whereby it is occurrent, as opposed to dispositional disbelief that should be suppressed. What suspension of disbelief requires, for Currie, is merely that one, while engaged in a work of fiction, does not attend to the fact that the story is literally false. On such a view, a proper engagement with fictional stories, while allowing one to believe at some level that these stories are non-actual requires that one not bring this belief to consciousness. This solution, however, remains rather sketchy and may need to be supplemented with further details about the workings of the relevant mechanism. See Todd, forthcoming, for a proposal (in particular for the notion of bracketed beliefs).

The second way to reject the disbelief condition is by arguing that in some sense we do actually believe what the relevant stories tell us (rather than merely suspend our disbelief in the reality of their content), and therefore that the characters at play in them exist in some way. A plausible way to do so is to say that beliefs need not be directed at propositions about the actual and sometimes can be directed at propositions about what is fictionally the case. For instance, there does not seem to be anything wrong with the utterance “I believe that Sherlock Holmes lives in London”, and with the possibility of a genuine disagreement between the speaker and someone who believes that Holmes lives in Manchester. The fact that the relevant states are about a fictional entity does not give us a reason to deny them the status of beliefs. According to Alex Neill (1993), many of the types of emotion (pity, for example) we experience in response to fiction, require the presence of certain beliefs while leaving open the ontological status (actual vs. non-actual) of the entities involved in its content. On this account, what matters for pity, for instance, is not that a given character is believed to have actually existed, but that she is having a miserable time (in the fiction).

Whether an appeal to belief (as opposed to an appeal to make-beliefs, imaginings, attention mechanisms, and so on.) is the right way to capture the phenomenon is open for debate. Still, this solution not only seems to accommodate some of our intuitions, but also prompts us to further investigate the nature of the kinds of belief, if any, found in the context of an engagement with works of fiction. See Suits (2006) for further discussion.

3. Emotion in Response to Abstract Artworks: Music

Representational artworks are not the only ones capable of eliciting affective states in audiences; non-representational artworks can do so, too. One surely can be moved by a Beethoven’s Fifth Symphony or a Jackson Pollock’s No.5, 1948. A problem analogous to the problem of fiction can however be found in the case of abstract art. How can it be the case that we are moved by strings of sounds, or configurations of colors, when such entities do not represent anything, strictly speaking? Here, the question of what is the object of the relevant emotions is even more pressing than for fiction, as in this case there seems to be no characters and situations for an emotion to be about. Focusing on the case of (instrumental or absolute) music, two main questions should therefore be addressed. First, how should we classify our affective responses to music? Second, if such affective responses can be genuine emotions, what are their objects?

a. Moods, Idiosyncratic Responses and Surrogate Objects

As in the case of fiction, there are affective responses to music that may not pose any substantial philosophical problem. Such responses include seemingly non-cognitive or non-intentional states such as moods and feelings. Since moods are not directed at anything in particular, and can be triggered by a variety of things, including events, thoughts, and even certain chemical substances (such as caffeine), they do not constitute a problem in the context of our dealings with musical works. However, can genuine emotions be experienced in response to music? Intuitively, the answer is ‘of course’. The problem is precisely what kinds of emotion these can be and towards what they can be directed.

Some cases of genuine emotion in response to music that do not seem to be philosophically significant, moreover, are those emotions that we experience in response to music which can be explained by idiosyncratic facts about the subject. This is what Peter Kivy calls the ‘our-song phenomenon’ (Kivy, 1999, 4). A piece of music, for instance, may bring to one a host of memories that produce sadness. Here, as elsewhere, the intentional object of the emotion turns out not to be the music itself but what the memories are about (a lost love, say). Such cases are not philosophically puzzling, therefore, because they involve emotions whose object is not (or at least not exclusively) the music or the music’s properties.

Alternatively, one may fear that the loudness of a performance of Vivaldi’s Four Seasons might cause severe auditory damage. In such a case, although the music is the object of the emotion, the emotion is not in any way aesthetically relevant. It is not a response that is made appropriate by the music’s aesthetic properties. Furthermore, it is not a response that everyone is expected to experience in response to Vivaldi’s piece, as opposed to the particular performance of it in a specific context. (By analogy, one may be angry at the sight of a stain on the frame of the Mona Lisa; this, however, does not make anger an appropriate response to the painting as an object of aesthetic appraisal.)

What we need to know is whether there are cases of emotion that can claim to be aesthetically relevant in this intuitive way.

b. Emotions as Responses to Musical Expressiveness

It is commonly thought that musical works can be ‘sad’, ‘happy’, ‘melancholy’, perhaps even ‘angry’. In short, music appears capable of expressing, or being expressive of, emotions. One line of thought is that, just as we can respond to human expressions of emotion with genuine emotions, we can respond to music’s expressive properties with genuine emotions. In the case of human expressions of emotion, one can for instance react to someone’s sadness with genuine sadness, pity, or anger, the person (or the person’s distress) being the intentional object of one’s emotion. Likewise, if music can express sadness, one may respond to it by being sad (Davies, 1994, 279), something that seems to accord with common experience. The problem, of course, is in what sense music can be said to ‘express’ emotion in a way that would cause appropriate corresponding emotions in audiences. (The addition of the word ‘appropriate’ here is important, as one could agree that the relevant properties sometimes cause emotion─just as any (concrete) object can happen to cause emotion─but nonetheless consider them irrelevant to a proper appreciation of the work. For the view that all emotion elicited by musical works is pure distraction, see Zangwill, 2004.)

Of course, it is one thing to ask what musical expressiveness consists in; it is another thing to wonder about the nature of our affective response to musical works. One question addresses what makes it the case that a piece of music can be said to be ‘sad’, ‘joyful’ or ‘melancholy’, the other asks what makes it the case that we can feel emotions, if any, in response to a piece of music. As a result, nothing prevents one from answering these questions separately. Nevertheless, the various answers one might give to the question about musical expressiveness plausibly can have implications regarding the nature and appropriateness of our emotional responses to music’s expressive properties. If, for instance, the answer we give is that music can express emotions in roughly the same way humans do, then we are likely to count experiences of sadness in response to ‘sad’ music as appropriate. If, by contrast, we take the answer to be that music is expressive in an analogous way colors and masks can be expressive, we are likely to consider the affective states the relevant features tend to trigger either inappropriate (when feeling sadness directed at a work’s ‘sadness’) or of a sort that we don’t typically find in our responses to paradigmatic expression of human emotion.

Let’s call the claim according to which it is appropriate to react emotionally to music’s expressive properties in an analogous way it is appropriate to react emotionally to the expression of other people’s emotions, the expression-emotion link thesis (E), and the sorts of responses we typically (and appropriately) experience in response to the expression of human emotions─such as pity, sadness, compassion, or sympathy when someone is in a state of distress─‘empathetic emotions’. Accounts of musical expressiveness divide into those that naturally lead to the acceptance of (E) and those that naturally lead to its rejection (and perhaps to the denial that empathetic emotions are ever really felt in the context of an engagement with musical works).

i. Music as Expression of the Composer’s Emotions

The most obvious way to do justice to the thought that music’s expressiveness can appropriately arouse in its audience emotions that are analogous to those one would feel in response to a real person expressing her emotions is to say that music itself is a medium by which human emotions can express themselves. Music, on one such view, can express the emotions of the composer. So, if when we listen to music we listen to what we take to be the direct expression of the emotions of some real human being, then the usual emotions we can feel towards someone expressing emotion through facial expression, gestures, voice, and so on, are emotions we can feel in response to music. One problematic feature of this view is the highly robust link it makes between the mental states of the artist and the expressive properties of the music he or she produces. Little reflection suggests that it is just not true that whenever a musical work is ‘cheerful’, its composer was cheerful when composing it; a sad composer could have created the work just as well. Moreover, the view does not seem to do justice to our experience of musical works, as we typically do not, and need not, construe them as expressions of human emotion in order to be moved by them; our experience of musical works does not typically go beyond the sounds themselves, and probably not to the artist who produced them.

ii. Reacting Emotionally to Music’s Persona

One way to preserve the attractive thought that music can genuinely express emotions, and therefore that it may sometimes be appropriate to respond to it with empathetic emotions, is to claim that music can express emotion, not by virtue of it being produced by an artist experiencing emotion, but imagined as something or someone that expresses emotion. According to Jerrold Levinson (1990), when we listen to music that is expressive of emotion, we hear it as though someone─what he calls the music’s ‘persona’─were expressing the emotion of the music. Of course, the persona is not something that is or ought to be imagined in great detail; for instance, there is no need to attribute any specific physical characteristics to it. All that is needed is that something very much human-like be imagined as expressing human-like emotions. Now, if Levinson is right in claiming that music’s expressive properties are to be explained by an appeal to a persona that we imagine to be in the music, then there is less mystery in the possibility for audiences to experience genuine emotions in response to these properties. At the very least, there may be here as much mystery as in our ability to experience genuine emotions in response to the expression of emotion by fictional characters, and the solution one gives to the paradox of fiction may as a result be expandable so as to include the case of music. See Kivy, 2002, 116-117, for doubts about this strategy.

iii. Contour Accounts of Expressiveness and Appropriate Emotions

Regardless of the appeal of its conclusions, the persona theory relies on a psychological mechanism─imagining hearing a persona in the music─whose nature may need to be further clarified (see Davies, 1997, for problems for the persona account). An account of music’s expressive properties that posits no seemingly mysterious mechanism, and therefore that may seem less controversial, is what is sometimes called the ‘contour’ or ‘resemblance’ theory (examples can be found in Davies, 1994, Kivy, 1989). According to this view, music can be expressive of emotion in virtue of resembling characteristic expressions of human emotion, a resemblance that we seem pretty good at picking out. A heavy, slow-paced piece of music may, for instance, resemble the way a sad, depressed person would walk and behave, and it may be why we are tempted to call it ‘sad’ or ‘depressing’ in the first place. Musical works, on such a view, are analogous to colors, masks, and other things that we may call ‘happy’, ‘sad’, and so on, on the basis of their perceptible properties. A ‘happy’ song is similar to a smiling mask in that they both resemble in some way characteristic behavioral expressions of human emotion.

The contour theorist does not claim that music genuinely expresses emotion in the way that paradigmatically humans do (that is, by first experiencing an emotion, and then expressing it); rather, she claims that music can resemble expressions of human emotion, just as certain other objects can. Given that musical expressiveness is, on this view, different from paradigmatic human expression, we are led to conclude one of two claims: either the emotions we typically experience in response to music’s expressive properties are generally not of the same sort as the emotions we typically experience in response to other people’s expressions of emotion, or, whenever they are, they are inappropriate. The former claim is a descriptive claim about the nature of our emotional responses to music’s expressive properties; the latter claim is a normative claim about the sorts of responses music’s expressive properties require, if any. We will consider the descriptive claim in the next section.

Why does the contour theory naturally lead to the normative claim? The answer is that, just as it would be inappropriate to feel ‘empathetic’ emotions in response to the sadness of a mask or the cheerfulness of a color, it would be inappropriate to feel such emotions in response to the sadness, or the cheerfulness, of a piece of music. Unless we believe that what one is sad about when looking at a mask is not really the mask, we should be puzzled by one’s response, for, after all, nobody is (actually or fictionally) sad. (Of course, the mask may produce certain moods in one; this, however, would not constitute a problem, as we have seen.) The sadness, as a result, would count as an inappropriate response to its object.

Something similar may hold in the music case. Although there may be cases where sadness is an appropriate response to a musical work’s ‘sadness’ (for idiosyncratic reasons), this may not be a response that is aesthetically appropriate, that is, appropriate given the work’s aesthetic properties. No matter how often we respond emotionally to music’s expressive properties, these ultimately may count as mere distractions (Zangwill, 2004).

The plausibility of the contour account remains to be assessed. See Robinson, 2005, Ch. 10, for a critique. See also Neill, 1995, for the possibility of a positive role for feelings in attaining a deeper understanding of music’s expressiveness.

c. Music as Object

According to Peter Kivy (1990, 1999), there is an obvious object that can produce genuine emotions in audiences, namely the music itself. On his view, in addition to moods (with qualifications─see Kivy, 2006) and idiosyncratic emotions (the our-song phenomenon), music is capable of eliciting emotions in audiences that are both genuine (not ‘quasi-emotions’), aesthetically appropriate, and of the non-empathetic sort. When one is deeply moved by a musical work, one may be moved by its beauty, complexity, subtlety, grace, and any other properties predicated by the work (including its expressive properties, construed in terms of the contour account). According to Kivy, these emotions, which can collectively be put under the label ‘being moved’, are the only ones that count as appropriate in the relevant way. Moreover, they are not the ‘empathetic’ responses that were put forward in the aforementioned accounts. When we are moved by ‘sad’ music, what may move us is how beautifully, or gracefully the music is expressive of sadness, rather than by an alleged persona we may pity. This allows Kivy to explain why we can (rightly) fail to be moved by a musical work that is surely expressive of some emotion but is at the same time mediocre, a phenomenon that is left unexplained on alternative accounts (Kivy, 1999, 12). In addition, the account explains how music can be moving even when it lacks expressive properties altogether (ibid.).

According to Kivy, we are simply mistaken when thinking that whenever we experience a genuine emotion in response to a musical work, this emotion is of the same sort as the emotions we would feel in response to the expression of emotion in other people. It is simply not true, on his account, that music’s sadness produces in audiences a state of sadness (or other empathetic emotions) whose intentional object is its expressive property (or the entity, real or imagined, that allegedly produced it). What they feel instead is an emotion that, although directed at the music’s sadness, should not be characterized as sadness but instead as ‘exhilaration’, ‘wonder’, ‘awe’, ‘enthusiasm’, and other non-empathetic emotions. Even if the emotion is felt in response to music’s sadness, and may on occasion even feel like garden-variety empathetic emotions, Kivy states that first appearances may be deceptive.

Whether Kivy’s solution to the original problem (including his rejection of (E)), and its accompanying error theory, are adequate, is a matter of significant debate. (In any case, all parties may agree that it adequately captures some emotions that we can confidently count as appropriate responses to music.)

4. Art and Negative Emotion: The Paradox of Tragedy

Let’s assume that we regularly experience genuine emotions in response to fictional characters and situations, and that among the emotions we commonly experience are paradigmatic ‘negative’ emotions. (If one thinks that music can arouse such emotions as well, the following problem is a problem about music, too.) Now, to the extent that many of us are inclined to regularly pursue an engagement with works of fiction, and thereby things that tend to produce negative emotions in audiences, we have a problem in explaining how many of us could be motivated in pursuing activities that elicit in us such unpleasant states.

The paradox of tragedy (called the ‘paradox of horror’ in the specific context of horror─Carroll, 1990) arises when the three following theses are held simultaneously:

  1. We commonly do not pursue activities that elicit negative emotions.
  2. We often have negative emotions in response to fictional works.
  3. We commonly pursue an engagement with fictional works that we know will elicit negative emotions.

Notice that the so-called ‘paradox’ is not a formal paradox. Unless one defends a strong form of motivational hedonism─namely, the view that all we are motivated to pursue is pleasure and nothing else; see Hedonism─the fact that many of us pursue activities that produce painful experiences is not logically problematic. Rather, the problem is that of explaining why many of us are so eager to seek out an engagement with works of fiction when they know it will result in negative emotions and negative emotions are things they generally wish to avoid (all other things being equal). Alternatively, there is a problem in explaining the presence of the relevant motivation in the context of fiction when such motivation is arguably lacking in everyday life.

The solutions to the paradox of tragedy can be divided into two broad classes: those that appeal to pleasure in solving the paradox and those that appeal to entities other than pleasure.

a. Pleasure-Centered Solutions

A common way to characterize the paradox of tragedy is by asking how we can derive pleasure from artworks that tend to elicit unpleasant states in its audience. One solution is to say that the pain that may result from an engagement with fictional works is relatively insignificant or negligible. Another solution is to say that, although the pain that may result in such an engagement can be significant, it is nonetheless compensated by pleasure derived from either the work or some other source.

i. Negligible Pain

1. Conversion

Various mechanisms have been postulated in the history of philosophy in order to explain how we can sometimes derive pleasure from activities, such as watching tragedies and horror films, that tend to elicit unpleasant experiences in audiences. David Hume provides one such mechanism in his famous essay ‘Of Tragedy’. According to Hume, unpleasant emotional experiences are ‘converted’ into pleasant ones in response to positive aesthetic properties of the work such as the eloquence with which the events of the narrative are depicted. One’s overall initially unpleasant emotional state is thereby converted into a pleasant emotional state thanks to the ‘predominance’ of a positive emotion.

One of the main problems with this proposal is the absence of a clear account of the mechanism by which such conversion is made (Feagin, 1983, 95). Presumably, part of the initial overall experience─the pain─is removed and replaced by another part─the pleasure. This, however, demands explanation as to how such an operation can be performed. If the operation is not that there is first a negative emotion and thereafter a positive emotion that somehow compensates for the first’s unpleasantness (which would make Hume’s view a variant of the ‘compensation’ solution; see below), then what is it precisely?

Another problem with the conversion theory is that it seems to fail to allow for cases where people are motivated to engage with works of fiction that they know will on the whole lead to more pain than pleasure, but that may still provide them with some valuable experience (Smuts, 2007, 64). (See section on Non-Pleasure-Centered Solutions below.)

2. Control

Hume’s theory is meant to account for the possibility of deriving pleasure from an engagement with fictional works eliciting negative emotions. This view assumes that we do tend to experience unpleasant emotions in response to fiction; it is just that the unpleasant part of the emotions is modified in the course of the engagement so as to make the overall experience a pleasant one.

An alternative way to view our engagement with works of fiction is as in fact rarely involving any significantly unpleasant states in the first place. On a recent family of solutions, it is a fact of life that we are able to enjoy, under certain conditions, putative ‘negative’ emotions. According to one version of this solution, the ‘control theory’ (Eaton, 1982, Morreall, 1985), the relevant conditions are those where we enjoy some suitable degree of control over the situation to which we are attending. The thought is that, whenever we think that the situation is one over which we can have a fair degree of control, the negative emotions that we may feel in response to it would be felt as less painful than they would otherwise be. For instance, the fear that a professional mountain climber, a skydiver, a roller-coaster user, or a horror film lover, may feel, may be painful to such an insignificant extent that it may turn out to be enjoyable. The reason why the relevant experiences are enjoyable, on this view, is that the subjects are confident that they can handle the situation appropriately. In the fictional case, one can certainly leave the movie theater, or close the book, at any time, and therefore stop being confronted with the work, whenever it becomes unbearable (for example, when it depicts extreme violence).

One worry with the present solution is that, in its current form at least, it still leaves it rather mysterious why we are able to derive pleasure from even mildly unpleasant affective states. The control theorist does not deny that the relevant emotions are to some degree unpleasant. The problem is that she fails to provide us with an account as to why we would want to have such experiences in the first place. What’s so enjoyable about feeling a little pain? As Aaron Smuts puts it, “If we feel any pain at all, then the question of why we desire such experiences, why we seek out painful art, is still open.” (Smuts, 2007, 66)

Another worry is that the view may not have the resources to explain why some people, and not others, pursue activities that tend to elicit unpleasant experiences, when everyone enjoys the relevant control (Gaut, 1993, 338). The fact that people have some control over their responses may well be a prerequisite for the possibility of finding pleasure in them, but it does not seem to explain why such pleasure is to be had in the first place. As alternatively put, although confidence that one will have the capacity to exercise some degree of control may be necessary in order to have the relevant motivation, it is surely insufficient. The control theory therefore requires further elaboration.

3. No Essential Valence

One way to solve the paradox of tragedy by relying on the idea that we can enjoy negative emotions is by denying an assumption that was made when we initially posed the problem, namely that the ‘negative’ emotions we feel in response to fiction are necessarily unpleasant states (Walton, 1990, Neill, 1993). One problem with control theories, we saw, is that they fail to explain why we should even enjoy mild instances of pain. The present solution, by contrast, does not claim that the emotions we experience involve any such pain. More generally, the view denies that emotion as such essentially involves valence (as in the valence thesis above). According to Kendall Walton and Alex Neill, when responding emotionally to an undesirable situation, we may confuse the undesirability of the situation with the perceived nature of the emotion, thereby thinking that it is the emotion that is unpleasant. “What is clearly disagreeable”, Walton says, “what we regret, are the things we are sorrowful about –the loss of an opportunity, the death of a friend─not the feeling or experience of sorrow itself.” (Walton, 1990, 257) Neill goes a step further, making the claim that what we actually mean when we say that an emotion is painful or unpleasant, is that an emotion in these terms is in fact attributing the relevant properties (painfulness, unpleasantness, and so on) to the situations themselves (Neill, 1992, 62).

The major problem for such a view, of course, is that it seems to radically run counter to common sense in holding that emotions do not in themselves feel good or bad. There certainly appears to be a fairly robust conceptual connection between certain emotion types and felt qualities that can be described as pleasure and pain. Perhaps, however, one could maintain that, although such a conceptual connection exists, it is not so strong as to rule out exceptions. For such a possibility, see Gaut, 1993.

ii. Compensation

The second class of solutions that appeal to pleasure, in solving the paradox of tragedy, take the pain involved in our engagement with fiction to be genuine, sometimes even significant, but take this pain to be compensated by some pleasure experienced in response to either the work or some other source.

1. Intellectual Pleasure

One general solution to the paradox of tragedy is to say that, although works of fiction can surely elicit unpleasant states in audiences, they can also elicit pleasant ones. An explanation for why we are motivated to engage with such works is that they usually elicit states that are on the whole more pleasant than unpleasant.

A version of this solution is the one defended by Noel Carroll in his book on horror (Carroll, 1990). Carroll argues that monsters, such as vampires and werewolves, often violate our categorial schemes. In doing so, they unsurprisingly can look threatening, scary, and disgusting, explaining in turn why we can experience emotions such as fear and disgust towards them. However, in challenging our categorial schemes, monsters can also trigger our curiosity. Monsters are indeed things that fascinate us. According to Carroll, the pleasure that is derived from our getting to know more about them can compensate for the unpleasant states we may initially experience, particularly when the narration is such as to arouse our curiosity. The pain is for Carroll “the price to be paid for the pleasure of [the monsters’] disclosure” (1990, 184).

One problem with Carroll’s view is that not all horror stories involve monsters, that is, things that challenge our categorial schemes in the way that werewolves do. Psychopaths and serial killers, for instance, do not seem to be monsters in this sense; they are, as Berys Gaut says, “instances of an all-too-real phenomenon” (Gaut, 1993, 334). Carroll, however, could give up his appeal to monsters in particular, and claim rather that the relevant characters in the horror genre have the ability to elicit our curiosity and the resulting intellectual pleasures.

A more serious problem is that Carroll’s explanation does not seem to be adequate when we go beyond the particular case of horror. The fact that we enjoy watching drama films does not seem to be explainable solely in terms of our curiosity and fascination for odd characters. One possible response to this worry is that intellectual pleasures need not be derived from the satisfaction of our curiosity, and that such pleasures can rather be derived from a variety of sources, including things that do not elicit our curiosity. One problem with this solution would be why the relevant pleasures should be exclusively intellectual, rather than (say) both intellectual and affective. (See Affective Mixture.) In addition, the proponent of this solution may need to motivate the thought that it is in virtue of the pleasures they produce that the relevant intellectual states (such as learning about the characters) are valuable to us, as opposed to the fact that such states are intrinsically valuable. Why should it be the pleasure that one has as a result of learning about the personality of a work’s characters that is the primary motivating factor in our engaging with fiction? Why can it not be the fact that one, say, is able to get a new perspective on what it is to be human, regardless of whether the painful experiences that one may undergo are compensated by pleasant ones? (See Non-Pleasure-Centered Solutions.)

2. Meta-Response

Susan Feagin (1983), another proponent of the compensatory solution, argues that we are motivated to engage with works of fiction that elicit negative emotions, not because the works themselves elicit pleasant experiences in greater proportion, but because of certain responses that we have towards the negative emotions themselves. In a nutshell, for Feagin, an awareness of the fact that we are the kind of people to experience, for instance, pity for Anna Karenina, results in a pleasurable state that compensates for the painful emotions that we have. It feels good, on this view, to realize that we care about the right things.

This solution suffers from a number of problems. First, it does not appear to be applicable to all the relevant cases, such as horror fictions. Many works of horror fiction certainly do not involve ‘sympathetic’ emotions such as pity, and involve rather fear and disgust. Furthermore, it doesn’t seem right to say that, when we experience fear and disgust in response to horror, we enjoy the fact that we are the kind of people to feel fear and disgust.

A second problem with Feagin’s view concerns its appeal to the notion of a meta-response. One thing to notice is that it does not sit well with the phenomenology of our engagement with many fictions. When watching a movie involving a serial killer, our attention is mostly focused on the events depicted, and rarely on ourselves; when we feel sad for an innocent character that has been killed, we do not seem to be sad for her and enjoy being the kind of person who can be sad for innocent people who have been murdered. Such a thought, it seems, rarely occurs to us. Of course, such self-congratulatory thoughts may occur, and may sometimes contribute to one’s overall enjoyment of the work, but they don’t seem necessary for it to be the case that one enjoys fictional works that elicit unpleasant experiences.

3. Catharsis

On one popular understanding of Aristotle’s famous theory of catharsis (see Lucas, 1928), the major motivation behind our desire to watch tragedy is that, by experiencing negative emotions, we are in turn able to expel them (by letting them go away), which somehow provides us with a pleasurable state. It is not, on such a view, the negative emotions that are pleasurable, but the fact that, after having been experienced, they are purged.

One issue this solution fails to address is that the pleasure that one derives from the experience of an unpleasant state terminated (as when one stops having a toothache after taking medicine) compensates for the unpleasantness of the state one was initially in. Moreover, the proponent of the catharsis solution must tell us why the relief that one gets from the extinction of the painful experiences is what typically motivates us in pursuing an engagement with the relevant fictional works. The fact that one will suffer for some time but then be ‘healed’ in a way that is very pleasant hardly sounds like a reason for one to accept to undergo the suffering in the first place. (For further reading, see Smuts, 2009, 50-51.)

Perhaps Aristotle meant to capture a different phenomenon by appealing to the notion of purgation, however. For instance, he could have taken purgation to involve emotions that people unconsciously had before engaging with the work, and that the work would help express themselves, leading to a pleasurable ‘release’ of some inner tension. Despite its possible intuitive appeal, such a solution remains to be developed in a clear and convincing way. (See Aristotle: Poetics for further details.)

4. Affective Mixture

The final pleasure-centered compensatory view that is worth mentioning holds that it does not really matter what kind of experience is supposed to compensate for the pain involved in the negative emotions; what is important is that there be some such experience. For instance, the negative emotions that we feel in response to a drama may be compensated by the joy we experience in realizing that it has a happy ending. Relief may often play a role as well, as when we are relieved that the main character is not going to die after all, something we were afraid would happen throughout the movie or novel.

The view, moreover, does not deny that something like the purgation of certain negative emotions may sometimes elicit pleasurable states. As Patricia Greenspan, a defender of this view, says, “it is not the release of fear that is pleasurable, at least in immediate terms, but that fact that one is soon released from it.” (1988, 32) However, the view denies that this is the only way painful states can be compensated. Such states can be compensated by any positive affective state that is called for by the work: joy, relief, a pleasant sense of immunity, the pleasure of having had a meaningful experience, admiration, and perhaps even self-congratulatory meta-responses. What matters, on this view, is the fact that some pleasant state or other is sufficiently strong to compensate for any unpleasant state the audience may otherwise have.

One advantage of the view is that it readily explains why people sometimes have to close the book they’re reading, or leave the movie theater: they have reached a point where the unpleasant emotions they experience have attained such a level of intensity that they don’t think the work could in any way compensate for them. In addition, the account has a straightforward explanation of why some of us do not like genres, such as horror, that tend to elicit negative emotions in audiences: the positive experiences they have, if any, do not typically compensate for the negative ones.

Another advantage of the present account is its ability to be applied to a wide range of cases, from horror and tragedy to drama and soap opera. Given that it allows a variety of mental states to play the compensatory role, it is less vulnerable to counter-examples.

The main problem with such a view, as with all pleasure-centered views, is simply that not all works of fiction that we are motivated to engage with succeed in compensating for the unpleasant states they elicit in the relevant way and may even leave us utterly depressed or disturbed. For example, Albert Camus’ The Stranger does not seem to elicit many pleasant experiences in readers, at any rate none that are likely to compensate for the pain elicited; many people who fully know what to expect may nevertheless be motivated to read it, suggesting that pleasure is not always what we are trying to get when pursuing an engagement with works of fiction. Pleasure may not be the only thing that negative emotions can buy.

b. Non-Pleasure-Centered Solutions

According to recent proposals, we beg the question if we assume that the paradox of tragedy asks us to explain how we can take pleasure in response to artworks eliciting negative responses in audiences. Unless we are committed to a form of motivational hedonism according to which nothing other than pleasure can account for the pervasiveness of the motivation to engage with works of fiction eliciting negative emotions, the question arises as to whether some other factor(s) may play the relevant motivating role.

We may sometimes find ourselves in conflict between going to see a movie that, although depressing, may teach us some valuable lessons about life, and doing some less painful activity (such as watching comedy). If we end up choosing to see the depressing movie, it is clearly not because one is hoping that it will give one more pleasure than pain. The pain may indeed be the cost of gaining something of value, regardless of whether the having of that thing turns out to be pleasurable.

Below are examples of theories that do not take pleasure to be the main, or the only, factor that explains our willingness to engage with works of fiction eliciting negative emotions. Notice that they can be viewed as compensatory theories that do not primarily appeal to pleasure.

i. ‘Relief from Boredom’ and Rich Experience

A clear example of the kind of view under discussion is one that Hume introduces and rejects in his essay on tragedy, and that he attributes to L’Abbé Dubos. According to Dubos, the thing that we most want when we pursue an engagement with tragedy is a ‘relief from boredom’. As life can sometimes be dull, he thinks, we may find it better to feel any emotion, even negative ones, as long as it delivers us from our boredom.

Dubos’ view is attractive for a number of reasons. First, it is very simple in that it explains the motivation we have to engage with tragedy by appealing to a single overarching desire: the desire to avoid boredom. Second, it makes it possible for people to pursue an engagement with works of fiction that they know will not elicit positive experiences that would compensate for the negative ones.

Of course, it is doubtful that a desire to avoid boredom is actually the underlying motivation behind all engagements with tragedy. Even if Dubos’ proposal does not work, it still provides a recipe for constructing a number of plausible views. If the desire to avoid boredom is not the relevant overarching desire, this does not mean that there is no such desire to be found. For instance, it could be the case that the relevant desire is the desire to have meaningful experiences, whatever their specific nature. According to a recent proposal (Smuts, 2007), when we engage in works of fiction such as tragedies and horror films, what we essentially desire is to get a ‘rich’ experience out of it. This experience, moreover, can be provided by a variety of mental states, including learning about certain characters, having our desire to avoid boredom fulfilled, having a sense of security (since, after all, the events depicted in fictional works are not real), feeling alive, and any other valuable mental state that can be had in the context of an encounter with the work. In addition, such mental states need not be particularly pleasant ones, as when one is being made aware of certain nasty truths about human nature.

One might wonder why there is a focus on experience here, in addition to what the notion of experience at play here is supposed to amount to. It is perfectly conceivable that some consumers of painful fiction are motivated to engage with it for non-experiential reasons. It is not the fact that they will have valuable experiences per se that may motivate them, but the fact that they will acquire certain valuable states whose phenomenology need not be experienced as intrinsically valuable. For instance, one can be reluctant to learn sad truths about the world via literature, but nonetheless pursue this activity because the relevant knowledge is valuable in its own right. Of course, the acquisition of the relevant knowledge may be experienced as meaningful or valuable, but it may not be what one ultimately pursues. One possible response from the rich response theorist is that, by and large, people (or, perhaps, people who are motivated by distinctively aesthetic or artistic considerations) are predominantly motivated to pursue an engagement with artworks eliciting unpleasant emotions because they will provide them with some valuable experience. This, however, would prevent the rich experience theorist from generalizing her account to all cases of painful activities that we are motivated in pursuing. (See next section.)

ii. Pluralism about Reasons for Emotional Engagement

An alternative way to solve the paradox of tragedy is not by appealing to an overarching desire, though such a desire may ultimately exist, but to simply accept that there are many motivating reasons for which we pursue an engagement with the relevant fictions. Like the rich experience theory, such a pluralist solution does not deny that pleasure may sometimes be a motivating factor. What matters however is that we expect from an engagement with any particular work of fiction more benefits than costs, benefits that can be realized by a variety of things: knowledge (including ‘sad truths’), a sense of being alive, an avoidance of boredom, an appreciation of aesthetic properties—any value that can be gained from an interaction with the relevant work. Pluralism about the reasons motivating our engagement with tragedy, therefore, differs from the rich experience theory in that it remains non-committal regarding the existence of an overarching desire under which all these reasons may fall.

The pluralist solution, notably, not only explains why we are motivated to engage with works of fiction that elicit negative emotions in audiences, it also seems to have the resources to provide an explanation for why we are motivated to pursue any activity that tends to elicit negative emotions. Why are some motivated to practice bungee jumping, mountain climbing, fasting, and any other activity that one can think of and that involves displeasure to some non-negligible extent? Because, presumably, it has some positive value that compensates for the displeasure that one may experience, positive value that need not be realized by pleasure. In contrast to the rich experience theory, furthermore, the pluralist solution does not place any constraint on the kind of thing whose positive value may compensate for the displeasure we may experience in the relevant contexts; it indeed does not even have to be a mental state. This is partly what makes an extension of the pluralist solution possible. To take a mundane case, getting one’s wisdom teeth removed may be valuable, and therefore something we may be motivated to pursue, despite the unpleasant side-effects, not merely because one will have pleasurable or other valuable experiences in the future, but partly because doing so contributes positively to one’s health.

Whether the paradox of tragedy may ultimately be part of a broader phenomenon, and, if so, whether it should nonetheless be solved in a way that is special to it, are matters for further discussion.

5. The Rationality of Audience Emotion

As hinted at in our discussion of the paradox of fiction, it is one thing to establish that we are capable of experiencing genuine emotions in response to fictional characters and situations (and perhaps in response to expressive properties of musical works, depending on one’s view on the matter); it is quite another thing to establish that such responses are rational. Colin Radford, let’s recall, attempted to solve the paradox by declaring us irrational in experiencing the relevant emotions. After all, for him, since there is nothing to be afraid, sad, or angry about, having such emotions in the context of an encounter with fictional characters cannot be as rational as having them in the context of an encounter with real people. If our emotions in response to artworks fall short of being fully rational, the worry therefore goes, it becomes hard to see where their apparent value could really lie. Are our emotional responses to fiction in any way justifiable?

Let’s distinguish two broad kinds of justification that emotions can be thought to have. On the one hand, emotions can be epistemically justified in the sense that they give us an accurate representation of the world; the fact that emotions can be rational in this way (at least in principle) is a straightforward consequence of their being cognitive or representational states. On the other hand, emotions can be justified by non-epistemic reasons. For instance, a given instance of anger, with its characteristic expressions, can turn out to be useful in scaring potential threats away.

Presumably, when Radford declared us ‘irrational’ in experiencing emotions in response to entities that we believe do not exist, he meant that there is something epistemically wrong with them. By analogy, there is arguably something wrong from an epistemological standpoint with someone who is afraid of a harmless spider precisely because the spider does not constitute a real threat.

One way to deny that emotions in response to fiction (and representational artworks generally) are irrational in the relevant sense is by saying that the epistemic norms to which they ought to comply are in some way different from those at play in the case of belief. Whereas it would be wrong to believe that there is such a person as Sherlock Holmes after having read Conan Doyle’s novels, it may nonetheless be epistemically acceptable to experience admiration for him. The difference between the epistemic norms governing belief and those governing emotion may lie in the fact that those governing the latter put a lower threshold on what counts as adequate evidence than those governing the former. (See Greenspan, 1988, and Gilmore, 2011, for discussion.)

It seems plausible that some emotions at least are epistemically appropriate responses to the content of fictional stories. In many cases, emotions seem required for a proper appreciation and understanding of the work. Reading War and Peace without experiencing any emotion whatsoever hardly makes possible the kind of understanding that is expected of the reader. Furthermore, it appears quite intuitive to say that, without emotion, an appreciation of the aesthetic qualities of certain artworks, such as those possessed by their narrative structure, would be at best highly impoverished. Radford could reply however that such emotions are uncontroversial, as they are directed at properties of the work rather than at its fictional content. The problem with this response is that even such emotions may need to rely on prior emotions that have been directed at the content of the work.

Even if the emotions we experience in response to fictional artworks are epistemically irrational, they still may be rational in respects that are not epistemic. The point of departure of such a claim is that there is a sense in which the relevant emotions are not completely irrational; they sometimes make perfect sense. As a result, perhaps their value lies in their enabling us to achieve either non-epistemic goods or knowledge about the actual world (as opposed to some putative fictional world). An example of a non-epistemic good may be what can be roughly called spiritual growth. It is commonly thought that literature can somehow educate its audience, not by giving it information about the world, but by playing with its emotions in a way that is beneficial. It seems certainly right that fictions, by presenting us novel situations, can condition our emotions by making us feel what we would not have in real-life situations. Moreover, fiction may enable its audiences to experience empathetic emotions towards types of characters (and their situations) of which they may have no significant knowledge prior to the engagement with the work, contributing potentially to certain openness to the unknown. (It is worth noting that this may provide an additional reason to favor the non-pleasure-centered solutions to the paradox of tragedy introduced above.) Additionally, given the capacity of fictions to modify our emotional repertoire, thereby potentially modifying our values, some philosophers have emphasized the genuine possibility of acquiring moral knowledge by means of an engagement with fiction. See Nussbaum, 1992, for a discussion on the ways literature can contribute to moral development.

It is worth pointing out dangers that are commonly thought to be at play in the fact that we regularly engage emotionally with fictional entities. Certain works of fiction, for instance, such as some works of drama, are designed to arouse very intense emotional responses in audiences. Some of us may feel that there is something wrong, indeed highly sentimental, with indulging in such activities. Perhaps this is because of a norm of proportionality that is at play in our judgment; perhaps this is because of another norm. In any case, sentimentality is often taken to be a bad thing. See Tanner, 1976, for a classic discussion.

In addition to sentimentality, one may worry that works of fiction have the ability to arouse in audiences ‘sympathetic’ feelings for morally problematic characters (see Feagin, 1983), and perhaps in the long run be detrimental to one’s moral character. Whether this is true, however, is an empirical question that cannot be answered from the armchair.

Whether or not the experience of emotions in response to artworks is on the whole rational remains a very live question, a question that, given the irresistible thought that artworks are valuable partly because of the emotional effect they have on us, is of prime importance. (See Matravers, 2006, for further discussion.)

6. Conclusion

There are at least three main problems that arise at the intersection of emotions and the arts. The first problem is that of explaining how it is possible to experience genuine emotions in response to fictional events and non-representational strings of sounds. The second problem is finding a rationale for our putative desire to engage with works of art that systematically trigger negative affective states in us (regardless of whether such states count as genuine emotions). Assuming the claim that works of art do sometimes trigger genuine emotions in us, the third problem is the problem whether the relevant responses can be said to be rational (given that, after all, they are systematically directed at non-existent or non-meaningful entities). We have seen that each of these problems admits more or less plausible answers that may require a revision of our pre-theoretical beliefs about emotion, on the one hand, and artworks, on the other. Whether a non-revisionary solution to each of these problems is possible, and whether its lack of non-revisionary implications would give this solution the upper hand over alternative solutions, are questions the reader is encouraged to consider.

It is worth emphasizing that the list of problems this article deals with is far from exhaustive. One promising avenue of research is the thorough study of the relationship between emotions and other mental capacities, including imagination. Of particular interest in this area is the phenomenon of imaginative resistance whereby an audience is reluctant to imaginatively engage with fictional works that require them to imagine situations that run contrary to its moral beliefs (such as a world where the torture of innocents is morally permissible). Given that imaginative resistance plausibly implies the presence of certain emotions, such as indignation and guilt, and the absence of others, such as joy and pride, we can legitimately ask what is the precise relationship between our reluctance to imagine certain states of affairs and the emotional responses that are often found to accompany this reluctance (see, for instance, Moran, 1994). If emotion turned out to play a role in the explanation of imaginative resistance in response to fiction, this would give us strong ground for taking the relationship between emotion, fiction, and morality very seriously.

7. References and Further Reading

  • Budd, M. (1985). Music and the Emotions. Routledge
    • A forceful attack of the major theories on the relationship between music and emotion.
  • Carroll, N. (1990). The Philosophy of Horror. New York: Routledge
    • A classic discussion on the nature of our responses to horror films.
  • Charlton, W. (1970). Aesthetics. London: Hutchinson
    • Dated but useful introduction to aesthetics.
  • Coleridge, S.T. ([1817] 1985). Biographia Literaria, ed. Engell, J. & Bate, W.J. Princeton: Princeton University Press.
  • Currie, G. (1990). The Nature of Fiction. Cambridge: Cambridge University Press
    • A classic treatment of the nature of fiction. Includes a chapter on the paradox of fiction.
  • Davies, S. (1994). Musical Meaning and Expression. Ithaca: Cornell University Press
    • Important contribution to the debate over the nature of musical expressiveness.
  • Davies, S. (1997). “Contra the Hypothetical Persona in Music”, in Hjort, M. & Laver, S. (eds.), Emotion and the Arts. New York: Oxford University press. 95-109
    • A forceful attack on the persona theory of musical expressiveness.
  • Deigh, J. (1994). “Cognitivism in the Theory of Emotions”, Ethics, 104, 824-854
    • A classic critique of the cognitive theory of emotion.
  • Deonna, J.A. & Teroni, F. (2012). The Emotions: A Philosophical Introduction. New York: Routledge
    • An excellent introduction to the philosophy of emotion.
  • Döring, S. (2009). “The Logic of Emotional Experience: Non-Inferentiality and the Problem of Conflict Without Contradiction”, Emotion Review, 1, 240-247
    • An article where a problem to the cognitive theory of emotion is exposed.
  • Eaton, M. (1982). “A Strange Kind of Sadness”, Journal of Aesthetics and Art Criticism, 41, 51-63
    • A response to the paradox of tragedy appealing to control.
  • Feagin, S. (1983). “The Pleasures of Tragedy”, American Philosophical Quarterly, 20, 95-104
    • Develops a response to the paradox of tragedy in terms of meta-response.
  • Gaut, B. (1993). “The Paradox of Horror”, British Journal of Aesthetics, 33, 333-345
    • A thorough response to Noel Carroll’s treatment of our responses to works of horror.
  • Gendler, T.S. (2008). “Alief and Belief”, Journal of Philosophy, 105, 10, 634-663
    • Article where the distinction between alief and belief is defended and developed.
  • Gendler, T.S. & Kovakovich, K. (2006). “Genuine Rational Fictional Emotions”, in Kieran, M. (ed.), Contemporary Debates in Aesthetics and the Philosophy of Art. Blackwell. 241-253
    • An article dealing with the paradox of fiction. Argues that our responses to fictional works are rational.
  • Gilmore, J. (2011). “Aptness of Emotions for Fictions and Imaginings”, Pacific Philosophical Quarterly, 92, 4, 468-489
    • An article dealing with the topic of appropriateness of emotions in fictional and imaginative contexts.
  • Goldie, P. (2000). The Emotions: A Philosophical Exploration. Oxford: Oxford University Press
    • A classic text in the philosophy of emotion. Defends the claim that the cognitive theory of emotions is too ‘intellectualist’.
  • Greenspan, P. (1988). Emotions and Reasons. New York: Routledge
    • Explores the relationship between emotions and rationality.
  • Hjort, M. & Laver, S. (eds.) (1997). Emotion and the Arts. New York: Oxford University Press
    • An excellent collection of papers on the relationship between emotions and artworks.
  • Joyce, R. (2000). “Rational Fears of Monsters”, British Journal of Aesthetics, 21, 291-304
    • An article construing the paradox of fiction as a paradox of rationality rather than logic.
  • Kivy, P. (1989). Sound Sentiment: An Essay on the Musical Emotions. Philadelphia: Temple University Press
    • Classic contribution to the philosophy of music, in particular to the topic of emotions in music.
  • Kivy, P. (1990). Music Alone: Philosophical Reflections on the Purely Musical Experience. Ithaca: Cornell University Press
    • Classic contribution to the philosophy of music. Revival of formalism in music theory.
  • Kivy, P. (1999). “Feeling the Musical Emotions”, British Journal of Aesthetics, 39, 1-13
    • An article expressing clearly Kivy’s views on the relationship between music and emotion.
  • Kivy, P. (2002). Introduction to a Philosophy of Music. Oxford: Oxford University Press
    • A good introduction to the philosophy of music, and to Kivy’s views on it.
  • Kivy, P. (2006). “Mood and Music: Some Reflections on Noël Carroll”, Journal of Aesthetics and Art Criticism, 64, 2, 271-281
    • A good place to find Kivy’s views on moods in response to music.
  • Lamarque, P. (1981). “How Can We Fear and Pity Fictions”, British Journal of Aesthetics, 21, 291-304
    • Classic article developing the thought theory as a response to the paradox of fiction.
  • Levinson, J. (1990). Music, Art, and Metaphysics: Essays in Philosophical Aesthetics. Ithaca: Cornell University Press
    • A classic book in analytic aesthetics. Includes a chapter on music and negative emotion.
  • Levinson, J. (1997). “Emotion in Response to Art: A Survey of the Terrain”, in Hjort, M. & Laver, S. (eds.), Emotion and the Arts. New York: Oxford University press. 20-34
    • A short and useful survey of the various questions and positions about our emotional responses to artworks. Could be used as a supplement to the present article.
  • Lucas, F.L. (1928). Tragedy in Relation to Aristotle’s Poetics. Hogarth
    • Classic treatment of Aristotle’s view on tragedy. Defines catharsis as moral purification.
  • Lyons, W. (1980). Emotion. Cambridge: Cambridge University Press
    • A classic exposition of the cognitive theory of emotion.
  • Mannison, D. (1985). “On Being Moved By Fiction”, Philosophy, 60, 71-87
    • Article on the paradox of fiction. Develops a solution appealing to surrogate objects.
  • Matravers, D. (1998). Art and Emotion. Oxford: Oxford University Press
    • A comprehensive study of the relationship between emotion and art.
  • Matravers, D. (2006). “The Challenge of Irrationalism, and How Not To Meet It”, in Kieran, M. (ed.), Contemporary Debates in Aesthetics and the Philosophy of Art. Blackwell. 254-264
    • An interesting discussion on the rationality and value of emotions in response to works of fiction.
  • Moran, R. (1994). “The Expression of Feeling in Imagination”, Philosophical Review, 103, 1, 75-106
    • Classic article on imagination and the emotions.
  • Morreall, J. (1985). “Enjoying Negative Emotions in Fiction”, Philosophy and Literature, 9, 95-103
    • A response to the paradox of tragedy appealing to control.
  • Neill, A. (1991). “Fear, Fiction and Make-Believe”, Journal of Aesthetics and Art Criticism, 49, 1, 47-56
    • Provides a good discussion of the notion of make-believe.
  • Neill, A. (1992). “On a Paradox of the Heart”, Philosophical Studies, 65, 53-65
    • A critical discussion of Carroll’s solution to the paradox of horror.
  • Neill, A. (1993). “Fiction and the Emotions”, American Philosophical Quarterly, 30, 1-13
    • Proposes a solution to the paradox of fiction in terms of beliefs about what is fictionally the case.
  • Novitz, D. (1980). “Fiction, Imagination, and Emotion”, Journal of Aesthetics and Art Criticism, 38, 279-288
    • Develops a version of the ‘thought’ theory in terms of imagination.
  • Nussbaum, M.C. (1992). Love’s Knowledge: Essays on Philosophy and Literature. Oxford: Oxford University Press
    • Classic text at the intersection of philosophy and literature. Provides a defense of the claim according to which literature can play a positive role in moral philosophy.
  • Paskins, B. (1977). “On Being Moved by Anna Karenina and Anna Karenina”, Philosophy, 52, 344-347
    • Article on the paradox of fiction. Provides a solution in terms of surrogate objects.
  • Radford, C. (1975). “How Can We Be Moved by the Fate of Anna Karenina?”, Proceedings of the Aristotelian Society, supplementary vol. 49, 67-80
    • The classic article where Radford introduces the so-called ‘paradox of fiction’.
  • Radford, C. (1989). “Emotions and Music: A Reply to the Cognitivists”, Journal of Aesthetics and Art Criticism, 47, 1, 69-76
    • Article on emotion and music. Responds to views assuming the cognitive theory of emotion (such as Kivy’s).
  • Ridley, A. (1995). Music, Value, and the Passions. Ithaca, NY: Cornell University Press
    • A thorough study of the relationship between music and emotion. Defense of the claim that the emotions we feel in response to music can positively contribute to a proper understanding of it.
  • Robinson, J. (2005). Deeper than Reason: Emotion and its Role in Literature, Music, and Art. New York: Oxford University Press
    • A comprehensive treatment of the relationship between emotions and artworks. A very good discussion on the nature of emotions included.
  • Smuts, A. (2007). “The Paradox of Painful Art”, Journal of Aesthetic Education, 41, 3, 59-77
    • Excellent discussion on the paradox of tragedy. Puts forward the rich experience theory.
  • Smuts, A. (2009). “Art and Negative Affect”, Philosophy Compass, 4, 1, 39-55
    • Excellent introduction to the paradox of tragedy. Good supplement to the present article.
  • Suits, D.B. (2006). “Really Believing in Fiction”, Pacific Philosophical Quarterly, 87, 369-386
    • Article on the paradox of fiction. Argues that, while engaging with fictional stories, we do believe in their content.
  • Solomon, R. (1993). The Passions: Emotions and the Meaning of Life. Indianapolis: Hackett
    • Classic text in the philosophy of emotion. Develops a cognitive account of emotion.
  • Tappolet, C. (2000). Emotions et valeurs. Paris: Presses Universitaires de France
    • Classic defense of the perceptual theory of emotion (in French).
  • Todd, C. (forthcoming). “Attending Emotionally to Fiction”, Journal of Value Inquiry
    • Article on the paradox of fiction. Proposes a solution appealing to the notion of bracketed beliefs. A plausible version of the suspension of disbelief strategy.
  • Walton, K.L. (1978). “Fearing Fictions”, Journal of Philosophy, 75, 5-27
    • Classic paper on the paradox of fiction. Introduction of the notion of quasi-emotion.
  • Walton, K.L. (1990). Mimesis as Make-Believe. Cambridge, MA: Harvard University Press
    • Classic text on the nature of our engagement with fictions.
  • Walton, K.L. (1997). “Spelunking, Simulation, and Slime: On Being Moved by Fiction”, in Hjort, M. & Laver, S. (eds.), Emotion and the Arts. New York: Oxford University press. 37-49
    • Article on the paradox of fiction. Worth comparing with Walton’s earlier formulation of his views on the matter.
  • Weston, P (1975). “How Can We Be Moved by the Fate of Anna Karenina? II.” Proceedings of the Aristotelian Society, supplementary vol. 49, 81-93
    • Article on the paradox of fiction. Provides a solution in terms of surrogate objects.
  • Yanal, R.J. (1994). “The Paradox of Emotion and Fiction”, Pacific Philosophical Quarterly, 75, 54-75
    • A good article for an introduction to the paradox of fiction.
  • Zangwill, N. (2004). “Against Emotion: Hanslick Was Rightbout Music”, British Journal of Aesthetics, 44, 29-43
    • A thorough defense of the view that emotions are aesthetically inappropriate in an engagement with musical works.

Author Information

Hichem Naar
Email: hm.naar@gmail.com
University of Manchester
United Kingdom

Argument

The word “argument” can be used to designate a dispute or a fight, or it can be used more technically. The focus of this article is on understanding an argument as a collection of truth-bearers (that is, the things that bear truth and falsity, or are true and false) some of which are offered as reasons for one of them, the conclusion. This article takes propositions rather than sentences or statements or utterances to be the primary truth bearers. The reasons offered within the argument are called “premises”, and the proposition that the premises are offered for is called the “conclusion”. This sense of “argument” diverges not only from the above sense of a dispute or fight but also from the formal logician’s sense according to which an argument is merely a list of statements, one of which is designated as the conclusion and the rest of which are designated as premises regardless of whether the premises are offered as reasons for believing the conclusion. Arguments, as understood in this article, are the subject of study in critical thinking and informal logic courses in which students usually learn, among other things, how to identify, reconstruct, and evaluate arguments given outside the classroom.

Arguments, in this sense, are typically distinguished from both implications and inferences. In asserting that a proposition P implies proposition Q, one does not thereby offer P as a reason for Q. The proposition frogs are mammals implies that frogs are not reptiles, but it is problematic to offer the former as a reason for believing the latter. If an arguer offers an argument in order to persuade an audience that the conclusion is true, then it is plausible to think that the arguer is inviting the audience to make an inference from the argument’s premises to its conclusion. However, an inference is a form of reasoning, and as such it is distinct from an argument in the sense of a collection of propositions (some of which are offered as reasons for the conclusion). One might plausibly think that a person S infers Q from P just in case S comes to believe Q because S believes that P is true and because S believes that the truth of P justifies belief that Q. But this movement of mind from P to Q is something different from the argument composed of just P and Q.

The characterization of argument in the first paragraph requires development since there are forms of reasoning such as explanations which are not typically regarded as arguments even though (explanatory) reasons are offered for a proposition. Two principal approaches to fine-tuning this first-step characterization of arguments are what may be called the structural and pragmatic approaches. The pragmatic approach is motivated by the view that the nature of an argument cannot be completely captured in terms of its structure. In what follows, each approach is described, and criticism is briefly entertained.  Along the way, distinctive features of arguments are highlighted that seemingly must be accounted for by any plausible characterization. The classification of arguments as deductive, inductive, and conductive is discussed in section 3.

Table of Contents

  1. The Structural Approach to Characterizing Arguments
  2. The Pragmatic Approach to Characterizing Arguments
  3. Deductive, Inductive, and Conductive Arguments
  4. Conclusion
  5. References and Further Reading

1. The Structural Approach to Characterizing Arguments

Not any group of propositions qualifies as an argument. The starting point for structural approaches is the thesis that the premises of an argument are reasons offered in support of its conclusion (for example, Govier 2010, p.1, Bassham, G., W. Irwin, H. Nardone, J. Wallace 2005, p.30, Copi and Cohen 2005, p.7; for discussion, see Johnson 2000, p.146ff ). Accordingly, a collection of propositions lacks the structure of an argument unless there is a reasoner who puts forward some as reasons in support of one of them. Letting P1, P2, P3, …, and C range over propositions and R over reasoners, a structural characterization of argument takes the following form.

 A collection of propositions, P1, …, Pn, C, is an argument if and only if there is a reasoner R who puts forward the Pi as reasons in support of C.

The structure of an argument is not a function of the syntactic and semantic features of the propositions that compose it. Rather, it is imposed on these propositions by the intentions of a reasoner to use some as support for one of them. Typically in presenting an argument, a reasoner will use expressions to flag the intended structural components of her argument. Typical premise indicators include: “because”, “since”, “for”, and “as”; typical conclusion indicators include “therefore”, “thus”, “hence”, and “so”. Note well: these expressions do not always function in these ways, and so their mere use does not necessitate the presence of an argument.

Different accounts of the nature of the intended support offered by the premises for the conclusion in an argument generate different structural characterizations of arguments (for discussion see Hitchcock 2007). Plausibly, if a reasoner R puts forward premises in support of a conclusion C, then (i)-(iii) obtain. (i) The premises represent R’s reasons for believing that the conclusion is true and R thinks that her belief in the truth of the premises is justified. (ii) R believes that the premises make C more probable than not. (iii) (a) R believes that the premises are independent of C ( that is, R thinks that her reasons for the premises do not include belief that C is true), and (b) R believes that the premises are relevant to establishing that C is true. If we judge that a reasoner R presents an argument as defined above, then by the lights of (i)-(iii) we believe that R believes that the premises justify belief in the truth of the conclusion.  In what immediately follows, examples are given to explicate (i)-(iii).

A: John is an only child.

B: John is not an only child; he said that Mary is his sister.

If B presents an argument, then the following obtain. (i) B believes that the premise ( that is, Mary is John’s sister) is true, B thinks this belief is justified, and the premise is B’s reason for maintaining the conclusion. (ii) B believes that John said that Mary is his sister makes it more likely than not that John is not an only child, and (iii) B thinks that that John said that Mary is his sister is both independent of the proposition that Mary is John’s sister and relevant to confirming it.

A: The Democrats and Republicans don’t seem willing to compromise.

B: If the Democrats and Republicans are not willing to compromise, then the U.S. will go over the fiscal cliff.

B’s assertion of a conditional does not require that B believe either the antecedent or consequent. Therefore, it is unlikely that B puts forward the Democrats and Republicans are not willing to compromise as a reason in support of the U.S. will go over the fiscal cliff, because it is unlikely that B believes either proposition. Hence, it is unlikely that B’s response to A has the structure of an argument, because (i) is not satisfied.

A: Doctor B, what is the reason for my uncle’s muscular weakness?

B: The results of the test are in. Even though few syphilis patients get paresis, we suspect that the reason for your uncle’s paresis is the syphilis he suffered from 10 years ago.

Dr. B offers reasons that explain why A’s uncle has paresis. It is unreasonable to think that B believes that the uncle’s being a syphilis victim makes it more likely than not that he has paresis, since B admits that having syphilis does not make it more likely than not that someone has (or will have) paresis. So, B’s response does not contain an argument, because (ii) is not satisfied.

A: I don’t think that Bill will be at the party tonight.

B: Bill will be at the party, because Bill will be at the party.

Suppose that B believes that Bill will be at the party. Trivially, the truth of this proposition makes it more likely than not that he will be at the party. Nevertheless, B is not presenting an argument.  B’s response does not have the structure of an argument, because (iiia) is not satisfied. Clearly, B does not offer a reason for Bill will be at the party that is independent of this. Perhaps, B’s response is intended to communicate her confidence that Bill will be at the party. By (iiia), a reasoner R puts forward [1] Sasha Obama has a sibling in support of [2] Sasha is not an only child only if R’s reasons for believing [1] do not include R’s belief that [2] is true. If R puts forward [1] in support of [2] and, say, erroneously believes that the former is independent of the latter, then R’s argument would be defective by virtue of being circular. Regarding (iiib), that Obama is U.S. President entails that the earth is the third planet from the sun or it isn’t, but it is plausible to suppose that the former does not support the latter because it is irrelevant to showing that the earth is the third planet from the sun or it isn’t is true.

Premises offered in support of a conclusion are either linked or convergent. This difference marks a structural distinction between arguments.

[1] Tom is happy only if he is playing guitar.
[2] Tom is not playing guitar.
———————————————————————
[3] Tom is not happy.

Suppose that a reasoner R offers [1] and [2] as reasons in support of [3]. The argument is presented in what is called standard form; the premises are listed first and a solid line separates them from the conclusion, which is prefaced by “”. This symbol means “therefore”. Premises [1] and [2] are linked because they do not support the conclusion independently of one another,  that is, they support the conclusion jointly. It is unreasonable to think that R offers [1] and [2] individually, as opposed to collectively, as reasons for [3]. The following representation of the argument depicts the linkage of the premises.

convergent
 

Combining [1] and [2] with the plus sign and underscoring them indicates that they are linked. The arrow indicates that they are offered in support of [3]. To see a display of convergent premises, consider the following.

[1] Tom said that he didn’t go to Samantha’s party.
[2] No one at Samantha’s party saw Tom there.
——————————————————————————
[3] Tom did not attend Samantha’s party.

These premises are convergent, because each is a reason that supports [3] independently of the other. The below diagram represents this.

divergent
 

An extended argument is an argument with at least one premise that a reasoner attempts to support explicitly. Extended arguments are more structurally complex than ones that are not extended. Consider the following.

The keys are either in the kitchen or the bedroom. The keys are not in the kitchen. I did not find the keys in the kitchen. So, the keys must be in the bedroom. Let’s look there!

The argument in standard form may be portrayed as follows:

[1] I just searched the kitchen and I did not find the keys.
—————————————————————————————
[2] The keys are not in the kitchen.
[3] The keys are either in the kitchen or the bedroom.
————————————————————————————
[4] The keys are in the bedroom.

multiple

Note that although the keys being in the bedroom is a reason for the imperative, “Let’s look there!” (given the desirability of finding the keys), this proposition is not “truth apt” and so is not a component of the argument.

An enthymeme is an argument which is presented with at least one component that is suppressed.

A: I don’t know what to believe regarding the morality of abortion.

B: You should believe that abortion is immoral. You’re a Catholic.

That B puts forward [1] A is a Catholic in support of [2] A should believe that abortion is immoral suggests that B implicitly puts forward [3] all Catholics should believe that abortion is immoral in support of [2]. Proposition [3] may plausibly be regarded as a suppressed premise of B’s argument. Note that [2] and [3] are linked. A premise that is suppressed is never a reason for a conclusion independent of another explicitly offered for that conclusion.

There are two main criticisms of structural characterizations of arguments. One criticism is that they are too weak because they turn non-arguments such as explanations into arguments.

A: Why did this metal expand?

B: It was heated and all metals expand when heated.

B offers explanatory reasons for the explanandum (what is explained): this metal expanded. It is plausible to see B offering these explanatory reasons in support of the explanandum. The reasons B offers jointly support the truth of the explanandum, and thereby show that the expansion of the metal was to be expected. It is in this way that B’s reasons enable A to understand why the metal expanded.

The second criticism is that structural characterizations are too strong. They rule out as arguments what intuitively seem to be arguments.

A: Kelly maintains that no explanation is an argument. I don’t know what to believe.

B: Neither do I. One reason for her view may be that the primary function of arguments, unlike explanations, is persuasion. But I am not sure that this is the primary function of arguments. We should investigate this further.

B offers a reason, [1] the primary function of arguments, unlike explanations, is persuasion, for the thesis [2] no explanation is an argument. Since B asserts neither [1] nor [2], B does not put forward [1] in support of [2]. Hence, by the above account, B’s reasoning does not qualify as an argument. A contrary view is that arguments can be used in ways other than showing that their conclusions are true. For example, arguments can be constructed for purposes of inquiry and as such can be used to investigate a hypothesis by seeing what reasons might be given to support a given proposition (see Meiland 1989 and Johnson and Blair 2006, p.10). Such arguments are sometimes referred to as exploratory arguments.  On this approach, it is plausible to think that B constructs an exploratory argument [exercise for the reader: identify B’s suppressed premise].

Briefly, in defense of the structuralist account of arguments one response to the first criticism is to bite the bullet and follow those who think that at least some explanations qualify as arguments (see Thomas 1986 who argues that all explanations are arguments). Given that there are exploratory arguments, the second criticism motivates either liberalizing the concept of support that premises may provide for a conclusion (so that, for example, B may be understood as offering [1] in support of [2]) or dropping the notion of support all together in the structural characterization of arguments (for example, a collection of propositions is an argument if and only if a reasoner offers some as reasons for one of them. See Sinnott-Armstrong and Fogelin 2010, p.3).

2. The Pragmatic Approach to Characterizing Arguments

The pragmatic approach is motivated by the view that the nature of an argument cannot be completely captured in terms of its structure. In contrast to structural definitions of arguments, pragmatic definitions appeal to the function of arguments. Different accounts of the purposes arguments serve generate different pragmatic definitions of arguments. The following pragmatic definition appeals to the use of arguments as tools of rational persuasion (for definitions of argument that make such an appeal, see Johnson 2000, p. 168; Walton 1996, p. 18ff; Hitchcock 2007, p.105ff)

A collection of propositions is an argument if and only if there is a reasoner R who puts forward some of them (the premises) as reasons in support of one of them (the conclusion) in order to rationally persuade an audience of the truth of the conclusion.

One advantage of this definition over the previously given structural one is that it offers an explanation why arguments have the structure they do. In order to rationally persuade an audience of the truth of a proposition, one must offer reasons in support of that proposition. The appeal to rational persuasion is necessary to distinguish arguments from other forms of persuasion such as threats. One question that arises is: What obligations does a reasoner incur by virtue of offering supporting reasons for a conclusion in order to rationally persuade an audience of the conclusion? One might think that such a reasoner should be open to criticisms and obligated to respond to them persuasively (See Johnson 2000 p.144 et al, for development of this idea). By appealing to the aims that arguments serve, pragmatic definitions highlight the acts of presenting an argument in addition to the arguments themselves. The field of argumentation, an interdisciplinary field that includes rhetoric, informal logic, psychology, and cognitive science, highlights acts of presenting arguments and their contexts as topics for investigation that inform our understanding of arguments (see Houtlosser 2001 for discussion of the different perspectives of argument offered by different fields).

For example, the acts of explaining and arguing—in sense highlighted here—have different aims.  Whereas the act of explaining is designed to increase the audience’s comprehension, the act of arguing is aimed at enhancing the acceptability of a standpoint. This difference in aim makes sense of the fact that in presenting an argument the reasoner believes that her standpoint is not yet acceptable to her audience, but in presenting an explanation the reasoner knows or believes that the explanandum is already accepted by her audience (See van Eemeren and Grootendorst 1992, p.29, and Snoeck Henkemans 2001, p.232). These observations about the acts of explaining and arguing motivate the above pragmatic definition of an argument and suggest that arguments and explanations are distinct things. It is generally accepted that the same line of reasoning can function as an explanation in one dialogical context and as an argument in another (see Groarke and Tindale 2004, p. 23ff for an example and discussion). Eemeren van, Grootendorst, and Snoeck Henkemans 2002 delivers a substantive account of how the evaluation of various types of arguments turns on considerations pertaining to the dialogical contexts within which they are presented and discussed.

Note that, since the pragmatic definition appeals to the structure of propositions in characterizing arguments, it inherits the criticisms of structural definitions. In addition, the question arises whether it captures the variety of purposes arguments may serve. It has been urged that arguments can aim at engendering any one of a full range of attitudes towards their conclusions (for example, Pinto 1991). For example, a reasoner can offer premises for a conclusion C in order to get her audience to withhold assent from C, suspect that C is true, believe that is merely possible that C is true, or to be afraid that C is true.

The thought here is that these are alternatives to convincing an audience of the truth of C. A proponent of a pragmatic definition of argument may grant that there are uses of arguments not accounted for by her definition, and propose that the definition is stipulative. But then a case needs to be made why theorizing about arguments from a pragmatic approach should be anchored to such a definition when it does not reflect all legitimate uses of arguments. Another line of criticism of the pragmatic approach is its rejecting that arguments themselves have a function (Goodwin 2007) and arguing that the function of persuasion should be assigned to the dialogical contexts in which arguments take place (Doury 2011).

3. Deductive, Inductive, and Conductive Arguments

Arguments are commonly classified as deductive or inductive (for example, Copi, I. and C. Cohen 2005, Sinnott-Armstrong and Fogelin 2010). A deductive argument is an argument that an arguer puts forward as valid. For a valid argument, it is not possible for the premises to be true with the conclusion false. That is, necessarily if the premises are true, then the conclusion is true. Thus we may say that the truth of the premises in a valid argument guarantees that the conclusion is also true. The following is an example of a valid argument: Tom is happy only if the Tigers win, the Tigers lost; therefore, Tom is definitely not happy.

A step-by-step derivation of the conclusion of a valid argument from its premises is called a proof. In the context of a proof, the given premises of an argument may be viewed as initial premises. The propositions produced at the steps leading to the conclusion are called derived premises. Each step in the derivation is justified by a principle of inference. Whether the derived premises are components of a valid argument is a difficult question that is beyond the scope of this article.   

An inductive argument is an argument that an arguer puts forward as inductively strong. In an inductive argument, the premises are intended only to be so strong that, if they were true, then it would be unlikely, although possible, that the conclusion is false. If the truth of the premises makes it unlikely (but not impossible) that the conclusion is false, then we may say that the argument is inductively strong. The following is an example of an inductively strong argument: 97% of the Republicans in town Z voted for McX, Jones is a Republican in town Z; therefore, Jones voted for McX.

In an argument like this, an arguer often will conclude “Jones probably voted for McX” instead of “Jones voted for McX,” because they are signaling with the word “probably” that they intend to present an argument that is inductively strong but not valid.

In order to evaluate an argument it is important to determine whether or not it is deductive or inductive. It is inappropriate to criticize an inductively strong argument for being invalid. Based on the above characterizations, whether an argument is deductive or inductive turns on whether the arguer intends the argument to be valid or merely inductively strong, respectively. Sometimes the presence of certain expressions such as ‘definitely’ and ‘probably’ in the above two arguments indicate the relevant intensions of the arguer. Charity dictates that an invalid argument which is inductively strong be evaluated as an inductive argument unless there is clear evidence to the contrary.

Conductive arguments have been put forward as a third category of arguments (for example, Govier 2010). A conductive argument is an argument whose premises are convergent; the premises count separately in support of the conclusion. If one or more premises were removed from the argument, the degree of support offered by the remaining premises would stay the same. The previously given example of an argument with convergent premises is a conductive argument. The following is another example of a conductive argument. It most likely won’t rain tomorrow. The sky is red tonight. Also, the weather channel reported a 30% chance of rain for tomorrow.

The primary rationale for distinguishing conductive arguments from deductive and inductive ones is as follows. First, the premises of conductive arguments are always convergent, but the premises of deductive and inductive arguments are never convergent. Second, the evaluation of arguments with convergent premises requires not only that each premise be evaluated individually as support for the conclusion, but also the degree to which the premises support the conclusion collectively must be determined. This second consideration mitigates against treating conductive arguments merely as a collection of subarguments, each of which is deductive or inductive. The basic idea is that the support that the convergent premises taken together provide the conclusion must be considered in the evaluation of a conductive argument. With respect to the above conductive argument, the sky is red tonight and the weather channel reported a 30% chance of rain for tomorrow are offered together as (convergent) reasons for It most likely won’t rain tomorrow. Perhaps, collectively, but not individually, these reasons would persuade an addressee that it most likely won’t rain tomorrow.

4. Conclusion

A group of propositions constitutes an argument only if some are offered as reasons for one of them. Two approaches to identifying the definitive characteristics of arguments are the structural and pragmatic approaches. On both approaches, whether an act of offering reasons for a proposition P yields an argument depends on what the reasoner believes regarding both the truth of the reasons and the relationship between the reasons and P. A typical use of an argument is to rationally persuade its audience of the truth of the conclusion. To be effective in realizing this aim, the reasoner must think that there is real potential in the relevant context for her audience to be rationally persuaded of the conclusion by means of the offered premises. What, exactly, this presupposes about the audience depends on what the argument is and the context in which it is given. An argument may be classified as deductive, inductive, or conductive. Its classification into one of these categories is a prerequisite for its proper evaluation.

5. References and Further Reading

  • Bassham, G., W. Irwin, H. Nardone, and J. Wallace. 2005. Critical Thinking: A Student’s Introduction, 2nd ed. New York: McGraw-Hill.
  • Copi, I. and C. Cohen 2005. Introduction to Logic 12th ed. Upper Saddle River, NJ: Prentice Hall.
  • Doury, M. 2011. “Preaching to the Converted: Why Argue When Everyone Agrees?” Argumentation26(1): 99-114.
  • Eemeren F.H. van, R. Grootendorst, and F. Snoeck Henkemans. 2002. Argumentation: Analysis, Evaluation, Presentation. 2002. Mahwah, NJ: Lawrence Erlbaum Associates.
  • Eemeren F.H. van and R. Grootendorst. 1992. Argumentation, Communication, and Fallacies: A Pragma-Dialectical Perspective. Hillsdale, NJ: Lawrence Erblaum Associates.
  • Goodwin, J. 2007. “Argument has no function.” Informal Logic 27 (1): 69–90.
  • Govier, T. 2010. A Practical Study of Argument, 7th ed. Belmont, CA: Wadsworth.
  • Govier, T. 1987. “Reasons Why Arguments and Explanations are Different.” In Problems in Argument Analysis and Evaluation, Govier 1987, 159-176. Dordrecht, Holland: Foris.
  • Groarke, L. and C. Tindale 2004. Good Reasoning Matters!: A Constructive Approach to Critical Thinking, 3rd ed. Oxford: Oxford University Press.
  • Hitchcock, D. 2007. “Informal Logic and The Concept of Argument.” In Philosophy of Logic. D. Jacquette 2007, 101-129. Amsterdam: Elsevier.
  • Houtlosser, P. 2001. “Points of View.” In Critical Concepts in Argumentation Theory, F.H. van Eemeren 2001, 27-50. Amsterdam: Amsterdam University Press.
  • Johnson, R. and J. A. Blair 2006. Logical Self-Defense. New York: International Debate Education Association.
  • Johnson, R. 2000. Manifest Rationality. Mahwah, NJ: Lawrence Erlbaum Associates.
  • Kasachkoff, T. 1988. “Explaining and Justifying.” Informal Logic X, 21-30.
  • Meiland, J. 1989. “Argument as Inquiry and Argument as Persuasion.” Argumentation 3, 185-196.
  • Pinto, R. 1991. “Generalizing the Notion of Argument.” In Argument, Inference and Dialectic, R. Pinto (2010), 10-20. Dordrecht, Holland: Kluwer Academic Publishers. Originally published in van Eemeren, Grootendorst, Blair, and Willard, eds. Proceedings of the Second International Conference on Argumentation, vol.1A, 116-124. Amsterdam: SICSAT. Pinto, R.1995. “The Relation of Argument to Inference,” pp. 32-45 in Pinto (2010).
  • Sinnott-Armstrong, W. and R. Fogelin. 2010. Understanding Arguments: An Introduction to Informal Logic, 8th ed. Belmont, CA: Wadsworth.
  • Skyrms, B. 2000. Choice and Chance, 4th ed. Belmont, CA: Wadsworth.
  • Snoeck Henkemans, A.F. 2001. “Argumentation, explanation, and causality.” In Text Representation: Linguistic and Psycholinguistic Aspects, T. Sanders, J. Schilperoord, and W. Spooren, eds. 2001, 231-246. Amsterdam: John Benjamins Publishing.
  • Thomas, S.N. 1986. Practical Reasoning in Natural Language. Englewood Cliffs, NJ: Prentice Hall.
  • Walton, D. 1996. Argument Structure: A Pragmatic Theory. Toronto: University of Toronto Press.

Author Information

Matthew McKeon
Email: mckeonm@msu.edu
Michigan State University
U. S. A.

Gianni Vattimo (1936−2023 )

VattimoGianni Vattimo was an Italian philosopher and cultural commentator. He studied in Turin, Italy with Luigi Pareyson, and in Heidelberg under Hans-Georg Gadamer. Central to Vattimo’s philosophy are the existentialist and proto-postmodernist influences of Nietzsche, Heidegger, Gadamer and Kuhn. He also became prominent outside of philosophical circles through his political activism in supporting gay rights and from his position as a Member of the European Parliament. His ideas have had a wide-ranging influence across disciplines such as feminism, theology, sexuality studies, and globalisation.

In his philosophy Vattimo explores the relationship between postmodernism and nihilism, treating nihilism affirmatively rather than as something to be overcome. Vattimo draws upon both the theoretical work of Nietzsche and Heidegger, and the ‘signs of the times’. By the latter Vattimo refers to the social and political pluralism and the absence of metaphysical foundations that he believes characterise late modernity, a term Vattimo uses to refer to advanced societies in their present state to show their connection to modernity; the term ‘postmodern’ implies more discontinuity than Vattimo would like. Vattimo thinks that in developed societies there is a ‘plurality of interpretations’ because through the media and ever-increasing movement of peoples it is no longer possible to believe in one dominant way of seeing the world. The freeing-up of these interpretations is possible, Vattimo thinks, because one can no longer plausibly conceive Being as a foundation, that is, of the universe as a rational metaphysically-ordered system of causes and effects. In turn, the lack of plausibility in foundationalism is due to the ‘event’ of the death of God (both the idea of Being as foundation and the event of the death of God shall be explained in due course).  A significant portion of Vattimo’s work is devoted to explaining how the only plausible late-modern, Western philosophical outlook is ‘hermeneutical nihilism’. Broadly, this is the view that ‘there are no facts, only interpretations’, to use a phrase from Nietzsche’s unpublished notebooks (Colli and Montinari, 1967, VIII.1, 323, 7 [60]). Vattimo also investigates the implications of this position for religion, politics, ethics, art, technology, and the media.

Vattimo is well known for his philosophical style of ‘weak thought’ (pensiero debole). ‘Weak thought’ is an attempt to understand and re-configure traces from the history of thought in ways that accord with postmodern conditions. In doing so, the aim of ‘weak thought’ is to create an ethic of ‘weakness’. Vattimo’s efforts to create a postmodern ethic are closely related to his return to religion from the late 1980s onwards, a significant point in his own intellectual development. His renewed interest in religion has also acted as a blueprint for a return to philosophical engagement with, and commitment to, Communism.

Table of Contents

  1. Life
  2. The End of History
  3. Hermeneutical Nihilism
    1. The End of History, Nihilism, and Nietzsche’s ‘Death of God’
    2. Heidegger: Being, Technology and Nihilism
    3. Ontology, ‘Exchange Value’ as Language, and Verwindung as Resignation
    4. ‘Weak Thought,’ Hermeneutics and Verwindung as Convalescence-Alteration
  4. Return to Religion
    1. Reasons behind Vattimo’s Return to Religion
    2. The Philosophical Importance of the Christian Message in Vattimo’s Thought
  5. Ethics
  6. Political Philosophy
  7. References and Further Reading
    1. Works by Vattimo
    2. Works on Vattimo
    3. Other Works Cited

1. Life

Gianni Vattimo was born on January 4, 1936, in Turin, Italy. He was sent to an oratory school as a child. This strongly Catholic environment led to him becoming involved with Catholic youth groups such as Azione Cattolica. After completing his schooling, Vattimo studied at the University of Turin under Luigi Pareyson. He graduated in 1959 with a thesis on Aristotle that was published in 1961. To fund his studies Vattimo found employment as a television host and at a local high school. At the same time, Vattimo was working increasingly closely with Pareyson. Throughout this period Vattimo was also involved in activism, including protests against South African apartheid.

Vattimo has stated that he stopped being a Catholic when, after having gone to study in Germany, he ‘no longer read the Italian newspapers’ (Vattimo and Paterlini, 2009: 27). By this remark, he sought to imply that Catholicism and Italian culture were closely linked at the time. In 1963, Vattimo had taken up a two-year Humboldt Fellowship and was living in Heidelberg, Germany, studying under Karl Löwith and Hans-Georg Gadamer. Returning to Turin after his fellowship ended, Vattimo took up a position as adjunct professor at the university in 1964 to teach aesthetics, especially those of Heidegger. In 1968 Vattimo became a full professor of aesthetics at the University of Turin. In 1969, Vattimo finished his translation of Gadamer’s Truth and Method into Italian (it was published in 1970). During the 1970s Vattimo published many books, including his personal favourite, the 1974 work Il soggetto e la maschera (‘The subject and the mask’).

To Pareyson’s dismay, Vattimo became a Maoist after reading Mao’s works while in hospital in 1968. However, Vattimo was not seen as revolutionary enough by some groups. In 1978, the Red Brigades threatened Vattimo, spreading information to the media about Vattimo’s homosexuality. Moreover, some of Vattimo’s students were involved in terrorism around this time. When Vattimo received letters from some of his imprisoned students, he realised that they were attempting to justify their actions on metaphysical grounds. These events contributed to Vattimo reconsidering his own theoretical position. The fruit of this reflection was Vattimo’s notion of ‘weak thought’: that the history of Western metaphysics is a history of the weakening of strong structures (epistemic structures that purport to provide thinking with firm principles and criteria for judgement), and that philosophy should be an ‘adventure of difference’. Through these claims, Vattimo attempts to express the view that one should not aim for fixed philosophical solutions or for certainty with regard to knowledge. Rather, one should embrace the endless play of interpretations constitutive of late modernity. During this period Vattimo wrote some of his most well-known books, such as The End of Modernity and The Transparent Society.

There are recurrent themes and beliefs in Vattimo’s life, especially the philosophies of Nietzsche and Heidegger, Communism, and religion. The latter has come back into Vattimo’s life in a significant way since the late 1980s and 1990s. His faith is difficult to categorise, being a non-dogmatic and highly idiosyncratic form of the Catholicism with which he was brought up. Vattimo ‘thanks God’ that he is an atheist, says that he ‘believes that he believes’, and has closely related his faith to his philosophy of weak thought. He has also returned to the Communism of his younger years, albeit in a manner similarly ‘weakened’. Aside from his theoretical output, Vattimo has been active politically and was elected as a Member of the European Parliament in 1999. Since retiring from his university post in Turin in 2008 continued to publish prolifically. He died in Turin in 2023.

2. The End of History

Vattimo contends that the Western postmodern experience is that of the end of history. By this he means that the way we can view the past can no longer be as if it had a unilinear character. By ‘unilinear character’, Vattimo means a way of writing and viewing history that sees it as a single train of events, often working towards a goal and privileging one interpretation of the past. Vattimo argues that there is no longer a coherent narrative which is accepted in the West. The typical modern narrative was one of progress, whether this concerned scientific and technological innovation, increasing freedom, or even a Marxist interpretation of history. For this narrative to be coherent it must view the past in terms of cause and effect. It must see that which has happened before as determining the present, and therefore determining the future. According to Vattimo, history loses its unilinear character in three principal ways: theoretically, demographically, and through the rise of the society of generalised communication. For the first way, Vattimo turns to Walter Benjamin’s essay ‘Theses on the Philosophy of History’ (1938), in which Benjamin argues that unilinear history is a product of class conflict. The powerful — kings, emperors, nobles — make history in a manner denied to the poor. Vattimo acknowledges that Benjamin was speaking from a nascent tradition, already begun by Marx and Nietzsche, of seeing history as constructed and not impartial. Given the selective, power-laden nature of unilinear history, Vattimo surmises that it would be mistaken to think there is only one true history. Such a realisation has profound consequences for the idea of progress. If there is not one unique history but many histories, then there is no one clear goal throughout historical development. This implication applies equally to sacred eschatology as to secular Marxist hopes of world revolution and of the realisation of a classless society. In modern Europe, where the unilinear conception of history had flourished, demographic effects have acted to undermine this very notion. In particular, mass immigration has led to a greater prominence of alternative histories.

Furthermore, the rebellion of previously ruled peoples is a common theme in history. However, such rebellion becomes postmodern in the context of the age of mass communication and the after-effects of the two World Wars. Of course, a hallmark of the Reformation was the importance of the printed word. Nevertheless, it did not facilitate the en masse expression and preservation of alternative viewpoints as do radio, television and — mostly significantly — the internet. Hence for Vattimo the advent of the society of mass communication is the third major component of the ‘end of history’ and the start of postmodernity. Vattimo proposes:

(a) that the mass media play a decisive role in the birth of a postmodern society; (b) that they do not make this postmodern more ‘transparent’, but more complex, even chaotic; and finally (c) that it is in precisely this relative ‘chaos’ that our hopes for emancipation lie (Vattimo, 1992: 4).

The Transparent Society, in which Vattimo most clearly outlined his ideas on the end of history, was written shortly before the mass uptake of the internet by Western consumers. Nevertheless, Vattimo’s analysis of mass communication applies even more strongly in light of the effects of widespread internet use. While alternative television and radio stations gave voices to more groups, Twitter, Facebook, blogs and web forums go further by allowing anyone with minimal access to technology to express their worldview. Vattimo acknowledges that this view of the effect of the culture of mass communication is in contrast to the positions of Adorno, Horkheimer, and Orwell. These thinkers predicted that a homogenisation of society would be the result of such communications technology. Additionally, following his reading of Nietzsche, Vattimo only believes in the possibility of interpretations, rather than facts. Nevertheless, he takes great pains to show that his diagnosis of the situation of late modernity is a cogent interpretation. In particular, he claims that it makes the best possible sense of the interpretative plurality he sees around him.

For Vattimo, freedom of information and media multiplicity eliminate the possibility of conceiving of a single reality. This has epistemological consequences, since the plurality of histories and voices in the age of mass communication brings multiple rationalities and anthropologies to the fore. This undermines the possibility of constructing knowledge on certain foundations. Thus, the tendency to universalise and impose a single view of how the world is ordered on others is weakened. As a result, Vattimo sees in late modernity the realisation of Nietzsche’s prophecy of the world becoming a fable. That is, Vattimo considers it impossible to find objective reality among images received from the media: there is no way to step outside, or be an impartial spectator, of these images. The dissolution of the unilinear conception of history, and its implications for modern views on knowledge and reality, liberates differences by allowing local rationalities to come to the fore.

3. Hermeneutical Nihilism

a. The End of History, Nihilism, and Nietzsche’s ‘Death of God’

Vattimo argues that the implications for philosophy of the end of history, and therefore of modernity, are profound. The postmodern experience is fragmented, whereas modernity, with its coherent narrative, is unified. Thinking within a unified, coherent narrative is oriented towards a foundation or origin. It sees history as moving forward from this origin through a logical progression. By lacking this sense of progress, postmodern experience for Vattimo thus coincides with nihilism. In searching for an anchor for the self, the postmodern person finds no centre and no certain foundations. As such, Vattimo views the notion of nihilism as the expression of the dislocation humans feel in the postmodern age. Nihilism is encapsulated by a Nietzschean phrase that Vattimo uses in his work The End of Modernity: that man ‘rolls “from the centre toward X”…because, to use a Heideggerian expression for nihilism, “there is nothing left of Being as such’’’ (Vattimo, 1988a: 20). Not only is a foundation for knowledge undesirable in the fragmentation of experience characteristic of the postmodern age, but it is also impossible. Nietzsche’s term for this experience of nihilism is the ‘death of God’. This is not meant in a metaphysical sense, but rather as the loss of the highest values of which God is the highest of all. It is impossible to find a centre, a universally-accessible metaphysical foundation amidst all the images and messages delivered in a society of mass communication. As a result, nihilism entails that there is no clear point of origin, no accessible epistemic foundation, and no universally-shared sense of where we are going. Rather than seeing this in a negative light, Vattimo sees nihilism as our ‘sole opportunity’ for emancipation from the violence of metaphysics, as will be explicated below.

b. Heidegger: Being, Technology and Nihilism

While Heidegger is not conventionally seen as a nihilist, by reading him through Nietzsche, Vattimo is able to draw upon many of Heidegger’s ideas to enrich and deepen his own concept of nihilism. Heidegger thought that metaphysics was the history of the forgetting of Being, of how things are. Heidegger viewed philosophy from Plato to Nietzsche as a history of metaphysics. Since Plato, the question of Being had been pushed aside by metaphysics in favour of the question of truth, and of the relationship between subject and object. Yet the question of truth ignores the prior ontological question, for both subject and object exist. Metaphysics, through the use of reason, establishes foundations upon which truth is made objective and to which one ‘must give one’s assent or conform’ (Vattimo, 1999: 43).  Heidegger undercuts this establishment of foundations through his notion of the human being as ‘Dasein’. The term ‘Dasein’ refers to the human being as that which always has a pre-ontological interpretation of the world. This itself is given by the contingencies of how one is ‘thrown’ into the world, such as where, when and in what kind of environment one is brought up. In Vattimo’s reading of Heidegger, the end of metaphysics entails the same consequence as the death of God: nihilism and the emancipation of different interpretations, the latter no longer held to account by rationalistic metaphysics.

In his reading of Heidegger, Vattimo sees metaphysics reaching its point of culmination in modern technology. Before looking at Vattimo’s specifically nihilistic reading of Heidegger on technology, it is necessary to outline Heidegger’s thoughts on the issue. In ‘The Question Concerning Technology’ and Identity and Difference, Heidegger states that the essence of technology is not something technological: it is not merely instrumental, but also a way of revealing. The idea of ‘revealing’ comes from Heidegger’s phenomenological rejection of Kant divorcing how things appear to us from how they really are; Heidegger thought they are connected, and the appearance of something in our consciousness is how it is revealed to us, how it is brought into unconcealment. Every unconcealment also conceals, however, as our knowledge of beings is always fragmentary; there is always more to the essence of a thing than is revealed to us. Technology’s role in unconcealment for Heidegger is evident in the interest he pays to the ancient Greek etymology of techné, which emphasises technology’s role in ‘opening up’ and ‘revealing’. Techné is a form of poeisis, a Greek term for a poetic revealing that is a bringing-forth from unconcealment, whether an artisan brings-forth a chalice which was previously a potential chalice, or whether blossom brings itself into bloom. Primitive technology allowed nature to reveal itself ‘poetically’, such as a farmer watching crops grow and harvesting them or a windmill converting the energy generated by the wind when it blew. Industrial technology, on the other hand, ‘challenges’ nature by placing an unreasonable demand on it, forcing it to produce what is required of it by humans. For example, with man-made hydroelectricity dams the mode of revealing is a ‘challenging forth’, the way in which the river reveals itself is no longer the same. Rather than the Rhine appearing poetically as water flowing as a feature of a larger landscape, modern technology has made it become an energy resource. Equally, tourism cannot see the Rhine as an object of nature, but rather merely as a source of income. All nature is challenged in this way. Humans are also challenged, for they are reduced to the level of objects used for production. For example, human resources departments can be viewed as regarding humans as resources for production. A Humans waiting to go to work is, in this industrial society, like an aeroplane on a runway, having little value being brought-forth themselves, but only for something else; essentially both are ‘standing reserve’, valuable only when employed and at the mercy of a system which uses and manipulates them as and when required. The term for this type of revealing which is a challenging on a global scale is Ge-Stell (enframing). Ge-Stell is the culmination of metaphysics because it involves the total planning of everything in perfectly ordered relationships of cause and effect, all capable of unlimited manipulation.

Heidegger had a negative view of technology because of its nihilistic conclusion as the culmination of metaphysics. While Vattimo agrees with Heidegger’s critique of technology, he also sees liberating opportunities provided by it. These arise principally through drawing upon Heidegger’s later work: the notion of ‘the first flashing up of Ereignis’ (Vattimo, 1988a: 26), where Vattimo references Heidegger’s Identity and Difference. In Vattimo’s reading of this quotation, time spent considering Ge-Stell leads to an understanding of the event-like nature of Being through the Ereignis, the event of transpropriation. Heidegger’s later work is less interested in Dasein as the home or determining site of Being. Rather, Being is a horizon in which things appear, ‘the aperture within which alone man and the world, subject and object, can enter into relationship’ (Vattimo, 2004: 6). For the later Heidegger, Being is a series of irruptions, or ‘events’. There is a Selbst (same) which ‘sends’, or ‘destines’ (Geschick) these events, although it is wrong to think of the Selbst as a being, for this would be to repeat metaphysics. Demonstrating the influence of both Heidegger and Gadamer, Vattimo thinks that Being is nothing other than language. Therefore, the way Being appears to us is in a series of historical announcements (events) that colour our interpretation of the traces of Being from previous epochs (the sendings of Being). The traces of Being are transmitted through linguistic traditions into which we – as Dasein – are always already thrown.

Vattimo extends Heidegger’s understanding of technology to take into account contemporary communications technology. In the play of images and messages attained through media such as television, radio and the internet, the difference between subject and object dissolves. For instance, one may doubt that someone’s online profile is ‘real’. Moreover, how could one ever verify its claim to representing reality? In the Ereignis which results from Ge-Stell, metaphysical designations such as ‘subject’ and ‘object’ disappear as everything is challenged-forth. In the Enlightenment era, the rational Cartesian ‘thinking thing’ is not only the subject, but also the foundation of knowledge.  This anthropocentrism continued in different ways through the construction of unilinear narratives surrounding progress and science. As Ge-Stell challenges the distinction between humans and things as they are all reduced to causal determined standing-reserve, capable of manipulation, the Ereignis is the ‘event of appropriation’ that Vattimo considers a ‘trans-propriation’. In the Ereignis, humanity and Being (traditionally considered as that which grounds the rule of reason) lose their metaphysical properties of subject and object. As a result, Being is shown not as a foundation or a thing, but as an ‘exchange value’: as ‘language and… the tradition constituted by the transmission and interpretation of messages’ (Vattimo, 1988a: 26).

c. Ontology, ‘Exchange Value’ as Language, and Verwindung as Resignation

Before discussing exactly how language and tradition function within Vattimo’s philosophy, it is worth examining why he concerns himself with ontology at all. Some philosophers take the end of metaphysics to constitute a total departure from ontology, for they feel it is too closely associated with metaphysical foundationalism. Vattimo’s problem with this view is that a non-ontological approach to knowledge once again locates the origin of knowledge in beings, and in a manner that pertains to their own realms. Yet beings will plan and organise their realms such that an authority akin to the one previously associated with metaphysical Being is postulated of beings. This could lead to a relativism in which local epistemologies or groups are incapable of external criticism. Moreover, relativism may itself take on the appearance of a metaphysical principle.

Rather than approaching the end of metaphysics from a relativistic standpoint or retreating into a local epistemology, Vattimo looks at the postmodern experience through a nihilistic ontology. The postmodern experience is fragmented, since the ‘death of God’ means that there is an irreducible plurality of perspectives on the world. This fragmentation is exacerbated by the society of mass communication. Nevertheless, for Vattimo there are traces of traditions by which we can – and must – orient ourselves. It is in this sense that Vattimo contends that Being is reduced to ‘exchange value’. Our experience of existence is absorbed into the language we use. This language is taken from traces of traditions from past epochs. What Vattimo considers to be potentially liberating – our ‘sole opportunity’ and ‘nihilistic vocation’ – is how we approach, consider, and re-use the traces of Being from past traditions. This process involves the Heideggerian concept Verwindung.

Verwindung has multiple meanings for Vattimo, such as being resigned to tradition, yet also distorting or ‘twisting’ it and as a result getting better from it as a form of ‘convalescence’ from the ‘metaphysical malady’ (Vattimo, 2006: 151); metaphysics cannot be overcome by standing it ‘on its head’ (Vattimo, 2006: 151), for this would be to lay another foundation. Rather, Vattimo thinks that metaphysics can only be overcome by a long convalescence. To this end, Vattimo contrasts Verwindung with an Überwindung (overcoming) of modernity or an Aufhebung (dialectical overcoming in the Hegelian sense). If one were to overcome modernity, to leave metaphysics behind altogether, it would be to create a new foundation; whether locally, in the relativist sense described above, or as a new global epistemological foundation, one would be repeating the metaphysical tendency to create foundations. Moreover, in modernity Being was reduced to the value of the new. After the ‘death of God’, the geschick (the epoch of history of Being) in which we are living is one of dislocation where the narrative of progress has been demythologised.

If metaphysics is not to be overcome, but ‘twisted’, what does this involve and how does it happen? Lexically, Verwindung

is a convalescence (in the sense of ‘ein Krankheit verwinden’: to heal, to be cured of an illness) and a distorting (although this is a rather marginal meaning linked to ‘winden’, meaning ‘to twist’, and to the sense of a deviant alteration which the prefix ‘ver—‘ also possesses). The notion of ‘convalescence’ is linked to another meaning as well, that of ‘resignation’…Besides these meanings of the term, there is that of ‘distortion’ to consider as well (Vattimo, 1988a: 172-173).

This notion of Verwindung is related to Vattimo’s view of nihilism as our sole opportunity. He follows Nietzsche in referring to an ‘accomplished nihilism’, one which aims at creating one’s own values after the highest values have been dissolved. The opportunity of accomplished nihilism is limited by language, and this is where Verwindung comes into play:

Tradition is the transmitting of linguistic messages that constitute the horizon within which Dasein is thrown as an historically determined project: and tradition derives its importance from the fact that Being, as a horizon of disclosure in which things appear, can arise only as a trace of past words (Vattimo, 1988a: 120).

The exchange value of Being is like that of a common currency in the community. Another phrase Vattimo uses to show the inescapable influence of the metaphysical tradition is ‘the ontology of decline’: that we are living in ‘the Occident’ which is ‘the land of sunset (and hence, of Being)’. This nostalgia is a form of resignation, since one cannot escape metaphysics without creating a new foundation. Yet in doing so, one would succumb to the sort of authoritarianism one wishes to escape.

d. ‘Weak Thought,’ Hermeneutics and Verwindung as Convalescence-Alteration

In the postmodern experience, Dasein is thrown into an existence in which experience is fragmented. Furthermore, in this existence thought is limited by the common currency of Being as a collection of linguistic traces mediated by tradition. One cannot overcome metaphysics as a history of Being without falling into authoritarianism. However, in recognising the flaws of authoritarianism, one tacitly realises nihilism as the sole opportunity to find liberation. Such liberation occurs through weakening the traces of the tradition into which one is thrown. Nostalgic resignation, according to Vattimo, is not mere acceptance. The Being of Dasein is interpretative, and therefore the reception of traces of tradition is active, rather than passive. This is the active nihilism of Nietzsche’s ‘philosophy of the morning’. In Vattimo’s work, it is what came to be known in the early 1980s as the philosophical style of ‘weak thought’. Through Andenken and Sorge (care) one recollects the traces of Being handed down though the linguistic tradition. In receiving these traces, however, one interprets them in accordance with the sending of Being – the horizon – into which one is thrown. Nostalgia is also a recovery in that it involves a rewriting of the tradition. The sending of Being is an event rather than an unchanging essence, as it would be on a Platonic conception of Being.

If Being is the experience of existence, and Being is linguistic, then the issue of interpretation will come into play when expressing one’s existence. During the event of the sending of Being, one cannot conceive of traces of Being as simply true when recollecting them, that they are not ‘facts’ to be remembered. Therefore, although Vattimo sees Verwindung as a resignation to Being, it is an ironic remembrance rather than a total acceptance of it. Through recollecting and twisting traces of Being in light of the event of the sending of Being, one weakens strong truth claims. As a result, one is ‘healed’ from what Vattimo considers to be the violence of metaphysics. This ‘convalescent’ aspect of Verwindung occurs through the hermeneutical event of the act of interpretation. Determined in this manner, Vattimo’s philosophy of ‘weak thought’ involves a withdrawal from metaphysics by avoiding new foundations or complete assent to any position.

4. Return to Religion

a. Reasons behind Vattimo’s Return to Religion

On the face of it, Vattimo’s philosophy does not appear to be integrally connected to religion. Nevertheless, a significant proportion of Vattimo’s writings in the last twenty years have been devoted to religion and religious themes. Vattimo’s return to writing on religion was gradual, comprising only brief mentions in his works in the late 1980s, but appearing prominently in the 1990s. While personal factors, including old age and the death of loved ones, brought him back to his faith, Vattimo is first and foremost a theorist. During that period, the profile of religion in society was growing through the Iranian Revolution and the role of Pope John Paul II in the breakdown of Communism. Vattimo developed his ideas on religion as an extension of his philosophies of hermeneutical nihilism, ‘weak thought’, and the relation of metaphysics to ontology. This largely entailed a reading of Heidegger through which the gradual weakening of metaphysics, rather than its return, is anticipated. Nevertheless, metaphysics cannot simply be overcome; it must be worked through in the forms of life we have inherited and are developing. The forms of life and traditions we inherit are the limits of our thought and language. Being is disclosed within language, so it is disclosed within cultural horizons. Hermeneutical nihilism leaves room for faith by dissolving the authoritarianism of reason.

Vattimo feels compelled to ascertain the implications of hermeneutical nihilism and ‘weak thought’ for Christianity. He sees Christianity as a set of beliefs and practices synonymous with Europe, and an inalienable facet in the formation of his character and personal life. Vattimo goes beyond merely applying his idea of ‘weak thought’ to Christianity. He twists Christianity in accordance with Verwindung, yet views the message of Christianity as a stimulus. This stimulus explains the sending of Being in late modernity as an irreducible plurality of interpretations. In part, this conception is the result of Vattimo’s acquaintance with the work of René Girard, whose notion of the ‘natural sacred’ he sees as a transcription of Heidegger’s attitude to Being.

b. The Philosophical Importance of the Christian Message in Vattimo’s Thought

René Girard’s theological anthropology has focussed on the importance of Christ’s death and resurrection in revealing the mechanisms that underpin society. On Girard’s view, natural religions are founded upon the need to create victims to keep order in society. The mimetic drive in humans to desire what the other has escalates until violence threatens to consume society. A sacrificial scapegoat is killed to prevent the society’s destruction. Over time this becomes ever more ritualised and assumes a sacral and divine character. Girard sees the Old and New Testaments as intended to reveal this victim-based mechanism. Jesus’ purpose was not, as Christian theology proposes, to be the perfect sacrifice for his father. Instead, Girard thinks that Jesus was put on the cross due to his revelation of this mechanism. Vattimo, who likewise considers the sacred to involve violence, saw the potential for adaptation of Girard’s notion of the natural sacred. Unlike Girard, Vattimo does not see Jesus as revealing an anthropological truth (the victim-based mechanism). Vattimo instead sees Jesus as the instigator of the desacralising weakening that has come to fruition in modernity. This weakening occurs through the exposition of the tendency of religions to be authoritarian and violent, particularly in demanding sacrifice.

Vattimo sees Heidegger’s philosophy ‘as an active…revelation of the same victimary mechanism that Girard makes us discover in the Judeo-Christian Scripture’ (Vattimo, 2010: 80). Vattimo thinks that Heidegger’s philosophy is a transcription of the Judeo-Christian revelation, especially as the latter is interpreted by Girard. For Girard, this is ‘the basic victimary structure of all human culture; for Heidegger, it exposes the “secret” of metaphysics, which is the forgetting of Being and the identification of it with beings, objectivity etc.’ (Vattimo, 2010: 81). The meaning of history for both Girard and Heidegger, as Vattimo reads them, is emancipation from violence. Vattimo sees in Girard a link between the ‘violent’ God of metaphysics and the violence he sees in metaphysics, broadly following Heidegger’s understanding of it. The notion of God referred to here is the onto-theological, maximising concept of God containing the attributes of omnipotence, omniscience, and judgment. In his introduction to Vattimo and Girard’s Christianity, Truth and Weakening Faith, Pierpaolo Antonello makes clear the link between Girard and Vattimo:

The rupture of the sacrificial circle, accomplished by the Judeo-Christian revelation (in Girard’s terms) or the kenosis of God through the incarnation (in Vattimo’s), launched a historical development that culminates in the present age (Antonello, 2010: 9).

For Girard, the culmination is the present age in which humans have a decision to continue with mimetic rivalry and the sacrifice of scapegoats even when it has become fully exposed. The exposition has taken place at an apocalyptic level, that of the brink of a global nuclear holocaust, and through Jesus’ call for charity. For Vattimo, the culmination is the realisation of the process of secularisation in hermeneutics.

Reading Girard helped Vattimo to reappraise Christianity, and to see faith in Christ as faith in the weakening of strong structures. Vattimo began to connect Girard’s theological anthropology to Heidegger’s attitude to Being. The violence of the natural sacred is akin, in Vattimo’s eyes, to the violence of metaphysics. Girard also impressed on Vattimo the importance of Christ for explaining the current situation of hermeneutical nihilism. While Girard focussed on his notions of mimesis and the natural sacred, Vattimo translated these ideas into terms borrowed from the language of theology. Vattimo draws upon the ‘incarnation’ in a variety of ways, principally through the closely related notion of kenosis, a term often used in the Bible to refer to the ‘self-emptying’ of Christ (Philippians 2:7). In line with this association, Vattimo refers to a variety of Biblical passages in support of his understanding of kenosis. He quotes Hebrews 1:1-2 in Beyond Interpretation, one of the earlier extended pieces of writing in his return to religion. In that text Vattimo used kenosis to refer to the archetype of hermeneutic plurality in the West, comparing and contrasting it with other, less suitable (that is, in Vattimo’s view, metaphysical) candidates such as Aristotle’s claim that ‘Being is said in many ways’. For Vattimo, Aristotle’s contention remains a metaphysical statement, and hence is less suitable than Paul’s ‘prophetic’ alternative from Hebrews. The latter, Vattimo thinks, is a statement of weakening and of the message revealed in the incarnation of the Son of God. This message referred back to the prophets in the Old Testament, but also pointed forward to the Spirit speaking in many tongues at and after Pentecost (Acts 2). Vattimo uses this prophetic understanding of kenosis to challenge the idea of Wilhelm Dilthey’s that hermeneutics is a ‘general philosophy’ which grew out of the move away from metaphysical dogmatism. Instead, Vattimo uses kenosis to ground hermeneutics more fundamentally as a longstanding, archetypal tradition of the West. For Vattimo, ‘kenosis’ and ‘incarnation’ also reconfigure the relationship of the late-modern person to dogmatic thought, without overcoming it completely; it is a Verwindung rather than an Überwindung.

In other writings where Vattimo renews his focus on religion, Vattimo uses kenosis to refer to a process he calls ‘secularisation’. Through it, a message of weakening is passed down and weakens strong structures, including both the essence and the fulfilment of the Christian message. Vattimo invokes the notion of secularisation in Beyond Interpretation and in earlier writings, but without the precision he gives to the concept in Belief and After Christianity. In these later works, kenosis is the abasement of God. This can be understood as Him emptying himself of power and of otherness (Philippians 2:7), or as Christ calling humans to be friends rather than servants (John 15:15). Again utilising theological terms, Vattimo regards ‘salvation’ as reinterpreting Jesus’ words, and that we now feel free to reinterpret his words is evidence of salvation manifesting through history; the inauguration of the process of secularisation as weakening that occurred with kenosis has dissolved the strong structures (metaphysical, political, religious) that had restricted the possibilities of scriptural interpretation. Further weakening (for it is a process that never ends), Vattimo believes, occurs by weakening strong structures, living charitably and being open to others.

Vattimo draws upon the ideas of 12th century theologian Joachim of Fiore in this regard. Joachim saw history as divided into stages corresponding to the Trinity and the canon. The stage of the ‘Father’ pertained to the Old Testament, and this was about discipline. The stage of the ‘Son’ corresponded to the New Testament, and this was about the rise of the Church. Finally, the stage of the Spirit would eventually be a spiritual reading of scripture (one that rejects literalism and claims to objectivity) and greater interpretative freedom. Although Vattimo rejects literal prophecy as archaic, he is interested in a spiritual reading of scripture and doctrine. He sees the hermeneutics of the late-modern period as having a relationship to scripture and doctrine similar to the one Jesus has to the Old Testament: ‘you heard it was said…but I tell you…’ (Antitheses, Matthew 5). Indeed, Vattimo sees the hermeneutical nihilism of the postmodern era as the result of secularisation. That is, as the result of the kenosis of God that gradually liberates humanity from the myth of objectivity. On this issue, Vattimo relies heavily on the link between Jesus in Girard’s writing and the weakening of Being in Heidegger’s thought.

5. Ethics

Vattimo sees his philosophy of hermeneutical nihilism as having important ethical ramifications. As Vattimo claims that there are no facts, only interpretations, he opposes moral realism and any claim to objectivity in morals. Instead, Vattimo argues for an ‘ethics of finiteness’ which can be best summarised in the following passage from his recent work, A Farewell to Truth

[An ethic of finiteness should be] understood neither as the compulsion to leap into the void (much twentieth-century religious thought argues this line: acknowledgement of finiteness prepares the leap into faith, hence only a God can save us) nor as the definitive assumption of the alternatives concretely presented by the situation. An ethic of finiteness is one that strives to keep faith with the discovery of the always insuperable finite situatedness of one’s own provenance, while not forgetting the pluralistic implications of this discovery (Vattimo, 2011: 96).

Thus, in the face of nihilism, one should seek safety in the Other as expressed in the philosophies of Levinas or Derrida. On Vattimo’s view, these philosophers conceive of secularisation ‘as the fall in which God’s transcendence as the wholly other can be revealed through dialectical reversal’ (Vattimo, 2002: 37). Vattimo instead contends that one should acknowledge one’s situatedness, that of being a thrown project (we are born in a specific time, place, and with a particular background). The effect of this realisation should guard against strong thought. As he concedes in recent works, strong thought has in fact re-emerged in the late 20th and early 21st centuries. Vattimo’s ethics are therefore not normative in the conventional sense, but focus instead on encouraging the recognition of both one’s situatedness and the provisionality of one’s worldview.  Ethically, one should take a step backwards from one’s immediate situation. Although one’s cultural horizon constitutes the limit of thought, this does not entail outright moral relativism. One should be able to refrain from following the ethnic, cultural, religious or political principles local to one’s identity, since a lack of objective first principles does not warrant an immediate retreat into local rationalities. What is required is a disposition in accordance with the event of Being of the late modern, an the ability to step back from one’s immediate situation in order to take other people and their beliefs, values, and traditions into account.

In addition to stepping back from one’s immediate situation, the other main element of Vattimo’s ethics is the weakening of strong structures. Operating here is the influence of the traditionally Christian virtue of caritas (‘charity’), although Vattimo interprets this concept though his theory of ‘weak thought’. Vattimo sees caritas as the force driving secularisation, the application of the message of the kenosis of God through the interpretative act. Kenosis itself is an act of God showing his love for his creatures by emptying himself of his power and authority, and by calling us to be friends rather than servants. Indeed, it is possible to see caritas and kenosis as identical; kenosis is the message of the weakening of God, and caritas is the message of weakening as a formal principle. Imitating God is listening to the message of kenosis, of his weakening, and following it as if following a formal principle. In Vattimo’s case, the principle is of reducing violence by dissolving strong structures by interpreting and questioning them. As to whether caritas is itself metaphysical, Vattimo states that it ‘is not really ultimate and does not possess the peremptoriness of the metaphysical principle, which cannot be transcended’ (Vattimo, 1999: 64). Lacking the quality of being ‘ultimate’ is due paradoxically to caritas itself, since it is a principle of weakening. Its status as the kernel of revelation is guaranteed through Vattimo’s reading of kenosis as the heart of the New Testament, and his understanding of ‘love’ as God’s love shown through the message of his weakening.

For Vattimo caritas is the limit of secularisation. It is that which cannot be secularised and is the standard by which beliefs and practices should be judged when entering the secular public space. For instance, wearing the cross is not, Vattimo judges, offensive any longer because it has been secularised and is part of the cultural furniture of the West. However, the mindset involved in wearing the chador does not exhibit caritas as it is an example of strong thought. Those who wear or endorse the wearing of Islamic dress have not ‘read the signs of the times’, since the wearing of the chador is ‘an affirmation of a strong identity’ (Vattimo, 2002: 101). This mindset does not recognise the provisional nature and situatedness of its own provenance, nor does it accept the plurality of other views in the secular space of modern Europe. Therefore, Vattimo argues, it should not be legally permitted to express itself through such dress codes. Hence caritas prevents one from retreating into and cementing local identities in the absence of objective reality and objective values. However, Vattimo does not explore the possibility that the chador could be worn out of choice in a ‘weak’ sense.

Caritas rules out both specific normative positions and broader meta-ethical approaches. Concerning the latter, Vattimo is against anything that flouts ‘Hume’s law’ that one cannot derive an ‘ought’ from an ‘is’. The claim ‘torture is wrong’ can lead to ‘one should not torture’ only with the bridging statements ‘torture causes pain’ and ‘pain is wrong’. However, one must then explain why pain is wrong. Vattimo may well agree that one should minimise pain, but he would not do so on naturalistic grounds. This is because he does not believe that one can derive normative ethical prescriptions from rational observance of nature or eternal forms. Such a move would always entail that some people — such as Plato’s ‘philosopher kings’ — will claim authoritative knowledge of the link between nature and normativity. Vattimo justifies his approach in two ways. Firstly, he critiques the metaphysical nature of naturalistic meta-ethics. Secondly, he claims that the abuse of naturalistic ethics by elites is manifest in the position of the Catholic Church on sexual ethics (prohibition of homosexual acts and the use of artificial contraception) and medical or bio-ethics (prohibitions against euthanasia).

6. Political Philosophy

Vattimo was a Marxist during his youth. He eventually moved away from Marxism when he realised that some of his students’ metaphysical commitments were leading them to violent acts. This pushed him to develop his hermeneutics in the direction of weakening metaphysical strong structures of all forms, including Marxist commitments. Drawing upon the implications of hermeneutics for politics, Vattimo warned against specialists, technocrats, and any kind of leadership that presumes exclusive knowledge of the truth by which to lead a country. Likening these kinds of individuals to Plato’s philosopher kings, Vattimo saw any such form of governance as falling prey to the myth about objective truth. Rather, Vattimo argued, politics should be bounded by the cultural horizon of its time and place. In this sense, Vattimo consciously limited politics and politicians in the way Thomas Kuhn limited the claims of scientists with his ‘paradigm’ concept. Eternal forms and objective truths were replaced with provisional judgements. These are based on forms of life limited and constituted by cultural horizons and re-interpreted linguistic traditions.

Plato’s philosopher kings — the specialists, experts, and technocrats — can lead to totalitarianism, which Vattimo was keen to avoid. If the laws that run society were objective, Vattimo thought that democracy would be an irrational choice. However, a ‘weak ontology’ or a ‘philosophy of weakening’ can provide reasons for preferring liberal democracy over totalitarianism. Vattimo related totalitarian government to Heidegger’s ‘Ge-Stell’, and claimed that within the Ge-Stell there is the first flashing-up of Ereignis. This aperture enables one to see that truth is not found in ‘presence’. Rather, it is based in historical events and the consensus formed within cultural horizons. Individuals can liberate themselves from facilitating government agendas through this realisation. Governments themselves can change through acknowledging the provisional and culturally-bound nature of thought. Although Vattimo aimed to move away from revolutionary violence, his justification of democracy over totalitarianism was particularly timely because of the Arab Spring.

In later years, Vattimo returned to a weakened Marxism, a form of Communism deeply informed by his philosophy of ‘weak thought’. Attempting to distance himself from the kind of Communism put into practice in the Soviet era, Vattimo claimed that the problem with the traditional understanding of Marx is that his ideas have been framed metaphysically. Vattimo therefore looked to create a post-metaphysical Marxism. As with Catholicism, Vattimo realised one cannot overcome Marxism, but that it should be twisted. ‘Hermeneutic Communism’ was presented by Vattimo and his collaborator Santiago Zabala as an alternative to forms of liberal capitalism that aim to keep the status quo in favour of those benefitting from the system. Equally, they saw it as an alternative to forms of Marxism that involve unilinear historicism.

7. References and Further Reading

The list of works by Vattimo is not an exhaustive list, covering his major works in translation. For works about Vattimo, the Benso and Zabala edited volumes are good places to start, covering a range of his work.

a. Works by Vattimo

  • The End of Modernity: Nihilism and Hermeneutics in Postmodern Culture. Trans. J. R. Snyder. Baltimore: John Hopkins University Press 1988a.
  • “Metaphysics, Violence, Secularisation.” Trans. B. Spackman. In Recoding Metaphysics: The New Italian Philosophy. Edited by G. Borradori. Evanston: Northwestern University Press 1988b, 45-61.
  • “Toward an Ontology of Decline Recoding Metaphysics.” Trans. B. Spackman. In Recoding Metaphysics: The New Italian Philosophy. Edited by G. Borradori. Evanston: Northwestern University Press 1988c.
  • The Transparent Society. Trans. D. Webb. Cambridge: Polity Press 1992.
  • The Adventure of Difference: Philosophy after Nietzsche and Heidegger. Trans. C. P. Blamires and T. Harrison. Cambridge: Polity Press 1993.
  • Beyond Interpretation: The Meaning of Hermeneutics for Philosophy. Trans. D. Webb. Cambridge: Polity Press, 1997.
  • With J. Derrida. Religion. Stanford: Stanford University Press 1998.
  • Belief. Trans. L. D’Isanto and D. Webb. Cambridge: Polity Press 1999.
  • Nietzsche: An Introduction. Trans. N. Martin. Stanford: Stanford University Press 2002.
  • After Christianity. Trans. L. D’Isanto. New York: Columbia University Press 2002.
  • Nihilism and Emancipation: Ethics, Politics, and Law. Foreword by Richard Rorty. Edited by Santiago Zabala and translated by William McCuaig. New York: Columbia University Press 2004.
  • With R. Rorty. The Future of Religion. Edited by Santiago Zabala. New York: Columbia University Press 2005.
  • Dialogue with Nietzsche. Trans. William McCuaig. New York: Columbia University Press 2006.
  • With John D. Caputo. After the Death of God. Edited by Jeffrey W. Robbins. New York: Columbia University Press 2007.
  • With Piergiorgio Paterlini. Not Being God: A Collaborative Autobiography. Trans. William McCuaig. New York: Columbia University Press 2009.
  • With René Girard. Christianity, Truth and Weakening Faith: A Dialogue. Edited by Pierpaolo Antonello. New York: Columbia University Press 2010.
  • The Responsibility of the Philosopher. Edited by Franca D’Agostini and translated by William McCuaig. New York: Columbia University Press 2010.
  • A Farewell to Truth. Trans. William McCuaig, with a foreword by Robert T. Valgenti. New York: Columbia University Press 2011.
  • With Santiago Zabala. Hermeneutic Communism: From Heidegger to Marx. New York and Chichester, West Sussex: Columbia University Press, 2011.
  • Being and Its Surroundings. Edited by Giuseppe Iannantuono, Alberto Martinengo and Santiago Zabala. Trans. by Corrado Federici. Montreal: McGill‐Queen’s University Press 2021.

b. Works on Vattimo

  • Benso, S., Schroeder, B, eds. Between Nihilism and Politics: The Hermeneutics of Gianni Vattimo. New York: SUNY 2010.
  • Borradori, G. ““Weak Thought” and Postmodernism: The Italian Departure from Deconstruction.” Social Text 18 (Winter, 1987-1988), 39-49.
  • Depoortere, F. Christ in Postmodern Philosophy. London: T&T Clark 2008.
  • Guarino, T. Vattimo and Theology. New York: Continuum 2009.
  • Pireddu, Nicoletta. “Gianni Vattimo.” In Postmodernism, edited by Johannes Willem Bertene and Joseph P. Natoli. Boston: Blackwell 2002, 302-9.
  • Woodward, Ashley. “Nihilism and the Postmodern in Vattimo’s Nietzsche.” Minerva 6 (2002), 51-67.
  • Zabala, Santiago, ed. Weakening Philosophy: Essays in Honour of Gianni Vattimo. Montreal and Kingston, London, Ithaca: McGill-Queen’s University Press 2007.

c. Other Works Cited

  • Giorgio Colli and Mazzino Montinari, eds. Nietzsche: Werke. Kritische Gesamtausgabe. Berlin and New York, 1967ff.

 

Author Information

Matthew Edward Harris
Email: h031781a@student.staffs.ac.uk
Staffordshire University
United Kingdom

Musonius Rufus (c. 30–before 101-2 C.E.)

Gaius Musonius Rufus was one of the four great Stoic philosophers of the Roman empire, along with Seneca, Epictetus, and Marcus Aurelius. Renowned as a great Stoic teacher, Musonius conceived of philosophy as nothing but the practice of noble behavior. He advocated a commitment to live for virtue, not pleasure, since virtue saves us from the mistakes that ruin life. Though philosophy is more difficult to learn than other subjects, it is more important because it corrects the errors in thinking that lead to errors in acting. He also called for austere personal habits, including the simplest vegetarian diet, and minimal, inexpensive garments and footwear, in order to achieve a good, sturdy life in accord with the principles of Stoicism. He believed that philosophy must be studied not to cultivate brilliance in arguments or an excessive cleverness, but to develop good character, a sound mind, and a tough, healthy body. Musonius condemned all luxuries and disapproved of sexual activity outside of marriage. He argued that women should receive the same education in philosophy as men, since the virtues are the same for both sexes. He praised the married life with lots of children. He affirmed Stoic orthodoxy in teaching that neither death, injury, insult, pain, poverty, nor exile is to be feared since none of them are evils.

Table of Contents

  1. Life
  2. Teachings
  3. Philosophy, Philosophers, and Virtue
  4. Food and Frugality
  5. Women and Equal Education
  6. Sex, Marriage, Family, and Old Age
  7. Impact
  8. References and Further Reading

1. Life

Gaius Musonius Rufus was born before 30 C.E. in Volsinii, an Etruscan city of Italy, as a Roman eques (knight), the class of aristocracy ranked second only to senators. He was a friend of Rubellius Plautus, whom emperor Nero saw as a threat. When Nero banished Rubellius around 60 C.E., Musonius accompanied him into exile in Asia Minor. After Rubellius died in 62 C.E. Musonius returned to Rome, where he taught and practiced Stoicism, which roused the suspicion of Nero. On discovery of the great conspiracy against Nero, led by Calpurnius Piso in 65 C.E., Nero banished Musonius to the arid, desolate island of Gyaros in the Aegean Sea. He returned to Rome under the reign of Galba in 68 C.E. and tried to advocate peace to the Flavian army approaching Rome. In 70 C.E. Musonius secured the conviction of the philosopher Publius Egnatius Celer, who had betrayed Barea Soranus, a friend of Rubellius Plautus. Musonius was exiled a second time, by Vespasian, but returned to Rome in the reign of Titus. Musonius was highly respected and had a considerable following during his life. He died before 101-2 C.E.

2. Teachings

Either Musonius wrote nothing himself, or what he did write is lost, because none of his own writings survive. His philosophical teachings survive as thirty-two apophthegms (pithy sayings) and twenty-one longer discourses, all apparently preserved by others and all in Greek, except for Aulus Gellius’ testimonia in Latin. For this reason, it is likely that he lectured in Greek. Musonius favored a direct and concise style of instruction. He taught that the teacher of philosophy should not present many arguments but rather should offer a few, clear, practical arguments oriented to his listener and couched in terms known to be persuasive to that listener.

3. Philosophy, Philosophers, and Virtue

Musonius believed that (Stoic) philosophy was the most useful thing. Philosophy persuades us, according to Stoic teaching, that neither life, nor wealth, nor pleasure is a good, and that neither death, nor poverty, nor pain is an evil; thus the latter are not to be feared. Virtue is the only good because it alone keeps us from making errors in living. Moreover, it is only the philosopher who seems to make a study of virtue. The person who claims to be studying philosophy must practice it more diligently than the person studying medicine or some other skill, because philosophy is more important, and more difficult to understand, than any other pursuit. This is because, unlike other skills, people who study philosophy have been corrupted in their souls with vices and thoughtless habits by learning things contrary to what they will learn in philosophy. But the philosopher does not study virtue just as theoretical knowledge. Rather, Musonius insists that practice is more important than theory, as practice more effectively leads us to action than theory. He held that though everyone is naturally disposed to live without error and has the capacity to be virtuous, someone who has not actually learned the skill of virtuous living cannot be expected to live without error any more than someone who is not a trained doctor, musician, scholar, helmsman, or athlete could be expected to practice those skills without error.

In one of his lectures Musonius recounts the advice he offered to a visiting Syrian king. A king must protect and help his subjects, so a king must know what is good or bad, helpful or harmful, useful or useless for people. But to diagnose these things is precisely the philosopher’s job. Since a king must also know what justice is and make just decisions, a king must study philosophy. A king must possess self-control, frugality, modesty, courage, wisdom, magnanimity, the ability to prevail in speech over others, the ability to endure pain, and must be free of error. Philosophy, Musonius argued, is the only art that provides all such virtues. To show his gratitude the king offered him anything he wanted, to which Musonius asked only that the king adhere to the principles set forth.

Musonius held that since a human being is made of body and soul, we should train both, but the latter demands greater attention. This dual method requires becoming accustomed to cold, heat, thirst, hunger, scarcity of food, a hard bed, abstaining from pleasures, and enduring pains. This method strengthens the body, inures it to suffering, and makes it fit for every task. He believed that the soul is similarly strengthened by developing courage through enduring hardships, and by making it self-controlled through abstaining from pleasures. Musonius insisted that exile, poverty, physical injury, and death are not evils and a philosopher must scorn all such things. A philosopher regards being beaten, jeered at, or spat upon as neither injurious nor shameful and so would never litigate against anyone for any such acts, according to Musonius. He argued that since we acquire all good things by pain, the person who refuses to endure pain all but condemns himself to not being worthy of anything good.

Musonius criticized cooks and chefs while defending farming as a suitable occupation for a philosopher and no obstacle to learning or teaching essential lessons.

4. Food and Frugality

Musonius’ extant teachings emphasize the importance of daily practices. For example, he emphasized that what one eats has significant consequences. He believed that mastering one’s appetites for food and drink is the basis for self-control, a vital virtue. He argued that the purpose of food is to nourish and strengthen the body and to sustain life, not to provide pleasure. Digesting our food gives us no pleasure, he reasoned, and the time spent digesting food far exceeds the time spent consuming it. It is digestion which nourishes the body, not consumption. Therefore, he concluded, the food we eat serves its purpose when we’re digesting it, not when we’re tasting it.

The proper diet, according to Musonius, was lacto-vegetarian. These foods are least expensive and most readily available: raw fruits in season, certain raw vegetables, milk, cheese, and honeycombs. Cooked grains and some cooked vegetables are also suitable for humans, whereas a meat-based diet is too crude for human beings and is more suitable for wild beasts. Those who eat relatively large amounts of meat seemed slow-witted to Musonius.

We are worse than brute animals when it comes to food, he thought, because we are obsessed with embellishing how our food is presented and fuss about what we eat and how we prepare it merely to amuse our palates. Moreover, too much rich food harms the body. For these reasons, Musonius thought that gastronomic pleasure is undoubtedly the most difficult pleasure to combat. He consequently rejected gourmet cuisine and delicacies as a dangerous habit. He judged gluttony and craving gourmet food to be most shameful and to show a lack of moderation. Indeed, Musonius was of the opinion that those who eat the least expensive food can work harder, are the least fatigued by working, become sick less often, tolerate cold, heat, and lack of sleep better, and are stronger, than those who eat expensive food. He concluded that responsible people favor what is easy to obtain over what is difficult, what involves no trouble over what does, and what is available over what isn’t. These preferences promote self-control and goodness.

Musonius advocated a similarly austere philosophy about clothes. The purpose of our clothes and footwear is strictly protection from the elements. So clothes and shoes should be modest and inexpensive, not attract the attention of the foolish. One should dress to strengthen and toughen the body, not to bundle up in many layers so as never to experience cold and heat and make the body soft and sensitive. Musonius recommended dressing to feel somewhat cold in the winter and avoiding shade in the summer. If possible, he advised, go shoeless.

The purpose of houses, he believed, was to protect us from the elements, to keep out cold, excessive heat, and the wind. Our dwelling should protect us and our food the way a cave would. Money should be spent both publicly and privately on people, not on elaborate buildings or fancy décor. Beds or tables of ivory, silver, or gold, hard-to-get textiles, cups of gold, silver, or marble—all such furnishings are entirely unnecessary and shamefully extravagant. Items that are expensive to acquire, hard to use, troublesome to clean, difficult to guard, or impractical, are inferior when compared with inexpensive, useful, and practical items made of cast iron, plain ceramic, wood, and the like. Thoughtless people covet expensive furnishings they wrongly believe are good and noble. He said he would rather be sick than live in luxury, because illness harms only the body, whereas living in luxury harms both the body and the soul. Luxury makes the body weak and soft and the soul undisciplined and cowardly. Musonius judged that luxurious living fosters unvarnished injustice and greed, so it must be completely avoided.

For the ancient Roman philosophers, following their Greek predecessors, the beard was the badge of a philosopher. Musonius said that a man should cut the hair on his scalp the way he prunes vines, by removing only what is useless and bothersome. The beard, on the other hand, should not be shaved, he insisted, because (1) nature provides it to protect a man’s face, and (2) the beard is the emblem of manhood, the human equivalent of the cock’s comb and the lion’s mane. Hair should never be trimmed to beautify or to please women or boys. Hair is no more trouble for men than feathers are for birds, Musonius said. Consequently, shaving or fastidiously trimming one’s beard were acts of emasculation.

5. Women and Equal Education

Musonius supported his belief that women ought to receive the same education in philosophy as men with the following arguments. First, the gods have given women the same power of reason as men. Reason considers whether an action is good or bad, honorable or shameful. Second, women have the same senses as men: sight, hearing, smell, and the rest. Third, the sexes share the same parts of the body: head, torso, arms, and legs. Fourth, women have an equal desire for virtue and a natural affinity for it. Women, no less than men, are by nature pleased by noble, just deeds and censure their opposites. Therefore, Musonius concluded, it is just as appropriate for women to study philosophy, and thereby to consider how to live honorably, as it is for men.

Moreover, he reasoned, a woman must be able to manage an estate, to account for things beneficial to it, and to supervise the household staff. A woman must also have self-control. She must be free from sexual improprieties and must exercise self-control over other pleasures. She must neither be a slave to desires, nor quarrelsome, nor extravagant, nor vain. A self-controlled woman, Musonius believed, controls her anger, is not overcome by grief, and is stronger than every emotion. But these are the character traits of a beautiful person, whether male or female.

Musonius argued that a woman who studies philosophy would be just, a blameless partner in life, a good and like-minded co-worker, a careful guardian of husband and children, and completely free from the love of gain and greed. She would regard it worse to do wrong than to be wronged, and worse to take more than one’s share than to suffer loss. No one, Musonius insisted, would be more just than she. Moreover, a philosophic woman would love her children more than her own life. She would not hesitate to fight to protect her children any more than a hen that fights with predators much larger than she is to protect her chicks.

He considered it appropriate for an educated woman to be more courageous than an uneducated woman and for a woman trained in philosophy to be more courageous than one untrained in philosophy. Musonius noted that the tribe of Amazons conquered many people with weapons, thus demonstrating that women are fully capable of courageously participating in armed conflict. He thought it appropriate for the philosophic woman not to submit to anything shameful out of fear of death or pain. Nor is it appropriate for her to bow down to anyone, whether well-born, powerful, wealthy, or even a tyrant. Musonius saw the philosophic woman as holding the same beliefs as the Stoic man: she thinks nobly, does not judge death to be an evil, nor life to be a good, nor shrinks from pain, nor pursues lack of pain above all else. Musonius thought it likely that this kind of woman would be self-motivated and persevering, would breast-feed her children, serve her husband with her own hands, and do without hesitation tasks which some regard as appropriate for slaves. Thus, he judged that a woman like this would be a great advantage for her husband, a source of honor for her kinfolk, and a good example for the women who know her.

Musonius believed that sons and daughters should receive the same education, since those who train horses and dogs train both the males and the females the same way. He rejected the view that there is one type of virtue for men and another for women. Women and men have the same virtues and so must receive the same education in philosophy.

When it comes to certain tasks, on the other hand, Musonius was less egalitarian. He believed that the nature of males is stronger and that of females is weaker, and so the most suitable tasks ought to be assigned to each nature accordingly. In general, men should be assigned the heavier tasks (e.g. gymnastics and being outdoors), women the lighter tasks (e.g. spinning yarn and being indoors). Sometimes, however, special circumstances—such as a health condition—would warrant men undertaking some of the lighter tasks which seem to be suited for women, and women in turn undertaking some of the harder ones which seem more suited for men. Thus, Musonius concludes that no chores have been exclusively reserved for either sex. Both boys and girls must receive the same education about what is good and what is bad, what is helpful and what is harmful. Shame towards everything base must be instilled in both sexes from infancy on. Musonius’ philosophy of education dictated that both males and females must be accustomed to endure toil, and neither to fear death nor to become dejected in the face of any misfortune.

6. Sex, Marriage, Family, and Old Age

Musonius’ opposition to luxurious living extended to his views about sex. He thought that men who live luxuriously desire a wide variety of sexual experiences, both legitimate and illegitimate, with both women and men. He remarked that sometimes licentious men pursue a series of male sex-partners. Sometimes they grow dissatisfied with available male sex-partners and choose to pursue those who are hard to get. Musonius condemned all such recreational sex acts. He insisted that only those sex acts aimed at procreation within marriage are right. He decried adultery as unlawful and illegitimate. He judged homosexual relationships as an outrage contrary to nature. He argued that anyone overcome by shameful pleasure is base in his lack of self-control, and so blamed the man (married or unmarried) who has sex with his own female slave as much as the woman (married or unmarried) who has sex with her male slave.

Musonius argued that there must be companionship and mutual care of husband and wife in marriage since its chief end is to live together and have children. Spouses should consider all things as common possessions and nothing as private, not even the body itself. But since procreation can result from sexual relations outside of marriage, childbirth cannot be the only motive for marriage. Musonius thought that when each spouse competes to surpass the other in giving complete care, the partnership is beautiful and admirable. But when a spouse considers only his or her own interests and neglects the other’s needs and concerns, their union is destroyed and their marriage cannot help but go poorly. Such a couple either splits up or their relationship becomes worse than solitude.

Musonius advised those planning to marry not to worry about finding partners from noble or wealthy families or those with beautiful bodies. Neither wealth, nor beauty, nor noble birth promotes a sense of partnership or harmony, nor do they facilitate procreation. He judged bodies that are healthy, normal in form, and able to function on their own, and souls that are naturally disposed towards self-control, justice, and virtue, as most fit for marriage.

Since he held that marriage is obviously in accordance with nature, Musonius rejected the objection that being married gets in the way of studying philosophy. He cited Pythagoras, Socrates, and Crates as examples of married men who were fine philosophers. To think that each should look only to his own affairs is to think that a human being is no different from a wolf or any other wild beast whose nature is to live by force and greed. Wild animals spare nothing they can devour, they have no companions, they never work together, and they have no share in anything just, according to Musonius. He supposed that human nature is very much like that of bees, who are unable to live alone and die when isolated. Bees work and act together. Wicked people are unjust and savage and have no concern for a neighbor in trouble. Virtuous people are good, just, and kind, and show love for their fellow human beings and concern for their neighbors. Marriage is the way for a person to create a family to provide the well-being of the city. Therefore, Musonius judged that anyone who deprives people of marriage destroys family, city, and the entire human race. He reasoned that this is because humans would cease to exist if there were no procreation, and there would be no just and lawful procreation without marriage. He concluded that marriage is important and serious because great gods (Hera, Eros, Aphrodite) govern it.

Musonius opined that the bigger a family, the better. He thought that having many children is beneficial and profitable for cities while having few or none is harmful. Consequently, he praised the wise lawgivers who forbade women from agreeing to be childless and from preventing conception, who enacted punishment for women who induced miscarriages, who punished married couples who were childless, and who honored married couples with many children. He reasoned that just as the man with many friends is mightier than the man without friends, the father with many children is much mightier than the man who has few or none. Musonius was very impressed by the spectacle of a husband or wife with their many children crowded around them. He believed that no procession for the gods and no sacred ritual is as beautiful to behold, or as worthy of being witnessed, as a chorus of many children leading their parents through the city. Musonius scolded those who offered poverty as an excuse for not raising many children. He was outraged by the well-off or affluent refusing to raise children who are born later so that the earlier-born may be better off. Musonius believed that it was an unholy act to deprive the earlier-born of siblings to increase their inheritance. He thought it much better to have many siblings than to have many possessions. Possessions invite plots from neighbors. Possessions need protection. Siblings, he taught, are one’s greatest protectors. Musonius opined that the man who enjoys the blessing of many loyal brothers is most worthy of emulation and most beloved by the gods.

When asked whether one’s parents should be obeyed in all matters, Musonius answered as follows. If a father, who was neither a doctor nor knowledgeable about health and sickness, ordered his sick son to take something he believed would help, but which would in fact be useless, or even harmful, and the sick son, knowing this, did not comply with his father’s order, the son would not be acting disobediently. Similarly, if the father was ill and asked for wine or inappropriate food that would worsen his illness if consumed, and the son, knowing better, refused to give them to his ill father, the son would not be acting disobediently. Even less disobedient, Musonius argued, is the son who refuses to steal or embezzle money entrusted to him when his greedy father commands it. The lesson is that refusing to do what one ought not to do merits praise, not blame. The disobedient person disobeys orders that are right, honorable, and beneficial, and acts shamefully in doing so. But to refuse to obey a shameful, blameworthy command of a parent is just and blameless. The obedient person obeys only his parent’s good, appropriate advice, and obeys all of such advice. The law of Zeus orders us to be good, and being good is the same thing as being a philosopher, Musonius taught.

He argued that the best thing to have on hand during old age is living in accord with nature, the definitive goal of life according to the Stoics. Human nature, he thought, can be better understood by comparing it to the nature of other animals. Horses, dogs, and cows are all inferior to human beings. We do not consider a horse to reach its potential by merely eating, drinking, mating without restraint, and doing none of the things suitable for a horse. Nor do we regard a dog as reaching its potential if it merely indulges in all sorts of pleasures while doing none of the things for which dogs are thought to be good. Nor would any other animal reach its potential by being glutted with pleasures but failing to function in a manner appropriate to its species. Hence, no animal comes into existence for pleasure. The nature of each animal determines the virtue characteristic of it. Nothing lives in accord with nature except what demonstrates its virtue through the actions which it performs in accord with its own nature. Therefore, Musonius concluded, the nature of human beings is to live for virtue; we did not come into existence for the sake of pleasure. Those who live for virtue deserve praise, can rightly think well of themselves, and can be hopeful, courageous, cheerful, and joyful. The human being, Musonius taught, is the only creature on earth that is the image of the divine. Since we have the same virtues as the gods, he reasoned, we cannot imagine better virtues than intelligence, justice, courage, and self-control. Therefore, a god, since he has these virtues, is stronger than pleasure, greed, desire, envy, and jealousy. A god is magnanimous and both a benefactor to, and a lover of, human beings. Consequently, Musonius reasoned, inasmuch as a human being is a copy of a god, a human being must be considered to be like a god when he acts in accord with nature. The life of a good man is the best life, and death is its end. Musonius thought that wealth is no defense against old age because wealth lets people enjoy food, drink, sex, and other pleasures, but never supplies contentment to a wealthy person nor banishes his grief. Therefore, the good man lives without regret, and according to nature by accepting death fearlessly and boldly in his old age, living happily and honorably till the end.

7. Impact

Musonius seems to have acted as an advisor to his friends the Stoic martyrs Rubellius Plautus, Barea Soranus, and Thrasea. Musonius’ son-in-law Artemidorus was judged by Pliny to be the greatest of the philosophers of his day, and Pliny himself professed admiration for Musonius. Musonius’ pupils and followers include Fundanus, the eloquent philosopher Euphrates of Tyre, Timocrates of Heracleia, Athenodotus (the teacher of Marcus Cornelius Fronto), and the golden-tongued orator Dio of Prusa. His greatest student was Epictetus, who mentions him several times in The Discourses of Epictetus written by Epictetus’ student Arrian. Epictetus’ philosophy was deeply influenced by Musonius.

After his death Musonius was admired by philosophers and theologians alike. The Stoic Hierocles and the Christian apologist Clement of Alexandria were strongly influenced by Musonius. Roman emperor Julian the Apostate said that Musonius became famous because he endured his sufferings with courage and sustained with firmness the cruelty of tyrants. Dio of Prusa judged that Musonius enjoyed a reputation greater than any one man had attained for generations and that he was the man who, since the time of the ancients, had lived most nearly in conformity with reason. The Greek sophist Philostratus declared that Musonius was unsurpassed in philosophic ability.

8. References and Further Reading

  • Engel, David M. ‘The Gender Egalitarianism of Musonius Rufus.’ Ancient Philosophy 20 (Fall 2000): 377-391.
  • Geytenbeek, A. C. van. Musonius Rufus and Greek Diatribe. Wijsgerige Texten en Studies 8. Assen: Van Gorcum, 1963.
  • Inwood, Brad. ‘The Legacy of Musonius Rufus.’ Pages 254-276 in From Stoicism to Platonism: The Development of Philosophy, 100 BCE-100 CE. Ed. T. Engberg-Pedersen. New York: Cambridge University Press, 2017.
  • King, Cynthia (trans.). Musonius Rufus. Lectures and Sayings. CreateSpace, 2011.
  • Klassen, William. ‘Musonius Rufus, Jesus, and Paul: Three First-Century Feminists.’ Pages 185-206 in From Jesus to Paul: Studies in Honour of Francis Wright Beare. Ed. P. Richardson and J. C. Hurd. Waterloo, ON: Wilfrid Laurier University Press, 1984.
  • Musonius Rufus. That One Should Disdain Hardships: The Teachings of a Roman Stoic. Trans. C. E. Lutz; Intro. G. Reydams-Schils. New Haven and London: Yale University Press, 2020.
  • Nussbaum, Martha C. ‘The Incomplete Feminism of Musonius Rufus, Platonist, Stoic, and Roman.’ Pages 283-326 in The Sleep of Reason. Erotic Experience and Sexual Ethics in Ancient Greece and Rome. Ed. M. C. Nussbaum and J. Sihvola. Chicago: The University of Chicago Press, 2002.
  • Stephens, William O. ‘The Roman Stoics on Habit.’ Pages 37-65 in A History of Habit: From Aristotle to Bourdieu. Ed. T. Sparrow and A. Hutchinson. Lanham, MD: Lexington Books, 2013.

Author Information

William O. Stephens
Email: stoic@creighton.edu
Creighton University
U. S. A.

The Philosophy of Anthropology

The Philosophy of Anthropology refers to the central philosophical perspectives which underpin, or have underpinned, the dominant schools in anthropological thinking. It is distinct from Philosophical Anthropology which attempts to define and understand what it means to be human.

This article provides an overview of the most salient anthropological schools, the philosophies which underpin them and the philosophical debates surrounding these schools within anthropology. It specifically operates within these limits because the broader discussions surrounding the Philosophy of Science and the Philosophy of Social Science  have been dealt with at length elsewhere in this encyclopedia. Moreover, the specific philosophical perspectives have also been discussed in great depth in other contributions, so they will be elucidated to the extent that this is useful to comprehending their relationship with anthropology. In examining the Philosophy of Anthropology, it is necessary to draw some, even if cautious borders, between anthropology and other disciplines. Accordingly, in drawing upon anthropological discussions, we will define, as anthropologists, scholars who identify as such and who publish in anthropological journals and the like. In addition, early anthropologists will be selected by virtue of their interest in peasant culture and non-Western, non-capitalist and stateless forms of human organization.

The article specifically aims to summarize the philosophies underpinning anthropology, focusing on the way in which anthropology has drawn upon them. The philosophies themselves have been dealt with in depth elsewhere in this encyclopedia. It has been suggested by philosophers of social science that anthropology tends to reflect, at any one time, the dominant intellectual philosophy because, unlike in the physical sciences, it is influenced by qualitative methods and so can more easily become influenced by ideology (for example Kuznar 1997 or Andreski 1974). This article begins by examining what is commonly termed ‘physical anthropology.’ This is the science-oriented form of anthropology which came to prominence in the nineteenth century. As part of this section, the article also examines early positivist social anthropology, the historical relationship between anthropology and eugenics, and the philosophy underpinning this.

The next section examines naturalistic anthropology. ‘Naturalism,’ in this usage, is drawn from the biological ‘naturalists’ who collected specimens in nature and described them in depth, in contrast to ‘experimentalists.’ Anthropological ‘naturalists’ thus conduct fieldwork with groups of people rather than engage in more experimental methods. The naturalism section looks at the philosophy underpinning the development of ethnography-focused anthropology, including cultural determinism, cultural relativism, fieldwork ethics and the many criticisms which this kind of anthropology has provoked. Differences in its development in Western and Eastern Europe also are analyzed. As part of this, the article discusses the most influential schools within naturalistic anthropology and their philosophical foundations.

The article then examines Post-Modern or ‘Contemporary’ anthropology. This school grew out of the ‘Crisis of Representation’ in anthropology beginning in the 1970s. The article looks at how the Post-Modern critique has been applied to anthropology, and it examines the philosophical assumptions behind developments such as auto-ethnography. Finally, it examines the view that there is a growing philosophical split within the discipline.

Table of Contents

  1. Positivist Anthropology
    1. Physical Anthropology
    2. Race and Eugenics in Nineteenth Century Anthropology
    3. Early Evolutionary Social Anthropology
  2. Naturalist Anthropology
    1. The Eastern European School
    2. The Ethnographic School
    3. Ethics and Participant Observation Fieldwork
  3. Anthropology since World War I
    1. Cultural Determinism and Cultural Relativism
    2. Functionalism and Structuralism
    3. Post-Modern or Contemporary Anthropology
  4. Philosophical Dividing Lines
    1. Contemporary Evolutionary Anthropology
    2. Anthropology: A Philosophical Split?
  5. References and Further Reading

1. Positivist Anthropology

a. Physical Anthropology

Anthropology itself began to develop as a separate discipline in the mid-nineteenth century, as Charles Darwin’s (1809-1882) Theory of Evolution by Natural Selection (Darwin 1859) became widely accepted among scientists. Early anthropologists attempted to apply evolutionary theory within the human species, focusing on physical differences between different human sub-species or racial groups (see Eriksen 2001) and the perceived intellectual differences that followed.

The philosophical assumptions of these anthropologists were, to a great extent, the same assumptions which have been argued to underpin science itself. This is the positivism, rooted in Empiricism, which argued that knowledge could only be reached through the empirical method and statements were meaningful only if they could be empirically justified, though it should be noted that Darwin should not necessarily be termed a positivist. Science needed to be solely empirical, systematic and exploratory, logical, theoretical (and thus focused on answering questions). It needed to attempt to make predictions which are open to testing and falsification and it needed to be epistemologically optimistic (assuming that the world can be understood). Equally, positivism argues that truth-statements are value-neutral, something disputed by the postmodern school. Philosophers of Science, such as Karl Popper (1902-1994) (for example Popper 1963), have also stressed that science must be self-critical, prepared to abandon long-held models as new information arises, and thus characterized by falsification rather than verification though this point was also earlier suggested by Herbert Spencer (1820-1903) (for example Spencer 1873). Nevertheless, the philosophy of early physical anthropologists included a belief in empiricism, the fundamentals of logic and epistemological optimism. This philosophy has been criticized by anthropologists such as Risjord (2007) who has argued that it is not self-aware – because values, he claims, are always involved in science – and non-neutral scholarship can be useful in science because it forces scientists to better contemplate their ideas.

b. Race and Eugenics in Nineteenth Century Anthropology

During the mid-nineteenth and early twentieth centuries, anthropologists began to systematically examine the issue of racial differences, something which became even more researched after the acceptance of evolutionary theory (see Darwin 1871). That said, it should be noted that Darwin himself did not specifically advocate eugenics or theories of progress. However, even prior to Darwin’s presentation of evolution (Darwin 1859), scholars were already attempting to understand ‘races’ and the evolution of societies from ‘primitive’ to complex (for example Tylor 1865).

Early anthropologists such as Englishman John Beddoe (1826-1911) (Boddoe 1862) or Frenchman Arthur de Gobineau (1816-1882) (Gobineau 1915) developed and systematized racial taxonomies which divided, for example, between ‘black,’ ‘yellow’ and ‘white.’ For these anthropologists, societies were reflections of their racial inheritance; a viewpoint termed biological determinism. The concept of ‘race’ has been criticized, within anthropology, variously, as being simplistic and as not being a predictive (and thus not a scientific) category (for example Montagu 1945) and there was already some criticism of the scope of its predictive validity in the mid-nineteenth century (for example Pike 1869). The concept has also been criticized on ethical grounds, because racial analysis is seen to promote racial violence and discrimination and uphold a certain hierarchy, and some have suggested its rejection because of its connotations with such regimes as National Socialism or Apartheid, meaning that it is not a neutral category (for example Wilson 2002, 229).

Those anthropologists who continue to employ the category have argued that ‘race’ is predictive in terms of life history, only involves the same inherent problems as any cautiously essentialist taxonomy and that moral arguments are irrelevant to the scientific usefulness of a category of apprehension (for example Pearson 1991) but, to a great extent, current anthropologists reject racial categorization. The American Anthropological Association’s (1998) ‘Statement on Race’ began by asserting that: ‘”Race” thus evolved as a worldview, a body of prejudgments that distorts our ideas about human differences and group behavior. Racial beliefs constitute myths about the diversity in the human species and about the abilities and behavior of people homogenized into “racial” categories.’ In addition, a 1985 survey by the American Anthropological Association found that only a third of cultural anthropologists (but 59 percent of physical anthropologists) regarded ‘race’ as a meaningful category (Lynn 2006, 15). Accordingly, there is general agreement amongst anthropologists that the idea, promoted by anthropologists such as Beddoe, that there is a racial hierarchy, with the white race as superior to others, involves importing the old ‘Great Chain of Being’ (see Lovejoy 1936) into scientific analysis and should be rejected as unscientific, as should ‘race’ itself. In terms of philosophy, some aspects of nineteenth century racial anthropology might be seen to reflect the theories of progress that developed in the nineteenth century, such as those of G. W. F. Hegel (1770-1831) (see below). In addition, though we will argue that Herderian nationalism is more influential in Eastern Europe, we should not regard it as having no influence at all in British anthropology. Native peasant culture, the staple of the Eastern European, Romantic nationalism-influenced school (as we will see), was studied in nineteenth century Britain, especially in Scotland and Wales, though it was specifically classified as ‘folklore’ and as outside anthropology (see Rogan 2012). However, as we will discuss, the influence is stronger in Eastern Europe.

The interest in race in anthropology developed alongside a broader interest in heredity and eugenics. Influenced by positivism, scholars such as Herbert Spencer (1873) applied evolutionary theory as a means of understanding differences between different societies. Spencer was also seemingly influenced, on some level, by theories of progress of the kind advocated by Hegel and even found in Christian theology. For him, evolution logically led to eugenics. Spencer argued that evolution involved a progression through stages of ever increasing complexity – from lower forms to higher forms – to an end-point at which humanity was highly advanced and was in a state of equilibrium with nature. For this perfected humanity to be reached, humans needed to engage in self-improvement through selective breeding.

American anthropologist Madison Grant (1865-1937) (Grant 1916), for example, reflected a significant anthropological view in 1916 when he argued that humans, and therefore human societies, were essentially reflections of their biological inheritance and that environmental differences had almost no impact on societal differences. Grant, as with other influential anthropologists of the time, advocated a program of eugenics in order to improve the human stock. According to this program, efforts would be made to encourage breeding among the supposedly superior races and social classes and to discourage it amongst the inferior races and classes (see also Galton 1909). This form of anthropology has been criticized for having a motivation other than the pursuit of truth, which has been argued to be the only appropriate motivation for any scientist. It has also been criticized for basing its arguments on disputed system of categories – race – and for uncritically holding certain assumptions about what is good for humanity (for example Kuznar 1997, 101-109). It should be emphasized that though eugenics was widely accepted among anthropologists in the nineteenth century, there were also those who criticized it and its assumptions (for example Boas 1907. See Stocking 1991 for a detailed discussion). Proponents have countered that a scientist’s motivations are irrelevant as long as his or her research is scientific, that race should not be a controversial category from a philosophical perspective and that it is for the good of science itself that the more scientifically-minded are encouraged to breed (for example Cattell 1972). As noted, some scholars stress the utility of ideologically-based scholarship.

A further criticism of eugenics is that it fails to recognize the supposed inherent worth of all individual humans (for example Pichot 2009). Advocates of eugenics, such as Grant (1916), dismiss this as a ‘sentimental’ dogma which fails to accept that humans are animals, as acceptance of evolutionary theory, it is argued, obliges people to accept, and which would lead to the decline of civilization and science itself. We will note possible problems with this perspective in our discussion of ethics. Also, it might be useful to mention that the form of anthropology that is sympathetic to eugenics is today centered around an academic journal called The Mankind Quarterly, which critics regard as ‘racist’ (for example Tucker 2002, 2) and even academically biased (for example Ehrenfels 1962). Although ostensibly an anthropology journal, it also publishes psychological research. A prominent example of such an anthropologist is Roger Pearson (b. 1927), the journal’s current editor. But such a perspective is highly marginal in current anthropology.

c. Early Evolutionary Social Anthropology

Also from the middle of the nineteenth century, there developed a school in Western European and North American anthropology which focused less on race and eugenics and more on answering questions relating to human institutions, and how they evolved, such as ‘How did religion develop?’ or ‘How did marriage develop?’ This school was known as ‘cultural evolutionism.’ Members of this school, such as Sir James Frazer (1854-1941) (Frazer 1922), were influenced by the positivist view that science was the best model for answering questions about social life. They also shared with other evolutionists an acceptance of a modal human nature which reflected evolution to a specific environment. However, some, such as E. B. Tylor (1832-1917) (Tylor 1871), argued that human nature was the same everywhere, moving away from the focus on human intellectual differences according to race. The early evolutionists believed that as surviving ‘primitive’ social organizations, within European Empires for example, were examples of the ‘primitive Man,’ the nature of humanity, and the origins of its institutions, could be best understood through analysis of these various social groups and their relationship with more ‘civilized’ societies (see Gellner 1995, Ch. 2).

As with the biological naturalists, scholars such as Frazer and Tylor collected specimens on these groups – in the form of missionary descriptions of ‘tribal life’ or descriptions of ‘tribal life’ by Westernized tribal members – and compared them to accounts of more advanced cultures in order to answer discrete questions. Using this method of accruing sources, now termed ‘armchair anthropology’ by its critics, the early evolutionists attempted to answered discrete questions about the origins and evolution of societal institutions. As early sociologist Emile Durkheim (1858-1917) (Durkheim 1965) summarized it, such scholars aimed to discover ‘social facts.’ For example, Frazer concluded, based on sources, that societies evolved from being dominated by a belief in Magic, to a belief in Spirits and then a belief in gods and ultimately one God. For Tylor, religion began with ‘animism’ and evolved into more complex forms but tribal animism was the essence of religion and it had developed in order to aid human survival.

This school of anthropology has been criticized because of its perceived inclination towards reductionism (such as defining ‘religion’ purely as ‘survival’), its speculative nature and its failure to appreciate the problems inherent in relying on sources, such as ‘gate keepers’ who will present their group in the light in which they want it to be seen. Defenders have countered that without attempting to understand the evolution of societies, social anthropology has no scientific aim and can turn into a political project or simply description of perceived oddities (for example Hallpike 1986, 13). Moreover, the kind of stage theories advocated by Tylor have been criticized for conflating evolution with historicist theories of progress, by arguing that societies always pass through certain phases of belief and the Western civilization is the pinnacle of development, a belief known as unilinealism. This latter point has been criticized as ethnocentric (for example Eriksen 2001) and reflects some of the thinking of Herbert Spencer, who was influential in early British anthropology.

2. Naturalist Anthropology

a. The Eastern European School

Whereas Western European and North American anthropology were oriented towards studying the peoples within the Empires run by the Western powers and was influenced by Darwinian science, Eastern European anthropology developed among nascent Eastern European nations. This form of anthropology was strongly influenced by Herderian nationalism and ultimately by Hegelian political philosophy and the Romantic Movement of eighteenth century philosopher Jean-Jacques Rousseau (1712-1778). Eastern European anthropologists believed, following the Romantic Movement, that industrial or bourgeois society was corrupt and sterile. The truly noble life was found in the simplicity and naturalness of communities close to nature. The most natural form of community was a nation of people, bonded together by shared history, blood and customs, and the most authentic form of such a nation’s lifestyle was to be found amongst its peasants. Accordingly, Eastern European anthropology elevated peasant life as the most natural form of life, a form of life that should, on some level, be strived towards in developing the new ‘nation’ (see Gellner 1995).

Eastern European anthropologists, many of them motivated by Romantic nationalism, focused on studying their own nations’ peasant culture and folklore in order to preserve it and because the nation was regarded as unique and studying its most authentic manifestation was therefore seen as a good in itself. As such, Eastern European anthropologists engaged in fieldwork amongst the peasants, observing and documenting their lives. There is a degree to which the kind of anthropology – or ‘ethnology’ – remains more popular in Eastern than in Western Europe (see, for example, Ciubrinskas 2007 or SarkanyND) at the time of writing.

Siikala (2006) observes that Finnish anthropology is now moving towards the Western model of fieldwork abroad but as recently as the 1970s was still predominantly the study of folklore and peasant culture. Baranski (2009) notes that in Poland, Polish anthropologists who wish to study international topics still tend to go to the international centers while those who remain in Poland tend to focus on Polish folk culture, though the situation is slowly changing. Lithuanian anthropologist Vytis Ciubrinkas (2007) notes that throughout Eastern Europe, there is very little separate ‘anthropology,’ with the focus being ‘national ethnology’ and ‘folklore studies,’ almost always published in the vernacular. But, again, he observes that the kind of anthropology popular in Western Europe is making inroads into Eastern Europe. In Russia, national ethnology and peasant culture also tends to be predominant (for example Baiburin 2005). Indeed, even beyond Eastern Europe, it was noted in the year 2000 that ‘the emphasis of Indian social anthropologists remains largely on Indian tribes and peasants. But the irony is that barring the detailed tribal monographs prepared by the British colonial officers and others (. . .) before Independence, we do not have any recent good ethnographies of a comparable type’ (Srivastava 2000). By contrast, Japanese social anthropology has traditionally been in the Western model, studying cultures more ‘primitive’ than its own (such as Chinese communities), at least in the nineteenth century. Only later did it start to focus more on Japanese folk culture and it is now moving back towards a Western model (see Sedgwick 2006, 67).

The Eastern school has been criticized for uncritically placing a set of dogmas – specifically nationalism – above the pursuit of truth, accepting a form of historicism with regard to the unfolding of the nation’s history and drawing a sharp, essentialist line around the nationalist period of history (for example Popper 1957). Its anthropological method has been criticized because, it is suggested, Eastern European anthropologists suffer from home blindness. By virtue of having been raised in the culture which they are studying, they cannot see it objectively and penetrate to its ontological presuppositions (for example Kapferer 2001).

b. The Ethnographic School

The Ethnographic school, which has since come to characterize social and cultural anthropology, was developed by Polish anthropologist Bronislaw Malinowski (1884-1942) (for example Malinowski 1922). Originally trained in Poland, Malinowski’s anthropological philosophy brought together key aspects of the Eastern and Western schools. He argued that, as with the Western European school, anthropologists should study foreign societies. This avoided home blindness and allowed them to better perceive these societies objectively. However, as with the Eastern European School, he argued that anthropologists should observe these societies in person, something termed ‘participant observation’ or ‘ethnography.’ This method, he argued, solved many of the problems inherent in armchair anthropology.

It is this method which anthropologists generally summarize as ‘naturalism’ in contrast to the ‘positivism,’ usually followed alongside a quantitative method, of evolutionary anthropologists. Naturalist anthropologists argue that their method is ‘scientific’ in the sense that it is based on empirical observation but they argue that some kinds of information cannot be obtained in laboratory conditions or through questionnaires, both of which lend themselves to quantitative, strictly scientific analysis. Human culturally-influenced actions differ from the subjects of physical science because they involve meaning within a system and meaning can only be discerned after long-term immersion in the culture in question. Naturalists therefore argue that a useful way to find out information about and understand a people – such as a tribe – is to live with them, observe their lives, gain their trust and eventually live, and even think, as they do. This latter aim, specifically highlighted by Malinowski, has been termed the empathetic perspective and is considered, by many naturalist anthropologists, to be a crucial sign of research that is anthropological. In addition to these ideas, the naturalist perspective draws upon aspects of the Romantic Movement in that it stresses, and elevates, the importance of ‘gaining empathy’ and respecting the group it is studying, some naturalists argue that there are ‘ways of knowing’ other than science (for example Rees 2010) and that respect for the group can be more important than gaining new knowledge. They also argue that human societies are so complex that they cannot simply be reduced to biological explanations.

In many ways, the successor to Malinowski as the most influential cultural anthropologist was the American Clifford Geertz (1926-2006). Where Malinowski emphasized ‘participant observation’ – and thus, to a greater degree, an outsider perspective – it was Geertz who argued that the successful anthropologist reaches a point where he sees things from the perspective of the native. The anthropologist should bring alive the native point of view, which Roth (1989) notes ‘privileges’ the native, thus challenging a hierarchical relationship between the observed and the observer. He thus strongly rejected a distinction which Malinowski is merely critical of: the distinction between a ‘primitive’ and ‘civilized’ culture. In many respects, this distinction was also criticised by the Structuralists – whose central figure, Claude Levi-Strauss (1908-2009), was an earlier generation than Geertz – as they argued that all human minds involved similar binary structures (see below).

However, there was a degree to which both Malinowski and Geertz did not divorce ‘culture’ from ‘biology.’ Malinowski (1922) argued that anthropological interpretations should ultimately be reducible to human instincts while Geertz (1973, 46-48) argued that culture can be reduced to biology and that culture also influences biology, though he felt that the main aim of the ethnographer was to interpret. Accordingly, it is not for the anthropologist to comment on the culture in terms of its success or the validity of its beliefs. The anthropologist’s purpose is merely to record and interpret.

The majority of those who practice this form of anthropology are interpretivists. They argue that the aim of anthropology is to understand the norms, values, symbols and processes of a society and, in particular, their ‘meaning’ – how they fit together. This lends itself to the more subjective methods of participant observation. Applying a positivist methodology to studying social groups is regarded as dangerous because scientific understanding is argued to lead to better controlling the world and, in this case, controlling people. Interpretivist anthropology has been criticized, variously, as being indebted to imperialism (see below) and as too subjective and unscientific, because, unless there is a common set of analytical standards (such as an acceptance of the scientific method, at least to some extent), there is no reason to accept one subjective interpretation over another. This criticism has, in particular, been leveled against naturalists who accept cultural relativism (see below).

Also, many naturalist anthropologists emphasize the separateness of ‘culture’ from ‘biology,’ arguing that culture cannot simply be traced back to biology but rather is, to a great extent, independent of it; a separate category. For example, Risjord (2000) argues that anthropology ‘will never reach the social reality at which it aims’ precisely because ‘culture’ cannot simply be reduced to a series of scientific explanations. But it has been argued that if the findings of naturalist anthropology are not ultimately consilient with science then they are not useful to people outside of naturalist anthropology and that naturalist anthropology draws too stark a line between apes and humans when it claims that human societies are too complex to be reduced to biology or that culture is not closely reflective of biology (Wilson 1998, Ch. 1). In this regard, Bidney (1953, 65) argues that, ‘Theories of culture must explain the origins of culture and its intrinsic relations to the psychobiological nature of man’ as to fail to do so simply leaves the origin of culture as a ‘mystery or an accident of time.’

c. Ethics and Participant Observation Fieldwork

From the 1970s, the various leading anthropological associations began to develop codes of ethics. This was, at least in part, inspired by the perceived collaboration of anthropologists with the US-led counterinsurgency groups in South American states. For example, in the 1960s, Project Camelot commissioned anthropologists to look into the causes of insurgency and revolution in South American States, with a view to confronting these perceived problems. It was also inspired by the way that increasing numbers of anthropologists were employed outside of universities, in the private sector (see Sluka 2007).

The leading anthropological bodies – such as the Royal Anthropological Institute – hold to a system of research ethics which anthropologists, conducting fieldwork, are expected, though not obliged, to adhere to. For example, the most recent American Anthropological Association Code of Ethics (1998) emphasizes that certain ethical obligations can supersede the goal of seeking new knowledge. Anthropologists, for example, may not publish research which may harm the ‘safety,’ ‘privacy’ or ‘dignity’ of those whom they study, they must explain their fieldwork to their subjects and emphasise that attempts at anonymity may sometimes fail, they should find ways of reciprocating to those whom they study and they should preserve opportunities for future fieldworkers.

Though the American Anthropological Association does not make their philosophy explicit, much of the philosophy appears to be underpinned by the golden rule. One should treat others as one would wish to be treated oneself. In this regard, one would not wish to be exploited, misled or have ones safety or privacy comprised. For some scientists, the problem with such a philosophy is that, from their perspective, humans should be an objective object of study like any other. The assertion that the ‘dignity’ of the individual should be preserved may be seen to reflect a humanist belief in the inherent worth of each human being. Humanism has been accused of being sentimental and of failing to appreciate the substantial differences between human beings intellectually, with some anthropologists even questioning the usefulness of the broad category ‘human’ (for example Grant 1916). It has also been accused of failing to appreciate that, from a scientific perspective, humans are a highly evolved form of ape and scholars who study them should attempt to think, as Wilson (1975, 575) argues, as if they are alien zoologists. Equally, it has been asked why primary ethical responsibility should be to those studied. Why should it not be to the public or the funding body? (see Sluka 2007) In this regard, it might be suggested that the code reflects the lauding of members of (often non-Western) cultures which might ultimately be traced back to the Romantic Movement. Their rights are more important than those of the funders, the public or of other anthropologists.

Equally, the code has been criticized in terms of power dynamics, with critics arguing that the anthropologist is usually in a dominant position over those being studied which renders questionable the whole idea of ‘informed consent’ (Bourgois 2007). Indeed, it has been argued that the most recent American Anthropological Association Code of Ethics (1998) is a movement to the right, in political terms, because it accepts, explicitly, that responsibility should also be to the public and to funding bodies and is less censorious than previous codes with regard to covert research (Pels 1999). This seems to be a movement towards a situation where a commitment to the group being studied is less important than the pursuit of truth, though the commitment to the subject of study is still clear.

Likewise, the most recent set of ethical guidelines from the Association of Anthropologists of the UK and the Commonwealth implicitly accepts that there is a difference of opinion among anthropologists regarding whom they are obliged to. It asserts, ‘Most anthropologists would maintain that their paramount obligation is to their research participants . . .’ This document specifically warrants against giving subjects ‘self-knowledge which they did not seek or want.’ This may be seen to reflect a belief in a form of cultural relativism. Permitting people to preserve their way of thinking is more important than their knowing what a scientist would regard as the truth. Their way of thinking – a part of their culture – should be respected, because it is theirs, even if it is inaccurate. This could conceivably prevent anthropologists from publishing dissections of particular cultures if they might be read by members of that culture (see Dutton 2009, Ch. 2). Thus, philosophically, the debate in fieldwork ethics ranges from a form of consequentialism to, in the form of humanism, a deontological form of ethics. However, it should be emphasized that the standard fieldwork ethics noted are very widely accepted amongst anthropologists, particularly with regard to informed consent. Thus, the idea of experimenting on unwilling or unknowing humans is strongly rejected, which might be interpreted to imply some belief in human separateness.

3. Anthropology since World War I

a. Cultural Determinism and Cultural Relativism

As already discussed, Western European anthropology, around the time of World War I, was influenced by eugenics and biological determinism. But as early as the 1880s, this was beginning to be questioned by German-American anthropologist Franz Boas (1858-1942) (for example Boas 1907), based at Columbia University in New York. He was critical of biological determinism and argued for the importance of environmental influence on individual personality and thus modal national personality in a way of thinking called ‘historical particularism.’

Boas emphasized the importance of environment and history in shaping different cultures, arguing that all humans were biologically relatively similar and rejecting distinctions of ‘primitive’ and civilized.’ Boas also presented critiques of the work of early evolutionists, such as Tylor, demonstrating that not all societies passed through the phases he suggested or did not do so in the order he suggested. Boas used these findings to stress the importance of understanding societies individually in terms of their history and culture (for example Freeman 1983).

Boas sent his student Margaret Mead (1901-1978) to American Samoa to study the people there with the aim of proving that they were a ‘negative instance’ in terms of violence and teenage angst. If this could be proven, it would undermine biological determinism and demonstrate that people were in fact culturally determined and that biology had very little influence on personality, something argued by John Locke (1632-1704) and his concept of the tabula rasa. This would in turn mean that Western people’s supposed teenage angst could be changed through changing the culture. After six months in American Samoa, Mead returned to the USA and published, in 1928, her influential book Coming of Age in Samoa: A Psychological Study of Primitive Youth for Western Civilization (Mead 1928). It portrayed Samoa as a society of sexual liberty in which there were none of the problems associated with puberty that were associated with Western civilization. Accordingly, Mead argued that she had found a negative instance and that humans were overwhelming culturally determined. At around the same time Ruth Benedict (1887-1948), also a student of Boas’s, published her research in which she argued that individuals simply reflected the ‘culture’ in which they were raised (Benedict 1934).

The cultural determinism advocated by Boas, Benedict and especially Mead became very popular and developed into school which has been termed ‘Multiculturalism’ (Gottfried 2004). This school can be compared to Romantic nationalism in the sense that it regards all cultures as unique developments which should be preserved and thus advocates a form of ‘cultural relativism’ in which cultures cannot be judged by the standards of other cultures and can only be comprehended in their own terms. However, it should be noted that ‘cultural relativism’ is sometimes used to refer to the way in which the parts of a whole form a kind of separate organism, though this is usually referred to as ‘Functionalism.’ In addition, Harris (see Headland, Pike, and Harris 1990) distinguishes between ‘emic’ (insider) and ‘etic’ (outsider) understanding of a social group, arguing that both perspectives seem to make sense from the different viewpoints. This might also be understood as cultural relativism and perhaps raises the question of whether the two worlds can so easily be separated.  Cultural relativism also argues, as with Romantic Nationalism, that so-called developed cultures can learn a great deal from that which they might regard as ‘primitive’ cultures. Moreover, humans are regarded as, in essence, products of culture and as extremely similar in terms of biology.

Cultural Relativism led to so-called ‘cultural anthropologists’ focusing on the symbols within a culture rather than comparing the different structures and functions of different social groups, as occurred in ‘social anthropology’ (see below). As comparison was frowned upon, as each culture was regarded as unique, anthropology in the tradition of Mead tended to focus on descriptions of a group’s way of life. Thick description is a trait of ethnography more broadly but it is especially salient amongst anthropologists who believe that cultures can only be understood in their own terms. Such a philosophy has been criticized for turning anthropology into little more than academic-sounding travel writing because it renders it highly personal and lacking in comparative analysis (see Sandall 2001, Ch. 1).

Cultural relativism has also been criticized as philosophically impractical and, ultimately, epistemologically pessimistic (Scruton 2000), because it means that nothing can be compared to anything else or even assessed through the medium of a foreign language’s categories. In implicitly defending cultural relativism, anthropologists have cautioned against assuming that some cultures are more ‘rational’ than others. Hollis (1967), for example, argues that anthropology demonstrates that superficially irrational actions may become ‘rational’ once the ethnographer understands the ‘culture.’ Risjord (2000) makes a similar point. This implies that the cultures are separate worlds, ‘rational’ in themselves. Others have suggested that entering the field assuming that the Western, ‘rational’ way of thinking is correct can lead to biased fieldwork interpretation (for example Rees 2010).

Critics have argued that certain forms of behaviour can be regarded as undesirable in all cultures, yet are only prevalent in some. It has also been argued that Multiculturalism is a form of Neo-Marxism on the grounds that it assumes imperialism and Western civilization to be inherently problematic but also because it lauds the materially unsuccessful. Whereas Marxism extols the values and lifestyle of the worker, and critiques that of the wealthy, Multiculturalism promotes “materially unsuccessful” cultures and critiques more materially successful, Western cultures (for example Ellis 2004 or Gottfried 2004).

Cultural determinism has been criticized both from within and from outside anthropology. From within anthropology, New Zealand anthropologist Derek Freeman (1916-2001), having been heavily influenced by Margaret Mead, conducted his own fieldwork in Samoa around twenty years after she did and then in subsequent fieldwork visits. As he stayed there far longer than Mead, Freeman was accepted to a greater extent and given an honorary chiefly title. This allowed him considerable access to Samoan life. Eventually, in 1983 (after Mead’s death) he published his refutation: Margaret Mead and Samoa: The Making and Unmaking of an Anthropological Myth (Freeman 1983). In it, he argued that Mead was completely mistaken. Samoa was sexually puritanical, violent and teenagers experienced just as much angst as they did everywhere else. In addition, he highlighted serious faults with her fieldwork: her sample was very small, she chose to live at the American naval base rather than with a Samoan family, she did not speak Samoan well, she focused mainly on teenage girls and Freeman even tracked one down who, as an elderly lady, admitted she and her friends had deliberately lied to Mead about their sex lives for their own amusement (Freeman 1999). It should be emphasized that Freeman’s critique of Mead related to her failure to conduct participant observation fieldwork properly (in line with Malinowski’s recommendations). In that Freeman rejects distinctions of primitive and advanced, and stresses the importance of culture in understanding human differences, it is also in the tradition of Boas. However, it should be noted that Freeman’s (1983) critique of Mead has also been criticized as being unnecessarily cutting, prosecuting a case against Mead to the point of bias against her and ignoring points which Mead got right (Schankman 2009, 17).

There remains an ongoing debate about the extent to which culture reflects biology or is on a biological leash. However, a growing body of research in genetics is indicating that human personality is heavily influenced by genetic factors (for example Alarcon, Foulks, and Vakkur 1998 or Wilson 1998), though some research also indicates that environment, especially while a fetus, can alter the expression of genes (see Nettle 2007). This has become part of the critique of cultural determinism from evolutionary anthropologists.

b. Functionalism and Structuralism

Between the 1930s and 1970s, various forms of functionalism were influential in British social anthropology. These schools accepted, to varying degrees, the cultural determinist belief that ‘culture’ was a separate sphere from biology and operated according to its own rules but they also argued that social institutions could be compared in order to better discern the rules of such institutions. They attempted to discern and describe how cultures operated and how the different parts of a culture functioned within the whole. Perceiving societies as organisms has been traced back to Herbert Spencer. Indeed, there is a degree to which Durkheim (1965) attempted to understand, for example, the function of religion in society. But functionalism seemingly reflected aspects of positivism: the search for, in this case, social facts (cross-culturally true), based on empirical evidence.

E. E. Evans-Pritchard (1902-1973) was a leading British functionalist from the 1930s onwards. Rejecting grand theories of religion, he argued that a tribe’s religion could only make sense in terms of function within society and therefore a detailed understanding of the tribe’s history and context was necessary. British functionalism, in this respect, was influenced by the linguistic theories of Swiss thinker Ferdinand de Saussure (1857-1913), who suggested that signs only made sense within a system of signs. He also engaged in lengthy fieldwork. This school developed into ‘structural functionalism.’ A. R. Radcliffe-Brown (1881-1955) is often argued to be a structural functionalist, though he denied this. Radcliffe-Brown rejected Malinowski’s functionalism – which argued that social practices were grounded in human instincts. Instead, he was influenced by the process philosophy of Alfred North Whitehead (1861-1947). Radcliffe-Brown claimed that the units of anthropology were processes of human life and interaction. They are in constant flux and so anthropology must explain social stability. He argued that practices, in order to survive, must adapt to other practices, something called ‘co-adaptation’ (Radcliffe-Brown 1957). It might be argued that this leads us asking where any of the practices came from in the first place.

However, a leading member of the structural functionalist school was Scottish anthropologist Victor Turner (1920-1983). Structural functionalists attempted to understand society as a structure with inter-related parts. In attempting to understand Rites of Passage, Turner argued that everyday structured society could be contrasted with the Rite of Passage (Turner 1969). This was a liminal (transitional) phase which involved communitas (a relative breakdown of structure). Another prominent anthropologist in this field was Mary Douglas (1921-2007). She examined the contrast between the ‘sacred’ and ‘profane’ in terms of categories of ‘purity’ and ‘impurity’ (Douglas 1966). She also suggested a model – the Grid/Group Model – through which the structures of different cultures could be categorized (Douglas 1970). Philosophically, this school accepted many of the assumptions of naturalism but it held to aspects of positivism in that it aimed to answer discrete questions, using the ethnographic method. It has been criticized, as we will see below, by postmodern anthropologists and also for its failure to attempt consilience with science.

Turner, Douglas and other anthropologists in this school, followed Malinowski by using categories drawn from the study of ‘tribal’ cultures – such as Rites of Passage, Shaman and Totem – to better comprehend advanced societies such as that of Britain. For example, Turner was highly influential in pursuing the Anthropology of Religion in which he used tribal categories as a means of comprehending aspects of the Catholic Church, such as modern-day pilgrimage (Turner and Turner 1978). This research also involved using the participant observation method. Critics, such as Romanian anthropologist Mircea Eliade (1907-1986) (for example Eliade 2004), have insisted that categories such as ‘shaman’ only make sense within their specific cultural context. Other critics have argued that such scholarship attempts to reduce all societies to the level of the local community despite there being many important differences and fails to take into account considerable differences in societal complexity (for example Sandall 2001, Ch. 1). Nevertheless, there is a growing movement within anthropology towards examining various aspects of human life through the so-called tribal prism and, more broadly, through the cultural one. Mary Douglas, for example, has looked at business life anthropologically while others have focused on politics, medicine or education. This has been termed ‘traditional empiricism’ by critics in contemporary anthropology (for example Davies 2010).

In France, in particular, the most prominent school, during this period, was known as Structuralism. Unlike British Functionalism, structuralism was influenced by Hegelian idealism.  Most associated with Claude Levi-Strauss, structuralism argued that all cultures follow the Hegelian dialectic. The human mind has a universal structure and a kind of a priori category system of opposites, a point which Hollis argues can be used as a starting point for any comparative cultural analysis. Cultures can be broken up into components – such as ‘Mythology’ or ‘Ritual’ – which evolve according to the dialectical process, leading to cultural differences. As such, the deep structures, or grammar, of each culture can be traced back to a shared starting point (and in a sense, the shared human mind) just as one can with a language. But each culture has a grammar and this allows them to be compared and permits insights to be made about them (see, for example, Levi-Strauss 1978). It might be suggested that the same criticisms that have been leveled against the Hegelian dialectic might be leveled against structuralism, such as it being based around a dogma. It has also been argued that category systems vary considerably between cultures (see Diamond 1974). Even supporters of Levi-Strauss have conceded that his works are opaque and verbose (for example Leach 1974).

c. Post-Modern or Contemporary Anthropology

The ‘postmodern’ thinking of scholars such as Jacques Derrida (1930-2004) and Michel Foucault (1926-1984) began to become influential in anthropology in the 1970s and have been termed anthropology’s ‘Crisis of Representation.’ During this crisis, which many anthropologists regard as ongoing, every aspect of ‘traditional empirical anthropology’ came to be questioned.

Hymes (1974) criticized anthropologists for imposing ‘Western categories’ – such as Western measurement – on those they study, arguing that this is a form of domination and was immoral, insisting that truth statements were always subjective and carried cultural values. Talal Asad (1971) criticized field-work based anthropology for ultimately being indebted to colonialism and suggested that anthropology has essentially been a project to enforce colonialism. Geertzian anthropology was criticized because it involved representing a culture, something which inherently involved imposing Western categories upon it through producing texts. Marcus argued that anthropology was ultimately composed of ‘texts’ – ethnographies – which can be deconstructed to reveal power dynamics, normally the dominant-culture anthropologist making sense of the oppressed object of study through means of his or her subjective cultural categories and presenting it to his or her culture (for example Marcus and Cushman 1982). By extension, as all texts – including scientific texts – could be deconstructed, they argued, that they can make no objective assertions. Roth (1989) specifically criticizes seeing anthropology as ‘texts’ arguing that it does not undermine the empirical validity of the observations involved or help to find the power structures.

Various anthropologists, such as Roy Wagner (b. 1938) (Wagner 1981), argued that anthropologists were simply products of Western culture and they could only ever hope to understand another culture through their own. There was no objective truth beyond culture, simply different cultures with some, scientific ones, happening to be dominant for various historical reasons. Thus, this school strongly advocated cultural relativism. Critics have countered that, after Malinowski, anthropologists, with their participant observation breaking down the color bar, were in fact an irritation to colonial authorities (for example Kuper 1973) and have criticized cultural relativism, as discussed.

This situation led to what has been called the ‘reflexive turn’ in cultural anthropology. As Western anthropologists were products of their culture, just as those whom they studied were, and as the anthropologist was himself fallible, there developed an increasing movement towards ‘auto-ethnography’ in which the anthropologist analyzed their own emotions and feelings towards their fieldwork. The essential argument for anthropologists engaging in detailed analysis of their own emotions, sometimes known as the reflexive turn, is anthropologist Charlotte Davies’ (1999, 6) argument that the ‘purpose of research is to mediate between different constructions of reality, and doing research means increasing understanding of these varying constructs, among which is included the anthropologist’s own constructions’ (see Curran 2010, 109). But implicit in Davies’ argument is that there is no such thing as objective reality and objective truth; there are simply different constructions of reality, as Wagner (1981) also argues. It has also been argued that autoethnography is ‘emancipatory’ because it turns anthropology into a dialogue rather than a traditional hierarchical analysis (Heaton-Shreshta 2010, 49). Auto-ethnography has been criticized as self-indulgent and based on problematic assumptions such as cultural relativism and the belief that morality is the most important dimension to scholarship (for example Gellner 1992). In addition, the same criticisms that have been leveled against postmodernism more broadly have been leveled against postmodern anthropology, including criticism of a sometimes verbose and emotive style and the belief that it is epistemologically pessimistic and therefore leads to a Void (for example Scruton 2000). However, cautious defenders insist on the importance of being at least ‘psychologically aware’ (for example Emmett 1976) before conducting fieldwork, a point also argued by Popper (1963) with regard to conducting any scientific research. And Berger (2010) argues that auto-ethnography can be useful to the extent that it elucidates how a ‘social fact’ was uncovered by the anthropologist.

One of the significant results of the ‘Crisis of Representation’ has been a cooling towards the concept of ‘culture’ (and indeed ‘culture shock’) which was previously central to ‘cultural anthropology’ (see Oberg 1960 or Dutton 2012). ‘Culture’ has been criticized as old-fashioned, boring, problematic because it possesses a history (Rees 2010), associated with racism because it has come to replace ‘race’ in far right politics (Wilson 2002, 229), problematic because it imposes (imperialistically) a Western category on other cultures, vague and difficult to perfectly define (Rees 2010), helping to maintain a hierarchy of cultures (Abu Lughod 1991) and increasingly questioned by globalization and the breakdown of discrete cultures (for example Eriksen 2002 or Rees 2010). Defenders of culture have countered that many of these criticisms can be leveled against any category of apprehension and that the term is not synonymous with ‘nation’ so can be employed even if nations become less relevant (for example Fox and King 2002). Equally, ‘culture shock,’ formerly used to describe a rite of passage amongst anthropologists engaging in fieldwork, has been criticized because of its association with culture and also as old-fashioned (Crapanzano 2010).

In addition, a number of further movements have been provoked by the postmodern movement in anthropology. One of these is ‘Sensory Ethnography’ (for example Pink 2009). It has been argued that traditionally anthropology privileges the Western emphasis on sight and the word and that ethnographies, in order to avoid this kind of cultural imposition, need to look at other senses such as smell, taste and touch. Another movement, specifically in the Anthropology of Religion, has argued that anthropologists should not go into the field as agnostics but should accept the possibility that the religious perspective of the group which they are studying may actually be correct and even work on the assumption that it is and engage in analysis accordingly (a point discussed in Engelke 2002).

During the same period, schools within anthropology developed based around a number of other fashionable philosophical ideologies. Feminist anthropology, like postmodern anthropology, began to come to prominence in the early 1970s. Philosophers such as Sandra Harding (1991) argued that anthropology had been dominated by men and this had led to anthropological interpretations being androcentric and a failure to appreciate the importance of women in social organizations. It has also led to androcentric metaphors in anthropological writing and focusing on research questions that mainly concern men. Strathern (1988) uses what she calls a Marxist-Feminist approach. She employs the categories of Melanesia in order to understand Melanesian gender relations to produce an ‘endogenous’ analysis of the situation. In doing so, she argues that actions in Melanesia are gender-neutral and the asymmetry between males and females is ‘action-specific.’ Thus, Melanesian women are not in any permanent state of social inferiority to men. In other words, if there is a sexual hierarchy it is de facto rather than de jure.

Critics have countered that prominent feminist interpretations have simply turned out to be empirically inaccurate. For example, feminist anthropologists, such as Weiner (1992) as well as philosopher Susan Dahlberg (1981), argued that foraging societies prized females and were peaceful and sexually egalitarian. It has been countered that this is a projection of feminist ideals which does not match with the facts (Kuznar 1997, Ch. 3). It has been argued that it does not follow that just because anthropology is male-dominated it is thus biased (Kuznar 1997, Ch. 3). However, feminist anthropologist Alison Wylie (see Risjord 1997) has argued that ‘politically motivated critiques’ including feminist ones, can improve science. Feminist critique, she argues, demonstrates the influence of ‘androcentric values’ on theory which forces scientists to hone their theories.

Another school, composed of some anthropologists from less developed countries or their descendants, have proffered a similar critique, shifting the feminist view that anthropology is androcentric by arguing that it is Euro-centric. It has been argued that anthropology is dominated by Europeans, and specifically Western Europeans and those of Western European descent, and therefore reflects European thinking and bias. For example, anthropologists from developing countries, such as Greenlandic Karla Jessen-Williamson, have argued that anthropology would benefit from the more holistic, intuitive thinking of non-Western cultures and that this should be integrated into anthropology (for example Jessen-Williamson 2006). American anthropologist Lee Baker (1991) describes himself as ‘Afro-Centric’ and argues that anthropology must be critiqued due to being based on a ‘Western’ and ‘positivistic’ tradition which is thus biased in favour of Europe. Afrocentric anthropology aims to shift this to an African (or African American) perspective. He argues that metaphors in anthropology, for example, are Euro-centric and justify the suppression of Africans. Thus, Afrocentric anthropologists wish to construct an ‘epistemology’ the foundations of which are African. The criticisms leveled against cultural relativism have been leveled with regard to such perspectives (see Levin 2005).

4. Philosophical Dividing Lines

a. Contemporary Evolutionary Anthropology

The positivist, empirical philosophy already discussed broadly underpins current evolutionary anthropology and there is an extent to which it, therefore, crosses over with biology. This is inline with the Consilience model, advocated by Harvard biologist Edward Wilson (b. 1929) (Wilson 1998), who has argued that the social sciences must attempt to be scientific, in order to share in the success of science, and, therefore, must be reducible to the science which underpins them. Contemporary evolutionary anthropologists, therefore, follow the scientific method, and often a quantitative methodology, to answer discrete questions and attempt to orient anthropological research within biology and the latest discoveries in this field. Also some scholars, such as Derek Freeman (1983), have defended a more qualitative methodology but, nevertheless, argued that their findings need to be ultimately underpinned by scientific research.

For example, anthropologist Pascal Boyer (2001) has attempted to understand the origins of ‘religion’ by drawing upon the latest research in genetics and in particular research into the functioning of the human mind. He has examined this alongside evidence from participant observation in an attempt to ‘explain’ religion. This subsection of evolutionary anthropology has been termed ‘Neuro-anthropology’ and attempts to better understand ‘culture’ through the latest discoveries in brain science. There are many other schools which apply different aspects of evolutionary theory – such as behavioral ecology, evolutionary genetics, paleontology and evolutionary psychology – to understanding cultural differences and different aspects of culture or subsections of culture such as ‘religion.’ Some scholars, such as Richard Dawkins (b. 1941) (Dawkins 1976), have attempted to render the study of culture more systematic by introducing the concept of cultural units – memes – and attempting to chart how and why certain memes are more successful than others, in light of research into the nature of the human brain.

Critics, in naturalist anthropology, have suggested that evolutionary anthropologists are insufficiently critical and go into the field thinking they already know the answers (for example Davies 2010). They have also argued that evolutionary anthropologists fail to appreciate that there are ways of knowing other than science. Some critics have also argued that evolutionary anthropology, with its acceptance of personality differences based on genetics, may lead to the maintenance of class and race hierarchies and to racism and discrimination (see Segerstråle 2000).

b. Anthropology: A Philosophical Split?

It has been argued both by scholars and journalists that anthropology, more so than other social scientific disciplines, is rent by a fundamental philosophical divide, though some anthropologists have disputed this and suggested that qualitative research can help to answer scientific research questions as long as naturalistic anthropologists accept the significance of biology.

The divide is trenchantly summarized by Lawson and McCauley (1993) who divide between ‘interpretivists’ and ‘scientists,’ or, as noted above, ‘positivists’ and ‘naturalists.’ For the scientists, the views of the ‘cultural anthropologists’ (as they call themselves) are too speculative, especially because pure ethnographic research is subjective, and are meaningless where they cannot be reduced to science. For the interpretivists, the ‘evolutionary anthropologists’ are too ‘reductionistic’ and ‘mechanistic,’ they do not appreciate the benefits of subjective approach (such as garnering information that could not otherwise be garnered), and they ignore questions of ‘meaning,’ as they suffer from ‘physics envy.’

Some anthropologists, such as Risjord (2000, 8), have criticized this divide arguing that two perspectives can be united and that only through ‘explanatory coherence’ (combining objective analysis of a group with the face-value beliefs of the group members) can a fully coherent explanation be reached. Otherwise, anthropology will ‘never reach the social reality at which it aims.’ But this seems to raise the question of what it means to ‘reach the social reality.’

In terms of physical action, the split has already been happening, as discussed in Segal and Yanagisako (2005, Ch. 1). They note that some American anthropological departments demand that their lecturers are committed to holist ‘four field anthropology’ (archaeology, cultural, biological and linguistic) precisely because of this ongoing split and in particular the divergence between biological and cultural anthropology. They observe that already by the end of the 1980s most biological anthropologists had left the American Anthropological Association. Though they argue that ‘holism’ was less necessary in Europe – because of the way that US anthropology, in focusing on Native Americans, ‘bundled’ the four – Fearn (2008) notes that there is a growing divide in British anthropology departments as well along the same dividing lines of positivism and naturalism.

Evolutionary anthropologists and, in particular, postmodern anthropologists do seem to follow philosophies with essentially different presuppositions. In November 2010, this divide became particularly contentious when the American Anthropological Association voted to remove the word ‘science’ from its Mission Statement (Berrett 2010).

5. References and Further Reading

  • Abu-Lughod, Lila. 1991. “Writing Against Culture.” In Richard Fox (ed.), Recapturing Anthropology: Working in the Present (pp. 466-479). Santa Fe: School of American Research Press.
  • Alarcon, Renato, Foulks, Edward and Vakkur, Mark. 1998. Personality Disorders and Culture: Clinical and Conceptual Interactions. New York: John Wiley and Sons.
  • American Anthropological Association. 1998. “American Anthropological Association Statement on “Race.”” 17 May.
  • Andreski, Stanislav. 1974. Social Sciences as Sorcery. London: Penguin.
  • Asad, Talal. 1971. “Introduction.” In Talal Asad (ed.), Anthropology and the Colonial Encounter. Atlantic Highlands: Humanities Press.
  • Baiburin, Albert. 2005. “The Current State of Ethnography and Anthropology in Russia.” For Anthropology and Culture 2, 448-489.
  • Baker, Lee. 1991. “Afro-Centric Racism.” University of Pennsylvania: African Studies Center.
  • Barenski, Janusz. 2008. “The New Polish Anthropology.” Studio Ethnologica Croatica 20, 211-222.
  • Beddoe, John. 1862. The Races of Britain: A Contribution to the Anthropology of Western Europe. London.
  • Benedict, Ruth. 1934. Patterns of Culture. New York: Mifflin.
  • Berger, Peter. 2010. “Assessing the Relevance and Effects of “Key Emotional Episodes” for the Fieldwork Process.” In Dimitrina Spencer and James Davies (eds.), Anthropological Fieldwork: A Relational Process (pp. 119-143). Newcastle: Cambridge Scholars Press.
  • Berrett, Daniel. 2010. “Anthropology Without Science.” Inside Higher Ed, 30 November.
  • Bidney, David. 1953. Theoretical Anthropology. New York: Columbia University Press.
  • Boas, Franz. 1907. The Mind of Primitive Man. New York: MacMillan.
  • Bourgois, Philippe. 2007. “Confronting the Ethics of Ethnography: Lessons from Fieldwork in Central America.” In Antonius Robben and Jeffrey Slukka (eds.), Ethnographic Fieldwork: An Anthropological Reader (pp. 288-297). Oxford: Blackwell.
  • Boyer, Pascal. 2001. Religion Explained: The Human Instincts That Fashion Gods, Spirits and Ancestors. London: William Heinnemann.
  • Cattell, Raymond. 1972. Beyondism: A New Morality from Science. New York: Pergamon.
  • Ciubrinskas, Vytis. 2007. “Interview: “Anthropology is Badly Needed in Eastern Europe.”” Antropologi.info.
  • Curran, John. 2010. “Emotional Interaction and the Acting Ethnographer: An Ethical Dilemma?” In Dimitrina Spencer and James Davies (eds.), Anthropological Fieldwork: A Relational Process (pp. 100-118). Newcastle: Cambridge Scholars Press.
  • Crapanzano, Vincent. 2010. ““At the Heart of the Discipline”: Critical Reflections on Fieldwork.” In James Davies and Dimitrina Spencer (eds.), Emotions in the Field: The Psychology and Anthropology of Fieldwork Experience (pp. 55-78). Stanford: Stanford University Press.
  • Dahlberg, Frances. 1981. “Introduction.” In Frances Dahlberg (ed.), Woman the Gatherer (pp. 1-33). New Haven: Yale University Press.
  • Darwin, Charles. 1871. The Descent of Man. London: John Murray.
  • Darwin, Charles. 1859. The Origin of Species. London: John Murray.
  • Davies, James. 2010. “Conclusion: Subjectivity in the Field: A History of Neglect.” In Dimitrina Spencer and James Davies (eds.), Anthropological Fieldwork: A Relational Process (pp. 229-243). Newcastle: Cambridge Scholars Publishing.
  • Davies, Charlotte. 1999. Reflexive Ethnography: A Guide to Researching Selves and Others. London: Routledge.
  • Dawkins, Richard. 1976. The Selfish Gene. Oxford: Oxford University Press.
  • Diamond, Stanley. 1974. In Search of the Primitive. New Brunswick: Transaction Books.
  • Douglas, Mary. 1970. Natural Symbols: Explorations in Cosmology. London: Routledge.
  • Douglas, Mary. 1966. Purity and Danger: An Analysis of the Concepts of Pollution and Taboo. London: Routledge.
  • Durkheim, Emile. 1995. The Elementary Forms of Religious Life. New York: Free Press.
  • Dutton, Edward. 2012. Culture Shock and Multiculturalism. Newcastle: Cambridge Scholars Publishing.
  • Ehrenfels, Umar Rolf, Madan, Triloki Nath, and Comas, Juan. 1962. “Mankind Quarterly Under Heavy Criticism: 3 Comments on Editorial Practices.” Current Anthropology 3, 154-158.
  • Eliade, Mircea. 2004. Shamanism: Archaic Technique of Ecstasy. Princeton: Princeton University Press.
  • Ellis, Frank. 2004. Political Correctness and the Theoretical Struggle: From Lenin and Mao to Marcus and Foucault. Auckland: Maxim Institute.
  • Emmet, Dorothy. 1976. “Motivation in Sociology and Social Anthropology.” Journal for the Theory of Social Behaviour 6, 85-104.
  • Engelke, Matthew. 2002. “The Problem of Belief: Evans-Pritchard and Victor Turner on the Inner Life.” Anthropology Today 18, 3-8.
  • Eriksen, Thomas Hylland. 2003. “Introduction.” In Thomas Hylland Eriksen (ed.), Globalisation: Studies in Anthropology (pp. 1-17). London: Pluto Press.
  • Eriksen, Thomas Hylland. 2001. A History of Anthropology. London: Pluto Press.
  • Fearn, Hannah. 2008. “The Great Divide.” Times Higher Education, 28 November.
  • Fox, Richard, and King, Barbara. 2002. “Introduction: Beyond Culture Worry.” In Richard Fox and Barbara King (eds.), Anthropology Beyond Culture (pp. 1-19). Oxford: Berg.
  • Frazer, James. 1922. The Golden Bough: A Study in Magic and Religion. London: MacMillan.
  • Freeman, Derek. 1999. The Fateful Hoaxing of Margaret Mead: A Historical Analysis of Her Samoan Research. London: Basic Books.
  • Freeman, Derek. 1983. Margaret Mead and Samoa: The Making and Unmaking of an Anthropological Myth. Cambridge: Harvard University Press.
  • Galton, Francis. 1909. Essays in Eugenics. London: Eugenics Education Society.
  • Geertz, Clifford. 1973. The Interpretation of Cultures. New York: Basic Books.
  • Geertz, Clifford. 1999. “From the Native’s Point of View’: On the Nature of Anthropological Understanding.” In Russell T. McCutcheon (ed.), The Insider/Outsider Problem in the Study of Religion: A Reader (pp. 50-63). New York: Cassell.
  • Gellner, Ernest. 1995. Anthropology and Politics: Revolutions in the Sacred Grove. Oxford: Blackwell.
  • Gellner, Ernest. 1992. Post-Modernism, Reason and Religion. London: Routledge.
  • Gobineau, Arthur de. 1915. The Inequality of Races. New York: G. P. Putnam and Sons.
  • Gorton, William. 2010. “The Philosophy of Social Science.” Internet Encyclopedia of Philosophy. https://iep.utm.edu/soc-sci/
  • Gottfried, Paul. 2004. Multiculturalism and the Politics of Guilt: Towards a Secular Theocracy. Columbia: University of Missouri Press.
  • Grant, Madison. 1916. The Passing of the Great Race: Or the Racial Basis of European History. New York: Charles Scribner’s Sons.
  • Hallpike, Christopher Robert. 1986. The Principles of Social Evolution. Oxford: Clarendon Press.
  • Harding, Sandra. 1991. Whose Science? Whose Knowledge? Thinking from Women’s Lives. Ithaca: Cornell University Press.
  • Headland, Thomas, Pike, Kenneth, and Harris, Marvin. 1990. Emics and Etics: The Insider/ Outsider Debate. New York: Sage Publications.
  • Heaton-Shreshta, Celayne. 2010. “Emotional Apprenticeships: Reflections on the Role of Academic Practice in the Construction of “the Field.”” In Dimitrina Spencer and James Davies (eds.), Anthropological Fieldwork: A Relational Process (pp. 48-74). Newcastle: Cambridge Scholars Publishing.
  • Hollis, Martin. 1967. “The Limits of Irrationality.” European Journal of Sociology 8, 265-271.
  • Hymes, Dell. 1974. “The Use of Anthropology: Critical, Political, Personal.” In Dell Hymes (ed.), Reinventing Anthropology (pp. 3-82). New York: Vintage Books.
  • Jessen Williamson, Karla. 2006. Inuit Post-Colonial Gender Relations in Greenland. Aberdeen University: PhD Thesis.
  • Kapferer, Bruce. 2001. “Star Wars: About Anthropology, Culture and Globalization.” Suomen Antropologi: Journal of the Finnish Anthropological Society 26, 2-29.
  • Kuper, Adam. 1973. Anthropologists and Anthropology: The British School 1922-1972. New York: Pica Press.
  • Kuznar, Lawrence. 1997. Reclaiming a Scientific Anthropology. Walnut Creek: AltaMira Press.
  • Lawson, Thomas, and McCauley, Robert. 1993. Rethinking Religion: Connecting Cognition and Culture. Cambridge: Cambridge University Press.
  • Leach, Edmund. 1974. Claude Levi-Strauss. New York: Viking Press.
  • Levi-Strauss, Claude. 1978. Myth and Meaning. London: Routledge.
  • Levin, Michael. 2005. Why Race Matters. Oakton: New Century Foundation.
  • Lovejoy, Arthur. 1936. The Great Chain of Being: A Study of the History of an Idea. Cambridge: Harvard University Press.
  • Lynn, Richard. 2006. Race Differences in Intelligence: An Evolutionary Analysis. Augusta: Washington Summit Publishers.
  • Malinowski, Bronislaw. 1922. Argonauts of the Western Pacific. London: Routledge.
  • Marcus, George, and Cushman, Dick. 1974. “Ethnographies as Texts.” Annual Review of Anthropology 11, 25-69.
  • Mead, Margaret. 1928. Coming of Age in Samoa: A Psychological Study of Primitive Youth for Western Civilization. London: Penguin.
  • Montagu, Ashley. 1945. Man’s Most Dangerous Myth: The Fallacy of Race. New York: Columbia University Press.
  • Nettle, Daniel. 2007. Personality: What Makes Us the Way We Are. Oxford: Oxford University Press.
  • Oberg, Kalervo. 1960. “Culture Shock: Adjustment to New Cultural Environments.” Practical Anthropology 7, 177-182.
  • Pearson, Roger. 1991. Race, Intelligence and Bias in Academe. Washington DC: Scott-Townsend Publishers.
  • Pels, Peter. 1999. “Professions of Duplexity: A Prehistory of Ethical Codes.” Current Anthropology 40, 101-136.
  • Pichot, Andre. 2009. The Pure Society: From Darwin to Hitler. London: Verso.
  • Pike, Luke. 1869. “On the Alleged Influence of Race Upon Religion.” Journal of the Anthropological Society of London 7, CXXXV-CLIII.
  • Pink, Sarah. 2009. Doing Sensory Ethnography. London: Sage Publications.
  • Popper, Karl. 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge.
  • Popper, Karl. 1957. The Poverty of Historicism. London: Routledge.
  • Radcliffe-Brown, Alfred Reginald. 1957. A Natural Science of Society. Chicago: University of Chicago Press.
  • Rees, Tobias. 2010. “On the Challenge – and the Beauty – of (Contemporary) Anthropological Inquiry: A Response to Edward Dutton.” Journal of the Royal Anthropological Institute 16, 895-900.
  • Risjord, Mark. 2007. “Scientific Change as Political Action: Franz Boas and the Anthropology of Race.” Philosophy of Social Science 37, 24-45.
  • Risjord, Mark. 2000. Woodcutters and Witchcraft: Rationality and the Interpretation of Change in the Social Sciences. New York: University of New York Press.
  • Rogan, Bjarn. 2012. “The Institutionalization of Folklore.” In Regina Bendix and Galit Hasam-Rokem (eds.), A Companion to Folklore (pp. 598-630). Oxford: Blackwell.
  • Roth, Paul. 1989. “Anthropology Without Tears.” Current Anthropology 30, 555-569.
  • Sandall, Roger. 2001. The Culture Cult: On Designer Tribalism and Other Essays. Oxford: Westview Press.
  • Sarkany, Mihaly. ND. “Cultural and Social Anthropology in Central and Eastern Europe.” Liebnitz Institute for the Social Sciences.
  • Shankman, Paul. 2009. The Trashing of Margaret Mead: Anatomy of an Anthropological Controversy. Madison: University of Wisconsin Press.
  • Scruton, Roger. 2000. Modern Culture. London: Continuum.
  • Sedgwick, Mitchell. 2006. “The Discipline of Context: On Ethnography Amongst the Japanese.” In Joy Hendry and Heung Wah Wong (eds.), Dismantling the East-West Dichotomy: Essays in Honour of Jan van Bremen (pp. 64-68). New York: Routledge.
  • Segal, Daniel, and Yanagisako, Sylvia. 2005. “Introduction.” In Daniel Segal and Sylvia Yanagisako (eds.), Unwrapping the Sacred Bundle: Reflections on the Disciplining of Anthropology (pp. 1-23). Durham: Duke University Press.
  • Segerstråle, Ullica. 2000. Defenders of the Truth: The Sociobiology Debate. Oxford: Oxford University Press.
  • Siikala, Jukka. 2006. “The Ethnography of Finland.” Annual Review of Anthropology 35, 153-170.
  • Sluka, Jeffrey. 2007. “Fieldwork Ethics: Introduction.” In Antonius Robben and Jeffrey Slukka (eds.), Ethnographic Fieldwork: An Anthropological Reader (pp. 271-276). Oxford: Blackwell.
  • Spencer, Herbert. 1873. The Study of Sociology. New York: D. Appleton and Co.
  • Srivastava, Vinay Kumar. 2000. “Teaching Anthropology.” Seminar 495.
  • Strathern, Marylin. 1988. The Gender of the Gift: Problems with Women and Problems with Society in Melanesia. Berkley: University of California Press.
  • Stocking, George. 1991. Victorian Anthropology. New York: Free Press.
  • Tucker, William. 2002. The Funding of Scientific Racism: Wycliffe Draper and the Pioneer Fund. Illinois: University of Illinois Press.
  • Turner, Victor. 1969. The Ritual Process: Structure and Anti-Structure. New York: Aldine Publishers.
  • Turner, Victor, and Turner, Edith. 1978. Image and Pilgrimage in Christian Culture: Anthropological Perspectives. New York: Columbia University Press.
  • Tylor, Edward Burnett. 1871. Primitive Culture: Researches into the Development of Mythology, Religion, Art and Custom. London: John Murray.
  • Tylor, Edward Burnett. 1865. Researchers into the Early History of Mankind. London: John Murray.
  • Wagner, Roy. 1981. The Invention of Culture. Chicago: University of Chicago Press.
  • Weiner, Annette. 1992. Inalienable Possession. Los Angeles: University of California Press.
  • Wilson, Edward Osborne. 1998. Consilience: Towards the Unity of Knowledge. New York: Alfred A. Knopf.  
  • Wilson, Edward Osborne. 1975. Sociobiology: A New Synthesis. Cambridge: Harvard University Press.
  • Wilson, Richard. 2002. “The Politics of Culture in Post-apartheid South Africa.” In Richard Fox and Barbara King (eds.), Anthropology Beyond Culture (pp. 209-234). Oxford: Berg.

 

Author Information

Edward Dutton
Email: ecdutton@hotmail.com
University of Oulu
Finland

Modern Morality and Ancient Ethics

It is commonly supposed that there is a vital difference between ancient ethics and modern morality. For example, there appears to be a vital difference between virtue ethics and the modern moralities of deontological ethics (Kantianism) and consequentialism (utilitarianism). At second glance, however, one acknowledges that both ethical approaches have more in common than their stereotypes may suggest. Oversimplification, fallacious interpretations, as well as a broad variation within a particular ethical theory make it in general harder to determine the real differences and similarities between ancient ethics and modern morality. But why should we bother about ancient ethics at all? What is the utility of comparing the strengths and weaknesses of the particular approaches? The general answer is that a proper understanding of the strengths and weaknesses of virtue ethics and modern moral theories can be used to overcome current ethical problems and to initiate fruitful developments in ethical reasoning and decision-making.

This article examines the differences and similarities between ancient ethics and modern morality by analysing and comparing their main defining features in order to show that the two ethical approaches are less distinct than one might suppose. The first part of the article outlines the main ethical approaches in Ancient Greek ethics by focusing on the Cynics, the Cyrenaics, Aristotle’s virtue ethics, the Epicureans, and the Stoics. This part also briefly outlines the two leading modern ethical approaches, that is, Kantianism and utilitarianism, in more general terms in order to provide a sufficient background. The second part provides a detailed table with the main defining features of the conflicting stereotypes of ancient ethics and modern morality. Three main issues – the good life versus the good action, the use of the term “moral ought,” and whether a virtuous person can act in a non-virtuous way – are described in more detail in the third part of the article in order to show that the differences have more in common than the stereotypes may initially suggest. The fourth part deals with the idea of the moral duty in ancient ethics.

Table of Contents

  1. Ancient Ethics and Modern Morality
    1. Ethics and Morality
    2. Ancient Ethics
      1. The Cynics and the Cyrenaics – The Extremes
      2. The Peripatetic School – Aristotle’s Virtue Ethics
      3. Epicureanism and Stoicism
    3. Modern Morality
      1. Kantianism
      2. Utilitarianism
    4. The Up-shot
  2. The Table of Ancient Ethics and Modern Morality – A Comparison
  3. Ancient Ethics and Modern Morality – The Main Differences
    1. The Good Life versus the Good Action
    2. The Moral Ought
    3. Can a Virtuous Person Act in a Non-Virtuous Way?
  4. Special Problem: Kant and Aristotle – Moral Duty and For the Sake of the Noble
  5. Conclusion
  6. References and Further Reading

1. Ancient Ethics and Modern Morality

There are at least two main criteria that each moral theory must fulfil: first, the criterion of justification (that is, the particular moral theory should not contain any contradictions) and, second, the criterion of applicability (that is, the particular moral theory should solve concrete problems and offer ethical orientation). However, many (traditional) moral theories are unable to meet the second criterion and simply fall short of the high demands of applied ethics to solve the complex moral problems of our times. Why is this the case? The main point is that the traditional moral theories are not sufficiently well equipped to deal with completely new problems such as issues concerning nuclear power, gene technology, and cloning and so forth. Therefore, there is constant interest in updating and enhancing a particular moral theory in order to make it compatible with the latest demands. Examples are neo-Aristotelians such as Hursthouse on abortion (1991) and on nature (2007), as well as neo-Kantians such as Regan on animals (1985), Korsgaard in general and in particular on animals and nature (1996), and Altman’s edited volume on the use and limits of Kant’s practical philosophy in applied ethics (2011). This is a difficult and often very complex process.

a. Ethics and Morality

When people talk about ethical approaches in Antiquity, they refer to these approaches by using the words “ancient ethics” rather than “ancient morality”. They talk about “virtue ethics” and not about “virtue morality”. But, why is this the case? The challenging question is, according to Annas (1992: 119-120), whether ancient scholars such as Plato and Aristotle as well as the Stoics and Epicureans are really talking about morality at all, since their main focus is limited to the agent’s happiness, which obviously “doesn’t sound much like morality” (119). Even if one acknowledges the fact that happiness means a satisfactory and well-lived life according to the ethical virtues and not only a happy moment or so, it still does not sound like morality. Furthermore, the general idea in virtue ethics, that the good of other people enters the scene by being a part of one’s own good and that, for example, the notion of justice is introduced as a character trait and not as the idea of the rights of others (see, Dworkin’s phrase, “rights as trumps”), makes it obvious that there is a systematic difference between the notions of ethics and morality. Ancient ethics is about living a good and virtuous life according to the ethical virtues, that is, to become a virtuous person, while the modern notion of morality is primarily focused on the interests of other people and the idea of deontological constraints. That is, one acts morally because one has to meet certain standards and not because it supports one’s own good life. But even this simple picture might be premature depending on how one conceives the idea of “moral motivation” in ancient ethics (see, below).

Historically speaking, from a different perspective, there is no evidence which term is most legitimate. In Ancient Greek history, the Greek term for ethics is êthos and means something like character. When Aristotle analyses the good life in the Nicomachean Ethics and the Eudemian Ethics, he therefore focuses on the central topic of good and bad character traits that is virtues and vices. In this original sense, ethics means an analysis about the character or character traits. In Ancient Roman thought, which was essentially influenced by Cicero, the Greek term ethikos (the adjective to êthos) was translated with the Latin term moralis (the adjective of mores) whereas the Latin term mores, in fact, means habits and customs. It is possible to translate the Greek term êthos with habits and customs, but it is more likely that the translation of ethikos with moralis was a mistranslation. The term moralis rather refers to the Greek ethos whose primary meaning is habits and customs. If the term morality refers to mores, then the term morality means the totality of all habits and customs of a given community. The term moralis became a terminus technicus in the Latin-shaped philosophy, which covers the present meaning of the term. In modern times, the habits and customs of a given community are termed ‘conventions’, which are authoritative for the social life in society. Morality, however, is not simply a matter of mere convention but the latter often conflicts with morality (for example, an immoral convention), hence, it seems inappropriate to shorten the term in this way (Steinfath 2000). At present, there are, at least, four different possibilities to distinguish between ethics and morality:

  1. Ethics and morality as distinct spheres: Ethics has to do with the pursuit of one’s own happiness or well-being and private lifestyle, that is, how we should live to make good lives for ourselves. Morality has to do with other people’s interests and deontological constraints (for example Jürgen Habermas).
  2. The equation of ethics and morality (for example Peter Singer).
  3. Morality as a special field in the ethical realm: Ethics is the generic term for ethical and moral issues in the above-mentioned sense. Morality is a special part of ethics (for example, Bernard Williams).
  4. Morality as the object of ethics: Ethics is the philosophical theory of morality which is the systematic analysis of moral norms and values (standard reading).

The upshot is that it is always important to ask how the terms ethics and morality are used and how one uses them for oneself. It is certain that one makes a textual and not only a conceptual differentiation by claiming that the terms differ.

b. Ancient Ethics

It is impossible to give a complete depiction of the rich history of ethical reasoning and decision-making in Antiquity here, therefore the focus of this section concerns the main lines of ethical reasoning of the most important philosophical schools in the classic and Hellenistic period. This rather simplified overview is nonetheless sufficient for our purposes. One can roughly distinguish the classic and Hellenistic periods into four different but closely connected parts. The first part concerns Socrates and his arguments with the Sophists (second half of the fifth century BC); the second part covers the post-Socratian formation of important philosophical schools deeply influenced by Socratic thought for example Antisthenes’ school of the Cynics, Aristippus’ school of the Cyrenaics, and Plato’s Academy which is the most influential ancient school (second half of the fifth and fourth centuries BC). The third part is characterized, on the one hand, by the formation of one new major philosophical school, namely Aristotle’s peripatetic school, which developed from Plato’s Academy, and, on the other hand, by the exchange of arguments among the existing schools on various issues (fourth century BC). The fourth part concerns the formation of two new important philosophical schools, which become highly influential in Antiquity, first, Epicurus’ school of epicureanism standing in the tradition of the Cyrenaics and, secondly, Zeno’s school of the Stoics which partly developed from the Cynics (second half of the fourth and third century BC). All the philosophical schools – being at odds with each other – are still united by the fact that they are deeply concerned with the most important ethical questions of how to live a good life and how to achieve happiness. Their responses to these vital questions are, of course, diverse.

ancient and modern ethics Figure 1

Figure 1. The Most Prominent Philosophical Schools in Ancient Greece

The following brief depiction focuses on the basic ethical assumptions of the philosophical schools of the Cynics and Cyrenaics, the peripatetic school, the Epicureans, and the Stoics. Socrates and Plato’s Academy are left out by virtue that Socrates did not provide any (written) systematic ethics. His unsystematic ethical position is mainly depicted in Plato’s early dialogues, for example Laches, Charmides, Protagoras and some of Xenophon’s works, such as Apology, Symposium, and Memorabilia. Plato himself did not provide any systematic ethics comparable to the other main ancient schools either, even though one can certainly reconstruct – at least to some extent – his ethical viewpoint in the dialogue Politeia. In addition, most (ethical) works of the classic and Hellenistic periods are lost in the dark of history; what remains is a collection of fragments, phrases, and (parts of) letters of various important philosophers (and commentators) standing in the tradition of particular schools at that time. Many rival views on ethics are mediated through the works of Plato and Aristotle, in which they criticize their opponents. In addition, some of these rudiments and testimonials were also mediated by famous writers and politicians such as Xenophon (fifth and fourth century BC) and the important historian of philosophy Diogenes Laertios (third century AD). Aristotle, however, is the only ancient philosopher whose two substantial and complete ethical contributions, that is, the Nicomachean Ethics and the Eudemian Ethics – leaving aside the Magna Moralia of which the authorship is unclear – have survived, even though all of his dialogues including those that are concerned with ethics and ethical issues are also lost.

i. The Cynics and the Cyrenaics – The Extremes

The founder of the school of the Cynics, Antisthenes of Athens, taught that virtue in terms of practical wisdom is a good and also sufficient for eudaimonia, that is, happiness. Badness is an evil and everything else is indifferent. In accord with Socrates, Antisthenes claimed that virtue is teachable and he also accepted the doctrine of the unity of the virtues which is the general idea that if a person possesses one ethical virtue, then he or she thereby possesses all other ethical virtues as well (for a recent contribution to this controversial doctrine, see Russell, 2009). The only good of human beings is that what is peculiar to them, that is, their ability to reason. Against the Cyrenaics he argues that pleasure is never a good. Things such as death, illness, servitude, poverty, disgrace, and hard labour are only supposed to be bad but are not real evils. One should be indifferent towards one’s honour, property, liberty, health and life (committing suicide was allowed). The Cynics, in general, lived a beggar’s life and were probably the first real cosmopolitans in human history – a feature that the Stoics wholeheartedly adopted later. They were also against the common cultural and religious rites and practices, a main feature which they shared with the Sophists. They took Socratian frugality to extremes and tried to be as independent of material goods as possible, like Diogenes of Sinope who lived in a barrel. Furthermore, one should abstain from bad things and seek apathy and tranquillity, which are important features the Stoics adopted from the Cynics as well. According to the Cynics, there are two groups of people: first, the wise people living a perfect and happy life – they cannot lose their virtues once they achieved this condition (similar to Aristotle) – and, secondly, the fools who are unhappy and make mistakes (Diogenes Laertios VI, 1 and 2; Zeller 1883: 116-121; Long 2007: 623-629).

Aristippus of Cyrene was well known and highly regarded among philosophers in Antiquity and was the first Socratian disciple who took money in exchange for lessons. He was the founder of the Cyrenaics – a famous philosophical school whose members were devoted to (sensualistic) hedonism (which certainly influenced Jeremy Bentham’s version of hedonistic utilitarianism). Thereby, the school of the Cyrenaics stands in striking contrast to the Cynics. Aristippus claims that knowledge is valuable only insofar as it is useful in practical matters (a feature that the Cyrenaics share with the Cynics); all actions should strive for the utmost pleasure since pleasure is the highest good. There are gradual qualitative differences of the goods. Unlike Aristotle the Hedonists believed that happiness understood as a long-term state is not the overall purpose in life but the bodily pleasure of the very moment, which is the goal of life. The past has gone by and the future is uncertain therefore only the here and now is decisive since the immediate feelings are the only guide to what is really genuinely valuable. Practical wisdom is the precondition of happiness in being instrumentally useful for achieving pleasure. Aristippus and the Cyrenaics were seeking maximum pleasure in each moment without being swamped by it. Aristippus – known for his cheerful nature and praiseworthy character as well as his distinguished restraint – famously claimed that one should be the master in each moment: “I possess, but I am not possessed”. A. A. Long rightly claims: “Aristippus Senior had served as the paradigm of a life that was both autonomous and effortlessly successful in turning circumstances into sources of bodily enjoyment” (2007: 636). Aristippus was a true master in making the best out of each situation; he also taught that one should be able to limit one’s wishes if they are likely to cause severe problems for oneself, to preserve self-control (a general feature he shares with Socrates), to secure one’s happiness, to seek inner freedom, and to be cheerful. Obviously his teachings of a life solely devoted to bodily pleasure – that is, his pursuit of lust and his view concerning the unimportance of knowledge – stand in striking contrast to Socrates’ teachings (as well as to Plato and Aristotle). His disciples – most notably Aristippus the Younger, Theodoros, Anniceris (who bought the release of Plato), and Hegesias – established new Cyrenaic schools offering sophisticated versions of hedonism by virtue of fruitful disputes with Epicurus and the Cynics (for a brief overview on Aristippus’ disciples, see A. A. Long 2007: 632-639 and for the teachings, for example, Diogenes Laertios II, 8; Zeller 1883: 121-125; Döring 1988. For the view that Aristippus’ hedonism is not limited to “bodily pleasures”, see Urstad 2009).

ii. The Peripatetic School – Aristotle’s Virtue Ethics      

Aristotle proposed the most prominent and sophisticated version of virtue ethics in Antiquity and his teachings have become authoritative for many scholars and still remain alive in the vital contributions of neo-Aristotelians in contemporary philosophy. His main ethical work is the Nicomachean Ethics; less prominent but still valuable and authentic is the Eudemian Ethics while Aristotle’s authorship of the Magna Moralia is highly questionable. Aristotle claims that happiness (eudaimonia) is the highest good – that is the final, perfect, and self-contained goal – to which all people strive at. In particular, happiness is the goal of life, that is, a life that is devoted to “doing” philosophy (EN X, 6–9). Whether a person can be called “happy” can only be determined at the very end of a person’s life, retrospectively. For a good and general overview on Aristotle’s ethics see Broadie (1991) and Wolf (2007).

However, the idea that life should be devoted to reasoning follows from Aristotle’s important human function argument (EN I, 5, 6) in which he attempts to show – by analogy – that human beings as such must also have a proper function in comparison to other things such as a pair of scissors (the proper function is to cutting) and a flute player (the proper function is to flute playing) and so forth. If the proper function is performed in a good way, then Aristotle claims that the particular thing has goodness (aretê). For example, if the proper function of a pair of scissors is to cutting, then the proper function of a good pair of scissors is to cutting well (likewise in all other cases). Since the proper function of human beings – according to Aristotle – is to reason, the goodness of human beings depends on the good performance of the proper human function that is to reason well. In fact, Aristotle claims that the goodness of human beings does not consist in the mere performance of the proper function but rather in their disposition. This claim is substantiated by his example of the good person and the bad person who cannot be distinguished from each other during their bedtime if one only refers to their (active) performance. The only possible way to distinguish them is to refer to their different dispositions. It is a matter of debate whether there is a particular human function as proposed by Aristotle.

All in all, one can distinguish four different lines of reasoning in Aristotle’s ethics: the virtue of the good person (standard interpretation), the idea of an action-oriented virtue ethics, the application of practical wisdom, and the idea of the intrinsic value of virtues. The different approaches are dealt with in order.

The virtue of the good person (EN II, 3, 4): according to Aristotle, an action is good (or right) if a virtuous person would perform that action in a similar situation; an action is bad or wrong (and hence prohibited) if the virtuous person would never perform such an action. Three criteria must be met, according to Aristotle, in order to ensure that an action is virtuous given that the agent is in a certain condition when he performs them: (i.) the agent must have knowledge of the circumstances of the action (the action must not happen by accident); (ii.) the action is undertaken out of deliberative choice and is done for its own sake; and (iii.) the action is performed without hesitation, that is, the action is performed by a person with a firm and stable virtuous character.

The action-oriented virtue ethics (EN II, 6, 1107a10–15): Aristotle’s virtue ethics contains some hints that he not only adheres to the standard interpretation, but also claims that there are some actions that are always morally blameworthy under any circumstances, that is, some actions are intrinsically bad. The fine or the noble and the just require the virtuous person to do or refrain from doing certain things, for example, not to murder (in particular, not to kill one’s parents), not to commit adultery, and not to commit theft. This line of reasoning contains deontological limitations insofar as the virtuous person is no longer the overall standard of evaluation, but the virtuous person herself must meet some ethical criteria in order to fulfil the external demands of, for example, “the noble” and “the just” to act virtuously.

Practical wisdom (EN VI): in some passages in book VI of the Nicomachean Ethics, Aristotle argues that it is our practical wisdom that makes our practical considerations good, both with regard to the good or virtuous life and with regard to our particular goals. He claims that a practically wise person has a special sensitivity or special perceptual skill with which to evaluate a situation in a morally correct or appropriate way. Here, the emphasis lies on the practical wisdom – as the capacity of ethical reasoning and decision-making – rather than on adhering to single ethical virtues, even though Aristotle claims that it is impossible to be practically wise without having ethical virtues and vice versa.

The intrinsic value of the virtues: following the standard interpretation of the role of the ethical virtues with regard to living a good life, Aristotle argues in the Nicomachean Ethics (EN X, 6–9) that these virtues are somewhat less important when it comes to the overall goal, that is, happiness of living a good life. The primary goal is to live a life devoted to “doing” philosophy and thereby living a good life; the secondary goal is to live a life among other people which makes it necessary to adopt the ethical virtues, as well.

iii. Epicureanism and Stoicism

Epicurus – educated by the Platonist Pamphilus and highly influenced by the important teachings of Democritus – developed his philosophical school of the Epicureans in controversies with the Cyrenaics and the Stoics and meeting their objections and challenges. The lively exchange of arguments concerning the vital issue of how to live a good life put Epicurus in the position to successfully articulate a refined and sophisticated version of hedonism, which was regarded as superior to the rival philosophical school of the Cyrenaics. He claims that sensation is the only standard of measuring good and evil. Epicurus shares the view with the Cyrenaics that all living beings strive for pleasure and try to avoid pain. But, unlike the Cyrenaic school, he argues that happiness consists of not only the very moment of bodily pleasure but lasts a whole life and also contains mental pleasure, which is – according to him – preferable to bodily pleasure. In his Letter to Menoceus, Epicurus comments on flawed views of his ethical position and claims: “For what produces the pleasant life is not continuous drinking and parties or pederasty or womanizing or the enjoyment of fish and the other dishes of an expensive table, but sober reasoning […]” (Epic. EP. Men. 132, in: Long and Sedley 2011: 114).  The ultimate goal in life is not to strive for positive pleasure but to seek for absence of pain. Unlike Aristippus, Epicurus claims in support of the importance of mental states that bodily pleasure and pain is limited to the here and now, while the soul is also concerned with the pleasurable and painful states of the past and prospective pleasure and pain. Thus, sensations based on recollections, hope and fear in the context of mental states with regard to the past and future are much stronger than the bodily pleasure of the moment. Being virtuous is a precondition of tranquillity, that is, peace and freedom from fear, which is closely connected to happiness. In addition, Epicurus taught that one should free oneself from prejudices, to master and restrict one’s desires, to live a modest life (for example a life not devoted to achieve glory and honour), which does not exclude bodily pleasure, and to cultivate close friendships, for which the Epicureans were well known (see, Diogenes Laertios X, 1; Zeller 1883: 263-267; Erler and Schofield 2007: 642-674; Long and Sedley 2000: §20-§25).

Shortly after the rise of epicureanism, Zeno of Citium – the founder of stoicism – established a new school in Athens. The members were well known for their cosmopolitism that is the idea that all human beings belong to a single community that should be cultivated (quite similar to Aristippus’ view and the Stoics), their self-contained life style and deep concern for friendship as well as their strong adherence to ataraxia that is the freedom from passions such as pleasure, desires, sorrow, and fear which jeopardize the inner independence. The Stoics were influenced by teachings of the Cynics. Human beings, according to stoicism, are able to perceive the laws of nature through reason and to act accordingly. The best life is a life according to nature (Zeller 1883: 243). Zeno believed that the most general instinct is the instinct of self-preservation; for each living being the only thing that is valuable is what conduces to the being’s self-preservation and thereby contributes to the being’s happiness. For example, in the case of rational beings only what is in accord with reason is valuable; only virtue, which is necessary and sufficient for happiness, is a good. Following the Cynics, the Stoics argue that honour, property, health and life are not goods and that poverty, disgrace, illness, and death are not evils. Against the Cyrenaics and Epicureans, they hold the view that pleasure is not a good and certainly not the highest good; they agree with Aristotle that pleasure is the consequence of our actions – if they are of the right kind – but not the goal itself. Two main doctrines are of utmost importance in the teachings of stoicism, first, the significance of ataraxia and, secondly, the idea of doing what nature demands. First, happiness is ataraxia – the freedom from passions – and a self-contained life style. Secondly, the idea that one must act in accordance with one’s own nature in terms of acting virtuously stands in striking contrast to the other philosophical schools at that time. In addition, the right motif transforms the performance of one’s duty into a virtuous action, completely independent of the outcome of the particular action (an important feature that we find again in Kant’s ethics). Following Socrates and Plato, the Stoics believed that virtue is ethical knowledge and that non-virtuous people simply lack ethical knowledge, since virtue consists in the reasonable condition of the soul, which leads to correct views. The Cynic idea of the sharp distinction between the existence of a very few wise people and many fools, that is all non-wise people, had become less sharp in the process of time. In addition, the Roman philosopher and politician Cicero (106–43 BC) is the first author whose work on the notion of duty survives, De Officiis, in which he examined the notion in great detail in the first century BC (44 BC). It should be noted, however, that the stoic philosopher Panaitios of Rhodes (180–110 BC) had already published an important book on the notion of duty prior to Cicero. Panaitios’ work is lost but we know some essential ideas from it mediated through Cicero since he often refers to Panaitios in his De Officiis. Stoicism outlived the other philosophical schools with regard to its ethics by being an attractive position for many people and leading philosophers and politicians such as Seneca (first century AD) and Marcus Aurelius (second century AD) in Ancient Rome. (see, Diogenes Laertios VII, 1; Zeller 1883: 243-253; Inwood and Donini 2007: 675-738; Long and Sedley 2000: §56-§67).

c. Modern Morality

The two main moral theories of modern virtue ethics (or neo-Aristotelianism) are Kant’s deontological ethics and utilitarianism. Both theories have been adopted and modified by many scholars in recent history in order to make them (more) compatible with the latest demands in ethical reasoning and decision-making, in particular, by meeting the objections raised by modern virtue ethics. The following briefly depicts Kantianism in its original form and the main features of utilitarianism.

i. Kantianism

The German philosopher Immanuel Kant is the founder of deontological ethics. His ethics, which he mainly put forth in the Groundwork of the Metaphysics of Morals (1785), Critique of Practical Reason (1788), and Metaphysics of Morals (1797), is one of the most prominent and highly respected theories in modernity. Kant’s ethics is deontological in the sense that one has to obey the duties and obligations which derive from his supreme principle of morality, that is, the Categorical Imperative: “Act only according to that maxim whereby you can at the same time will that it should become a universal law” (Kant 1785). The Categorical Imperative is a test for maxims which, in turn, determine whether certain acts have moral worth or not. A maxim is an individual’s subjective principle or rule of the will (in German, das subjektive Prinzip des Wollen), which tells the individual what to do in a given particular situation. If the maxim can be universalized, then it is valid and one must act upon it. A maxim cannot be universalized when it faces two severe instances: (i.) the case of logical inconsistency (the example of suicide, which is against the “perfect duty”); and, (ii.) the case of impossibility to will the maxim to be universalized (failing to cultivate one’s talents, which is against the “imperfect duty”). Perfect duties are those duties that are blameworthy if they are not met by human beings (for example the suicide example); imperfect duties allow for human desires and hence they are not as strong as perfect duties but they are still morally binding and people do not attract blame if they do not complete them (for example failing to cultivate one’s talents). Kant’s ethics is universal in the sense that the system of moral duties and obligations point at all rational beings (not only human beings). Morality is not based in interests (such as social contract theories), emotions and intuitions, or conscience, but in reason alone. This is the reason why Kant’s ethics is not heteronomous – by being a divine ethical theory in which God commands what human beings should do (for example the Bible, the Ten Commandments) or natural law conception in which nature itself commands what human beings should do by providing human beings with the faculty of reason who, in turn, detect what should be done in moral matters – but truly autonomous with regard to rational beings, who make their moral decisions in the light of pure practical reason. However, pure practical reason, in determining the moral law or Categorical Imperative, determines what ought to be done without reference to empirical contingent factors (that is, anthropology in the broad sense of the term including the empirical sciences; see preface to Groundwork) such as one’s own desires or any personal inclinations (in German Neigungen). The pure practical reason is not limited to the particular nature of human reasoning but is the source and the field of universal norms, which stem from a general notion of a rational being as such (see, Eisler 2008: 577; Paton 1967; Timmermann 2010; Altman 2011).

ii. Utilitarianism

Historically speaking, Jeremy Bentham in his Introduction to the Principles of Morals and Legislation (1789) and John Stuart Mill in Utilitarianism (1863) are the founders of utilitarianism, while Francis Hutcheson (1755) and William Paley (1785) could be seen as their legitimate predecessors by pointing out that utility should be seen as an important standard of evaluation in ethical reasoning and decision-making. Bentham claims that the duration and intensity of pleasure and pain are of utmost importance and that it is even possible – according to Bentham – to measure the right action by applying a hedonistic calculus which determines the exact utility of the actions. The action with the best hedonistic outcome should be put into practice. His position is called radical quantitative hedonism. Mill instead questions the very idea of a hedonistic calculus and argues that one must distinguish between mental and bodily pleasure by giving more weight to mental pleasures. His position is called qualitative hedonism. Mill’s basic formula of utilitarianism is as follows:

The creed which accepts as the foundation of morals, Utility, or the Greatest Happiness Principle, holds that actions are right in proportion as they tend to promote happiness, wrong as they tend to produce the reverse of happiness. By happiness is intended pleasure, and the absence of pain; by unhappiness, pain and the privation of pleasure. (Mill’s Utilitarianism, chapter 2)

There is widespread agreement that there exist numerous different utilitarian theories in modern ethics; hence it would be impossible to provide an adequate depiction of all important major strands in this brief subsection. However, the following four main aspects are typical for each utilitarian theory. (1.) The consequence principle: Utilitarianism is not about actions but about the consequences of actions. This kind of theory is a form of consequentialism, which means that the moral worth of the particular action is determined by its outcome. (2.) Happiness: Utilitarianism is a teleological theory insofar as happiness (but, not in the ancient sense of the term) is the main goal that should be achieved. This particular goal can be identified with (i.) the promotion of pleasure, (ii.) the avoidance of pain or harm, (iii.) the fulfilment of desires or considered preferences, or (iv.) with meeting some objective criteria of well-being. (3.) Greatest Happiness Principle: Utilitarianism is not about mere happiness but about “the greatest happiness” attainable. Utilitarianism is a theory with one principle that judges the consequences of a given action regarding its utility, which is the general aim of actions. The moral rightness or wrongness of actions depends on the goal of achieving the greatest happiness for the greatest number of sentient beings, in short, “the greatest happiness for the greatest number”. (4.) Maximising: The collective amount of utility regarding sentient beings affected by the action should be maximized. This line of reasoning contains strong altruistic claims because, roughly speaking, one should only choose those actions which improve other sentient beings’ happiness.

Furthermore, one major methodological distinction should be mentioned briefly since it really divides all utilitarian theories in two different groups by either applying the principle of utility to actions or rules. In act utilitarianism (or direct utilitarianism) the principle of utility is applied to the particular action; in this case, one asks whether the action in question is morally right or wrong in this particular situation. In rule utilitarianism (or indirect utilitarianism), instead, the principle of utility is applied to rules only which, in turn, are applied to the particular actions and serve as guidelines for human behaviour in order to guarantee the greatest happiness for the greatest number. Here, the vital question is whether a specific rule maximises the general utility or not. From time to time, it happens that the general utility will be maximised by rule utilitarianism to a lesser degree than it would have been the case regarding act utilitarianism. For example, one should act according to the general rule which says that one should keep one’s promises which – in the long run – maximises the general utility (rule utilitarianism). However, in some cases it would be better to adhere to act utilitarianism since it maximises the general utility to a higher degree depending on the particular situation and circumstances of the case in question (act utilitarianism).

d. The Up-shot

The depiction of the ethical views of some important philosophical schools as well as their interrelatedness in Antiquity and the outline of the two leading moral theories in modern morality show that there is – despite the systematic difference concerning the importance of the question of the good life – a significant overlap of important lines of reasoning. In addition, the supposed distinction between ancient ethics and modern morality contains many misleading claims. Socrates can be seen as the initial ignition of a broad variety of diverse virtue ethical approaches such as cynicism, the teachings of the Cyrenaics, Aristotelianism, epicureanism, and stoicism. All philosophical schools were concerned with the vital questions of how to live a good life and how to achieve happiness by pointing out what the appropriate actions were. The brief outline of the different philosophical schools in Antiquity supports this view. Modern morality is different in that its focus is on the basic question of how one should act. The ancient question of how should one live is secondary. However, modern morality in particular Kantianism and utilitarianism did not start from scratch but already had some important and highly influential ancient predecessors. For example, the Kantian idea of doing the right thing because reason dictates it has its roots in stoicism (see, Cooper 1998, Schneewind 1998) and the utilitarian idea of living a happy life according to pleasure has its roots in the teachings of the Cyrenaics (for example Bentham 1789) and Epicureans (for example Mill 1863). The history of ideas conveyed important ethical insights handed down from Antiquity to modernity. The idea that there is a clear and easy distinction between ancient (virtue) ethics and modern moral theories is premature and misleading. Indeed, there are some important differences but one must acknowledge the simple fact that there is no unity or broad consensus among ancient virtue ethicists concerning the question of how to live a good life and which actions should count as virtuous. Hence, it follows that there is no “ancient ethics” as such but many important and diverse virtue ethical approaches, which have either more or less in common with “modern morality”.

In addition, modern morality, in particular contemporary morality, is characterized by the fact that quite a few important scholars elaborated modern versions of Aristotle’s classical virtue ethics in the twentieth century. These scholars argue that virtue ethics was quite successful in solving ethical problems in Antiquity and they believe that adhering to a refined version of virtue ethics is not only useful but also superior in solving our modern moral problems. Among the most important neo-Aristotelian scholars are Anscombe (1958), Foot (1978, 2001), Hursthouse (1999), MacIntyre (1981), Nussbaum (1992, 1993, 1995), Slote (2001), Swanton (2003), and Williams (1985) who claim that the traditional ethical theories such as deontological ethics (Kantianism) and consequentialism (utilitarianism) are doomed to failure. In general they adhere, at least, to two main hypotheses: (i.) People in Antiquity already employed a very efficient way of ethical reasoning and decision-making; and, (ii.) this particular way got lost in modernity without having been properly replaced. Hence it follows that one should overcome the deficient modern ethical theories and again adhere to virtue ethics as a viable alternative without, of course, abandoning the existing ethical developments (see Bayertz 2005: 115).

The following section depicts the old but still persisting stereotypical differences between ancient ethics and modern morality in order to further deepen our understanding about the supposed and real differences and similarities of both ethical approaches.

2. The Table of Ancient Ethics and Modern Morality – A Comparison

This self-explanatory table presents a simple but instructive comparison of the defining features of the stereotypes of ancient ethics and modern morality (for a similar table see Bayertz 2005: 117).

No. Criteria Ancient Ethics Modern Morality
1. Basic Question  What is the good life? What is happiness and human flourishing? What should one/I do? The question of the good life plays, at best, a sub-ordinate role.
2. What is the Object of Concern?  Self-centred: The person’s own interests dominate. Other-related: The interests of other people are most central.
3. What is most important?  Pursuit of Goals: Personal perfection, personal projects, and personal relationships. Universal moral obligations & rules: Individuals should seek for impartiality (and hence they alienate themselves from their own personal projects).
4. What is examined?  Agent: Most important are the acting person and his/her character (agent-centred ethics). Actions & Consequences: Most important is the correctness of the action & consequence (action & consequences centred ethics).
5. Central Notions  Virtues: aretaic notions for example good, excellence, virtue (aretaic language). Norms: prescriptive notions concerning rules, duties, obligations for example must, should (deontic language).
6. Rationality is seen as?  Rationality is seen as a capacity of context-sensitive insight and decision-making. Rationality is “mainly” seen as the capacity to (rationally) deduce inferences from abstract propositions.
7. The Goals of human actions  The goals of human actions are objective (notion of happiness: for example thinking, pleasure). The goals of human actions are individually defined by the people (subjectivism). No God, no nature.
8.  Scope of Morality Adult male citizens with full citizenship. Men, women, children, animals, environment.
9.  Individual and Community The individual is in unity with the community (harmony). The individual and the community are rather disconnected from each other.

Table 1: Ancient Ethics and Modern Morality

3. Ancient Ethics and Modern Morality – The Main Differences

a. The Good Life versus the Good Action

The most common stereotype with regard to ancient ethics and modern morality concerns the vital issue that ancient ethics is only about the question “What is the good life” and that modern moral theories only deal with the question “What should one do” or “How should one act”. Many stereotypes certainly depict some truth, but there is almost always a lot of room for a better understanding of the differences and similarities of the particular issue. To be more precise with regard to this issue, it is true that ancient ethics concerns the vital question of how to live a good life and to become a virtuous person by acting in accordance with the ethical virtues. However, the idea that virtue ethics does not deal with actions and hence is unable to provide concrete answers to ethical problems is premature; it is not only modern moral theories that deal with actions (see, Hursthouse 1999, chapters 1-3; Slote 2001, chapter 1; Swanton 2003, chapter 11). An ethical virtue, according to Aristotle, needs to be completely internalized by its agent through many actions of the same type so that the person is able to accomplish a firm disposition. In other words, a brave person who has the virtue of courage has to perform many brave actions in the area of fear and confidence in order to accomplish a brave disposition. Performing the appropriate actions is the only way one can do this. Indeed, modern moral theories are rather focused on the question of what should one do in a particular situation, and usually ethicists do not pay much attention to the question of living a good life. Ancient ethicists, instead, believe that one cannot separate both issues.

A related issue that seems to strongly support the initial idea concerns the claim that, on the one hand, ancient ethics is self-centred because it only focuses on the agent’s interests in living a good life and becoming a virtuous person and, on the other hand, that modern morality is other-regarding by only focusing on the interests of other people. Broadly speaking, ancient ethics is egoistical and modern morality is altruistic. The interests of other people in virtue ethics enter the stage by being incorporated into the person’s own interest in becoming virtuous and living a good life. In her article Ancient Ethics and Modern Morality, Annas examines this point in more detail and claims “the confusion comes from the thought that if the good of others is introduced into the agent’s own final good, it cannot really be the good of others, but must in some way be reduced to what matters to the agent”.  She points out that the confusion might be that “the good of others must matter to me because it is the good of others, not because it is part of my own good” (Annas 1992: 131). Annas thinks that this is compatible with the overall final good of the virtuous person since the good of others matters to the virtuous person not because it is part of the agent’s own good but because it is the good of others.

Other people, however, might claim that the difference is between “morality” and “legality”, to use a Kantian distinction. In this context, legality means simply to fulfil the moral claims that other people have; morality means to fulfil the moral claims that other people have and, in addition, to have the right motive in doing so, that is, to act out of “the good will” – to act out of a sense of moral obligation or duty. Translated into “ancient” language, the virtuous person should consider other people’s interests not because she feels indifferent to them or because their interests are only instrumentally useful to her as agent, but because the virtuous person wholeheartedly believes, feels, and acknowledges the fact that the other people’s interests are important in their own right. Another example is Aristotle who believes that the good person is living a good life if and only if she devotes her life to “philosophy” and, secondarily, lives a social life among other people. The latter requires the usage of ethical virtues, which are by nature other-regarding; the former does not require the usage of ethical virtues (see, Aristotle EN X, 6–9), even though, according to Aristotle, one cannot be a practically wise person without being virtuous, and vice versa. Both concepts are mutually dependent (EN VI).

One might claim that self-interest and the interests of other people do not stand in contrast to each other in ancient ethics but converge by adhering to an objective idea of the good (see, Bayertz 2005). The line between moral questions that concern the interests of other people and ethical questions that concern the well-being of the particular agent is disfigured beyond recognition. In modern morality, however, there is a clear difference because the question of the good life is secondary, and is systematically not important for the question of how one should act in a particular situation. Modern moral theories are rather subjective in character and hence lack the strong commitments of virtue ethical theories concerning their objective basis, as well as their claims regarding elitism and the devaluation of the moral common sense. The upshot is, however, that there is a systematic difference between ancient ethics and modern morality concerning the way in which moral problems are solved, but the idea that ancient ethics is egoistic and does not appeal to actions is premature and simply wrong.

b. The Moral Ought

Anscombe points out in her classical paper Modern Moral Philosophy (1958) that modern morality is doomed to failure because it only focuses on the analysis of language and notions and, in particular, it adheres to the fallacious idea of the moral duty. She argues that the idea of the moral duty and the moral ought used in deontological ethics originally comes from religious reasoning and theological ethics, where God was the ultimate source of morality and where the people had to obey God’s commands. Here, the ideas of a moral duty and a moral ought were appropriate. In secular ethics, however, there is no general consent to the idea of a moral duty that is universally binding on all rational people. The idea of a moral duty, according to Anscombe, should be replaced by the notion of virtue. Furthermore, Schopenhauer convincingly claims in his book On the Basis of Morality that even in the case of religious ethics there is no categorical moral duty, since people obey God’s moral rules simply because they do not want to be punished, if they decide not to act accordingly. But this means that the moral duty is hypothetical rather than categorical. It is commonly said that in ancient ethics there is no moral duty and no moral ought simply because the Greek and Romans lack those particular notions. However, from the bare fact that they lack the notions of moral duty and moral ought, one cannot conclude that they also lack the particular phenomena as well (Bayertz 2005: 122). In addition, one might claim that his point still misses the general idea of using similar notions as main ethical key terms, which reflects a certain particular way of ethical reasoning and decision-making. Whether there is something like a ‘moral ought’ in ancient virtue ethics that is comparable to deontological ethics will be briefly examined below by focusing on Aristotle’s ethics. 

c. Can a Virtuous Person Act in a Non-Virtuous Way?

According to ancient ethics, a completely virtuous person, who is the bearer of all ethical virtues, is unable to act in a non-virtuous way. If a person bears one virtue, he thereby bears all other virtues as well (that is the thesis of the unity of the virtues). The practically wise person – according to Ancient ethicists – will always act in accordance with the ethical virtues. In other words, the virtuous person is always master of her emotions and, in general, will never be swamped by her emotions, which otherwise might have led her to act in a non-virtuous way. Generally speaking, this is a quite demanding line of argumentation since it can be the case, at least according to our modern way of thinking, that a brave person who has the virtue of courage might not be able to show the virtue of liberality. However, even if one acknowledges that person A is a virtuous person, one might not be convinced that this person will never be able to act in a non-virtuous way. This particular problem has to do with the famous hypothesis of ‘the unity of the virtues’ (for a recent contribution to this problem, see Russell, 2009). In modern morality, utilitarianism, for example, convincingly distinguishes between the evaluation of the character of a person and his or her actions. It can easily be the case, according to utilitarianism, that a morally bad person performs a morally right action or that a morally good person performs a morally wrong action. This distinction is impossible to draw for proponents of (classic) virtue ethics because an ethically right action always presupposes that the person has an ethically good character.

4. Special Problem: Kant and Aristotle – Moral Duty and For the Sake of the Noble

There is a widely shared agreement among philosophers that Kant’s deontological ethics and Aristotle’s virtue ethics can be easily distinguished by acknowledging the simple fact that Kant is concerned with acting from duty or on the moral principle or because one thinks that it is morally right; while Aristotle’s approach completely lacks this particular idea of moral motivation and, hence, it would be unsound to claim that the virtuous person is morally obligated to act in a way similar to the Kantian agent. In other words, there is no such thing as acting from a sense of duty in virtue ethics. The common view has been challenged by, for example, neo-Aristotelians (for example Hursthouse 2010) who claim that there is not only a strong notion of moral motivation in Aristotle’s approach, but also that the virtuous person is better equipped to meet the demands of acting from a sense of duty than the Kantian moral agent. The following sketches out the main line of reasoning (see, also Engstrom and Whiting 1998; Jost and Wuerth 2011).

Hursthouse claims in her book On Virtue Ethics that “there is a growing enthusiasm for the idea that the ideal Kantian agent, the person with a good will, who acts “from a sense of duty”, and the ideal neo-Aristotelian agent, who acts from virtue – from a settled state of character – are not as different as they were once supposed to be” (2010: 140). Her view is supported by some important works of Hudson (1990), Audi (1995), and Baron (1995). This fact, however, has also been acknowledged by neo-Kantian philosophers such as Korsgaard (1998) and Herman (1998). In this respect it reflects a lack of awareness about current developments in virtue ethics and neo-Kantianism if one still up-holds the claim of the clear distinction between ancient ethics and modern morality, in particular, concerning Aristotle and Kant that has been proposed for hundreds of years. A related issue concerning the question of whether there is a fundamental distinction between aretaic and deontic terms has been critically discussed by Gryz (2011) who argues against Stocker (1973) who claims that “good” and “right” mean the same thing. Gryz is convinced that even if both groups of terms converge (as close as possible), there will still either remain an unbridgeable gap or in case that one attempts to define one group of terms by the other group, it follows that something is left behind which cannot be explained by the second group. This contemporary debate shows that there is still no common view on the relationship between ancient ethics and modern morality.

Kant claims in the Groundwork that the morally motivated agent acts from good will. In more detail, to act from duty or to act because one thinks that it is morally right is to perform an action because one thinks that its maxim has the form of a law (Korsgaard 1998: 218). For example, if a person is in need the Kantian agent does the right action not because – as Korsgaard claims – that it is her purpose to simply do her duty, but because the person chooses the action for its own sake that means her purpose is to help (Korsgaard 1998: 207).

Even if the Ancient Greeks lacked the particular notions that can be translated as moral ought, duty, right, and principle (for example Gryz 2011, Hursthouse 2010), it seems nonetheless correct to claim that the idea of doing the right thing because it is right or because one is required to do it is also a well-known phenomenon in classic virtue ethics in general and with regard to Aristotle and stoicism in particular. There are quite a few passages in the Nicomachean Ethics in which Aristotle clearly claims that morally good actions are done for their own sake or because it is the morally right thing to do:

Now excellent actions are noble and done for the sake of the noble. (EN IV, 2, 1120a23–24)

Now the brave man is as dauntless as man may be. Therefore, while he will fear even the things that are not beyond human strength, he will fear them as he ought and as reason directs, and he will face them for the sake of what is noble; for this is the end of excellence. (EN III, 10 1115b10-13)

The standard of all things is the good and the good man; he is striving for the good with all his soul and does the good for the sake of the intellectual element in him. (EN IX, 4, 1166a10–20)

The good man acts for the sake of the noble. (EN IX, 8, 1168a33-35)

For the wicked man, what he does clashes with what he ought to do, but what the good man ought to do he does; for the intellect always chooses what is best for itself, and the good man obeys his intellect. (EN IX, 8, 1169a15–18)

If the virtuous person acts because she thinks that it is the right thing to do, because she acts for the sake of the noble without any inclination other than to do good for the sake of the noble, then she is comparable with the Kantian moral agent. For example, according to Aristotle the noble is “that which is both desirable for its own sake and also worthy of praise” (Rhetoric I, 9, 1366a33); and in 1366b38–67a5 he holds the view that nobility is exhibited in actions “that benefit others rather than the agent, and actions whose advantages will only appear after the agent’s death, since in these cases we can be sure the agent himself gets nothing out of it” (Korsgaard 1998: 217). Hence it follows, the virtuous person will not be able to act in a non-virtuous way because he or she acts from a strong inner moral obligation to act according to the morally right thing, since it is the very nature of the virtuous person to act virtuously. The Kantian agent, instead, sometimes acts according to the universal law and hence performs a morally right action, and on other occasions he or she fails to do so. This is because he or she has no stable and firm disposition to always act in accordance with the universal law. That is the very reason why the Aristotelian virtuous person can be seen as an agent who is not only acting from duty in the sense of doing the right thing because it is right, but also because the virtuous person constantly perceives and adheres to the moral duty, that is, to act virtuously.

5. Conclusion

The upshot is, however, that the vital question of how to live a good life cannot be separated from the essential question of how one should act. Conceptually and phenomenologically, both questions are intimately interwoven and a complete ethical theory will always be concerned with both issues, independently of whether the theory is of ancient or modern origin.

6. References and Further Reading

  • Altman, M. 2011. Kant and Applied Ethics: The Uses and Limits of Kant’s Practical Philosophy. Oxford: Wiley-Blackwell.
  • Annas, J. 1992. “Ancient Ethics and Modern Morality.” Philosophical Perspectives 6: 119-136.
  • Anscombe, E. 1958. “Modern Moral Philosophy.“ Philosophy 33 (124): 1-19.
  • Apelt, O, trans. 1998. Diogenes Laertius: Leben und Meinungen berühmter Philosophen. Hamburg: Meiner.
  • Aristotle. 1985. “Nicomachean Ethics.” Translated by W.D. Ross. In Complete Works of Aristotle, edited by J. Barnes, 1729-1867. Princeton: Princeton University Press.
  • Aristotle. 2007. Rhetoric. Translated by George A. Kennedy. New York: Oxford University Press.
  • Audi, R. 1995. “Acting from Virtue.” Mind 104 (415): 449-471.
  • Baron, M. 1995. Kantian Ethics Almost Without Apology. New York: Cornell University Press.
  • Bayertz, K. 2005. “Antike und moderne Ethik.“ Zeitschrift für philosophische Forschung 59 (1): 114-132.
  • Bentham, J. 1962. „The Principles of Morality and Legislation” [1798]. In J. Bowring (ed.), The Works of Jeremy Bentham, edited by J. Bowring, 1-154. New York: Russell and Russell.
  • Broadie, S. 1991. Ethics with Aristotle. Oxford: Oxford University Press.
  • Cicero, M.T. 2001. On Obligations. Translated by P.G. Walsh. Oxford, New York: Oxford University Press.
  • Cooper, J.M. 1998. “Eudaimonism, the Appeal to Nature, and ‘Moral Duty’ in Stoicism.” In Aristotle, Kant, and the Stoics: Rethinking Happiness and Duty, edited by S. Engstrom and  J. Withing, 261-284. Cambridge, New York: Cambridge University Press.
  • Döring, K. 1988. Der Sokratesschüler Aristipp und die Kyrenaiker. Stuttgart, Wiesbaden: Franz Steiner.
  • Dworkin, R. 1984. Rights as Trumps. In Theories of Rights, edited by J. Waldron, 152-167. Oxford: Oxford University Press.
  • Eisler, R. 2008. Kant Lexikon: Nachschlagewerk Zu Kants Sämtlichen Schriften, Briefen Und Handschriftlichem Nachlaß. Hildesheim: Weidmannsche Verlagsbuchhandlung.
  • Engstrom, S. and J. Withing. 1998. Aristotle, Kant, and the Stoics. Rethinking Happiness and Duty.  Cambridge: Cambridge University Press.
  • Erler, M. and M. Schofield. 2007. “Epicurean Ethics.” In The Cambridge History of Hellenistic Philosophy, edited by K. Algra, J. Barnes, J. Mansfield and M. Schofield, 642-669. Cambridge: Cambridge University Press.
  • Foot, P. 1978. Virtues and Vices and other Essays in Moral Philosophy. Berkeley: University of California Press.
  • Foot, P. 2001. Natural Goodness. Oxford: Clarendon Press.
  • Gryz, J. 2010. “On the Relationship Between the Aretaic and the Deontic.” Ethical Theory and Moral Practice 14 (5): 493–501.
  • Herman, B. 1998. “Making Room for Character.” In Aristotle, Kant, and the Stoics. Rethinking Happiness and Duty, edited by S. Engstrom and J. Withing, 36-60.  Cambridge: Cambridge University Press.
  • Hudson, S. 1990. “What is Morality All About?” Philosophia: Philosophical Quarterly of Israel 20 (1-2): 3-13.
  • Hursthouse, R. 1991. “Virtue Theory and Abortion.” Philosophy and Public Affairs 20 (3): 223-246.
  • Hursthouse, R. 2010. On Virtue Ethics. Oxford: Oxford University Press.
  • Hursthouse, R. 2007. “Environmental Virtue Ethics.” In Working Virtue: Virtue Ethics and Contemporary Moral Problems, edited by R.L. Walker and P.J. Ivanhoe, 155-171. New York: Oxford University Press.
  • Hutcheson, F. 2005. A System of Moral Philosophy, in Two Books [1755]. London: Continuum International Publishing Group.
  • Inwood, B. and P. Donini. 2007. “Stoic Ethics.” In The Cambridge History of Hellenistic Philosophy, edited by K. Algra, J. Barnes, J. Mansfield and M. Schofield, 675-736. Cambridge:  Cambridge University Press.
  • Jost, L. and J. Wuerth. 2011. Perfecting Virtue: New Essays on Kantian Ethics and Virtue Ethics. Cambridge: Cambridge University Press.
  • Kant., I. 1991. Metaphysics of Morals [1797]. Translated by M. Gregor. Cambridge: Cambridge University Press.
  • Kant, I. 1997. Critique of Practical Reason [1788]. Translated by M. Gregor. Cambridge: Cambridge University Press.
  • Kant, I. 2005. Groundwork for the Metaphysics of Morals [1785]. Translated by A.W. Wood. New Haven CT:  Yale University Press.
  • Korsgaard, C.M. 1996. Sources of Normativity. Cambridge: Cambridge University Press.
  • Korsgaard, C.M. 1998. “From Duty and From the Sake of the Noble: Kant and Aristotle on Morally Good Action.” In Aristotle, Kant, and the Stoics. Rethinking Happiness and Duty, edited by S. Engstrom and J. Withing, 203-236. Cambridge: Cambridge University Press.
  • Long, A.A. 2007. “The Socratic Legacy.” In The Cambridge History of Hellenistic Philosophy, edited by K. Algra, J. Barnes, J. Mansfield and M. Schofield, 617-639. Cambridge: Cambridge University Press.
  • Long, A.A and D.N. Sedley. 2000. The Hellenistic Philosophers, Volume 2: Greek and Latin Texts with Notes and Bibliography. Cambridge: Cambridge University Press.
  • Long, A.A. and D.N. Sedley. 2011. The Hellenistic Philosophers, Volume 1: Translations of the Principal Sources, with Philosophical Commentary. Cambridge: Cambridge University Press.
  • MacIntyre, A. 1981.  After Virtue. London: Duckworth.
  • Mill, J.S. 1998. Utilitarianism [1863]. Edited by R. Crisp. Oxford: Oxford University Press.
  • Nussbaum, M.C. 1992. “Human Functioning and Social Justice. In Defense of Aristotelian Essentialism.” Political Theory 20 (2): 202-246.
  • Nussbaum, M. 1993. “Non-Relative Virtues: An Aristotelian Approach.” In The Quality of Life, edited by M.C. Nussbaum and A. Sen, (1-6). New York:  Oxford University Press.
  • Nussbaum, M. 1995. “Aristotle on Human Nature and the Foundations of Ethics.” In World, Mind, and Ethics: Essays on the Ethical Philosophy of Bernard Williams, edited by J.E.J. Altham and R. Harrison, 86-131. Cambridge, New York: Cambridge University Press.
  • Paley, W. 2002. The Principles of Moral and Political Philosophy [1785]. Indianapolis: Liberty Fund.
  • Paton, H. J. 1967. The Categorical Imperative: A Study in Kant’s Moral Philosophy. London: Hutchinson & co.
  • Regan, T. 1985.  The Case for Animal Rights. Berkeley: University of California Press.
  • Russell, D.C. 2009. Practical Intelligence and the Virtues. Oxford: Clarendon Press.
  • Schneewind, J.B. 1998. “Kant and Stoic Ethics.” In Aristotle, Kant, and the Stoics: Rethinking Happiness and Duty, edited by S. Engstrom and J. Withing, 285-302. Cambridge, New York: Cambridge University Press.
  • Schopenhauer, A. 1995. On the Basis of Morality [1841]. Translated by E.F.J. Payne. Providence: Berghahn Books.
  • Slote, M. 2001. Morals from Motives. Oxford: Oxford University Press.
  • Steinfath, H. 2000. „Ethik und Moral.“ In Vorlesung: Typen ethischer Theorien, 1-21 (unpublished)
  • Stocker, M. 1973. “Rightness and Goodness: Is There a Difference?” American Philosophical Quarterly 10 (2): 87–98.
  • Swanton, C. 2003. Virtue Ethics: A Pluralistic View. Oxford: Oxford University Press.
  • Timmermann, Jens. 2010. Kant’s Groundwork of the Metaphysics of Morals. Cambridge: Cambridge University Press.
  • Urstad, K. 2009. “Pathos, Pleasure and the Ethical Life in Aristippus.” Journal of Ancient Philosophy III (1): 1-22.
  • Williams, B. 1985. Ethics and the Limits of Philosophy. Cambridge MA: Harvard University Press.
  • Wolf, Ursula. 2002. Aristoteles’ “Nikomachische Ethik”. Darmstadt: Wissenschaftliche Buchgesellschaft.
  • Zeller, E. 1883. Geschichte der Griechischer Philosophie, Stuttgart: Magnus.

 

Author Information

John-Stewart Gordon
Email: jgordon@uni-koeln.de
University of Cologne, Germany
Vytautas Magnus University Kaunas, Lithuania

Simplicity in the Philosophy of Science

The view that simplicity is a virtue in scientific theories and that, other things being equal, simpler theories should be preferred to more complex ones has been widely advocated in the history of science and philosophy, and it remains widely held by modern scientists and philosophers of science. It often goes by the name of “Ockham’s Razor.” The claim is that simplicity ought to be one of the key criteria for evaluating and choosing between rival theories, alongside criteria such as consistency with the data and coherence with accepted background theories. Simplicity, in this sense, is often understood ontologically, in terms of how simple a theory represents nature as being—for example, a theory might be said to be simpler than another if it posits the existence of fewer entities, causes, or processes in nature in order to account for the empirical data. However, simplicity can also been understood in terms of various features of how theories go about explaining nature—for example, a theory might be said to be simpler than another if it contains fewer adjustable parameters, if it invokes fewer extraneous assumptions, or if it provides a more unified explanation of the data.

Preferences for simpler theories are widely thought to have played a central role in many important episodes in the history of science. Simplicity considerations are also regarded as integral to many of the standard methods that scientists use for inferring hypotheses from empirical data, the most of common illustration of this being the practice of curve-fitting. Indeed, some philosophers have argued that a systematic bias towards simpler theories and hypotheses is a fundamental component of inductive reasoning quite generally.

However, though the legitimacy of choosing between rival scientific theories on grounds of simplicity is frequently taken for granted, or viewed as self-evident, this practice raises a number of very difficult philosophical problems. A common concern is that notions of simplicity appear vague, and judgments about the relative simplicity of particular theories appear irredeemably subjective. Thus, one problem is to explain more precisely what it is for theories to be simpler than others and how, if at all, the relative simplicity of theories can be objectively measured. In addition, even if we can get clearer about what simplicity is and how it is to be measured, there remains the problem of explaining what justification, if any, can be provided for choosing between rival scientific theories on grounds of simplicity. For instance, do we have any reason for thinking that simpler theories are more likely to be true?

This article provides an overview of the debate over simplicity in the philosophy of science. Section 1 illustrates the putative role of simplicity considerations in scientific methodology, outlining some common views of scientists on this issue, different formulations of Ockham’s Razor, and some commonly cited examples of simplicity at work in the history and current practice of science. Section 2 highlights the wider significance of the philosophical issues surrounding simplicity for central controversies in the philosophy of science and epistemology. Section 3 outlines the challenges facing the project of trying to precisely define and measure theoretical simplicity, and it surveys the leading measures of simplicity and complexity currently on the market. Finally, Section 4 surveys the wide variety of attempts that have been made to justify the practice of choosing between rival theories on grounds of simplicity.

Table of Contents

  1. The Role of Simplicity in Science
    1. Ockham’s Razor
    2. Examples of Simplicity Preferences at Work in the History of Science
      1. Newton’s Argument for Universal Gravitation
      2. Other Examples
    3. Simplicity and Inductive Inference
    4. Simplicity in Statistics and Data Analysis
  2. Wider Philosophical Significance of Issues Surrounding Simplicity
  3. Defining and Measuring Simplicity
    1. Syntactic Measures
    2. Goodman’s Measure
    3. Simplicity as Testability
    4. Sober’s Measure
    5. Thagard’s Measure
    6. Information-Theoretic Measures
    7. Is Simplicity a Unified Concept?
  4. Justifying Preferences for Simpler Theories
    1. Simplicity as an Indicator of Truth
      1. Nature is Simple
      2. Meta-Inductive Proposals
      3. Bayesian Proposals
      4. Simplicity as a Fundamental A Priori Principle
    2. Alternative Justifications
      1. Falsifiability
      2. Simplicity as an Explanatory Virtue
      3. Predictive Accuracy
      4. Truth-Finding Efficiency
    3. Deflationary Approaches
  5. Conclusion
  6. References and Further Reading

1. The Role of Simplicity in Science

There are many ways in which simplicity might be regarded as a desirable feature of scientific theories. Simpler theories are frequently said to be more “beautiful” or more “elegant” than their rivals; they might also be easier to understand and to work with. However, according to many scientists and philosophers, simplicity is not something that is merely to be hoped for in theories; nor is it something that we should only strive for after we have already selected a theory that we believe to be on the right track (for example, by trying to find a simpler formulation of an accepted theory). Rather, the claim is that simplicity should actually be one of the key criteria that we use to evaluate which of a set of rival theories is, in fact, the best theory, given the available evidence: other things being equal, the simplest theory consistent with the data is the best one.

This view has a long and illustrious history. Though it is now most commonly associated with the 14th century philosopher, William of Ockham (also spelt “Occam”), whose name is attached to the famous methodological maxim known as “Ockham’s razor”, which is often interpreted as enjoining us to prefer the simplest theory consistent with the available evidence, it can be traced at least as far back as Aristotle. In his Posterior Analytics, Aristotle argued that nothing in nature was done in vain and nothing was superfluous, so our theories of nature should be as simple as possible. Several centuries later, at the beginning of the modern scientific revolution, Galileo espoused a similar view, holding that, “[n]ature does not multiply things unnecessarily; that she makes use of the easiest and simplest means for producing her effects” (Galilei, 1962, p396). Similarly, at beginning of the third book of the Principia, Isaac Newton included the following principle among his “rules for the study of natural philosophy”:

  • No more causes of natural things should be admitted than are both true and sufficient to explain their phenomena.
    As the philosophers say: Nature does nothing in vain, and more causes are in vain when fewer will suffice. For Nature is simple and does not indulge in the luxury of superfluous causes. (Newton, 1999, p794 [emphasis in original]).

In the 20th century, Albert Einstein asserted that “our experience hitherto justifies us in believing that nature is the realisation of the simplest conceivable mathematical ideas” (Einstein, 1954, p274). More recently, the eminent physicist Steven Weinberg has claimed that he and his fellow physicists “demand simplicity and rigidity in our principles before we are willing to take them seriously” (Weinberg, 1993, p148-9), while the Nobel prize winning economist John Harsanyi has stated that “[o]ther things being equal, a simpler theory will be preferable to a less simple theory” (quoted in McAlleer, 2001, p296).

It should be noted, however, that not all scientists agree that simplicity should be regarded as a legitimate criterion for theory choice. The eminent biologist Francis Crick once complained, “[w]hile Occam’s razor is a useful tool in physics, it can be a very dangerous implement in biology. It is thus very rash to use simplicity and elegance as a guide in biological research” (Crick, 1988, p138). Similarly, here are a group of earth scientists writing in Science:

  • Many scientists accept and apply [Ockham’s Razor] in their work, even though it is an entirely metaphysical assumption. There is scant empirical evidence that the world is actually simple or that simple accounts are more likely than complex ones to be true. Our commitment to simplicity is largely an inheritance of 17th-century theology. (Oreskes et al, 1994, endnote 25)

Hence, while very many scientists assert that rival theories should be evaluated on grounds of simplicity, others are much more skeptical about this idea. Much of this skepticism stems from the suspicion that the cogency of a simplicity criterion depends on assuming that nature is simple (hardly surprising given the way that many scientists have defended such a criterion) and that we have no good reason to make such an assumption. Crick, for instance, seemed to think that such an assumption could make no sense in biology, given the patent complexity of the biological world. In contrast, some advocates of simplicity have argued that a preference for simple theories need not necessarily assume a simple world—for instance, even if nature is demonstrably complex in an ontological sense, we should still prefer comparatively simple explanations for nature’s complexity. Oreskes and others also emphasize that the simplicity principles of scientists such as Galileo and Newton were explicitly rooted in a particular kind of natural theology, which held that a simple and elegant universe was a necessary consequence of God’s benevolence. Today, there is much less enthusiasm for grounding scientific methods in theology (the putative connection between God’s benevolence and the simplicity of creation is theologically controversial in any case). Another common source of skepticism is the apparent vagueness of the notion of simplicity and the suspicion that scientists’ judgments about the relative simplicity of theories lack a principled and objective basis.

Even so, there is no doubting the popularity of the idea that simplicity should be used as a criterion for theory choice and evaluation. It seems to be explicitly ingrained into many scientific methods—for instance, standard statistical methods of data analysis (Section 1d). It has also spread far beyond philosophy and the natural sciences. A recent issue of the FBI Law Enforcement Bulletin, for instance, contained the advice that “[u]nfortunately, many people perceive criminal acts as more complex than they really are… the least complicated explanation of an event is usually the correct one” (Rothwell, 2006, p24).

a. Ockham’s Razor

Many scientists and philosophers endorse a methodological principle known as “Ockham’s Razor”. This principle has been formulated in a variety of different ways. In the early 21st century, it is typically just equated with the general maxim that simpler theories are “better” than more complex ones, other things being equal. Historically, however, it has been more common to formulate Ockham’s Razor as a more specific type of simplicity principle, often referred to as “the principle of parsimony”. Whether William of Ockham himself would have endorsed any of the wide variety of methodological maxims that have been attributed to him is a matter of some controversy (see Thorburn, 1918; entry on William of Ockham), since Ockham never explicitly referred to a methodological principle that he called his “razor”. However, a standard of formulation of the principle of parsimony—one that seems to be reasonably close to the sort of principle that Ockham himself probably would have endorsed—is as the maxim “entities are not to be multiplied beyond necessity”. So stated, the principle is ontological, since it is concerned with parsimony with respect to the entities that theories posit the existence of in attempting to account for the empirical data. “Entity”, in this context, is typically understood broadly, referring not just to objects (for example, atoms and particles), but also to other kinds of natural phenomena that a theory may include in its ontology, such as causes, processes, properties, and so forth. Other, more general formulations of Ockham’s Razor are not exclusively ontological, and may also make reference to various structural features of how theories go about explaining nature, such as the unity of their explanations. The remainder of this section will focus on the more traditional ontological interpretation.

It is important to recognize that the principle, “entities are not to be multiplied beyond necessity” can be read in at least two different ways. One way of reading it is as what we can call an anti-superfluity principle (Barnes, 2000). This principle calls for the elimination of ontological posits from theories that are explanatorily redundant. Suppose, for instance, that there are two theories, T1 and T2, which both seek to explain the same set of empirical data, D. Suppose also that T1 and T2 are identical in terms of the entities that are posited, except for the fact that T2 entails an additional posit, b, that is not part of T1. So let us say that T1 posits a, while T2 posits a + b. Intuitively, T2 is a more complex theory than T1 because it posits more things. Now let us assume that both theories provide an equally complete explanation of D, in the sense that there are no features of D that the two theories cannot account for. In this situation, the anti-superfluity principle would instruct us to prefer the simpler theory, T1, to the more complex theory, T2. The reason for this is because T2 contains an explanatorily redundant posit, b, which does no explanatory work in the theory with respect to D. We know this because T1, which posits a alone provides an equally adequate account of D as T2. Hence, we can infer that positing a alone is sufficient to acquire all the explanatory ability offered by T2, with respect to D; adding b does nothing to improve the ability of T2 to account for the data.

This sort of anti-superfluity principle underlies one important interpretation of “entities are not to be multiplied beyond necessity”: as a principle that invites us to get rid of superfluous components of theories. Here, an ontological posit is superfluous with respect to a given theory, T, in so far as it does nothing to improve T’s ability to account for the phenomena to be explained. This is how John Stuart Mill understood Ockham’s razor (Mill, 1867, p526). Mill also pointed to a plausible justification for the anti-superfluity principle: explanatorily redundant posits—those that have no effect on the ability of the theory to explain the data—are also posits that do not obtain evidential support from the data. This is because it is plausible that theoretical entities are evidentially supported by empirical data only to the extent that they can help us to account for why the data take the form that they do. If a theoretical entity fails to contribute to this end, then the data fails to confirm the existence of this entity. If we have no other independent reason to postulate the existence of this entity, then we have no justification for including this entity in our theoretical ontology.

Another justification that has been offered for the anti-superfluity principle is a probabilistic one. Note that T2 is a logically stronger theory than T1: T2 says that a and b exist, while T1 says that only a exists. It is a consequence of the axioms of probability that a logically stronger theory is always less probable than a logically weaker theory, thus, so long as the probability of a existing and the probability of b existing are independent of each other, the probability of a existing is greater than zero, and the probability of b existing is less than 1, we can assert that Pr (a exists) > Pr (a exists & b exists), where Pr (a exists & b exists) = Pr (a exists) * Pr (b exists). According to this reasoning, we should therefore regard the claims of T1 as more a priori probable than the claims of T2, and this is a reason to prefer it. However, one objection to this probabilistic justification for the anti-superfluity principle is that it doesn’t fully explain why we dislike theories that posit explanatorily redundant entities: it can’t really because they are logically stronger theories; rather it is because they postulate entities that are unsupported by evidence.

When the principle of parsimony is read as an anti-superfluity principle, it seems relatively uncontroversial. However, it is important to recognize that the vast majority of instances where the principle of parsimony is applied (or has been seen as applying) in science cannot be given an interpretation merely in terms of the anti-superfluity principle. This is because the phrase “entities are not to be multiplied beyond necessity” is normally read as what we can call an anti-quantity principle: theories that posit fewer things are (other things being equal) to be preferred to theories that posit more things, whether or not the relevant posits play any genuine explanatory role in the theories concerned (Barnes, 2000). This is a much stronger claim than the claim that we should razor off explanatorily redundant entities. The evidential justification for the anti-superfluity principle just described cannot be used to motivate the anti-quantity principle, since the reasoning behind this justification allows that we can posit as many things as we like, so long as all of the individual posits do some explanatory work within the theory. It merely tells us to get rid of theoretical ontology that, from the perspective of a given theory, is explanatorily redundant. It does not tell us that theories that posit fewer things when accounting for the data are better than theories that posit more things—that is, that sparser ontologies are better than richer ones.

Another important point about the anti-superfluity principle is that it does not give us a reason to assert the non-existence of the superfluous posit. Absence of evidence, is not (by itself) evidence for absence. Hence, this version of Ockham’s razor is sometimes also referred to as an “agnostic” razor rather than an “atheistic” razor, since it only motivates us to be agnostic about the razored-off ontology (Sober, 1981). It seems that in most cases where Ockham’s razor is appealed to in science it is intended to support atheistic conclusions—the entities concerned are not merely cut out of our theoretical ontology, their existence is also denied. Hence, if we are to explain why such a preference is justified we need will to look for a different justification. With respect to the probabilistic justification for the anti-superfluity principle described above, it is important to note that it is not an axiom of probability that Pr (a exists & b doesn’t exist) > Pr (a exists & b exists).

b. Examples of Simplicity Preferences at Work in the History of Science

It is widely believed that there have been numerous episodes in the history of science where particular scientific theories were defended by particular scientists and/or came to be preferred by the wider scientific community less for directly empirical reasons (for example, some telling experimental finding) than as a result of their relative simplicity compared to rival theories. Hence, the history of science is taken to demonstrate the importance of simplicity considerations in how scientists defend, evaluate, and choose between theories. One striking example is Isaac Newton’s argument for universal gravitation.

i. Newton’s Argument for Universal Gravitation

At beginning of the third book of the Principia, subtitled “The system of the world”, Isaac Newton described four “rules for the study of natural philosophy”:

  • Rule 1 No more causes of natural things should be admitted than are both true and sufficient to explain their phenomena.
  • As the philosophers say: Nature does nothing in vain, and more causes are in vain when fewer will suffice. For Nature is simple and does not indulge in the luxury of superfluous causes.
  • Rule 2 Therefore, the causes assigned to natural effects of the same kind must be, so far as possible, the same.
  • Rule 3 Those qualities of bodies that cannot be intended and remitted [i.e., qualities that cannot be increased and diminished] and that belong to all bodies on which experiments can be made should be taken as qualities of all bodies universally.
  • For the qualities of bodies can be known only through experiments; and therefore qualities that square with experiments universally are to be regarded as universal qualities… Certainly ideal fancies ought not to be fabricated recklessly against the evidence of experiments, nor should we depart from the analogy of nature, since nature is always simple and ever consonant with itself…
  • Rule 4 In experimental philosophy, propositions gathered from phenomena by induction should be considered either exactly or very nearly true notwithstanding any contrary hypotheses, until yet other phenomena make such propositions either more exact or liable to exceptions.
  • This rule should be followed so that arguments based on induction may not be nullified by hypotheses. (Newton, 1999, p794-796).

Here we see Newton explicitly placing simplicity at the heart of his conception of the scientific method. Rule 1, a version of Ockham’s Razor, which, despite the use of the word “superfluous”, has typically been read as an anti-quantity principle rather than an anti-superfluity principle (see Section 1a), is taken to follow directly from the assumption that nature is simple, which is in turn taken to give rise to rules 2 and 3, both principles of inductive generalization (infer similar causes for similar effects, and assume to be universal in all bodies those properties found in all observed bodies). These rules play a crucial role in what follows, the centrepiece being the argument for universal gravitation.

After laying out these rules of method, Newton described several “phenomena”—what are in fact empirical generalizations, derived from astronomical observations, about the motions of the planets and their satellites, including the moon. From these phenomena and the rules of method, he then “deduced” several general theoretical propositions. Propositions 1, 2, and 3 state that the satellites of Jupiter, the primary planets, and the moon are attracted towards the centers of Jupiter, the sun, and the earth respectively by forces that keep them in their orbits (stopping them from following a linear path in the direction of their motion at any one time). These forces are also claimed to vary inversely with the square of the distance of the orbiting body (for example, Mars) from the center of the body about which it orbits (for example, the sun). These propositions are taken to follow from the phenomena, including the fact that the respective orbits can be shown to (approximately) obey Kepler’s law of areas and the harmonic law, and the laws of motion developed in book 1 of the Principia. Newton then asserted proposition 4: “The moon gravitates toward the earth and by the force of gravity is always drawn back from rectilinear motion and kept in its orbit” (p802). In other words, it is the force of gravity that keeps the moon in its orbit around the earth. Newton explicitly invoked rules 1 and 2 in the argument for this proposition (what has become known as the “moon-test”). First, astronomical observations told us how fast the moon accelerates towards the earth. Newton was then able to calculate what the acceleration of the moon would be at the earth’s surface, if it were to fall down to the earth. This turned out to be equal to the acceleration of bodies observed to fall in experiments conducted on earth. Since it is the force of gravity that causes bodies on earth to fall (Newton assumed his readers’ familiarity with “gravity” in this sense), and since both gravity and the force acting on the moon “are directed towards the center of the earth and are similar to each other and equal”, Newton asserted that “they will (by rules 1 and 2) have the same cause” (p805). Therefore, the forces that act on falling bodies on earth, and which keeps the moon in its orbit are one and the same: gravity. Given this, the force of gravity acting on terrestrial bodies could now be claimed to obey an inverse-square law. Through similar deployment of rules 1, 2, and 4, Newton was led to the claim that it is also gravity that keeps the planets in their orbits around the sun and the satellites of Jupiter and Saturn in their orbits, since these forces are also directed toward the centers of the sun, Jupiter, and Saturn, and display similar properties to the force of gravity on earth, such as the fact that they obey an inverse-square law. Therefore, the force of gravity was held to act on all planets universally. Through several more steps, Newton was eventually able to get to the principle of universal gravitation: that gravity is a mutually attractive force that acts on any two bodies whatsoever and is described by an inverse-square law, which says that the each body attracts the other with a force of equal magnitude that is proportional to the product of the masses of the two bodies and inversely proportional to the squared distance between them. From there, Newton was able to determine the masses and densities of the sun, Jupiter, Saturn, and the earth, and offer a new explanation for the tides of the seas, thus showing the remarkable explanatory power of this new physics.

Newton’s argument has been the subject of much debate amongst historians and philosophers of science (for further discussion of the various controversies surrounding its structure and the accuracy of its premises, see Glymour, 1980; Cohen, 1999; Harper, 2002). However, one thing that seems to be clear is that his conclusions are by no means forced on us through simple deductions from the phenomena, even when combined with the mathematical theorems and general theory of motion outlined in book 1 of the Principia. No experiment or mathematical derivation from the phenomena demonstrated that it must be gravity that is the common cause of the falling of bodies on earth, the orbits of the moon, the planets and their satellites, much less that gravity is a mutually attractive force acting on all bodies whatsoever. Rather, Newton’s argument appears to boil down to the claim that if gravity did have the properties accorded to it by the principle of universal gravitation, it could provide a common causal explanation for all the phenomena, and his rules of method tell us to infer common causes wherever we can. Hence, the rules, which are in turn grounded in a preference for simplicity, play a crucial role in taking us from the phenomena to universal gravitation (for further discussion of the apparent link between simplicity and common cause reasoning, see Sober, 1988). Newton’s argument for universal gravitation can thus be seen as argument to the putatively simplest explanation for the empirical observations.

ii. Other Examples

Numerous other putative examples of simplicity considerations at work in the history of science have been cited in the literature:

  • One of the most commonly cited concerns Copernicus’ arguments for the heliocentric theory of planetary motion. Copernicus placed particular emphasis on the comparative “simplicity” and “harmony” of the account that his theory gave of the motions of the planets compared with the rival geocentric theory derived from the work of Ptolemy. This argument appears to have carried significant weight for Copernicus’ successors, including Rheticus, Galileo, and Kepler, who all emphasized simplicity as a major motivation for heliocentrism. Philosophers have suggested various reconstructions of the Copernican argument (see for example, Glymour, 1980; Rosencrantz, 1983; Forster and Sober, 1994; Myrvold, 2003; Martens, 2009). However, historians of science have questioned the extent to which simplicity could have played a genuine rather than purely rhetorical role in this episode. For example, it has been argued that there is no clear sense in which the Copernican system was in fact simpler than Ptolemy’s, and that geocentric systems such as the Tychronic system could be constructed that were at least as simple as the Copernican one (for discussion, see Kuhn, 1957; Palter, 1970; Cohen, 1985; Gingerich, 1993; Martens, 2009).
  • It has been widely claimed that simplicity played a key role in the development of Einstein’s theories of theories of special and general relativity, and in the early acceptance of Einstein’s theories by the scientific community (see for example, Hesse, 1974; Holton, 1974; Schaffner, 1974; Sober, 1981; Pais, 1982; Norton, 2000).
  • Thagard (1988) argues that simplicity considerations played an important role in Lavoisier’s case against the existence of phlogiston and in favour of the oxygen theory of combustion.
  • Carlson (1966) describes several episodes in the history of genetics in which simplicity considerations seemed to have held sway.
  • Nolan (1997) argues that a preference for ontological parsimony played an important role in the discovery of the neutrino and in the proposal of Avogadro’s hypothesis.
  • Baker (2007) argues that ontological parsimony was a key issue in discussions over rival dispersalist and extensionist bio-geographical theories in the late 19th and early 20th century.

Though it is commonplace for scientists and philosophers to claim that simplicity considerations have played a significant role in the history of science, it is important to note that some skeptics have argued that the actual historical importance of simplicity considerations has been over-sold (for example, Bunge, 1961; Lakatos and Zahar, 1978). Such skeptics dispute the claim that we can only explain the basis for these and other episodes of theory change by according a role to simplicity, claiming other considerations actually carried more weight. In addition, it has been argued that, in many cases, what appear on the surface to have been appeals to the relative simplicity of theories were in fact covert appeals to some other theoretical virtue (for example, Boyd, 1990; Sober, 1994; Norton, 2003; Fitzpatrick, 2009). Hence, for any putative example of simplicity at work in the history of science, it is important to consider whether the relevant arguments are not best reconstructed in other terms (such a “deflationary” view of simplicity will be discussed further in Section 4c).

c. Simplicity and Inductive Inference

Many philosophers have come to see simplicity considerations figuring not only in how scientists go about evaluating and choosing between developed scientific theories, but also in the mechanics of making much more basic inductive inferences from empirical data. The standard illustration of this in the modern literature is the practice of curve-fitting. Suppose that we have a series of observations of the values of a variable, y, given values of another variable, x. This gives us a series of data points, as represented in Figure 1.

Figure 1

Given this data, what underlying relationship should we posit between x and y so that we can predict future pairs of xy values? Standard practice is not to select a bumpy curve that neatly passes through all the data points, but rather to select a smooth curve—preferably a straight line, such as H1—that passes close to the data. But why do we do this? Part of an answer comes from the fact that if the data is to some degree contaminated with measurement error (for example, through mistakes in data collection) or “noise” produced by the effects of uncontrolled factors, then any curve that fits the data perfectly will most likely be false. However, this does not explain our preference for a curve like H1 over an infinite number of other curves—H2, for instance—that also pass close to the data. It is here that simplicity has been seen as playing a vital, though often implicit role in how we go about inferring hypotheses from empirical data: H1 posits a “simpler” relationship between x and y than H2—hence, it is for reasons of simplicity that we tend to infer hypotheses like H1.

The practice of curve-fitting has been taken to show that—whether we aware of it or not—human beings have a fundamental cognitive bias towards simple hypotheses. Whether we are deciding between rival scientific theories, or performing more basic generalizations from our experience, we ubiquitously tend to infer the simplest hypothesis consistent with our observations. Moreover, this bias is held to be necessary in order for us to be able select a unique hypotheses from the potentially limitless number of hypotheses consistent with any finite amount of experience.

The view that simplicity may often play an implicit role in empirical reasoning can arguably be traced back to David Hume’s description of enumerative induction in the context of his formulation of the famous problem of induction. Hume suggested that a tacit assumption of the uniformity of nature is ingrained into our psychology. Thus, we are naturally drawn to the conclusion that all ravens have black feathers from the fact that all previously observed ravens have black feathers because we tacitly assume that the world is broadly uniform in its properties. This has been seen as a kind of simplicity assumption: it is simpler to assume more of the same.

A fundamental link between simplicity and inductive reasoning has been retained in many more recent descriptive accounts of inductive inference. For instance, Hans Reichenbach (1949) described induction as an application of what he called the “Straight Rule”, modelling all inductive inference on curve-fitting. In addition, proponents of the model of “Inference to Best Explanation”, who hold that many inductive inferences are best understood as inferences to the hypothesis that would, if true, provide the best explanation for our observations, normally claim that simplicity is one of the criteria that we use to determine which hypothesis constitutes the “best” explanation.

In recent years, the putative role of simplicity in our inferential psychology has been attracting increasing attention from cognitive scientists. For instance, Lombrozo (2007) describes experiments that she claims show that participants use the relative simplicity of rival explanations (for instance, whether a particular medical diagnosis for a set of symptoms involves assuming the presence of one or multiple independent conditions) as a guide to assessing their probability, such that a disproportionate amount of contrary probabilistic evidence is required for participants to choose a more complex explanation over a simpler one. Simplicity considerations have also been seen as central to learning processes in many different cognitive domains, including language acquisition and category learning (for example, Chater, 1999; Lu and others, 2006).

d. Simplicity in Statistics and Data Analysis

Philosophers have long used the example of curve-fitting to illustrate the (often implicit) role played by considerations of simplicity in inductive reasoning from empirical data. However, partly due to the advent of low-cost computing power and that the fact scientists in many disciplines find themselves having to deal with ever larger and more intricate bodies of data, recent decades have seen a remarkable revolution in the methods available to scientists for analyzing and interpreting empirical data (Gauch, 2006). Importantly, there are now numerous formalized procedures for data analysis that can be implemented in computer software—and which are widely used in disciplines from engineering to crop science to sociology—that contain an explicit role for some notion of simplicity. The literature on such methods abounds with talk of “Ockham’s Razor”, “Occam factors”, “Ockham’s hill” (MacKay, 1992; Gauch, 2006), “Occam’s window” (Raftery and others, 1997), and so forth. This literature not only provides important illustrations of the role that simplicity plays in scientific practice, but may also offer insights for philosophers seeking to understand the basis for this role.

As an illustration, consider standard procedures for model selection, such as the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Minimum Message Length (MML) and Minimum Description Length (MDL) procedures, and numerous others (for discussion see, Forster and Sober, 1994; Forster, 2001; Gauch, 2003; Dowe and others, 2007). Model selection is a matter of selecting the kind of relationship that is to be posited between a set of variables, given a sample of data, in an effort to generate hypotheses about the true underlying relationship holding in the population of inference and/or to make predictions about future data. This question arises in the simple curve-fitting example discussed above—for instance, whether the true underlying relationship between x and y is linear, parabolic, quadratic, and so on. It also arises in lots of other contexts, such as the problem of inferring the causal relationship that exists between an empirical effect and a set of variables. “Models” in this sense are families of functions, such as the family of linear functions, LIN: y = a + bx, or the family of parabolic functions, PAR: y = a + bx + cx2. The simplicity of a model is normally explicated in terms of the number of adjustable parameters it contains (MML and MDL measure the simplicity of models in terms of the extent to which they provide compact descriptions of the data, but produce similar results to the counting of adjustable parameters). On this measure, the model LIN is simpler than PAR, since LIN contains two adjustable parameters, whereas PAR has three. A consequence of this is that a more complex model will always be able to fit a given sample of data better than a simpler model (“fitting” a model to the data involves using the data to determine what the values of the parameters in the model should be, given that data—that is, identifying the best-fitting member of the family). For instance, returning to the curve-fitting scenario represented in Figure 1, the best-fitting curve in PAR is guaranteed to fit this data set at least as well as the best-fitting member of the simpler model, LIN, and this is true no matter what the data are, since linear functions are special cases of parabolas, where c = 0, so any curve that is a member of LIN is also a member of PAR.

Model selection procedures produce a ranking of all the models under consideration in light of the data, thus allowing scientists to choose between them. Though they do it in different ways, AIC, BIC, MML, and MDL all implement procedures for model selection that impose a penalty on the complexity of a model, so that a more complex model will have to fit the data sample at hand significantly better than a simpler one for it to be rated higher than the simpler model. Often, this penalty is greater the smaller is the sample of data. Interestingly—and contrary to the assumptions of some philosophers—this seems to suggest that simplicity considerations do not only come into play as a tiebreaker between theories that fit the data equally well: according to the model selection literature, simplicity sometimes trumps better fit to the data. Hence, simplicity need not only come into play when all other things are equal.

Both statisticians and philosophers of statistics have vigorously debated the underlying justification for these sorts of model selection procedures (see, for example, the papers in Zellner and others, 2001). However, one motivation for taking into account the simplicity of models derives from a piece of practical wisdom: when there is error or “noise” in the data sample, a relatively simple model that fits the sample less well will often be more accurate when it comes to predicting extra-sample (for example, future) data than a more complex model that fits the sample more closely. The logic here is that since more complex models are more flexible in their ability to fit the data (since they have more adjustable parameters), they also have a greater propensity to be misled by errors and noise, in which case they may recover less of the true underlying “signal” in the sample. Thus, constraining model complexity may facilitate greater predictive accuracy. This idea is captured in what Gauch (2003, 2006) (following MacKay, 1992) calls “Ockham’s hill”. To the left of the peak of the hill, increasing the complexity of a model improves its accuracy with respect to extra-sample data because this recovers more of the signal in the sample. However, after the peak, increasing complexity actually diminishes predictive accuracy because this leads to over-fitting to spurious noise in the sample. There is therefore an optimal trade-off (at the peak of Ockham’s hill) between simplicity and fit to the sample data when it comes to facilitating accurate prediction of extra-sample data. Indeed, this trade-off is essentially the core idea behind AIC, the development of which initiated the now enormous literature on model selection, and the philosophers Malcolm Forster and Elliott Sober have sought to use such reasoning to make sense of the role of simplicity in many areas of science (see Section 4biii).

One important implication of this apparent link between model simplicity and predictive accuracy is that interpreting sample data using relatively simple models may improve the efficiency of experiments by allowing scientists to do more with less data—for example, scientists may be able to run a costly experiment fewer times before they can be in a position to make relatively accurate predictions about the future. Gauch (2003, 2006) describes several real world cases from crop science and elsewhere where this gain in accuracy and efficiency from the use of relatively simple models has been documented.

2. Wider Philosophical Significance of Issues Surrounding Simplicity

The putative role of simplicity, both in the evaluation of rival scientific theories and in the mechanics of how we go about inferring hypotheses from empirical data, clearly raises a number of difficult philosophical issues. These include, but are by no means limited to: (1) the question of what precisely it means to say the one theory or hypothesis is simpler than another and how the relative simplicity of theories is to be measured; (2) the question of what rational justification (if any) can be provided for choosing between rival theories on grounds of simplicity; and (3) the closely related question of what weight simplicity considerations ought to carry in theory choice relative to other theoretical virtues, particularly if these sometimes have to be traded-off against each other. (For general surveys of the philosophical literature on these issues, see Hesse, 1967; Sober, 2001a, 2001b). Before we delve more deeply into how philosophers have sought to answer these questions, it is worth noting the close connections between philosophical issues surrounding simplicity and many of the most important controversies in the philosophy of science and epistemology.

First, the problem of simplicity has close connections with long-standing issues surrounding the nature and justification of inductive inference. Some philosophers have actually offered up the idea that simpler theories are preferable to less simple ones as a purported solution to the problem of induction: it is the relative simplicity of the hypotheses that we tend to infer from empirical observations that supposedly provides the justification for these inferences—thus, it is simplicity that provides the warrant for our inductive practices. This approach is not as popular as it once was, since it is taken to merely substitute the problem of induction for the equally substantive problem of justifying preferences for simpler theories. A more common view in the recent literature is that the problem of induction and the problem of justifying preferences for simpler theories are closely connected, or may even amount to the same problem. Hence, a solution to the latter problem will provide substantial help towards solving the former.

More generally, the ability to make sense of the putative role of simplicity in scientific reasoning has been seen by many to be a central desideratum for any adequate philosophical theory of the scientific method. For example, Thomas Kuhn’s (1962) influential discussion of the importance of scientists’ aesthetic preferences—including but not limited to judgments of simplicity—in scientific revolutions was a central part of his case for adopting a richer conception of the scientific method and of theory change in science than he found in the dominant logical empiricist views of the time. More recently, critics of the Bayesian approach to scientific reasoning and theory confirmation, which holds that sound inductive reasoning is reasoning according to the formal principles of probability, have claimed that simplicity is an important feature of scientific reasoning that escapes a Bayesian analysis. For instance, Forster and Sober (1994) argue that Bayesian approaches to curve-fitting and model selection (such as the Bayesian Information Criterion) cannot themselves be given Bayesian rationale, nor can any other approach that builds in a bias towards simpler models. The ability of the Bayesian approach to make sense of simplicity in model selection and other aspects of scientific practice has thus been seen as central to evaluating its promise (see for example, Glymour, 1980; Forster and Sober, 1994; Forster, 1995; Kelly and Glymour, 2004; Howson and Urbach, 2006; Dowe and others, 2007).

Discussions over the legitimacy of simplicity as a criterion for theory choice have also been closely bound up with debates over scientific realism. Scientific realists assert that scientific theories aim to offer a literally true description of the world and that we have good reason to believe that the claims of our current best scientific theories are at least approximately true, including those claims that purport to be about “unobservable” natural phenomena that are beyond our direct perceptual access. Some anti-realists object that it is possible to formulate incompatible alternatives to our current best theories that are just as consistent with any current data that we have, perhaps even any future data that we could ever collect. They claim that we can therefore never be justified in asserting that the claims of our current best theories, especially those concerning unobservables, are true, or approximately true. A standard realist response is to emphasize the role of the so-called “theoretical virtues” in theory choice, among which simplicity is normally listed. The claim is thus that we rule out these alternative theories because they are unnecessarily complex. Importantly, for this defense to work, realists have to defend the idea that not only are we justified in choosing between rival theories on grounds of simplicity, but also that simplicity can be used as a guide to the truth. Naturally, anti-realists, particularly those of an empiricist persuasion (for example, van Fraassen, 1989), have expressed deep skepticism about the alleged truth-conduciveness of a simplicity criterion.

3. Defining and Measuring Simplicity

The first major philosophical problem that seems to arise from the notion that simplicity plays a role in theory choice and evaluation concerns specifying in more detail what it means to say that one theory is simpler than another and how the relative simplicity of theories is to be precisely and objectively measured. Numerous attempts have been made to formulate definitions and measures of theoretical simplicity, all of which face very significant challenges. Philosophers have not been the only ones to contribute to this endeavour. For instance, over the last few decades, a number of formal measures of simplicity and complexity have been developed in mathematical information theory. This section provides an overview of some of the main simplicity measures that have been proposed and the problems that they face. The proposals described here have also normally been tied to particular proposals about what justifies preferences for simpler theories. However, discussion of these justifications will be left until Section 4.

To begin with, it is worth considering why providing a precise definition and measure of theoretical simplicity ought to be regarded as a substantial philosophical problem. After all, it often seems that when one is confronted with a set of rival theories designed to explain a particular empirical phenomenon, it is just obvious which is the simplest. One does not always need a precise definition or measure of a particular property to be able to tell whether or not something exhibits it to a greater degree than something else. Hence, it could be suggested that if there is a philosophical problem here, it is only of very minor interest and certainly of little relevance to scientific practice. There are, however, some reasons to regard this as a substantial philosophical problem, which also has some practical relevance.

First, it is not always easy to tell whether one theory really ought to be regarded as simpler than another, and it is not uncommon for practicing scientists to disagree about the relative simplicity of rival theories. A well-known historical example is the disagreement between Galileo and Kepler concerning the relative simplicity of Copernicus’ theory of planetary motion, according to which the planets move only in perfect circular orbits with epicycles, and Kepler’s theory, according to which the planets move in elliptical orbits (see Holton, 1974; McAllister, 1996). Galileo held to the idea that perfect circular motion is simpler than elliptical motion. In contrast, Kepler emphasized that an elliptical model of planetary motion required many fewer orbits than a circular model and enabled a reduction of all the planetary motions to three fundamental laws of planetary motion. The problem here is that scientists seem to evaluate the simplicity of theories along a number of different dimensions that may conflict with each other. Hence, we have to deal with the fact that a theory may be regarded as simpler than a rival in one respect and more complex in another. To illustrate this further, consider the following list of commonly cited ways in which theories may be held to be simpler than others:

  • Quantitative ontological parsimony (or economy): postulating a smaller number of independent entities, processes, causes, or events.
  • Qualitative ontological parsimony (or economy): postulating a smaller number of independent kinds or classes of entities, processes, causes, or events.
  • Common cause explanation: accounting for phenomena in terms of common rather than separate causal processes.
  • Symmetry: postulating that equalities hold between interacting systems and that the laws describing the phenomena look the same from different perspectives.
  • Uniformity (or homogeneity): postulating a smaller number of changes in a given phenomenon and holding that the relations between phenomena are invariant.
  • Unification: explaining a wider and more diverse range of phenomena that might otherwise be thought to require separate explanations in a single theory (theoretical reduction is generally held to be a species of unification).
  • Lower level processes: when the kinds of processes that can be posited to explain a phenomena come in a hierarchy, positing processes that come lower rather than higher in this hierarchy.
  • Familiarity (or conservativeness): explaining new phenomena with minimal new theoretical machinery, reusing existing patterns of explanation.
  • Paucity of auxiliary assumptions: invoking fewer extraneous assumptions about the world.
  • Paucity of adjustable parameters: containing fewer independent parameters that the theory leaves to be determined by the data.

As can be seen from this list, there is considerable diversity here. We can see that theoretical simplicity is frequently thought of in ontological terms (for example, quantitative and qualitative parsimony), but also sometimes as a structural feature of theories (for example, unification, paucity of adjustable parameters), and while some of these intuitive types of simplicity may often cluster together in theories—for instance, qualitative parsimony would seem to often go together with invoking common cause explanations, which would in turn often seem to go together with explanatory unification—there is also considerable scope for them pointing in different directions in particular cases. For example, a theory that is qualitatively parsimonious as a result of positing fewer different kinds of entities might be quantitatively unparsimonious as result of positing more of a particular kind of entity; while the demand to explain in terms of lower-level processes rather than higher-level processes may conflict with the demand to explain in terms of common causes behind similar phenomena, and so on. There are also different possible ways of evaluating the simplicity of a theory with regard to any one of these intuitive types of simplicity. A theory may, for instance, come out as more quantitatively parsimonious than another if one focuses on the number of independent entities that it posits, but less parsimonious if one focuses on the number of independent causes it invokes. Consequently, it seems that if a simplicity criterion is actually to be applicable in practice, we need some way of resolving the disagreements that may arise between scientists about the relative simplicity of rival theories, and this requires a more precise measure of simplicity.

Second, as has already been mentioned, a considerable amount of the skepticism expressed both by philosophers and by scientists about the practice of choosing one theory over another on grounds of relative simplicity has stemmed from the suspicion that our simplicity judgments lack a principled basis (for example, Ackerman, 1961; Bunge, 1961; Priest, 1976). Disagreements between scientists, along with the multiplicity and scope for conflict between intuitive types of simplicity have been important contributors to this suspicion, leading to the view that for any two theories, T1 and T2, there is some way of evaluating their simplicity such that T1 comes out as simpler than T2, and vice versa. It seems, then, that an adequate defense of the legitimacy a simplicity criterion needs to show that there are in fact principled ways of determining when one theory is indeed simpler than another. Moreover, in so far as there is also a justificatory issue to be dealt with, we also need to be clear about exactly what it is that we need to justify a preference for.

a. Syntactic Measures

One proposal is that the simplicity of theories can be precisely and objectively measured in terms of how briefly they can be expressed. For example, a natural way of measuring the simplicity of an equation is just to count the number of terms, or parameters that it contains. Similarly, we could measure the simplicity of a theory in terms of the size of the vocabulary—for example, the number of extra-logical terms—required to write down its claims. Such measures of simplicity are often referred to as syntactic measures, since they involve counting the linguistic elements required to state, or to describe the theory.

A major problem facing any such syntactic measure of simplicity is the problem of language variance. A measure of simplicity is language variant if it delivers different results depending on the language that is used to represent the theories being compared. Suppose, for example, that we measure the simplicity of an equation by counting the number of non-logical terms that it contains. This will produce the result that r = a will come out as simpler than x2 + y2 = a2. However, this second equation is simply a transformation of the first into Cartesian co-ordinates, where r2 = x2 + y2, and is hence logically equivalent. The intuitive proposal for measuring simplicity in curve-fitting contexts, according to which hypotheses are said to be simpler if they contain fewer parameters, is also language variant in this sense. How many parameters a hypothesis contains depends on the co-ordinate scales that one uses. For any two non-identical functions, F and G, there is some way of transforming the co-ordinate scales such that we can turn F into a linear curve and G into a non-linear curve, and vice versa.

Nelson Goodman’s (1983) famous “new riddle of induction” allows us to formulate another example of the problem of language variance. Suppose all previously observed emeralds have been green. Now consider the following hypotheses about the color properties of the entire population of emeralds:

  • H1: all emeralds are green
  • H2: all emeralds first observed prior to time t are green and all emeralds first observed after time t are blue (where t is some future time)

Intuitively, H1 seems to be a simpler hypothesis than H2. To begin with, it can be stated with a smaller vocabulary. H1 also seems to postulate uniformity in the properties of emeralds, while H2 posits non-uniformity. For instance, H2 seems to assume that there is some link between the time at which an emerald is first observed and its properties. Thus it can be viewed as including an additional time parameter. But now consider Goodman’s invented predicates, “grue” and “bleen”. These have been defined in variety of different ways, but let us define them here as follows: an object is grue if it is first observed before time t and the object is green, or first observed after t and the object is blue; an object is bleen if it is first observed before time t and the object is blue, or first observed after the time t and the object is green. With these predicates, we can define a further property, “grolor”. Grue and bleen are grolors just as green and blue are colors. Now, because of the way that grolors are defined, color predicates like “green” and “blue” can also be defined in terms of grolor predicates: an object is green if first observed before time t and the object is grue, or first observed after time t and the object is bleen; an object is blue if first observed before time t and the object is bleen, or first observed after t and the object is grue. This means that statements that are expressed in terms of green and blue can also be expressed in terms of grue and bleen. So, we can rewrite H1 and H2 as follows:

  • H1: all emeralds first observed prior to time t are grue and all emeralds first observed after time t are bleen (where t is some future time)
  • H2: all emeralds are grue

Re-call that earlier we judged H1 to be simpler than H2. However, if we are retain that simplicity judgment, we cannot say that H1 is simpler than H2 because it can be stated with a smaller vocabulary; nor can we say that it H1 posits greater uniformity, and is hence simpler, because it does not contain a time parameter. This is because simplicity judgments based on such syntactic features can be reversed merely by switching the language used to represent the hypotheses from a color language to a grolor language.

Examples such as these have been taken to show two things. First, no syntactic measure of simplicity can suffice to produce a principled simplicity ordering, since all such measures will produce different results depending of the language of representation that is used. It is not enough just to stipulate that we should evaluate simplicity in one language rather than another, since that would not explain why simplicity should be measured in that way. In particular, we want to know that our chosen language is accurately tracking the objective language-independent simplicity of the theories being compared. Hence, if a syntactic measure of simplicity is to be used, say for practical purposes, it must be underwritten by a more fundamental theory of simplicity. Second, a plausible measure of simplicity cannot be entirely neutral with respect to all of the different claims about the world that the theory makes or can be interpreted as making. Because of the respective definitions of colors and grolors, any hypothesis that posits uniformity in color properties must posit non-uniformity in grolor properties. As Goodman emphasized, one can find uniformity anywhere if no restriction is placed on what kinds of properties should be taken into account. Similarly, it will not do to say that theories are simpler because they posit the existence of fewer entities, causes and processes, since, using Goodman-like manipulations, it is trivial to show that a theory can be regarded as positing any number of different entities, causes and processes. Hence, some principled restriction needs to be placed on which aspects of the content of a theory are to be taken into account and which are to be disregarded when measuring their relative simplicity.

b. Goodman’s Measure

According to Nelson Goodman, an important component of the problem of measuring the simplicity of scientific theories is the problem of measuring the degree of systematization that a theory imposes on the world, since, for Goodman, to seek simplicity is to seek a system. In a series of papers in the 1940s and 50s, Goodman (1943, 1955, 1958, 1959) attempted to explicate a precise measure of theoretical systematization in terms of the logical properties of the set of concepts, or extra-logical terms, that make up the statements of the theory.

According to Goodman, scientific theories can be regarded as sets of statements. These statements contain various extra-logical terms, including property terms, relation terms, and so on. These terms can all be assigned predicate symbols. Hence, all the statements of a theory can be expressed in a first order language, using standard symbolic notion. For instance, “… is acid” may become “A(x)”, “… is smaller than ____” may become “S(x, y)”, and so on. Goodman then claims that we can measure the simplicity of the system of predicates employed by the theory in terms of their logical properties, such as their arity, reflexivity, transitivity, symmetry, and so on. The details arehighly technical but, very roughly, Goodman’s proposal is that a system of predicates that can be used to express more is more complex than a system of predicates that can be used to express less. For instance, one of the axioms of Goodman’s proposal is that if every set of predicates of a relevant kind, K, is always replaceable by a set of predicates of another kind, L, then K is not more complex than L.

Part of Goodman’s project was to avoid the problem of language variance. Goodman’s measure is a linguistic measure, since it concerns measuring the simplicity of a theory’s predicate basis in a first order language. However, it is not a purely syntactic measure, since it does not involve merely counting linguistic elements, such as the number of extra-logical predicates. Rather, it can be regarded as an attempt to measure the richness of a conceptual scheme: conceptual schemes that can be used to say more are more complex than conceptual schemes that can be used to say less. Hence, a theory can be regarded as simpler if it requires a less expressive system of concepts.

Goodman developed his axiomatic measure of simplicity in considerable detail. However, Goodman himself only ever regarded it as a measure of one particular type of simplicity, since it only concerns the logical properties of the predicates employed by the theory. It does not, for example, take account of the number of entities that a theory postulates. Moreover, Goodman never showed how the measure could be applied to real scientific theories. It has been objected that even if Goodman’s measure could be applied, it would not discriminate between many theories that intuitively differ in simplicity—indeed, in the kind of simplicity as systematization that Goodman wants to measure. For instance, it is plausible that the system of concepts used to express the Copernican theory of planetary motion is just as expressively rich as the system of concepts used to express the Ptolemaic theory, yet the former is widely regarded as considerably simpler than the latter, partly in virtue of it providing an intuitively more systematic account of the data (for discussion of the details of Goodman’s proposal and the objections it faces, see Kemeny, 1955; Suppes, 1956; Kyburg, 1961; Hesse, 1967).

c. Simplicity as Testability

It has often been argued that simpler theories say more about the world and hence are easier to test than more complex ones. C. S. Peirce (1931), for example, claimed that the simplest theories are those whose empirical consequences are most readily deduced and compared with observation, so that they can be eliminated more easily if they are wrong. Complex theories, on the other hand, tend to be less precise and allow for more wriggle room in accommodating the data. This apparent connection between simplicity and testability has led some philosophers to attempt to formulate measures of simplicity in terms of the relative testability of theories.

Karl Popper (1959) famously proposed one such testability measure of simplicity. Popper associated simplicity with empirical content: simpler theories say more about the world than more complex theories and, in so doing, place more restriction on the ways that the world can be. According to Popper, the empirical content of theories, and hence their simplicity, can be measured in terms of their falsifiability. The falsifiability of a theory concerns the ease with which the theory can be proven false, if the theory is indeed false. Popper argued that this could be measured in terms of the amount of data that one would need to falsify the theory. For example, on Popper’s measure, the hypothesis that x and y are linearly related, according to an equation of the form, y = a + bx, comes out as having greater empirical content and hence greater simplicity than the hypotheses that they are related according a parabola of the form, y = a + bx + cx2. This is because one only needs three data points to falsify the linear hypothesis, but one needs at least four data points to falsify the parabolic hypothesis. Thus Popper argued that empirical content, falsifiability, and hence simplicity, could be seen as equivalent to the paucity of adjustable parameters. John Kemeny (1955) proposed a similar testability measure, according to which theories are more complex if they can come out as true in more ways in an n-member universe, where n is the number of individuals that the universe contains.

Popper’s equation of simplicity with falsifiability suffers from some serious objections. First, it cannot be applied to comparisons between theories that make equally precise claims, such as a comparison between a specific parabolic hypothesis and a specific linear hypothesis, both of which specify precise values for their parameters and can be falsified by only one data point. It also cannot be applied when we compare theories that make probabilistic claims about the world, since probabilistic statements are not strictly falsifiable. This is particularly troublesome when it comes to accounting for the role of simplicity in the practice of curve-fitting, since one normally has to deal with the possibility of error in the data. As a result, an error distribution is normally added to the hypotheses under consideration, so that they are understood as conferring certain probabilities on the data, rather than as having deductive observational consequences. In addition, most philosophers of science now tend to think that falsifiability is not really an intrinsic property of theories themselves, but rather a feature of how scientists are disposed to behave towards their theories. Even deterministic theories normally do not entail particular observational consequences unless they are conjoined with particular auxiliary assumptions, usually leaving the scientist the option of saving the theory from refutation by tinkering with their auxiliary assumptions—a point famously emphasized by Pierre Duhem (1954). This makes it extremely difficult to maintain that simpler theories are intrinsically more falsifiable than less simple ones. Goodman (1961, p150-151) also argued that equating simplicity with falsifiability leads to counter-intuitive consequences. The hypothesis, “All maple trees are deciduous”, is intuitively simpler than the hypothesis, “All maple trees whatsoever, and all sassafras trees in Eagleville, are deciduous”, yet, according to Goodman, the latter hypothesis is clearly the easiest to falsify of the two. Kemeny’s measure inherits many of the same objections.

Both Popper and Kemeny essentially tried to link the simplicity of a theory with the degree to which it can accommodate potential future data: simpler theories are less accommodating than more complex ones. One interesting recent attempt to make sense of this notion of accommodation is due to Harman and Kulkarni (2007). Harman and Kulkarni analyze accommodation in terms of a concept drawn from statistical learning theory known as the Vapnik-Chervonenkis (VC) dimension. The VC dimension of a hypothesis can be roughly understood as a measure of the “richness” of the class of hypotheses from which it is drawn, where a class is richer if it is harder to find data that is inconsistent with some member of the class. Thus, a hypothesis drawn from a class that can fit any possible set of data will have infinite VC dimension. Though VC dimension shares some important similarities with Popper’s measure, there are important differences. Unlike Popper’s measure, it implies that accommodation is not always equivalent to the number of adjustable parameters. If we count adjustable parameters, sine curves of the form y = a sin bx, come out as relatively unaccommodating, however, such curves have an infinite VC dimension. While Harman and Kulkarni do not propose that VC dimension be taken as a general measure of simplicity (in fact, they regard it as an alternative to simplicity in some scientific contexts), ideas along these lines might perhaps hold some future promise for testability/accommodation measures of simplicity. Similar notions of accommodation in terms of “dimension” have been used to explicate the notion of the simplicity of a statistical model in the face of the fact the number of adjustable parameters a model contains is language variant (for discussion, see Forster, 1999; Sober, 2007).

d. Sober’s Measure

In his early work on simplicity, Elliott Sober (1975) proposed that the simplicity of theories be measured in terms of their question-relative informativeness. According to Sober, a theory is more informative if it requires less supplementary information from us in order for us to be able to use it to determine the answer to the particular questions that we are interested in. For instance, the hypothesis, y = 4x, is more informative and hence simpler than y = 2z + 2x with respect to the question, “what is the value of y?” This is because in order to find out the value of y one only needs to determine a value for x on the first hypothesis, whereas on the second hypothesis one also needs to determine a value for z. Similarly, Sober’s proposal can be used to capture the intuition that theories that say that a given class of things are uniform in their properties are simpler than theories that say that the class is non-uniform, because they are more informative relative to particular questions about the properties of the class. For instance, the hypothesis that “all ravens are black” is more informative and hence simpler than “70% of ravens are black” with respect to the question, “what will be the colour of the next observed raven?” This is because on the former hypothesis one needs no additional information in order to answer this question, whereas one will have to supplement the latter hypothesis with considerable extra information in order to generate a determinate answer.

By relativizing the notion of the content-fullness of theories to the question that one is interested in, Sober’s measure avoids the problem that Popper and Kemeny’s proposals faced of the most arbitrarily specific theories, or theories made up of strings of irrelevant conjunctions of claims, turning out to be the simplest. Moreover, according to Sober’s proposal, the content of the theory must be relevant to answering the question for it to count towards the theory’s simplicity. This gives rise to the most distinctive element of Sober’s proposal: different simplicity orderings of theories will be produced depending on the question one asks. For instance, if we want to know what the relationship is between values of z and given values of y and x, then y = 2z + 2x will be more informative, and hence simpler, than y = 4x. Thus, a theory can be simple relative to some questions and complex relative to others.

Critics have argued that Sober’s measure produces a number of counter-intuitive results. Firstly, the measure cannot explain why people tend to judge an equation such as y = 3x + 4x2 – 50 as more complex than an equation like y = 2x, relative to the question, “what is the value of y?” In both cases, one only needs a value of x to work out a value for y. Similarly, Sober’s measure fails to deal with Goodman’s above cited counter-example to the idea that simplicity equates to testability, since it produces the counter-intuitive outcome that there is no difference in simplicity between “all maple trees whatsoever, and all sassafras trees in Eagleville, are deciduous” and “all maple trees are deciduous” relative to questions about whether maple trees are deciduous. The interest-relativity of Sober’s measure has also generated criticism from those who prefer to see simplicity as a property that varies only with what a given theory is being compared with, not with the question that one happens to be asking.

e. Thagard’s Measure

Paul Thagard (1988) proposed that simplicity ought to be understood as a ratio of the number of facts explained by a theory to the number of auxiliary assumptions that the theory requires. Thagard defines an auxiliary assumption as a statement, not part of the original theory, which is assumed in order for the theory to be able to explain one or more of the facts to be explained. Simplicity is then measured as follows:

  • Simplicity of T = (Facts explained by T – Auxiliary assumptions of T) / Facts explained by T

A value of 0 is given to a maximally complex theory that requires as many auxiliary assumptions as facts that it explains and 1 to a maximally simple theory that requires no auxiliary assumptions at all to explain. Thus, the higher the ratio of facts explained to auxiliary assumptions, the simpler the theory. The essence of Thagard’s proposal is that we want to explain as much as we can, while making the fewest assumptions about the way the world is. By balancing the paucity of auxiliary assumptions against explanatory power it prevents the unfortunate consequence of the simplest theories turning out to be those that are most anaemic.

A significant difficulty facing Thargard’s proposal lies in determining what the auxiliary assumptions of theories actually are and how to count them. It could be argued that the problem of counting auxiliary assumptions threatens to become as difficult as the original problem of measuring simplicity. What a theory must assume about the world for it to explain the evidence is frequently extremely unclear and even harder to quantify. In addition, some auxiliary assumptions are bigger and more onerous than others and it is not clear that they should be given equal weighting, as they are in Thagard’s measure. Another objection is that Thagard’s proposal struggles to make sense of things like ontological parsimony—the idea that theories are simpler because they posit fewer things—since it is not clear that parsimony per se would make any particular difference to the number of auxiliary assumptions required. In defense of this, Thagard has argued that ontological parsimony is actually less important to practicing scientists than has often been thought.

f. Information-Theoretic Measures

Over the last few decades, a number of formal measures of simplicity and complexity have been developed in mathematical information theory. Though many of these measures have been designed for addressing specific practical problems, the central ideas behind them have been claimed to have significance for addressing the philosophical problem of measuring the simplicity of scientific theories.

One of the prominent information-theoretic measures of simplicity in the current literature is Kolmogorov complexity, which is a formal measure of quantitative information content (see Li and Vitányi, 1997). The Kolmogorov complexity K(x) of an object x is the length in bits of the shortest binary program that can output a completely faithful description of x in some universal programming language, such as Java, C++, or LISP. This measure was originally formulated to measure randomness in data strings (such as sequences of numbers), and is based on the insight that non-random data strings can be “compressed” by finding the patterns that exist in them. If there are patterns in a data string, it is possible to provide a completely accurate description of it that is shorter than the string itself, in terms of the number of “bits” of information used in the description, by using the pattern as a mnemonic that eliminates redundant information that need not be encoded in the description. For instance, if the data string is an ordered sequence of 1s and 0s, where every 1 is followed by a 0, and every 0 by a 1, then it can be given a very short description that specifies the pattern, the value of the first data point and the number of data points. Any further information is redundant. Completely random data sets, however, contain no patterns, no redundancy, and hence are not compressible.

It has been argued that Kolmogorov complexity can be applied as a general measure of the simplicity of scientific theories. Theories can be thought of as specifying the patterns that exist in the data sets they are meant to explain. As a result, we can also think of theories as compressing the data. Accordingly, the more a theory T compresses the data, the lower the value of K for the data using T, and the greater is its simplicity. An important feature of Kolmogorov complexity is that it is measured in a universal programming language and it can be shown that the difference in code length between the shortest code length for x in one universal programming language and the shortest code length for x in another programming language is no more than a constant c, which depends on the languages chosen, rather than x. This has been thought to provide some handle on the problem of language variance: Kolmogorov complexity can be seen as a measure of “objective” or “inherent” information content up to an additive constant. Due to this, some enthusiasts have gone so far as to claim that Kolmogorov complexity solves the problem of defining and measuring simplicity.

A number of objections have been raised against this application of Kolmogorov complexity. First, finding K(x) is a non-computable problem: no algorithm exists to compute it. This is claimed to be a serious practical limitation of the measure. Another objection is that Kolmogorov complexity produces some counter-intuitive results. For instance, theories that make probabilistic rather than deterministic predictions about the data must have maximum Kolmogorov complexity. For example, a theory that says that a sequence of coin flips conforms to the probabilistic law, Pr(Heads) = ½, cannot be said to compress the data, since one cannot use this law to reconstruct the exact sequence of heads and tails, even though it offers an intuitively simple explanation of what we observe.

Other information-theoretic measures of simplicity, such as the Minimum Message Length (MML) and Minimum Description Length (MDL) measures, avoid some of the practical problems facing Kolmogorov complexity. Though there are important differences in the details of these measures (see Wallace and Dowe, 1999), they all adopt the same basic idea that the simplicity of an empirical hypothesis can be measured in terms of the extent to which it provides a compact encoding of the data.

A general objection to all such measures of simplicity is that scientific theories generally aim to do more than specify patterns in the data. They also aim to explain why these patterns are there and it is in relation to how theories go about explaining the patterns in our observations that theories have often been thought to be simple or complex. Hence, it can be argued that mere data compression cannot, by itself, suffice as an explication of simplicity in relation to scientific theories. A further objection to the data compression approach is that theories can be viewed as compressing data sets in a very large number of different ways, many of which we do not consider appropriate contributions to simplicity. The problem raised by Goodman’s new riddle of induction can be seen as the problem of deciding which regularities to measure: for example, color regularities or grolor regularities? Formal information-theoretic measures do not discriminate between different kinds of pattern finding. Hence, any such measure can only be applied once we specify the sorts of patterns and regularities that should be taken into account.

g. Is Simplicity a Unified Concept?

There is a general consensus in the philosophical literature that the project of articulating a precise general measure of theoretical simplicity faces very significant challenges. Of course, this has not stopped practicing scientists from utilizing notions of simplicity in their work, and particular concepts of simplicity—such as the simplicity of a statistical model, understood in terms of paucity of adjustable parameters or model dimension—are firmly entrenched in several areas of science. Given this, one potential way of responding to the difficulties that philosophers and others have encountered in this area—particularly in light of the apparent multiplicity and scope for conflict between intuitive explications of simplicity—is to raise the question of whether theoretical simplicity is in fact a unified concept at all. Perhaps there is no single notion of simplicity that is (or should be) employed by scientists, but rather a cluster of different, sometimes related, but also sometimes conflicting notions of simplicity that scientists find useful to varying degrees in particular contexts. This might be evidenced by the observation that scientists’ simplicity judgments often involve making trade-offs between different notions of simplicity. Kepler’s preference for an astronomical theory that abandoned perfectly circular motions for the planets, but which could offer a unified explanation of the astronomical observations in terms of three basic laws, over a theory that retained perfect circular motion, but could not offer a similarly unified explanation, seems to be a clear example of this.

As a result of thoughts in this sort of direction, some philosophers have argued that there is actually no single theoretical value here at all, but rather a cluster of them (for example, Bunge, 1961). It is also worth considering the possibility that which of the cluster is accorded greater weight than the others, and how each of them is understood in practice, may vary greatly across different disciplines and fields of inquiry. Thus, what really matters when it comes to evaluating the comparative “simplicity” of theories might be quite different for biologists than for physicists, for instance, and perhaps what matters to a particle physicist is different to what matters to an astrophysicist. If there is in fact no unified concept of simplicity at work in science that might also indicate that there is no unitary justification for choosing between rival theories on grounds of simplicity. One important suggestion that this possibility has lead to is that the role of simplicity in science cannot be understood from a global perspective, but can only be understood locally. How simplicity ought to be measured and why it matters may have a peculiarly domain-specific explanation.

4. Justifying Preferences for Simpler Theories

Due to the apparent centrality of simplicity considerations to scientific methods and the link between it and numerous other important philosophical issues, the problem of justifying preferences for simpler theories is regarded as a major problem in the philosophy of science. It is also regarded as one of the most intractable. Though an extremely wide variety of justifications have been proposed—as with the debate over how to correctly define and measure simplicity, some important recent contributions have their origins in scientific literature in statistics, information theory, and other cognate fields—all of them have met with significant objections. There is currently no agreement amongst philosophers on what is the most promising path to take. There is also skepticism in some circles about whether an adequate justification is even possible.

Broadly speaking, justificatory proposals can be categorized into three types: 1) accounts that seek to show that simplicity is an indicator of truth (that is, that simpler theories are, in general, more likely to be true, or are somehow better confirmed by the empirical data than their more complex rivals); 2) accounts that do not regard simplicity as a direct indicator of truth, but which seek to highlight some alternative methodological justification for preferring simpler theories; 3) deflationary approaches, which actually reject the idea that there is a general justification for preferring simpler theories per se, but which seek to analyze particular appeals to simplicity in science in terms of other, less problematic, theoretical virtues.

a. Simplicity as an Indicator of Truth

i. Nature is Simple

Historically, the dominant view about why we should prefer simpler theories to more complex ones has been based on a general metaphysical thesis of the simplicity of nature. Since nature itself is simple, the relative simplicity of theories can thus be regarded as direct evidence for their truth. Such a view was explicitly endorsed by many of the great scientists of the past, including Aristotle, Copernicus, Galileo, Kepler, Newton, Maxwell, and Einstein. Naturally however, the question arises as to what justifies the thesis that nature is simple? Broadly speaking, there have been two different sorts of argument given for this thesis: i) that a benevolent God must have created a simple and elegant universe; ii) that the past record of success of relatively simple theories entitles us to infer that nature is simple. The theological justification was most common amongst scientists and philosophers during the early modern period. Einstein, on the other hand, invoked a meta-inductive justification, claiming that the history of physics justifies us in believing that nature is the realization of the simplest conceivable mathematical ideas.

Despite the historical popularity and influence of this view, more recent philosophers and scientists have been extremely resistant to the idea that we are justified in believing that nature is simple. For a start, it seems difficult to formulate the thesis that nature is simple so that it is not either obviously false, or too vague to be of any use. There would seem to be many counter-examples to the claim that we live in a simple universe. Consider, for instance, the picture of the atomic nucleus that physicists were working with in the early part of the twentieth century: it was assumed that matter was made only of protons and electrons; there were no such things as neutrons or neutrinos and no weak or strong nuclear forces to be explained, only electromagnetism. Subsequent discoveries have arguably led to a much more complex picture of nature and much more complex theories have had to be developed to account for this. In response, it could be claimed that though nature seems to be complex in some superficial respects, there is in fact a deep underlying simplicity in the fundamental structure of nature. It might also be claimed that the respects in which nature appears to be complex are necessary consequences of its underlying simplicity. But this just serves to highlight the vagueness of the claim that nature is simple—what exactly does this thesis amount to, and what kind of evidence could we have for it?

However the thesis is formulated, it would seem to be an extremely difficult one to adequately defend, whether this be on theological or meta-inductive grounds. An attempt to give a theological justification for the claim that nature is simple suffers from an inherent unattractiveness to modern philosophers and scientists who do not want to ground the legitimacy of scientific methods in theology. In any case, many theologians reject the supposed link between God’s benevolence and the simplicity of creation. With respect to a meta-inductive justification, even if it were the case that the history of science demonstrates the better than average success of simpler theories, we may still raise significant worries about the extent to which this could give sufficient credence to the claim that nature is simple. First, it assumes that empirical success can be taken to be a reliable indicator of truth (or at least approximate truth), and hence of what nature is really like. Though this is a standard assumption for many scientific realists—the claim being that success would be “miraculous” if the theory concerned was radically false—it is a highly contentious one, since many anti-realists hold that the history of science shows that all theories, even eminently successful theories, typically turn out to be radically false. Even if one does accept a link between success and truth, our successes to date may still not provide a representative sample of nature: maybe we have only looked at the problems that are most amenable to simple solutions and the real underlying complexity of nature has escaped our notice. We can also question the degree to which we can extrapolate any putative connection between simplicity and truth in one area of nature to nature as a whole. Moreover, in so far as simplicity considerations are held to be fundamental to inductive inference quite generally, such an attempted justification risks a charge of circularity.

ii. Meta-Inductive Proposals

There is another way of appealing to past success in order to try to justify a link between simplicity and truth. Instead of trying to justify a completely general claim about the simplicity of nature, this proposal merely suggests that we can infer a correlation between success and very particular simplicity characteristics in particular fields of inquiry—for instance, a particular kind of symmetry in certain areas of theoretical physics. If success can be regarded as an indicator of at least approximate truth, we can then infer that theories that are simpler in the relevant sense are more likely to be true in fields where the correlation with success holds.

Recent examples of this sort of proposal include McAllister (1996) and Kuipers (2002). In an effort to account for the truth-conduciveness of aesthetic considerations in science, including simplicity, Theo Kuipers (2002) claims that scientists tend to become attracted to theories that share particular aesthetic features in common with successful theories that they have been previously exposed to. In other words, we can explain the particular aesthetic preferences that scientists have in terms that are similar to a well-documented psychological effect known as the “mere-exposure effect”, which occurs when individuals take a liking to something after repeated exposure to it. If, in a given field of inquiry, theories that have been especially successful exhibit a particular type of simplicity (however this is understood), and thus such theories have been repeatedly presented to scientists working in the field during their training, the mere-exposure effect will then lead these scientists to be attracted to other theories that also exhibit that same type of simplicity. This process can then be used to support an aesthetic induction to a correlation between simplicity in the relevant sense and success. One can then make a case that this type of simplicity can legitimately be taken as an indicator of at least approximate truth.

Even though this sort of meta-inductive proposal does not attempt to show that nature in general is simple, many of the same objections can be raised against it as are raised against the attempt to justify that metaphysical thesis by appeal to the past success of simple theories. Once again, there is the problem of justifying the claim that empirical success is a reliable guide to (approximate) truth. Kuipers’ own arguments for this claim rest on a somewhat idiosyncratic account of truth approximation. In addition, in order to legitimately infer that there is a genuine correlation between simplicity and success, one cannot just look at successful theories; one must look at unsuccessful theories too. Even if all the successful theories in a domain have the relevant simplicity characteristic, it might still be the case that the majority of theories with the characteristic have been (or would have been) highly unsuccessful. Indeed, if one can potentially modify a successful theory in an infinite number of ways while keeping the relevant simplicity characteristic, one might actually be able to guarantee that the majority of possible theories with the characteristic would be unsuccessful theories, thus breaking the correlation between simplicity and success. This could be taken as suggesting that in order to carry any weight, arguments from success also need to offer an explanation for why simplicity contributes to success. Moreover, though the mere-exposure effect is well documented, Kuipers provides no direct empirical evidence that scientists actually acquire their aesthetic preferences via the kind of process that he proposes.

iii. Bayesian Proposals

According to standard varieties of Bayesianism, we should evaluate scientific theories according to their probability conditional upon the evidence (posterior probability). This probability, Pr(T | E), is a function of three quantities:

  • Pr(T | E) = Pr(E | T) Pr(T) / Pr(E)

Pr(E | T), is the probability that the theory, T, confers on the evidence, E, which is referred to as the likelihood of T. Pr(T) is the prior probability of T, and Pr(E) is the probability of E. T is then held to have higher posterior probability than a rival theory, T*, if and only if:

  • Pr(E | T) Pr(T) > Pr(E | T*) Pr(T*)

A standard Bayesian proposal for understanding the role of simplicity in theory choice is that simplicity is one of the key determinates of Pr(T): other things being equal, simpler theories and hypotheses are held to have higher prior probability of being true than more complex ones. Thus, if two rival theories confer equal or near equal probability on the data, but differ in relative simplicity, other things being equal, the simpler theory will tend to have a higher posterior probability. This idea, which Harold Jeffreys called “the simplicity postulate”, has been elaborated in a number of different ways by philosophers, statisticians, and information theorists, utilizing various measures of simplicity (for example, Carnap, 1950; Jeffreys, 1957, 1961; Solomonoff, 1964; Li, M. and Vitányi, 1997).

In response to this proposal, Karl Popper (1959) argued that, in some cases, assigning a simpler theory a higher prior probability actually violates the axioms of probability. For instance, Jeffreys proposed that simplicity be measured by counting adjustable parameters. On this measure, the claim that the planets move in circular orbits is simpler than the claim that the planets move in elliptical orbits, since the equation for an ellipse contains an additional adjustable parameter. However, circles can also be viewed as special cases of ellipses, where the additional parameter is set to zero. Hence, the claim that planets move in circular orbits can also be seen as a special case of the claim that the planets move in elliptical orbits. If that is right, then the former claim cannot be more probable than the latter claim because the truth of the former entails the truth of latter and probability respects entailment. In reply to Popper, it has been argued that this prior probabilistic bias towards simpler theories should only be seen as applying to comparisons between inconsistent theories where no relation of entailment holds between them—for instance, between the claim that the planets move in circular orbits and the claim that they move in elliptical but non-circular orbits.

The main objection to the Bayesian proposal that simplicity is a determinate of prior probability is that the theory of probability seems to offer no resources for explaining why simpler theories should be accorded higher prior probability. Rudolf Carnap (1950) thought that prior probabilities could be assigned a priori to any hypothesis stated in a formal language, on the basis of a logical analysis of the structure of the language and assumptions about the equi-probability of all possible states of affairs. However, Carnap’s approach has generally been recognized to be unworkable. If higher prior probabilities cannot be assigned to simpler theories on the basis of purely logical or mathematical considerations, then it seems that Bayesians must look outside of the Bayesian framework itself to justify the simplicity postulate.

Some Bayesians have taken an alternative route, claiming that a direct mathematical connection can be established between the simplicity of theories and their likelihood—that is, the value of Pr(E | T) ( see Rosencrantz, 1983; Myrvold, 2003; White, 2005). This proposal depends on the assumption that simpler theories have fewer adjustable parameters, and hence are consistent with a narrower range of potential data. Suppose that we collect a set of empirical data, E, that can be explained by two theories that differ with respect to this kind of simplicity: a simple theory, S, and a complex theory, C. S has no adjustable parameters and only ever entails E, while C has an adjustable parameter, θ, which can take a range of values, n. When θ is set to some specific value, i, it entails E, but on other values of θ, C entails different and incompatible observations. It is then argued that S confers a higher probability on E. This is because C allows that lots of other possible observations could have been made instead of E (on different possible settings for θ). Hence, the truth of C would make our recording those particular observations less probable than would the truth of S. Here, the likelihood of C is calculated as the average of the likelihoods of each of the n versions of C, defined by a unique setting of θ. Thus, as the complexity of a theory increases—measured in terms of the number of adjustable parameters it contains—the number of versions of the theory that will give a low probability to E will increase and the overall value of Pr(E | T) will go down.

An objection to this proposal (Kelly, 2004, 2010) is that for us to be able to show that S has a higher posterior probability than C as a result of its having a higher likelihood, it must be assumed that the prior probability of C is not significantly greater than the prior probability of S. This is a substantive assumption to make because of the way that simplicity is defined in this argument. We can view C as coming in a variety of different versions, each of which is picked out by a different value given to θ. If we then assume that S and C have roughly equal prior probability we must, by implication, assume that each version of C has a very low prior probability compared to S, since the prior probability of each version of C would be Pr(C) / n (assuming that the theory does not say that any particular parameter setting is more probable than any of the others). This would effectively build in a very strong prior bias in favour of S over each version of C. Given that each version of C could be considered independently—that is, the complex theory could be given a simpler, more restricted formulation—this would require an additional supporting argument. The objection is thus that the proposal simply begs the question by resting on a prior probabilistic bias towards simpler theories. Another objection is that the proposal suffers from the limitation that it can only be applied to comparisons between theories where the simpler theory can be derived from the more complex one by fixing certain of its parameters. At best, this represents a small fraction of cases in which simplicity has been thought to play a role.

iv. Simplicity as a Fundamental A Priori Principle

In the light of the perceived failure of philosophers to justify the claim that simpler theories are more likely to true, Richard Swinburne (2001) has argued that this claim has to be regarded as a fundamental a priori principle. Swinburne argues that it is just obvious that the criteria for theory evaluation that scientists use reliably lead them to make correct judgments about which theories are more likely to true. Since, Swinburne argues, one of these is that simpler theories are, other things being equal, more likely to be true, we just have to accept that simplicity is indeed an indicator of probable truth. However, Swinburne doesn’t think that this connection between simplicity and truth can be established empirically, nor does he think that it can be shown to follow from some more obvious a priori principle. Hence, we have no choice but to regard it as a fundamental a priori principle—a principle that cannot be justified by anything more fundamental.

In response to Swinburne, it can be argued that this is hardly going to convince those scientists and philosophers for whom it is not at all obvious the simpler theories are more likely to be true.

b. Alternative Justifications

i. Falsifiability

Famously, Karl Popper (1959) rejected the idea that theories are ever confirmed by evidence and that we are ever entitled to regard a theory as true, or probably true. Hence, Popper did not think simplicity could be legitimately regarded as an indicator of truth. Rather, he argued that simpler theories are to be valued because they are more falsifiable. Indeed, Popper thought that the simplicity of theories could be measured in terms of their falsifiability, since intuitively simpler theories have greater empirical content, placing more restriction on the ways the world can be, thus leading to a reduced ability to accommodate any future that we might discover. According to Popper, scientific progress consists not in the attainment of true theories, but in the elimination of false ones. Thus, the reason we should prefer more falsifiable theories is because such theories will be more quickly eliminated if they are in fact false. Hence, the practice of first considering the simplest theory consistent with the data provides a faster route to scientific progress. Importantly, for Popper, this meant that we should prefer simpler theories because they have a lower probability of being true, since, for any set of data, it is more likely that some complex theory (in Popper’s sense) will be able to accommodate it than a simpler theory.

Popper’s equation of simplicity with falsifiability suffers from some well-known objections and counter-examples, and these pose significant problems for his justificatory proposal (Section 3c). Another significant problem is that taking degree of falsifiability as a criterion for theory choice seems to lead to absurd consequences, since it encourages us to prefer absurdly specific scientific theories to those that have more general content. For instance, the hypothesis, “all emeralds are green until 11pm today when they will turn blue” should be judged as preferable to “all emeralds are green” because it is easier to falsify. It thus seems deeply implausible to say that selecting and testing such hypotheses first provides the fastest route to scientific progress.

ii. Simplicity as an Explanatory Virtue

A number of philosophers have sought to elucidate the rationale for preferring simpler theories to more complex ones in explanatory terms (for example, Friedman, 1974; Sober, 1975; Walsh, 1979; Thagard, 1988; Kitcher, 1989; Baker, 2003). These proposals have typically been made on the back of accounts of scientific explanation that explicate notions of explanatoriness and explanatory power in terms of unification, which is taken to be intimately bound up with notions of simplicity. According to unification accounts of explanation, a theory is explanatory if it shows how different phenomena are related to each other under certain systematizing theoretical principles, and a theory is held to have greater explanatory power than its rivals if it systematizes more phenomena. For Michael Friedman (1974), for instance, explanatory power is a function of the number of independent phenomena that we need to accept as ultimate: the smaller the number of independent phenomena that are regarded as ultimate by the theory, the more explanatory is the theory. Similarly, for Philip Kitcher (1989), explanatory power is increased the smaller the number of patterns of argument, or “problem-solving schemas”, that are needed to deliver the facts about the world that we accept. Thus, on such accounts, explanatory power is seen as a structural relationship between the sparseness of an explanation—the fewness of hypotheses or argument patterns—and the plenitude of facts that are explained. There have been various attempts to explicate notions of simplicity in terms of these sorts of features. A standard type of argument that is then used is that we want our theories not only to be true, but also explanatory. If truth were our only goal, there would be no reason to prefer a genuine scientific theory to a collection of random factual statements that all happen to be true. Hence, explanation is an ultimate, rather than a purely instrumental goal of scientific inquiry. Thus, we can justify our preferences for simpler theories once we recognize that there is a fundamental link between simplicity and explanatoriness and that explanation is a key goal of scientific inquiry, alongside truth.

There are some well-known objections to unification theories of explanation, though most of them concern the claim that unification is all there is to explanation—a claim on which the current proposal does not depend. However, even if we accept a unification theory of explanation and accept that explanation is an ultimate goal of scientific inquiry, it can be objected that the choice between a simple theory and a more complex rival is not normally a choice between a theory that is genuinely explanatory, in this sense, and a mere factual report. The complex theory can normally be seen as unifying different phenomena under systematizing principles, at least to some degree. Hence, the justificatory question here is not about why we should prefer theories that explain the data to theories that do not, but why we should prefer theories that have greater explanatory power in the senses just described to theories that are comparatively less explanatory. It is certainly a coherent possibility that the truth may turn out to be relatively disunified and unsystematic. Given this, it seems appropriate to ask why we are justified in choosing theories because they are more unifying. Just saying that explanation is an ultimate goal of scientific inquiry does not seem to be enough.

iii. Predictive Accuracy

In the last few decades, the treatment of simplicity as an explicit part of statistical methodology has become increasingly sophisticated. A consequence of this is that some philosophers of science have started looking to the statistics literature for illumination on how to think about the philosophical problems surrounding simplicity. According to Malcolm Forster and Elliott Sober (Forster and Sober, 1994; Forster, 2001; Sober, 2007), the work of the statistician, Hirotugu Akaike (1973), provides a precise theoretical framework for understanding the justification for the role of simplicity in curve-fitting and model selection.

Standard approaches to curve-fitting effect a trade-off between fit to a sample of data and the simplicity of the kind of mathematical relationship that is posited to hold between the variables—that is, the simplicity of the postulated model for the underlying relationship, typically measured in terms of the number of adjustable parameters it contains. This often means, for instance, that a linear hypothesis that fits a sample of data less well may be chosen over a parabolic hypothesis that fits the data better. According to Forster and Sober, Akaike developed an explanation for why it is rational to favor simpler models, under specific circumstances. The proposal builds on the practical wisdom that when there is a particular amount of error or noise in the data sample, more complex models have a greater propensity to “over-fit” to this spurious data in the sample and thus lead to less accurate predictions of extra-sample (for instance, future) data, particularly when dealing with small sample sizes. (Gauch [2003, 2006] calls this “Ockham’s hill”: to the left of the peak of the hill, increasing the complexity of a model improves its accuracy with respect to extra-sample data; after the peak, increasing complexity actually diminishes predictive accuracy. There is therefore an optimal trade-off at the peak of Ockham’s hill between simplicity and fit to the data sample when it comes to facilitating accurate prediction). According to Forster and Sober, what Akaike did was prove a theorem, which shows that, given standard statistical assumptions, we can estimate the degree to which constraining model complexity when fitting a curve to a sample of data will lead to more accurate predictions of extra-sample data. Following Forster and Sober’s presentation (1994, p9-10), Akaike’s theorem can be stated as follows:

  • Estimated[A(M)] = (1/N)[log-likelihood(L(M)) – k],

where A(M) is the predictive accuracy of the model, M, with respect to extra-sample data, N is the number of data points in the sample, log-likelihood is a measure of goodness of fit to the sample (the higher the log-likelihood score the closer the fit to the data), L(M) is the best fitting member of M, and k is the number of adjustable parameters that M contains. Akaike’s theorem is claimed to specify an unbiased estimator of predictive accuracy, which means that the distribution of estimates of A is centered around the true value of A (for proofs and further details on the assumptions behind Akaike’s theorem, see Sakamoto and others, 1986). This gives rise to a model selection procedure, Akaike’s Information Criterion (AIC), which says that we should choose the model that has the highest estimated predictive accuracy, given the data at hand. In practice, AIC implies that when the best-fitting parabola fits the data sample better than the best-fitting straight line, but not so much better that this outweighs its greater complexity (k), the straight line should be used for making predictions. Importantly, the penalty imposed on complexity has less influence on model selection the larger the sample of data, meaning that simplicity matters more for predictive accuracy when dealing with smaller samples.

Forster and Sober argue that Akaike’s theorem explains why simplicity has a quantifiable positive effect on predictive accuracy by combating the risk of over-fitting to noisy data. Hence, if one is interested in generating accurate predictions—for instance, of future data—one has a clear rationale for preferring simpler models. Forster and Sober are explicit that this proposal is only meant to apply to scientific contexts that can be understood from within a model selection framework, where predictive accuracy is the central goal of inquiry and there is a certain amount of error or noise in the data. Hence, they do not view Akaike’s work as offering a complete solution to the problem of justifying preferences for simpler theories. However, they have argued that a very significant number of scientific inference problems can be understood from an Akaikian perspective.

Several objections have been raised against Forster and Sober’s philosophical use of Akaike’s work. One objection is that the measure of simplicity employed by AIC is not language invariant, since the number of adjustable parameters a model contains depends on how the model is described. However, Forster and Sober argue that though, for practical purposes, the quantity, k, is normally spelt out in terms of number of adjustable parameters, it is in fact more accurately explicated in terms of the notion of the dimension of a family of functions, which is language invariant. Another objection is that AIC is not statistically consistent. Forster and Sober reply that this charge rests on a confusion over what AIC is meant to estimate: for example, erroneously assuming that AIC is meant to be estimator of the true value of k (the size of the simplest model that contains the true hypothesis), rather than an estimator of the predictive accuracy of a particular model at hand. Another worry is that over-fitting considerations imply that an idealized false model will often make more accurate predictions than a more realistic model, so the justification is merely instrumentalist and cannot warrant the use of simplicity as a criterion for hypothesis acceptance where hypotheses are construed realistically, rather than just as predictive tools. For their part, Forster and Sober are quite happy to accept this instrumentalist construal of the role of simplicity in curve-fitting and model selection: in this context, simplicity is not a guide to the truth, but to predictive accuracy. Finally, there are a variety of objections concerning the nature and validity of the assumptions behind Akaikie’s theorem and whether AIC is applicable to some important classes of model selection problems (for discussion, see Kieseppä, 1997; Forster, 1999, 2001; Howson and Urbach, 2006; Dowe and others, 2007; Sober, 2007; Kelly, 2010).

iv. Truth-Finding Efficiency

An important recent proposal about how to justify preferences for simpler theories has come from work in the interdisciplinary field known as formal learning theory (Schulte, 1999; Kelly, 2004, 2007, 2010). It has been proposed that even if we do not know whether the world is simple or complex, inferential rules that are biased towards simple hypotheses can be shown to converge to the truth more efficiently than alternative inferential rules. According to this proposal, an inferential rule is said to converge to the truth efficiently, if, relative to other possible convergent inferential rules, it minimizes the maximum number of U-turns or “retractions” of opinion that might be required of the inquirer while using the rule to guide her decisions on what to believe given the data. Such procedures are said to converge to the truth more directly and in a more stable fashion, since they require fewer changes of mind along the way. The proposal is that even if we do not know whether the truth is simple or complex, scientific inference procedures that are biased towards simplicity can be shown a priori to be optimally efficient in this sense, converging to the truth in the most direct and stable way possible.

To illustrate the basic logic behind this proposal, consider the following example from Oliver Schulte (1999). Suppose that we are investigating the existence of hypothetical particle, Ω. If Ω does exist, we will be able to detect it with an appropriate measurement device. However, as yet, it has not been detected. What attitude should we take towards the existence Ω? Let us say that Ockham’s Razor suggests that we deny that Ω exists until it is detected (if ever). Alternatively, we could assert that Ω does exist until a finite number of attempts to detect Ω have proved to be unsuccessful, say ten thousand, in which case, we assert that Ω does not exist; or, we could withhold judgment until Ω is either detected, or there have been ten thousand unsuccessful attempts to detect it. Since we are assuming that existent particles do not go undetected forever, abiding by any of three of these inferential rules will enable us to converge to the truth in the limit, whether Ω exists or not. However, Schulte argues that Ockham’s Razor provides the most efficient route to the truth. This is because following Ockham’s Razor incurs a maximum of only one retraction of opinion: retracting an assertion of non-existence to an assertion of existence, if Ω is detected. In contrast, the alternative inferential rules both incur a maximum of two retractions, since Ω could go undetected ten thousand times, but is then detected on the ten thousandth and one time. Hence, truth-finding efficiency requires that one adopt Ockham’s Razor and presume that Ω does not exist until it is detected.

Kevin Kelly has further developed this U-turn argument in considerable detail. Kelly argues that, with suitable refinements, it can be extended to an extremely wide variety of real world scientific inference problems. Importantly, Kelly has argued that, on this proposal, simplicity should not be seen as purely a pragmatic consideration in theory choice. While simplicity cannot be regarded as a direct indicator of truth, we do nonetheless have a reason to think that the practice of favoring simpler theories is a truth-conducive strategy, since it promotes speedy and stable attainment of true beliefs. Hence, simplicity should be regarded as a genuinely epistemic consideration in theory choice.

One worry about the truth-finding efficiency proposal concerns the general applicability of these results to scientific contexts in which simplicity may play a role. The U-turn argument for Ockham’s razor described above seems to depend on the evidential asymmetry between establishing that Ω exists and establishing that Ω does not exist: a detection of Ω is sufficient to establish the existence of Ω, whereas repeated failures of detection are not sufficient to establish non-existence. The argument may work where detection procedures are relatively clear-cut—for instance where there are relatively unambiguous instrument readings that count as “detections”—but what about entities that are very difficult to detect directly and where mistakes can easily be made about existence as well as non-existence? Similarly, a current stumbling block is that the U-turn argument cannot be used as a justification for the employment of simplicity biases in statistical inference, where the hypotheses under consideration do not have deductive observational consequences. Kelly is, however, optimistic about extending the U-turn argument to statistical inference. Another objection concerns the nature of the justification that is being provided here. What the U-turn argument seems to show is that the strategy of favoring the simplest theory consistent with the data may help one to find the truth with fewer reversals along the way. It does not establish that simpler theories themselves should be regarded as in any way “better” than their more complex rivals. Hence, there are doubts about the extent to which this proposal can actually make sense of standard examples of simplicity preferences at work in the history and current practice of science, where the guiding assumption seems to be that simpler theories are not to be preferred merely for strategic reasons, but because they are better theories.

c. Deflationary Approaches

Various philosophers have sought to defend broadly deflationary accounts of simplicity. Such accounts depart from all of the justificatory accounts discussed so far by rejecting the idea that simplicity should in fact be regarded as a theoretical virtue and criterion for theory choice in its own right. Rather, according to deflationary accounts, when simplicity appears to be a driving factor in theory evaluation, something else is doing the real work.

Richard Boyd (1990), for instance, has argued that scientists’ simplicity judgments are typically best understood as just covert judgements of theoretical plausibility. When a scientist claims that one theory is “simpler” than another this is often just another way of saying that the theory provides a more plausible account of the data. For Boyd, such covert judgments of theoretical plausibility are driven by the scientist’s background theories. Hence, it is the relevant background theories that do the real work in motivating the preference for the “simpler” theory, not the simplicity of the theory per se. John Norton (2003) has advocated a similar view in the context of his “material theory” of induction, according to which inductive inferences are licensed not by universal inductive rules or inference schemas, but rather by local factual assumptions about the domain of inquiry. Norton argues that the apparent use of simplicity in induction merely reflects material assumptions about the nature of the domain being investigated. For instance, when we try to fit curves to data we choose the variables and functions that we believe to be appropriate to the physical reality we are trying to get at. Hence, it is because of the facts that we believe to prevail in this domain that we prefer a “simple” linear function to a quadratic one, if such a curve fits the data sufficiently well. In a different domain, where we believe that different facts prevail, our decision about which hypotheses are “simple” or “complex” are likely to be very different.

Elliott Sober (1988, 1994) has defended this sort of deflationary analysis of various appeals to simplicity and parsimony in evolutionary biology. For example, Sober argues that the common claim that group selection hypotheses are “less parsimonious” and hence to be taken less seriously as explanations for biological adaptations than individual selection hypotheses, rests on substantive assumptions about the comparative rarity of the conditions required for group selection to occur. Hence, the appeal to Ockham’s Razor in this context is just a covert appeal to local background knowledge. Other attempts to offer deflationary analyses of particular appeals to simplicity in science include Plutynski (2005), who focuses on the Fisher-Wright debate in evolutionary biology, and Fitzpatrick (2009), who focuses on appeals to simplicity in debates over the cognitive capacities of non-human primates.

If such deflationary analyses of the putative role of simplicity in particular scientific contexts turn out to be plausible, then problems concerning how to measure simplicity and how to offer a general justification for preferring simpler theories can be avoided, since simplicity per se can be shown to do no substantive work in the relevant inferences. However, many philosophers are skeptical that such deflationary analyses are possible for many of the contexts where simplicity considerations have been thought to play an important role. Kelly (2010), for example, has argued that simplicity typically comes into play when our background knowledge underdetermines theory choice. Sober himself seems to advocate a mixed view: some appeals to simplicity in science are best understood in deflationary terms, others are better understood in terms of Akaikian model selection theory.

5. Conclusion

The putative role of considerations of simplicity in the history and current practice of science gives rise to a number of philosophical problems, including the problem of precisely defining and measuring theoretical simplicity, and the problem of justifying preferences for simpler theories. As this survey of the literature on simplicity in the philosophy of science demonstrates, these problems have turned out to be surprisingly resistant to resolution, and there remains a live debate amongst philosophers of science about how to deal with them. On the other hand, there is no disputing the fact that practicing scientists continue to find it useful to appeal to various notions of simplicity in their work. Thus, in many ways, the debate over simplicity resembles other long-running debates in the philosophy science, such as that over the justification for induction (which, it turns out, is closely related to the problem of justifying preferences for simpler theories). Though there is arguably more skepticism within the scientific community about the legitimacy of choosing between rival theories on grounds of simplicity than there is about the legitimacy of inductive inference—the latter being a complete non-issue for practicing scientists—as is the case with induction, very many scientists continue to employ practices and methods that utilize notions of simplicity to great scientific effect, assuming that appropriate solutions to the philosophical problems that these practices give rise to do in fact exist, even though philosophers have so far failed to articulate them. However, as this survey has also shown, statisticians, information and learning theorists, and other scientists have been making increasingly important contributions to the debate over the philosophical underpinning for these practices.

6. References and Further Reading

  • Ackerman, R. 1961. Inductive simplicity. Philosophy of Science, 28, 162-171.
    • Argues against the claim that simplicity considerations play a significant role in inductive inference. Critiques measures of simplicity proposed by Jeffreys, Kemeny, and Popper.
  • Akaike, H. 1973. Information theory and the extension of the maximum likelihood principle. In B. Petrov and F. Csaki (eds.), Second International Symposium on Information Theory. Budapest: Akademiai Kiado.
    • Laid the foundations for model selection theory. Proves a theorem suggesting that the simplicity of a model is relevant to estimating its future predictive accuracy. Highly technical.
  • Baker, A. 2003. Quantitative parsimony and explanatory power. British Journal for the Philosophy of Science, 54, 245-259.
    • Builds on Nolan (1997), argues that quantitative parsimony is linked with explanatory power.
  • Baker, A. 2007. Occam’s Razor in science: a case study from biogeography. Biology and Philosophy, 22, 193-215.
    • Argues for a “naturalistic” justification of Ockham’s Razor and that preferences for ontological parsimony played a significant role in the late 19th century debate in bio-geography between dispersalist and extensionist theories.
  • Barnes, E.C. 2000. Ockham’s razor and the anti-superfluity principle. Erkenntnis, 53, 353-374.
    • Draws a useful distinction between two different interpretations of Ockham’s Razor: the anti-superfluity principle and the anti-quantity principle. Explicates an evidential justification for anti-superfluity principle.
  • Boyd, R. 1990. Observations, explanatory power, and simplicity: towards a non-Humean account. In R. Boyd, P. Gasper and J.D. Trout (eds.), The Philosophy of Science. Cambridge, MA: MIT Press.
    • Argues that appeals to simplicity in theory evaluation are typically best understood as covert judgments of theoretical plausibility.
  • Bunge, M. 1961. The weight of simplicity in the construction and assaying of scientific theories. Philosophy of Science, 28, 162-171.
    • Takes a skeptical view about the importance and justifiability of a simplicity criterion in theory evaluation.
  • Carlson, E. 1966. The Gene: A Critical History. Philadelphia: Saunders.
    • Argues that simplicity considerations played a significant role in several important debates in the history of genetics.
  • Carnap, R. 1950. Logical Foundations of Probability. Chicago: University of Chicago Press.
  • Chater, N. 1999. The search for simplicity: a fundamental cognitive principle. The Quarterly Journal of Experimental Psychology, 52A, 273-302.
    • Argues that simplicity plays a fundamental role in human reasoning, with simplicity to be defined in terms of Kolmogorov complexity.
  • Cohen, I.B. 1985. Revolutions in Science. Cambridge, MA: Harvard University Press.
  • Cohen, I.B. 1999. A guide to Newton’s Principia. In I. Newton, The Principia: Mathematical Principles of Natural Philosophy; A New Translation by I. Bernard Cohen and Anne Whitman. Berkeley: University of California Press.
  • Crick, F. 1988. What Mad Pursuit: a Personal View of Scientific Discovery. New York: Basic Books.
    • Argues that the application of Ockham’s Razor to biology is inadvisable.
  • Dowe, D, Gardner, S., and Oppy, G. 2007. Bayes not bust! Why simplicity is no problem for Bayesians. British Journal for the Philosophy of Science, 58, 709-754.
    • Contra Forster and Sober (1994), argues that Bayesians can make sense of the role of simplicity in curve-fitting.
  • Duhem, P. 1954. The Aim and Structure of Physical Theory. Princeton: Princeton University Press.
  • Einstein, A. 1954. Ideas and Opinions. New York: Crown.
    • Einstein’s views about the role of simplicity in physics.
  • Fitzpatrick, S. 2009. The primate mindreading controversy: a case study in simplicity and methodology in animal psychology. In R. Lurz (ed.), The Philosophy of Animal Minds. New York: Cambridge University Press.
    • Advocates a deflationary analysis of appeals to simplicity in debates over the cognitive capacities of non-human primates.
  • Forster, M. 1995. Bayes and bust: simplicity as a problem for a probabilist’s approach to confirmation. British Journal for the Philosophy of Science, 46, 399-424.
    • Argues that the Bayesian approach to scientific reasoning is inadequate because it cannot make sense of the role of simplicity in theory evaluation.
  • Forster, M. 1999. Model selection in science: the problem of language variance. British Journal for the Philosophy of Science, 50, 83-102.
    • Responds to criticisms of Forster and Sober (1994). Argues that AIC relies on a language invariant measure of simplicity.
  • Forster, M. 2001. The new science of simplicity. In A. Zellner, H. Keuzenkamp and M. McAleer (eds.), Simplicity, Inference and Modelling. Cambridge: Cambridge University Press.
    • Accessible introduction to model selection theory. Describes how different procedures, including AIC, BIC, and MDL, trade-off simplicity and fit to the data.
  • Forster, M. and Sober, E. 1994. How to tell when simpler, more unified, or less ad hoc theories will provide more accurate predictions. British Journal for the Philosophy of Science, 45, 1-35.
    • Explication of AIC statistics and its relevance to the philosophical problem of justifying preferences for simpler theories. Argues against Bayesian approaches to simplicity. Technical in places.
  • Foster, M. and Martin, M. 1966. Probability, Confirmation, and Simplicity: Readings in the Philosophy of Inductive Logic. New York: The Odyssey Press.
    • Anthology of papers discussing the role of simplicity in induction. Contains important papers by Ackermann, Barker, Bunge, Goodman, Kemeny, and Quine.
  • Friedman, M. 1974. Explanation and scientific understanding. Journal of Philosophy, LXXI, 1-19.
    • Defends a unification account of explanation, connects simplicity with explanatoriness.
  • Galilei, G. 1962. Dialogues concerning the Two Chief World Systems. Berkeley: University of California Press.
    • Classic defense of Copernicanism with significant emphasis placed on the greater simplicity and harmony of the Copernican system. Asserts that nature does nothing in vain.
  • Gauch, H. 2003. Scientific Method in Practice. Cambridge: Cambridge University Press.
    • Wide-ranging discussion of the scientific method written by a scientist for scientists. Contains a chapter on the importance of parsimony in science.
  • Gauch, H. 2006. Winning the accuracy game. American Scientist, 94, March-April 2006, 134-141.
    • Useful informal presentation of the concept of Ockham’s hill and its importance to scientific research in a number of fields.
  • Gingerich, O. 1993. The Eye of Heaven: Ptolemy, Copernicus, Kepler. New York: American Institute of Physics.
  • Glymour, C. 1980. Theory and Evidence. Princeton: Princeton University Press.
    • An important critique of Bayesian attempts to make sense of the role of simplicity in science. Defends a “boot-strapping” analysis of the simplicity arguments for Copernicanism and Newton’s argument for universal gravitation.
  • Goodman, N. 1943. On the simplicity of ideas. Journal of Symbolic Logic, 8, 107-1.
  • Goodman, N. 1955. Axiomatic measurement of simplicity. Journal of Philosophy, 52, 709-722.
  • Goodman, N. 1958. The test of simplicity. Science, 128, October 31st 1958, 1064-1069.
    • Reasonably accessible introduction to Goodman’s attempts to formulate a measure of logical simplicity.
  • Goodman, N. 1959. Recent developments in the theory of simplicity. Philosophy and Phenomenological Research, 19, 429-446.
    • Response to criticisms of Goodman (1955).
  • Goodman, N. 1961. Safety, strength, simplicity. Philosophy of Science, 28, 150-151.
    • Argues that simplicity cannot be equated with testability, empirical content, or paucity of assumption.
  • Goodman, N. 1983. Fact, Fiction and Forecast (4th edition). Cambridge, MA: Harvard University Press.
  • Harman, G. 1999. Simplicity as a pragmatic criterion for deciding what hypotheses to take seriously. In G. Harman, Reasoning, Meaning and Mind. Oxford: Oxford University Press.
    • Defends the claim that simplicity is a fundamental component of inductive inference and that this role has a pragmatic justification.
  • Harman, G. and Kulkarni, S. 2007. Reliable Reasoning: Induction and Statistical Learning Theory. Cambridge, MA: MIT Press.
    • Accessible introduction to statistical learning theory and VC dimension.
  • Harper, W. 2002. Newton’s argument for universal gravitation. In I.B. Cohen and G.E. Smith (eds.), The Cambridge Companion to Newton. Cambridge: Cambridge University Press.
  • Hesse, M. 1967. Simplicity. In P. Edwards (ed.), The Encyclopaedia of Philosophy, vol. 7. New York: Macmillan.
    • Focuses on attempts by Jeffreys, Popper, Kemeny, and Goodman to formulate measures of simplicity.
  • Hesse, M. 1974. The Structure of Scientific Inference. London: Macmillan.
    • Defends the view that simplicity is a determinant of prior probability. Useful discussion of the role of simplicity in Einstein’s work.
  • Holton, G. 1974. Thematic Origins of Modern Science: Kepler to Einstein. Cambridge, MA: Harvard University Press.
    • Discusses the role of aesthetic considerations, including simplicity, in the history of science.
  • Hoffman, R., Minkin, V., and Carpenter, B. 1997. Ockham’s Razor and chemistry. Hyle, 3, 3-28.
    • Discussion by three chemists of the benefits and pitfalls of applying Ockham’s Razor in chemical research.
  • Howson, C. and Urbach, P. 2006. Scientific Reasoning: The Bayesian Approach (Third Edition). Chicago: Open Court.
    • Contains a useful survey of Bayesian attempts to make sense of the role of simplicity in theory evaluation. Technical in places.
  • Jeffreys, H. 1957. Scientific Inference (2nd edition). Cambridge: Cambridge University Press.
    • Defends the “simplicity postulate” that simpler theories have higher prior probability.
  • Jeffreys, H. 1961. Theory of Probability. Oxford: Clarendon Press.
    • Outline and defense of the Bayesian approach to scientific inference. Discusses the role of simplicity in the determination of priors and likelihoods.
  • Kelly, K. 2004. Justification as truth-finding efficiency: how Ockham’s Razor works. Minds and Machines, 14, 485-505.
    • Argues that Ockham’s Razor is justified by considerations of truth-finding efficiency. Critiques Bayesian, Akiakian, and other traditional attempts to justify simplicity preferences. Technical in places.
  • Kelly, K. 2007. How simplicity helps you find the truth without pointing at it. In M. Friend, N. Goethe, and V.Harizanov (eds.), Induction, Algorithmic Learning Theory, and Philosophy. Dordrecht: Springer.
    • Refinement and development of the argument found in Kelly (2004) and Schulte (1999). Technical.
  • Kelly, K. 2010. Simplicity, truth and probability. In P. Bandyopadhyay and M. Forster (eds.), Handbook of the Philosophy of Statistics. Dordrecht: Elsevier.
    • Expands and develops the argument found in Kelly (2007). Detailed critique of Bayesian accounts of simplicity. Technical.
  • Kelly, K. and Glymour, C. 2004. Why probability does not capture the logic of scientific justification. In C. Hitchcock (ed.), Contemporary Debates in the Philosophy of Science. Oxford: Blackwell.
    • Argues that Bayesians can’t make sense of Ockham’s Razor.
  • Kemeny, J. 1955. Two measures of complexity. Journal of Philosophy, 52, p722-733.
    • Develops some of Goodman’s ideas about how to measure the logical simplicity of predicates and systems of predicates. Proposes a measure of simplicity similar to Popper’s (1959) falsifiability measure.
  • Kieseppä, I. A. 1997. Akaike Information Criterion, curve-fitting, and the philosophical problem of simplicity. British Journal for the Philosophy of Science, 48, p21-48.
    • Critique of Forster and Sober (1994). Argues that Akaike’s theorem has little relevance to traditional philosophical problems surrounding simplicity. Highly technical.
  • Kitcher, P. 1989. Explanatory unification and the causal structure of the world. In P. Kitcher and W. Salmon, Minnesota Studies in the Philosophy of Science, vol 13: Scientific Explanation, Minneapolis: University of Minnesota Press.
    • Defends a unification theory of explanation. Argues that simplicity contributes to explanatory power.
  • Kuhn, T. 1957. The Copernican Revolution. Cambridge, MA: Harvard University Press.
    • Influential discussion of the role of simplicity in the arguments for Copernicanism.
  • Kuhn, T. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
  • Kuipers, T. 2002. Beauty: a road to truth. Synthese, 131, 291-328.
    • Attempts to show how aesthetic considerations might be indicative of truth.
  • Kyburg, H. 1961. A modest proposal concerning simplicity. Philosophical Review, 70, 390-395.
    • Important critique of Goodman (1955). Argues that simplicity be identified with the number of quantifiers in a theory.
  • Lakatos, I. and Zahar, E. 1978. Why did Copernicus’s research programme supersede Ptolemy’s? In J. Worrall and G. Curie (eds.), The Methodology of Scientific Research Programmes: Philosophical Papers of Imre Lakatos, Volume 1. Cambridge: Cambridge University Press.
    • Argues that simplicity did not really play a significant role in the Copernican Revolution.
  • Lewis, D. 1973. Counterfactuals. Oxford: Basil Blackwell.
    • Argues that quantitative parsimony is less important than qualitative parsimony in scientific and philosophical theorizing.
  • Li, M. and Vitányi, P. 1997. An Introduction to Kolmogorov Complexity and its Applications (2nd edition). New York: Springer.
    • Detailed elaboration of Kolmogorov complexity as a measure of simplicity. Highly technical.
  • Lipton, P. 2004. Inference to the Best Explanation (2nd edition). Oxford: Basil Blackwell.
    • Account of inference to the best explanation as inference to the “loveliest” explanation. Defends the claim that simplicity contributes to explanatory loveliness.
  • Lombrozo, T. 2007. Simplicity and probability in causal explanation. Cognitive Psychology, 55, 232–257.
    • Argues that simplicity is used as a guide to assessing the probability of causal explanations.
  • Lu, H., Yuille, A., Liljeholm, M., Cheng, P. W., and Holyoak, K. J. 2006. Modeling causal learning using Bayesian generic priors on generative and preventive powers. In R. Sun and N. Miyake (eds.), Proceedings of the 28th annual conference of the cognitive science society, 519–524. Mahwah, NJ: Erlbaum.
    • Argues that simplicity plays a significant role in causal learning.
  • MacKay, D. 1992. Bayesian interpolation. Neural Computation, 4, 415-447.
    • First presentation of the concept of Ockham’s Hill.
  • Martens, R. 2009. Harmony and simplicity: aesthetic virtues and the rise of testability. Studies in History and Philosophy of Science, 40, 258-266.
    • Discussion of the Copernican simplicity arguments and recent attempts to reconstruct the justification for them.
  • McAlleer, M. 2001. Simplicity: views of some Nobel laureates in economic science. In A. Zellner, H. Keuzenkamp and M. McAleer (eds.), Simplicity, Inference and Modelling. Cambridge: Cambridge University Press.
    • Interesting survey of the views of famous economists on the place of simplicity considerations in their work.
  • McAllister, J. W. 1996. Beauty and Revolution in Science. Ithaca: Cornell University Press.
    • Proposes that scientists’ simplicity preferences are the product of an aesthetic induction.
  • Mill, J.S. 1867. An Examination of Sir William Hamilton’s Philosophy. London: Walter Scott.
  • Myrvold, W. 2003. A Bayesian account of the virtue of unification. Philosophy of Science, 70, 399-423.
  • Newton, I. 1999. The Principia: Mathematical Principles of Natural Philosophy; A New Translation by I. Bernard Cohen and Anne Whitman. Berkeley: University of California Press.
    • Contains Newton’s “rules for the study of natural philosophy”, which includes a version of Ockham’s Razor, defended in terms of the simplicity of nature. These rules play an explicit role in Newton’s argument for universal gravitation.
  • Nolan, D. 1997. Quantitative Parsimony. British Journal for the Philosophy of Science, 48, 329-343.
    • Contra Lewis (1973), argues that quantitative parsimony has been important in the history of science.
  • Norton, J. 2000. ‘Nature is the realization of the simplest conceivable mathematical ideas’: Einstein and canon of mathematical simplicity. Studies in the History and Philosophy of Modern Physics, 31, 135-170.
    • Discusses the evolution of Einstein’s thinking about the role of mathematical simplicity in physical theorizing.
  • Norton, J. 2003. A material theory of induction. Philosophy of Science, 70, p647-670.
    • Defends a “material” theory of induction. Argues that appeals to simplicity in induction reflect factual assumptions about the domain of inquiry.
  • Oreskes, N., Shrader-Frechette, K., Belitz, K. 1994. Verification, validation, and confirmation of numerical models in the earth sciences. Science, 263, 641-646.
  • Palter, R. 1970. An approach to the history of early astronomy. Studies in History and Philosophy of Science, 1, 93-133.
  • Pais, A. 1982. Subtle Is the Lord: The science and life of Albert Einstein. Oxford: Oxford University Press.
  • Peirce, C.S. 1931. Collected Papers of Charles Sanders Peirce, vol 6. C. Hartshorne, P. Weiss, and A. Burks (eds.). Cambridge, MA: Harvard University Press.
  • Plutynski, A. 2005. Parsimony and the Fisher-Wright debate. Biology and Philosophy, 20, 697-713.
    • Advocates a deflationary analysis of appeals to parsimony in debates between Wrightian and neo-Fisherian models of natural selection.
  • Popper, K. 1959. The Logic of Scientific Discovery. London: Hutchinson.
    • Argues that simplicity = empirical content = falsifiability.
  • Priest, G. 1976. Gruesome simplicity. Philosophy of Science, 43, 432-437.
    • Shows that standard measures of simplicity in curve-fitting are language variant.
  • Raftery, A., Madigan, D., and Hoeting, J. 1997. Bayesian model averaging for linear regression models. Journal of the American Statistical Association, 92, 179-191.
  • Reichenbach, H. 1949. On the justification of induction. In H. Feigl and W. Sellars (eds.), Readings in Philosophical Analysis. New York: Appleton-Century-Crofts.
  • Rosencrantz, R. 1983. Why Glymour is a Bayesian. In J. Earman (ed.), Testing Scientific Theories. Minneapolis: University of Minnesota Press.
    • Responds to Glymour (1980). Argues that simpler theories have higher likelihoods, using Copernican vs. Ptolemaic astronomy as an example.
  • Rothwell, G. 2006. Notes for the occasional major case manager. FBI Law Enforcement Bulletin, 75, 20-24.
    • Emphasizes the importance of Ockham’s Razor in criminal investigation.
  • Sakamoto, Y., Ishiguro, M., and Kitagawa, G. 1986. Akaike Information Criterion Statistics. New York: Springer.
  • Schaffner, K. 1974. Einstein versus Lorentz: research programmes and the logic of comparative theory evaluation. British Journal for the Philosophy of Science, 25, 45-78.
    • Argues that simplicity played a significant role in the development and early acceptance of special relativity.
  • Schulte, O. 1999. Means-end epistemology. British Journal for the Philosophy of Science, 50, 1-31.
    • First statement of the claim that Ockham’s Razor can be justified in terms of truth-finding efficiency.
  • Simon, H. 1962. The architecture of complexity. Proceedings of the American Philosophical Society, 106, 467-482.
    • Important discussion by a Nobel laureate of features common to complex systems in nature.
  • Sober, E. 1975. Simplicity. Oxford: Oxford University Press.
    • Argues that simplicity can be defined in terms of question-relative informativeness. Technical in places.
  • Sober, E. 1981. The principle of parsimony. British Journal for the Philosophy of Science, 32, 145-156.
    • Distinguishes between “agnostic” and “atheistic” versions of Ockham’s Razor. Argues that the atheistic razor has an inductive justification.
  • Sober, E. 1988. Reconstructing the Past: Parsimony, Evolution and Inference. Cambridge, MA: MIT Press.
    • Defends a deflationary account of simplicity in the context of the use of parsimony methods in evolutionary biology.
  • Sober, E. 1994. Let’s razor Ockham’s Razor. In E. Sober, From a Biological Point of View, Cambridge: Cambridge University Press.
    • Argues that the use of Ockham’s Razor is grounded in local background assumptions.
  • Sober, E. 2001a. What is the problem of simplicity? In H. Keuzenkamp, M. McAlleer, and A. Zellner (eds.), Simplicity, Inference and Modelling. Cambridge: Cambridge University Press.
  • Sober, E. 2001b. Simplicity. In W.H. Newton-Smith (ed.), A Companion to the Philosophy of Science, Oxford: Blackwell.
  • Sober, E. 2007. Evidence and Evolution. New York: Cambridge University Press.
  • Solomonoff, R.J. 1964. A formal theory of inductive inference, part 1 and part 2. Information and Control, 7, 1-22, 224-254.
  • Suppes, P. 1956. Nelson Goodman on the concept of logical simplicity. Philosophy of Science, 23, 153-159.
  • Swinburne, R. 2001. Epistemic Justification. Oxford: Oxford University Press.
    • Argues that the principle that simpler theories are more probably true is a fundamental a priori principle.
  • Thagard, P. 1988. Computational Philosophy of Science. Cambridge, MA: MIT Press.
    • Simplicity is a determinant of the goodness of an explanation and can be measured in terms of the paucity of auxiliary assumptions relative to the number of facts explained.
  • Thorburn, W. 1918. The myth of Occam’s Razor. Mind, 23, 345-353.
    • Argues that William of Ockham would not have advocated many of the principles that have been attributed to him.
  • van Fraassen, B. 1989. Laws and Symmetry. Oxford: Oxford University Press.
  • Wallace, C. S. and Dowe, D. L. 1999. Minimum Message Length and Kolmogorov Complexity. Computer Journal, 42(4), 270–83.
  • Walsh, D. 1979. Occam’s Razor: A Principle of Intellectual Elegance. American Philosophical Quarterly, 16, 241-244.
  • Weinberg, S. 1993. Dreams of a Final Theory. New York: Vintage.
    • Argues that physicists demand simplicity in physical principles before they can be taken seriously.
  • White, R. 2005. Why favour simplicity? Analysis, 65, 205-210.
    • Attempts to justify preferences for simpler theories in virtue of such theories having higher likelihoods.
  • Zellner, A, Keuzenkamp, H., and McAleer, M. 2001. Simplicity, Inference and Modelling. Cambridge: Cambridge University Press.
    • Collection papers by statisticians, philosophers, and economists on the role of simplicity in scientific inference and modelling.

Author Information

Simon Fitzpatrick
Email: sfitzpatrick@jcu.edu
John Carroll University
U. S. A.

Neo-Kantianism

By its broadest definition, the term ‘Neo-Kantianism’ names any thinker after Kant who both engages substantively with the basic ramifications of his transcendental idealism and casts their own project at least roughly within his terminological framework. In this sense, thinkers as diverse as Schopenhauer, Mach, Husserl, Foucault, Strawson, Kuhn, Sellers, Nancy, Korsgaard, and Friedman could loosely be considered Neo-Kantian. More specifically, ‘Neo-Kantianism’ refers to two multifaceted and internally-differentiated trends of thinking in the late Nineteenth and early Twentieth-Centuries: the Marburg School and what is usually called either the Baden School or the Southwest School. The most prominent representatives of the former movement are Hermann Cohen, Paul Natorp, and Ernst Cassirer. Among the latter movement are Wilhelm Windelband and Heinrich Rickert. Several other noteworthy thinkers are associated with the movement as well.

Neo-Kantianism was the dominant philosophical movement in German universities from the 1870’s until the First World War. Its popularity declined rapidly thereafter even though its influences can be found on both sides of the Continental/Analytic divide throughout the twentieth century. Sometimes unfairly cast as narrowly epistemological, Neo-Kantianism covered a broad range of themes, from logic to the philosophy of history, ethics, aesthetics, psychology, religion, and culture. Since then there has been a relatively small but philosophically serious effort to reinvigorate further historical study and programmatic advancement of this often neglected philosophy.

Table of Contents

  1. Proto Neo-Kantians
  2. Marburg
  3. Baden
  4. Associated Members
  5. Legacy
  6. References and Further Reading
    1. Principle Works by Neo-Kantians and Associated Members
    2. Secondary Literature

1. Proto Neo-Kantians

During the first half of the Nineteenth-Century, Kant had become something of a relic. This is not to say that major thinkers were not strongly influenced by Kantian philosophy. Indeed there are clear traces in the literature of the Weimar Classicists, in the historiography of Bartold Georg Niebuhr (1776-1831) and Leopold von Ranke (1795-1886), in Wilhelm von Humboldt’s philosophy of language (1767-1835), and in Johannes Peter Müller’s (1801-1858) physiology. Figures like Schleiermacher (1768-1834), Immanuel Hermann Fichte (1796-1879), Friedrich Eduard Beneke (1798-1854), Christian Hermann Weiße (1801-1866), the Fries-influenced Jürgen Bona Meyer (1829-1897), the Frenchman Charles Renouvier (1815-1903), the evangelical theologian Albrecht Ritschl (1822-1889), and the great historian of philosophy Friedrich Ueberweg (1826-1871), made calls to heed Kant’s warning about transgressing the bounds of possible experience. However, there was neither a systematic nor programmatic school of Kantian thought in Germany for more than sixty years after Kant’s death in 1804.

The first published use of the term ‘Neo-Kantianer’ appeared in 1862, in a polemical review of Eduard Zeller by the Hegelian Karl Ludwig Michelet. But it was Otto Liebmann’s (1840-1912) Kant und die Epigonen (1865) that most indelibly heralds the rise of a new movement. Here Johann Gottlieb Fichte (1762-1814), Hegel (1770-1831), and Schelling (1775-1854), whose idealist followers held sway in German philosophy departments during the early decades of the 19th Century, are chided for taking over only Kant’s system-building, and for doing so only in a superficial way. They sought to create the world from scratch, as it were, by finding new, more-fundamental first principles upon which to create a stronghold of interlocking propositions, one following necessarily from its predecessor. Insofar as those principles were generated from reflection rather than experience, however, the ‘descendants’ would effectively embrace Kant’s idealism at the expense of his empirical realism. The steep decline of Hegelianism a generation later opened a vacuum which was to be filled by the counter-movement of scientific materialism, represented by figures like Karl Vogt (1817-1895), Heinrich Czolbe (1819-1873), and Ludwig Büchner (1824-1899). By reducing speculative philosophy to a system of naturalistic observation consistent with their realism, the materialists utilized a commonsense terminology that reopened philosophical inquiry for those uninitiated in idealist dialectics. Despite their successes in the realm of the natural sciences, the materialists were accused of avoiding serious philosophical problems rather than solving them. This was especially true about matters of consciousness and experience, which the materialists were inclined to treat unproblematically as ‘given’. Against the failings of both the idealists and the materialists, Liebmann could only repeatedly call, “Zurück zu Kant!”

The 1860’s were a sort of watershed for Neo-Kantianism, with a row of works emerging which sought to move past the idealism-materialism debate by returning to the fundamentals of the Kantian Transcendental Deduction. Eduard Zeller’s (1814-1908) Ueber Bedeutung und Aufgabe der Erkenntnishteorie (1862) placed a call similar to Liebmann’s to return to Kant, maintaining a transcendental realism in the spirit of a general epistemological critique of speculative philosophy. Kuno Fischer’s (1824-1907) Kants Leben und die Grundlagen seiner Lehre (1860), and the second volume of his Geschichte der neuern Philosophie (1860) manifested the same call, too, in what remain important commentaries on Kant’s philosophy. Fischer was at first not so concerned to advance a new philosophical system as to correctly understand the more intricate nuances of the true master. These works gained popular influence in part because of their major literary improvement upon the commentaries of Karl Leonhard Reinhold (1757-1823), but in part also because of the controversy surrounding Fischer’s interpretation of Kant’s argument that space is a purely subjective facet of experience. Worried that the ideality of space led inexorably to skepticism, Friedrich Adolf Trendelenburg (1802-1872) sought to retain the possibility that space was also empirically real, applicable to the things-in-themselves at least in principle. Fischer returned fire by claiming that treating space as a synthetic a posteriori would effectively annul the necessity of Newtonian science, something Fischer emphasized was at the very heart of the Kantian project. Their spirited quarrel, which enveloped a wide circumference of academics over a twenty-year span, climaxed with Trendelenburg’s highly personal Kuno Fischer und sein Kant (1869) and Fischer’s retaliatory Anti-Trendelenburg (1870). Although such personal diatribes attract popular attention, the genuine philosophical foundations of Neo-Kantianism were laid earlier.

Hermann Ludwig Ferdinand von Helmholtz (1821-94) outstripped even the scientific credentials of the materialists, combining his experimental research with a genuine philosophical sophistication and historical sensitivity. His advances in physiology, ophthalmology, audiology, electro- and thermo-dynamics duly earned him an honored place among the great German scientists. His Über die Erhaltung der Kraft (1847) ranks only behind The Origin of Species as the most influential scientific treatise of the Nineteenth-Century, even though its principle claim might have been an unattr­­­­ibuted adoption of the precedent theories by Julius Robert von Mayer (1814-1878) and James Joule (1818-1889). His major philosophical contribution was an attempt to ground Kant’s theoretical division between phenomena and noumena within empirically verifiable sense physiognomy. In place of the materialist’s faith in sense perception as a copy of reality and in advance of Kant’s general ignorance about the neurological conditions of experience, Helmholtz noted that when we see or hear something outside us there is a complicated process of neural stimulation. Experience is neither a direct projection of the perceived object onto our sense organs nor merely a conjunction of concept and sensuous intuition, but an unconscious process of symbolic inferences by which neural stimulations are made intelligible to the human mind.

The physical processes of the brain are a safer starting point, a scientifically-verifiable ground on which to explain the a priori necessity of experience, than Kant’s supra-naturalistic deduction of conceptual architectonics. Yet two key Kantian consequences are only strengthened thereby. First, experience is revealed to be nothing immediate, but a demonstrably discursive process wherein the material affect of the senses is transformed by subjective factors. Second, any inferences that can possibly be drawn about the world outside the subject must reckon with this subjective side, thereby reasserting the privileged position of epistemology above ontology. Helmholtz and the materialists both thought that Newtonian science was the best explanation of the world; but Helmholtz realized that science must take account of what Kant had claimed of it: science is the proscription of what can be demonstrated within the limits of possible experience rather than an articulation about objects in-themselves. Empirical physiognomy would more precisely proscribe those limits than purely conceptual transcendental philosophy.

Friedrich Albert Lange (1828-1875) was, at least in the Nineteenth-Century, more widely recognized as a theorist of pedagogy and advocate of Marxism in the Vereinstag deutscher Arbeitervereine than as a forerunner to Neo-Kantianism. But his influence on the Marburg school, though brief, was incisive. He took his professorship at Marburg in 1872, one year before Cohen completed his Habilitationschrift there. Lange worked with Cohen for only three years before his untimely death in 1875.

Much of Lange’s philosophy was conceived before his time in Marburg. As Privatdozent at Bonn, Lange attended Helmholtz’s lectures on the physiology of the senses, and later came to agree that the best way to move philosophy along was by combining the general Kantian insights with a more firmly founded neuro-physiology. In his masterpiece, Geschichte des Materialismus und Kritik seiner Bedeutung in der Gegenwart (1866), Lange argues that materialism is at once the best explanation of phenomena, yet quite naïve in its presumptive inference from experience to the world outside us. The argument is again taken from the progress of the physiological sciences, with Lange providing a number of experiments that reveal experience to be an aggregate construction of neural processes.

Where he progresses beyond Helmholtz is his recognition that the physiognomic processes themselves must, like every other object of experience, be understood as a product of a subject’s particular constitution. Even while denying the given-ness of sense data, Helmholtz had too often presumed to understand the processes of sensation on the basis of a non-problematic empirical realism. That is, even while Helmholtz viewed knowledge of empirical objects as an aggregate of sense physiognomy, he never sufficiently reflected on the conditions for the experience of that very sense physiognomy. So even while visual experience is regarded as derivative from the physiognomy of the eyes, optic nerves, and brain, how each of these function is considered unproblematically given to empirical experience. For Lange, we cannot so confidently infer that our experience of physiognomic processes corresponds to what is really the case outside our experience of them—that the eyes or ears actually do work as we observe them to—since the argument by which that conclusion is reached is itself physiological. Neither the senses, nor the brain, nor the empirically observable neural processes between them permit the inference that any of these is the causal grounds of our experience of them.

With such skepticism, Lange would naturally dispense, too, with attempts to ground ethical norms in either theological or rationalistic frameworks. Our normative prescriptions, however seemingly unshakeable by pure practical reason, are themselves operations of a brain, which develop contingently over great spans of evolutionary history. That brain itself is only another experienced representation, nothing which can be considered an immutable and necessary basis of which universal norms could be considered derivative. Lange thought this opened a space for creative narratives and even myths, able to inspire rather than regulate the ethical side of human behavior. Through their mutual philology teacher at Bonn, Friedrich Ritschl, Lange was also a decisive influence on the epistemology and moral-psychology of Friedrich Nietzsche.

The progenitors of Neo-Kantianism, represented principally by Liebmann, Fischer, Trendelenburg, Helmholtz, and Lange, evince a preference for Kant’s theoretical rather than ethical or aesthetical writings. While this tendency would be displaced by both the Marburg school’s social concerns and the Baden school’s concentration on the logic of values, the proto-Neo-Kantians had definite repercussions for later figures like Hans Vaihinger (1852-1933), the so-called empirio-positivists like Richard Avenarius and Ernst Mach, and the founder of the Vienna Circle, Moritz Schlick (1882-1936).

2. Marburg

Herman Cohen (1842-1918) was Lange’s friend and successor, and is usually considered the proper founder of Neo-Kantianism at Marburg. The son of a rabbi in Coswig, he was given a diverse schooling by the historian of Judaism Zacharias Frankel (1801-1875) and the philologist Jacob Bernays (1824-1881). Moving to Berlin, he studied philosophy under Trendelenburg, philology under August Boeckh (1785-1867), culture and linguistics with Heymann Steinthal (1823-1899), and physiology with Emil Du Bois-Reymond (1818-1896). One of his earliest papers, “Zur Controverse zwischen Trendelenburg und Kuno Fischer” (1871), was a sort of coming-out in academic society. Against Fischer, the attempt above all to understand the letter of Kant perfectly –even the problems that persisted in his work—was tantamount to historicizing what ought to be a living engagement with serious philosophical problems. Although roughly on the side of his teacher Trendelenburg, Cohen stood mostly on his own ground in denying that objectivity required any appeal to extra-mental objects. Granted a professorship at the University of Marburg in 1876, Cohen came to combine Kant-interpretation with Lange’s instinct to develop Kant’s thinking in light of contemporary developments: a “Verbindung der systematischen und historischen Aufgabe.” It was Cohen who published Lange’s Logische Studien (1877) posthumously and produced several new editions of his Geschichte des Materialismus. More interested in logic than science, however, Cohen took Lange’s initiatives in a decidedly epistemological direction.

Cohen’s early work consists mainly in critical engagements with Kant: Kants Theorie der Erfahrung (1871), Kants Begründung der Ethik (1877), and Kants Begründung der Aesthetik (1889). Where he progresses beyond Kant is most plainly in his attempt to overcome the Kantian dualism between intuition and discursive thinking. Cohen argues that formal a priori laws of the mind not only affect how we think about external objects, but actually constitute those objects for us. “Thinking produces that which is held to be” (Logik der reinen Erkenntnis [Berlin 1902], 67). That is, the laws of the mind not only provide the form but the content of experience, leaving Cohen with a more idealistic picture of experience than Kant’s empirical realism would have warranted. These laws lie beyond, as it were, both Kant’s conceptual categories as well as Helmholtz’s and Lange’s physiognomic processes, the latter of which Cohen considered unwarrantedly naturalistic. The transcendental conditions of experience lay in the most fundamental rules of mathematical thinking, such that metaphysics, properly understood, is the study of the laws that make possible mathematical, and by derivation, scientific thinking. While Cohen did not deny the importance of the categories of the understanding or of the necessity of sensuous processes within experience, his own advancement beyond these entailed that an object was experienced and only could be experienced in terms of the formal rules of mathematics. The nature of philosophical investigation becomes, in Cohen’s hands, neither a physiognomic investigation of the brain or senses as persistent ‘things’ nor a transcendental deduction of the concepts of the understanding and the necessary faculties of the mind, but an exposition of the a priori rules that alone make possible any and every judgment. The world itself is the measure of all possible experience.

The view had a radical implication for the notion of a Kantian self. Kant never much doubted that something persistent was the fundament on which judgment was constructed. He denied that this thing was understandable in the way either materialism or commonsense presumed, of course, but posited a transcendental unity of apperception as, at least, the logically-necessary ground for experience. Cohen replaces the ‘unity’ of a posited self, and indeed any ‘faculties’ of the mind, with a variety of ‘rules’, ‘methods’, and ‘procedures’. What we are is not a ‘thing’, but a series of logical acts. The importance of his view can be illustrated in terms of Cohen’s theoretical mathematics, especially his notions of continuity and the infinitesimally small. Neither of these is an object, obviously, in the views of materialists or empiricists. And even idealists were at pains to decipher how either could be represented. By treating both as rules for thinking rather than persistent things, Cohen was able to show their respective necessities in terms of what sorts of everyday experiences would be made impossible without them. Continuity becomes an indispensable rule for thinking about any objects in time, while the infinitesimally smallest thing becomes an indispensable rule for thinking about the composition of objects in space. Though a substantial departure from the Kantian ascription of faculties, Cohen never prided himself so much on ascertaining and propagating the conclusions of Kant as on applying Kant’s transcendental method to its sincerest conclusions.

Like his Marburg colleagues, Cohen is sometimes unfairly cast as an aloof logical hair-splitter. Quite the contrary, he was deeply engaged in ethics as well as in the social and cultural debates of his time. His approach to them is also based on a generally Kantian methodology, which he sometimes dubbed a ‘social idealism’. Dismissing the practical and applied aspects of Kant’s ethics as needlessly individualistic psychology, he stressed the importance of Kant’s project of grounding ethical laws in practical reason, therein presenting for the first time a transcendentally necessary ought. This necessity holds for humanity generally, insofar as humanity is considered generally and not in terms of the privileges or disadvantages of particular individuals. Virtues like truthfulness, honor, and justice, which relate to the concept of humanity, thus take priority over sympathy or empathy, which operate on particular individuals and their circumstances. Like both experience and ethical life, civil laws could only be justified insofar as they were necessary, in terms of their intersubjective determinability. Because class-strata effectually inculcate authoritative relationships of power, a democratic –or better— a socialist society that effaced autocratically decreed legal dicta would better fulfill Kant’s exhortation to treat people as ends rather than means, as both legislators and legislated at the same time. However, Cohen thought that then-contemporary socialist ideals put too much stock in the economic aspects of Marx’s theory, and in their place tried to instill a greater concentration on the spiritual and cultural side of human social life. Cohen’s grounding of socialism in a Kantian rather than Marxist framework thus circumvents some of its statist and materialist overtones, as Lange also tried to do in his 1865 Die Arbeiterfrage.

Hermann Cohen also remains important for his contributions to Jewish thought. In proportion to his decreasing patriotism in Germany, he became an increasingly unabashed stalwart of Judaism in his later writing, and indeed is still considered by many to be the greatest Jewish thinker of his century. After his retirement from Marburg in 1912, he taught at the Berlin Lehranstalt für die Wissenschaft des Judentums. His posthumous Die Religion der Vernunft aus den Quellen des Judentums (1919) maintains that the originally Jewish method of religious thinking –its monotheism— reveals it as a “religion of reason,” a systematic and methodological attempt to think through the mysteries of the natural world and to construct a universal system of morality ordered under a single universal divine figure. As such, religion is not simply an ornament to society or a mere expression of feelings, but an entirely intrinsic aspect of human culture. The grand summation of these various strands of his thought was to have been collated into a unified theory of culture, but only reached a planning stage before his death in 1918. Even had he completed it, Cohen’s fierce advocacy of socialism and fiercer defense of Judaism, especially in Germany, would have made an already adverse academic life increasingly difficult.

Cohen’s student Paul Natorp (1854-1924) was a trained philologist, a renowned interpreter of Plato and of Descartes, a composer, a mentor to Pasternak, Barth, and Cassirer, and an influence on the thought of Husserl, Heidegger, and Gadamer. Arriving at Marburg in 1881, much of Natorp’s career was concerned with widening the sphere of influence of Cohen’s interpretation of Kant and with tracing the historical roots of what he understood to be the essence of critical philosophy. For Natorp, too, the necessity of Neo-Kantian philosophizing lie in overcoming the speculations of the idealists and in joining philosophy again with natural science by means of limiting discourse to that which lay within the bounds of possible experience, in overcoming the Kantian dualism of intuition and discursive thinking.

However close their philosophies, Natorp stresses more indelibly the experiential side of thinking than did Cohen’s concentration on the logical features of thought. Natorp saw it as a positive advance on Kant to articulate the formal rules of scientific inquiry not as a set of axioms, but as an exposition of the rules—the methods for thinking scientifically—to the extreme that the importance of the conclusions reached by those rules becomes subsidiary to the rules themselves. Knowledge is an ‘Aufgabe’ or task, guided by logic, to make the undetermined increasingly more determined. Accordingly, the Kantian ‘thing-itself’ ceases to be an object that lies outside possible experience, but a sort of regulative ideal that of itself spurs the understanding to work towards its fulfillment. That endless call to new thinking along specifically ordered lines is just what Natorp means by scientific inquiry—a normative requirement to think further rather than a set of achieved conclusions. Thus, where Kant began with a transcendental logic in order to ground math and the natural sciences, Natorp began with the modes of thinking found already in the experimental processes of good science and the deductions of mathematics as evidence of what conditions are necessarily at play in thinking generally. The necessities of math and science do not rest, as they arguably do for Kant, upon the psychological idiosyncrasies of the rational mind, but are self-sufficient examples of what constitutes objective thinking. Thus philosophy itself, as Natorp conceived it, does not begin with the psychological functions of a rational subject and work towards its products. It begins with a critical observation of those objectively real formations—mathematical and scientific thinking—to ground how the mind itself must have worked in order to produce them.

One of the consequences of Natorp’s concentration on the continual process of methodological thinking was his acknowledgement that a proper exposition of its laws required an historical basis. Science, as the set of formal rules for thinking, begins to inquire by reflecting on why a thing is, not just that it is. Whenever that critical reflection on the methods of scientific thinking occurs, there ensues a sort of historical rebirth that generates a new cyclical age of scientific inquiry. This reflection, Natorp thought, involves rethinking how one considers that object: a genetic rather than static reflection on the changing conditions for the possibility of thinking along the lines of the continuing progress of science.

Plato’s notion of ideal forms marks the clearest occasion of –to borrow Kuhn’s designation for a similar notion—a paradigm shift. Under Natorp’s interpretation, Plato’s forms are construed not as real subsistent entities that lie beyond common human experience, but regulative hypotheses intended to guide thinking along systematic lines. No transcendent things per se, the forms are transcendental principles about the possibility of the human experience of objects. In this sense, Natorp presented Plato as a sort of Marburg Neo-Kantian avant le lettre, a view which garnered little popularity.

Like Cohen, Natorp also had significant cultural interests, and thought that the proper engagement with Kant’s thought had the potential to lead to a comprehensive philosophy of culture. Not as interested as Cohen in the philosophy of religion, it is above all in Natorp’s pedagogical writings that he espouses a social-democratic, anti-dogmatic ideal of education, the goal of which was not an orthodox body of knowledge but an attunement to the lines along which we think so as to reach knowledge. Education ought to be an awareness of the ‘Aufgabe’ of further determining the yet undetermined. Accordingly, the presentation of information in a lecture was thought to be intrinsically stilting, in comparison to the guided process of a quasi-Socratic method of question and answer. An educator of considerable skill, among his students were Karl Vorländer (1860-1928), Nicolai Hartmann (1882-1950), José Ortega y Gasset (1883-1955), and Boris Pasternak (1890-1960).

Ernst Cassirer (1874-1945) is usually considered the last principle figure of the Marburg school of Neo-Kantianism. Entering Marburg to study with Cohen himself in 1896, though he would never teach there, Cassirer adopted both the school’s historicist leanings and its emphasis on transcendental argumentation. Indeed, he stands as one of the last great comprehensive thinkers of the West, equal parts epistemologist, logician, philosopher of science, cultural theorist, and historian of thought. His final work, written after having immigrated to America in the wake of Nazi Germany, wove together Cohen’s defense of Judaism together with his own observations of the fascist employment of mythic symbolism to form a comprehensive critique of the Myth of the State (published posthumously in 1946). His wide interests were not accidental, but a direct consequence of his lifelong attempt to show the logical and creative aspects of human life –the Naturwissenschaften and Geisteswissenschaften— as integrally entwined within the characteristically human mode of mentality: the symbolic form.

Cassirer’s earlier historical works adopt the basic view of history promulgated by Natorp. The development of the history of ideas takes the form of naïve progresses interrupted by a series of fundamental reconsiderations of the epistemological methods that gave rise to those progresses, through Plato, Galileo, Spinoza, Leibniz, and Kant. Like Natorp, too, the progress of the history of ideas is, for Cassirer, the march of the problems and solutions constructed by the naturally progressive structure of the human mind. Despite this seemingly teleological framework, Cassirer’s histories of the Enlightenment, of Modernity, of Goethe, Rousseau, Descartes, and of epistemology are generally reliable, lucidly written expositions that aim to contextualize an author’s own thought rather than squeeze it into any particular historiographical framework. A sort of inverse of Hegel, for Cassirer the history of ideas is not reducible to an a priori necessary structure brought about by the nature of reason, but reliable evidence upon which to examine the progression of what problems arise for minds over time.

Cassirer’s historicism also led him to rethink Cohen’s notion of the subject. As Cohen thought the self was just a logical placeholder, the subject-term of the various rules of thinking, so Cassirer thought the self was a function that united the various symbolic capacities of the human mind. Yet, that set is not static. What the history of mathematics shows is not a timeless set of a priori rules that can in principle be deciphered within a philosophical logic, but a slow yet fluid shift over time. This is possible, Cassirer argued, because mathematical rules are tied neither to any experience of objects nor to any timeless notion of a context-less subject. Ways of thinking shift over time in response to wider environmental factors, and so do the mathematical and logical forms thereby. Einstein’s relativity theory, of which Cassirer was an important promulgator, underscores Cassirer’s insights especially in its assumption of non-Euclidean geometry. Put into the language of Cassirer’s mature thought, the logical model on which Einsteinian mathematics was based is itself an instantiation not of the single correct outlook on the world, but of a powerful new shift in the symbolic form of science. What Einstein accomplished was not just another theory in a world full of theories, but the remarkably concise formula for a modified way of thinking about the physical world.

The complete expression of Cassirer’s own philosophy is his three-volume Philosophie der symbolischen Formen (1923-9). In keeping with a general Kantianism, Cassirer argues that the world is not given as such in human experience, but mediated by subject-side factors. Cassirer thought that more complex structures constituted experience and the human necessity to think along certain pre-given lines. What Cassirer adds is a much greater historical sensitivity to earlier forms of human thinking as they were represented in myths and early religious expressions. The more complex and articulate forms of culture –the three major ones are language, myth, and science— are not a priori necessary across time or cultures, but are achieved by a sort of dialectical solution to problems arising when other cultural forms become unsustainable. Knowledge for Cassirer, unlike Natorp, is not so much the process of determination, but a web of psycho-linguistic relations. The human mind progresses over history through linguistic and cultural forms, from the affective expressions of primitives, to representational language, which situates objects in spatial and temporal relations, to the purely logical and mathematical forms of signification. The unique human achievement is the symbolic form, an energy of the intellect that binds a particular sensory signal to a meaningful general content. By means of symbols, humans not only navigate the world empirically and not only understand the world logically, but make the world meaningful to themselves culturally. This symbolic capacity is indeed what separates the human species from the animals. Accordingly, Cassirer saw himself as having synthesized the Neo-Kantian insights about subjectivity with the roughly Hegelian-themed phenomenology of conscious forms.

3. Baden

The intellectual pride of the Marburg school fostered a prickly relationship with their fellow neo-Kantians. The rivalry developed with the Baden Neo-Kantians (alternately named the “Southwest” school) was not so much a competition for the claim to doctrinal orthodoxy as much as what the proper aims and goals of Kant-studies should be. Although it is a generalization, where the Marburg Neo-Kantians sought clarity and methodological precision, the Baden school endeavored to explore wider applications of Kantian thought to contemporary cultural issues. Like their Marburg counterparts, they concerned themselves with offering a third, critical path between speculative idealism and materialism, but turned away from how science or mathematics were grounded in the logic of the mind toward an investigation of the human sciences and the transcendental conditions of values. While they paid greater attention to the spiritual and cultural side of human life than the Marburg school, they were less active in the practical currents of political activity.

Wilhelm Windelband (1848-1915), a student of Kuno Fischer and Hermann Lotze (1817-1881), set the tone of the school by claiming that to truly understand Kant was not a matter of philological interpretation–per Fischer—nor a return to Kant–per Liebmann—but of surpassing Kant along the very path he had blazed. Never an orthodox Kantian, he originally said “Kant verstehen, heißt über ihn hinausgehen.” Part of that effort sprung from Kant’s distinction between different kinds of judgment as being appropriate to different forms of inquiry. Whereas the Marburg school’s conception of logic was steeped in Kant, Windleband found certain elements of Fichte, Hegel, and Lotze to be fruitful. Moreover, the Marburgers too-hastily applied theoretical judgment to all fields of intellectual investigation summarily, and thereby conflated the methods of natural science with proper thinking as such. This effectively relegated the so-called cultural sciences, like history, sociology, and the arts – Geisteswissenschaften – to a subsidiary intellectual rank behind logic, mathematics, and the natural science – Naturwissenschaften. Windelband thought, on the contrary, that these two areas of inquiry had a separate but equal status. To show that, Windelband needed to prove the methodological rigor of those Geisteswissenschaften, something which had been a stumbling block since the Enlightenment. In a distinctly Kantian vein, he was able to show that the conditions for the possibility of judging the content of any of the Geisteswissenschaften took the shape of idiographic descriptions, which focus on the particular, unique, and contingent. The natural sciences, on the other hand, are generalizing, law-positing, and nomothetic. Idiographic descriptions are intended to inform, nomothetic explanations to demonstrate. The idiographic deals in Gestaltungen, the nomothetic in Gesetze. Both are equally parts of the human endeavor.

The inclination to seek a more multifaceted conception of subjectivity was a hallmark of the Baden Neo-Kantians. The mind is not a purely mathematical or logical function designed to construct laws and apply them to the world of objects; it works in accordance with the environment in which it operates. What is constructed is done as a response to a need generated by that socio-cultural-historical background. This notion was a sort of leitmotif to Windelband’s own history of philosophy, which for the first time addressed philosophers organically in terms of the philosophical problems they faced and endeavored to solve rather than as either a straightforward chronology, a series of schools, or, certainly, a sort of Hegelian conception of a graduated dialectical unfolding of a single grand idea.

Heinrich Rickert (1863-1936) began his career with a concentration on epistemology. Alongside many Neo-Kantians, he denied the cogency of a thing itself, and thereby reduced an ontology of externalities to a study of the subjective contents of a common, universal mind. Having dissertated under Windelband in 1888 and succeeding him at Heidelberg in 1916, Rickert also came to reject the Marburger’s assumption that the methodologies of natural science were the rules for thinking as such. Rickert argued instead that the mind engages the world along the dual lines of Geisteswissenschaften and Naturwissenschaften according to idiographic or nomothetic judgments respectively, the division which Rickert borrowed from his friend Windelband and elaborated upon (although it is sometimes attributed to Rickert or even Dilthey mistakenly). And similar to him as well, Rickert considered it a major project of philosophy to ground the former in a critical method, as a sort of transcendental science of culture. Where Rickert went his own way was in his critique of scientific explanations insofar as they rested on abstracted generalizations and in his privilege of historical descriptions on account of their attention to life’s genuine particularity and the often irrational character of historical change. Where Windelband saw equality-but-difference between the Geistes—and Naturwissenschaften, Rickert saw the latter as deficient insofar as it was incapable of addressing values.

History, for Rickert, was the exemplary case of a human science. Historians write about demonstrable facts in time and space, wherein the truth or falsity of a claim can be demonstrated with roughly the same precision as the natural sciences. The relational links between historical events – causes, influences, consequences – rely on mind-centered concepts rather than on observable features of a world outside the historian; though this of itself would not preclude the possibility of at least phenomenal explanations of history. More importantly, historians deal with events which cannot be isolated, repeated, or tested, particular individuals whose actions cannot be subsumed under generalizations, and with human values which resist positive nomothetic explanation. Those values constitute the essence of an historiographical account in a way entirely foreign to the physical sciences, whose objects of inquiry – atoms, gravitational forces, chemical bonds, etc.—remain value-neutral. A historian passes judgment about the successes and failures of policies, assigns titular appellations from Alexander the Great to Ivan the Terrible, decides who is king and who a tyrant, and regards eras as contributions or hindrances to human progress. The values of historians essentially comprise what story is told about the past, even if this sacrifices history’s status as an exact, demonstrable, or predictive science.

Rickert’s differentiation of the methods of history and science has been influential on post-modern continental theorists, in part through his friend and colleague, Karl Jaspers (1883-1969), and through his doctoral student, Martin Heidegger (1889-1976). The traditional scientific reliance on a correspondential theory of truth seemed woefully uncritical and thereby inadequate to the cultural sciences. The very form of reality should be understood as a product of a subject’s judgment. However, Rickert’s recognition of this subjective side did not entangle him in value relativism. Although the historian’s values, for example, do indeed inform the account, Rickert also thought that values were nothing exclusively personal. Values, in fact, express proximate universals across cultures and eras. Philosophy itself, as a critical inquiry into the values that inform judgment, reveals the endurance and trans-cultural nature of values in such a way that grounds the objectivity of an historical account in the objectivity of the values he or she holds. That truth is something that ‘ought to be sought’ is perhaps one such value. But Rickert has been criticized for believing that the values of historians on issues of personal ethics, the defensibility of wars, the treatment of women, or the social effects of religion were universally agreed upon.

4. Associated Members

Twentieth-century attempts to categorize thinkers historically into this or that school led to some debate about which figures were ‘really’ Neo-Kantian. But this overlooks the fact that even until the mid-1880’s the term ‘Neo-Kantian’ was rarely self-appellated. Unlike other more tightly-knit schools of thought, the nature of the Neo-Kantian movement allowed for a number of loosely-associated members, friends, and students, who were seekers of truth first and proponents of a doctrine second. Although each engaged the basic terminology and transcendental framework of either Kant or the Neo-Kantians, none did so wholesale or uncritically.

Hans Vaihinger (1852-1933) remains arguably the best scholar of Kant after Fischer and Cohen. And he probably maintained the closest ties with Neo-Kantianism without being assimilated into one or the other school. His massive commentary, begun in 1881 and left unfinished due to his health, exposited not only Kant’s work, but also the history of Neo-Kantian interpretation to that point. He founded the Kant-Studien in 1897, which is both the worldwide heart of Kant studies and the model for all author-based periodicals published in philosophy. Vaihinger’s own philosophy, which resonates in contemporary debates about fictionalism and anti-realism, takes as its starting point a curious blend of Kant, Lange, and Nietzsche in his Die Philosophie des Als-Ob (1911). For Vaihinger, the expressions that stem from our subjective makeup render moot the question whether they correspond to the real world. Not only do the concepts of math and logic have a subjective source, as per many of the Neo-Kantians, but the claims of religion, ethics, and even philosophy turn out to be subjectively-generated illusions that are found to be particularly satisfying to certain kinds of life, and thereafter propagated for the sake of more successfully navigating an unconceptualizable reality in-itself. Human psychology is structured to behave ‘as-if’ these concepts and mental projections corresponded to the way things really were, though in reality there is no way to prove whether they do. Freedom, a key notion for Vaihinger, is not merely one side of a misunderstanding between reason and the understanding, but a useful projection without which we could neither act nor live.

Wilhelm Dilthey (1833-1911) remains alongside Vaihinger as one of the great patrons of Kant scholarship for his work on the Academy Edition of Kant’s works, which began in 1900. Taught by both Fischer in Heidelberg and Trendelenburg in Berlin, Dilthey was strongly influenced by the hermeneutics of Schleiermacher as well, and entwines his Neo-Kantian denial of material realism with the hermeneutical impulse to see judgment as socio-culturally informed interpretation. Dilthey shares with the Baden school the distinction between nomothetic and idiographic sciences. Where he stresses the distinction, though, is slightly different. The natural sciences explain by way of causal relationships, whereas history makes understood by correctly associating particulars and wholes. In this way, the practice of history  itself allows us to better understand the Lebenzusammenhang –how all life’s aspects are interconnected—in a way the natural sciences, insofar as they treat the objects of their study as abstract generalizations, cannot. The natural sciences utilize an abstract Verstand; the cultural sciences use a more holistic way of understanding, a Verstehen.

Despite his shared interest in the conditions of experience, Dilthey is usually not considered a Neo-Kantian. Among several key differences, he rejects the Marburg proclivity to construct laws of thinking and experience that supposedly hold for all rational beings. Their consideration of the self as a set of logico-mathematical rules was an illegitimate abstraction from the really-lived, historically-contextualized human experience. In its place he stresses the ways historicity and social contexts shape the experiences and thinking processes of unique individuals. In distinction from the Baden school, Dilthey rejects Rickert’s theses about both the necessarily phenomenal and universal character of historical experience. Since we have unmediated access to our lived inner life through a sort of ‘reflexive awareness’, our explanations of historical events may indeed require a degree of generalization and abstraction; however, our sympathetic awareness of agent’s motivations and intentions remains direct. That inner world is not representational or inferential, but – contrary to the natural sciences – lived from within our inter-contextualized experience of the world. In this way, Dilthey alsocrossed the mainstay Neo-Kantian position on the role of psychology in the human sciences. It is not a field alongside sociology or economics that requires a transcendental grounding in order to gain a nomothetic status. Psychology is an immediate descriptive science that alone enables the other Geisteswissenschaften to be understood properly as a sort of mediation between the individual and their social milieu. Dilthey thought the results of psychology will, contra Windelband, allow us to move beyond a purely idiographic description of individuals to a law-like (though not strictly nomothetic) typology according to worldviews or ‘Weltanschauungen’.

Max Weber (1864-1920) was a student and friend of Rickert’s in Freiburg, and was also concerned with distinguishing the logic of historical judgment from that of the natural sciences. The past itself contains no intelligibility. Intelligibility is something added by means of the historian’s act of contextualizing an individual in his or her social environment. Unlike Comte (1798-1857), whatever ‘laws’ we might discover in history are really just constructions. And unlike Durkheim (1858-1917), talk of ‘societies’ runs the risk of hypostasizing what is actually a convenient symbol. Social history should instead be about trying to understand the motivations behind individual and particular changes.

The sociology of Georg Simmel (1858-1918) was also informed by a personal connection to Rickert. From its foundation by Comte and Mill (1806-1873), sociology had been marked by both a positivist theory of explanation-under-law and by an uncritical empiricism. Simmel, with the familiar Neo-Kantian move, undermined their naivety and replaced it with a more careful consideration of how society could be conceived at all. How do individuals relate to society, in what sense does society have a sort of psychology in its own right that generates changes and out of which spiritual shifts emerge, and how do the representational faculties affect societal rituals like money, fashion, and the construction of cities?

Taking history away from its sociological orientation and back to its conceptual roots was Emil Lask (1875-1915). Having dissertated on Fichte under Rickert in 1902 and habilitated on legal theory under Windelband in 1905, Lask was called to Heidelberg in 1910, where his inaugural lecture –“Hegel in seinem Verhältnis zur Weltanschauung der Aufklärung”— intimated a rather different line of influence. His work is comprised of two major published titles: Logik der Philosophie oder die Kategorienlehre (1911) and Lehre vom Urteil (1912). In them, Lask works out an immediate-intuitional theory of knowledge that went beyond the Kantian categories of the understanding of objects into the realm of a logic of values, and even probed the borders of the irrational. In fact he criticized the generalities of the Baden school’s Problemsgeschichte for lacking a concrete grounding in actual historical movements. Although Lask’s insistence to fight on the eastern front in the First World War cut short what should have been a promising career, he had significant influence on German philosophy through Weber, Lukács, and Heidegger. His death, in the same year as Windelband’s, stands as the usual endpoint of the Baden school of Neo-Kantianism.

The physiognomic leanings of the Neo-Kantian forerunners, like Helmholtz and Lange, were revitalized by a pair of philosophers of considerable scientific reputation. Richard Avenarius (1843-96) and Ernst Mach (1838-1916), founders of empirio-criticism, sought to overcome the idealism-materialism rift in German academic philosophy by closer investigation into the nature of experience. This entailed replacing traditional physics with a phenomenalistic conception of thermodynamics as the model of positivist science. In his Kritik der reinen Erfahrung (1888-90), Avenarius showed that the physiognomic factors of experience vary according to environmental conditions. As the nervous system develops regular patterns within a relatively constant environment, experience itself becomes regularized. That feeling of being accustomed to typical experiences, rather than logical deduction, is what counts as knowledge. Mach was similarly critical of realist assumptions about sensation and experience, positing instead the mind’s inherent ability to economize the welter of experience into abbreviated forms. Although the concepts we use, especially those in the natural sciences, are indispensable for communication and for navigating the world, they cannot demonstrate the substantial objects from which they are believed to be derived.

Alois Riehl (1844-1924) denied Liebmann’s claim that the Kantian thing-in-itself was nothing more than an incidental remnant of enlightenment rationalism. Were it, Riehl argued, Kantianism should give up any pretension to a positive engagement with science. Kant in fact never entirely rejected the possibility of apprehending the thing-in-itself, only the attempt to do so by way of the understanding and reason. It was not inconsistent with Kant to try to articulate noumena mediately, via indirect inferences from their perceptible characteristics in observation. Beyond its spatio-temporal and categorical qualifications, the particular characteristics of an object which we perceive and by which we distinguish it from any other object depend not on us so much, as upon its mode of appearance as it really is outside us.

Although Rudolph Otto (1869-1937) taught in Marburg, he had little interest in the logic of mathematics per se. Otto did, however, follow his school’s inclination to understand the objects of study in terms of the transcendental conditions of their experience as fundamental to the proper study, specifically of religion. What Otto discovered thereby was a new kind of experience –the ‘holy’— that was irreducible either to the categories of the understanding or to feeling. This had a substantial influence on the religious phenomenology of two other loosely-associated Neo-Kantian thinkers, Ernst Troeltsch (1865-1923) and Paul Tillich (1886-1965).

With certain Hegelian overtones, Bruno Bauch (1877-1942) was another fringe member of the Baden school with much to say about religion. While he studied with Fischer in Heidelberg, Rickert in Freiburg, and wrote his Habilitationschrift under Vaihinger in Berlin, he was more concerned than most Badeners with logic and mathematics, and in this respect represents a key link between the Neo-Kantians and both Frege, his confidant, and Carnap, his student. But his views about religion and contemporary politics kept Bauch at considerable distance from mainline Neo-Kantians. Due, in fact to his overt anti-Semitism, he was made to resign his editorship at the Kant-Studien. Having argued that Jews were intrinsically foreign to German culture, his embrace of Nazism during the very period of Cassirer’s emigration was an obvious affront, too, to the legacy of Cohen, whatever honors it may have gained him by the state-run academies.

Nicolai Hartmann (1882-1950) began his career as a Neo-Kantian under Cohen and Natorp, but soon after developed his own form of realism. While remaining largely beholden to transcendental semantics, Hartmann rejected the basic Neo-Kantian position on the priority of epistemology to ontology. The first condition for thinking about objects is itself the existence of those objects. It was an error of idealism to assume the human mind constitutes the knowability of those objects and equally the error of materialism to assume that those objects are all open to knowability. To Hartmann, some objects were able to be known by way of their appropriate categorization. Others, however, must remain mysterious, closed off to human inquiry at least until philosophy could point the way toward new methods of comprehending them. Philosophy’s task, accordingly, was to discover the discontinuities between the ‘subjective categories’ of thought and the ‘objective categories’ of reality, and, where possible, to eventually overcome them.

Given the predominance of Neo-Kantianism in German academies during the turn of the century, a number of fringe-members deserve at least mention here. None can be considered an orthodox ‘Marburg’ or ‘Southwestern’ school Neo-Kantian, but each had certain affinities or at least critical engagements with the general movement. Friedrich Paulsen (1846-1908), Johannes Volkelt (1848-1930), Benno Erdmann (1851-1921), Rudolf Stammler (1856-1938), Karl Vorländer (1860-1928), the Harvard psychologist Hugo Münsterberg (1863-1916), Ernst Troeltsch (1865-1923), the plant physiologist Jonas Cohn (1869-1947), Albert Görland (1869-1952), Richard Hönigswald (1875-1947), Artur Buchenau (1879-1946), Eduard Spranger (1882-1963), and the neo-Friesian Leonard Nelson (1882-1927), who, though more critical than not about Neo-Kantian epistemology, shared certain ethical and social Neo-Kantian themes. There was also significant Neo-Kantian influence on French universities in the same years, with representative figures like Charles Renouview (1815-1903), Jules Lachelier (1832-1918), Émile Boutroux (1845-1921), Octave Hamelin (1856-1907), and Léon Brunschvicg (1869-1944). (For a more detailed look at French Neo-Kantians, see Luft and Capeillères 2010, pp. 70-80.)

5. Legacy

From March 17 to April 6, 1929, the city of Davos, Switzerland hosted an International University Course intended to bring together the continent’s best philosophers. The keynote lecturers and subsequent leaders of the famous disputation were the author of the recent Being and Time, Martin Heidegger, and the author of the recent Philosophy of Symbolic Forms, Ernst Cassirer. Heidegger’s contribution was his interpretation of Kant as chiefly concerned with trying to ground metaphysical speculation in what he calls the transcendental imagination through temporally-finite human reason. Cassirer countered by revealing Heidegger’s formulation of that imagination as unwarrantedly accepting of the non-rational. Agreeing that the progress of Kantian philosophy must acknowledge the finitude of human life, as Heidegger held, Cassirer emphasized that beyond this it must retain the spirit of critical inquiry, the openness to natural science, and the clarity of rational argumentation that marked Kant himself as a great philosopher and not only as an intuitive visionary. The Analytic philosopher Rudolf Carnap (1891-1970) was a participant in the Davos congress, and was thereafter prompted to write a highly critical review of Heidegger. What should have been an event that revitalized Neo-Kantianism in contemporary thought became instead the symbol both for the rise of the Analytic school of philosophy and for its cleavage from Continental thought. Increasingly, Cassirer was viewed as defending an unfashionable way of philosophizing, and after he emigrated from Germany in the wake of increasing anti-Jewish sentiment in 1933 the influence of Neo-Kantianism waned.

Even while serious Kant scholarship thrives in the Anglophone world of the early 21st century, Neo-Kantianism remains perhaps the single worst-neglected region of the historical study of ideas. This is unfortunate for Kant studies, since it inculcates an attitude of offhandedness toward nearly a century of superb scholarship on Kant, from Fischer to Cohen to Vaihinger to Cassirer. The Neo-Kantians were also the single most dominant group in German philosophy departments for half a century. Few works of Cohen, Natorp, Rickert, or Windelband have been translated in their entirety, to say nothing of the less prominent figures. Cassirer and Dilthey have attracted the interest of solid scholars, though it tends to be historical and hermeneutical rather than programmatic. Study of Simmel and Weber has been consistently strong, but more for their sociological conclusions than their philosophical starting points. To the rest, it has become customary to attribute a secondary philosophy rank behind the Idealists, Marxists, and Positivists –and even a tertiary one behind the Romantics, Schopenhauer, Nietzsche, and Kierkegaard, none of whom, it may be reminded, held a professorship in the field. It is not uncommon even in lengthy histories of 19th Century philosophy to find their names omitted entirely.

The reasons for this neglect are complex and not entirely clear. The writing of the Marburg school is dense and academically forbidding, true, but that of Windelband, Cassirer, Vaihinger, and Simmel are quite eloquent. The personalities of Cohen and Rickert were eccentric to the point of unattractive, though Natorp and Cassirer were congenial. Their ad hominem criticisms of rival members seem crass, but such gossip is undeniably salacious as well. Some of their writing was intentionally set against the backdrop of obscure political developments, though often with other figures such obscurities give rise to cottage-industries of historical scholarship (see Willey, 1978). The two World Wars no doubt had a severely negative impact on the continuity of any movement or school, though phenomenology has fared far better. The fact that many of the Neo-Kantians were both social-liberals and Jewish virtually guaranteed their works would be repressed in Hitler’s Germany. Forced resignation and even exile was a reality for many Jewish academics; Cassirer even had his habilitation at Straßburg rejected explicitly on the grounds that he was Jewish. He and Brunschvicg both escaped likely persecution by giving up their homeland. But as the examples of Bauch and Fischer make clear, Neo-Kantianism need not be subsumed under a specifically Jewish or even minimally Semitically-tolerant worldview. And as Cohen’s resurgent legacy in Jewish scholarship proves, not all aspects that were once repressed need forever be.

To ignore the Neo-Kantians, however, creates the false impression of a philosophical black hole in the German academies between the decline of Hegelianism and the rise of phenomenology, a space more often devoted to anti-academic philosophers like Kierkegaard and Nietzsche. Moreover, the Neo-Kantian influence on 20th century minds from Husserl and Heidegger to Frege and Carnap is pronounced, and should not be brushed aside lightly. In terms of early 21st century philosophy, Neo-Kantianism reminds us of the importance of reflecting on method and of philosophizing in conjunction with the best contemporary research in the natural sciences, something for which Helmholtz, Lange, and Cassirer stand as exemplars. And just as Liebmann’s motto ‘Back to Kant’ was exhorted to bridge the gap between idealism and materialism in the 19th century, so too may it be a sort of third way that rectifies the split between contemporary continental and analytic partisanship. If the old saying still carries water, then “one can philosophize with Kant, or against Kant; but one cannot philosophize without Kant.”

Fortunately there are signs of renewed interest. In 2010, Luft and Rudolf Makkreel edited Neo-Kantianism in Contemporary Philosophy (Bloomington, IN: Indiana University Press), which presents a well-rounded contemporary collection of research papers. In Continental Europe, no one has done more to revitalize Neo-Kantian studies than Helmut Holzhey and Fabien Capeillères, who have each published a row of books, editions, and papers on the theme. Under the directorship of Christian Krijnen and Kurt Walter Zeidler, the Desiderata der Neukantianismus-Forschung has hosted a number of congresses and meetings throughout Europe. A 2008 issue of the Philosophical Forum was dedicated to Neo-Kantianism generally, and featured articles by Rolf-Peter Horstmann, Paul Guyer, Michael Friedman, and Frederick Beiser among others. Studies of particular Neo-Kantians have been produced by Poma (1997), Krijnen (2001), Zijderveld (2006), Skildelsky (2008), and Munk (2010). And Charles Bambach (1995), Michael Friedmann (2000), Tom Rockmore (2000), and Peter Eli Gordon (2012) have composed fine works of intellectual history concerning the heritage of Neo-Kantianism in the twentieth century. 

6. References and Further Reading

a. Principle Works by Neo-Kantians and Associated Members

  • Helmholtz
  • Über die Erhaltung der Kraft (Berlin: G. Reimer, 1847).
  • Über das Sehen des Menschen (Leipzig: Leopold Voss, 1855).
  • Lange
  • Die Arbeiterfrage in ihrer Bedeutung für Gegenwart und Zukunft (Duisburg: W. Falk & Volmer, 1865).
  • Geschichte des Materialismus und Kritik seiner Bedeutung in der Gegenwart (Iserlohn: J. Baedeker, 1866).
  • Logische Studien: Ein Beitrag zur Neubegründung der formalen Logik und der Erkenntnisstheorie (Iserlohn: J. Baedeker, 1877).
  • Cohen
  • “Zur Controverse zwischen Trendelenburg und Kuno Fischer,” Zeitschrift für Völkerpsychologie and Sprachewissenschaft 7 (1871): 249–296.
  • Kants Theorie der Erfahrung (Berlin: Dümmler, 1885 [1871]).
  • Die systematische Begriffe in Kants vorkritische Schriften nach ihrem Verhältniss zum kritischen Idealismus (Berlin: Dümmler, 1873).
  • “Friedrich Albert Lange,” Preußische Jahrbücher 37 [4] (1876): 353-381.
  • Kants Begründung der Ethik (Berlin: Dümmler, 1877).
  • Kants Begründung der Aesthetik (Berlin: Dümmler, 1889).
  • Die Religion der Vernunft aus den Quellen des Judentums (Leipzig: Fock, 1919).
  • Natorp
  • Forschungen zur Geschichte des Erkenntnisproblems im Altertum: Protagoras, Demokrit, Epikur und die Skepsis (Berlin: Darmstadt, 1884).
  • Platos Ideenlehre: Eine Einführung in den Idealismus (Leipzig: Dürr, 1903).
  • Philosophische Propädeutik (Marburg: Elwert, 1903).
  • Die logischen Grundlagen der exakten Wissenschaften (Leipzig: Teubner, 1910).
  • “Kant und die Marburger Schule,” Kant-Studien 17 (1912): 193-221.
  • Sozialidealismus: Neue Richtlinien sozialer Erziehung (Berlin: Julius Springer, 1920).
  • Cassirer
  • Zur Einsteinschen Relativitätstheorie: Erkenntnistheoretische Betrachtungen (Berlin: Bruno Cassirer, 1921).
  • Philosophie der symbolischen Formen, 3 vols. (Berlin: Bruno Cassirer, 1923-9).
  • Sprache und Mythos: Ein Beitrag zum Problem der Götternamen (Leipzig: Teubner, 1925).
  • Individuum und Kosmos in der Philosophie der Renaissance (Leipzig: Teubner, 1927).
  • Die Idee der republikanischen Verfassung (Hamburg: Friedrichsen, 1929).
  • An Essay on Man (New Haven, CT: Yale University Press, 1944).
  • The Myth of the State (New Haven, CT: Yale University Press, 1946).

  • Windelband
  • Die Geschichte der neueren Philosophie in ihrem Zusammenhange mit der allgemeinen Cultur und den besonderen Wissenschaften dargestellt, 2 vols. (Leipzig: Breitkopf & Härtel, 1878–80).
  • Praeludien: Aufsaetze und Reden zur Philosophie und ihrer Geschichte (Freiburg im Breisgau: Mohr, 1884).
  • Einleitung in die Philosophie: Grundriß der philosophischen Wissenschaften, edited by Fritz Medicus (Tübingen: Mohr, 1914).
  • Rickert
  • Der Gegenstand der Erkenntnis: ein Beitrag zum Problem der philosophischen Transcendenz (Freiburg im Breisgau: Mohr, 1892).
  • Die Grenzen der naturwissenschaftlichen Begriffsbildung, 2 vols. (Tübingen: Mohr, 1896-1902).
  • Kulturwissenschaft und Naturwissenschaft (Freiburg im Breisgau: Mohr, 1899).
  • Die Heidelberger Tradition und Kants Kritizismus, 2 vols. (Berlin: Junker und Dünnhaupt, 1934).
  • Dilthey
  • Gesammelte Schriften, 26 vols. (Göttingen: Vandenhoeck & Ruprecht, 1914–2006).
  • Vaihinger
  • Hartman, Dühring und Lange: Zur Geschichte der deutschen Philosophie im XIX. Jahrhundert (Iserlohn: J. Baedeker, 1876).
  • Kommentar zu Kants Kritik der reinen Vernunft, 2 vols. (Stuttgart: Spemann & Union Deutsche Verlagsgesellschaft, 1881-92).
  • Die Philosophie des Als Ob (Berlin: Reuther und Reichard, 1911).
  • Weber
  • Gesammelte Aufsätze zur Soziologie und Sozialpolitik (Tübingen: Mohr, 1924).
  • Simmel
  • Über sociale Differenzierung (Leipzig: Duncker & Humblot, 1890).
  • Die Probleme der Geschichtphilosophie (Leipzig: Duncker & Humblot, 1892).
  • Grundfragen der Soziologie (Berlin: Göschen, 1917).

  • Mach
  • Die Analyse der Empfindungen und das Verhältnis des Physischen zum Psychischen (Jena: Gustav Fischer, 1886).
  • Bauch
  • Studien zur Philosophie der exakten Wissenschaften (Heidelberg: C. Winter, 1911).
  • Wahrheit, Wert und Wirklichkeit (Leipzig: F. Meiner, 1923).
  • Lask
  • Gesammelte Schriften, 3 vols., edited by Eugen Herrigel (Tübingen: Mohr, 1923-24).
  • Hartmann
  • Zur Grundlegung der Ontologie (Berlin: Walter De Gruyter, 1935).

b. Secondary Literature

  • Bambach, Charles R., Heidegger, Dilthey and the Crisis of Historicism (Ithaca: Cornell University Press, 1995).
  • Beiser, Frederick C., The German Historicist Tradition (Oxford: Oxford University Press, 2011).
  • Ermarth, M., Wilhelm Dilthey: The Critique of Historical Reason (Chicago: University of Chicago Press, 1978).
  • Friedman, Michael, A Parting of the Ways: Carnap, Cassirer, Heidegger (Chicago: Open Court, 2000). Dynamics of Reason: The 1999 Kant Lectures at Stanford University (Chicago: University of Chicago Press, 2001).
  • Glatz, Uwe B., Emil Lask (Würzburg: Königshausen und Neumann, 2001).
  • Gordon, Peter E., Continental Divide: Heidegger, Cassirer, Davos (Cambridge, MA: Harvard University Press, 2012).
  • Hartmann, E. v., Neukantianismus, Schopenhauerianismus und Hegelianismus in ihrer Stellung zu den philosophischen Aufgaben der Gegenwart (Berlin: C. Duncker, 2nd expanded edition 1877).
  • Heidegger, Martin, “Davoser Disputation zwischen Ernst Cassirer und Martin Heidegger,” in his, Kant und das Problem der Metaphysik (Frankfurt a.M.: Vittorio Klosertmann, 4th expanded edition 1973), 246-68.
  • Holzhey, Helmut, Cohen und Natorp, 2 vols. (Basel/Stuttgart: Schwabe & Co., 1986).  Historical Dictionary of Kant and Kantianism (Lanham, MD: Scarecrow Press, 2005).
  • Köhnke, K. C., Entstehung und Aufstieg des Neukantianismus: Die deutsche Universitätsphilosophie zwischen Idealismus und Positivismus (Frankfurt am Main: Suhrkamp, 1986).
  • Krijnen, Christian, Nachmetaphysischer Sinn: Eine problemgeschichtliche und systematische Studie zu den Prinzipien der Wertphilosophie Heinrich Rickerts (Würzburg: Königshausen & Neumann, 2001).
  • Krijnen, Christian & Heinz, Marion (eds.), Kant im Neukantianismus (Würzburg: Königshausen & Neumann, 2007).
  • Külpe, Oswald, The Philosophy of the Present in Germany (New York: Macmillan, 1913).
  • Luft, Sebastian & Capeillères, Fabien, “Neo-Kantianism in Germany and France,” in The History of Continental Philosophy Volume 3: The New Century, edited by Keith Ansell-Pearson & Alan D. Schrift (Durham: Acumen, 2010), 47-85.
  • Makkreel, Rudolph, Dilthey: Philosopher of the Human Studies (Princeton: Princeton University Press, 1993).
  • Makkreel, Rudolph & Luft, Sebastian (eds.), Neo-Kantianism in Contemporary Philosophy (Bloomington, IN: Indiana University Press, 2010).
  • Munk, Reinier (ed.), Hermann Cohen’s Criticial Idealism (Dordrecht: Springer, 2010).
  • Ollig, Hans-Ludwig, Der Neukantianismus (Stuttgart: Metzler, 1979). Materialien zur Neukantianismus-Diskussion (Darmstadt: Wissenschaftliche Buchgesellschaft, 1987).
  • Poma, Andrea, The Critical Philosophy of Hermann Cohen (Albany: SUNY Press, 1997).
  • Rockmore, Tom (ed.), Heidegger, German Idealism & Neo-Kantianism (Amherst, NY: Humanity Books, 2000).
  • Schnädelbach, Herbert, Philosophy in Germany, 1831–1933 (Cambridge: Cambridge University Press, 1984).
  • Sieg, Ulrich, Aufstieg und Niedergang des Marburger Neukantianismus: Die Geschichte einer philosophischen Schulgemeinschaft (Würzburg: Königshausen & Neumann, 1994).
  • Skidelsky, Edward, Ernst Cassirer: The Last Philosopher of Culture (Princeton, NJ: Princeton University Press, 2008).
  • Willey, Thomas E., Back to Kant: The Revival of Kantianism in German Social and Historical Thought, 1860-1914 (Detroit: Wayne State University Press, 1978).
  • Zijderveld, Anton C., Rickert’s Relevance: The Ontological Nature and Epistemological Functions of Values (Leiden: Brill, 2006).

 

Author Information

Anthony K. Jensen
Email: Anthony.Jensen@providence.edu
Providence College
U. S. A.

Antoine Arnauld (1612—1694)

Antoine Arnauld was considered by his peers as one of the preeminent 17th century European intellectuals. Arnauld had been remembered primarily as a correspondent of René Descartes, Gottfried Leibniz and Nicolas Malebranche, and as a dogmatic and uncritical Cartesian who made few if any philosophical contributions. In fact, as much newer research suggests, Arnauld was not a dogmatic and uncritical Cartesian, and he made many philosophical contributions over and above facilitating the development of the philosophical systems of Descartes, Leibniz and Malebranche.

Arnauld’s primary endeavors were largely theological. Indeed, the 18th Century Enlightenment thinker Voltaire wrote of Arnauld that “there was no one with a more philosophical mind, but his philosophy was corrupted” because, among other things, he “plunged 60 years in miserable disputes” (Voltaire 1906, p. 728/Kremer 1990, p. xi). In other words, Voltaire claims that Arnauld wasted his time on religious disputes and theology instead of focusing on philosophy. While it is controversial (to say the least) whether Arnauld’s primary intellectual endeavors were in fact “miserable disputes,” it is certainly true that Arnauld devoted the majority of his efforts and writings to theological and religious questions. Arnauld is likely the figure most associated with a sect of Catholicism prominent in France in the 17th Century called Jansenism (other than Cornelius Jansen for whom Jansenism is so named, see section 1).

However, despite his focus on theological and religious issues, Arnauld was a major figure in the philosophical landscape of the latter half of the 17th century. Arnauld burst onto the philosophical scene in 1641, when he authored the Fourth Objections to Descartes’ Meditations on First Philosophy. After Descartes had finished the manuscript of the Meditations, he sent it to Marin Mersenne to acquire comments from leading intellectuals of the day in Paris, including Thomas Hobbes and Pierre Gassendi. In the eyes of many (Descartes included), Arnauld’s are the best set of objections.

In the 1660’s Arnauld co-authored several philosophical works with others, none more influential and important than the Port-Royal Logic, sometimes called the Art of Thinking. In this work, Arnauld and co-author Pierre Nicole, offer a sophisticated account of reasoning well, which for these thinkers encompassed not just what we might associate with logic proper today, but also issues concerning the nature of ideas, metaphysics, philosophical methodology and ontology. Arnauld corresponded with Gottfried Leibniz in the 1680’s concerning an outline of what would later become one of Leibniz’s most influential works, the Discourse on Metaphysics. Arnauld also corresponded with Nicolas Malebranche in a very public controversy. In fact, this controversy has been called “one of the intellectual events of the [17th] century,” which covered, among other things the nature of ideas and the nature of God (Nadler 1996, p. 147). This article focuses on those texts and positions of Arnauld’s that are of the most philosophical interest.

Table of Contents

  1. Life and Works
    1. The Life of Arnauld
    2. Philosophical Works
    3. The Leibniz-Arnauld Correspondence and the Malebranche-Arnauld Polemic
  2. Arnauld’s Cartesianism
  3. The Fourth Objections
  4. The Nature of Arnauld’s Cartesianism
  5. Arnauld’s Cartesian Ontology: Dual Dualisms
  6. Philosophical Methodology
  7. Occasionalism
  8. Conception of God and Theodicy
    1. The Doctrine of the Creation of the Eternal Truths
    2. Theodicy
    3. Arnauld’s God
  9. Modality
  10. Conclusion
  11. References and Further Reading

1. Life and Works

a. The Life of Arnauld

Antoine Arnauld, often referred to as “le grand Arnauld” on account of his small physical stature, was first and foremost a theologian (Sedgwick 1998, p. 124). He was born on February 6th, 1612 into a well-regarded family in Paris. After briefly considering following his late father and pursuing a career in law, Arnauld decided, largely at the bequest of his mother and family friend (and friend of Cornelius Jansen) Jean Duvergier, to pursue a life in theology and the Church (Kremer 1990, pp. xiv; Nadler 1989, pp. 15-16). In 1633, Arnauld began his studies at the Sorbonne in Paris and both received his doctorate in theology and became an ordained priest in 1641. Arnauld would go on to ultimately serve on the faculty of the Sorbonne (Nadler 1989, p. 16).

Arnauld’s life was tied to Jansenism and the monastery of Port-Royal. Port-Royal was a convent in France that shifted between two locations (and sometimes both), one just outside of Paris and one in Paris – Port Royal du Champs and Port Royal de Paris respectively. (Sleigh 1990, pp. 26-27). The Port-Royal was associated with Jansenism and was dedicated to intellectual pursuits and notably the education of children (Sleigh 1990, p. 27).

Jansenism is a now defunct sect of Catholicism that sought reform within the Roman Catholic Church. Indeed, many of Jansenism’s critics argued that Jansenism was actually a brand of Protestantism and called the movement Jansenism to relate it to Calvinism, so named from John Calvin (Schmaltz 1999, p. 42). Jansenism developed out of the ideas of Cornelius Jansen and his book Augustinus. As the title suggests, Jansen defended an Augustinian inspired system. It is hard to outline many characteristics over and above being inspired by Jansen’s text and their views on grace that are representative of Jansenism or those associated with the Port-Royal (see, for example, Schmaltz 1999). For example, while the Port-Royal had long been associated with Cartesianism, recent research suggests that many associated with the Port-Royal were in fact anti-Cartesian (see, for example, Nadler 1988b). Nevertheless, the central claim of the Jansenist cause that occupies much of Arnauld’s attention at points in his career and plays a fundamental role in several of Arnauld’s most interesting philosophical contributions is the Augustinian doctrine of efficacious grace. This doctrine of efficacious grace held that one did not achieve salvation by one’s own merit, but only through the grace of God. Further, if one received God’s grace, one could not fail to achieve salvation. In other words, grace was given (or not given) by God and not earned by one’s own actions (Nadler 1989, p. 16 and Nadler 2008b, pp. 56-57). A second aspect of Jansenist doctrine (or at least Arnauld’s Jansenism) that plays a central role in Arnauld’s philosophical endeavors is a commitment to a “hidden God” (Moreau 2000, p. 106). By a “hidden God”, Arnauld at least, does not mean to claim that God’s works are unknowable or that we can know nothing about God, but rather that in some very substantive ways God is not fully comprehendible by us (see section 5).

Arnauld had a fundamental role in the “Quarrel over the Five Propositions”. In 1653, Pope Innocent X declared the following five propositions heretical. All five of these, the Pope claimed, were endorsed by Jansen in Augustinus:

  1. Some commandments of God are impossible for righteous men, although they wish to fulfill them and strive to fulfill them in accord with the power they presently possess. They lack the grace that would make it possible.
  2. In the state of fallen nature, interior grace is never resisted.
  3. In order to deserve merit or demerit in the state of fallen nature, freedom from necessity is not required in men; rather, freedom from constraint is sufficient.
  4. The Semi-Pelagians admitted the necessity of prevenient and interior grace for each action, even for the beginning of faith; but they were heretics in that they held hat this grace is such that the human will can either resist it or obey it.
  5. It is an error of the Semi-Pelagians to say that Christ dies or that he shed his blood for everyone without exception. [See, for example, Sleigh (1990), p. 27, from whom the translations are taken]

Arnauld refused to submit to Papal authority and accept that Jansen’s text was heretical. While Arnauld granted that the Pope had the authority to decide what was or was not heretical, he denied that the Pope had a similar authority in interpreting Augustinus. In fact, Arnauld argued, Jansen endorses none of these propositions in Augustinus. This incident between the Pope and Jansenists like Arnauld, resulted in much persecution for Arnauld and other Jansenists and those who did not submit to Papal authority. Further, as a result of this incident, Arnauld was removed from the faculty of the Sorbonne in 1656 (Kremer 1990, pp. xv-xvi and Sleigh 1990, pp. 27-28). In 1679, Arnauld went into exile in the Netherlands, during which time he even used a fake name (Monsieur Davy) and never returned to France (Sleigh 1990, p. 26/Jacques 1976, p. 34]. Arnauld died August 8th 1694, in Liège.

[For more information on the life of Arnauld and from which the above account is taken, see especially: Nadler (1989) Chapter II; Kremer (1990); Sleigh (1990) Chapter 3; and Sedgwick (1998), especially Chapter 7. For a general book length-study of 17th Century French Jansenism, see Sedgwick (1977)].

b. Philosophical Works

 

 

In the course of his life, Arnauld wrote a substantial amount on both theology and philosophy. Those philosophical writings that are of the most importance are briefly discussed here.

While studying at the Sorbonne, Arnauld authored the aforementioned Fourth Objections to Descartes’ Meditations. In November of 1640, Descartes asked Marin Mersenne (who was the center of the Parisian intellectual world) to circulate copies of the Meditations among other thinkers in Paris to acquire comments and objections so that Descartes could publish them with the first printing of the Meditations. While the group of objectors included Pierre Gassendi, Thomas Hobbes and Marin Mersenne, Descartes claimed of Arnauld’s objections, “I think they are the best of all the sets of objections” (CSMK III 175/AT III 331). Arnauld’s objections to the Mediations have many distinctive features. For example, unlike the objections of Hobbes and Gassendi, Arnauld’s objections to the Meditations are not objections to Descartes’ system, but objections internal to Descartes’ system. Further, Arnauld’s objections resulted in changes in the actual body of the Meditations (see for example, Carraud 1995, pp. 110-111). Some of Arnauld’s objections are discussed below (section 2a). In addition to the Fourth Objections and Fourth Replies, Arnauld and Descartes exchanged two letters each in 1648 wherein Arnauld further presses Descartes on matters related to the Cartesian philosophy (although it is worth noting that Arnauld remained anonymous in these letters). These two letters are often called “The New Objections to Descartes’ Meditations” (and are available in English translation in Kremer’s translation of On True and False Ideas 1990). In these works, Arnauld pushes Descartes for further clarification of central aspects of the Cartesian worldview while simultaneously showing sympathy to that worldview.

Arnauld co-authored works with other Port-Royalists, two of which deserve special mention. The first of these is the Port-Royal Grammar. In this 1660 work Arnauld and co-author Claude Lancelot, construct a general text on grammar, which they define as “the art of speaking” (PRG 41/OA 41 5). They define speaking as “explaining one’s thoughts by signs which men have invented for that purpose” (PRG 41/OA 41 5). In 1662, Arnauld co-wrote a similar work called La Logique, ou L’art de Penser, that is The Logic or the Art of Thinking (henceforth: Logic) with Pierre Nicole. Arnauld and Nicole understand the purpose of logic to “give rules for all actions of the mind, and for simple ideas as well as for judgments and inferences” (B 15/OA 41 27). Thus, Arnauld and Nicole aim to produce a text based on reasoning well and making good judgments. Arnauld and Nicole divide the text into four parts: conceiving, judging, reasoning and ordering. Conceiving is “the simple view we have of things that present themselves to the mind” and is what we do when we represent things to the mind in the form of ideas before making judgments about them. Judging is the act of bringing together different ideas in the mind and affirming or denying one or the other. Reasoning is the “action of the mind in which it forms a judgment from several others.” Ordering (or method) is the mental action of arranging “ideas, judgments and reasonings” in such a way that the arrangement is “best suited for knowing the subject” (B 23/OA 41 125). The Logic is very influential. So influential, in fact, it has been claimed that “The Port-Royal Logic was the most influential logic from Aristotle to the end of the nineteenth century” (Buroker 1996, p. xxiii). Further, while the text is ostensibly one about logic, it also treats extensively metaphysics, philosophy of language, epistemology, theory of ideas and philosophy of religion.

Arnauld wrote a majority of his philosophical work while in exile. In Examen d’un Ecrit qui a pour titre: Traité de l’essence du the corps, et de l’union de l’âme avec le corps, contre la philosophie de M. Descartes, that is, an Examination of a Writing that has for a title: Treatise on the essence of the body, and the union of the soul with the body, against the philosophy of Mr. Descartes (henceforth Examen), Arnauld responds to an attack on Cartesianism by M. Le Moine, Dean of the Chapter of Vitre. The treaty to which Arnauld is responding is now lost (Schmaltz 2002, p. 54; Nadler 1989, p. 26, n 18). In this work, Arnauld defends a Cartesian philosophy, though not an entirely orthodox Cartesianism (see section 4). Arnauld divided the text into four sections: whether Descartes’ philosophy can add anything valuable over and above what scripture tells us; that Descartes’ philosophy is consistent with the Catholic stance on the Eucharist – a Catholic ceremony in which wine and bread are blessed by a priest and the wine and bread, which in the 17th century are taken to either be converted into, or annihilated and replaced by, Christ’s blood and body (see, for example, Nadler 1988, p. 230); that Descartes does not ruin the Church’s belief in the glory of the body; and on the union of the soul and the body. [For a thorough treatment of Arnauld, Descartes and the Eucharist, see Nadler (1988) and Schmaltz (2002), Chapter 1]. The Examen (1680) is especially notable as Arnauld was explicitly defending a Cartesian philosophy, even after Descartes’ works were condemned by the Congregation of the Index in 1663. To have one’s works condemned by the Congregation of the Index was having them being banned by the Catholic Church (see for example, Nadler 1988, p. 239).

c. The Leibniz-Arnauld Correspondence and the Malebranche-Arnauld Polemic

 

In addition to the philosophical works described above, Arnauld participated in two events in the 17th century of immense importance, namely the Leibniz-Arnauld correspondence and the Malebranche-Arnauld polemic. The most famous part of the correspondence with Leibniz (indeed the part normally referred to as the “Leibniz-Arnauld Correspondence”) began in February of 1686 when Leibniz wrote to the Landgrave Ernst von Hessen-Rheinfels. Leibniz sent an outline of what would become one of Leibniz’s most important works, the Discourse on Metaphysics, and asked if the Landgrave could give the outline to Arnauld in hopes that Arnauld would comment on it. Arnauld and Leibniz then corresponded with Hessen-Rheinfels as an intermediary. Arnauld’s initial response to Leibniz gives some justification of his reputation as a harsh critic and hot-tempered person. Arnauld claims that Leibniz’s work contains “so many things that frighten me and that almost all men, if I am not mistaken, will find so shocking, that I do not see what use such a work can be, which will clearly be rejected by everybody” (M 9/G II 15). Arnauld adds “Would it not be better if he [Leibniz] abandoned these metaphysical speculations which cannot be of use to him or others, in order to apply himself seriously to the greatest business that he can ever have, the assurance of his salvation by returning to the Church” (M 10/G II 16). In Arnauld’s second letter to Leibniz, he apologizes and offers a more thorough account of what he found so troubling in Leibniz’s philosophy.  It is this letter that begins a philosophical exchange whose rigor rivals the best published philosophical works of the period.

The correspondence had a profound effect on the development of Leibniz’s metaphysical system, so much so that Leibniz considered publishing the correspondence (G I 420, see also Sleigh 1990, p. 1). Further, as we shall see below, Arnauld offers some original contributions in the discussion, most notably concerning modal metaphysics or the nature of possibility. However, it is worth briefly noting two of Arnauld’s objections to Leibniz’s system before we move on. First, Arnauld objects to Leibniz’s account of pre-established harmony. In the Discourse on Metaphysics, Leibniz defends pre-established harmony or as Arnauld refers to it “the hypothesis of the concomitance and harmony between substances” (M 78/G II 64). Leibniz’s view, in brief, is that each substance causes all of its own changes and there is no causation or causal relations between two distinct finite substances. Instead, according to Leibniz, God has set up the universe so that at the exact moment that, for example, Arnauld speaks to Leibniz, Leibniz’s own substance changes itself in such a way that Leibniz hears Arnauld. For example, in the Discourse Leibniz claims:

We could therefore say…that one particular substance never acts upon another particular substance nor is acted upon by it. (AG 47/G IV 358)

And:

God alone (from whom all individuals emanate continually and who sees the universe not only as they see it but also entirely different from all of them) is the cause of the correspondence of their phenomena and makes that which is particular to one of them public to all of them; otherwise, there would be no interconnection. (AG 47/G IV 375)

Arnauld pushes Leibniz to clarify his view (M 78/G II 64) and then argues that Leibniz’s view collapses into a different account of causation prevalent in the 17th century: occasionalism (M 105-106/G II 84-85). The traditional doctrine of occasionalism, most closely associated with the Cartesian Nicolas Malebranche, is the conjunction of the following two claims:

  1. Finite beings have no causal power.
  2. God is the only true causal agent.

According to occasionalism, only God has causal power. When it appears to us that a baseball breaks a window, it is not the case that the baseball (a finite thing) exercises any causal efficacy over the window. Instead, God, on the occasion of the baseball striking the window, causes the window to break. The baseball in this case is often called an “occasional cause”.  However, one should not mistake an occasional cause as something with causal power, rather it simply signifies the occasion for God to exercise His causal power. Returning to Arnauld’s objection to Leibniz, he explains:

It seems to me that this [the concomitance and harmony between substances] is saying the same thing in other words as those who claim that my will is the occasional cause of the movement of my arm and God is the real cause. For they do not claim that God does that in time through a new act of will which he exercises each time I wish to raise my arm; but by that single act of the eternal will (M 105-106/G II 84).

Arnauld’s concern is that if substances are not causally affecting each other, and the correspondence of seeming causal relations occurs only because of God setting it up so that these relations correspond, how could this view be any different than claiming that only God is causally efficacious? We might restate Arnauld’s question as follows: Given that God wills things, not one at a time, but from a single eternal will, how is it any different for God to will from a single act of the eternal will that all ‘interactions’ between different substances correspond with no actual causal relations between them and for God to will the relations between the two substances themselves?

Arnauld may have misunderstood Leibniz in this regard [see, for example, Sleigh (1990), p. 150], as Leibniz would claim that each substance causes its own changes and insist that this is substantively different from occasionalism. However, Arnauld’s objections to Leibniz, at the least, push Leibniz to explain his view, and at best do not suffer from and rely on a misunderstanding of Leibniz’s position and offer a legitimate problem for Leibniz’s view.

Arnauld also criticizes Leibniz’s account substances. According to Leibniz, each created substance has a certain ‘form’ that constitutes its essence (see, for example, Sleigh 1990, p. 116). A full (or even adequate) account of Leibniz on this issue would take us on a long detour. For our purposes we can focus on one particular aspect of what Leibniz says:

If the body is a substance…[and not] an entity united by accident or by aggregation like a heap of stones, it cannot consist of extension, and one must necessarily conceive of something there that one calls a [form], and which corresponds in a way to a soul. (M 66/G II 58)

Take for example a human body. We naturally think of the human body, say Arnauld’s body, as an individual thing with some sort of unity or togetherness. According to Descartes and Arnauld (see section 2c) bodies are essentially extended things and are infinitely divisible. That is, for any body, that body can be divided into an infinite number of parts. Leibniz claims in this passage that if there are extended bodies they cannot be simply things like heaps of stone or simple aggregates. There must be something that unifies the body and this thing Leibniz says corresponds in a way to the soul. The “if” is quite important for Leibniz, for Leibniz seems to be acknowledging the possibility of physical substances being merely “true phenomenon like the rainbow” (see for example M 95/G II 77, Parkinson 1967, pp. xxvi-xxvii and Sleigh 1990, pp. 101-110). Leibniz’s comparison of the potential for physical substances being like a rainbow suggests that such substances might very well be nothing but phenomenon. Nevertheless, Leibniz denies that an extended substance could exist without some “form” providing it a unity.

Arnauld offers numerous arguments against Leibniz’s conception of “forms”. Indeed, as pointed out by Parkinson (1967, p. xxvi), Arnauld makes at least 7 distinct objections to this concept. Several of Arnauld’s objections include: denying that he has any clear notion of what such a “form” is (M 134/G II 107); that he only has knowledge of two types of substances, minds and bodies, and it does not make sense to call these “forms” minds or bodies (M 134-135/G II 107-108); and even that if these “forms” did exist, Leibniz’s account of them is indefensible (M 135/G II 108).

[For a general discussion of Arnauld and Leibniz’s discussion of “forms”, see Sleigh (1990), Chapter 6. For an excellent book-length treatment of the entire Leibniz-Arnauld correspondence see Sleigh (1990), and for a shorter treatment see Parkinson (1967)].

Arnauld also engaged in a decade long public dispute with Malebranche over the nature of ideas and account of God and God’s modus operandi. The debate between Arnauld and Malebranche should be interpreted as a debate between two Cartesians, who have strayed from the letter of Descartes in different ways. The public and private debate between Malebranche and Arnauld consists in many works and letters. The controversy began with Malebranche’s publication of the Treatise on Nature and Grace (henceforth: Treatise). Initially, prior to publication, Malebranche had sent the work to Arnauld for his opinion. However, Malebranche then decided to publish the Treatise without waiting to hear back from Arnauld. This event clearly irked Arnauld and is likely the beginning of the polemic (see Moreau 2000, p. 88 and OA 2 95, OA 2 101 and OA 2 116). In the Treatise, Malebranche offers a theodicy, or an attempt to reconcile the existence of an all-good, all-knowing, all-powerful God with the existence of evil. Prima facie, it seems that an all-powerful, all-good, all-knowing God could create a world with no evil (given all-power), would desire to create such a world (given all-goodness) and would know how to create such a world (given all-knowing). Malebranche offers a highly original theodicy. Malebranche’s theodicy and Arnauld’s objections are discussed in section 5b in regard to Arnauld’s conception of God and theodicy.

Arnauld vigorously responds to Malebranche’s view in print and then the two exchange numerous writings and compose many books arguing against each other for the next ten years (for an overview of the chronology of the debate, see Moreau 2000, pp. 88-92). Three of Arnauld’s works which are of much importance are On True and False Ideas (henceforth: VFI; from the French title: Des vraies et des fausses idées), Réflexions philosophiques et théologiques sur le noveau système de la nature et de la grace, that is, Philosophical and theological relfections on the new system of nature and grace (henceforth: Réflexions) and Dissertation sur la manière dont Dieu a fait les fréquents miracles de l’Ancienne Loi par le ministre des Anges, that is, Dissertation on the manner in which God has made frequent miracles of the Ancient Law by the minister of the Angles (henceforth: Dissertation). [Unfortunately, only the former is currently available in English and even it has only been available in English since 1990, see Kremer (1990), p. xii]. In the latter two, Arnauld goes after Malebranche’s theodicy and conception of God. VFI (1683) was Arnauld’s first contribution to the debate. Curiously, it does not address Malebranche theodicy or the Treatise (the work that started the polemic), but rather Arnauld offers an attack on Nicholas Malebranche’s The Search After Truth, specifically the theory of ideas there presented.

Malebranche offered a very distinct indirect realist theory of ideas. Malebranche’s account of ideas and perception is not always straightforward and not without its own interpretative issues. However, in brief, according to Malebranche, ideas are distinct things independent of the mind and objects that they represent. Further, when a person considers an idea, say of an extended body, the direct and immediate object of my thought is an idea, not the extended body itself. One only has indirect perception of the extended body. Malebranche also holds that these ideas are in an interesting sense “in God.” For Malebranche in human knowledge and perception the human mind perceives an idea and this idea is found in the divine understanding and is God’s idea [See, for example, The Search After Truth book III, part ii, chapters 1-7; PS 27-50 and Nadler (1992), pp. 98-99 from whom this account is indebted. For more elaborate discussions of Malebranche’s account of ideas, see Nadler (1992) and Schmaltz (2000)].

Arnauld vigorously opposed Malebranche’s account of ideas. It is clear that Arnauld argues against Malebranche’s view that there exist objects independent of our perception in God that play a role in our perception. Further, Steven Nadler (1989) has persuasively (although perhaps not definitely) argued that Arnauld was arguing against any indirect realist conception of ideas and instead offered an account in which ideas are modes of the mind and acts of perception such that the direct object of any perception is the thing which is perceived.

While the majority of the attention the Arnauld-Malebranche debate has received has concerned the theory of ideas, the majority of the debate has actually concerned theodicy and God. In the Réflexions and Dissertation, for example, Arnauld argues directly against Malebranche’s conception of God and theodicy, and Malebranche’s account of the order of providence.

This account of the correspondence brings us to an interesting question. While Arnauld’s main target was Malebranche’s theodicy and account of God’s modus operandi, Arnauld begins his attack on a seemingly unrelated issue, namely Malebranche’s theory of ideas. Why would Arnauld begin his attack on Malebranche’s theodicy with an attack on his theory of ideas? Recently, Denis Moreau has offered a very plausible explanation for this fact, an explanation that we will be in a position to consider in section 5b.

Though not addressed below, two other works that are related to the Arnauld-Malebranche debate that warrant mention are the Dissertatio Bipartitia, that is, Dissertation in Two Parts (1692), and Règle du bon sens, that is, the Rules of Good Sense (1693). The former was written by Arnauld in response to a work by Gommaire Huygens, in which Huygens defends a view similar to Malebranche’s concerning the theory of ideas (see TP 36). The latter was written in response to François Lamy, who at the request of Pierre Nicole (Arnauld’s co-author from the Logic), defended Huygens from Arnauld’s attack in the Dissertatio (see TP 37).

With this brief account of Arnauld’s life and works in hand, we can proceed to the heart of the project.

[For excellent treatments of the Arnauld-Malebranche polemic, see: Moreau (2000); and Nadler (1989). Moreau (2000) is an article length summary of the debate while Nadler (1989) is a book length treatment on the debate concerning ideas. In addition, Moreau (1999) is an excellent book-length treatment of the debate, though it is only available in French.]

2. Arnauld’s Cartesianism

In the Introduction it was claimed that it is a mistake to view Arnauld simply as an uncritical Cartesian, one who endorses Descartes’ positions and offers no critical evaluation, advancement or original contributions. Nevertheless, Arnauld’s philosophy was thoroughly Cartesian and his commitment to Cartesian is so fundamental to his philosophy that a section on Arnauld’s Cartesianism is in order. We begin with a discussion of the Fourth Objections as this is both one of Arnauld’s earliest philosophical works and involves Arnauld directly responding to Descartes’ Meditations. Then we proceed to discuss the nature of Arnauld’s Cartesianism and finally Arnauld’s commitment to a basic Cartesian ontology.

a. The Fourth Objections

 

As discussed above, Arnauld was one of the authors of the Objections to Descartes’ Meditations. Arnauld’s objections, the Fourth Objections, include many objections that are of central importance, both philosophically and because they facilitate Descartes further articulating his own positions. Prior to receiving the manuscript of the Meditations from Mersenne, Arnauld was already familiar with and likely sympathetic to Cartesian philosophy. As he tells Mersenne:

You can hardly be after my opinion of the author, since you already know how highly I rate his outstanding intelligence and exceptional learning (CSM II 138/AT VII 197).

Arnauld was familiar with at least Descartes’ Discourse on Method (published in 1637) as he refers to it in the Fourth Objections (AT VII 199/CSM II 139). Returning to the philosophical content of the Fourth Objections, Arnauld offers many interesting and important objections to Descartes’ Meditations, all of which cannot be adequately covered here. Some of the most important ones are addressed below.

The first Arnauld’s objection to Descartes concerns the first argument in the Sixth Meditation for the claim that the mind and the body are really distinct (the one that occurs at AT VII 78/CSM II 54). Roughly, for the mind and body to be really distinct is for the mind to be able to exist without the body and for the body to be able to exist without the mind. Descartes begins the argument in question by claiming that:

(1) I know that everything which I clearly and distinctly understand is capable of being created by God so as to correspond exactly with my understanding of it (AT 78; CSM II 54).

And he continues:

(2) Hence the fact that I can clearly and distinctly understand one thing apart from another is enough to make me certain that the two things are distinct, since they are capable of being separated, at least by God (AT 78; CSM II 54).

And he concludes by claiming:

(3) On the one hand I have a clear and distinct idea of myself, in so far as I am simply a thinking, non-extended thing; and on the other hand I have a distinct idea of body, in so far as this is simply an extended, non-thinking thing (AT 78; CSM II 54).

So, he concludes:

(4) And accordingly, it is certain that I am really distinct from my body, and can exist without it (AT 78; CSM II 54).

The correct interpretation of Descartes’ argument is controversial. However, fundamental to the argument is the inference from Descartes’ clear and distinct idea of himself (that is, whatever is essential to him) as a thinking, non-extended thing and his clear and distinct idea of body as an extended, non-thinking thing, to the claim that “I am really distinct from my body, and can exist without it”. While Descartes does not give us an explicit definition of what a clear and distinct idea is in the Meditations (he does in the Principles of Philosophy, a text that was written later and so one which Arnauld would have not yet seen at the time of the Fourth Objections), Descartes clearly means an idea that has a certain internal feature such that it is very precise in his mind and strikes him with a certain strength. Descartes’ argument moves from our ability to have a clear and distinct idea of ourselves as thinking non-extended things to the “real distinction” of mind and body. Since we clearly and distinctly perceive that mind and body are separable, then minds and bodies are separable. Thus, central to Descartes’ argument is that our ability to clearly and distinctly conceive of mind and body as separate licenses the inference to the claim that minds and bodies are capable of being separate (or even of actually being separate). [For a more thorough account, see the Article titled “René Descartes: the Mind-Body Distinction” and for various different interpretations of Descartes’ argument see Van Cleve (1983); Wilson (1978); and Rozemond (1998)]

In his treatment of this argument, Arnauld questions the inference from our being able to clearly and distinctly conceive of mind and body as separate to their actually being separable. Arnauld offers what he takes to be a counter-example. He claims:

Suppose someone knows for certain that the angle in a semi-circle is a right angle, and hence that the triangle formed by this angle and the diameter of the circle is right angled. In spite of this, he may doubt, or not yet grasped as certain, that the square on the hypotenuse is equal to the squares on the other two sides; indeed he may even deny this if he is misled by some fallacy. But now, if he uses the same argument as that proposed by our illustrious author, he may appear to have confirmation of his false belief, as follows: ‘I clearly and distinctly perceive’, he may say, ‘that the triangle is right-angled’; but I doubt that the square on the hypotenuse is equal to the squares on the other two sides; therefore it does not belong to the essence of the triangle that the square on its hypotenuse is equal to the squares on the other sides.’ (CSM II 141-142/AT VII 201-202)

Arnauld offers an objection to Descartes’ argument using what Arnauld takes to be a different case with an analogous structure. Arnauld thinks that in his case, the conclusion is clearly untenable and so Descartes’ argument must be mistaken. Arnauld argues that given Descartes’ argument it seems that it is possible for there to be a right triangle in which the square of the hypotenuse is not equal to the sum of the squares of the other two sides, in other words a right triangle that does not reflect Pythagoras’ Theorem. Arnauld argues: imagine someone considers a right triangle, but doubts whether the square of the hypotenuse is equal to the sum of the squares of the other two sides. Arnauld claims that using Descartes’ argument, since one clearly and distinctly conceives of a right triangle without recognizing that it instantiates Pythagoras’ Theorem, then one could conclude that a right triangle could exist that does not instantiate Pythagoras’ Theorem; a conclusion that is surely untenable.

Arnauld suggests that Descartes’ only reply is to claim that he does not clearly and distinctly perceive that the triangle is right angled. Arnauld continues: “But how is my perception of the nature of my mind any clearer than his perception of the nature of a triangle” (CSM II 142/AT VII 202). So, Arnauld suggests, either we can never know when we have a clear and distinct idea or Descartes’ principle suffers some counter-examples. Whether Arnauld’s objection is successful is a difficult question and one that cannot be covered here. However, briefly, it seems that Descartes’ reply to Arnauld is at least prima facie effective (and possibly a central text in offering a proper interpretation of Descartes’ argument). Descartes claims that “we can clearly and distinctly understand that a triangle in a semi-circle is right-angled without being aware that the square on the hypotenuse is equal to the squares on the other two sides” and compares this to “clearly and distinctly perceive[ing] the mind without the body and the body without the mind” (CSM II 158/AT VII 224-225). Van Cleve (1983, p. 39) helpfully explains the difference (what occurs between the ‘ ’ is the content of the clear and distinct perception):

(a)    Clearly and distinctly perceiving ‘A’ without clearly and distinctly perceiving ‘B’

(b)   Clearly and distinctly perceiving ‘A without B’

So, it seems that Descartes is claiming in the case of mind and body that (b) holds. That is, we can have a clear and distinct perception with the content: “mind without body and body without mind”. In Arnauld’s cases, we only have a clear and distinct perception of a “right triangle” such that we fail to notice that it instantiates Pythagoras’ Theorem. In order for Arnauld’s case to be a true counter-example, we would need to be able to have a clear and distinct perception with the content “right triangle that does not instantiate Pythagoras’ Theorem” and Descartes would deny that we have any such perception.

Arnauld’s objection to Descartes seems to still be a central problem for conceivability-based accounts of the epistemology of modality (see, for example, Yablo 1993, p. 2). A conceivability-based account of the epistemology of modality is one in which our epistemic access to modal claims (that is, claims about possibility, essence, necessity, and so forth) are grounded in our faculty of conceiving of certain states of affairs. Indeed, the conceivability-possibility principle holds that if a state of affairs is conceivable, then it is possible. This principle seems ubiquitous in philosophy, not the least in many thought experiments. The problem akin to Arnauld’s objection is coming up with an internally verifiable criterion for the right sort of conceivability that does not admit of counter-examples. Interestingly, Arnauld ultimately would agree with Descartes and accept that clear and distinct perception was sufficient to establish a real distinction between mind and body (see for example, Schmaltz 1996, pp. 14). Arnauld’s objection to Descartes is of central importance, both historically and philosophically.

A second prominent objection from the Fourth Objections concerns the structure of the Meditations. Arnauld was the first to argue that Descartes’ argument in the Meditations is circular. Arnauld claims:

I have one further worry, namely how the author avoids reasoning in a circle when he says that we are sure that what we clearly and distinctly perceive is true only because God exists.

But we can be sure that God exists only because we clearly and distinctly perceive this. Hence, before we can be sure that God exists, we ought to be able to be sure that whatever we perceive clearly and evidently is true. (CSM II 150/AT VII 214)

The worry pointed to by Arnauld has become known as the ‘Cartesian Circle’. Arnauld asks, how can Descartes avoid arguing in a circle when, in the Meditations, he seems to both rely on a certain type of idea, namely clear and distinct ideas, to argue for God’s existence, while at the same time, using God to validate those clear and distinct ideas? In other words, Descartes’ proof for God’s existence depends on Descartes trusting his clear and distinct ideas and he relies on God to validate those clear and distinct ideas. It is possible, however, that there is no such circle in Descartes (indeed, Descartes’ response to Arnauld is key to understanding the Meditations and how Descartes does not argue in a circle). Nevertheless, there are passages that suggest that Descartes argues in a circle. For example, in the Fifth Mediation, Descartes claims: “if I were unaware of God…I should never have true and certain knowledge of anything, but only shifting and changeable opinions” (CSM II 48/AT VII 69). Regardless of one’s take on the Cartesian Circle, it is among the most well-known and oft-written about topics in Descartes scholarship and Arnauld was the first to articulate the worry.

Arnauld goes on to express a concern about Descartes’ dualism. This concern is motivated by the fact that Descartes argues that the mind and the body are entirely different sorts of substances, with nothing in common whatsoever. Arnauld worries that by exaggerating the distinction between the mind and body, Descartes renders the body an overly inconsequential part of our selves claiming “that man is merely a rational soul and the body merely a vehicle for the soul – a view which gives rise to the definition of a man as ‘a soul which makes use of a body’” (CSM II 143/AT VII 203). Arnauld is concerned, with this objection, to make sure that on Descartes’ account the human body is a legitimate whole.

b. The Nature of Arnauld’s Cartesianism

 

As it has been noted above, Arnauld’s philosophy is clearly Cartesian. However, the extent to which he is a Cartesian is less clear. There are three general accounts of Arnauld’s commitment to Cartesianism. Traditionally, Arnauld had been interpreted as an uncritical and orthodox Cartesian (some readings go so far as to suggest that Arnauld treated Descartes as almost scripture). Marjorie Greene, for example, claims that Arnauld “would indeed prove for more than half a century the defender of pure Cartesian doctrine as he read it” (Grene 1998, p. 172). Other defenders of this general reading (although not as strong as Grene’s claim) include Robinet (1991) and Gouhier (1978). A second reading of Arnauld’s Cartesianism is that Arnauld was a Cartesian merely nominally. That is, the enthusiasm for Cartesian philosophy on the part of Arnauld was merely in so far as it furthered some other ends of Arnauld’s, whether it be to resuscitate interest in the philosophy of Augustine or to further some purely religious doctrines associated with Jansenism. Thus, although Arnauld may have furthered Cartesianism, developed Cartesianism and popularized Cartesianism, to call Arnauld a Cartesian is, in the words of Miel, “to be a dupe to a label” (Miel 1969, p. 262). Third, Arnauld can be seen as a critical and innovative Cartesian. That is, as a philosopher who embraced, to some extent, a Cartesian world-view, but remained critical of Descartes’ views and developed aspects of the Cartesian world-view or abandoned them when he saw fit. The most prominent defenders of this interpretation of Arnauld are Nadler (1995) and Moreau (1999); see also Kremer (1996) and Faye (2005). It seems that the best reading of Arnauld is the third reading. Indeed, the account of Arnauld defended in this article is indicative of the third reading to the extent that Arnauld strays from Descartes on several important issues described below. [For a good discussion of Arnauld’s Cartesianism, see Nadler (1989), Chapter II.]

c. Arnauld’s Cartesian Ontology: Dual Dualisms

 

 

Arnauld, like Descartes, has two exhaustive and exclusive dualisms at the foundation of his ontology, namely a substance-mode ontology and a mind-body substance dualism. Arnauld’s commitment to a substance-mode ontology is clear in Book I, Chapter 2 of the Logic, where he claims:

I call whatever is conceived as subsisting by itself and as the subject of everything conceived about it, a thing. It is otherwise called a substance.

I call a manner of a thing, or mode, or attribute, or quality, that which, conceived as in the thing and not able to subsist without it, determines it to be a certain way and causes it to be so named.  (B 30/OA 41 46-47).

Arnauld is endorsing a substance-mode Cartesian ontology. Substances have two features. First, a substance is something which subsists by itself. Arnauld explains that by subsisting by itself, a substance needs “no other subject to exist” (B 31/OA 41 47). Arnauld holds that substances have independent ontic existence (relying only on God). The second feature of a substance is that it is the subject of everything conceived about it. That is, whatever features or properties the substance is conceived to have depend on the substance for their existence.

The second ontic category is that of modes. Modes cannot subsist on their own and are in a substance. In another passage Arnauld explains that modes “can exist naturally only through a substance” (Book I, Chapter 7, B 43/OA 41 64). Following Descartes, Arnauld defines modes as those things that rely on other things for their existence. Modes are different ways that substances can be (K 6/OA 38 184).

Arnauld, like Descartes, also endorses substance dualism, or the view that minds and bodies are the only two types of substances. That minds and bodies are the only two types of substances is a favorite example of Arnauld’s throughout the Logic. In Book 1, Chapter 7, he claims “body and mind are the two species of substance.” In Book II, Chapter 15, Arnauld claims “every substance is a body or a mind” (B 42/OA 41 61; B 123/OA 41 161). In fact, Arnauld tells Leibniz “I am acquainted with only two kinds of substances, bodies and minds; and it is up to those who would claim that there are others to prove it to us” (M 134/G II 107). Minds, for Arnauld, are essentially thinking things, while bodies are essentially extended things (see, for example, B 31-32/OA 41 135). In other words, what it is to be a mind just is to be a thinking thing and what it is to be a body just is to be an extended thing.

3. Philosophical Methodology

 

 

 

Another area of Arnauld’s philosophy that reveals his commitment to Cartesianism is his philosophical methodology. Two of the most illuminating discussions of Arnauld’s methodology occur throughout the Logic (especially Book IV) and the early chapters of VFI.

In Book IV of the Logic, Arnauld lists four rules useful for guarding against error when we try to find the truth in human sciences (L1, for Logic 1, and so forth):

L1. Never accept anything as true that is not known evidently to be so.

L2. Divide each of the difficulties being presented into as many parts as possible, and as may be required to solve them.

L3. To proceed by ordering our thoughts, beginning with the simplest and the most easily known objects, in order to rise step by step, as if by degrees, to knowledge of the most complex, assuming an order even among those that do not precede one another naturally.

L4. Always make enumerations so complete and reviews so comprehensive that we can be sure to leave nothing out. (B 238/OA 41 367-368) In Chapter I of VFI, Arnauld offers seven “rules which we ought to keep in mind in order to seek the truth” (K 3/OA 38 181).

The rules are (V1 for VFI 1, and so forth):

V1. The first is to begin with the simplest and clearest things, such that we cannot doubt about them provided that we pay attention to them.

V2. The second, to not muddy what we know clearly by introducing confused notions in the attempt to explain it further.

V3. The third is not to seek for reasons ad infinitum; but to stop when we know what belongs to the nature of the thing, or at least to be a certain quality of it.

V4. The fourth is not to ask for definitions of terms which are clear in themselves, and which we could only render more obscure by trying to define them, because we could explicate them only by those less clear.

V5. The fifth is not to confuse the questions which ought to be answered by giving the formal cause, with those that ought to be answered by giving the efficient cause, and not to seek the formal cause of the formal cause, which is a source of many errors, but rather to reply at that point by giving the efficient cause.

V6. The sixth is to take care not to think of minds as being like bodies or bodies as being like minds, attributing to one what pertains to the other.

V7. The seventh, not to multiply beings without necessity. (K 3-4/OA 38 181-182)

Later in VFI, Arnauld adds some “axioms” that compliment and/or add to his method. These axioms include (the axiom numbers as listed here do not correspond with the numbers as listed in VFI, A1 for Axiom 1, and so forth):

A1. Nothing should make us doubt something if we know that it is so with entire certainty, no matter what difficulties can be put forward against it.

A2.  Nothing is more certain than our knowledge of what takes place in our soul when we pause there. It is very certain, for example, that I conceive of bodies when I think I conceive of bodies, even though it may not be certain that the bodies that I conceive either truly exist, or are such as I conceive them to be.

A3.  It is certain, either by reason, assuming that God is not a deceiver, or at least by faith, that I have a body and that the earth, the sun, the moon and many other bodies which I know as existing outside of my mind, truly exist outside my mind.

There is much of interest in this set of rules. One aspect that is immediately noticeable is how similar and indebted to Descartes these rules are. Although it would be beyond the scope of this article to offer a full investigation and comparison of Arnauld’s method to Descartes’, there are a few aspects of the relation that are worth noting. Rules L1-L4 are taken from Descartes’ Discourse on Method and identified as such by Arnauld and Nicole. In the Discourse on Method, Descartes offers his method “of rightly conducting one’s reason and seeking truth in the sciences” (CSM I 111/AT VI 1). In the Discourse, Descartes is openly critical of the education (although certainly not entirely dismissive) he received while studying at one of the better schools in Europe at the time. Descartes offers his own rules as a replacement for the method he had been taught (CSM I 120/AT VI 18-19). Descartes is often considered the “Father of Modern Philosophy” and this title is not without merit, at least in part because of his method. Arnauld’s explicit endorsement of the method offered by Descartes in the Discourse on Method shows Arnauld’s embrace of the new method in philosophy.

The rules from VFI are very similar to those from Descartes’ Rules for the Direction of the Mind (henceforth: Rules; CSM I 9-78/AT X 359-472). For example, V1 above is very similar (if not simply a restatement) of Rule 5 of Descartes’ Rules, where Descartes claims that we should “reduce complicated and obscure propositions step by step to simple ones, and then, starting with the intuition of the simplest ones of all try to ascend through the same steps to knowledge of all the rest” (Nadler 1989, p. 34 makes this point). V1 is an explicit mention of Descartes’ Method of Doubt (the method Descartes uses in the Meditations), as Arnauld wants to rely on things that we cannot doubt in his method. In V2, Arnauld warns of not introducing confused notions to understand what we already know clearly. It is further noteworthy that in his explanation of Rule 4, Arnauld explicitly mentions Descartes.

In V6, Arnauld claims that one must be very clear to distinguish between bodies and minds and not attribute what belongs to one to the other. For example, Arnauld would argue that thought pertains to minds and extension belongs to body. It is a mistake to think of bodies as thinking or minds as extended. In V7, Arnauld endorses what is often called Ockham’s Razor, or the principle that one ought not introduce things into one’s ontology when there is no need to do so. This is no doubt in the background of Arnauld’s dismissal of Leibniz’s “forms” discussed in section 1c. Arnauld thinks not only that these forms are vague, but also that they are unhelpful and unnecessary.

In the axioms from VFI, specifically A1, Arnauld claims that “nothing should make us doubt something if we know it with entire certainty” even if this leads to other difficulties. Arnauld employs this principle often in the Logic. For example, in the Logic, Arnauld claims:

How to understand that the smallest bit of matter is infinitely divisible and that one can never arrive at a part that is so small that not only does it not contain several others, but that it does not contain an infinity of parts; that the smallest grain of wheat contains in itself as many parts, although proportionately smaller, as the entire world…all these things are inconceivable, and yet they must necessarily be true, since the infinite divisibility of matter has been demonstrated. (B 231/OA 41 359)

Here Arnauld applies A1 to the claim that matter is infinitely divisible. Arnauld takes it that it has been proven that matter is infinitely divisible. Yet, it is inconceivable how this is so. We should not doubt this, however, no matter what difficulties are put forward concerning it. This is because it has been demonstrated. Arnauld goes on to relate this same principle to God and God’s omnipotence:

The benefit we can derive from these speculations is not just to acquire this kind of knowledge [extension’s being infinitely divisible]…but to teach us to recognize the limits of the mind, and to make us admit in spite of ourselves that some things exist even though we cannot understand them. This is why it is good to tire the mind on these subtleties, in order to master its presumption and to take away its audacity ever to oppose our feeble insight to the truths presented by the Church, under the pretext that we cannot understand them. For since all the vigor of the human mind is forced to succumb to the smallest atom of matter, and to admit that it clearly sees that it is infinitely divisible without being able to understand how that can be, is it not obviously to sin against reason to refuse to believe the marvelous effects of God’s omnipotence, which is itself incomprehensible, for the reason that the mind cannot understand them. (B 233/OA 41 361).

Here, Arnauld claims that God’s omnipotence and it effects are incomprehensible, yet this should not cause of to doubt that God is omnipotent. We shall return to this point in section 5a.

[See Nadler (1989), Chapter II.3 for a great discussion of the methodology and its relation to Cartesianism].

4. Occasionalism

One of the key questions facing Cartesian dualism is how the mind and body, being so radically different, could causally interact. While there is some debate about the exact nature of Descartes’ position, the dominant reading is that he held mind-body interactionism—the view that the mind and body causally interact with one another. As discussed in section 2c, Arnauld also held mind-body substance dualism.  It is not surprising then that Arnauld inherited some of the same questions concerning interaction from Descartes. There is also debate about Arnauld’s view in this respect, and whether Arnauld followed Descartes and embraced interactionism or abandoned interactionism in favor of a version of occasionalism.

As discussed in section 1c, the traditional doctrine of occasionalism has two components:

  1. Finite beings have no causal power.
  2. God is the only true causal agent.

Arnauld clearly rejects both 1 and 2. Arnauld holds that finite minds have causal power. Thus, it may seem odd to attribute occasionalism to Arnauld. However, consider the following two claims:@

  1. Finite minds have no causal power over bodies and bodies have no causal power over minds.
  2. God is the only true causal agent with respect to mind-body causal relations.

The conjunction of 3 and 4, or mind-body occasionalism, is occasionalism with respect to only a certain type of causal relations, namely mind-body causal relations.

The traditional reading of Arnauld is as a mind-body interactionist. The most sophisticated defense of this reading is A. R. Ndiaye’s. He argues that despite his occasionalist vernacular, Arnauld is in the tradition of Descartes, not in the occasionalist tradition (Ndiaye 1991, p. 308).

While Ndiaye’s interpretation deserves treatment, Nadler (1995) has convincingly argued that, at least by the end of his career, Arnauld was a mind-body occasionalist (the following account is taken from Nadler 1995). As pointed out by Nadler, in the Examen Arnauld argues that mental states and bodily states correspond in an interesting way. We have certain mental states, like pain, on the occasion of certain bodily states, like burning flesh. Arnauld argues that such a correspondence needs an explanation; that is, Arnauld claims that there must be some reason for this correlation between certain types of bodily events and certain types of mental states. Arnauld suggests only three possible explanations:

(a) Bodies cause mental states.

(b) The mind causes its own mental states on the occasion of certain bodily states.

(c) God causes mental states on the occasion of certain physical states.

The three options Arnauld gives make perfect sense given Arnauld’s mind-body substance dualism. The only created substances that exist are minds and bodies and of course God exists as well. Given that these are the only things that exist, the three above options seem to be the only three possibilities to explain the correspondence between mental states and bodily states (short of coincidence). Either minds are responsible, bodies are responsible or God is responsible. Arnauld begins by denying that bodies cause mental states, arguing:

For the movement of a body can have at most no other real effect than to move another body, I say at most, for possibly it cannot even do that, for who does not see that a body cannot at all have causal relations with a spiritual soul [qui ne voit qu’il n’en peut causer aucun sur une ame spirituelle] which is incapable, by its nature, of being pushed or moved? (E 99/OA 38 146)

Arnauld clearly denies that bodies can exercise causal power over minds. Thus, Arnauld has eliminated the first possibility. Arnauld then proceeds to consider the second possibility and claims that the mind does not cause its own perception on the occasion of certain bodily states, for the mind could not “ha[ve] such a power to give itself all the perceptions of sensible objects” to “produce them so properly and with so marvelous a promptitude” (OA 38 147/E 101/trans. Nadler 1995, p. 135). In other words, the human mind cannot cause the correct sensations so consistently and so unerringly. How would the human mind know what sensations to cause and when? Since Arnauld argues there are only three possibilities (namely a, b and c above) and the first two do not work (namely, a and b), it must be that God causes them:

It only remains for us to understand that it must be that God desired to oblige himself to cause in our soul all the perceptions of sensible qualities every time certain motions occur in the sense organs, according to the laws that he himself has established in nature (emphasis mine, E 101-102/OA 38 147-148; translation Nadler 1995, p. 136).

Arnauld’s account, at least with respect to body→mind causal relations is occasionalist. God causes our mental states on the occasion of certain bodily states every time these bodily states occur.

As pointed out by Nadler, there is another work in which Arnauld does address mind→body causal relations, namely the Dissertation. In the Dissertation, Arnauld claims:

Our soul does not know what needs to be done to move our arm…it is properly only this reason…that can lead one to believe that it is not our soul that moves our arm. (OA 38 690/trans. Nadler 1995, pp. 138-139)

This is not an argument for occasionalism, as all Arnauld claims is that finite minds do not causally affect bodies. However, Arnauld has only three options open to explain the correspondence of mental states and physical states: finite minds, bodies or God. Bodies certainly do not have the capacity to recognize what to do on the occasion of states of the soul. Minds do not have the requisite knowledge to produce their own perceptions on the occasion of certain bodily states, so a fortiori, bodies would not have such a capacity (see Nadler 1995, p. 138). Thus, given that Arnauld denies that finite minds causally affect bodies, the only option available to explain the correspondence is God.

These passages illustrate Arnauld’s commitment to mind-body occasionalism. In fact, they illustrate that Arnauld’s mind-body occasionalism is an ad hoc response to some form of the problem of interaction (see Nadler 1995, pp. 142-143). Arnauld argues that it is clear that minds do not causally affect bodies and that bodies do not causally affect minds, so God must be responsible for the causal interaction.

Concerning body-body causation, Arnauld is less clear in his position. Nadler has suggested that Arnauld seems to allow body-body causal relations. Indeed, he cites the passage: “For the movement of a body can have at most no other real effect than to move another body” to suggest that Arnauld seems to hold that bodies do have causal power (Nadler 1995, p. 132). However, Arnauld follows this claim up with “I say at most, for possibly it cannot even do that.” There appears to be no text that establishes Arnaud’s view on the matter, but the hesitation in this passage may suggest that Arnauld was not completely satisfied with the possibility of body-body causal relations and at least would entertain the notion that body-body causal relations were occasionalist.

[Secondary literature: Ndiaye (1991); Watson (1987); Robinet (1991); Nadler (1995); Kremer (1996); Sleigh (1990).]

5. Conception of God and Theodicy

As noted in the introduction, Arnauld was primarily a theologian. Combining Arnauld’s focus on theological issues and his philosophical acumen, one might expect that Arnauld had especially interesting philosophical discussions and views pertaining to God, God’s modus operandi and theodicy. If so, one would not be disappointed. Two questions concerning Arnauld’s views on God are of particular importance. First, did Arnauld follow Descartes and endorse the Doctrine of the Creation of the Eternal Truths. Second, what contributions, if any, did Arnauld make to the problem of theodicy and an account of God’s modus operandi.

a. The Doctrine of the Creation of the Eternal Truths

 

 

One of the most contested issues in Arnauld scholarship is whether Arnauld followed Descartes in endorsing the Doctrine of the Creation of the Eternal Truths (henceforth: CET). Descartes first defends this position in a series of letters to Mersenne in 1630. For example, Descartes claims:

You ask me by what kind of causality God established the eternal truths. I reply: by the same kind of causality as he created all things, that is to say, as their efficient and total cause. For it is certain that he is the author of the essence of created things no less than their existence; and the essence is nothing other than the eternal truths…You ask what necessitated God to create these truths; and I reply that he was free to make it not true that all the radii of the circle are equal – just as free as he was not to create the world. And it is certain that these truths are no more necessarily attached to his essence than are other created things. You ask what God did in order to produce them. I reply that from all eternity he willed and understood them to be, and by that very fact he created them…In God, willing, understanding and creating are all the same thing without one being prior to the other even conceptually. (AT 1 152-153/CSMK III 25-26)

How to interpret CET is a difficult question in its own right. Eternal truths are truths like those of mathematics, for example, 2+2=4, and those about essences, for example, Descartes is a thinking thing. These sorts of truths are usually considered to be necessarily true. That is, they are considered to be truths that could not have been false. Yet, in this passage, Descartes seems to suggest that the eternal truths were created by God in such a way that they could have been false. There are many different interpretations of Descartes’ account of CET, three of which are especially worth noting. On one reading of CET, popularized by Frankfurt (1977), to claim that God freely created the eternal truths is to claim that all truths, even the eternal truths, are contingent. A second interpretation, popularized by Curley, holds that Descartes is not denying that the eternal truths are necessary, but rather that they are necessarily necessary (Curley 1984). Finally, a third interpretation, defended by Kaufman, holds that Descartes’ claim that God freely created the eternal truths is simply to claim that there were no factors, independent of God that compelled God to create the eternal truths as they were in fact created. Kaufman argues, however, that this is consistent with these truths being necessarily true (Kaufman 2002, p. 38). What is central to the doctrine on all three readings is that there is a legitimate sense in which God was free in creating the eternal truths and had the capacity to create them differently (or at least to not create them).

While Arnauld never explicitly addresses CET, whether he endorses it is the focus of much Arnauld scholarship. Interpretations vary from Andrew Robinet’s claim that “Arnauld offers a tranquil Cartesianism, which is not confused by the eternal truths” (Robinet 1991, p. 8), to Nadler’s suggestion that (depending on one’s interpretation of Descartes) Arnauld could be “seen as going further than Descartes, recognizing the truly radical potential of the Cartesian doctrine [CET] and embracing it” (Nadler 2008, p. 536), to Emmanuel Faye’s interpretation that “Arnauld’s doctrinal position” concerning the creation of the eternal truths “is not Cartesian but Thomist, even if it is, in actuality, a nuanced Thomism” (Faye 2005, p. 208), to Tad Schmaltz’s claim that “Arnauld never did take a stand on [the creation of the eternal truths]” (Schmaltz 2002, pp. 15-16).

Before discussing whether Arnauld adopted CET, three other views about God that Arnauld endorses should be addressed, namely the Principle of Divine Incomprehensibility (PDI), the Doctrine of Equivocity and the Doctrine of Divine Simplicity. All three of these views are held by Descartes (although they are not necessarily unique to Cartesianism, and attributing the Doctrine of Equivocity to Descartes is controversial). Further, all three views are relevant to Descartes’ acceptance of CET and to Arnauld’s account of God.

The Principle of Divine Incomprehensibility is the thesis that:

PDI:    (a) We ought to believe everything that is revealed by God.

(b) The fact that we cannot understand how something revealed by God can come about should not deter us from believing it because a finite intelligence ought not to expect to understand the nature or causal power of an infinite being. (Kremer 1996, p. 84)

In the Examen Arnauld directly endorses PDI:

Whatever God has been pleased to reveal to us about himself or about the extraordinary effects of his omnipotence ought to take first place in our belief even though we cannot conceive it, for it is not strange that our mind, being finite, cannot comprehend what an infinite power is capable of. (OA 38, 90/E 12/trans. Kremer 1996, p. 77)

In this passage Arnauld explicitly defends both components of PDI. We should believe everything revealed to us by God and we should believe it even if we cannot understand it because we ought not expect our finite intellect to be able to understand the infinite. In fact, Arnauld holds a view stronger than PDI, claiming that things God reveals to us ought to be placed first in our beliefs (see Kremer 1996, p. 88 fn. 5). Arnauld’s claim here is without doubt related to axiom A1 and his discussion of God and God’s omnipotence from section 3. Arnauld takes it that we know that God is omnipotent and that we should believe everything revealed to us by God even if we cannot understand how it can be true.

Arnauld also endorses the Doctrine of Equivocity or the thesis that predicates when applied to God and finite things have at most equivocal meanings and not at all univocal meanings. That is, to say that God is powerful and to say that Alexander the Great is powerful are to say entirely different things about God and Alexander the Great. In this context, “powerful” means something entirely different when applied to God and when applied to Alexander the Great. Properties that are possessed by God and by finite things, are at best equivocally similar properties. At the conclusion of a discussion about the Eucharist (see section 1b) and its relation to Cartesian philosophy in the Examen, Arnauld adds:

But I add before finishing…nothing would be more unreasonable than to hold that philosophers, who have the right to follow the light of reason in the human sciences, are required to take what is incomprehensible in the mystery of the Incarnation as a rule for their opinion when they attempt to explain the natural union of the soul with the body, as if the soul could do with regard to the body what the eternal Word could do with regard to the humanity he took on, even though the power, as will as the wisdom, of the eternal Word is infinite, while the power of the soul over the body to which it is joined is very limited. We would not have those thoughts which mix up everything in philosophy and theology, if we were more convinced of the clear and certain maxim that Cardinal Bellarmine used against the quibbles of the Socinians: “No inference can be made from the finite to the infinite,” or as others put it “there is no proportion [proportio] between the finite and the infinite”. (OA 38 175/E 141, trans. Carraud 1996, p. 9)

In this passage, Arnauld is defending the Doctrine of Equivocity. Arnauld cites as a “clear and certain maxim” that no inference can be made from the finite to the infinite, and considers it the same claim as there being “no proportion [proportio] between the finite and the infinite.” While attributing the Doctrine of Equivocity to Arnauld is controversial, to claim that there is no proportio (“proportion”, while most naturally translated “proportion” can also mean “analogy” or “symmetry”) between the finite and the infinite is at least prima facie to claim that properties of God and created things are in no way similar. [See, Nadler (2008), p. 533 and Carraud (1995), pp. 122-123 for a discussion of equivocity in Arnauld]. At the very least, Arnauld flatly denies the claim that predicates apply to God and creatures univocally. For properties to apply to God and creatures univocally is for those properties to be essentially the same kind of properties. Although God is infinite and creatures finite, according to the Doctrine of Univocity, what it is for God to have knowledge and for creatures to have knowledge is roughly the same (see Nadler 2008b, pp. 204-207 and Moreau 2000, p. 104).

Arnauld also endorses the Doctrine of Divine Simplicity, or the view that God is absolutely simple. For Arnauld, to claim that God is simple is, among other things, to deny any real distinction between God’s will, God’s understanding and God. God’s will and understanding are one and the same thing; God’s will just is God’s understanding. In addition, Arnauld’s version of divine simplicity is such that it is not the case that God’s will and understanding performing different functions in God’s action and creation. While the Doctrine of Divine Simplicity was ubiquitous among Arnauld’s contemporaries and predecessors, many seem to not require this latter claim as a component in divine simplicity. Arnauld’s account of divine simplicity appears both in the correspondence with Leibniz and the debate with Malebranche. In the latter, for example, Arnauld offers this account of divine simplicity while arguing against Malebranche’s conception of God. In discussing God’s creation, Malebranche describes God as “consulting his wisdom.” In the Treatise Malebranche explains: “The wisdom of God reveals to him an infinity of ideas of different works, and all the possible ways of executing his plans” (OC 5 28/R116; for a nice summary of Malebranche’s theodicy, see Rutherford 2000). Thus, Malebranche has an account where God consults his understanding, considers the many different ways that he could create, and selects one.

Arnauld objects to this conception of God and God’s action in many ways. For example, Arnauld claims it is not appropriate to say that God: “’consults his wisdom’, and it is from there that it happens that all that He wills is wise.” For, Arnauld claims, God does not need to consult his wisdom in order to be wise, “everything that he wills” is “essentially wise as soon as he wills it” (OA 39 578/Nadler 2008, p. 531). Arnauld further claims that: “Can there be a thought more unworthy of God than to imagine such a disagreement between his wisdom and his will as if his will and his wisdom were not the same thing?” (OA 39 748/Moreau 2000, p. 103, trans. Nadler). In these passages, Arnauld is articulating his conception of God such that God’s different faculties do not play different roles in creation; God is absolutely simple. [See also Nelson (1993), pp. 685-688; Moreau (1999), especially pp. 280-286; Moreau (2000), especially section 4.3, pp. 102-104; and Nadler (2008), p. 533, for discussion of Arnauld and divine simplicity]

There is strong evidence that Arnauld endorsed the Doctrine of Divine Simplicity, Equivocity and PDI. In fact, it seems the text demands that Arnauld held divine simplicity and PDI and at least strongly supported the Doctrine of Equivocity. Returning to whether Arnauld also adopted the Doctrine of the Creation of the Eternal Truths, the dominant view is that Arnauld denied CET (see for example Kremer 1996, Gouhier 1978, and Robinet 1991). Carraud (1996) and Schmaltz (2002) have argued either that Arnauld never made up his mind, or at least that the texts give us no answer. Faye argues that what is “created” about the eternal truths is not that they depend on the free decrees of God. Rather, the eternal truths are created on account of the fact that the truth is known by the human intellect, which is a created intellect (Faye 2005, pp. 206-207). Moreau (1999) has offered a very compelling case that Arnauld did in fact accept CET that is endorsed (and developed) by Nadler (2009). While this is too difficult a question to adequately answer here, what is clear is that Arnauld holds the same underlying conception of God (for example, strong notion of divine simplicity, Equivocity and PDI) that Descartes relies on in his exposition of and defense of CET and his conception of God at least approaches the Cartesian God (see Nadler 2009, p. 533). Arnauld explicitly claims in his account of PDI that we cannot understand the causal power of an infinite being and in his defense of Equivocity argues that finite beings and God have no proportion to another. Finally, in his account of divine simplicity, Arnauld denies that God understands conceptually prior to God’s willing certain states of affairs. Given PDI it seems that Arnauld would at least not positively claim that there are things that God cannot do (for example, make 2+3=6) and given Equivocity and divine simplicity it seems that Arnauld would not conceive of God creating in the way that Malebranche and Leibniz, for example, would hold, namely by understanding all the possible ways of creating conceptually prior to willing to create. In the Logic (see section 3), Arnauld denies that we can understand God’s omnipotence. Put all of these aspects of Arnauld’s conception of God together and it seems that Arnauld’s conception of God at least approaches CET.

b. Theodicy

Arnauld’s most systematic contributions to the question of theodicy come in his criticisms of Malebranche’s theodicy. As such, it will be beneficial to first briefly examine Malebranche’s theodicy before considering Arnauld’s response and account. Malebranche begins by citing two features of an infinitely perfect being: a wisdom that has no limits and a power that nothing is capable of resisting (R 116/OC V 27). Malebranche’s God surveys all of the possible universes that God could create and chooses to create the one that is the best reflection of God’s attributes. Malebranche claims:

An excellent workman should proportion his action to his work; he does not accomplish by quite complex means that which he can execute by simpler ones; he does not act without an end, and never makes useless efforts. From this one must conclude that God, discovering in the infinite treasure of his wisdom an infinity of possible worlds (as the necessary consequences of the laws of motion which he can establish), determines himself to create that world which could have been produced and preserved by the simplest laws, and ought to be the most perfect, with respect to the simplicity of the ways necessary to its production or to its conservation. (R/OC V 28)

The aim in God’s creation of the world is not to create the world with the least amount of evil, but the world that is the best reflection of God and God’s attributes. God, being infinitely wise, ought to create a world that is produced and preserved by the simplest laws. God acts, as Malebranche likes to say, nearly always by ‘general volitions’ and very rarely by ‘particular volitions’. When confronted with evil in the world, God did not will that that evil exist by a particular volition, but only as a consequence of the general laws that God has willed due to their simplicity, that is, by a general volition. Malebranche goes so far as to say that “God’s wisdom in a sense renders Him impotent; for since it obliges Him to act by the most simple ways, it is not possible for all humans to be saved” (OC 5 47, trans. Nadler 2008, p. 525).

Many things about Malebranche’s theodicy upset Arnauld. Only several of Arnauld’s more prominent objections shall be discussed here. First and foremost, Arnauld objects to Malebranche’s theodicy because it limits God’s power. Arnauld was very adamant in his defense of divine power, so much so that he even criticizes Descartes, whose divine voluntarism is well-known, of limiting God’s power. In the New Objections to Descartes’ Meditations, Arnauld argues concerning Descartes’ claim that a vacuum is impossible: “I would rather acknowledge my ignorance than convince myself that God necessarily conserves all bodies, or at least that he cannot annihilate any one of them unless he at once creates another” (K 188/OA 38 75). Arnauld’s insistence on a strong conception of divine omnipotence continued throughout his career. He was critical of Malebranche’s theodicy because, he thought, it limits God’s power substantially. In the Reflexions, for example, Arnauld objects to Malebranche’s claim that God must choose the world with the simplest laws. In passages pointed out by Moreau (2000, p. 103), Arnauld claims of Malebranche’s theodicy and account of God’s action: “it is indeed strange that someone should so easily take the liberty to provide arbitrary boundaries to the freedom of God” and “He fears not to set limits to the freedom of God…He fears not to proclaim that…God will have no freedom in his choice of ways necessary for the execution of his designs…but on what basis could a doctrine so injurious to divine freedom be grounded?” (OA 39 603 and OA 39 594-595/both translations are Nadler’s and appear in Moreau 2000, p. 103). Overall, not only does this language undermine God’s omnipotence, but also humanizes his ways of acting. Arnauld claims: “All means for executing his designs are equally easy for God…and his power so renders him master of all things and so independent of the need for help from others, that it suffices that he will for his volitions to be executed” (OA 39 189-190/trans. Nadler 1996, p. 154, who points to this passage). Arnauld objects to Malebranche’s theodicy because he sees it as putting unnecessary and inappropriate constraints on God’s action and power

Arnauld also objects to Malebranche’s theodicy because Malebranche admits that evil exists in the world. According to Malebranche, there are legitimate evils in the world and these evils are allowed to exist by God. In fact, as noted above Malebranche is an occasionalist. Thus, it seems that God is causally responsible for evil. Arnauld, on the other hand, rejects this claim, insisting that apparent evils in the world are only appearances. Arnauld claims God’s work has no faults: “There are no faults in the works of God.” According to Arnauld, if we “consider the whole” world, or perhaps better, were we able to consider the whole, we would recognize that everything contributes to the overall beauty of the universe (OA 39 205/see also Nadler 1996, p. 156 who points to this passage).

Finally, Arnauld attacked Malebranche’s claim that God acts primarily by “general volitions” and very rarely by “particular volitions”. In a section of Malebranche’s Treatise added in a later addition to clarify the distinction (no doubt from Arnauld’s very objections), Malebranche defines what it means for God to act by a general volition and by a particular volition.

I say that God acts by general volitions, when he acts in consequence of general laws which he has established. For example, I say that God acts in me by general volitions when he makes me feel pain at the time I am pricked; because in consequence of the general and efficacious laws of the union of the soul and the body which he has established, he makes me feel pain when my body is ill-disposed. (R 195/OC V 147, I have replaced Riley’s “wills” with “volitions” for the French “volontez”)

He continues:

I say on the contrary that God acts by particular volitions when the efficacy of his volition is not determined at all by some general law to produce some effect. Thus, supposing that God makes me feel the pain of pinching without there happening in my body, or in any creature whatsoever, any changes which determine him to act in me according to general laws—I say then that God acts by particular volitions. (R 195/OC V 147-148, I have replaced “wills” and “will” with “volitions” and “volition”, respectively for the French “volontez” and “volonté”)

 

Arnauld argues that Malebranche’s claim that God acts almost always by general wills destroys God’s providential concern for creation. According to Arnauld, God acts according to general laws, but does not act by general laws (see, for example, OA 39 175).  Arnauld argues that in fact, God has a volition for every state of affairs in the world. Arnauld’s objection relies on interpreting Malebranche’s account of God’s acting by general volitions as one in which the contents of God general volitions are general laws. For example, on this reading the content of God’s volition might be “For all created substances x and y and for every time t1, if x is G at t1 and x bears relation R to y then there is a time t2 such that t2 bears relation T to t1 and y is F at t2” (this example is taken from Black 1997, p. 34). Thus on this reading, general volitions have contents which are laws. It is not clear that Arnauld interpreted Malebranche correctly in this regard. Indeed, a second plausible interpretation of Malebranche is that when Malebranche claims that God acts by general volitions, Malebranche means that God has particular content volitions in accordance with the natural laws (see Pessin 2001, for this very helpful way of explaining the distinction). So, Malebranche’s God would have a volitional content for every particular state of affairs of the world, but God’s volitional contents proceed according to general laws. If this latter reading is correct, then Arnauld’s objection seems to lack force as Malebranche does claim of God that God acts through general laws and not by general laws. What is clear is that Arnauld holds that God has a volitional content of every particular state of affairs of the world.

[For discussions of the correct interpretation of Malebranche in this respect see: Pessin (2001); Nadler (2000); Black (1997); and Stencil (2011)]

c. Arnauld’s God

 

 

If we combine the discussion of Arnauld’s commitments to his understanding of God from the two above discussions, we can see a distinction conception of God emerge. First and foremost, is Arnauld’s insistence that God (an infinite being) does not act in the way that created finite beings act. God’s understanding and will do not play different roles in creation or in God’s action. The necessity to consult one’s understanding before willing, having to balance different competing desires and not being able to achieve all of one’s ends is something that finite beings do, but not God. God is in this sense not a being with practical rationality (see for example, Nadler 2008, p. 518). Second, God created no evil (at least as a positive thing in the world). If we were capable of understanding creation in its entirety we would see that everything contributes to the greatness and beauty of the universe. And finally, from Arnauld’s objections to Malebranche’s theodicy, we can see that Arnauld holds that although God acts according to general laws, God does not act by general laws. God directly acts in every particular situation.

Before moving on, we are now finally in a position to consider Moreau’s suggestion as to why Arnauld begins his attack on Malebranche’s theodicy with an attack on Malebranche’s theory of ideas. From Arnauld’s perspective, Malebranche’s theory of ideas, especially his Vision in God doctrine and his theodicy are quite related. According to his Vision in God doctrine, created beings have ideas that are in a sense in God and Malebranche’s God acts much like created beings do, for example, by consulting his understanding. As Moreau (2000, pp. 104-106) argues, Arnauld sees both of these as an illegitimate understanding of God and endorsement of the Doctrine of Univocity. God has ideas that are in a sense accessible to us and acts in much the same way we do, according to Malebranche. These are both problematic for Arnauld, as he denies univocity between God and created beings. Indeed, if the above account is correct, he endorses equivocity.

6. Modality

 

 

In the Leibniz-Arnauld correspondence, Arnauld not only offers many important objections to Leibniz’s view (some of which were outlined in section 1), but he also offers a positive account of modality. The best way to approach Arnauld’s account is to consider one more of his objections to Leibniz.

 

Arnauld’s initial objection in his first letter revolves around Leibniz’s endorsement of the Complete Concept Theory of Substance (CCS). As Leibniz explains in the outline sent to Arnauld:

 

Since the individual concept of each person contains once for all everything that will ever happen to him, one sees in it a priori proofs or reasons for the truth of each event, or why one event has occurred rather than another. (M 5/G II 12)

 

Leibniz defends this view in the Discourse on Metaphysics:

 

We are able to say that this is the nature of an individual substance or of a complete being, namely, to afford a conception so complete that the concept shall be sufficient for the understanding of it and for the deduction of all the predicates of which the substance is or may become the subject. (D 13/G IV 349/AG 41)

 

CCS is the view that it is the nature of an individual substance to have a complete concept such that any true predication which can be made of that substance is contained in that substance’s concept. According to this view, not only is being a man contained in the concept of Caesar, but also that he crossed the Rubicon, that he was the subject of a biography by Plutarch and every other particular event in his life.

Arnauld adamantly objects to this principle:

If that is so, God was free to create or not create Adam; but supposing he wished to create him, everything that has happened since and will ever happen to the human race was and is obliged to happen through a more than fatal necessity. For the individual concept of Adam contained the consequence that he would have so many children, and the individual concept of each of these children, everything that they would do and all the children they would have: and so on. There is therefore no more liberty in God regarding all that, supposing he wished to create Adam, than in maintaining that God was free, supposing he wished to create me, not to create a nature capable of thought. (M 9/G II 15)

As with many of his objections to Malebranche, Arnauld is concerned that Leibniz’s view limits God’s power. Arnauld argues that once God has decided to create Adam, everything else that happens follows through necessity. This is because what is contained in Adam’s complete concept entails every other fact about the universe. For example, Adam’s complete concept entails not only that Caesar exist, but that Plutarch compose a treatise on Caesar. The reason is, or at least Arnauld seems to take it, that it is a true predication of Adam that Adam exists a certain number of years before Caesar. Since this is a true predication of Adam, and a complete concept of Adam (according to Leibniz) is sufficient to deduce all of Adam’s predicates; Caesar is either a part of Adam’s concept or straightaway follows from Adam’s concept. Arnauld finds this troubling. Not the least because he sees it as a threat to God’s omnipotence. Once God creates Adam, Arnauld claims, God is bound to create Caesar and everything else that God creates through a “more than fatal necessity.” Surely, Arnauld would claim, God could have created Adam and chosen not to create Caesar, or at least God could have created Caesar and Plutarch, but Plutarch wrote a biography on Aristotle instead. Arnauld treats this as an unacceptable limitation on God’s power.

Leibniz responds, and among other things, offers his account of creation. Leibniz sums up the relevant part of his account of creation in the Monadology: “There is an infinity of possible universes in God’s ideas, and since only one of them can exist, there must be a sufficient reason for God’s choice which determines him towards one thing rather than another” (AG 220/G IV 615-616). Much like Malebranche, Leibniz considers that God’s creation takes place by God considering the possible ways that God could create. God’s decision to create is not piecemeal, as Leibniz claims: “one must not consider God’s will to create a particular Adam separate from all the others which he has regarding Adam’s children and the whole of the human race” (M 14/G IV 19). Leibniz continues that had God wanted to create a world with “Adam” and a different posterity, God would have created a different Adam with a different complete concept. Leibniz holds that God created in the best possible way from an infinite number of options. In these infinite different universes that God could create, there are many different “Adams” and in some of those worlds there are “Caesars” and in some there are not.

Ultimately, Arnauld concedes to Leibniz’s account of a complete concept, but the concession is only a terminological one. It turns out the real debate between the two is about the nature of possibility. As Arnauld tells Leibniz:

It was more than enough to make me decide to confess to you in good faith that I am satisfied by the way you explain what had at first shocked me regarding the concept of an individual nature…I see no other difficulties remaining except on the possibility of things, and this way of conceiving of God as having chosen the universe that he has created amongst an infinite number of other possible universes that he saw at the same time and did not wish to create. But as that strictly has no bearing upon the concept of the individual nature and since I should have to ponder too long to make clear what my views on the subject are, or rather what I take exception to in the ideas of others, because they do not seem to me to be worthy of God, you will think it advisable, Sir, that I say nothing about it. (M 77/G II 64)

As convincingly argued by Alan Nelson (1993), the debate between Leibniz and Arnauld is a debate between a possibilist (Leibniz) and an actualist (Arnauld). Possibilism is the view that there is an ontic distinction between two types of existing things: actual things and possible things. Actualism is the view that the only things that exist are actual things. According to Leibniz, prior to the creation of the universe, God considers all of the possible universe that God could create and chooses the best of those worlds and creates it. The world that God creates is the actual world. In addition to the actual world, there are an infinite number of merely possible worlds, which exist but only in God’s mind. In the above passage Arnauld offers his disagreement with this way of conceiving of God’s creating: “I see no other difficulties remaining except on…this way of conceiving of God as having chosen the universe that he has created amongst an infinite number of other possible universes.” Nelson (1993, p. 686) has helpfully suggested keeping Arnauld’s endorsement of the Doctrine of Divine Simplicity in mind in this passage. As argued in the section 5a, Arnauld held a view of divine simplicity such that God’s different faculties do not play different roles in God’s acting. Leibniz’s God, given that he surveys his understanding prior (at least conceptually) to his willing to create, has faculties participating in God’s action in different ways. This Arnauld sees as an unacceptable way to conceive of God and so there are no possible worlds in God’s mind that are not created.

After Arnauld objects to Leibniz’s account of creation and the existence of possible worlds in God’s understanding on account of its reliance on a view whereby God’s understanding and will are distinct (see M 27-33/G II 28-33), Arnauld then offers an actualist account:

I confess in good faith that I have no conception of these purely possible substances, that is to say, the ones that God will never create. And I am very much inclined to think that they are figments of the imagination that we create, and that what we call possible, purely possible, substances, cannot be anything other than God’s omnipotence, which being a pure act does not permit the existence in it of any possibility [et que tout ce que nous appelons substances possibles, purement possibles, ne peut être autre chose que la toute-puissance de Dieu, qui étant un pur acte ne souffre point qu’il y ait en lui aucune possibilité]. (M 31/G II 31-32)

Arnauld continues:

But one can conceive of possibilities in the natures which [God] has created…as I can also do with an infinite number of modifications which are in the power of these created natures, such as the thoughts of intelligent natures and the forms of extended substance. (M 31-32/G I 32, see also Nelson 1993, p. 685 who points to this passage)

The view suggested by Arnauld is that all that exists are God and actual substances and all possibility exists in virtue of the modal properties or “powers” of actually existing substances. Prior to God’s creation, there are no ‘possible worlds’ in God’s understanding from which God chooses to create. Instead, God creates all actual things and possibility is grounded in these actual things and their powers. For example, Arnauld would claim, it is possible for Descartes to have written 7 meditations, instead of 6. This is not because God could have chosen to create a Descartes who wrote 7 meditations, but because it is in the power of the actually created Descartes that he could have written a 7th meditation.

7. Conclusion

Antoine Arnauld was a major figure in the intellectual landscape of the latter part of the 17th century. Although he has not received the attention that others have, for example, Descartes, Leibniz and Spinoza, Arnauld is finally starting to receive the attention he so rightly deserves from scholars. Arnauld was arguably the most philosophically gifted Cartesian of the latter part of the 17th century and explicitly defended Cartesianism even after Descartes’ works had been put on the index. His Objections to Descartes’ Meditations are arguably the best of any set. He went on to co-author an extremely influential logic text with Pierre Nicole. His correspondence with Leibniz, where he simultaneously pushes Leibniz to explain his position, offers insightful criticisms of Leibniz’s views and offers his own philosophical contributions is one of the most sophisticated and philosophically interesting texts of the period. And finally, Arnauld was a major player in one of the most important intellectual events of the 17th century, the Malebranche-Arnauld polemic that resulted in many works from both parties. There is much of interest in Arnauld that scholars are just beginning to investigate. While at least one major barrier to Arnauld’s acquiring more attention in the English-speaking world is the lack of English translation of many of his major texts, Arnauld scholarship will no doubt continue to become more prominent and Arnauld will finally receive the attention deserved of one of his influence and acumen.

8. References and Further Reading

  • Primary Texts
  • Arnauld, Antoine, (1775) Oeuvres de Messire Antoine Arnauld, 43 vols., Paris: Sigismond D’Arnay, (abbreviated OA).
    • A collection of Arnauld’s works in their original language (Latin and French).
  • Arnauld, Antoine, (1975) and Claude Lancelot, The Port Royal Grammar, Jacques Rieux and Bernard E. Rollin, trans., Paris: Mouton, (abbreviated PRG).
    • English translation of the Port-Royal Grammar.
  • Arnauld, Antoine, (1990) “New Objections to Descartes’ Meditations and Descartes’ Replies”, Elmar Kremer, trans, Lewiston: The Edwin Mellen Press, (abbreviated as NO).
    • English translation of the letters exchanged by Descartes and Arnauld in 1648.
  • Arnauld, Antoine, (1990b) On True and False Ideas, Elmar Kremer, trans., Lewiston: The Edwin Mellen Press, (abbreviated as K)
    • English translation of On True and False Ideas.
  • Arnauld, Antoine, (1996) and Pierre Nicole, Logic or the Art of Thinking, Jill Vance Buroker, trans., Cambridge: Cambridge University Press, (abbreviated as B).
    • English translation of the Port-Royal Logic.
  • Arnauld, Antoine, (1999) Examen du Traité de l’Essence du corps contre Descartes, Libraire Arthème Fayard, (abbreviated E).
    • A recent edition of the Examen. This edition is French (as was the original). No English translations of this work are available.
  • Arnauld, Antoine, (2001) Textes philosophiques, Denis Moreau, trans., Paris: Presses Universitaires de France, (abbreviated as TP).
    • Collection of some of Arnauld’s works including: Philosophical Conclusions (in the original Latin with French translation, no English translation available); excerpt of Quod est nomen Dei (in original Latin with French translation, no English translation available); Dissertation en Deux Parties (French translation from the Latin, Latin not included, no English translation available); and Règles du Bon Sens (French, no English translation available).
  • Descartes, Rene, (1974-1989) Oeuvres de Descartes, 11 vols., Charles Adam and Pual Tannery, eds., Paris: Vrin, (abbreviated as AT)
    • Descartes’ collected works in their original language (French and Latin).
  • Descartes, Rene, (1984-1985) The Philosophical Writings of Descartes, 2 vols., John Cottingham, Robert Stoothoff, and Dugland Murdoch, trans., Cambridge: Cambridge University Press, (abbreviated as CSM I, and CSM II).
    • Two-volume collection, with standard English translation of many of Descartes’ works.
  • Descartes, Rene, (1991) The Philosophical Writings of Descartes: Volume III, John Cottingham, Robert Stoothoff, Dugland Murdoch, and Anthony Kenny trans, Cambridge: Cambridge University Press, (abbreviated as CSMK III).
    • Third volume and companion to CSM I and II. This volume is a collection of many of Descartes’ letters.
  • Leibniz, Gottfried, (1875-1890) Die Philosophischen Schriften von G. W. Leibniz, ed. C.I. Gerhardt, 7 vols (Berlin, Weidman), (abbreviated as G).
    • A collection of Leibniz’s complete works in their original language (Latin, French and German).
  • Leibniz, Gottfried, (1967) The Leibniz-Arnauld Correspondence, trans. H. T. Mason, (Manchester: Manchester University Press). (abbreviated as M)
    • English translation of the Leibniz-Arnauld correspondence.
  • Leibniz, Gottfried, (1980) Discourse on Metaphysics/Correspondence with Arnauld/Monadology, trans. George Montgomery, La Salle: The Open Court Publishing Company, (abbreviated as D).
    • English translation of Leibniz’s Discourse on Metaphysics, Monadology and the Leibniz-Arnauld correspondence. More readily available than the Mason translation.
  • Leibniz, Gottfried, (1989) Philosophical Essays, eds. and trans. Roger Ariew and Daniel Garber, Indianapolis: Hackett, (abbreviated as AG).
    • One volume collection of many of Leibniz’s most important works in English.
  • Malebranche, Nicolas, (1958-1967) Oeuvres Complètes de Malebranche, Directeur A. Robinet, 20 volumes (Paris: J. Vrin), abbreviated OC. Citation as follows: (OC, volume, page).
    • Malebranche’s collected works in their original language (French).
  • Malebranche, Nicolas, (1992) Philosophical Selections, ed. Nadler, Indianapolis: Hackett, (abbreviated as PS).
    • Collection of Malebranche selections in English.
  • Malebranche, Nicolas, (1992) Treatise on Nature and Grace, trans. Patrick Riley, Oxford: Clarendon Press, (abbreviated as R).
    • English translation of Malebranche’s Treatise on Nature and Grace.
  • Voltaire, (1906) Siècle de Louis XIV (Paris: Hachette)
    • Voltaire’s history of the Age of Louis XIV.
  • Primary Source Abbreviations
  • Antoine Arnauld
  • B         (1996) and Pierre Nicole, Logic or the Art of Thinking, Jill Vance Buroker, trans., Cambridge: Cambridge University Press.
  • E          (1999) Examen du Traité de l’Essence du corps contre Descartes, Libraire Arthème Fayard.
  • K         (1990b) On True and False Ideas, Elmar Kremer, trans., Lewiston: The Edwin Mellen Press.
  • NO      (1990) “New Objections to Descartes’ Meditations and Descartes’ Replies”, Elmar Kremer, trans, Lewiston: The Edwin Mellen Press.
  • OA      (1775-1783) Oeuvres de Messire Antoine Arnauld, 43 vols., Paris: Sigismond D’Arnay.
  • PRG    (1975) and Claude Lancelot, The Port Royal Grammar, Jacques Rieux and Bernard E. Rollin, trans., Paris: Mouton.
  • TP        (2001) Textes philosophiques, Denis Moreau, trans., Paris: Presses Universitaires de France.
  • Rene Descartes
  • AT       (1974-1989) Oeuvres de Descartes, 11 vols., Charles Adam and Paul Tannery, eds., Paris: Vrin.
  • CSM I (1984-1985) The Philosophical Writings of Descartes, 2 vols., John Cottingham, Robert Stoothoff, and Dugland Murdoch, trans., Cambridge: Cambridge University Press.
  • CSMK III (1991) The Philosophical Writings of Descartes: Volume III, John Cottingham, Robert Stoothoff, Dugland Murdoch, and Anthony Kenny trans, Cambridge: Cambridge University Press.
  • Gottfried Leibniz
  • AG      (1989) Philosophical Essays, eds. and trans. Roger Areiw and Dnaiel Garber, Indianapolis: Hackett.
  • D         (1980) Discourse on Metaphysics/Correspondence with Arnauld/Monadology, George Montgomery trans, La Salle: The Open Court Publishing Company.
  • G         (1875-1890) Die Philosophiscehm Schriften von G. W. Leibniz, ed. C.I. Gerhardt, 7 vols (Berlin, Weidman).
  • M         (1967) The Leibniz-Arnauld Correspondence, trans. H. T. Mason, Manchester: Manchester University Press.
  • Nicolas Malebranche
  • OC      (1958-1967) Oeuvres Complètes de Malebranche, Directeur A. Robinet, 20 volumes (Paris: J. Vrin).
  • PS        (1992) Philosophical Selections, ed. Nadler, Indianapolis: Hackett.
  • R         (1992) Treatise on Nature and Grace, trans. Patrick Riley, Oxford: Clarendon Press.
  • Secondary Works
  • Article Collections on Arnauld in English:
  • The Great Arnauld and Some of His Philosophical Correspondents ed. Elmar Kremer (Toronto: Toronto University Press, 1990).
    • Contains two papers on Arnauld’s logic and scientific method authored by Jill Vance Buroker and Fred Wilson respectively, five papers on Arnauld and Malebranche on ideas by Monte Cook, Elmar Kremer, Steven Nadler, Richard Watson and Norman Wells respectively, two papers on the Leibniz-Arnauld Correspondence, by Graeme Hunter and Jean-Claude Pariente respectively and one paper on Arnauld’s account of grace and free will, by Kremer.
  • Interpreting Arnauld, ed. Elmar Kremer (Toronto: Toronto University Press, 1996).
    • Collection of papers on Arnauld including papers by Jill Vance Buroker, Alan Nelson, Peter Schouls, Thomas Lennon, Aloyse-Raymond Ndiaye, Elmar Kremer, Vincent Carraud, Graeme Hunter, Jean-Luc Solère, Steven Nadler, and Robert C. Sleigh, some of which will be highlighted in the bibliography below.
  • Books/Articles:
  • Black, Andrew, (1997) “Malebranche’s Theodicy”, Journal of the History of Philosophy 35, pp. 27-44.
    • Discussion of Malebranche’s theodicy, containing a defense of the ‘general content’ reading of the particular-general volitions distinction in Malebranche.
  • Buroker, Jill Vance, (1996) “Introduction” in Antoine Arnauld and Pierre Nicole, Logic or the Art of Thinking, Jill Vance Buroker, trans., Cambridge: Cambridge University Press, pp. ix-xxvi.
    • Helpful introduction to the Logic.
  • Carraud, Vincent, (1995) “Arnauld: From Ockhamism to Cartesianism” Descartes and His Contemporaries, in Ariew and Grene, ed., Descartes and His Contemporaries, pp. 110-128.
    • An examination of Arnauld’s early philosophy (pre-Fourth Objections) arguing that Arnauld’s early philosophy is in many respects Ockhamist.
  • Carraud, Vincent, (1996) “Arnauld: A Cartesian Theologian? Omnipotence, Freedom of Indifference, and the Creation of the Eternal Truths”, in Kremer, ed., Interpreting Arnauld, pp. 91-110.
    • A discussion of whether Arnauld held the Doctrine of the Creation of the Eternal Truths.
  • Curley, E.M., (1984) “Descartes on the Creation of the Eternal Truths”, The Philosophical Review 93,  pp. 569-597.
    • Paper focusing on Descartes and the eternal truths. Standard defense of the iterated modality reading.
  • Faye, Emmanuel, (2005) “The Cartesianism of Desgabets and Arnauld and the Problem of the Eternal Truths”, Garber, Daniel and Steven Nadler, eds., Oxford Studies in Early Modern Philosophy vol. 2 (Oxford: Clarendon Press), pp. 191-209.
    • Discussion of the Cartesianism of Arnauld (especially on the Creation of the Eternal Truths) and Robert Desgabets.
  • Frankfurt, Harry, (1977) “Descartes on the Creation of the Eternal Truths”, The Philosophical Review 86, pp. 36-57.
    • Paper focusing on Descartes and the eternal truths. Standard defense of the contingency reading.
  • Gouhier, Henri, (1978) Cartesianisme et augustinisme au XVII siècle, Paris: J Vrin.
    • Book-length study of Augustinianism and Cartesianism in the 17th century with discussion of many figures from the period including two chapters largely on Arnauld. In French.
  • Grene, Marjorie, (1998) Descartes, Indianapolis: Hackett Publishing Company.
    • Book on Descartes’ general philosophy with a chapter on ‘The Port Royal Connection’ with much discussion of Arnauld.
  • Jacques, Emile, (1976) Les anées d’exil d’Antoine Arnauld (1679-1694) (Louvain: Publications Universitaires de Louvain and Editions Nauwelaerts).
    • Historical study of Arnauld especially from 1679-1694. In French.
  • Kaufman, Dan, (2002) “Descartes’s Creation Doctrine and Modality”, Australasion Journal of Philosophy 80, pp. 24-41.
    • Paper focusing on Descartes and the eternal truths. Defends the claim that Descartes held both the God freely created the eternal truths and that they are necessarily true.
  • Kremer, Elmar J., (1990) “Introduction”, On True and False Ideas, Elmar Kremer, trans., (Lewiston: The Edwin Mellen Press), pp. xi-xxxiv.
    • Helpful introduction to On True and False Ideas including a discussion of the Malebranche-Arnauld polemic.
  • Kremer, Elmar J., (1994) “Grace and Free Will in Arnauld”, in The Great Arnauld and Some of His Philosophical Correspondents, pp. 219-239.
    • A discussion of grace and freewill in Arnauld arguing for a change in his view around 1684.
  • Kremer, Elmar J., (1996) “Arnauld’s Interpretation of Descartes as a Christian Philosopher”, in Interpreting Arnauld, pp. 76-90.
    • A discussion of several issues in which Kremer argues Arnauld departs from Descartes including on the nature of matter, the relation between mind and body and the eternal truths.
  • Lennon, Thomas M., (1977) “Jansenism and the Crise Pyrrhonienne”, Journal of the History of Ideas, 38, pp. 297-306.
    • Discussion of the relationship between Jansenism and skepticism, arguing that skepticism was characteristic of Jansenism.
  • Miel, Jan, (1969) “Pascal, Port-Royal, and Cartesian Linguistics”, Journal of the History of Ideas, 30, pp. 261-271.
    • Article questioning the extent to which the Port-Royal Logic and the Port-Royal Grammar are ‘Cartesian’.
  • Moreau, Denis, (1999) Deux Cartesiens: La Polemique Arnauld Malebranche, Paris: Librairie Philosophique J. Vrin.
    • Book-length study of the Malebranche-Arnauld polemic. No English translation is available. There is an English version of a paper summarizing some of the key aspects of the debate by the same author. See, Moreau (2000).
  • Moreau, Denis, (2000) “The Malebranche-Arnauld Debate”, in ed. Nadler, The Cambridge Companion to Malebranche, pp. 87-111.
    • Article-length summary of some of the main components of the Arnauld-Malebranche polemic. Including a helpful chronology of the writtings in the debate.
  • Nadler, Steven, (1988) “Arnauld, Descartes, and Transubstantiation: Reconciling Cartesian Metaphysics and Real Presence”, Journal of the History of Ideas, 49, pp. 229-246.
    • Examination of the role that the problem of transubstantiation played in Arnauld’s philosophy (both his ultimate endorsement of Cartesianism and his defense ofCartesianism).
  • Nadler, Steven, (1988b) “Cartesianism and the Port Royal”, Monist, 71, pp. 573-584.
    • Study of the relation between the Port Royal and Cartesianism especially focusing on three figures: Louis-Paul Du Vaucel, Le Maistre de Sacy and Pierre Nicole.
  • Nadler, Steven, (1989) Arnauld and the Cartesian Philosophy of Ideas, Princeton: Princeton University Press.
    • Book-length discussion of Arnauld’s theory of ideas, especially in terms of the Malebranche-Arnauld polemic.
  • Nadler, Steven, (1992) Malebranche and Ideas (New York: Oxford University Press)
    • Book-length study of Malebranche’s theory of ideas.
  • Nadler, Steven, (1995) “Occasionalism and the Question of Arnauld’s Cartesianism”, in Ariew and Grene, ed., Descartes and His Contemporaries, pp. 129-144.
    • Paper arguing that Arnauld was a mind-body occasionalist.
  • Nadler, Steven, (1996) “’Tange montes et fumigabunt’: Arnauld on the Theodicies of Malebranche and Leibniz”, in Kremer, ed. Interpreting Arnauld, pp. 147-163.
    • An examination of Arnauld’s criticisms of the theodicies of Leibniz and Malebranche arguing that from Arnauld’s perspective, despite their differences both suffer the same problems.
  • Nadler, Steven, ed., (2000b) The Cambridge Companion to Malebranche, New York: Cambridge University Press.
    • Collection of papers on Malebranche including papers on the Malebranche-Arnauld Debate, Malebranche’s Theodicy and Vision in God.
  • Nadler, Steven, (2008) “Arnauld’s God” Journal of the History of Philosophy, 46 (2008), pp. 517-538.
    • Paper-length treatment of Arnauld’s conception of God arguing that Arnauld had a thoroughly voluntarist conception of God.
  • Nadler, Steven, (2008b) The Best of All Possible Worlds, Farrar, Straus and Giroux: New York.
    • Book-length (and especially accessible) discussion of the theodicies and interactions of Leibniz, Arnauld and Malebranche.
  • Ndiaye, Aloyse-Raymond, (1991) La Philosophie D’Antoine Arnauld, Paris: Librairie Philosophique J. Vrin.
    • Book-length general study of Arnauld’s philosophy. In French.
  • Ndiaye, Aloyse-Raymond, (1996) “The Status of the Eternal Truths in the Philosophy of Antoine Arnauld”, in Kremer, ed. Interpreting Arnauld, pp. 64-75.
    • Paper-length argument that Arnauld (at least early in his career) did not endorse the Doctrine of the Creation of the Eternal Truths.
  • Nelson, Alan, (1993) “Cartesian Actualism in the Leibniz-Arnauld Correspondence”, Canadian Journal of Philosophy 23, pp. 675-94.
    • Paper arguing that Arnauld endorsed an actualist theory of possibility in the Leibniz-Arnauld correspondence.
  • Parkinson, G. H. R., (1967) “Introduction”, in The Leibniz-Arnauld Correspondence, trans. H. T. Mason, (Manchester: Manchester University Press).
    • Relatively brief (about 50 pages) summary and analysis of the Leibniz-Arnauld correspondence.
  • Pessin, Andrew, (2001) “Malebranche’s Distinction Between General and Particular Volitions”, Journal of the History of Philosophy 39, pp. 77-99.
    • Defense of the ‘particular content’ interpretation of Malebranche’s distinction between general and particular volitions.
  • Robinet, Andre, (1991) Preface, in Ndiaye (1991), Prais: J Vrin.
    • Brief preface to Ndiaye’s Antoine Arnauld.
  • Rozemond, Marleen, (1998) Descartes’s Dualism, Cambridge Mass.: Harvard University Press.
    • Book-length study of Descartes’ dualism including discussion of the Real Distinction argument and the mind-body union.
  • Rutherford, Donald, (2000) “Malebranche’s Theodicy”, in Steven Nadler ed., The Cambridge Companion to Malebranche (New York: Cambridge University Press), pp. 165-189.
    • Excellent overview of Malebranche’s theodicy.
  • Schmaltz, Tad, (1996) Malebranche’s Theory of the Soul, Oxford: Oxford University Press.
    • Book-length treatment of Malebranche’s theory of the soul and its relation to his Cartesianism. Includes many discussions of Arnauld and his interactions with Malebranche.
  • Schmaltz, Tad, (1999) “What Has Cartesianism to Do with Jansenism?”, Journal of the History of Philosophy, 60, pp. 37-56.
    • This paper explores the relationship between Cartesianism and Jansenism. This paper discusses both their theoretical relation as well as their historical relation in 17th century France.
  • Schmaltz, Tad, (2002) Radical Cartesianism, New York: Cambridge University Press.
    • Book-length investigation of the philosophy of two Seventeenth Century Cartesians: Robert Desgabets and Pierre-Sylvain Regis, with substantive discussions of Arnauld throughout.
  • Sedgwick, Alexander, (1977) Jansenism in Seventeenth-Century France, Charlottesville: University Press of Virginia.
    • Book-length study of Jansenism in 17th century France.
  • Sedgwick, Alexander, (1998) The Travails of Conscience, Cambridge: Harvard University Press.
    • Book-length historical study of the Arnauld family.
  • Sleigh, Robert C, (1990) Leibniz and Arnauld: A Commentary on Their Correspondence, New Haven: Yale University Press.
    • A book-length treatment of the Leibniz-Arnauld correspondence including a chapter devoted to Arnauld, and chapters focusing on the exchange between Leibniz and Arnauld on substance, freedom and action,
  • Stencil, Eric, (2011) “Malebranche and the General Will of God”, British Journal for the History of Philosophy 19 (6), pp. 1107-1129.
    • Discussion of Malebranche’s conception of the general/particular volition distinction and argument focusing on Malebranche’s theodicy that Arnauld misunderstood the distinction.
  • Van Cleve, James, (1983) “Conceivability and the Cartesian Argument for Dualism”, Pacific Philosophical Quarterly 64, pp. 35-45.
    • Paper focusing on Descartes’ argument for mind-body substance dualism.
  • Watson, Richard A., (1987) The Breakdown of Cartesian Metaphysics, Atlantic Heights: Humanities Press.
    • Book-length discussion of the “downfall” (both historically and philosophically) of Cartesian Metaphysics covering many debates about Cartesianism after Descartes including discussions of Simon Foucher, Louis de La Forge and Arnauld.
  • Wilson, Margaret, (1978) Descartes (London and New York: Routledge).
    • Book-length general (and very influential) study of Descartes’ philosophy.
  • Yablo, Stephen, (1993) “Is Conceivability a Guide to Possibility?”, Philosophy and Phenomenological Research 53, pp. 1-42.
    • An examination of several objections to the claim that conceivability is a guide to possibility

 

Author Information

Eric Stencil
Email: eric.stencil@uvu.edu
Utah Valley University
U. S. A.

Feng Youlan (Fung Yu-lan, 1895—1990)

Feng Youlan (romanized as Fung Yu-lan) was a representative of modern Chinese philosophy. Throughout his long and turbulent life, he consistently engaged the problem of reconciling traditional Chinese thought with the methods and concerns of modern Western philosophy. Raised by a modernist family who nonetheless gave him a traditional Confucian education, Feng pursued two great goals during his career: rewriting the history of Chinese philosophy from a modern perspective, and developing a reconstructed version of Chinese philosophy that could respond to the modern situation. The famous translator of Chinese philosophical texts, Wing-tsit Chan, summed up Feng’s contribution by saying: “At a time when Chinese intellectuals saw little value in the Chinese tradition, Professor Feng upheld it.”

Although Feng published his monumental and still-influential Zhongguo zhexue shi (History of Chinese Philosophy) in 1934, his effort to synthesize the traditional Neo-Confucianism of Cheng Hao, Cheng Yi, and Zhu Xi with the philosophical traditions of the modern West that he studied with John Dewey took much longer to complete and arguably was never entirely successful. Between 1939 and 1947, during the early years of China’s occupation by Japan, Feng published Xin lixue (New Teaching of Principle), in which he presented Chinese philosophy, particularly Confucianism, as valuable for addressing both “the True Realm” (corresponding to the transcendent, metaphysical world of what Confucians call li—“principle” or cosmic norms—and ti, “substance”) and “the Real Realm” (corresponding to the immanent, physical world of qi, “energy” or material forms, and yong, “function”). Feng contrasted Daoist philosophy, which he labeled “the philosophy of subtraction” (that is, inward-looking and seeking unity with nature), and both Mohist and Western philosophy, which he described as “the philosophy of augmentation” (that is, outward-looking and seeking to master nature); with Confucian philosophy, which he saw as “the middle way” between so-called “other-worldly” and “this-worldly” philosophies.

Due to political upheaval in China following the defeat of Japan and the successful Communist revolution in the 1940s, Feng proved unable to continue this philosophical project and in many ways reversed direction. Telling Chinese Communist Party chairman Mao Zedong that he was “unwilling to be a remnant of a bygone age in a time of greatness,” Feng devoted himself to reinterpreting both his earlier work and Chinese philosophy in general through the lens of Marxist thought, which had become the orthodox ideology in mainland China under Mao’s rule, even going so far as to denounce Confucius during the final years of the Cultural Revolution (1966-1976). This effort provoked great controversy among his peers, many of whom had fled China to establish “New Confucian” enclaves within the universities of Hong Kong and Taiwan as well as in the West. Partly as a result of his controversial immersion in Maoist politics, Feng’s most enduring contribution to Chinese philosophy probably is his historical account of the subject rather than his own philosophical synthesis. Nonetheless, his work remains an interesting example of the way in which early 20th century Chinese thinkers attempted to develop a rational philosophical system that was credible to both Chinese and Western traditions.

Table of Contents

  1. Life
  2. Philosophical Works
    1. A Comparative Study of Life Ideals
    2. Six Books of Zhengyuan
  3. A History of Chinese Philosophy
  4. Influence
  5. References and Further Reading

1. Life

Feng Youlan, whose zi (“style”) or courtesy name was Zhisheng, was born on December 4, 1895, in Tanghe County, Henan Province, to an affluent and prominent family.  Feng’s father completed the highest level of study required by the Qing dynasty imperial civil service and headed the local Confucian academy, but also maintained a personal library that included books about the West and modern affairs. From the age of six, Feng pursued a private education in the Confucian curriculum typical of the time, but had little interest in traditional rote learning. At the age of fifteen, he began studying in the county middle school and one year later transferred to high school, first in the provincial capital of Kaifeng, and then in neighboring Hubei Province. At the age of seventeen, he was admitted to the preparatory class of the Chinese Public University in Shanghai, where all courses were taught with English textbooks and the curriculum included Western logic and philosophy.

Feng studied philosophy at Peking (Beijing) University from 1915 to 1918, during which time he was exposed to various forms of foreign thought, ranging from the philosophy of Henri Bergson to the “New Realism” associated with British philosophers such as W. P. Montague. Also during this time, the New Cultural Movement, which questioned traditional Confucian values, was in full swing, provoking counter-reactions from conservative Confucian scholars. At Peking University, Feng met both Hu Shi (1891-1962), the iconoclast critic of Confucianism and Liang Shuming (1893-1988), the New Confucian apologist and social activist. In 1919, he traveled to the United States and, like Hu Shi before him, studied with John Dewey at Columbia University. He was deeply impressed by the United States’ advanced technology, emphasis on commerce, and atmosphere of law and order, which helped to confirm his sense that the Chinese psyche was inward-looking, contemplative, and holistic, while the Western spirit was outward-looking, analytical, and reductionistic. He was among the first to introduce Liang Shuming’s work to America.

In 1923, Feng completed his Ph.D. at Columbia, having entitled his dissertation A Comparison of Life Ideals. Looking back at this early work some sixty years later, Feng wrote:

I maintained that the difference between cultures is the difference between the East and the West. This was in fact the prevailing opinion at that time.  However, as I further studied the history of philosophy, I found this prevailing opinion to be incorrect. I discovered that what is considered to be the philosophy of the East has existed in the history of the philosophy of the West as well, and vice versa. I discovered that mankind has the same essential nature and the same problems of life… (Feng 2008: 658)

After completing his doctoral studies at Columbia, Feng returned to China and began his long teaching career there, having paid close attention to intellectual trends in China while he was abroad. Between 1923 and 1926, Feng held appointments in the philosophy departments of several Chinese universities, culminating in his being named Chair of the Philosophy Department and Dean of College of Humanities at Beijing’s prestigious Tsinghua (Qinghua) University in 1927.

In 1934, Feng had a fateful encounter with Communist ideology that would influence the future course of his life. While en route to a philosophical conference in Prague, he stopped to visit the Soviet Union, where he found a grand social experiment in progress.  Although he was not blind to the imperfections of Soviet Communism, he also was attracted to its utopian possibilities, as he later proclaimed in public speeches. As a result of his visit to the Soviet Union, he was detained briefly by Jiang Jieshi’s (Chiang Kai-shek’s) Nationalist government, but upon his release established a close relationship with the regime, which then ruled China despite the revolutionary activities of Communist factions. The outbreak of war between China and Japan in 1937 inspired Feng to work for the patriotic recognition and preservation of China’s unique philosophical heritage.  Between 1937 and 1946, he published a series of books to help justify Jiang Jieshi’s New Life Movement, which sought to safeguard the prosperity of the nation by instilling traditional values into Chinese citizens’ daily lives and transforming their moral character.

Due to political upheaval in China following the defeat of Japan in 1945 and the successful Communist revolution in 1949, Feng proved unable to continue this philosophical project and in many ways reversed direction. Telling Chinese Communist Party chairman Mao Zedong that he was “unwilling to be a remnant of a bygone age in a time of greatness,” Feng renounced his association with the Nationalist regime and devoted himself to reinterpreting both his earlier work and Chinese philosophy in general through the lens of Marxist thought, which had become the orthodox ideology in mainland China under Mao’s rule. This new work provoked great controversy among his peers, many of whom had fled China to establish “New Confucian” enclaves within the universities of Hong Kong and Taiwan as well as in the West.

Despite his apparent conversion to doctrinaire Marxism, Feng attempted to preserve a place for Confucianism under Mao’s aegis, asserting that this and other Chinese philosophical traditions, despite their reactionary historical baggage, could be “conceptually and abstractly” relevant in the Communist era. For example, Feng argued that the Confucian concept of ren or “benevolence” had in the past served the purpose of making the oppressed and exploited masses lose sight of intense class struggle and had sedated them into believing a dream of an impossible classless and harmonious society. Feng believed, however, that if taken “abstractly,” ren for all people could be a helpful idea even in Mao’s “New China.”

During China’s Cultural Revolution (1966-1976) Feng, along with many other academics, was denounced by Mao. Near the end of this period, however, Feng was able to regain a measure of public influence by allying himself with Mao’s wife, Jiang Qing, who enlisted his aid in her propaganda campaign against Confucius between 1973 and 1976. After Mao’s death and Jiang’s arrest and imprisonment, Feng’s political decisions cost him both his mianzi (“face” or public reputation) and, for a short time, his freedom.  By the 1980s, Feng was able to return to teaching and began work on a new manuscript entitled A New History of Chinese Philosophy, which was incomplete when he died in 1990.

2. Philosophical Works

a. A Comparative Study of Life Ideals

In 1923, Feng wrote his doctoral thesis A Comparative Study of Life Ideals. As the title of his thesis suggests, the relationship between Chinese and Western cultures was the focal point of his philosophical thinking. According to Feng, all forms of philosophy fall into three categories: “the philosophy of subtraction,” “the philosophy of augmentation,” and “the middle way.” Philosophers who value the natural world and distain human interference with nature wish to return to a state of innocence by reducing human artifice and interference. In this camp Feng placed Laozi and Zhuangzi, both of whom advocated “abandoning humanity and righteousness” and “eliminating sagehood and wisdom.” On the other hand, Feng saw both Western thought and Mozi’s philosophy as encouraging the conquest and transformation of nature and thus belonging to “the philosophy of augmentation.” Finally, Confucianism as Feng sees it insists on cultivating a balanced relationship between humans and nature and is therefore the middle way.

b. Six Books of Zhengyuan

The so-called Six Books of Zhengyuan include A New Philosophy of Principle, New Discourses on Events, New Social Admonitions, A New Inquiry into Man, A New Inquiry into the Tao, and A New Understanding of Language. Here, Feng’s approach was strongly constructive, as he intended to carry forward the Chinese philosophical tradition through these publications. Inspired by Neo-Confucian thinkers such as Cheng Yi, and Zhu Xi, Feng called his philosophy the “New Philosophy of Principle” (Xin lixue).

In order to understand Feng’s reinterpretation of Neo-Confucian thought, it is necessary to examine the Neo-Confucian concepts of li (“principle” or “cosmic pattern”) and qi (“energy” or “material force”). Feng understood these paired concepts as representing the ontological relationship between the universal and the particular. The universal is eternal, abstract and intangible, yet it has a certain material basis in the concrete world or the world of “instruments,” as the Yijing (Book of Changes) describes it: “That which is above physical form is the Dao (“Way”); those things contained by physical form are instruments.” Objects in the concrete world that have physical form are not the proper objects of philosophical thinking. Rather, philosophy is concerned with knowing the universal through logical analysis. Feng distinguished between the metaphysical world of li (which he called “the True Realm”) and and ti (“substance”) and the physical world of qi (“the Real Realm”) and yong (“function”). The True Realm is like an arcane book, such that those with untrained eyes see only blank pages, but philosophical minds see meaningful words engraved on them. Between the two, the True Realm is “first,” not temporally but ontologically. That is, a given class of things in the Real Realm is an imperfect revelation of certain principle in the True Realm. Feng’s reimagined Confucian ethics, in shaping human society, is functionally derived from Principle of the True Realm.

Feng’s “New Philosophy of Principle” is realist in outlook – that is, it assumes that moral truths exist as facts, and thus that ethics and ontology are interrelated. As facts, moral truths really exist, and possess an inherent but abstract existence as “principle” (li). This “principle” manifests itself concretely as “material force” (qi) in the ever-flowing phenomenal world of the Way (Dao). Like Neo-Confucian thinkers long before him, Feng adapted Chinese Buddhism’s notion of “the emptiness of emptiness” to develop his notion of the Great Whole (daquan), or the totality of all that exists. Within the Great Whole, all things are one, and human beings find their purpose. The self-definition of human beings inevitably comes from their understanding of the universe.

All of this is very much in keeping with traditional Chinese thought’s concern for the harmonious relationship between Heaven, nature and human beings. Feng’s comparative studies of Plato and Zhu Xi, on the one hand, and of Immanuel Kant and the Daoists, on the other hand, represent a revamped lixue based on the fundamental outline of Zhu Xi’s highly influential philosophical synthesis, which (unlike the efforts of most of Feng’s fellow “New Confucians”), bid fair to become a major contribution to world philosophy.  Unfortunately, Feng proved unable to revise, refine, or elaborate upon this groundbreaking work of the 1930s.

3. A History of Chinese Philosophy

Feng began work on A History of Chinese Philosophy during the 1920s. Prior to its publication in 1934, the only available modern critical history of Chinese philosophy was Hu Shi’s Outlines of the History of Chinese Philosophy (1919), which was the first attempt to break away from traditional genres of writing about the history of Chinese philosophy. The latter are often based on unexamined genealogical traditions and historically unreliable materials. In addition, the traditional writings are typically in the form of commentaries and even random notes and reflections on classical texts, and lack a systematic approach. Hu attempted to address these two issues by offering critical scholarship analyzing and authenticating historical documents, on the one hand, and providing a survey of Chinese thought by means of employing Western philosophical concepts, on the other hand. The problem with Hu’s work was that he never finished it. Feng decided to follow in Hu’s footsteps and compose a comprehensive and critical historical account of Chinese philosophy, using the scholarly tools he had mastered at Columbia University. In contrast with Hu’s effort to dismantle and debunk Chinese traditions, however, Feng claimed that his approach more accurately and honestly interpreted the various schools and thinkers of Chinese philosophical traditions. Having based his own philosophy on Confucian concepts, Feng insisted on the central role of Confucianism in Chinese intellectual history.

Feng self-consciously adopted a modern approach to Chinese philosophy. He abandoned the traditional view that all historical Chinese thought was an imperfect interpretation of perfect truths revealed by ancient sages. However, unlike Hu Shi and other iconoclastic scholars of the time, Feng refused to see history as pious fabrications concocted by the political and religious powers of old. His method was to interpret and appreciate tradition as a valuable historical legacy. For instance, against traditional commentators, Feng insisted that the Laozi (also known as the Daodejing) was written at a time much later than the Spring and Autumn Period (770-481 B.C.E.)—a view since borne out by archaeological discoveries, although the date of the text remains a contested issue among scholars. But this does not make the Laozi a worthless forgery, as the iconoclasts contended. Feng was also extremely keen on creating a comprehensive understanding of the entire history of Chinese philosophy. Feng’s intention was to give his readers a sense of the direction and development in Chinese philosophy. He wanted to show that there were periods when Chinese thought was original and creative (such as in the era prior to the Qin dynasty’s establishment in 221 B.C.E.), followed by other periods when it became rigid and stagnant (as in the late feudal society of the 18th and 19th centuries). He also stressed that true philosophy came from observing nature and social life, not from lofty theories and hair-splitting textual analysis.

Many scholars see Feng’s historiographical contributions to Chinese philosophy as much more significant than his original constructive contributions as a Chinese philosopher.  Prior to the publication of A History of Chinese Philosophy, Chinese thinkers did not group their traditional schools of thought with “philosophy” as understood in the West.  It was not until the very late 19th century, in fact, that a word for “philosophy” appeared in East Asian languages. The Japanese thinker Nishi Amane (1829-1897) coined the term tetsugaku, “the study of wisdom,” to approximate the Western concept of “philosophy,” and this term was adopted for use in Chinese by the scholar-official Huang Zunxian (1848-1905) as zhexue, which typically was reserved for the description of Western thought, excluding traditional Chinese thought. Although this exclusive and Eurocentric view of philosophy still has its adherents, Feng’s achievement was in defining Chinese thought and Western thought in terms of the common category of zhexue or “philosophy.”

A History of Chinese Philosophy appeared in several different translations, first in Derk Bodde’s English edition and later in several other languages, including Japanese, French, Italian, Korean, and German. During Feng’s 1948 visit to the University of Pennsylvania, he gave lectures that later became the basis of the English-language book, A Short History of Chinese Philosophy (1948). In the 1980s, Feng began – but never completed – a revised, Marxist version of A History of Chinese Philosophy, which was to be called A New History of Chinese Philosophy.

4. Influence

Views of Feng’s philosophical legacy vary according to cultural context. In China, Feng is considered to be one of the few original philosophers that twentieth-century China produced. Not remembered for his classroom eloquence by his former students, Feng Youlan nevertheless was able to impart his scholarship on, and passion for, Chinese philosophy to generations of students, many of whom currently hold teaching positions in elite Chinese universities.

As a result of their renewed interest in twentieth-century “New Confucian” thinkers, contemporary Chinese scholars have devoted a considerable amount of scholarship to Feng’s “New Philosophy of Principle” (xin lixue), which they typically regard as a systematic and sophisticated endeavor. For many of his Chinese admirers, Feng’s works are evidence that ancient Chinese (especially Confucian) conceptions can be retooled and made useful for modern times. The influence of Feng’s thought is likely to become greater as Confucian traditions enjoy something of a revival across contemporary China.

In the West, on the other hand, Feng’s influence is limited mainly to the reception of A History of Chinese Philosophy, which has been translated into multiple Western languages. With the exception of a few somewhat obscure publications, there has not been substantial Western scholarly engagement of general philosophical interest with Feng’s works. Often recommended as an indispensable read for students of Chinese culture, history, and thought, A History of Chinese Philosophy also has received criticism for its apparent partiality to Confucianism over Buddhism and Daoism, China’s two other great spiritual and intellectual traditions. Although Feng considered himself to be a philosopher, his reputation outside of China is that of an historian, and the nature of Western academic interest in Feng has focused on him as an historical figure in his own right, who lived during a particularly critical and troubled moment in the development of modern China.

5. References and Further Reading

  • Chen, Derong. Metaphorical Metaphysics in Chinese Philosophy: Illustrated with Feng Youlan’s New Metaphysics. Lanham, MD: Lexington Books, 2011.
  • Chen, Lai. Tradition and Modernity: A Humanist View. Leiden and Boston: Brill, 2009.
  • Cheng, Chung-Ying, and Nicholas Bunnin, eds. Contemporary Chinese Philosophy. Malden, MA: Blackwell Publishers, 2002.
  • de Bary, William Theodore, and others, eds. Sources of Chinese Tradition. 2 vols. New York: Columbia University Press, 1999-2000.
  • Feng, Youlan. Selected Philosophical Writings of Feng Yu-lan. Beijing: Foreign Languages Press, 2008.
  • Feng, Youlan. The Hall of Three Pines: An Account of My Life. Trans. Denis C. Mair. Honolulu: University of Hawai’i Press, 1999.
  • Feng, Youlan. The Spirit of Chinese Philosophy. Trans. Ernest Richard Hughes. Boston: Beacon Press, 1962.
  • Feng, Youlan. A History of Chinese Philosophy. Trans. Derk Bodde. Princeton: Princeton University Press, 1952-53.
  • Feng, Youlan. A Short History of Chinese Philosophy. Trans. Derk Bodde. New York: Free Press, 1948.
  • Makeham, John. New Confucianism: A Critical Examination. New York: Palgrave Macmillan, 2003.
  • Masson, Michel C. Philosophy and Tradition: The Interpretation of China’s Philosophic Past — Fung Youlan, 1939-1949. Taipei: Ricci Institute, 1985.
  • Obenchain, Diane B., ed. Something Exists: Selected papers of the International Research Seminar on the Thought of Feng Youlan. Honolulu: Dialogue Publishing, 1994.
  • Teoh, Vivienne. The Post ’49 Critiques of Confucius in the PRC, with Special Emphasis on the Views of Feng Youlan and Yang Rongguo: A Case Study of the Relationship between Contemporary Political Values and the Re-evaluation of the History of Chinese Philosophy. Thesis (Ph.D.) — University of New South Wales, 1982.
  • Wycoff, William Alfred. The New Rationalism of Fung Yu-Lan. Thesis (Ph. D.) –Columbia University, 1975.

 

Author Information

Xiaofei Tu
Email: tux@appalachian.edu
Appalachian State University
U. S. A.

Moral Particularism

Moral particularism is the view that the moral status of an action is not in any way determined by moral principles; rather, it depends on the configuration of the morally relevant features of the action in a particular context. It can be seen as a reaction against a traditional principled conception of morality as comprising a true and coherent set of moral principles. The chief motivation for moral particularism derives from the observation that exceptions to principles are common, and exceptions to exceptions are not unusual. Moral principles, which are equipped only to deal with homogeneous cases, seem to be too crude to handle the delicate nuances in heterogeneous moral situations.

Two counterarguments against moral particularism have been popular. One is the argument from supervenience. The other is the argument from the atomism of reason. The argument from supervenience is meant to establish the existence of true absolute moral principles, whereas the argument from the atomism of reason is meant to establish the existence of true, pro tanto, moral principles. If these arguments succeed, they will jointly consolidate a principled conception of morality. This article introduces the idea of moral particularism in more detail and assesses moral particularists’ responses to the two counterarguments mentioned above.

Table of Contents

  1. Introduction
  2. Absolute Moral Principlism and the Supervenience Argument
  3. Moral Particularists’ Responses to the Supervenience Argument
  4. Argument from Atomism of Reason for Pro tanto Principlism
  5. Criticism of Argument from Atomism of Reason
  6. Concluding Remarks
  7. References and Further Reading

1. Introduction

Moral particularism, on a first approximation, is the view that the moral status of an action is not in any way determined by moral principles; rather, it depends on the configuration of the morally relevant features of the action in a particular context. To illustrate, whether killing is morally wrong is not determined by whether it violates any sort of universal moral principles, say, the moral principle against killing; rather, its moral status is determined by its morally relevant features in the particular context where it occurs. If it occurs in a normal context, it is wrong; however, if it occurs in a context where the killer kills because it is the only way to defend from a rascal’s fatal attack, it is not wrong. In this case, the moral principle against killing does not determine the moral status of the action; rather, its moral status is determined by its morally relevant features, that is, the feature of killing an aggressor and the feature of saving one’s own life, in the particular context. Now, let us call those who contend, contra moral particularists, that the moral status of an action is determined by moral principles the ‘moral principlists’. In the killing case, the moral principlists might well contend that the principle against killing is not the right sort of principle but the principle which says that ‘killing is wrong except it is done out of self-defense’ is. So still the moral principlists might contend that it is the principle that determines the moral status of the action. However, according to moral particularism, even in cases where the moral verdicts issued respectively by moral principles and particular features converge, it is still the particular features rather than universal principles that determine the moral status of an action. This is not to say that the moral particularists are right to think so; this merely illustrates the difference between the views of moral particularists and principlists. The gist of moral particularism is best summed up in the words of Terence Irwin (2000): the particulars are normatively prior to the universals.

To further illustrate the differences between moral particularism and principlism, moral particularism can be seen as a reaction against a top-down principled approach to morality. That is, it opposes the moral principlists’ idea that we can get morally right answers by applying universal moral principles to particular situations.

In more detail, in moral principlists’ conception of morality, morality is made up of a true and coherent set of moral principles. It follows from this conception that if one negates the existence of moral principles, one negates morality altogether. For without moral principles, it seems that there would be no standards against which the moral status of actions can be determined.

The above-mentioned principlists’ conception of morality has been, without a doubt, a dominant view. This can be confirmed by examining the work in normative ethics. One chief concern of normative ethics has been to formulate basic moral principles that govern the moral terrain. It is generally believed by the normative ethicists that in basic moral principles lies the ultimate source of moral truths. The normative ethicists, though arguing among themselves over what the correct basic moral principles are, all tacitly agree that a major part of normative ethics is built upon the articulation of the basic moral principles and their application to practical moral issues.

While the debate amongst normative ethicists is continuing about the correct formulation and application of the basic moral principle(s), the common principled conception of morality underlying it has not received proper attention—not until the appearance of the contemporary particularists.

Contrary to the principlists, the particularists argue that morality does not depend upon codification into a true and coherent set of moral principles. On their view, general principles fail to capture the complexity and uniqueness of particular circumstances. Exceptions to principles are common and exceptions to exceptions are not unusual (Davis 2004; Tsu 2010). In other words, there are no exceptionless principles of the sort which the principlists have in mind. The particularists believe that the moral status of an action is not determined by moral principles; instead it always relies on the particular configuration of its contextual features. In brief, moral particularists take a bottom-up approach and, as Irwin pointed out, they give the normative priorities to the particulars. That is, according to moral particularism, the moral status of an action is determined by its morally relevant features in a particular context.

More slowly, one chief motivation for moral particularism to place emphasis on the particulars comes from their observation that the whole history of moral philosophy has witnessed brilliant moral philosophers searching for true moral principles that codify the moral landscape, yet no principles have generated wide agreement. There can be many explanations for this, but one is particularly salient: there are no true moral principles to be had in the first place. Parallel situations like this are common in many corners of philosophy. Many brilliant philosophers have spent their whole life’s time analyzing concepts of probability, truth or knowledge, trying to supply non-trivial, non-circular necessary and sufficient conditions for them. As is generally acknowledged today, it is extremely difficult, if not entirely impossible, to come up with a widely accepted analysis. One plausible explanation is that no analysis of the kind is to be had in the first place. This is not to say that there cannot be alternative explanations. But until they are produced and justified, there is at least a prima facie case for doubting whether conceptual analysis can produce non-trivial, non-circular necessary and sufficient conditions for the meanings of many philosophically interesting terms. Likewise, as hardly any moral principle receives wide acceptance, there is also a prima facie case that can be made for doubting whether there are any true moral principles in the first place, until an alternative explanation of why agreement on principles is so hard to come by is produced and justified.

Adding to the untenability of the principled conception of morality, moral particularists argue, is the fact that the moral status of an action is context-sensitive in character. That is, whether an action is right or wrong depends entirely on the particular contexts where it takes place; no universal principles, which are only equipped to deal with homogeneous cases, are capable of capturing the context-sensitive character of the moral status of an action in various heterogeneous contexts. Take the action of killing for instance. A principle against killing will run into exceptions in cases of self-defense and a further qualified principle against killing except in cases of self-defense will run into exceptions again in cases of overreacting self-defense. The exceptions may well go on infinitely. This shows that, the moral particularists believe, no principles are able to handle the essentially context-sensitive character of the moral status of an action.

To forestall some possible misunderstanding, moral particularism is not just the view that the moral status of an action is not determined by moral principles of the absolute kind – the absolute moral principle against killing; rather, it is the more encompassing and more radical view that it is not determined by moral principles of the pro tanto kind either. The difference between absolute moral principles and pro tanto ones roughly lies in the fact that the former purport to determine the moral status of an action conclusively, whereas the later do not; instead, the pro tanto moral principles, on a standard construal, merely specify the contribution a morally relevant feature makes to the determination of the moral status of an action. To use an example to illustrate, the difference between an absolute principle against lying and a pro tanto principle against lying lies in the fact that according to the absolute principle, lying is wrong (period) whereas according to the pro tanto principle, lying is wrong-making. To put it differently, according to the absolute principle against lying, the fact that an action has a feature of lying conclusively determines its wrongness whereas according to the pro tanto principle against lying, the same fact merely contributes to the wrongness of the action and leaves open the possibility that it is right overall. For instance, if an action has not only a feature of lying, but also a feature of saving a life, then the wrongness the feature of lying contributes to the action may well be outweighed by the rightness the feature of saving a life contributes to the action. Overall, the action may still turn out to be right, despite its violation of the pro tanto principle against lying.

Moral particularism, as we have explained, not only opposes absolute principlism, i.e. the idea that the moral status of an action is determined by absolute principles, but also pro tanto principlism, i.e. the idea that pro tanto moral principles jointly determine the moral status of an action. As moral particularists see things, there are neither true absolute principles nor true pro tanto principles. They both are bound to run into exceptions in some cases. In what follows, we will introduce the arguments for absolute moral principlism and pro tanto principlism and moral particularists’ responses to them.

Before we move on, however, there is an important caveat to be noted about terminology. It is this. Some philosophers use ‘prima facie principles’ to mean pro tanto principles. This is largely due to W. D. Ross’s influence. Ross was perhaps the first philosopher who used ‘prima facie’ to mean ‘pro tanto’. Strictly speaking, however, there is a difference in the meanings of these two phrases, as Ross himself later recognized and apologized. The difference between ‘prima facie’ and ‘pro tanto’ principles can be put in the following way: a prima facie principle specifies a moral duty at first glance whereas a pro tanto principle specifies a moral duty sans phrase, unless it is overridden by other moral duties. To illustrate with examples, a pro tanto principle against lying means that other things being equal, we have a duty not to lie. By contrast, a prima facie principle against lying means that at first glance, we have a duty not to lie; however, as we take a closer examination of the facts of the situations, we discover that we do not in fact have such a duty, as, for example, in the case where we are playing a bluffing game, the rule of which states that lying to the other contestants is permitted.

Now, having clarified the distinction between ‘pro tanto principle’ and ‘prima facie principle’, it has to be noted, for our purposes, that pro tanto principlism is the view that the moral status of an action is determined jointly by pro tanto principles, instead of by prima facie principles we just elucidated. In fact, the prima facie principles do not in any way determine the moral status of an action, for as we have explained, they are merely indicative of its moral status (as our moral duty) at first glance.

With the above caveat in mind, we can proceed to examine the arguments for absolute moral principlism and pro tanto principlism, and moral particularists’ responses to them.

2. Absolute Moral Principlism and the Supervenience Argument

Absolute moral principlism  is the idea that there are true absolute moral principles. The most powerful argument that has been advanced in support of this idea is the supervenience argument. It was advanced collaboratively by Frank Jackson, Philip Pettit, and Michael Smith (henceforth the Canberrans, as they all worked in Canberra in Australia when they advanced the argument) in their co-authored paper “Ethical Particularism and Patterns”. Roughly, the supervenience argument takes the following form:

P1: The thesis of global supervenience is true.

P2: If the thesis of global supervenience is true, then there must be a true conditional of the form ‘(x) (Nx → Mx)’.

P3: The true conditional of the form ‘(x) (Nx → Mx)’ must be shaped.

P4: A true shaped conditional of the form ‘(x) (Nx → Mx)’ is a true absolute moral principle.

C: There are true absolute moral principles, that is absolute moral principlism is true.

The above argument is valid, but each premise needs some comment.

P1: The thesis of global supervenience is true.

The thesis of global supervenience means, according to the Canberrans, the following:

The thesis of global supervenience: Naturally identical worlds are morally identical.

To illustrate, suppose that N1 stands for the sum of natural properties of world 1 and M1 for the sum of its moral properties. According to the thesis of global supervenience, for any world that is naturally identical with world 1, that is for any world that has N1, it must have M1. The Canberrans take the global thesis to be true, for it is a dictum in the realm of ethics that there cannot be a moral difference without a natural difference. So it follows from this dictum that naturally identical worlds are morally identical.

P2: If the thesis of global supervenience is true, then there must be a true conditional of the form ‘(x) (NxMx)’.

Suppose again that N1 stands for the sum of natural properties of world 1 and M1 for the sum of its moral properties. As we have illustrated, according to the thesis of global supervenience, for any world that is naturally identical with world 1, that is for any world that has N1, it must have M1. Likewise, let N2 and M2 stand respectively for the sum of natural properties of world 2 and the sum of its moral properties. According to the thesis of global supervenience, for any world that is naturally identical with world 2, it must have M2. Such being the case, from the thesis of global supervenience we can get the following raft of conditionals:

If x has N1, then x has M1.

If x has N2, then x has M2.

For the sake of simplicity, let us focus on the moral property of rightness and use ‘M’ for it, and use ‘N’ for ‘N1 or N2 or N3…’. Now, we can simplify the above raft of conditionals as follows:

If x has N, then x has M.

Hence, we can infer from the thesis of global supervenience that there must be a true conditional of the form ‘(x) (Nx → Mx)’.

P3: The true conditional of the form ‘(x) (NxMx)’ must be shaped.

The meaning of ‘shaped’ requires some explanation. According to the Canberrans, a conditional of the form ‘(x) (Nx → Mx)’ is shaped if and only if there is some commonality in the various Ns ‘N’ refers to. Remember that ‘N’ stands for ‘N1 or N2 or N3…’. So this just means that a conditional of the form ‘(x) (Nx → Mx)’ is shaped if and only if there is some commonality amongst N1, N2, N3… Having clarified the meaning of ‘shaped’, one question that needs to be addressed here is why there must be some commonality amongst N1, N2, N3…

The Canberrans contend that there must be some commonality amongst the various Ns, for if there were not, an absurd consequence would follow. The absurd consequence is this: we are not competent with the use of moral terms. This is absurd because we are indeed competent with the use of moral terms. We condemn a cold-blooded case of murder as immoral whereas we approve of an act of self-sacrifice as virtuous. It is absurd to suggest that we are not competent with the use of moral terms. Let’s call this absurd consequence conceptual incompetence for short. So how is it the case that if there were no commonality amongst the various Ns, conceptual incompetence would follow? Or to put the question somewhat differently, why is commonality in the various Ns essential for our competence with the use of moral terms?

To answer this question, let us use an example to illustrate. According to the Canberrans, the moral term ‘M’ (or ‘right’ in our current case) applies to various actions that have Ns. These actions that have Ns can be infinite in number. In order to know that ‘M’ can be applied to these various actions that have Ns, we, as finite human beings, cannot possibly acquaint ourselves with each and all of the actions that have Ns which ‘M’ applies to. That is to say, we cannot become competent with the use of ‘M’ by acquainting ourselves with all the possible actions ‘M’ applies to. The only viable way via which we can become competent with the use of ‘M’ is when these various actions have some finite commonality we can latch on to. Once we latch on to that commonality, we become competent with the use of ‘M’. So, the fact that we are indeed competent with the use of ‘M’ shows that there must be some commonality amongst the various Ns.

P4: A true shaped conditional of the form ‘(x) (NxMx)’ is a true absolute moral principle.

This premise is by definition true, as the Canberrans contend that a true absolute moral principle is one that displays a shaped connection between the natural ways things might be and the moral ways things might be. Suppose that ‘N’ stands for the natural property of maximal utility and ‘M’ for the moral property of rightness. Then the true shaped conditional of the form ‘(x) (Nx → Mx)’ just says that an action that has maximal utility is right. This is clearly equivalent to a true absolute moral principle of utility.

C: There are true absolute moral principles, that is absolute moral principlism is true).

C follows validly from P1 to P4. If there are true absolute moral principles, this just means that absolute moral principlism is true. Besides, this also means that moral particularism, the claim that there are neither true absolute moral principles nor true pro tanto moral principles will be refuted.

3. Moral Particularists’ Responses to the Supervenience Argument

In section 2, we have examined Canberrans’ supervenience argument for moral absolute principlism, or against moral particularism. Now, in this section, we will examine moral particularists’ response to Canberrans’ supervenience argument. Much of the philosophical action has focused on P3, especially on ‘conceptual incompetence’. This is understandable, as all the other premises seem to be relatively uncontroversial. So now, without further detain, let us just cut to the chase and introduce moral particularists’ response to P3. Jonathan Dancy (1999a, 2004), a chief proponent of moral particularism, responds to the supervenience argument by taking issue with P3. He contends that the Canberrans have not provided us with very convincing reasons to believe that P3 is true. In particular, Dancy argues that the Canberrans’ claim with regard to ‘conceptual incompetence’ is far from being justified.

As a reminder, P3 is the claim that the true conditional of the form ‘(x)

(Nx → Mx)’ must be shaped. As we have explained, the meaning of a ‘shaped’

conditional boils down to this: there is some commonality in the various Ns. And the Canberrans contend that if there were no commonality in the

various Ns, then the absurd consequence of ‘conceptual incompetence’ would

result, that is the absurd consequence that we are not competent with the use of

moral terms.

Remember that the Canberrans’ reason for thinking that ‘conceptual

incompetence’ would result is this. The Canberrans argue that the only plausible way for us to learn a moral term is by latching onto a commonality that the various actions that have Ns have in common, for we, as finite beings, cannot possibly acquaint ourselves with all the various actions that have Ns which the moral term can be applied to. So if there were no commonality amongst the various actions that have Ns, then it would be impossible for us to become competent with the use of moral terms.

However, Dancy maintains that Canberrans’ line of reasoning is a non sequitur. He argues that we can become competent with the use of moral terms even if there is no commonality amongst the various actions that have Ns. For there is one theory of competence the Canberrans unduly neglected, that is, the prototype theory, which is firstly advanced by Rosch (1975). According to the prototype theory, there is no commonality amongst the various Ns the moral term ‘right’ can be applied to; we do not latch unto the commonality amongst various Ns when we learn the term ‘right’ since there is none. Instead, what we latch unto are the ‘prototypical properties’ of a right action, that is the properties that are typical of a right action.

To use an analogy to illustrate, the various things the concept ‘bird’ can be applied to really have no commonality. It is true that birds can typically fly. However, a penguin cannot fly. It is also true that birds typically have feathers. Nonetheless, some baby birds do not. So there is really no commonality all birds have in common. Does this mean that we are thus unable to learn how to use the term ‘bird’? No! For we are indeed competent with the use of the term ‘bird’. So how do we pick up the term ‘bird’ if there is really no commonality amongst the various birds? According to the prototype theory, what we pick up are the prototypical properties that are typical of a bird—for instance, its capability of flying, its having feathers, and its having wings, et cetera. With regard to the atypical cases, such as those of penguins and baby birds, we can rightly categorize them as birds because they have enough of those prototypical properties that are typical of a bird. Likewise, Dancy contends that in the moral cases, we can become competent with the use of moral terms, say, the term ‘right’, by picking up those prototypical properties that are typical of right actions. With regard to the atypical cases, we can rightly categorize them as right actions because they have enough of those prototypical properties that are typical of right actions just as we can rightly categorize penguins and baby birds as birds because they have enough of those prototypical properties that are typical of birds.

So far, we have seen Dancy’s response to the ‘conceptual incompetence’ claim that is used to support P3 in the supervenience argument. Whether Dancy’s response is successful is an issue that requires further investigation. The crux of the question lies in whether the term ‘right’ is really analogous to the term ‘bird’ with regard to the lack of commonality amongst the things they can be applied to. The use of analogy, while illuminating, does not seem to prove that the term ‘right’ is really analogous to the term ‘bird’ in that aspect. To put it somewhat differently, is there any good reason to think that the prototype theory that can be used to explain our competence with the term ‘bird’ can explain equally well our competence with the term ‘right’? Perhaps a natural response from the moral particularists would be to say that the morally right actions are multiply realizable at the natural level—even killing, stealing, and lying can be morally right in some circumstances, and if so, it seems plausible to contend that these morally right actions do not seem to have anything in common with other morally right ones such as saving a drowning kid in the pond or donating to Oxfam. This is one of the possible responses that can be made. As to exactly how plausible it is and whether there are other more plausible responses await further exploration. These issues will not be pursed here.

4. Argument from Atomism of Reason for Pro tanto Principlism

In sections 2 and 3, we have examined respectively the supervenience argument that purports to establish the truth of absolute principlism and moral particularist’s response to it. Now, remember that moral particularism does not just hold the view that absolute principlism is false, but also the view that pro tanto principlism is false. Now, if there are good arguments to show that pro tanto principlism is true, then moral particularism will be falsified.

Remember that pro tanto principlism is the claim that there are true pro tanto moral principles. Pro tanto moral principles specify a feature of an action that is either right-making or wrong-making. For instance, the pro tanto principle against lying specifies the feature of lying as a wrong-making feature whereas the pro tanto principle demanding promise-keeping specifies the feature of promise-keeping as right-making. The strongest argument that has been advanced to support pro tanto principlism is the argument from the atomism of reason. It runs as follows:

P1. The atomism of reason is true.

P2. If the atomism of reason is true, then there are true pro tanto moral

principles.

C. There are true pro tanto moral principles

The above argument is valid. But in the meantime, each premise needs some comments.

P1: The atomism of reason is true.

In its simplest form, the atomism of reason is the idea that features of an action, qua reasons, behave in an atomistic way. The atomism of reason is contrasted with holism of reason, the view that features of an action, qua reasons, behave in a holistic way. To grasp the difference between atomism and holism, it is of course necessary for us to understand what ‘atomistic way’ and ‘holistic way’ mean. According to atomism, features, qua reasons, behave in an atomistic way in the sense that their status as reasons for or against performing an action is constant, insusceptible to the change of contexts. It is as if they had an intrinsic nature that remains constant across contexts. Hence, the analogy to ‘atom’. To illustrate, according to atomism of reason, if a feature of lying is a reason against performing the action in one context, its status as a reason against remains constant; even in a context where you have to lie in order to save a life, the feature of lying still remains as a reason against performing the action. Its status as a reason against does not change, although its moral significance might well be outweighed by that of saving a life.

On the other hand, holism contends that features, qua reasons, behave in a holistic way in the sense that their status as reasons for or against might change due to the change of contexts. For instance, the fact that an action has a feature of lying might be a reason against performing it in a normal context; however, it might not be so in a context where we are playing a bluffing game. Indeed, it might even become a reason for performing the action because it makes the game more exciting and enjoyable.

In face of the challenges from the holists, there are typically three routes the atomists take to respond. First, the atomists might well bite the bullet and contend that the feature of lying is a reason against no matter what. In the context of playing a bluffing game, the atomists might well contend that the fact that bluffing has a feature of lying is still a reason against so doing; it is just that its moral significance is outweighed by that of the harmless pleasures of the participants. That is why bluffing seems overall morally permissible in that context. However, this does not mean that the status of the feature of lying as a reason has changed from a reason against to a reason for in that context.

Another response from the atomists is this. The features that change their reason status are overly simplified. The more the features get specified, the less compelling the case is that they can change their reason status. For instance, the feature of killing might change its status from a reason against in a normal context to a reason for in a context where the only possible way to defend oneself against a rapist is to kill him. However, when the feature of killing gets more specified as the feature of killing a little girl merely for one’s own pleasure, it does not seem likely that its status as a reason against can change.

A third line of response from the atomists is this. There are two sorts of features we need to distinguish in the current context: the morally thin and the morally thick. The difference between the thin and the thick, according to Crisp (2000), lies chiefly in whether they are couched in virtue terms. For instance, the features of lying and killing belong to the thin whereas the features of honesty and cruelty belong to the thick. They also differ in the way they behave. While the morally thin might change their reason status according to contexts, the morally thick don’t. For instance, while it might well be the case that the feature of lying might change its status from a reason against to a reason for, it is never the case for the feature of dishonesty. The feature of dishonesty is always a reason against. To think otherwise is to distort the meaning of ‘dishonesty’. In fact, according to Crisp, the invariant reason status of the feature of dishonesty is the best explanation for the variance of reason status of lying. In cases where lying brings about dishonesty, it is a reason against, whereas in cases where it does not, it is not a reason against. To put it more generally, the invariant reason status of the morally thick provides an illuminating explanation for why the morally thin change their reason status. So the atomists contend that it is both linguistically intuitive and explanatorily illuminating to regard the morally thick as invariant reasons. Such being the case, the atomists contend that while holism might be true of the morally thin, it is not true of the morally thick.

P2. If the atomism of reason is true, then there are true pro tanto moral principles.

The atomism of reason is the claim that features, qua reasons, behave in an atomistic way. As we have explained, this means that their reason status does not change according to contexts. If a feature of lying is a reason against in one context, it is a reason against in all contexts. If so, this just means that there is (always) a reason against lying. On the other hand, according to the pro tanto moral principle against lying, the fact that an action has a feature of lying always is a reason against performing it. So if the atomism of reason about lying is true, it follows naturally that there is a true pro tanto moral principle against lying. The same line of reasoning can be generalized to apply to other features as well.

5. Criticism of Argument from Atomism of Reason

P1 in the argument from the atomism of reason is where all the philosophical action is. In section 3, we have seen the atomists’ three possible replies to the particularists’ holism charge against P1. However, the particularists are not thus satisfied. In what follows, I will introduce respectively particularists’ further challenges to the atomists’ three replies.

First, the atomists contend that the atomism is simply true of how all features qua reasons behave. Even in cases where we are playing a bluffing game, the feature of lying is still a reason against but not a reason for. It does not thus change its reason status. Against this line of reasoning, the particularists might well contend that the atomists’ point is simply question-begging. For whether atomism is true of how all features qua reasons behave is exactly the issue disputed. The atomists’ assertion is no substitute for persuasion.

Second, the atomists contend that when the features get more and more specified, it is less and less likely that their reason status can change. For instance, while it might be possible for a feature of killing to change its reason status from a reason against to a reason for, it seems very unlikely for a more specified feature of killing, the feature of killing a little girl merely for fun, to change its reason status. In reply, the particularists might well contend that while the specified feature of killing a little girl merely for fun cannot change its moral status, it does not thus establish a pro tanto moral principle in the relevant sense. For the ‘merely for fun’ bit is a description of the agent’s motive, whereas a pro tanto principle is exclusively about the moral status of actions. When the controversial bit of the description of the feature is eliminated, it no longer seems as intuitively plausible to maintain that the feature of killing a little girl is always a reason against, for doing so might eliminate her needless suffering as a terminally ill patient and serves as a reason for. Against this line of reasoning, the atomists might well contend that the ‘merely for fun’ bit is added to the description only to prevent other principles from getting involved (for instance, the girl is not being killed as demanded by some other principle, where doing so is necessary to save a large number of girls) and that the specified feature can thus establish the truth of a pro tanto principle in the relevant sense. If the atomists were to so contend, the particularists might counter, as Dancy (2000) did, that true moral principles generated out of specified features like this are far and few between in the moral landscape. So while the atomists might have won a few battles against the particularists by producing a few true moral principles based on specified features, they certainly have not won the war. For morality, for the larger part, still cannot be codified into exceptionless moral principles and does not depend on them for its existence, the particularists might insist.

Finally, the atomists contend that atomism is true of the morally thick, the features that are couched in virtue terms, such as honesty or cruelty. In reply, the particularists might well contend that this is not necessarily the case. A spy might well be too honest whereas a platoon leader relentless in training his soldiers who are about to be sent to the battlefields might well be only cruel to be kind. In these cases, the feature of honesty might well change its reason status from a reason for to a reason against whereas the feature of cruelty might well do the reverse. In short, the morally thick, contra what the atomists claim, might well change their reason status. And calling someone ‘too honest’ or ‘cruel to be kind’ does not really run counter to our linguistic intuitions. On the other hand, with regard to the explanation of why a morally thin feature such as that of lying changes its reason status, the holists are not completely clueless as the atomists like to think. In fact, they might well appeal to the change of the context as providing an explanation. For instance, the holists might well contend that the reason why a feature of lying which is normally a reason against becomes a reason for in the context of playing a bluffing game is because of the change of the context. The atomists do not have any theoretical edge in terms of providing an explanation for why the morally thin can change their reason status.

6. Concluding Remarks

Moral particularism, as we have introduced it, is the view that the particulars are normatively prior to the universals. That is, the moral status of an action is not in any way determined by universal moral principles, but by the morally relevant features in a particular context where the action takes place. This view is motivated by the observation that universal principles, no matter whether they are absolute or pro tanto, often run into exceptions. This observation has led moral particularists to believe that the moral status of an action is context-sensitive, susceptible to the influences of various morally relevant factors in the particular circumstances. And as both absolute moral principles and pro tanto ones deal in the same sort of moral answer in all contexts they cover, moral particularists believe that they oversimplify the moral matter. The moral principles, be they absolute or pro tanto, fly in the face of the context-sensitivity of the moral status of an action. Moral particularists therefore contend that there are neither true absolute moral principles nor true pro tanto ones,

As we have seen, two major arguments against moral particularism have been advanced respectively by absolute moral principlists and pro tanto moral principlists. However, as we have also seen, they are far from being conclusive. P3 in the supervenience argument and P1 in the argument from the atomism of reason can both be disputed. Of course, this does not mean that there can never be compelling arguments against moral particularism. In fact, there are several outstanding issues remaining for moral particularism. Roughly, they all point to the fact that moral particularism has not really undermined a principled conception of morality, when the idea of a moral principle is construed in various specific ways. For instance, Holton (2002) has argued that the gist of moral particularism is entirely compatible with the existence of what he calls ‘that’s-it principle’. Ridge and McKeever (2006), on the other hand, propose to view moral principles as ‘regulative ideals’ and argue that this view of moral principles is not scathed by any particularist argument that has been advanced so far. Moreover, Väyrynen (2006) champions the notion of ‘hedged moral principles’ and argues that they might well play an indispensable role in action guidance. Finally, Robinson (2006) has urged a dispositionalist understanding of moral principles; when moral principles are so understood, Robinson contends that they are immune from the particularists’ attacks. We cannot afford to pursue the details of the above-mentioned proposals here. To determine whether they are successful or not lies beyond the scope of the current article. However, they all provide interesting topics for future research. It is fair to say that debates on moral particularism are still very lively. They cut deep into the nature of morality. This article has barely begun to scratch the surface.

7. References and Further Reading

  • Audi, Robert. (2004). The Good in the Right, New Jersey: Princeton University Press
  • Audi, Robert. (2008). “Ethical Generality and Moral Judgment”, in M. Lance, M. Ptrc, and V. Strahovnik (eds.), Challenging Moral Particularism. New York: Routledge, pp. 31-52
  • Berker, Selim. (2007). “Particular Reasons”, Ethics, vol. 118, pp. 109-139
  • Blackburn, Simon. (1992). “Through Thick and Thin”, Proceedings of The Aristotelian Society, Suppl., vol. 66, pp. 285-99
  • Crisp, Roger. (2000). “Particularizing Particularism” in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford University Presss, pp. 23-47
  • Cullity, Garrett. (2002). “Particularism and Moral Theory: Particularism and Presumptive Reasons”, Aristotelian Society Supplementary Volume, vol. 76, issue 1, pp. 169-190
  • Dancy, Jonathan. (1981). “On Moral Properties.”, Mind, 90, no. 359, pp. 367-85
  • Dancy, Jonathan. (1983). “Ethical particularism and Morally Relevant Properties”, Mind, vol. 92, pp. 530-547
  • Dancy, Jonathan. (1993). Moral Reasons, Oxford: Blackwell
  • Dancy, Jonathan. (1999a). “Can the Particularist Learn the Difference Between Right and Wrong”, in K. Brinkmann (ed.), The Proceedings of the Twentieth World Congress of Philosophy, vol. 1, Bowling Green, OH: Philosophy Documentation Center, pp. 59-72
  • Dancy, Jonathan. (1999b). “Defending Particularism”, Metaphilosophy, vol. 30, no. 1/2, pp. 25-32
  • Dancy, Jonathan. (2000). “Particularist’s Progress” in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford Press, pp. 130-156
  • Dancy, Jonathan. (2002). “What Do Reasons Do?”, The Southern Journal of Philosophy, Supplement, vol. XLI, pp. 95-113
  • Dancy, Jonathan. (2004). Ethics Without Principles, Oxford: Oxford University Press
  • Dancy, Jonathan. (2007). “Defending the Right”, Journal of Moral Philosophy, vol. 4, no. 1, pp. 85-98
  • Dancy, Jonathan. (2009). “Moral Particularism”, The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.)
  • Davis, Darin. (2004). Rules and Vision: Particularism in Contemporary Ethics, Ph.D. Thesis, St. Louis University
  • Garfied, Jay. (2000). “Particularity and Principle”, in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford University Press, pp. 178-204
  • Gert, Bernard. (2004). Common Morality, New York: Oxford University Press
  • Gert, Bernard. (2005). Morality: Its Nature and Justification, Oxford: Oxford University Press
  • Gewirth, Alan. (1978). Reason and Morality, Chicago: University of Chicago Press
  • Hare, Richard. (1963). Freedom and Reason, Clarendon: Oxford University Press
  • Hare, Richard. (1981). Moral Thinking, Oxford: Oxford University Press
  • Hare, Richard. (1984). “Supervenience”, Aristotelian Society Supplementary Volume 58, pp. 1-16
  • Harman, Gilbert. (2005). “Moral Particularism and Transduction”, Philosophical Issues, vol. 15(1), pp. 44-55
  • Holton, Richard. (2002). “Principles and Particularisms”, Proceedings of the Aristotelian Society Supplementary Volume 67 (1), pp. 191-209.
  • Hooker, Brad. (2000). “Ethical Particularism and Patterns” in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford University Press, pp. 1-22
  • Irwin, Terence. (2000). “Ethics as an Inexact Science” in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford University Press
  • Jackson, Frank, Philip Pettit & Michael Smith. (2000). “Ethical Particularism and Patterns” in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford University Press, pp. 79-99
  • Jackson, Frank, Philip Pettit & Michael Smith. (2004). Mind, Morality, and Explanation, New York: Oxford University Press
  • Kagan, Shelly. (1988). “The Additive Fallacy”, Ethics, vol. 99, pp. 5-31
  • Kagan, Shelly. (1998). Normative Ethics, Boulder, Colorado: Westview Press
  • Kirchin, Simon. (2007). “Particularism and Default Valency”, Journal of Moral Philosophy, vol. 4 (1), pp. 16-32
  • Little, Margaret. (2000). “Moral Generalities Revisited” in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford University Press, pp. 276-304
  • Little, Margaret & Mark Lance. (2005a). “Particularism and Antitheory” in David Copp (ed.) The Oxford Handbook of Ethical Theory, pp. 567-595(29)
  • Little, Margaret & Mark Lance. (2005b). “Defeasibility and the Normative Grasp of Context”, Erkenntnis, vol. 61, pp. 435-455
  • Little, Margaret & Mark Lance. (2008). “From Particularism to Defeasibility in Ethics”, in M. Lance, M. Ptrc, and V. Strahovnik (eds.), Challenging Moral Particularism. New York: Routledge, pp. 53-75
  • McDowell, John. (2002). “Virtue and Reason” in Mind, Value & Reality, Massachusetts: Harvard University Press, pp. 50-73
  • McKeever, Sean & Michael Ridge. (2006). Principled Ethics: Generalism as a Regulative Ideal, Oxford: Oxford University Press
  • McKeever, Sean & Michael Ridge. (2007). “Turning on Default Reasons”, Journal of Moral Philosophy, vol. 4 (1), pp. 55-76
  • McNaughton, David. (1988). Moral Vision, Oxford: Blackwell
  • McNaughton, David. (1996). ‘An Unconnected Heap of Duties?’, The Philosophical Quarterly, Vol. 46, No. 185, pp. 433-437
  • McNaughton, David & Piers Rawling. (2000). “Unprincipled Ethics” in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford Press, pp. 256-275
  • Noble, C. (1989). “Normative Ethical Theories” in S. Clark and E. Simpson (eds), Anti-Theory in Ethics and Moral Conservativism, State University of New York Press, Albany, N.Y, pp. 49-64
  • Nussbaum, Martha. (1986). The Fragility of Goodness, Cambridge: Cambridge University Press
  • Nussbaum, Martha. (1990). Love’s Knowledge, N.Y.: Oxford University Press
  • Olson, Jonas & Frans Svensson. (2003). “A Particular Consequentialism: Why Moral Particularism and Consequentialism Need Not Conflict”, Utilitas 15, pp. 194-205
  • O’Neill, Onora. (1996). Towards Justice and Virtue: A Constructive Account of Practical Reasoning, Cambridge: Cambridge University Press
  • Pellegrino, Gianfranco. (2006). “Particularism and Individuation: Disappearing, Not Varing, Features”, Acta Analytica, vol. 21, no. 2, pp. 54-70
  • Raz, Joseph. (2000). “The Truth in Particularism” in Brad Hooker and Margaret Little (eds.), Moral Particularism, Oxford: Oxford University Press, pp. 48-78
  • Richardson, H. (1990). “Specifying Norms as a Way to Resolve Concrete Ethical Problems”, Philosophy and Public Affairs, vol. 19, pp. 279-310
  • Ridge, Michael & Sean McKeever. (2007). “Turning on Default Reasons”, The Journal of Moral Philosophy, vol. 4, no. 1, pp. 55-76
  • Ridge, Michael. (2008) “Moral Non-Naturalism”, The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.),
  • Robinson, Luke. (2006). “Moral Holism, Moral Generalism and Moral Dispositionalism”, Mind, vol. 115, pp. 331-360
  • Rosch. E. (1975). “Cognitive Representation of Semantic Categories”, Journal of Experimental Psychology, vol. 104, pp. 192-233
  • Ross, W. D. (2002). The Right and The Good, Oxford: Oxford University Press
  • Shafer-Landau, Russ. (2001). “Knowing Right from Wrong”, Australasian Journal of Philosophy, vol. 79, no. 1, pp. 62-80
  • Sinnott-Armstrong, Walter. (1999). “Some Varieties of Particularism”, Metaphilosophy 30, pp. 1-12
  • Stratton-Lake, P. (ed.) (2002). Ethical Intuitionism: Re-evaluations, Oxford: Oxford University Press
  • Thomas, Alan. (2007). “Practical Reasoning and Normative Relevance”, Journal of Moral Philosophy, vol. 4, no. 1, pp. 77-84
  • Thomson, Judith Jarvis. (1990). The Realm of Rights, Cambridge, Mass.: Harvard University Press
  • Timmons, Mark. (2002). Moral Theory, USA: Rowman & Littlefield Publishers
  • Tsu, Peter Shiu-Hwa. (2008). “Book Note on Principled Ethics: Generalism as a Regulative Ideal”, Australasian Journal of Philosophy, vol. 86 (3), pp. 521-524
  • Tsu, Peter Shiu-Hwa. (2009). “How the Ceteris Paribus Principles of Morality Lie”, Public Reason, vol. 2 (1), pp. 89-94
  • Tsu, Peter Shiu-Hwa. (2010). “Can Morality Be Codified?”, Australian Journal of Professional and Applied Ethics, vol. 11, no. 1 & 2, pp. 145-154
  • Tsu, Peter Shiu-Hwa. (2011a). Defending Moral Particularism, Ph.D. Thesis, Australian National University
  • Tsu, Peter Shiu-Hwa. (2011b). “Defending Particularism from Supervenience/Resultance Attack”, Acta Analytica, vol. 24, no. 4, pp. 387-492
  • Väyrynen, Pekka. (2002). Ruling Passions: A Defense of Moral Generalism, Ph.D. Thesis, Cornell University
  • Väyrynen, Pekka. (2006). “Moral Generalism: Enjoy in Moderation”, Ethics, vol. 116, pp. 704-741
  • Williams, Bernard. (2006). Ethics and The Limits of Philosophy, Oxford: Routledge
  • Winch, Peter. (1972). “The Universalizability of Moral Judgements”, Ethics and Actions, London: Routledge & Kegan and Paul Ltd, pp. 151-170
  • Wittgenstein, Ludwig. (1963). Philosophical Investigations, tr. G. E. M. Anscombe, Basil Blackwell & Mort: Oxford

Author Information

Peter Shiu-Hwa Tsu
Email: u4079238@alumni.anu.edu.au
Taipei Medical University
Republic of China

Ethics and Care-Worker Migration

Cited as one of the most pressing issues of our times, health care worker migration is now occurring at unprecedented rates. In the post-colonial era, as many developing countries began to develop and expand their health services and to educate their citizens to staff them, more educated workers left for wealthier countries, often those of the colonists or ones that shared the colonists’ language and other cultural ties.

Health worker migration has seen more than one phase, and taken on many forms. The transnational flow, however—especially from low and middle-income countries to wealthier ones—has never been higher. Of particular concern is asymmetrical migration – the main focus of this article – because it is skewing the distribution of the global health workforce and contributing to severe shortages in some parts of the world. There are also concerns about health worker distribution between urban and rural areas, between the public and private health care sectors, and among fields of specialization, but these will not be discussed here. Because many so-called “source countries” have higher burdens of disease and suffer from lower health care worker-to-population ratios than do destination countries, asymmetrical migration is deepening health inequities and creating what many have described as a global crisis in health.

This article begins by identifying a range of factors that contribute to the movement of health care workers around the globe, specifically from low and middle-income countries to affluent ones.  From there it explores ethical issues that arise concerning the deepening of global health inequalities; the status and treatment of migrant health workers, the implications for their families and communities; and the structure of human health resource planning. There is further consideration of the range of agents who might be said to have responsibilities to address these concerns, and what could be said to ground them. Noted are key efforts made to date as well as ideas for further reform.

Although this article refers to both migrants and emigrants, it is primarily emigrants – those who leave one country and take up residence in another – who are the main concern here. Additionally, the focus will be on nurses and care workers, such as nurse aides and home care aides. These health care workers tend to receive less attention than physicians, yet comprise a substantial share of migrant health care labor, in large part to meet the growing demands and expectations for affordable, quality long-term care services in high income countries. Moreover, the loss of nurses and other care workers is especially troubling for they tend to be the backbone of primary care in developing countries. This focus also highlights important issues concerning gender equity.

Table of Contents

  1. Care Workers on the Move: Contributing Factors
    1. Global Economic Policy
    2. Colonial Legacies
    3. Immigration Policies
    4. Health Care Policies
    5. The “Choices” of Privileged Families
    6. Care Regimes
  2. Ethical Issues
    1. Health Inequities in Source Countries
    2. Autonomy and Equity for Migrant Care Workers
    3. Social Reproduction and Political Capacity
  3. Responsibilities
    1. Responsibilities of Destination Countries
    2. Responsibilities of Other Agents
  4. Remedies
    1. Political and Institutional Strategies
    2. Compensation for Source Countries
    3. Retention Efforts in Source Countries
    4. Economic Policy Reform
    5. Health Policy Planning for the Long-term and Shared Governance for Health
    6. Practices of “Privileged Responsibility”
      1. Preventive Foresight
      2. Recognition
      3. Solidarity
  5. Conclusion
  6. References and Further Reading

1. Care Workers on the Move: Contributing Factors

a. Global Economic Policy

Neo-liberal economic policies may be the greatest contributor to the contemporary movement of health care workers from low and middle to high-income countries. International financial institutions, chiefly the World Bank and International Monetary Fund (IMF), have aimed at making these countries more competitive players in the global marketplace. Their main tools, structural adjustment policies, have in many places led to reductions in health sector (and other) employment. Many people have thus been motivated to seek work in richer nations. Some countries, such as the Philippines and India, have also taken to recruiting and training their own citizens specifically for care work overseas (Yeates 2009). Intent on gaining a share of their remittances, this strategy has become an integral part of their economic development plans.

When asked why they migrate, Philippine-trained nurses, the largest group working abroad, give responses that reflect these background conditions (ILO 2006; Lorenzo, Galvez-Tan, Icamina et al., et al. 2007; Alonso-Garbayo and Maben 2009). They point to high and growing rates of unemployment and feelings of being “underutilized”, even when employed, because their skills often surpass what they can do for patients given the scarcity of available resources. At the same time, many are “overutilized” where staffing is short. These emigrant nurses identify desires for lower nurse to patient ratios, better working hours, higher salaries, and better opportunities for professional development and their families’ well-being. Some also describe familial pressure: given economic conditions and policies at home, working abroad has come increasingly to be seen as an expected strategy for family survival, even a woman’s duty (Kelly and D’Addario 2008; Nowak 2009). Certain forms of nationalist rhetoric, which I discuss below, may contribute to the pressures brought to bear.

b. Colonial Legacies

Even though the current rates of migration are unprecedented, in no small part due to global economic policy, nurses and other health care workers in many major source countries have long been primed for emigration. Missionary and military involvement in the Philippines, along with targeted foreign policy strategies, began fueling the mobility of Filipino nurses over a century ago (Choy 2003). As part of a broad effort to serve American colonists and military personnel stationed there with “modern” medicine, the Baptist Foreign Mission Society established the first nursing school in the country – the Iloilo Mission Hospital School of Nursing – in 1906. Many nurses were sent to the US for additional training, sponsored by groups like Rockefeller, Daughters of the American Revolution, and the Catholic Scholarship Fund. By the 1920s, efforts were well underway to export “American professionalism, standardization, and efficiency” in nursing education and practice – lauded as “rational, scientific, and universalistic” – to the Philippines (Brush 1995, 552). Later, in the wake of World War II, in collaboration with the International Council of Nurses, Rockefeller introduced the Exchange Visitor Program to offer experience in American hospitals for nurses trained abroad, including those from the Philippines. By the mid-1960s, most were moving to jobs in US hospitals.

Since these early days, the numbers leaving have grown exponentially (Cheng 2009). The contemporary surge can be traced to the export-oriented, debt-servicing development strategy established by President Ferdinand Marcos. By the late 1990s, a complex, labor-exporting bureaucracy had emerged. Cultivating this care labor and its export have been government-supported educational institutions that educate and train nurses, chiefly for foreign markets in Saudi Arabia, the US, and the UK. With major capital at stake, some governments have been “only too eager to provide this habitat [my emphasis]” for producing care workers, deemed most valuable as export (Tolentino 1996, 53).

As the example of the Philippines demonstrates, colonial histories figure prominently in the contemporary emigration of care workers. These longstanding “interdependent relationships…established over centuries” (Raghuram 2009, 30) contribute to the global flow of health care workers. In addition to those from the Philippines, Indian nurses now constitute one of the largest groups of emigrant health care workers (Kingma 2006; Khadria 2007; Hawkes, Kolenko, Shockness et al. 2009). Their modern-day migration can trace its roots back to the British Empire’s Colonial Nursing Association (Rafferty 2005).

c. Immigration Policies

Immigration policies also figure into the flow of health care labor across borders to the extent that selective immigration, especially for skilled workers in areas with shortages, is a strategy increasingly used as an instrument of industrial policy under globalization (Ahmad 2005). Health and long-term care industry organizations in high income countries, who regard international recruitment as a way to address shortages and reduce hiring costs and improve retention, indeed, often lobby to ease immigration requirements in order to gain access to nurses and other care workers (Buchan, Parkin, and Sochalski 2003; Pittman, Folsam, Bass, et al. 2007).

In this context, a for-profit international recruitment industry, involved in a range of activities related to recruitment, testing, credentialing, and immigration has emerged and flourished (Connell and Stillwell 2006; Pittman, Folsam, Bass, et al. 2007). Not only has the size of the industry surged, but so too has the number of countries in which recruiters operate. Many of these countries have high burdens of disease and low nurse-to-population ratios.

d. Health Care Policies

The health care policies and planning failures of governments and other agents operating in the health care arena, especially around cost-containment, also play a role (Pond and McPake 2006). The unprecedented vacancies and turnover rates, and growing trend toward early retirement that characterize nursing and direct care work in the US, for example, are attributable to underinvestment, staffing that is insufficient to support quality patient care, increasing hours, between units, centralized decision-making, inadequate opportunity for continuing education and professional development, poor compensation and benefits, and a pervasive sense of disrespect (Aiken, Clarke, Sloan et al. 2002; Berliner and Ginzberg 2002; Allan and Larsen 2003; Institute of Medicine 2003). The worldwide burgeoning need for long-term care, a sector with especially persistent shortages and poor working conditions, alongside what critics cite as the longstanding absence of coherent long-term care policy, is, for example, a major driver of care worker migration. A recent report argues that the unprecedented reliance on migrant care workers around the world is a symptom of inadequate long-term care policy (International Organization for Migration 2010, 7).

e. The “Choices” of Privileged Families

Middle class and more privileged families also arguably contribute, albeit unwittingly, to the emigration of care workers from less well-off parts of the world. As Joan Tronto (2006) suggests, the tendency to understand caring in private terms, that is, as a matter involving the needs of their loved ones exclusively, has implications for the use of human health resources. Home care provided by emigrant women, for example, is on the rise in many places for those who can afford private help. Seen by the privileged as “[c]heap and flexible, this model is [embraced] to overcome the structural deficiencies of public family care provision [and often, crucially, workplace policies regarding family leave] and strikes a good balance between the conflicting needs of publicly supporting care of the elderly and controlling public expenditure” in privileged parts of the world (Bettio, Simonazzi, and Villa 2006). Trying to do the best they can for their families, often under constraints, families in privileged countries may not consider the implications for those less well off in other parts of the world.

f. Care Regimes

Finally, countries’ “care regimes” can contribute to the flow of migrant workers.  How governments structure provisions for the care of children, the ill and the elderly, including support for family caregivers has implications for the demand of migrant workers.  In countries with strong welfare states and provisions for care of the dependent, there appears to be less demand than in countries with weak welfare states.  What’s more, even in strong welfare states, the structure of the support seems to matter: where support for the dependent and their caregivers is professionalized, or formalized, there is more demand for migrant workers than where schemes are more informal (Michel 2010).

2. Ethical Issues

a. Health Inequities in Source Countries

The hope of many government officials and other policy makers has been that remittances sent back to source countries by those working abroad will stimulate economic growth and development through investment and business opportunities, increase trade and knowledge transfer, and over time, reduce poverty. While remittances channel billions of dollars in money and other goods, there is little agreement on the overall impact of migration on countries that export workers (Page and Plaza 2006; Connell 2010). Most studies do not separate out health care workers specifically. Those that do make distinctions among work type have found variation by particular profession, gender, and family circumstance. Evidence suggests that remittances primarily benefit households, often through poverty reduction. In some cases, migrants remit to invest in community programs and groups, or perhaps businesses. Remittances have also been credited for gains in educational attainment, production increases in sectors like agriculture and manufacturing, and entrepreneurial activity.

Still, there is an overwhelming consensus that when health workers leave, population health erodes. Recent evidence suggests that the adverse effects of losing health workers are not likely compensated by remittances, for they do not contribute to the development of health systems, care provision, or compensate for economic losses of educated workers (Ball 1996; Brown and Connell 2006; OECD 2008; Packer, Runnels and Labonté 2010). Low income countries’ investment in education and training for health care workers who ultimately leave for other shores, say some critics, reflect a “perverse subsidy” (Mackintosh, Mensah, Henry et al. 2006).

Fifty-seven countries face severe health worker shortages (WHO 2006).  These shortages affect both the quantity and the quality of health services available and provided. They worsen inequalities in infant, child and maternal health, vaccine coverage, response capacity for outbreaks and conflict, and mental health care. They lead to striking patient-health care worker ratios (especially in rural and remote areas), hospital, clinic, and program closures, and an increased workload for residing health care workers. Shortages in health personnel are said to be the most critical constraint in achieving the U.N. Millennium Development Goals and the WHO/UNAIDS 3 by 5 Initiative (Chen, Evans, Anand, et al. 2004; Médecins sans Frontières 2007). There is, of course, wide variation among countries when it comes to the impact of health care worker migration, depending upon such factors as size of the country, its geography and demographic make-up, disease burden, stock of trained workers, and so on. People in source countries, however, may get less care than they once did, and it may be given by someone with less education and training than would be the case but for migrations. In addition, people who formerly received care may get none at all. In some source countries, the absence of health care workers has caused a “virtual collapse” of health services (Packer, Labonté, and Runnels, 2009, 214). The gap created is often filled within families, as some governments facing under-resourced health care systems cope by “downloading” the work of caring onto women in individual households (Akintola 2004; Wegelin-Schuringa 2006; Harper, Aboderin, and Ruchieva 2008; Makina 2009). The problem is so grave that the Joint Learning Initiative (2004) concluded that the fate of global health and development in the 21st century lies in ensuring the equitable management of human health resources.

There are many ways of capturing what is at stake ethically for the people in source countries whose health care systems are failing. We might say they are harmed because resources are inequitably distributed (Wibulpolprasert and Pengaibon 2003; Dussault and Franceschini 2006), because the equal moral worth of particular groups of people, and in turn, equal opportunity, is denied (Daniels 2008). We might say, as well, that their welfare interests are threatened. Another view would suggest that global health inequities are morally troubling because deprivations in individuals’ capabilities to function threaten their well-being (Ruger 2006). Following Aristotle’s conception of human flourishing and its contemporary formulation in the “capability approach” of Sen and Nussbaum (Sen 1993), well-being on Ruger’s account is defined as having capabilities to achieve a range of beings and doings, or the freedom to be what we want to be and to do what we want to do.

Although the capabilities approach has not specifically been brought to bear on the problem of health worker migration, other rights-based arguments have played a prominent role. A number of commentators launch critiques of the depletion of source countries’ human health resources based on a human right to health (Gostin 2008; Shah 2010; O’Brien and Gostin 2011), a right ensconced in a number of international declarations including the United Nations Charter, the UN Covenant on Economic, Social and Cultural Rights (1966), and the WHO Constitution (1948). The necessity of the health workforce to the health system, and in turn, to realizing this asserted right, is recognized in the WHO’s Code of Practice on the International Recruitment of Health Personnel (2010b).

The harm, finally, might be explained in terms of structural injustice. Structural injustice

exists when social processes [that is, social norms and economic structures, institutional rules, incentive structures, and sanctions, decision-making processes, etc] put large categories of persons under a systematic threat of domination or deprivation of the means to develop and exercise their capacities, at the same time as these processes enable others to dominate or have a wider range of opportunities for developing and exercising their capacities.

The ethical concern is not merely that structures constrain. “Rather the injustice consists in the way they constrain and enable,” serving systematically to expand opportunities for the privileged while contracting them for the less well-off (Young 2006, 114).

From a plurality of ethical perspectives, then, the asymmetrical migration of care workers is deeply problematic.

b. Autonomy and Equity for Migrant Care Workers

The implications—including ethical, social and political, and economic—for migrant care workers are also significant. Here I explore them with an emphasis on autonomy and equity. If we understand autonomy to mean something like being relatively free to choose one’s actions and course in life from a decent set of options, and equity to mean something like the absence of avoidable and unfair inequalities, the picture that emerges is complex and yields no simple conclusions.

Threats to autonomy and equity for migrant care workers come from several sources. Part of the feminization of international migration, women seeking employment in more affluent countries as maids, nannies, nurses, and other care workers, has come to be an especially integral part of the global economy (Ehrenreich and Hochschild 2002; Van Eyck 2005; Kingma 2006; Dumont, Martin, and Spielvogel 2007). Yet, to the extent that the global migration of nurses and other care workers is fueled by “the ideological construction of jobs and tasks in terms of notions of appropriate femininity” and in terms of racial and cultural stereotypes, it raises concerns. The construction of Filipinas for instance as caring, obedient, meticulous workers, “sacrificing heroines” (Schwenken 2008), of Indian women and Caribbean women as naturally warm-hearted and joyful, serves the aims of governments, industry organizations and employers, recruiters, and even family caregivers in the North. Yet it potentially perpetuates stereotypes and constrains the imaginations, opportunities and choices of women and girls.

Emigrants from low and middle-income countries are situated amidst different genres of nationalist rhetoric that support neoliberal economic policies. One form is organized around specific conceptions of national community, compelling labor migrants to organize their conduct around what is beneficial to states’ economies. Indeed, “the brain drain migrant is a particular subjectivity, forged by the needs of late capitalism.” Her subjectivity becomes “constituted as an assemblage of morality and economic rationality,” within which she “acts in socially appropriate ways not because of force or coercion but [allegedly] because [her] choices align with . . . ‘community interests’” (Ilcan, Oliver, and O’Connor 2007, 80). Another variety of nationalist rhetoric emphasizes “the active citizen,” a “reconfigured political identity . . . whose aim is to maximize . . . quality of life . . . by being [an] active agent in the market” (Schild 2007, 181). These rhetorical strategies operate with a caring face, suggesting that labor emigrants, especially women, will enjoy expanded opportunities for choice and prospects for equality. Yet to the extent that they “encourage[e] and cultivat[e] . . . forms of subjectivity that are congruent with capitalism in its latest phase” (199), enforce expectations for individual responsibility for familial well-being and constrain the set of options available, they may thwart autonomy for many emigrants.

Although married women with children who migrate are encouraged, even pressed—by governments, by family members, or by both given conditions at home—to provide from abroad for their families and countries, and are celebrated as “modern heroes”, they are often blamed for such social ills as divorce, poor school performance by children, and teen pregnancy in their home countries (Parreñas 2005). Gender norms, then, persist and are manipulated, further contributing to the erosion of autonomy and equity.

Most if not all migrant laborers, including those engaged in care work, are subject to “flexibilization,” a “process of self-constitution that correlates with, arises, from, and resembles a mode of social organization” (Fraser 2009, 129). Its central features are fluidity, provisionality, and a temporal horizon of no long-term. Transnational economic and other structures compel care workers to mobilize, for example, when most say they would rather work at home. Movement to care labor markets in the North may involve taking jobs below the education and skill level of care workers, a practice known as “down-skilling”. There is also the rapid expansion of the informal or “grey” economy, and the tendency under neo-liberal economic policies to define more and more jobs as temporary and unskilled. Inequities may persist under such schemes and choices may be constrained.

Furthermore, although countries often incentivize immigration for some workers, including some categories of care workers, questions of immigration and citizenship are contested.  Countries “differentially incorporate” migrants when it comes to immigration and citizenship status (Parreñas 2000; Kofman and Raghuram 2006; Carens 2008). Care workers, especially the “unskilled”, often lack citizenship in the countries where they are employed. They therefore have a limited set of political rights and labor protections (Ball and Piper 2002; Ball 2004; Stasiulis and Bakan 2005; Dauvergne 2009; Bosniak 2009). Access to health and social services may also be diminished or lacking altogether (Meghani and Eckenwiler 2009; Deeb-Sosa and Mendez 2008).

More generally, like other migrants who describe feelings of dislocation, some emigrant nurses describe the experience of “having a foot here, a foot there, and a foot nowhere” (DiCicco-Bloom 2004, 28) and lacking a sense of belonging (Van Eyck 2004; Sørenson 2005; Hausner 2011).  In addition to potential political disenfranchisement and exclusion from labor protections, many live in transnational families and engage in transnational care practices, adjusting to “extended family relations and obligations across space and time” (Baldock 2000, 221; Parreñas 2005). To the extent that selves are relational, that is, our identities shaped by familial relationships and engagement in the communities and places from which we come (Mackenzie and Stolijar 2000), migration leads not just to a geographic rupture but a “self-rupture” in many instances (Kittay unpublished). Indeed, these care workers experience a sort of “bi-placement” of identity, enduring the harm of “never feeling oneself as fully here” (Kittay unpublished; Hondagneu-Sotelo 1997).

Moreover, relations with family members and with their social and political communities may be transformed (Parreñas 2000; Espiritu 2005). They may be improved from the perspective of migrant care workers, or they may be eroded or fractured. Indeed, these moral harms faced by individuals can at the same time threaten the relationships themselves; they may lead others to “reinterpret our social or moral standing . . . [and] compromise the…bonds we have with them” (Miller 2009, 513).

Research finds that migrant care workers can reap significant benefits. There can be important gains for women in areas like self-trust and confidence, household decision-making and expenditures, as well as in spatial mobility and freedom from restrictive gender norms (McKay 2004; Pessar 2005; Percot 2006). They may advance their “migration project”, that is, achieve goals they have (albeit under constrained conditions) set for themselves, whether this means contributing to the well-being of their families and themselves at home, or ultimately gaining traction and stability in destination countries. Depending upon a range of factors, many care workers may well be vulnerable, yet become more autonomous and gain greater opportunity.

In sum, to the extent that migrants decisions to migrate come about because their own countries lack the capacities to support them and their families, and even facilitate their departures, and to the extent that they live and work under conditions that do not just alter but distort their identities, threaten their self-respect, relationships, and political engagement, and that perpetuate their status as lower-paid workers with few options in the global economy, their overall prospects for enhanced autonomy and equity are highly unclear (Abraham 2004; Connell and Voigt-Graf 2006; Espiritu 2006; Barber 2009; Nowak 2009).

c. Social Reproduction and Political Capacity

As global capital uses particular places for particular forms of reproduction, and care workers are cultivated, extracted, and exported for the global marketplace – their labor revalued, repackaged, and relocated – not only do health care systems and the ill and dependent suffer, but family and community life, even political capacities, face erosion. Given that the care done within families generates public goods, like citizens, and contributes to their development and duration over the course of life, when a country exports care labor, that is, the women who provide it, it exports capacities for social reproduction (Truong 1996; Parreñas 2000). Most troublingly, to the extent that those with more resources have greater capacities to care—now by importing it—and those who have more resources are further produced and sustained to become more capable citizens, is that the outflow of caregivers may generate profound additional global inequalities in social and political capacity.

3. Responsibilities

Who is responsible for addressing the harms suffered by emigrant care workers, their families and communities, and populations in source countries confronting high disease burdens and worker shortages? The array of agents involved, and therefore candidates for having responsibilities, includes governments in destination as well as source countries, international lending bodies, transnational health care corporations and recruitment agencies, and those who employ emigrant care workers. I begin by considering views on the matter of whether destination countries have obligations to source countries suffering from health inequities, and from there expand to explore what if any responsibilities might be said to exist for these other agents to source countries as well as emigrants.

a. Responsibilities of Destination Countries

A variety of arguments hold that destination countries are not obligated to address health inequities in source countries. One line of thought, which assumes that there is something ethically essential about the nation-state, is that although we might have some obligations to the world’s poor, we have duties to prioritize our compatriots (Miller 2004). Another line of argument holds that we have duties not to interfere in the affairs of other societies (Rawls 1999). A third view emphasizes distance rather than shared state membership; our moral intuitions, according to this account, suggest that our strongest obligations are to those nearby (Kamm 2004). Finally, libertarian theorists suggest that we have no obligations save for situations where we have caused harm, which on their account comes in the form of unfairly acquiring goods or resources (Nozick 1974).

Others, however, maintain that there are at least negative duties and possibly positive ones to address health inequities and other problems generated or worsened by the asymmetrical migration of health care workers. Some argue that our shared human dignity is sufficient to ground responsibilities for global health equity. We have duties, in other words, to prevent suffering when we have the capacity to do so on the basis of our equal moral worth as persons (Singer 1972). Countering the argument that we ought not to interfere in the affairs of other societies, a republican approach holds that freedom should be conceived in terms of non-domination rather than non-interference, and this may generate responsibilities of justice that cross borders (Petit 1997). Luck egalitarian accounts maintain that people should not suffer worse opportunities based on where they come from, so justice demands assistance (Caney 2005).

Arguing from various relational conceptions of justice, some theorists point to the dense relations of interdependence that connect people transnationally to justify principles of justice that transcend the boundaries of states. There are several ways to think about these relations, which are more particular than our shared humanity. According to Onora O’Neill (2000), an agent’s moral obligation encompasses all those people whom his or her activities depend upon, and so, is often global in scope. Thomas Pogge maintains that by “shaping and enforcing the social conditions that foreseeably and avoidably cause the monumental suffering of global poverty, we are harming the global poor …” (2005, 33). According to this view, our relationship is a matter of being “materially involved” in or “substantially contributing to” upholding the institutions responsible for injustice (2004, 137). Iris Marion Young proposes a third relational approach: a “social connection model of responsibility”. Here “[o]bligations of justice arise between [agents] by virtue of the social [often transnational] processes that connect them” (Young 2006, 102). Relational conceptions of justice, thus, implicate not only destination countries, but also the other agents who contribute through their policies and practices to the global, asymmetrical flow of care workers and the inequities this exacerbates. They also highlight the limits of libertarian and non-interventionist accounts.

The complexity of the processes and relations involved in generating injustice presents challenges when it comes to the work of attributing and assigning responsibilities. Given the way that structures and processes operate, responsibility is diffused, or dispersed. It can therefore be difficult if not impossible to identify a particular perpetrator (individual, institutional, or corporate) to whom particular harms might be traced directly. As Young explains, while “structural processes that produce injustice result from the actions of many persons and the policies of many organizations, in most cases it is not possible to trace which specific actions of which specific agents cause which specific parts of the structural processes or their outcomes” (Young 2006, 115). At the same time, adverse effects are not necessarily intended.  Indeed, structural injustice often occurs as a result of our (individual and institutional) choices and actions as we try to advance our own interests “within given institutional rules and accepted norms” (114).

There is one further line of argument that might be brought to bear here. Responsibility can be motivated by prudential arguments, specifically those that acknowledge global interdependence. In other words, motivated by the idea that negative and positive externalities flow over national boundaries that are increasingly porous, the prudential approach justifies state-based action with transnational implications by affirming our interdependence along with the existence of ‘global public goods’. The idea is “to let international cooperation start ‘at home’, with national policies meant, at a minimum, to reduce or avoid altogether negative cross-border spillovers – and preferably to go beyond that to generate positive externalities in the interest of all” (Kaul, Grunberg and Stern 1999). Even, then, if agents are not motivated by moral reasons, prudence may generate responsible action and policy on the part of destination countries for the sake of source countries (Eckenwiler, Chung, and Straehle 2012).

b. Responsibilities of Other Agents

States are certainly not the sole actors on this moral landscape. Typically in discussions of global justice, including discussions of global health equity, states are assumed to have primary responsibility; that is, the “most direct and prior obligations” (Ruger 2006, 1001). Yet, states’ limitations are many. Some states do not have the capability to ensure justice. Others may lack the desire. And some states with the desire to be just are rendered more porous by the activities of other agents—what Onora O’Neill calls “networking institutions” (here, for instance: international lenders, health care corporations and recruiters) operating and exercising power within their borders (O’Neill 2000, 182-185; 2004 246–47). Conceptions of justice, therefore, must reckon with such agents if they are to have any traction under globalization. As noted earlier, relational approaches do, and in turn, assign responsibilities according to such parameters as their scope of power, resources, and degree of contribution to injustice (O’Neill 2000; Young 2006).

As for emigrant and other migrant care workers, the responsibilities of destination countries and those of networking institutions might include reforms in areas such as: recruitment policies and practices; immigration policy; and compensation and working conditions. I discuss these further in the final section.

4. Remedies

An array of interventions and ideas has emerged to try to address at least some of the concerns raised here (Stillwell, Diallo, Zurn et al. 2004; Mensah, Mackintosh and Henry 2005; Packer, Labonté, and Spitzer 2007). They aim variously at responding to health inequities, global health workforce management, and workers’ status and treatment. In the discussion below I focus on efforts underway along with ideas for further reform.

a. Political and Institutional Strategies

Perhaps the most important mechanisms embraced by governments to date are legally binding bilateral or multilateral agreements between source countries or regions and destination countries. These involve agreements for supplies of health professionals from particular countries (specifically those not suffering under low care worker-population ratios and/or high disease burdens) for specified lengths of time to address particular skill shortages. The UK and South Africa, for example, have a bilateral agreement, and some countries within the European Union have multilateral agreements. Recent evidence suggests that state policies can indeed have an impact on health worker migration (Bach 2010). In a related vein, some have called for global resource sharing, including staff sharing programs (Mackey and Liang 2012).

Codes of practice and position statements aimed at protecting the health systems of source countries and ensuring ethical treatment for migrant health workers have also emerged. They emphasize not recruiting from countries with severe shortages and high disease burdens, refraining from unethical recruitment practices, such as deception and misrepresentation, and treating migrant care workers with respect and providing them with labor protections. There are roughly twenty documents developed by governments, including the United Kingdom (UK Department of Health 2004), associations of governments like the Commonwealth Countries (2003) and Pacific Island Countries (Ministers of Health for Pacific Island Countries 2007), along with health professional associations, such as the World Health Organization (2010), the World Federation of Public Health Associations (2005), Academy Health (2008) and the American Public Health Association (Hagopian and Friedman 2006).

While such instruments draw attention to important issues and can raise the level of discourse, a major limit is that they do not address the root causes of migration.  They are also voluntary and their impact is difficult to monitor (Buchan, McPake, Mensah, and Rae 2009). Moreover, the structure and financing of a country’s health system is a significant factor when it comes to their scope. In the UK, for example, which has one major public sector employer and one point of entry for health professionals, a code may have more of an impact than in countries with a wide array of independent private sector health care employers (Buchan, Parkin, and Sochalski 2003; Buchan, McPake, Mensah, and Rae 2009).

b. Compensation for Source Countries

Motivated by compensatory justice, various compensation schemes have been suggested (Agwu and Llewelyn 2009). One model calls for governments of host or destination countries to pay source countries for the investments made in educating health professionals. Alternatively, destination countries could offer funding or other forms of investment for the purpose of capacity building and strengthening health systems in source countries (Mackintosh, Mensah, Henry et al. 2006).

c. Retention Efforts in Source Countries

Perhaps one of the most difficult questions concerns whether health workers have obligations to serve the countries that have provided their education and training (leaving aside countries where export is embedded in economic policy) (Cole 2010; Raustol 2010). Closely tied to this question is the extent to which, if at all, coercion (and what this means precisely) is justified in countries’ policies around health worker migration, particularly retention policies (Eyal and Hurst 2010). Proposals that raise such concerns range from financial and other incentives, where possible, to compulsory service requirements of some specified duration and taxes on migrants (Packer, Labonté, and Spitzer 2007; Masango, Gathu, and Sibandze 2008; Barnighausen and Bloom 2009). Another idea aimed at retaining health care workers is to organize medical school and other health worker curricula in low and middle-income countries around “locally relevant” needs and capacities (Eyal and Hurst 2008). Meanwhile, many source countries are trying to increase training efforts among community and other auxiliary health care workers to fill gaps (Global Health Workforce Alliance 2008).

d. Economic Policy Reform

Global economic policy that shifts to address the constraints that lead to migration faced by care workers in source countries is often the first item on health advocates’ list of reforms. This would mean an end to structural adjustment policies and an investment in health system infrastructure that will staff sufficient numbers and varieties of health care workers, fairly compensate and treat them, and serve the public equitably.

e. Health Policy Planning for the Long-term and Shared Governance for Health

At the country level, specifically in destination countries, long-range planning is clearly essential. When it comes to public health, notes one commentator, “the future is no where in sight” (Graham 2010). Indeed,

[In] spite of its importance, workforce planning and management have traditionally been viewed as low priority by many countries… supply driven with limited attention afforded to population health needs, service demand factors and social, political, geographical, technological and economic factors….[and] carried out in…silos rather than integrated across the various health disciplines/occupations (International Council of Nurses 2006, 10).

Policy makers in such places indisputably have obligations to address these issues for the sake of their own populations. And as has been shown here, many conceptions of justice maintain they also owe it to the poorer nations upon whom they’ve come to rely.

Some might argue for self-sufficiency in human heath resource planning (Mullan, Frehywot and Jolley 2008). This goal seems morally praiseworthy in its quest to avoid unjust relationships between richer and poorer countries. Skepticism as to whether it can be realized given conditions under globalization suggests to many that “shared health governance” ought to become the new model (Ruger 2006, 1001). The precise meaning of this notion is a matter of vitriolic debate; however, in the most general terms it involves governments and global institutions, along with non-governmental organizations, businesses, and foundations, engaged collaboratively in decision-making processes aimed at meeting the specific needs and addressing the constraints particular countries face. Advocates for shared governance over human health resources point to the fact that some form of this is reflected already in a number of areas, including access to essential medicines, vaccines, and tobacco control (O’Brien and Gostin 2011).

f. Practices of “Privileged Responsibility”

The discussion so far has emphasized states and “networking institutions”, and political-legal-institutional reforms. Some contemporary moral and political philosophers, however, would maintain that there is a tendency to underestimate the potential of personal interactions and practices among individuals (Walker 1998; Kurasawa 2007). According to Fuyuki Kurasawa, for example, a “formalist bias” that involves understanding global justice as emerging principally through prescriptive or legislative means, overlooks “the social labour and modes of practice that supply the ethical and political soil within which the norms, institutions and procedures of global justice are rooted” (Kurasawa 2007, 6). He identifies five practices: bearing witness, forgiveness, giving aid, solidarity, and foresight. Such practices may accompany or in some cases facilitate political, legal, economic, and institutional reform.

Here I consider practices that concern the ethical stance people in source countries – especially but not exclusively those who employ them – owe to these emigrants. There are international agreements that call for elements of what I note below, such as the International Convention on the Protection of the Rights of All Migrant Workers and Members of Their Families. I describe three practices, drawing upon Tronto’s concept (2006), of “privileged responsibility”: preventive foresight and solidarity (following Kurasawa), and recognition.

i. Preventive Foresight

Tronto’s argument is that the tendency among middle class and more affluent families to understand caring (particularly for children, the ill, and the dependent) in private terms, that is, as a matter involving the needs of their loved ones exclusively, can lead to moral hazards including social harm. “In a competitive society,” she observes, “what it means to care well for one’s own [family] is to make sure that they have a competitive edge against other [families]” (2006, 10). Ultimately, those acting with what Tronto describes as “privileged irresponsibility” can “ignore the ways in which their own caring activities continue to perpetuate inequality” (13).

To respond to this and other myopias, individuals and families with resources in wealthy countries might practice “preventive forsesight” (Kurasawa 2007) and plan ahead for their own care needs and think critically about their anticipated use of resources. They might ask themselves for example: To what extent do my/our “expectations” or actual needs have implications for others in need of care? Other families? Other communities? Could I/we plan and act in such a way that might avoid or lessen participation in the perpetuation of injustice?

ii. Recognition

Another way of taking seriously the moral status and significance of migrant care workers, their work, and the implications of their mobility might be through the practice of recognition. Recognition has come to be understood in at least two senses: recognition of individuals’ unique identity as an autonomous individual and recognition of persons as belonging to particular communities or groups. It also includes a third dimension, the recognition of others’ needs for relationships, both interpersonal and associative (Gould 2007, 250). Recognizing emigrant care workers in this way might enrich appreciation for the fact that many live in transnational families. Additionally, recognizing interdependence could mitigate against ignorance concerning health inequities in source countries. A fourth dimension might be added: recognition of the conditions – the places – in which care workers provide care, including conditions in the places where their numbers are few, the disease burden is high, and available resources scant.

iii. Solidarity

Solidarity, an increasingly examined concept both in health ethics as well as global justice, may have relevance here. Contemporary thinking on solidarity suggests that it involves reaching beyond the scope of one’s community to cultivate ties with others who have suffered injustice across distance and amid asymmetries, and standing together in advancing justice and resisting injustice (Gould 2007; Lenard 2010). Relations of solidarity might be forged, for example, between governments, employers and employees, people with diminished access to health care, migrant workers, and family caregivers and care workers for moral or prudential reasons (Eckenwiler, Chung, and Straehle 2012).

5. Conclusion

Health worker migration is not the main cause of health inequities; yet, it contributes to them and possibly others. Thus, conditions that facilitate, and the moral agents who participate in facilitating, migration (above all, asymmetrical) call for moral scrutiny. Additionally, the treatment of emigrant care workers warrants ethical investigation. Policies and practices that undermine values such as autonomy and equity should be seen as morally suspect. Finally, the state-level and global management of human health resources demands urgent attention. The present structure characterized by a “perverse subsidy”, ongoing underinvestment in health care (often where it is most needed), even in destination countries, and narrowly nationalist thinking threatens justice for populations everywhere, especially the poor.

6. References and Further Reading

  • Abraham, Binumal. 2004. Women nurses and the notion of their “empowerment.” Discussion paper no. 88. Kerala Research Programme on Local Level Development. Thiruvananthapuram: Centre for Development Studies.
  • Academy Health. 2008. Voluntary Code of Ethical Conduct for the Recruitment of Foreign-Educated Nurses to the United States. Washington, D.C.DC: Academy Health.
  • Agwu, Kenechukwu and Llewelyn, Megan, on behalf of undergraduates in International Health at UCL. 2009. Compensation for the brain drain from developing countries. Lancet 373 (May 16): 1665-1666.
  • Ahmad, Omar B. 2005. Managing medical migration from poor countries. British Medical Journal 331: 43-45.
  • Akin, Linda, Clarke, Sean, and Sloane, Douglas, Sochalski, Julie, and Silber, Jeffrey. 2002. Hospital nurse staffing and patient mortality, nurse burnout and job dissatisfaction. JAMA 288: 1987-93.
  • Akintola, Olagoke. 2004. A Gendered Analysis of the Burden of Care on Family and Volunteer Caregivers in Uganda and South Africa. Durban: HEARD, University of KwaZulu-Natal.
  • Allan, Helen, and Larsen, John A. 2003. “We Need Respect”: Experiences of Internationally Recruited Nurses in the UK. London: Royal College of Nursing.
  • Alonso-Garbayo, Alvaro, and Maben, Jill. 2009. Internationally recruited nurses from India and the Philippines in the United Kingdom: The decision to emigrate. Human Resources for Health 7: 37: 1-11.
  • Anderson, Barbara and Isaacs, A. 2007. Simply not there: The impact of international migration of nurses and midwives—perspectives from Guyana. Journal of Midwifery and Women’s Health 52: 392-7.
  • Bach, Stephen. 2010. Managed migration? Nurse recruitment and the consequences of state policy. Industrial Relations Journal 41 (3): 249-6.
  • Baldock, Cora V. 2000. Migrants and their parents: Caregiving from a distance. Journal of Family Issues 21: 205-24.
  • Ball, Rochelle. 2004. Divergent development, racialised rights: Globalised labour markets and the trade of nurses—the case of the Philippines. Women’s Studies International Forum 27: 119-133.
  • Ball, Rochelle. 1996. Nation building or dissolution: The globalization of nursing—the case of the Philippines. Pilipinas 27: 67-92.
  • Ball, Rochelle and Piper, Nicola. 2002. Globalisation and regulation of citizenship: Filipino migrant workers in Japan. Political Geography 21: 1013-34.
  • Barber, Pauline G. 2009. Border contradictions and the reproduction of gender and class inequalities in Philippine global migration. Paper presented at International Studies Association, New York, New York, 17 February.
  • Berliner, Howard and Ginzberg, Eli. 2002. Why this hospital nursing shortage is different. JAMA 288: 2742-2744.
  • Bettio, Francesca, Simonazzi, Annamaria, and Villa, Paola. 2006. Changes in care regimes and female migration: The “care drain” in the Mediterranean. Journal of European Social Policy 16 (3): 271-85.
  • Bärnighausen, Till and Bloom, David. 2009. Financial incentives for return of service in underserved areas: A systematic review. BMC Health Services Research 9 (86): 1-17.
  • Bosniak, Linda. 2009. Citizenship, non-citizenship, and the transnationalization of domestic work. In Seyla Benhabib and Judith Resnik, eds. Migrations and Mobilities: Citizenship, Borders, and Gender. New York: New York University Press.
  • Brown, Richard and Connell, John. 2006. Occupation specific analysis of migration and remittance behaviour: Pacific Island nurses in Australia and New Zealand. Asia-Pacific Viewpoint 47: 133-48.
  • Brush. Barbara L. 1995. The Rockefeller agenda for American/Philippines nursing relations. Western Journal of Nursing Research 17 (5): 540-55.
  • Buchan, James, McPake, Barbara, Mensah, Kwado, and Rae, George. 2009. Does a code make a difference? Assessing the English code of practice on international recruitment. Human Resources for Health 7 (33): 1-8.
  • Buchan, James, Parkin, Tina, and Sochalski, Julie. 2003. International Nurse Mobility: Trends and Policy Implications. Geneva: WHO.
  • Caney, Simon. 2005. Justice Beyond Borders: A Global Political Theory. Oxford: Oxford University Press.
  • Carens, Joseph. 2008. Live-in domestics, seasonal workers, and others hard to locate on the map of democracy. Journal of Political Philosophy 16: 371-96.
  • Chatterjee, Patralekha. 2011. Progress patchy on health worker crisis.  Lancet 377: 456.
  • Chen, Lincoln, Evans, Timothy, Anand, Sudhir, Boufford, Jo Ivey, Brown, Hilary, Chowdhury, Mushtaque, Cueto, Marcus, Dare, Lola, Dussault, Gilles, Elzinga, Gijs Fee, Elizabeth, Habte, Demissie, Hanvoravongchai, Piya, Jacobs, Marian, Kurowski, Christophe, Michael, Sarah, Pablos-Mendez, Ariel, Sewankambo, Nelson, Solimano, Giorgio, Stilwell, Barbara, de Waal, Alex, and Wibulpolprasert, Suwit. 2004. Human resources for health: Overcoming the crisis. Lancet 364: 1984-90.
  • Cheng, M. 2009. The Philippines’ health worker exodus. Lancet 373: 111-12.
  • Choy, Catherine C. 2003. Empire of Care: Nursing and Migration in Filipino American History. Durham, NC: Duke University Press.
  • Cole, Phillip. 2010. The right to leave vs. the duty to remain: Health care workers and the ‘brain drain’.  In Rebecca S. Shah, ed. The International Migration of Health Workers: Ethics, Rights and Justice. Basingstoke Hampshire: Palgrave Macmillan, 118-129.
  • Commonwealth Health Ministers. 2003. Commonwealth Code of Practice for the International Recruitment of Health Workers. Adopted at the Pre-WHA Meeting of Commonwealth Health Minsters. Geneva, May 18.
  • Connell, John. 2010. Migration and the Globalization of Health Care: The Health Worker Exodus? Cheltenha, UK: Edward Elgar.
  • Connell, John, and Stilwell, Barbara. 2006. Recruiting agencies in the global health care chain. In Christiane Kuptsch, ed. Merchants of Labour. Geneva: ILO, 239-253.
  • Connell, John and Voigt-Graf, Carmen. 2006. Towards autonomy? Gendered migration in Pacific Island states. In Wallner, Margo, and Bedford, Richard, eds. Migration Happens: Reasons, Effects, Opportunities of Migration in the South Pacific. Vienna: Lit Verlag, 43-62.
  • Daniels, Norman. 2008. Just Health: Meeting Health Needs Fairly. New York: Cambridge University Press.
  • Dauvergne, Catherine. 2009. Globalizing fragmentation: New pressures on women caught in the immigration law-citizenship law dichotomy. In Seyla Benhabib and Judith Resnik, eds. Migrations and Mobilities: Citizenship, Borders, and Gender. New York: New York University Press, 333-355.
  • Deeb-Sossa, Natalia, and Mendez, Jennifer B. 2008. Enforcing borders in the Nuevo South: Gender and migration in Williamsburg, Virginia and the Research Triangle, North Carolina. Gender and Society 22 (October): 613-38.
  • DiCicco-Bloom, Barbara. 2004. The racial and gendered experiences of immigrant nurses from Kerala, India. Journal of Transcultural Nursing 15: 26-33.
  • Dumont, Jean-Christophe, Martin, John P., and Spielvogel, Gilles. 2007. Women on the move: The neglected gender dimension of the brain drain. IZA Discussion paper no. 2920. Bonn.
  • Dumont, Jean-Christophe and Zurn, Pascale. 2007. Immigrant health workers in OECD countries in the broader context of highly skilled migration. In International Migration Outlook. Paris: OECD.
  • Dussault, Gilles and Franceschini, Maria C. 2006. Not enough there, too many here: Understanding the geographical imbalances in the distribution of the health workforce. Human Resources for Health 4 (12): 1-16.
  • Eckenwiler, Lisa, Chung, Ryoa and Strahele, Christine. 2012. Global solidarity, migration, and global health inequities. Bioethics 26 (7): 382-390
  • Ehrenreich, Barbara, and Hochschild, Arlie R., eds. 2002. Global Woman: Nannies, Maids, and Sex Workers in the New Economy. New York: Henry Holt.
  • Espiritu, Yen L. 2005. Gender, migration and work: Filippina health care professionals to the United States. Revue Européenne des Migrations Internationales 21: 55-75.
  • Eyal, Nir, and Hurst, Samia. 2008. Brain drain: Is nothing to be done? Journal of Public Health Ethics 1 (2): 180-92.
  • Eyal, Nir, and Hurst, Samia. 2010. Coercion in the fight against medical brain drain. In Rebecca S. Shah, ed. The International Migration of Health Workers: Ethics, Rights and Justice. Basingstoke Hampshire: Palgrave Macmillan, 137-158.
  • Fraser, Nancy. 2009. Scales of Justice: Reimagining Political Space in a Globalizing World. New York: Columbia University Press.
  • Global Health Workforce Alliance. 2008. Scaling Up, Saving Lives. Geneva: WHO.
  • Gostin, Larry. 2008. The international migration and recruitment of nurses: Human rights and global justice. JAMA 299 (15): 1827-29.
  • Gould, Carol. 2007. Transnational solidarities. Journal of Social Philosophy 38 (1): 148-64.
  • Graham, Hilary. 2010. Where is the future in public health? Milbank Quarterly 88 (2): 149-68.
  • Hagopian, Amy and E. Friedman, 2006. Ethical Restrictions on International Recruitment of Health Professionals to the US. Washington, DC: APHA.
  • Harper, Sarah, Aboderin, Isabella, and Ruchieva, Iva. 2008. The impact of outmigration on informal family care in Nigeria and Bulgaria. In John Connell, ed., The International Migration of Health Workers. New York: Routledge, 163-71.
  • Hausner, Sondra L. 2011. Nepali nurses in Great Britain: The paradox of professional belonging. Centre on Migration, Policy and Society Working paper no. 90.  Oxford: COMPAS.
  • Hawkes, Michael, Kolenko, Mary, Shockness, Michelle, and Diwaker, Krishna. 2009. Nursing brain drain from India. Human Resources for Health 7 (5): 1-2.
  • Hondagneu-Sotelo, Pierrette and Avila, Ernestine. 1997. “I’m here, but I’m not here”: The meanings of Latina transnational motherhood. Gender and Society 11 (5): 548-571.
  • Ilcan, Suzanne, Oliver, Marcia, and O’Connor, Daniel. 2007. Spaces of governance: Gender and public sector restructuring in Canada. Gender, Place, and Culture: A Journal of Feminist Geography 14 (1): 75 – 92.
  • Institute of Medicine. 2003. Keeping Patients Safe: Transforming the Work Environment of Nurses. Washington, D.C.DC: National Academies Press.
  • International Council of Nurses. 2006. The Global Nursing Shortage: Priority Areas for Intervention. Geneva: ICN.
  • International Labour Office. 2006. Migration of Health Workers: Country Case Study Philippines. Geneva: ILO.
  • International Organization for Migration (IOM). 2010. The Role of Migrant Care Workers in Ageing Societies: Report on Research Findings in the United Kingdom, Ireland, and the United States. Geneva: IOM.
  • Joint Learning Initiative. 2004. Human Resources for Health: Overcoming the Crisis. Cambridge: Harvard University Press.
  • Kamm, F.M. 2004. The new problem of distance in morality. In Dean Chatterjee, ed. Morality and the Distant Needy. Cambridge: Cambridge University Press, 59-74.
  • Kaul, Inge, Grunberg, Isabelle and Stern, Marc. 1999. Global Public Goods: International Cooperation in the 21st Century.  Oxford: Oxford University Press
  • Economy. New York: ILR Press.
  • Kelly, Philip and D’Addorio, Sylvia. 2008. “Filipinos are very strongly into medical stuff”: Labour market segmentation in Toronto, Canada. In J. Connell, ed., The International Migration of Health Workers. New York: Routledge, 77-98.
  • Khadria, Binod. 2007. International nurse recruitment in India. Health Services Research 42 (3): 1429-36.
  • Kingma, Mirielle. 2006. Nurses on the Move: Migration and the Global Health Care Economy. Ithaca: Cornell University Press.
  • Kittay, Eva Feder. Unpublished manuscript. The body as the place of care. Unpublished manuscript.
  • Kofman, Eleanor and Raghuram, Pavarti. 2006. Gender and global labour migrations: Incorporating skilled workers. Antipode 38 (2): 282-303.
  • Kurasawa, Fuyuki. 2007. The Work of Global Justice: Human Rights as Practices. Cambridge: Cambridge University Press.
  • Lenard, Patti. 2010. What is solidaristic about global solidarity? Contemporary Political Theory 9 (1): 100-110.
  • Lesser, Anthony. 2010. The right to the free movement of labour.  In Rebecca S. Shah, ed. The International Migration of Health Workers: Ethics, Rights and Justice. Basingstoke Hampshire: Palgrave Macmillan, 130-136.
  • Lorenzo, Fely M.E., Galvez-Tan, Jaime, Icamina, Kriselle, and Javier, Lara. 2007. Nurse migration from a source country perspective: Philippine country case study. Health Services Research 42 (3 Pt 2): 1406-18.
  • Lynch, Sharonann, Lethola, Pheello, and Ford, Nathan. 2008. International nurse migration and HIV/AIDS, Reply. JAMA 300 (9): 1024.
  • Mackenzie, Catriona and Natalie Stolijar, eds. 2000. Relational Autonomy: Feminist Perspectives on Autonomy, Agency, and the Social Self.  New York: Oxford.
  • Mackey, Timothy and Liang, Bryan. 2012. Rebalancing brain drain: Exploring resource re-allocation to address health worker migration and promote global health. Health Policy 107: 66-73.
  • Mackintosh, Maureen, Mensah, Kwado, Henry, Leroi, and Rowson, Michael. 2006. Aid, restitution, and fiscal redistribution in health care: Implications of health professionals’ migration. Journal of International Development 18: 757-70.
  • Makina, Anesu. 2009. Caring for people with HIV: State policies and their dependence on women’s unpaid work. Gender and Development 17 (2): 309-19.
  • Masango, Sidumo, Gathu, Kamanja, and Sibandze, Sibusiso. 2008. Retention strategies for Swaziland’s health sector workforce: Assessing the role of non-financial incentives. Harare: EQUINET Discussion Paper no. 68.
  • Matsuno, Ayaka. 2009. Nurse Migration: The Asian perspective. ILO/EU Asian Programme on the Governance of Labour Migration Technical Note.
  • McKay, Dierdre.  2004. Performing identities, creating cultures of circulation: Filipina migrants between home and abroad. Presented at Asian Studies Association of Australia. Canberra, June 29.
  • Médecins sans Frontières. 2007. Help Wanted: Confronting the Health Worker Crisis to Expand Access to HIV/AIDS Treatment: MSF Experience in Southern Africa. Johannesburg: MSF.
  • Meghani, Zahra, and Eckenwiler, Lisa. 2009. Care for the caregivers: Transnational justice and undocumented non-citizen care workers. International Journal of Feminist Approaches to Bioethics 2 (1): 77-101.
  • Mensah, Kwado, Mackintosh, Maureen, and Henry, Leroi. 2005. The Skills Drain of Health Professionals from the Developing World: A Framework for Policy Formation. London: Medact.
  • Michel, Sonya. 2010. Filling the gaps: Migrants and care work in Europe and North America. Paper presented at ESPA. Budapest.
  • Miller, David. 2004. National responsibility and international justice. In Dean Chatterjee, ed. Morality and the Distant Needy. Cambridge: Cambridge University Press, 123-143.
  • Miller, Sarah C. 2009. Moral injury and relational harm: Analyzing rape in Darfur. Journal of Social Philosophy 40 (4): 504-23.
  • Ministers of Health for Pacific Island Countries. 2007. Pacific Code of Practice for Recruitment of Health Workers. Port Vila, Vanuatu: WHO.
  • Mullan, Fitzhugh, Frehywot, Seble and Jolley, Laura. 2008. Aging, primary care, and self-sufficiency: Health care workforce challenges ahead. Journal of Law, Medicine and Ethics 36: 703-8.
  • Narasinham, Vasant, Brown, Hilary, Pablos-Mendez, Ariel et al. 2004. Responding to the global human resources crisis. Lancet 363: 1469-72.
  • Nozick, Robert. 1974. Anarchy, State, and Utopia. New York: Basic Books.
  • Nowak, Joanne. 2009. Gendered perceptions of migration among skilled female Ghanian nurses. Gender and Development 17: 269-80.
  • O’Brien, Paula and Lawrence O. Gostin. 2011. Health Worker Shortages and Global Justice. New York: Milbank Memorial Fund.
  • O’Neill, Onora. 2000. Bounds of Justice. Cambridge: Cambridge University Press.
  • O’Neill, Onora. 2004. Global justice: Whose obligations? In D. K. Chatterjee, ed. The Ethics of Assistance: Morality and the Distant Needy. Cambridge: Cambridge University Press, 242-259.
  • Organisation for Economic Cooperation and Development (OECD). 2005. Ensuring Quality Long-term Care for Older People. Paris: OECD.
  • Organisation for Economic Cooperation and Development (OECD) 2008. International mobility of health workers: Interdependency and ethical challenges. In The Looming Crisis in the Health Workforce: How Can OECD Countries Respond? Paris: OECD, 57-74.
  • Organisation for Economic Cooperation and Development (OECD) 2010. International Migration of Health Workers: Improving International Cooperation to Address the Global Health Workforce Crisis. Policy Brief.
  • Packer, Corrine, Labonté, Ronald, and Runnels, Vivien. 2009. Globalization and the cross-border flow of health workers. In Ronald Labonté, Ted Schrecker, Corrine Packer, and Vivien. Runnels, eds. Globalization and Health: Pathways, Evidence and Policy. New York: Routledge, 213-234.
  • Packer, Corrine, Labonté, Ronald, and Spitzer, Denise (for the WHO Commission on Social Determinants of Health). 2007. Globalization and the Health Worker Crisis. Globalization and Health Knowledge Network Research Papers.
  • Packer, Corrine, Runnels, Vivien, and Labonté, Ronald. 2010. Does the migration of health workers bring benefits to the countries they leave behind? In Rebecca Shah, ed. The International Migration of Health Workers: Ethics, Rights, and Justice. Basingstoke Hampshire: Palgrave Macmillan, 44-61.
  • Page, J., and Plaza, S. 2006. Migration remittances and development: A review of global evidence. Journal of African Economies 2: 245-336.
  • Parreñas, Rhacel. 2000. 2000. Migrant Filipina domestic workers and the international division of reproductive labour. Gender and Society 14 (4): 560-581.
  • Parreñas, Rhacel. 2001a. Servants of Globalization: Women, Migration, and Domestic Work. Stanford, CA: Stanford University Press.
  • Parreñas, Rhacel.  2001b. Transgressing the nation state: The partial citizenship and “imagined (global) community” of migrant Filipina domestic workers. Signs 26 (4): 1129-1154.
  • Parreñas, Rhacel.. 2005. Children of Global Migration: Transnational Families and Gendered Woes. Stanford, CA: Stanford University Press.
  • Percot, Marie. 2006. Indian nurses in the gulf: Two generations of female migration. South Asia Research 26: 41-62.
  • Pessar, Patricia R. 2005. Women, Gender, and International Migration Across and Beyond the Americas: Inequalities and Limited Empowerment. Expert Group Meeting on International Migration and Development in Latin America and the Caribbean. Population Division, Department of Economic and Social Affairs, United Nations Secretariat. Mexico City, 30 November.
  • Petit, Phillip. 1997. Republicanism: A Theory of Freedom and Government. Oxford: Clarendon Press.
  • Pogge, Thomas. 2004. Relational conceptions of justice: Responsibilities for health outcomes. In Sudhir Anand, Fabienne Peter, and Amartya Sen, eds. Public Health, Ethics, and Equity. New York: Oxford University Press, 135-162.
  • Pogge, Thomas.  2005. Real world justice. Journal of Ethics 9: 29-53.
  • Pittman, Patricia, Folsom, Amanda, Bass, Emily, and Leonhardy, Kathryn. 2007. U.S. Based International Nurse Recruitment: Structure and Practices of a Burgeoning Industry. Washington, D.C.: Academy Health.
  • Pond, Bob, and McPake, Barbara. 2006. The health migration crisis: The role of four Organisation for Economic Cooperation and Development countries. Lancet 367: 1448-55.
  • Rafferty, Anne Marie. 2005. The seductions of history and the nursing diaspora. Health and History 7 (2): 2-16.
  • Raghuram, Pavarti. 2009. Caring about ‘brain drain’ migration in a postcolonial world. Geoforum 40: 25-33.
  • Raustøl, Anne. 2010. Should I stay or should I go? Brain drain and moral duties.  In Rebecca S. Shah, ed. The International Migration of Health Workers: Ethics, Rights and Justice. Basingstoke Hampshire: Palgrave Macmillan, 175-188.
  • Rawls, John. 1999. The Law of Peoples. Cambridge, MA: Harvard University Press.
  • Ruger, Jennifer P. 2006. Ethics and governance of heath inequalities. Journal of Epidemiology and Community Health 60: 998-1002.
  • Schild, Veronica. 2007. Empowering “consumer-citizens” or governing poor female subjects? The institutionalization of “self-development” in the Chilean social policy field. Journal of Consumer Culture 7 (2): 179-203.
  • Schwenken, Helen. 2008. Beautiful victims and sacrificing heroines: Exploring the role of gender knowledge in migration policies. Signs 33 (4): 770-6.
  • Sen, Amartya. 1993. Capability and well-being. In Amartya Sen  and Martha Nussbaum, eds. Quality of Life. Oxford: Clarendon Press, 30-53.
  • Shah, Rebecca S. 2010. The right to health, state responsibility and global justice. In Rebecca S. Shah, ed. The International Migration of Health Workers: Ethics, Rights and Justice. Basingstoke Hampshire: Palgrave Macmillan, 78-102.
  • Singer, Peter. 1972. Famine, affluence, and morality. Philosophy and Public Affairs 1: 229-243.
  • Sørenson, Ninna Nyberg. 2005. Narratives of longing, belonging, and caring in the Dominican diaspora. In J. Besson and K. F. Olwig, eds., Caribbean Narratives of Belonging: Fields of Relations, Sites of Identity. Thailand: Macmillan Caribbean.
  • Stasiulis, Davia and Abigail, Bakan. 2005. Negotiating Citizenship: Migrant Women in Canada and the Global System. Toronto: University of Toronto Press.
  • Stilwell, Barbara, Diallo, Khassoum, Zurn, Pascale, Vujicic, Marko, Adams, Orvill, and Dal Poz, Mario. 2004. Migration of health care workers from developing countries: Strategic approaches to its management. Bulletin of the World Health Organization 82: 595-600.
  • Thomas, Philomena. 2008. Indian nurses: Seeking new shores. In J. Connell, ed., The International Migration of Health Workers. New York: Routledge, 99-109.
  • Tolentino, Roland B. 1996. Bodies, letters, catalogs: Filippinas in transnational space. Social Text 48 (Autumn): 49-76.
  • Tronto, Joan. 2006. Vicious circles of privatized caring. In Maurice E. Hamington, Dorothy C. Miller, eds. Socializing Care. Lanham, MD: Rowman and Littlefield, 3-26.
  • Truong, Thanh-Dam. 1996. Gender, international migration and social reproduction: Implications for theory, policy, research, and networking. Asian and Pacific Migration Journal 5 (1): 27-52.
  • United Kingdom Department of Health. 2004. Code of Practice of the International Recruitment of Healthcare Professionals. Leeds: UK Department of Health.
  • UN General Assembly. 1966. Resolution 2200A (XXI), International Covenant on Economic, Social, and Cultural Rights. December 16.
  • Van Eyck, Kim. 2004. Women and International Migration in the Health Sector. Ferney-Voltaire, France: PSI.
  • Van Eyck, Kim. 2005. Who Cares? Women Health Workers in the Global Labour Market. Ferney-Voltaire, France: PSI.
  • Walker, Margaret U. 1998. Moral Understandings: A Feminist Study in Ethics. New York: Routledge.
  • Wegelin-Schuringa, M. 2006. Local responses to HIV/AIDS from a gendered perspective. In Anke Van der Kwaak, and Madaleen Wegelin-Schuringa, eds. Gender and Health: A Global Sourcebook. Amsterdam: Netherlands: KIT Publishers, 79-94.
  • Wibulpolprasert, Suwit and Pengaibon Paichit. 2003. Integrated strategies to tackle the inequitable distribution of doctors in Thailand: Four decades of experience. Human Resources for Health 1 (12): 1-17.
  • World Federation of Public Health Associations. 2005. Ethical Restrictions on International Recruitment of Health Professionals from Low-Income Countries. Geneva: World Federations of Public Health Associations.
  • World Health Organization (WHO). 1948. Constitution of the World Health Organization. Geneva: WHO.
  • World Health Organization (WHO). 2006. World Health Report 2006: Working Together for Health.  Geneva: WHO.
  • World Health Organization (WHO). 2008. The Global Burden of Disease: 2004 Update. Geneva: WHO.
  • World Health Organization (WHO). 2010a. Global Atlas of the Health Workforce. Geneva: WHO.
  • World Health Organization (WHO). 2010b. WHO Global Code of Practice on the International Recruitment of Health Personnel. Geneva: WHO.
  • Yeates, Nicola. 2009. Production for export: The role of states in the development and operation of global care chains. Population, Space, and Place 15: 175-87.
  • Young, Iris M. 2006. Responsibility and global justice: A social connection model. Social Philosophy and Policy 23: 102-30.

Author Information

Lisa Eckenwiler
Email: leckenwi@gmu.edu
George Mason University
U. S. A.

Ralph Cudworth (1617—1688)

CudworthRalph Cudworth was an English philosopher and theologian, representative of a seventeenth century movement known as the Cambridge Platonists.  These were the first English philosophers to publish primarily in their native tongue, and to use Plato as a core influence.  Three of Cudworth’s works: The True Intellectual System of the Universe, A Treatise Concerning Eternal and Immutable Morality, and A Treatise on Freewill together constitute the most complete available exposition of the Platonist world-view.

The Platonist School formed as a response to the intellectual crisis following the Calvinist victory in the English Civil Wars.  The Calvinists believed that human intellect was useless for understanding God’s will.  Only revelation was acceptable. With the death of Charles I, and the failure of traditional authority, several factions of Calvinism sought to define the new order that would replace the old order according to their own revelations. Without any mediating authority, or grounds to negotiate compromise, violence was often the result.

The Platonists constructed a natural theology supporting the concept of free will, and opposing the materialism of Thomas Hobbes. To its members, there was no natural divide between philosophy and theology.  Reason could, therefore, sort out rival theological and ethical claims without the violence that had troubled their generation.

To support this agenda, Cudworth devoted himself to developing a model of the universe, based on a vast body of both ancient and contemporary sources.  His ontology was based upon Neoplatonism, and involved a World-Soul he called “Plastic Nature.”  His epistemology was an amended Platonism, where the “Essences” that served as the standards of rationality, ordering both the mind and the universe, were innate to God.  In order to support the concept that people have free will, he developed a modern-sounding psychology derived from Epictetus’s Stoic psychology.  With this theory he attacked the concepts of materialism, voluntarism, and determinism.

The Cambridge School was primarily a reactionary assembly, and they largely dissolved when the Restoration of the Monarchy provided a political resolution to their generation’s concerns.  However, Cudworth exerted a subtle influence on later generations.  Through his daughter Damaris, his ideas helped to shape the philosophies of John Locke and Gottfried Leibniz, among others.

Table of Contents

  1. Biography
  2. Publications
    1. Sermons
    2. The True Intellectual System of the Universe
    3. Posthumous works
  3. The Cambridge Platonists
    1. Calvinist Doctrines
    2. Platonist Responses
    3. Sources and Influences
  4. Themes in Cudworth’s Work
    1. The Essences and Rational Theology
    2. Ontology
      1. The Necessity of Dualism
      2. Atomic Materialism
      3. “Stratonical” Materialism
      4. Plastic Nature
    3. Epistemology
      1. Knowledge as Propositional
      2. The Essences in Epistemology
      3. Plastic Nature in Epistemology
      4. The Failings of Plastic Nature in Epistemology
    4. Free Will
      1. The Hegemonikon
      2. God’s Choice
      3. Human Choice
  5. Cudworth’s Influence
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

 

Ralph Cudworth was born in Aller, Somerset, in 1617.  His father was Ralph Cudworth, the Rector of Aller, a Fellow of Emmanuel College, Cambridge, and a former chaplain to King James I.  His mother, Mary Machell Cudworth, had been a nurse for James’s elder son, Henry.  Cudworth Senior died in 1624, and Mary Cudworth then married Doctor John Stoughton who replaced him as Rector of Aller.  Stoughton also saw to young Ralph’s home schooling.

In 1632, Cudworth enrolled in Emmanuel.  He received a Bachelor of Arts in 1635 and a Masters of Arts in 1639.  He was also elected a Fellow of Emmanuel, at this time, and began to serve as a tutor.  Cudworth earned a Bachelors of Divinity in 1646 and a Doctorate in Divinity in 1651.  During this time, he also became a friend of Benjamin Whichcote, the founder of the philosophical and theological movement known as The Cambridge Platonists.

During the English Civil Wars, Cudworth’s sympathies were with Parliament.  In 1644, representatives of Parliament appointed Cudworth to serve as the Master of Clair Hall, replacing a monarchist who had been ejected from that post.  In the next year, he was named Regius Professor in Hebrew.  From 1650 to 1654, he also served as the rector of North Cadbury, Somerset, which operated under the authority of Emmanuel College.  In 1654, Cudworth left both Clair Hall and the rectory to become the master of Christ’s College, Cambridge.

Judged a political moderate, Cudworth retained his position at Christ’s College upon the restoration of the monarchy, in 1660.  Sheldon, the Bishop of London, named him the Vicar of Ashwell, Hetford in 1662.  He was also given the Prebendry of Gloucester in 1678.   He died on June 26, 1688, at Cambridge, and was buried in the chapel at Christ’s College.

After becoming master of Christ’s College, Cudworth married Damaris Craddock Andrews.  They had two sons, John, and Charles, and one daughter, Damaris.  John and Charles Cudworth both died before their father.  Damaris Cudworth survived, and would anonymously author two philosophic works of her own: A Discourse Concerning the Love of God, and Occasional Thoughts. She also wrote the first biography of John Locke, with whom she was a close friend and correspondent.  She also wrote to Gottfried Leibniz, with whom she debated the merits of both her father’s works, and of Locke’s.

2. Publications

a. Sermons

Cudworth’s publications include theological texts such as A Discourse Concerning the True Notion of the Lords Supper (1642), and The Union of Christ and the Church, (1642).  He also published various sermons, including A Sermon before the House of Commons (1647).  As with most Platonists a good deal of his philosophical theories are expressed through published transcriptions of their sermons. This made them the first philosophers to express their theories primarily in English.  Cudworth would also write more conventional philosophical arguments to support their program.

b. The True Intellectual System of the Universe

His philosophical work is dominated by The True Intellectual System of the Universe: the First Part wherein All the Reason and Philosophy of Atheism is Confuted and Its Impossibility Demonstrated (1678). The System was supposed to be an exhaustive, three-part presentation of his entire Platonist world-view.  In his first volume he would attack atheism, most particularly as interpreted by Hobbes.  Thus, this work would also be arguing against Materialism.  The second volume would attack Voluntarism, as was accepted by John Calvin.  In the third, he would directly argue against the fatalism accepted by both Calvin and Hobbes.

However, the first volume of the System became controversial upon publication.  Some saw a crypto-atheism in Cudworth’s didactic style.  He first stipulated what he saw as all of the arguments for materialism and atheism, and then, after outlining his general philosophic positions, showed how that system answered all of these arguments.  Many found his initial arguments more compelling than his later responses.  This led some to wonder if this was not the intent.  Others disputed interpretations of Christian doctrine expressed in the System.  In response to these problems, Cudworth chose to suspend the project.

c. Posthumous works

Cudworth’s posthumous publications include A Treatise Concerning Eternal and Immutable Morality (1731), and A Treatise on Freewill (1838). These are based upon surviving manuscripts of the projected continuation of the True System. In 1733, his True Intellectual System was translated to Latin and published in France by J. L. von Mosheim as Systema intellectuale hujus Universi, seu de veris Naturae originibus, correcting several Greek translations in the original, and introducing the work to a Continental audience.

3. The Cambridge Platonists

Cudworth was nominally a Calvinist, but he was not orthodox.  As a member of the Platonist movement, he rejected significant elements of the Calvinist theology.

a. Calvinist Doctrines

Orthodox Calvinists are voluntarists.  To them, God is primarily omnipotent, and nothing, not even logic, can restrain Him.  If He chose to make a man a married bachelor, for example, that would be easily within His power.  This meant that all of His activities are merely to be the absurd assertions of the universe’s unique existentialist subject.

As a consequence, Calvinists are also Enthusiasts, to whom all theological knowledge came to man through divine revelation.  Man’s rational powers, bound to logic, are simply useless with reference to God.  Believing that theology is the only acceptable grounding for ethics, this implies, to Calvinists that ethical standards are similarly dependent on divine fiat and revelation.

In addition, Calvinists were Fatalists, rejecting the concept of human free will.  If free will existed, they would argue, an individual would have more power over their own actions than had God.  This would compromise God’s absolute power.  Human actions would also be contingent, and thus, unpredictable.  This would compromise His omniscience.  Neither compromise was acceptable to the Calvinists, so they restricted all agency to the omnipotence of the Supreme Being.

Finally, Calvinism taught that, as a result of Original Sin, man’s nature was totally depraved, and irremediable through human efforts.  Unable to control his fate, man was wholly dependent on God for his moral status.  Neither his reason, nor his will could improve his character.

b. Platonist Responses

The Cambridge Platonists unanimously rejected all of these positions. Cudworth called them “the theory of the arbitrary deity.”(Ralph Cudworth, The True Intellectual System of the Universe, II.529.)  Their goal was to vindicate the power of the human intellect, and human moral responsibility.  To do otherwise, in their eyes, rendered any conception of God’s wisdom and goodness meaningless.

Instead, they supported a natural theology which could prove the existence of God, and the superiority of Christianity.  Beyond these basic points, disputes between individuals with different beliefs could and should be settled with debate, when this was possible.  When this method failed to produce a definitive resolution, they argued, differences between belief systems should be tolerated in the spirit of humility.  If humanity really needed to understand something, God, as a rational and benevolent entity, would allow it the information required to develop an understanding.  Thus, all people who make an honest effort to understand God, should and, in fact, did, come up with the same basic theological positions.  Education and rational persuasion are the only methods required to correct differences that exist between good people on fundamental matters.  Because man’s theological and ethical deliberations were capable of yielding some results, he must be, at least to the limited extent that his finite reasoning faculty allows, capable of taking some remedial steps towards his own salvation.

This position is formally known as “Latitudinarianism.” It would dominate Cudworth’s writings and sermons, beginning with A Sermon before the House of Commons.  The largest segment of the True System amounts to an attempted historical demonstration of Latitudinarianism.  There, he seeks to prove that all great thinkers in history were believers in God who agreed on the basic points of Christian doctrine.  He conducts an encyclopedic review of the history of philosophy, as he knew it, to garner support for this point.

Unfortunately, these historical digressions are more efforts at myth-making than legitimate arguments.  In order to demonstrate the reasonableness of Christianity, the author rewrites philosophic history to show that as many ancient philosophers as possible were, in some sense, Christian.  He also expands the borders of Christian doctrine so as to accommodate as many diverse philosophies as possible within it.  Both philosophy and Christian doctrine suffer some violence in this process.  As a result, after the System’s publication, several readers accused Cudworth of some form heresy, generally Tritheism, or Arianism.

c. Sources and Influences

The Cambridge Platonists’ primary intellectual resource was, obviously, Plato.  The Cambridge Platonists would be the first English philosophers to use him so prominently in their works.  However, their understanding of Plato was mediated by St. Augustine’s Neoplatonism.  Thus, they tended to confuse Plato’s beliefs with that of Plotinus.  Aristotle and the Stoics were also among their major influences.

Platonists also felt the influence of their contemporaries.  They particularly appreciated the rationalism of both Hobbes and René Descartes.  But, Descartes was a voluntarist.  At the same time, his theory of innate ideas seemed, to the Platonists, to lead to psychological determinism analogous to his mechanistic conception of material actions.  This paradoxically left it with what Cudworth called a “tang of the atheistic mechanistic humour.”(True System, I.283)  On the other hand, Hobbes’s rationalism led him to determinism, materialism, moral relativism, and, it seemed, atheism.  The Platonists could not accept either option.

Cudworth, in particular, was very historically-minded.  He tried to incorporate as many historic philosophers as possible into his arguments.  He seems to have believed that all of his contemporaries’ philosophical positions were passed down from some ancient thinker.  This skewed his understanding of contemporary thinkers, such as Thomas Hobbes, who seemed, to Cudworth, nothing more than a contemporary follower of Democritus.

4. Themes in Cudworth’s Work

a. The Essences and Rational Theology

To Cudworth, in a universe where God’s omnipotence trumped His omniscience, there would be no final truths for Him to know. “Truth and falsehood would be only names.  Neither would there be any more certainty in the knowledge of God, himself.” (True System, III. 539)  So, he concluded, if God is omniscient, as Christian doctrine dictates, then there must be eternal truths to know, which are invulnerable to His power.

To be eternal, at least in the sense intended in Platonic Theism that Cudworth espouses, these truths would have to be self-justifying, logically necessary principles, and not mere conditional facts.  Cudworth called them the “Essences.”  They are his equivalent of Plato’s Forms.  To Cudworth, such eternal principles “do not exist without us…but in the mind, itself.”(True System, III. 622)  Human minds cannot directly bear such entities, because they, themselves, are not eternal: “of that which is in constant change nothing may be affirmed as constantly true.”(True System, III. 627; quoting Aristotle, Metaphysics, 4.5)  The principles must exist within a mind that is also eternal.  The only such mind is God’s.  And so, Cudworth concludes, logically necessary principles exist as natural configurations of God’s mind.  They do not exist above Him, but as a self-disciplining element of His divine psychology.

With God’s mind disciplined by the logic of the Essences, He and His works must always be rational.  This means that, to the extent allowed by finite human capacities, rational deliberation can, with confidence, develop sound opinions concerning God, and His works.  Revelation, although possible, in such a theory, is not necessary for theological or ethical judgments.  Moreover, revealed theological truths must be reasonable, so disputes between revelations may be solved through sufficient rational analysis.  If man cannot resolve such a dispute, it must be because his rational powers are insufficient to the task, and so, his proper attitude towards the issue should be one of humility, not violent intolerance.

Cudworth also uses these Essences as a Design Argument for the existence of God.  If the human mind can understand the universe, at least to some extent, through reason, he contends, then necessary logical principles must guide the universe.  But necessary logical principles must be eternal truths.  Eternal truths that can only exist contained within an eternal mind.  This mind, by virtue of such containment, would know, and direct, the universe.  Thus, because man can use reason to gain knowledge of the universe, there is a rational God.

b. Ontology

i. The Necessity of Dualism

Cudworth’s Ontology is the primary focus of The True System. He begins it by gathering all of the various forms of atheism and reduces them to two general types, each founded on a different form of materialism.  The first is atomism.  It holds that matter consists of individual particles, each of which is incapable of initiating motion its own.  Cudworth calls the other “Stratonical” materialism, after Strato Lampsacus, the third director of the Lyceum.  It allows matter to initiate action, claiming that the universe is a single, self-organizing, but non-conscious entity.  In keeping with his Latitudinarian beliefs, Cudworth is willing to adopt each of these theories, up to a point.

But, Cudworth defines matter as being necessarily non-conscious.  As the universe is active, and the logical order of the universe implies, to Cudworth, the existence of an eternal logical mind, it seems obvious that matter cannot account for the entire universe.  Cudworth address, and rejects, the possibility that consciousness comes into existence as an emergent property.  Citing the Neoplatonic doctrine that “an effect cannot be superior to its cause”(True System, III.81) and the logical principle that “nothing comes from nothing,”(True System, II.67) he argues that it is clearly absurd and paradoxical that such things come from a substance that does not itself demonstrate any potential for either property.  Therefore, either line of thought goes astray when it leads to materialism and atheism.

ii. Atomic Materialism

Cudworth supports the conception that matter is made of atomic particles, but holds that this belief and atheism are fundamentally incompatible.  To him, matter is an essentially passive entity.  Therefore, the atomic motion which accounts for all mechanical action is as a reaction to an outside stimulus.  But this stimulus cannot be material, or else we are left with a vicious cycle.  Following Thomas Aquinas’s Cosmological Argument, this cycle can only be broken by having in an unmoved mover operating somewhere in the causal history of an event.  Only God has such a resume.  And so, Cudworth concludes, atomists must not be materialists.  They must be dualists who believe in an eternal God, if they are to be logically consistent.

Having established this, to his satisfaction, Cudworth turns to myth-making.  He advances a history of atomic theory that he shared with his friend, fellow Cambridge Platonist, Henry More.  Democritus, he claims, was not the first atomist, but merely the first atheistic atomist.  Atomism was, in fact, taught by Moses, and was brought to Greece by Pythagoras.  From him, it was supposedly passed down to Plato, Aristotle, Plutarch and others.  Leucippus and Democritus evidently took this original philosophy and corrupted it into atheism.

iii. “Stratonical” Materialism

“Stratonical” materialism is a variation of Plato’s Organicism. It agrees with Plato that the universe is a single, self-organizing, entity, but stipulates that this principle is wholly unconscious and material.  Instead it grants matter the ability to initiate action.

Cudworth rejects this adaptation of Plato.  To him, it is impossible for an entity to operate in a logical, orderly manner, without both the regulation of the Essences to establish what logic and order are, and some sort of consciousness to access these principles.  There are clearly apparent, and logically predictable, patterns in the objects, and actions of the universe.  Therefore, logic and the Essences exist.

But, the Essences are phenomena of the mind of God, existing nowhere else.  So, there must be more than even activist matter in the universe.  There must be a God, and matter must have some connection to God, in order to operate in accord with the Essences, which are mental phenomena.  There must be some sort of connector between matter and spirit.  This connector would, indeed, bind the universe, in one sense, into a single entity, but not on the purely material level of atoms.

iv. Plastic Nature

Cudworth believes that this connection is provided by a form of World Soul.  He finds ancient authority to support this conception, and much more successfully than he did in reference to atomism.  This disciplining force is clearly related to the World Soul described in the Plato’s Timaeus, and in the Stoic tradition. It is also an interpretation of Plotinus’s Third Hypostasis.  Such a world-soul does indeed bind the universe into one entity, but not through mere matter.

However, the greatest influence behind Cudworth’s conception of this force was Henry More.  Originally a follower of Descartes, More eventually opposed Cartesian ontology, because due to the general Platonist distaste for mechanism.  To replace this theory, he developed his conception of “the Natural Spirit,” or “the Hylarchic Principle.”  This principle traced the causal history of events through the interaction of matter with spirit, instead of through interactions between two or more material objects, while maintaining deterministic causality for material events.  Cudworth took this idea, and called it “Plastic Nature.”

More suggested that matter and spirit both exist in space.  When an atom of matter coexists with spirit, within the same space, they become, “bound by the law of fate.”(True System, III.674)  The result is Plastic Nature.  When the active spiritual element of plastic moves in accord with the logic of the Essences, it carries the passive material atom on which it has overlaid with it, resulting in a physical event.

This overlay renders the spiritual element of Plastic Nature unconscious.  Its motion is not deliberative, but analogous to the body’s autonomic system.  Nevertheless, all spiritual actions are defined by logical principles.  So, the activities of plastic nature are rational, determined, and predictable through the exercise of reason.  This allows animals, plants, and un-living nature to behave in an orderly manner, though all three lack native intelligence.

Still, the full scope of the Essences cannot be apprehended or enacted by an unconscious hybrid like Plastic Nature.  Some of them, such as the rational principles which imply moral standards, are too subtle.  So, Plastic Nature sometimes operates amorally, or even self-destructively, as when moths seek the sun, and fly into a flame; or rain falls on the just and the unjust alike.  God might be aware of the fall of every sparrow, but he need not directly conspire in the event.  Neither need He be found responsible for the death of moths, or the democracy of the weather.  Such events are attributed to the autonomic system of the universe, operating according to a crude apprehension of the true logical order found in God’s mind.  Contemplation of this true order requires the use of a fully conscious mind, without material contamination.

c. Epistemology

i. Knowledge as Propositional

Cudworth’s epistemology is based on the understanding that all knowledge is propositional. Beliefs that cannot be expressed in a proposition, such as “All men are mortal,” are not “known.”  Thus, the elements of a logical proposition—quantifiers, subjects, predicates and copulas–must be somehow present in the mind as a precondition of knowledge.  However, none of these things are determinable by the senses.  Both subjects and predicates are universals, such as “men” and “mortal.” Quantifiers, like “all,” and copulas such as “are,” are logical relations.  Senses only have contact with individual objects and properties as they exist relative to the observer.  Therefore, propositional knowledge is not empirical in nature.  It is a judgment arising from an active intellect already possessed with some logical capacities.

As God has knowledge, these universals must also be known to Him.  They must also be the models of his creations and the rational determinants of His acts.  They must be, in short, the Essences.

ii. The Essences in Epistemology

Humans do not have the direct access to these ideas that God enjoys. Cudworth is uncomfortable with the idea that humans might have innate ideas of the Essences.  Such theories as Descartes’ imply, to him, a form of epistemic determinism with the human mind pre-programmed to know certain truths, which have inevitable logical implications.  To Cudworth, who believes that both theology and ethics were developed through human reason, instead of revelation, epistemological determinism would imply an ethical determinism, where man is bound to come to a particular conclusion in either field.  This would make human moral and epistemic error hard to explain.

Because God is perfect, a kind of determinism as to the accuracy of His knowledge and moral decisions is essential to His nature, at least in cases where one belief or option is actually better than the others.  His ideas are innate.  But, to fill what Cudworth saw as our proper role in the order of things, humans must possess a useful, but less accurate, method of developing a consciousness of the same Essences.

iii. Plastic Nature in Epistemology

The senses do have a significant epistemological role in this method.  They stimulate the rational faculties of the mind.  The physical senses, guided by Plastic Nature, react to a sense-stimulation and draw information into the mind.  But all that the senses alone can detect would be “a thing which affects our sense in respect of figure or color.”(True System, III.584)  The determination that these properties are conjoined to an object, and that the same properties are universals, capable of manifesting simultaneously in multiple objects and locations, are rational judgments.  “(The mind)… exerts ‘conceptions’…upon our perceptions” (Ibid.) so that we might know objects.

Moreover, these conceptions must represent objective, universal principles.  If they did not exist as objective and universal entities, then knowledge would be of a purely relativistic and subjective nature.  They would be “phantasms” which we would not be able to communicate with one another, because we would each exist in a different epistemological universe.  They would also be irredeemably vague, because they are without any disciplining principle, except our own will, which is in flux.  And, finally, we would be unable to infer anything from them, because they are all particular.

To Cudworth and the Platonists, this ordering role is filled by Plastic Nature.  In addition to operating material events in accord with logic, Plastic Nature is also responsible for the autonomic elements of human behavior.  Cudworth ascribes dreams, and all other operations of what we call the unconscious mind to Plastic Nature manifest in the human mind.  Included in these, Cudworth holds, are the most basic projection of order onto our perceptions, so that we can be capable of logical deliberation.

According to Cudworth, when our eyes are stimulated by seeing a white object, or a black object, Plastic Nature instinctively abstracts the notions of Whiteness or Blackness from these experiences, and submits these concepts to the conscious mind.  The same is true for all basic universals.  Now the mind has something to reason about, and may detect logical relations among them these basic Essences.  Cudworth lists these relations as “Cause, Effect, Means, End, Order, Proportion, Similitude, Dissimilitude, Equality, Inequality, Aptitude, Inaptitude, Symmetry, Asymmetry, Whole, Part, Genus, Species, and the like.”(True System, III.586)  Propositions, and thus, knowledge, become possible.

Knowing objects, we also come to see corporate entities made up of several particular individuals “which, though sometimes locally discontinued, yet are joined together by…relations…,” and, “all conspiring into one thing imperceptible by sense…yet…not mere figments.” Conceptions, of such things as a nation, are an example of such a “totem.”  “(The development of such ideas)…proceeds merely from the intuitive power and the activity of the mind” which provides just those relations which allow that abstraction to be possible.”(Ibid. 593)

Similarly,

…a house, or a palace is not only stone, brick, mortar, timber, iron, glass, heaped together…it is made up of relative… notions it being a certain disposition of those materials into a ‘totem’ consisting of several parts, rooms, stairs, passages, doors, chimneys, windows, convenient for habitation, and fit for several functions among men….this logical form which is the passive stamp or print of intellectuality upon it. (Ibid. 594)

At this point, the spiritual, conscious element of the mind can also infer the existence of Essences which are not directly represented in experience.  The ideas that the mind has already developed logically imply these new ideas.  Examples of such ideas are: “Wisdom, Folly, Prudence, Imprudence, Knowledge, Ignorance, Verity, Falsity, Virtue, Vice, Honesty, Dishonesty, Justice, Injustice, Volition, Cognition, and Sense, Itself.”(Ibid. 586)  Ethics and Mathematics are also products of this process.  From our perception of universal order among material things, the mind may develop rational arguments for the existence of God.

As the disciplining of natural activities and of human thought are directly analogous functions of Plastic Nature, and are regulated by the same Essences, the logic of the human mind is always analogous to the natural order in the universe.  Where the Essences are represented in dumb matter by the laws of nature, they are represented in spirit by the activities in our mind which generate conceptions of those Essences.  Therefore, skepticism regarding the accuracy of human beliefs, as is manifest in Descartes’s “Evil Genius” thought experiment, is unjustified at this basic level.  The logic of our minds and the rationality of the universe are both emanations from the same source, and mirror each other.

iv. The Failings of Plastic Nature in Epistemology

However, after Plastic Nature has communicated the basic universals to the conscious mind, humans require more and more, at increasingly great levels of sophistication to develop these higher conceptions, unlike God, to whom they are innate. This opens the possibility of confusion or error.  For example, the conscious mind might simultaneously receive a message from its automatic epistemological processes, telling it that there is water on the horizon, and a second message from the reasoning spirit that this is a mirage.  When this occurs the problem has defeated our basic epistemological system, and the mind must choose which alternative to believe, settling the question with another faculty, the Will.

d. Free Will

Cudworth’s conception of the nature of the will is expressed in A Treatise on Freewill.  It is somewhat eccentric by the standard of his contemporaries.  To them, the human mind seems to be more the arena where will, intellect and the various passions constantly vie for power over the individual.  To Cudworth, it is far more orderly and integrated.

i. The Hegemonikon

He follows the Stoics, especially Epictetus and Iamblicus, in describing the mind as a “hegemonikon,” a conjunction of imagination, logic, passion and impulse.  According to Cudworth, the mind receives impulses from the reason, the body, and the instinctive drives of Plastic Nature in the way of a central repository.  Then it orders them, so that the whole can act as one coherent being.  It is the man …that understands, and the man that wills, as it is the man who walks, and the man that speaks or talks….”(A Treatise on Free Will, p. 25)…not merely the will, which is a product, not a component of the man.  This concept allowed for a recovery of the connection between body and mind, lost in Descartes’s dualism.  In effect, it posits what we might call the “personality” as a multi-leveled emergent locus of physical, conscious, and unconscious forces out of which psychological activity arises.

Will is the faculty which brings order to these mixed signals.  Plastic Nature links it to the most basic Essences.  These include the Essence of the Good.  So, included in its make-up is “the instinct to do good.”(A Treatise on Free Will, p. 4)  It tries to come up with the best internal order for the mind.  So, in cases where the intellect receives confused or mixed signals, the hegemonikon chooses which alternative to accept.  It is “free,” but it is not arbitrary.  It tries to arrive at the objectively right answer, but the very fact that it has been invoked means that there is no overwhelming influence determining which option to accept.

ii. God’s Choice

In some cases even the intellect of God is defeated, and is unable to discern which of a set of alternatives is better.  This is because Cudworth believes in a variety of Molina’s middle knowledge. To Cudworth, it is quite possible for two or more contradictory alternatives to really have equally forceful intellectual value.  Thus, there is no rational cause to prefer one option to the other.  In order to enact, or believe, one or the other option, the active intellect of God must have the ability to will an alternative without the guidance of predetermination.  So, God makes choices, when all of His alternatives are equally justified.  Fortunately, in God’s case, this only occurs when the alternatives actually are equally valid, so either option turns out to be perfectly right and good.

iii. Human Choice

Humans have a similar power.  But, due to the failings of our human epistemological method, sometimes we err.  In such cases, we only believe that the alternatives are all equally good, and engage the will, when the intellect alone should be enough to determine the appropriate alternative. When this happens, we might make the wrong choice, believing in the lesser truth, or performing the lesser act.  And so, moral and intellectual error enters the picture.  We are responsible for these errors because the balance of intellectual influences which required our will to act as a tie-breaker also eliminates the possibility for a predetermined solution.  We are the determining factor responsible for the action, event or belief produced.

5. Cudworth’s Influence

The Cambridge Platonists faded rather quickly from the English intellectual scene.  Most of the problems they attempted to solve were simply no longer of moment when the Restoration of the Monarchy forced a retreat of Calvinism. Also, Cudworth’s own vast erudition and the originality of his psychology were also lost in his bulky, inelegant style, controversial implications, and tendency towards mythmaking.
In 1703, Georges-Louis Le Clerc drew new attention to Cudworth, by publishing excerpts from the System.  This initiated a long debate between himself and Pierre Bayle over whether a belief in Plastic Nature could lead to atheism.  It also renewed interest in Cudworth and would eventually lead to the posthumous publications, and re-publications of his works.
Still, Cudworth’s philosophic influence is, for the most part, felt indirectly.  Damaris Cudworth Masham was a close friend to John Locke, and a correspondent with Gottfried Leibniz, whom she encouraged to publish.  Thus, parallels between Cudworth’s theories, and Locke’s seem to be no coincidence.  John Locke’s shares Cudworth’s conception of human free will, and his epistemological theory also adopts a model of the mind as an integrated collection of powers, faculties and modes.  Similarly, it may be observed that there are points of comparison between Cudworth’s conception of Plastic Nature, and Leibniz’s Theory of Pre-established Harmony.  Unfortunately, there has yet to be a complete study made of these connections.

6. References and Further Reading

a. Primary Sources

  • Cragg, Gerald R. (ed.) 1968. The Cambridge Platonists. Oxford
  • Cudworth, Ralph, 1731. A Treatise on Eternal and Unalterable Morality. London
  • Cudworth, Ralph, 1996,  A Treatise on Free Will. Cambridge
  • Cudworth, Ralph, 1678, The True Intellectual System of the Universe. 3 vols. London
  • Patrides, C.A. 1969. The Cambridge Platonists. Cambridge

b. Secondary Sources

  • Birch, Thomas “An Account of the Life and Writings of R. Cudworth, D.D.” in Cudworth, Works, 4 vols. (Oxford, 1829), 1, 7-37.
  • Carter, Benjamin. “Ralph Cudworth and the theological origins of consciousness,” in History of the Human Sciences July 2010 vol. 23 no. 3 29-47
  • Gysi, Lydia. Platonism and Cartesianism in the Philosophy of Cudworth (1962).
  • Lahteenmaki, Vili.  2010. “Cudworth on Types of Consciousness.” British Journal for the History of Philosophy 18 (1):9-34.
  • Mijuskovic, Ben Lazare. The Achilles of Rationalist Arguments. The Simplicity, Unity, and Identity of Thought and Soul from the Cambridge Platonists to Kant: A Study in the History of an Argument (International Archives of the History of ideas, Series Minor 13). The Hague: Martinus Nijhoff, 1974.
  • Muirhead, John H. The Platonic Tradition in Anglo-Saxon Philosophy (1931). New York
  • Osborne, Catherine. ‘Ralph Cudworth’s The True Intellectual System of the Universe and the Presocratic Philosophers’, in Oliver Primavesi and Katharina Luchner (eds) The Presocratics from the Latin Middle Ages to Hermann Diels (Steiner Verlag 2011)
  • Passmore, John Arthur. Ralph Cudworth: An Interpretation, University Press, University of Michigan (1951).
  • Rodney, Joel M. “A Godly Atomist in 17th-Century England: Ralph Cudworth,” The Historian 32 (1970), 243-9.
  • Tulloch, J. “Rational Theology and Christian Philosophy in England in the Seventeenth Century,” reprint of 2nd ed., 2 vols. (New York, 1972), 2, 193-302. BR756 T919 Dictionary of National Biography (repr., London: Oxford University Press, 1949-1950), 5, 271-2. Biographia Britannica, 2nd ed. (London, 1778-93), 4, 544-9.

 

Author Information

Charles M. Richards
Email: Charles_Richards@tulsacc.edu
Tulsa Community College, Rogers State University
U. S. A.

17th Century Theories of Substance

In contemporary, everyday language, the word “substance” tends to be a generic term used to refer to various kinds of material stuff (“We need to clean this sticky substance off the floor”) or as an adjective referring to something’s mass, size, or importance (“That is a substantial bookcase”).  In 17th century philosophical discussion, however, this term’s meaning is only tangentially related to our everyday use of the term.  For 17th century philosophers, the term is reserved for the ultimate constituents of reality on which everything else depends.  This article discusses the most important theories of substance from the 17th century: those of Descartes, Spinoza, and Leibniz.  Although these philosophers were highly original thinkers, they shared a basic conception of substance inherited from the scholastic-Aristotelian tradition from which philosophical thinking was emerging.  In a general sense each of these theories is a way of working out dual commitments: a commitment to substance as an ultimate subject and a commitment to the existence of God as a substance.  In spite of these systematic similarities between the theories, they ultimately offer very different accounts of the nature of substance.  Given the foundational role that substance plays in the metaphysical schemes of these thinkers, it will not be surprising to find that these theories of substance underlie dramatically different accounts of the nature and structure of reality.

Table of Contents

  1. 17th Century Theories of Substance: A Shared Background
  2. Descartes
    1. Descartes’ Account of Substance
    2. What Substances are There?
    3. Are Embodied Human Beings Substances?
    4. How is Substance Independent?
    5. How Many Material Substances?
  3. Spinoza
    1. Spinoza’s Account of Substance
    2. What Substances are There?
    3. Why doesn’t Spinoza Countenance Created Substance?
    4. How Can a Substance Have More than One Attribute?
    5. An Extended and Indivisible Substance?
  4. Leibniz
    1. Leibniz’s Account of Substances
    2. What Substances are There?
    3. Experience and Reality
    4. What is Wrong with Composite Beings?
    5. Leibniz and Spinoza
  5. 17th Century Theories of Substance in Perspective
  6. References and Further Reading
    1. Primary Texts in English
    2. Secondary Texts

1. 17th Century Theories of Substance: A Shared Background

In thinking about 17th century accounts of substance we need to keep in mind that a concern with substance and its nature was nothing new to the period. In fact, philosophical thinking about the nature of substance stretches all the way back to ancient Greece.  While the new philosophers of the 17th century were keen to make a break with the past and to tackle philosophical and scientific problems from new foundations, their views did not develop in an intellectual vacuum.  Indeed, the scholastic-Aristotelian tradition of the day informed their thinking about substance in a number of ways, and contributed to a number of commonalities in their thought. Before looking at specific theories of substance, it is important to note four commonalities in particular.

Substance, Mode, Inherence

For the philosophers we will discuss, at the very deepest level the universe contains only two kinds or categories of entity: substances and modes.  Generally speaking, modes are ways that things are; thus shape (for example, being a rectangle), color (for example, redness), and size (for example, length) are paradigm modes.  As a way a thing is, a mode stands in a special relationship with that of which it is a way.  Following a tradition reaching back to Aristotle’s Categories, modes are said to exist in, or inhere in, a subject.  Similarly, a subject is said to have or bear modes.  Thus we might say that a door is the subject in which the mode of rectangularity inheres.  One mode might exist in another mode (a color might have a particular hue, for example), but ultimately all modes exist in something which is not itself a mode, that is, in a substance.  A substance, then, is an ultimate subject.

Independence and Priority

The new philosophers of the 17th century follow tradition in associating inherence with dependence.  They all agree that the existence of a mode is dependent in a way that the existence of a substance is not.  The idea is that modes, as the ways that things are, depend for their existence on that of which they are modes, e.g. there is no mode of ‘being 8’0 long’ without there being a subject that is 8’0 long.  Put otherwise, the view is that the existence of a mode ultimately requires or presupposes the existence of a substance.  This point is sometimes put by saying that substances, as subjects, are metaphysically prior to modes.

Degrees of Reality

In contrast to contemporary philosophers, most 17th century philosophers held that reality comes in degrees—that some things that exist are more or less real than other things that exist.  At least part of what dictates a being’s reality, according to these philosophers, is the extent to which its existence is dependent on other things: the less dependent a thing is on other things for its existence, the more real it is.  Given that there are only substances and modes, and that modes depend on substances for their existence, it follows that substances are the most real constituents of reality.

God Exists and is a Substance

Furthermore, each of the philosophers we will discuss maintains (and offer arguments on behalf of the claim) that God exists, and that God’s existence is absolutely independent.  It is not surprising then, given the above, that each of these philosophers holds that God is a substance par excellence.

2. Descartes

Descartes’ philosophical system, including his account of substance was extremely influential during the 17th century.  For more details see the IEP article René Descartes: Overview.  Unlike Spinoza and Leibniz, however, Descartes’ theory of substance was not the centerpiece of his philosophical system.  Nonetheless, Descartes offered a novel theory of substance which diverged in important ways from the Scholastic-Aristotelian tradition.

a. Descartes’ Account of Substance

It is sometimes said that Descartes gives two different definitions of substance, and indeed in the Principles and Second Replies he defines substance in distinct ways.  We should not, however, see this as evidence that Descartes changed his mind.  On the contrary, it is clear that for Descartes these definitions express two sides of a unified account of substance.

Let us begin with the definition he offers in his Principles of Philosophy (I.51-52).  There he defines ‘substance’ in terms of independence.  He begins by making clear that there are really two different philosophical senses of the term (corresponding to two degrees of independence).  For reasons that will become clear in a moment, let us distinguish the two senses by calling one Substance and the other Created Substance.  Descartes’ definitions can be paraphrased as follows:

Substance: A thing whose existence is dependent on no other thing.

Created Substance: A thing whose existence is dependent on nothing other than God.

Strictly speaking, for Descartes there is only one Substance (as opposed to Created Substance), since there is only one thing whose existence is independent of all other things: God.  However, within the universe that God has created there are entities the existence of which depends only on God.  These lesser substances are the ultimate constituents of the created world.

The definition of substance that Descartes offers in the Second Replies (and elsewhere), ignores the distinction between God and creation and defines substance in a much more traditional way, claiming that a substance is a subject that has or bears modes, but is not itself a mode of anything else.  This fits right in with his other comments about substance in the Principles.  Thus, he tells us that each created substance has exactly one attribute (Principles I. 53).  An attribute of a substance, Descartes explains, is its “principle property which constitutes its nature and essence, and to which all its other properties are referred” (Ibid.).  A substance’s attribute, consequently, dictates its kind since attributes “constitute” a substance’s nature and all and only those things of the same nature are of the same kind.  Moreover, in claiming that all a substance’s properties are referred through the substance’s attribute, Descartes is claiming that a substance’s attribute dictates the properties that a substance may have.

Descartes specifies two attributes: thought and extension.  Consequently, there are at least two kinds of created substance—extended substances and thinking substances.  By ‘extension’ Descartes just means having length, breadth, and depth.  More colloquially we might say that to be extended is just to take up space or to have volume.  Whereas by ‘thinking substance’ Descartes just means ‘mind’.  Although Descartes only ever discusses these two attributes, he never explicitly rules out the possibility of other attributes.  Nevertheless, the tradition has interpreted Descartes as holding that there are only two kinds of created substance and it is for this reason that Descartes is often called a substance dualist.

With this specification in hand we are in a better position to understand what Descartes means when he says that all a substance’s properties are referred through the substance’s attribute or “principle property.”  Consider an extended substance, say, a particular rock.  Among this rock’s properties are shape and size; but having these properties presupposes the property of extension.  Put otherwise, something cannot have a shape or a size without also being extended.  Furthermore, the properties that the rock may have are limited to modifications of extension—a rock cannot have the property of experiencing pain, for example, since the property of experiencing pain is not a way of being extended.   In general, we can say that for Descartes i) the attribute of a substance is its most general property, and that ii) every other property of a substance is merely a specification of, way of being, or mode of that attribute.

b. What Substances are There?

Given this account of the nature of substance, what substances exist?  Descartes famously argues in Meditation Six that human minds and bodies are really distinct—that is, that they are each substances.  Indeed, every individual consciousness or mind is a thinking substance.  Furthermore Descartes treats bodies, including the objects of our everyday experience (chairs, trees, spoons, etc.) as extended substances. This makes sense. Extension is an attribute of substance, so it would seem to follow that anything that is extended (has the attribute of extension) is itself a substance.  Moreover the parts of extended substances, as themselves extended, would seem to be extended substances for Descartes (see Principles I. 60). Given that Descartes thinks that matter is infinitely divisible (Principles II. 20)—that each part of matter is itself extended all the way down—it follows that there are an infinite number of extended substances.

We are thus left with the following picture of reality.  The most real thing is God on which all other things depend.  However, within the created realm there are entities that are independent of everything besides God.  These are the created substances.  Created substances come in two kinds—extended substances and minds, and there is a plurality of both.

This brief summary of Descartes’ account of substance raises a number of deeper questions and controversies.  One central question that naturally arises is why Descartes thinks that extension and thought are the most general properties of substances.  For a detailed discussion of Descartes’ reasons see the IEP article René Descartes: The Mind-Body Distinction. This article will briefly consider the role of embodied human beings in Descartes metaphysics, what Descartes means in calling substances independent, and a related controversy concerning the number of material substances to which Descartes is entitled.

c. Are Embodied Human Beings Substances?

Embodied human beings fit uneasily into Descartes’ metaphysics.  As embodied, humans are composite beings; an embodied human being consists of a mental substance (our mind) and a physical one (our body), for Descartes.  Descartes thinks that this composite being is, however, something over and above a mere aggregation.  He writes in Mediation Six: “Nature also teaches me…that I am not merely present in my body as a sailor is present in a ship, but that I am very closely joined and, as it were, intermingled with it, so that I and the body form a unit” (my italics).  In general, it is clear that Descartes thinks that embodied humans are exceptional beings in some regard, but how we should understand this mind/body union and its place in Descartes’ metaphysics has been a matter of some controversy among scholars.  One of the more prominent disputes has been between those scholars who read Descartes as holding that embodied human beings are a distinct kind of created substance, and those scholars who do not.  The former see Descartes as a substance trialist, whereas the latter read him along traditional lines as a substance dualist.  For trialist readings see Hoffman 1986 and Skirry 2005: Chapter 4).  For recent defenses of substance dualism against trialist interpretations see Kaufman 2008 and Zaldivar 2011.

d. How is Substance Independent?

As we have seen, Descartes defines substance in terms of independence.  This, however, is only a very general claim.  In order to better understand Descartes’ account of substance we need to have a better idea of the way in which substances are independent.  On one hand, in his thinking about substance Descartes is working with the traditional conception of independence according to which a substance’s existence is independent in a way that a mode’s existence is not, since substances are ultimate subjects.  Accordingly, let us say that substances are subject-independent.  On the other hand, in his account of substance Descartes is also working with a causal sense of independence.  After all, the reason that God is the only Substance (as opposed to Created Substance) is that all other things “can exist only with the help of God’s concurrence” (Principles I.51), and Descartes understands this as the causal claim that all other things are God’s creation and require his continual conservation.  Consequently scholars have seen Descartes as holding that in general i) God is both causally and subjectively independent (God is not, after all, a mode of anything else), ii) created substances are causally independent of everything but God and subjectively independent, and iii) modes are both causally and subjectively dependent in that they both depend on God’s continual conservation and on created substances as subjects. (See, for example, Markie 1994: 69; Rodriguez-Pereyra 2008: 79-80)

e. How Many Material Substances?

That created substances are causally independent of everything but God suggests a startling conclusion—that despite what Descartes seems to say, bodies are not material substances, since they are not sufficiently independent.  Bodies are causally dependent on other bodies in a host of different ways.  For example, bodies come to be and are destroyed by other bodies: A person is the product of  their parents and could die as the result of getting hit by a car.  Indeed, according to one scholarly tradition, there is only one material thing that satisfies Descartes’ definition of created substance—the material universe as a whole (see, for example, Cottingham 1986: 84-85).  Again, following tradition we can call this view the Monist Interpretation, and the opposing view that there are many material substances, the Pluralist Interpretation (for a distinct view see Woolhouse 1993: 22-23).  It would appear, then, that there is philosophical evidence of Monism; in other words, it would seem that Descartes’ views about created substance commit him to thinking there can be only one material substance.  Proponents of this interpretation claim that there is textual evidence as well, pointing to a passage in the Synopsis to the Meditations.  There Descartes writes:

[W]e need to recognize that body, taken in the general sense, is a substance, so that it too never perishes.  But the human body, in so far as it differs from other bodies, is simply made up of a certain configuration of limbs and other accidents of this sort; whereas the human mind is not made up of any accidents in this way, but is a pure substance.

Monists read ‘body, taken in the general sense’ as referring to the material universe as a whole.  Consequently, they see this passage as claiming that the material universe is a substance, but that the human body is not—since it is made up on a configuration of limbs and accidents.   Assuming the monists are right, two questions immediately arise.  First, if bodies are not substances, then what are they?  Monists typically claim that bodies are modes.  This makes sense: If bodies are not substances, they must be modes, given Descartes’ ontology.  Second, if Descartes does not think that bodies are substances, why does he so often talk as if they are?  Monists answer that Descartes is speaking loosely in these contexts using the term ‘substance’ in a secondary or derivative sense of the term.

Pluralists have objected on a number of grounds.  First, pluralists have challenged the monist’s textual evidence, offering alternative readings of the Synopsis.  Second, they have challenged the motivation of monism, pointing out that the monist interpretation requires a very strong conception of causal independence, and that it just isn’t clear that this is Descartes’ view. Third, pluralists note that although Descartes writes of bodies as substances on numerous occasions, he never clearly refers to them as modes.  Last, pluralists have denied that Descartes could have held that bodies are modes noting that for Descartes i) parts of things are not modes of them and ii) bodies are parts of the material universe.  Hoffman 1986 briefly raises each of these objections.  For more lengthy discussions see Skirry 2005: Chapter 3 and Slowik 2001.

3. Spinoza

Spinoza’s most important work is his Ethics Demonstrated in Geometric Order—henceforth the Ethics. Spinoza worked on the text throughout the 1660s and 70s.  By this time Descartes’ philosophy had become widely read and indeed Spinoza’s thinking was heavily influenced by it—including his account of substance.  Nevertheless, Spinoza’s account diverges in important ways and leads to a radically different picture of the world.

a. Spinoza’s Account of Substance

Spinoza offers a definition of substance on the very first page of the Ethics.  He writes:  “By substance I understand what is in itself and is conceived through itself… “ (E1d3).  Spinoza follows Descartes (and the tradition) in defining substance as “in itself” or as an ultimate subject.  Correspondingly, he follows the tradition in defining ‘mode’ as that which is had or borne by another; as Spinoza puts it a mode is “that which is in another…” (E1d5).  For a discussion of the scholastic-Aristotelian roots of Spinoza’s definition see Carriero 1995.  Spinoza also follows Descartes in thinking that i) attributes are the principle properties of substance, ii) among those attributes are thought and extension, iii) all other properties of a substance are referred through, or are ways of being, that attribute, and iv) God exists and is a substance.  Here the agreement ends.

The first obvious divergence from Descartes is found at E1P5.  For Descartes there are many extended substances (at least on the pluralist interpretation) and many minds.  Spinoza, however, thinks this is dead wrong.  At E1P5 Spinoza argues that substance is unique in its kind—there can be only one substance per attribute.  This fact about substance (in combination with a number of other metaphysical theses) has far-reaching consequences for his account of substance.

It follows, Spinoza argues at E1P6, that to be a substance is to be causally isolated, on the grounds that i) there is only one substance per kind or attribute and ii) causal relations can obtain only between things of the same kind.  Causal isolation does not, however, entail causal impotence.  An existing substance must have a cause in some sense, but as causally isolated its cause cannot lie in anything outside itself.  Spinoza concludes that substance “will be the cause of itself…it pertains to the nature of a substance to exist” (E1P7).  Not only is a substance the cause of itself, but Spinoza later tells us that it is the immanent cause of everything that is in it (E1P18).  Spinoza continues, in E1P8, by claiming that “every substance is necessarily infinite.”  In general Spinoza argues that if there is only one substance per attribute, then substance cannot be limited since limitation is a causal notion and substances are causally isolated.  Last, Spinoza makes the case that substances are indivisible.  He argues in E1P12-13 that if substance were divisible, it would be divisible either into parts of the same nature or parts of a different nature.  If the former, then there would be more than one substance of the same nature which is ruled out by E1P5.  If the latter, then the substance could cease to exist which is ruled out by E1P7; consequently substance cannot be divided.

b. What Substances are There?

Given this account of the nature of substance, what substances exist?  From what has been said so far in the Ethics it would be reasonable to suppose that, for Spinoza, reality consists of the following substances: God, one extended substance, one thinking substance, and one substance for every further attribute, should there be any.  As it turns out, however, this is only partially right.  It is true that Spinoza ultimately holds that God exists, that there is one extended substance, and one thinking substance.  However, Spinoza denies that these are different substances. The one thinking substance is numerically identical to the one extended substance which is numerically identical to God.  Put otherwise, there is only one substance, God, and that substance is both extended and thinking.

Spinoza’s official argument for this conclusion is at E1P14.  He argues as follows: God exists (which was proven at E1P11).  Given that God is defined as a being that possesses all the attributes (E1d6) and that there is only one substance per attribute (E1P5), it follows that God is the only substance.  For a detailed discussion of this argument, see the IEP article Spinoza: Metaphysics.

Given that God is the only substance and Spinoza’s substance/mode ontology, it follows that the material objects of our experience are not independently existing substances, but instead are modes of the one extended substance.  So too, minds which Descartes had thought of as thinking substances are, according to Spinoza, modes of the attribute of thought.

We are thus left with the following picture of reality.  Like Descartes, Spinoza holds that the most real thing is God on which all other things depend.  However, there are no created substances.  God as the one substance has all the attributes, and consequently is both an extended substance and a thinking one.  What Descartes had taken for created substances are actually modes of God.  Nevertheless, Spinoza agrees with Descartes that the contents of reality come in two kinds—modes of extension and modes of thought, and there is a plurality of both.

This account of the nature of substance yields a very different picture of the metaphysical structure of the world from Descartes (and from common sense).  This article focuses on three questions in particular: (i) Why doesn’t Spinoza accept created substances, (ii) How can a substance have more than one attribute, and (iii) How can a substance be indivisible as Spinoza suggests?

c. Why doesn’t Spinoza Countenance Created Substance?

Spinoza will not countenance Descartes’ distinction between Substance and Created Substance for a number of reasons.  First, created substances are the causal products of God.  However, substances are causally isolated, and so even if there were multiple substances, one could not be the causal product of the other.  Second, as we have seen Descartes holds that despite their causal dependence on God, finite minds and bodies warrant the name ‘substance’ at least partially because such beings are ultimate subjects.  Spinoza agrees that being an ultimate subject is an essential part of being a substance; the problem is that finite bodies and minds are not ultimate subjects.  Spinoza’s official grounds for this thesis are found in the arguments for E1P4 and 5.  In general, Spinoza claims that what is distinctive of substances as ultimate subjects is that they can be individuated by attribute alone.  According to Spinoza there are only two kinds of mark by which entities might be individuated—by attribute and by mode.  Substances as ultimate subjects cannot be individuated by mode, since subjects are metaphysically prior to modes.  Two finite bodies, for example, are not individuated by attribute (since they are both extended) and so cannot be substances.

d. How Can a Substance Have More than One Attribute?

As we’ve seen, for Descartes each substance has one—and only one—attribute.  Spinoza’s argument for substance monism, on the other hand, claims that there is a substance that possesses all the attributes.  Spinoza justifies this move defensively; at E1P10s Spinoza claims that nothing we know about the attributes entails that they must belong to different substances, and consequently there is nothing illegitimate about claiming that a substance may have more than one attribute.  Although this is Spinoza’s stated defense, a number of scholars have claimed that Spinoza has the philosophical resources to make a much stronger argument.  Specifically, they claim he has a positive case that, in fact, a substance possessing anything less than all the attributes (and hence, just one) is impossible.  In brief Lin 2007 asks us to suppose that Spinoza is wrong, and that it is possible for there to be a substance that has fewer than all of the attributes (but at least one).  Spinoza is a strong proponent of the Principle of Sufficient Reason (see, for example, E1P8s2) according to which there is an explanation for every fact.  Given the PSR it follows that there is an explanation of why the substance in question fails to have all the attributes.  However, any such explanation will have to appeal to the substance’s existing attribute (or attributes).  Attributes are conceptually independent however, and consequently one cannot appeal to an existing attribute to explain the absence of another.  For a different but closely related version of this argument see Della Rocca 2002.

e. An Extended and Indivisible Substance?

Unlike Descartes, Spinoza holds that substance is indivisible, and this raises a number of questions about the consistency of Spinoza’s account of substance.  For example, how is substance’s indivisibility consistent with Spinoza’s claims that (i) substance has many attributes which constitute its essence and (ii) that substance is extended?  For a discussion of (i), see the IEP article Spinoza: Metaphysics.  Here the focus is on (ii).

Spinoza’s extended substance, or God considered under the attribute of extension, is normally understood as encompassing the whole of extended reality (though for an alternative see Woolhouse 1990).  According to a philosophical tradition going back at least to Plato’s Phaedo, to be extended or corporeal is to have parts, to be divisible, and hence to be corruptible.  Spinoza, however, holds that “it pertains to the nature of substance to exist.”  Consequently, it would seem to follow that Spinoza cannot consistently hold that substance is extended.  Spinoza was well aware of this argument and his official rejoinder is found in E1P15s.  The problem with the argument is that it is “founded only on [the] supposition that corporeal substance is composed of parts.”  On its face, this is a confusing claim—if extended or corporeal substance just is the whole of extended reality, it surely has parts.  For example, there is the part of extension which constitutes an individual human body, a part which constitutes the Atlantic Ocean, a part that constitutes Earth, etc.  Despite his wording, Spinoza is not denying that extended substance has parts in every sense of the term.  Rather, Spinoza is especially concerned to counter the idea that his extended substance is a composite substance, built out of parts which are themselves substances, and into which it might be divided or resolved.  This makes sense, since a) it is not having parts that is the problem, but being corruptible, and b) this account of extended substance as divisible into further extended substances is just what Descartes (one of the main influences on Spinoza’s thought) seems to have held.

Spinoza makes his case in two ways in E1P15s.  First, Spinoza points us back to the arguments at E1P12 and 13 for the indivisibility of substance.  Second Spinoza offers a new argument that focuses specifically on extended substance, one that, interestingly, does not presuppose the prior apparatus of the Ethics.  In putting aside his own previous conclusions, Spinoza’s apparent goal is to show that a view like Descartes’ according to which any extended substance has parts which are themselves extended substances, fails on its own terms.  In general, he argues as follows.  Consider an extended substance, say a wheel of cheese.  If the parts of this wheel are themselves extended substances, then it is—at least in principle—possible for one or more of the parts to be annihilated without any consequence for the other parts.  The idea here is that because substances are independent subjects, the annihilation of one subject cannot have any consequence for the others.  Suppose then that the middle of our wheel of cheese is annihilated; we are thus left with a “donut” of cheese.  The problem with this is that the hole in the cheese is measurable—it has a diameter, a circumference, etc.  In short, it is extended.  However, we have supposed that the extended substance—the subject of the extension—in the middle was destroyed.  We are thus left with an instance of attribute, extension, without a substance as its subject—an impossibility by both Descartes’ and Spinoza’s standards.  For detailed discussions of this argument see Huenemann 1997 and Robinson 2009.

4. Leibniz

Leibniz’s views were informed by the accounts of both Descartes and Spinoza.  In fact, Leibniz corresponded with Spinoza during the early 1670s and briefly visited with Spinoza in 1676.  Unlike Spinoza, Leibniz did not write a single authoritative account of his metaphysical system.  Not only that, but his metaphysical views changed in significant ways over his lifetime.  Nevertheless, it is possible to identify a core account of the nature of substance that runs throughout his middle to later works (from the Discourse on Metaphysics of 1686 through the Monadology of 1714).

a. Leibniz’s Account of Substances

 

Substances are independent and are ultimate subjects.

Like Descartes, Leibniz thinks that God is the only absolutely independent thing, and that there are, in addition, created substances which are “like a world apart, independent of all other things, except for God” (Discourse on Metaphysics §8).  Second, Leibniz explicitly agrees with Descartes, Spinoza, and the tradition in maintaining that substances are the ultimate bearers of modes or properties.  He writes “when several predicates are attributed to a single subject and this subject is attributed to no others, it is called an individual substance” (Ibid.).

Substances are unities.

To be a unity for Leibniz is to be simple and without parts, and so the ultimate constituents of reality are not composite or aggregative beings. That substances are simple has metaphysically significant consequences; Leibniz infers in the Principles of Nature and Grace and elsewhere that “Since the monads have no parts, they can neither be formed nor destroyed.  They can neither begin nor end naturally, and consequently they last as long as the universe.”  A being comes to be naturally only as the result of a composition; an entity is destroyed naturally only through dissolution or corruption.  Thus only composite entities are naturally generable or destructible.  Leibniz emphasizes, however, that substances’ unity and consequent simplicity is entirely consistent with the possession of and changes in modes or properties.

Substances are active.

To say that a substance is active is to say not only that it is causally efficacious, but that it is the ultimate (created) source of its own actions. Thus he writes, “every substance has a perfect spontaneity…that everything that happens to it is a consequence of its idea or of its being, and that nothing determines it, except God alone” (DM §32).  Substances, in some sense, have their entire history written into their very nature.  The history of each substance unfolds successively, each state causally following from the previous state according to laws.  From this it follows that if we had perfect knowledge of a substance’s state at a time and of the laws of causal succession, we could foresee the entire life of the substance.  As Leibniz elegantly put the point in the Principles of Nature and Grace “the present is pregnant with the future; the future can be read in the past; the distant is expressed in the proximate.”

Substances are causally isolated.

Like Spinoza, Leibniz holds both that substances are causally efficacious, and that their efficacy does not extend to other substances.  In other words, although there is intra-substantial causation (insofar as substances cause their own states), there is no inter-substantial causation.  Leibniz offers a number of different arguments for this claim.  On some occasions he argues that causal isolation follows from the nature of substance.  If a state of a substance could be the causal effect of some other substance, then a substance’s spontaneity and independence would be compromised.  Elsewhere he argues that inter-substantial causation is itself impossible, claiming that the only way that one substance might cause another is through the actual transfer of accidents or properties.  Thus Leibniz famously writes that substances “have no windows through which something can enter or leave.  Accidents cannot be detached, nor can they go about outside of substances” (Monadology §7).  For a more detailed discussion of Leibniz’s views of causation see the IEP article Leibniz: Causation.

b. What Substances are There?

Although Leibniz agrees with Descartes that God is an infinite substance which created and conserves the finite world, he disagrees about the fundamental constituents of this world.  For Descartes there are fundamentally two kinds of finite substance—thinking substances or minds and extended substances or bodies.  Leibniz disagrees; according to Leibniz (and this is especially clear in the later works) there are no extended substances.  Nothing extended can be a substance since nothing that is extended is a unity.  To be extended is to be actually divided into parts, according to Leibniz, and consequently to be an aggregate.  The ultimate created substances, for Leibniz, are much more like Cartesian thinking substances, and indeed Leibniz refers to simple substances as “minds” or “souls.”  This terminology can be confusing, and it is important to be clear that in using these terms Leibniz is not thereby claiming that all simple substances are individual human consciousnesses (although human consciousnesses are simple substances for Leibniz).  Rather, there is a whole spectrum of simple substances of which human minds are a particularly sophisticated example.

We are thus left with the following picture of reality.  God exists and is responsible for creating and continually conserving everything else.  The ultimate constituents of reality are monads which are indivisible and unextended minds or mind-like substances.  Although monads are causally isolated, they have properties or qualities that continually change, and these changes are dictated by the monad’s nature itself.  Leibniz’s account of substance and his metaphysics in general, raise a number of questions.  This article will take up three in particular.  First, Leibniz’s account of substance yields (in conjunction with a number of other metaphysical commitments) a picture of reality that diverges in significant ways both from common sense and from Descartes and Spinoza.  How does our experience of an extended world of causal interaction fit into Leibniz’s metaphysical picture?  Second, that substances are unities is a crucial feature of Leibniz’s account, and it is important to consider why Leibniz is so opposed to composite substances.  Last, Spinoza and Leibniz offer very similar accounts of substance, yet end up with very different metaphysical pictures, and so this article will consider where Leibniz’s account diverges from Spinoza’s.

c. Experience and Reality

How does the world of our experience fit into Leibniz’s account of reality?  Our everyday experience is of extended objects causally interacting, but for Leibniz at the fundamental level there is no inter-substantial causation and there are no extended substances.  How, then, is the world of our experience related to the world as it really is?

Let us begin with the apparent causal relations between things.  Recall that, for Leibniz, monads are active and spontaneous.  Each individual human mind is a monad, and this means that all of a human’s experiences—including their sensations of the world—are the effects of their own previous states.  For example, a person’s sensation of a book’s being on the desk is not caused by the book (or the light bouncing off the book, entering the eye,…etc.) but is rather a progression in the unfolding of the history written into the person’s nature.  Although a monad’s life originates from its nature alone, God has created the world so that the lives of created monads perfectly correspond.  Leibniz writes in A New System of Nature,

God originally created the soul (and any other real unity) in such a way that everything must arise for it from its own depths…yet with a perfect conformity relative to external things…There will be a perfect agreement among all these substances, producing the same effect that would be noticed if they communicated through the transmission of species or qualities, as the common philosophers imagine they do.

Thus, when Katie walks around the corner and sees Beatrice, and Beatrice sees Katie, they do so because it was written into Katie’s very nature that she would see Beatrice, and into Beatrice’s nature that she would see Katie.  This is Leibniz’s famous doctrine of pre-established harmony.  For more see the IEP article Leibniz: Metaphysics.

How does our experience of an extended world of bodies arise? To start, Leibniz certainly doesn’t think that bodies are built out of, or are composites of, monads.  Thus he writes in his Notes on Comments by Michel Angelo Fardella, “just as a point is not a part of a line…so also a soul is not a part of matter.”  Instead in many cases Leibniz characterizes bodies as phenomena or appearances.  He writes in an oft-cited passage to DeVolder:

[M]atter and motion are not substances or things as much as they are the phenomena of perceivers, the reality of which is situated in the harmony of the perceivers with themselves (at different times) and with other perceivers.

Leibniz seems to be saying here and elsewhere that bodies are merely appearances (albeit shared appearances) that do not correspond to any mind-independent reality, and indeed a number of scholars have claimed that this is Leibniz’s considered view (see, for example, Loeb  1981: 299-309).  In other texts however Leibniz claims that bodies result from, or are founded in, aggregates of monads, and this suggests that bodies are something over and above the mere perceptions of monads.  In general, scholars have offered interpretations that attempt to accommodate both sets of texts and which see bodies as being aggregates of monads that are perceived as being extended.  There is a great deal of debate, however, about how such aggregates might ultimately be related to bodies and their perception (for one account see Rutherford 1995b: 143-153).

d. What is Wrong with Composite Beings?

Leibniz thinks composite beings are excluded as possible substances on a number of grounds.  First, no composite is (or can be) a unity, since according to Leibniz there is no way that two or more entities might be united into a single one.  He famously illustrates this claim by appealing to two diamonds.  He writes in his Letters to Arnauld:

One could impose the same collective name for the two…although they are far part from one another; but one would not say that these two diamonds constitute a substance…Even if they were brought nearer together and made to touch, they would not be substantially united to any greater extent… contact, common motion, and participation in a common plan have no effect on substantial unity.

In general, there is no relation that two or more entities might be brought into that would unify them into a single being.

A second and perhaps even deeper problem with composites is that according to Leibniz they cannot be ultimate subjects. He writes, again in the Letters to Arnauld, “It also seems that what constitutes the essence of a being by aggregation is only a mode of things of which it is composed.  For example, what constitutes the essence of an army is only a mode of the men who compose it.”  Leibniz’s claim is that no aggregate is a substance because aggregates are modes or states of their parts, and no mode is an ultimate subject.  This leaves us with a question, however: Why does Leibniz think that aggregates are mere modes or states of their parts?  In his influential book R.C. Sleigh (1990: 123-124) makes the case that Leibniz’s grounds for thinking aggregates are modes is that aggregates are semantically and ontologically dispensable.  That is, everything that is true of an aggregate can be expressed by attributing various modes to the parts, all without appealing to the aggregate itself.  This tells us that that all of an aggregate’s purported modes are in fact modes of the parts, and that consequently the aggregate is not an ultimate subject.  Given a substance/mode ontology, it follows that to the extent that aggregates exist, they must be modes.

e. Leibniz and Spinoza

Although Spinoza and Leibniz offer very different pictures of the structure of reality, their respective accounts of substance overlap in important ways: Both agree that to be a substance is to be at least i) an ultimate subject, ii) causally isolated but causally efficacious, and iii) indivisible.  Indeed, a number of scholars have suggested that Leibniz briefly adopted or was at least tempted by a Spinozistic metaphysics early in his philosophical career (see, for example, the discussion in Adams 1994: 123-130).  Even later in life Leibniz seems to have held Spinoza’s views in high regard saying in a Letter to Louis Bourguet that “[A]ccording to Spinoza…there is only one substance.  He would be right if there were no monads.”  Given this it is worth considering where Leibniz breaks with Spinoza and why.

Although they differ in a number of important ways, perhaps the most prominent difference between the metaphysics of Spinoza and Leibniz is that Leibniz holds that reality is split into two: God and creation.  God is a substance and He produces finite substances—created monads.  This signals a break from Spinoza in at least two significant ways.  First, it means that Leibniz’s agreement with Spinoza about the causal isolation of substances applies only to created substances; although for Leibniz God is a substance, He is not causally isolated.  Recall that at least one of Leibniz’s reasons for denying inter-substantial causation is that it would require the actual transfer of properties or accidents, and that such a transfer is impossible.  Jolley (2005) makes the case that, for Leibniz, God’s causal activity is of a different kind.  God does not produce effects in a metaphysically intolerable way, and consequently, God need not be causally isolated.

Second, Leibniz holds, in contrast to Spinoza, that created substances are ultimate subjects.  Leibniz is very explicit about his objection to Spinoza on this score.  Although he agrees that substances require individuation, he holds that Spinoza’s proof at E1P5 that there can be only one substance per nature or attribute is unsound.  Furthermore Leibniz holds that Monads can be individuated, ultimately claiming in the Monadology that “Monads…are…differentiated by the degrees of their distinct perceptions.”

5. 17th Century Theories of Substance in Perspective

Looking back, we might see Descartes, but especially Spinoza and Leibniz, as working through the metaphysical consequences of holding that substances are ultimate subjects.  More generally, we can see these theories of substance as different ways of trying to reconcile the notion of substance as an ultimate subject with a commitment to God’s existence and independence.

Epistemological considerations led prominent late 17th and 18th century philosophers to abandon such questions, and to give substance a much more modest position in their metaphysical systems.  John Locke, for example, holds that there are substances and that they are ultimate subjects, but is wary of drawing any further conclusions.  As Locke famously claims, “if any one will examine himself concerning his Notion of pure Substance in general, he will find he has no other Idea of it at all, but only a Supposition of he knows not what support of such Qualities…commonly called Accidents” (EHU 2.23.2).  David Hume goes further claiming that it is not within our power to know the ultimate structure of reality, and further that our idea of a substance as a subject is merely the result of our imagination: “The imagination is apt to feign something unknown and invisible, which it supposes to continue the same under all these variations; and this unintelligible something it calls a substance” (Treatise 1.4.3).  Humean skepticism about substance (and about metaphysics more generally) survives in one form or another to the present day.

Of course not everyone agrees with this tradition, and the nature of substance has been a question that many contemporary philosophers have taken up—albeit from different starting points than Descartes, Spinoza, and Leibniz.  Unlike the 17th century, in contemporary philosophical use the term ‘substance’ is not necessarily intended to refer to the ultimate constituents of reality (although it may).  Rather the term is usually taken to refer to what are sometimes called “concrete particulars”, that is, to individual material things or objects.  Furthermore, among contemporary philosophers there is nothing like the consensus that we find among the 17th century philosophers regarding ontology, dependence, reality, and God.  Thus, it is commonly held that there are categories of reality beyond substance and mode (or property), perhaps most prominently events or processes.  Many philosophers have questioned both the relation of inherence and the connection between inherence and ontological dependence. Bundle-theories of substance, for example, deny that substances are subjects at all—they are merely bundles or collections of properties.  Furthermore, most contemporary philosophers deny that it makes sense to talk about degrees of reality: Things are either real or not.  Last, and perhaps most obviously, contemporary philosophers no longer agree that God exists and is a substance.  For a contemporary effort to offer an account of substance that is in the spirit of 17th century discussions, see Hoffman and Rosenkrantz 1997.

6. References and Further Reading

a. Primary Texts in English

  • Descartes
  • John Cottingham, Robert Stoothoff, Dugald Murdoch, and Anthony Kenny ed. and trans. The Philosophical Writings of Descartes, 3 vols., Cambridge: Cambridge University Press, 1984-1991.
    • This is the standard English edition of Descartes’ work.
  • Spinoza
  • Edwin Curley, trans. and ed. The Collected Works of Spinoza Vol. 1. Princeton: Princeton University Press, 1985.
    • This is the standard English translation.
  • Leibniz
  • Roger Ariew and Daniel Garber trans. and ed. G.W. Leibniz: Philosophical Essays. Indianapolis: Hackett, 1989.
    • This is a great collection of many of Leibniz’ most important works.
  • Leroy L. Loemker trans. and ed. Philosophical Papers and Letters 2nd ed. Dordrecht: Reidel, 1969.
    • This is a much broader collection of Leibniz’s work than the Ariew and Garber text.

b. Secondary Texts

  • Adams, Robert Merrihew. Leibniz: Determinist, Theist, Idealist. New York: Oxford University Press, 1994.
    • This is one of the most influential books written on Leibniz in recent years.  Adams’ book includes detailed discussions of Leibniz on modality and identity, the ontological argument, and the place of bodies in Leibniz’s mature metaphysics, among other topics.
  • Carriero, John.  “On the Relationship Between Mode and Substance in Spinoza’s Metaphysics,” Journal of the History of Philosophy, vol. 33, no. 2 (1995), pp. 245-273.
    • In this article Carriero argues that Spinoza’s account of substance is a traditional one according to which substances are ultimate subjects.
  • Cottingham, John.  Descartes.  New York: Blackwell, 1986.
    • This is a good introduction to Descartes thought, and raises the question of a trialist interpretation.
  • Cottingham, John. The Rationalists.  Oxford: Oxford University Press, 1988.
    • This is a clearly written summary and comparison of the philosophies of Descartes, Spinoza, and Leibniz.  Chapter 3 on substance is recommended.
  • Della Rocca, Michael.  “Spinoza’s Substance Monism,” Spinoza: Metaphysical Themes. Ed. Olli Koistinen and John Biro. Oxford: Oxford University Press, 2002.
    • In this article Della Rocca considers Spinoza’s official argument that there is only one substance, and defends it from a number of objections—including the claim that Spinoza is not entitled to hold that substance can have more than one attribute.
  • Della Rocca, Michael.  Spinoza. Routledge: 2008.
    • This book is an excellent overview of Spinoza’s life and philosophy; Della Rocca’s discussion of Spinoza’s account of substance in contrast to Descartes’ is especially good.
  • Huenemann, Charles. “Predicative Interpretations of Spinoza’s Divine Extension,” History of Philosophy Quarterly, vol. 14, no. 1 (1997), pp. 53-76.
    • In this article Huenemann offers an account of Spinoza’s extended substance which differs from other influential interpretations in important ways.  In doing so, he takes up the question of the divisibility of substance and Spinoza’s vacuum argument.
  • Hoffman, Paul.  “The Unity of Descartes’s Man,” Philosophical Review, vol. 95, no. 3 (1986), pp. 339-370.
    • In this often cited article, Hoffman makes the case for a trialist reading of Descartes and along the way offers a number of criticisms of monist interpretations of substance.
  • Hoffman, Joshua, and Rosenkrantz, Gary S.  Substance: Its Nature and Existence.  Routledge, 1997.
    • In this book Hoffman and Rosenkrantz draw on the ideas of philosophers from the past (including Descartes, Spinoza, and Leibniz) as well as from contemporary philosophical advancements to develop and defend an account of substance based on independence.
  • Jolley, Nicholas.  Leibniz. Routledge: 2005.
    • This book is an excellent overview of Leibniz’s life and philosophy.  The book is written for the non-specialist and would be a good place for a person with no previous knowledge to start.
  • Kaufman, Dan. “Descartes on Composites, Incomplete Substances, and Kinds of Unity,” Archiv für Geschichte der Philosophie, vol. 90, no. 1 (2008), pp. 39-73.
    • In this excellent article Kaufman argues the Descartes is a dualist and that the trialist interpretation espoused by Hoffman (see above) and others is mistaken.
  • Lin, Martin. “Spinoza’s Arguments for the Existence of God,” Philosophy and Phenomenological Research, vol. 75, no. 2 (2007), pp. 269-297.
    • In this article Lin takes a new look at Spinoza’s arguments for God’s existence, and attempts to defend Spinoza from the charge that it is incoherent to think that God’s has more than one (much less, all) the attributes.
  • Loeb, Louis E. From Descartes to Hume: Continental Metaphysics and the Development of Modern Philosophy. Ithaca: Cornell University Press, 1981.
    • This book is one of the standards of the field, and in chapter 2 Loeb offers a comparison of Descartes, Spinoza, and Leibniz on substance.
  • Markie, Peter.  “Descartes’s Concepts of Substance,” Reason, Will and Sensation: Studies in Descartes’s Metaphysics. Ed. John Cottingham. Oxford: Clarenden Press, 1994.
    • In this influential article, Markie claims to find not two, but three accounts of substance in Descartes’ work.
  • Robinson, Thaddeus S.  “Spinoza on the Vacuum and the Simplicity of Corporeal Substance,” History of Philosophy Quarterly, vol. 26, no.1 (2009), pp. 63-81.
    • In this article Robinson offers a novel interpretation of Spinoza’s vacuum argument, and makes the case that Descartes’ account of extended substance, at least by Spinoza’s lights, is incoherent.
  • Rodriguez-Pereyra, Gonzalo.  “Descartes’s Substance Dualism and His Independence Conception of Substance,” Journal of the History of Philosophy, vol. 46, no. 1(2008), pp. 69-90.
    • In this article Rodriguez-Pereyra focuses on clarifying the respects in which Descartes’ substances are independent, and argues that other prominent features of Descartes’ account of substance follow from independence so understood.
  • Rutherford, Donald. Leibniz and the Rational Order of Nature.  Cambridge: Cambridge University Press, 1995a.
    • Although written for specialists, this influential book is highly readable.  Rutherford offers an account of Leibniz’s metaphysics which gives Leibniz’s theodicy and especially important role.
  • Rutherford, Donald. “Metaphysics: The Late Period.” The Cambridge Companion to Leibniz. Ed. Nicholas Jolley.  Cambridge: Cambridge University Press, 1995b.
    • This article is an excellent summary and discussion of Leibniz’s metaphysics from 1695’s New System of Nature to 1714’s Monadology, and focuses on Leibniz’s account of matter during this period.
  • Skirry, Justin.  Descartes and the Metaphysics of Human Nature.  New York: Continuum, 2005.
    • This book traces Descartes’ scholastic influences and develops a pluralist and trialist interpretation of Descartes’ account of substance.
  • Sleigh, R.C. Leibniz and Arnauld: A Commentary on Their Correspondence. New Haven: Yale University Press, 1990.
    • This is an extremely influential book which offers a reading of one of the most important of Leibniz’s philosophical exchanges.
  • Slowik, Edward. “Descartes and Individual Corporeal Substance,” British Journal for the History of Philosophy, vol. 9 no. 1 (2001) pp. 1-15.
    • Slowik picks up where Hoffman leaves off, developing several arguments against the monist interpretation of Descartes.
  • Woolhouse, R.S. “Spinoza and Descartes and the Existence of Extended Substance,” Central Themes in Early Modern Philosophy. Ed. J.A. Cover and Mark Kulstad. Indianapolis: Hackett, 1990.
    • In this article Woolhouse offers a novel reading of Spinoza’s extended substance, claiming that it refers to an essence as opposed to an actually existing infinite extension.
  • Woolhouse, R.S. The Concept of Substance in Seventeenth Century Metaphysics.  New York: Routledge, 1993.
    • This is a good general work on substance during the 17th century.  In addition, Woolhouse offers novel readings of Descartes and Spinoza (see above) on extended substance.  This work offers an especially good look at the relations between mechanics, causation, and substance during the period.
  • Zaldivar, Eugenio E. “Descartes’s Theory of Substance: Why He Was Not a Trialist,” British Journal for the History of Philosophy, vol. 19, no. 3 (2011), pp. 395-418.
    • The title says it all.  Zaldivar argues against Cottingham, Skirry, and others.

Author Information

Tad Robinson
Email: trobinson@muhlenberg.edu
Muhlenburg College
U. S. A.

Knowledge by Acquaintance and Knowledge by Description

Our own experiences of pain are better known to us than the bio-chemical structure of our brains. Some philosophers hold that this difference is due to different kinds of knowledge: knowledge by acquaintance and knowledge by description. I have first-hand or direct knowledge of my own experiences; whereas I have only second-hand or indirect knowledge of my brain’s being in a particular bio-chemical state. These two different kinds of knowledge indicate an essential difference in one’s awareness of certain kinds of truths. There is, however, considerable controversy among philosophers whether this distinction can consistently be applied and whether generally knowledge by acquaintance offers a stronger or better perspective on one’s knowledge than other kinds of knowledge.

Some philosophers distinguish knowledge by acquaintance from knowledge by description roughly along the following lines: knowledge by acquaintance is a unique form of knowledge where the subject has direct, unmediated, and non-inferential access to what is known whereas knowledge by description is a type of knowledge that is indirect, mediated, and inferential. There are some significant philosophical issues in spelling out how exactly to make this distinction and even whether it is possible to maintain that there is a privileged kind of knowledge by acquaintance. Nonetheless, many philosophers have put this distinction to work in issues related to epistemology and philosophy of mind.

Table of Contents

  1. The Distinction: Knowledge by Acquaintance and Knowledge by Description
  2. The Epistemology of Knowledge by Acquaintance and Knowledge by Description
    1. Bertrand Russell
    2. Recent Accounts
    3. Criticisms
    4. Replies
  3. The Philosophy of Mind and Knowledge by Acquaintance
  4. Conclusion
  5. References and Further Reading

1. The Distinction: Knowledge by Acquaintance and Knowledge by Description

The concept of acquaintance was introduced to contemporary philosophy by Bertrand Russell in his seminal article “Knowledge by Acquaintance and Knowledge by Description” (1910) and in chapter five of The Problems of Philosophy (1912) (although see Russell 1905, pp. 479-480, 492-493 for an earlier passing discussion of it). Russell explains that a person is acquainted with an object when he stands in a “direct cognitive relation to the object, i.e. when [the subject is] directly aware of the object itself” (Russell 1910, p. 108). In another place, he writes “we have acquaintance with anything of which we are directly aware, without the intermediary of any process of inference or any knowledge of truths” (Russell 1912, p. 46). To have knowledge by acquaintance, according to Russell, occurs when the subject has an immediate or unmediated awareness of some propositional truth. Knowledge by description, by contrast, is propositional knowledge that is inferential, mediated, or indirect.

The traditional account of knowledge by acquaintance is susceptible to being misunderstood or conflated with merely being directly acquainted with something (on this point see Russell 1914, p. 151). It is important to notice that there is a difference between being directly acquainted with something and having knowledge by acquaintance that something is the case. For a subject to be directly acquainted with something only requires for the subject to have unmediated access to the object of awareness.  Knowledge by acquaintance that something is the case, however, is more than being directly acquainted with something’s being the case. Knowledge by acquaintance, after all, is a kind of knowledge, which requires the subject to hold a belief under the right conditions. For a subject to be directly acquainted with something does not necessarily require the subject to hold a belief about it. Notice the difference in the following claims:

(1) S is directly acquainted with p

and

(2) S knows by direct acquaintance that p.

Propositions (1) and (2) do not mean the same thing. The truth-conditions for (1) are different than the truth-conditions for (2). In order for (1) to be true, it needs to be the case that the subject is in fact acquainted with p. For example, when p is the fact that one is experiencing a mild pain, all it takes for (1) to be true is that the subject has some unmediated access or awareness of his pain experience.  However, for (2) to be true, more is required. Some person may be directly acquainted with his mild pain experience but fail to have knowledge that he is having a pain experience, perhaps by failing to attend to this experience in such a way that he forms a propositional belief on the basis of his direct acquaintance with the pain experience. (Consider the possibility of an animal–perhaps a fish or a worm–that has the capacity to be acquainted with its pains, but lacks the capacity to form any propositional attitudes.) Thus, (1) may be a necessary condition for (2), but it would be hasty to conclude that (1) is sufficient for (2) or that (1) and (2) are equivalent. The subject needs to be aware of no propositional content about p to satisfy (1). But in order to satisfy (2), the subject needs to have a belief with propositional content about p, which is properly based on his direct acquaintance with p.

2. The Epistemology of Knowledge by Acquaintance and Knowledge by Description

The epistemological issues involving the distinction, knowledge by acquaintance and knowledge by description, have focused primarily on whether there is privileged unmediated knowledge by acquaintance. I begin with Russell’s epistemic use of the distinction, and then I will survey some contemporary accounts of knowledge by acquaintance. Finally, to round out this overview of the epistemic applications of this distinction I will highlight a couple of criticisms of classical accounts of knowledge by acquaintance and responses to them.

a. Bertrand Russell

Russell used the distinction between knowledge by acquaintance and description to articulate a foundationalist epistemology where knowledge by acquaintance is the most basic kind of knowledge and knowledge by description is inferential (Russell 1910 and 1912, ch. 5). “All our knowledge,” wrote Russell, “rests upon acquaintance for its foundation” (Russell 1912, p. 48). Knowledge by acquaintance, therefore, is a direct kind of knowledge; it is a kind of knowledge that does not depend on inference or mediation. The test Russell employs for determining what someone knows by acquaintance is based on dubitability. For this reason, Russell maintained a person cannot know by acquaintance that physical objects, like an iPod, exist; after all, even when someone is seeing an iPod, it is possible to doubt whether the iPod exists (due to the possibilities of dreaming, illusion, hallucination, and so forth). The sense data, or sensory experiences, of an iPod, however, cannot consistently be doubted by a person who is experiencing them. Thus, sense data can be known by acquaintance, whereas physical objects cannot.  Russell also believed that one could be directly acquainted with memory experiences, introspective experiences (awareness of one’s own direct acquaintances and other internal sensations), universals, and (probably) even one’s own self (see Russell 1912, ch. 5; however, for one place where he is less confident of being directly acquainted with the self, see his 1914, p. 81).

On Russell’s view one cannot know by acquaintance that physical objects exist. Consequently, knowledge by description provides the only possibility of knowing physical objects. Knowledge by description, according to Russell, is dependent on direct acquaintance in at least two ways. First, knowledge by description depends on acquaintance for its propositional content. Russell unequivocally stated, “every proposition which we can understand must be composed wholly of constituents with which we are acquainted” (Russell 1912, p. 58). Although one’s knowledge by description may concern objects that outstrip the range of one’s immediate acquaintance, the propositional content is composed of concepts with which the subject is directly acquainted.

There is a similarity between the way one can know something about particular things outside of one’s experience and the way Russell envisioned knowledge by description to allow a person to think about physical objects. Consider the belief that the tallest living man in the world exists. Someone can form this belief, even though he may have no clue who this individual is. Understanding the concepts of being the tallest, being alive, and being a man are sufficient for allowing a person to hold this belief, even though its referent may lie beyond the set of people that one has met first-hand. Likewise, Russell believed that one could form beliefs about physical objects, despite the fact that one cannot ever be directly acquainted with them. When a person holds the belief, for example, that there is a cup of coffee, he is not directly acquainted with the coffee as a physical object, but he is able to think about the physical object through descriptions with which he is directly acquainted. The descriptive content might consist of there being an object that is the cause of his experiences of blackness, bitterness, hotness, and liquidness. On Russell’s account, the subject’s acquaintance with the right concepts allows him to form beliefs about physical objects.

The second way in which knowledge by description depends on acquaintance is that knowledge by description is inferentially dependent on knowledge by acquaintance. In other words, the propositions one knows by description ultimately are inferred from one’s propositional knowledge by acquaintance. Consequently, this gives rise to a foundationalist epistemology in which all of one’s knowledge is either foundational or inferentially based on foundational knowledge. It is well-known that Russell believed one’s knowledge of the world beyond his or her own mind needed to be inferred from more basic knowledge of one’s own mental experiences (see Russell 1912, ch. 2 and Russell 1914). This raises the difficulty for Russell’s position of how to provide a plausible account of this inferential relationship between knowledge by acquaintance and knowledge by description in a way that successfully accommodates the commonsense notion that most people know that physical objects exist.

b. Recent Accounts

Since Russell, much work has been advanced in his name promoting and applying the distinction of knowledge by acquaintance and description to use in epistemology. On the one hand, there have been traditional acquaintance theorists who have more-or-less kept the original Russellian conception of knowledge by acquaintance; namely the view that one is directly acquainted with one’s own states of mind and not the extra-mental world. Brie Gertler helpfully characterizes contemporary accounts of knowledge by acquaintance as involving judgments that are (a) tied directly to their truthmakers, (b) depend only on the subject’s conscious states of mind for their justification, and (c) are more strongly justified than any empirical judgments that are not directly tied to their truthmakers and dependent only on conscious states of mind for their justification (Gertler 2012, p. 99). The traditional position on knowledge by acquaintance has been most influentially promulgated by the works of Richard Fumerton and Laurence BonJour (for other accounts related to the traditional approach see Balog 2012, Chalmers 2003, Fales 1996, Feldman 2004, Gertler 2001, Gertler 2011, Gertler 2012, Hasan forthcoming, and Pitt 2004).

Fumerton has proposed the following three conditions as necessary and sufficient for knowledge by acquaintance: (i) S is directly acquainted with the fact that p; (ii) S is directly acquainted with the thought that p; and (iii) S is directly acquainted with the correspondence that holds between the fact that p and the thought that p (Fumerton 1995, pp. 73-79). These three acquaintances parallel the requirements for a proposition’s being true (on a correspondence theory of truth): (a) the truth-maker (S is directly acquainted with the fact that p); (b) the truth-bearer (S is directly acquainted with the thought that p); and (c) the correspondence relation (S is directly acquainted with the correspondence between the fact that p and the thought that p). Given the subject’s direct acquaintance with (i), (ii), and (iii), he is directly acquainted with everything necessary to constitute being directly acquainted with a true proposition.

To the extent that BonJour provides an account of knowledge by acquaintance, it is directed exclusively to basic empirical beliefs. BonJour describes his theory of basic empirical justification as taking place when a person directly apprehends that his experience fits or satisfies the description offered by the content of his belief (BonJour 2001; BonJour 2003, especially pp. 60-76, 191-193; BonJour uses the language of “direct acquaintance” in 2001, while he prefers “direct apprehension” in 2003). It is the subject’s ability to have a direct acquaintance or apprehension of the contents of one’s own conscious experiences and how they fit or satisfy the content of one’s basic beliefs that makes knowledge by acquaintance possible. BonJour stresses, however, that fallibility can occur due to the subject’s misapprehension of one’s experience or failure to see the fit between the experience and the belief.  Despite the possibility of error, he believes that this does not undermine genuine cases when a subject does correctly apprehend the character of experience and sees its fit with one’s basic belief.

On the other hand, there are non-traditional acquaintance theorists who have modified knowledge by acquaintance in such a way that one can be acquainted with and know physical objects directly (such as Brewer 2011). Another deviation from the Russellian tradition is to maintain that knowledge by acquaintance is a different kind of knowledge than propositional knowledge (Tye 2009, pp. 95-102). On this non-propositional approach to knowledge by acquaintance, there is a sense in which one can be said to know something with which one is acquainted, even though the person does not necessarily have any propositional belief states about the thing that is said to be known by acquaintance.

c. Criticisms

Perhaps the most influential problem raised for knowledge by acquaintance is commonly called the problem of the speckled hen (due to Gilbert Ryle as reported by Chisholm 1942). The problem arises by considering the case where someone looks at a hen with exactly 48 speckles on one side. Yet, the perceiver’s experience is not adequate grounds for him to distinguish an experience of a hen with exactly 47 or 49 speckles. Indeed, even if someone were to hold the belief that one’s experience is of a hen with exactly 48 speckles, it would by most standards fail to count as knowledge because the subject could have easily formed a false belief (for example, that the hen has 47 or 49 speckles) on the basis of being acquainted with that very experience (for one way to understand this epistemic principle see “The Safety Condition for Knowledge”). In other words, typically people cannot distinguish between having a visual experience of a 47, 48, or 49 speckled hen. However, if a person is directly acquainted with these experiences and can plausibly satisfy the other conditions required for knowledge by acquaintance, then cases of this sort stand as potential counterexamples to possessing knowledge or justification through direct acquaintance. Given the indiscernibility of the contents of mental states through direct acquaintance, this raises doubts whether one’s justification based on direct acquaintance can offer some unique, privileged state of knowledge. After all, one motivation for accepting knowledge by acquaintance is that the subject’s knowledge of his own mental states may be indubitable, whereas other kinds of knowledge cannot. The problem of the speckled hen challenges the idea that one may fail to have indubitable justification through knowledge by acquaintance because for just about any putative mental state with which one is acquainted, since there may be a different mental state that is virtually indistinguishable from it. It even raises the question whether these closely related states constitute genuinely different experiences for the subject at all.

The problem of the speckled hen has continued to challenge contemporary accounts of knowledge by acquaintance (see Sosa 2003a, 2003b; Markie 2009; Poston 2010). The challenge, as Sosa puts it, is for the acquaintance theorist to “tell us which sorts of features of our states of consciousness are epistemically effective ones, the ones such that by corresponding to them specifically that our basic beliefs acquire epistemically foundational status” (Sosa 2003, pp. 277-278, similar remarks can be found in Sosa 2003b, p.121). Here the concern is that the acquaintance theorist has no principled way of explaining why one’s acquaintance with a simple mental state (for example, an experience of three black dots against a white background) is able to ground a justified belief, whereas one’s acquaintance with a complex mental state (for example, a hen with 48 speckles) fails to serve as appropriate grounds for a justified belief. The problem is for acquaintance theorists to provide a plausible account of the conditions and limits of knowledge by acquaintance that naturally explains cases like the speckled hen.

A second influential problem for knowledge by acquaintance is due to Wilfrid Sellars. Sellars (1963) famously critiqued the possibility of acquiring non-inferentially justified beliefs through some privileged, direct relation to one’s sensory experience. The problem that Sellars has raised can be put in terms of a dilemma (for a recent defense of this type of dilemma, see Williams 1999). The dilemma focuses on whether the conscious experiential states of mind are propositional or non-propositional. If, on the one hand, mental experiences are non-propositional, then it is mysterious (at best) to explain how a belief can derive any justification from a non-propositional basis. Donald Davidson expresses the core intuition behind this horn of the dilemma: “nothing can count as a reason for holding a belief except another belief” (1986, p. 126). The central claim at the crux of this side of the dilemma is that only propositional entities can stand in logical or justificatory relations to other propositional entities.

On the other side of the dilemma, if one accepts that mental experiences are propositional, then the problem for acquaintance theorists is to explain how one is justified in accepting the propositional content of experiences. Defenders of this objection stress that this horn of the dilemma pushes the problem of generating non-inferential justification back a step. If beliefs derive their justification from the propositional contents of experiences, then experiences too must derive their justification from some other appropriate source of propositional content. Contrary to the kind of foundationalism proposed by the acquaintance theorist, this horn of the dilemma essentially states that mental experiences do not constitute a non-arbitrary way to stop the regress of justification needed to arrive at a foundational bedrock for empirical knowledge or justification.

Thus, the Sellarsian dilemma appears to leave no viable alternative for the defender of direct acquaintance: (i) if experiences are non-propositional, then they cannot stand in justificatory relations to propositional beliefs; (ii) if experiences are propositional, then there must be some further basis for one to be justified in holding the propositional content of the experiences. The first option alleges that deriving justification from non-propositional content is mysterious and inexplicable. The second option alleges that granting propositional content to experiences does not stop the regress of reasons. Either way, the foundational and unique role that is supposed to be filled by knowledge by acquaintance is undermined.

d. Replies

In response to the problem of the speckled hen, Fumerton (2005) has suggested a variety of options available to the acquaintance theorist. First, it is possible that knowledge by acquaintance fails in cases like the speckled hen because the subject fails to have an acquaintance with the correspondence that holds between the character of his experience and the thought that the experience has a specific character (compare Poston 2007). Recall that on the traditional account, knowledge by acquaintance requires more than only being acquainted with the truth-maker for one’s belief, it also involves some kind of awareness of the correspondence or fit that holds between one’s thought and the experience that is the basis for one’s thought. A second proposal is based on phenomenal concepts (compare Feldman 2004). A phenomenal concept is a concept that one can immediately recognize (for example, being three-sided) as opposed to a concept that involves a process of thought to recognize (for example, being 27-sided). Restricting knowledge by acquaintance to belief-states involving phenomenal concepts can accommodate the challenge of the speckled hen by plausibly maintaining that the concept of being 48-speckled is not a phenomenal concept and thereby not a candidate for knowledge by acquaintance. Third, one can maintain that there are degrees between determinate and determinable properties with which one is acquainted (compare Fales 1996, pp. 173-180). Properties fall on a continuum between being more general to being more specific or determined. For example, a ripe tomato’s surface can be described generally as colored, more determinately as red, or even more determinately as vermilion. Given this distinction in the determinateness of properties, it is possible that one could be directly acquainted with differing degrees of determinable properties. If experiences can instantiate varying degrees of these determinable properties (and this is a matter of controversy that cannot be addressed here), then in cases like the speckled hen the acquaintance theorist may hold that the subject is not directly acquainted with the experience of a hen with 48 speckles but with the experience having a less determinate property such as the property of being many-speckled. Thus, in cases akin to the speckled hen, the subject may not be acquainted with the property of being 48-speckled, but with a less determinate property like being many-speckled.

Another proposed solution to the problem of the speckled hen follows from distinguishing between the phenomenal and epistemic appearances of an experience (Gertler 2011, pp. 103-106). The phenomenal appearance of an experience is determined by the properties that constitute the experience. The epistemic appearance of an experience is what the experience inclines the subject to believe. For example, if someone takes a white plate and holds it under a green light, the plate phenomenally appears green (that is, the property of being green partly constitutes the experience of the plate) but with sufficient knowledge of the effects of green lighting on white objects, it does not epistemically appear green (that is, the subject is not inclined to think that the plate is green). In cases like the speckled hen, then, the experience may phenomenally appear to be 48-speckled, but it may only epistemically appear to be many-speckled. The key to this solution is disambiguating the meaning of “appearance” to explain how in one sense the subject may have an appearance of a hen with 48 speckles (phenomenally) and in another sense the subject may not have an appearance of a hen with 48 speckles (epistemically).

With respect to the Sellarsian dilemma, one response takes the horn of the dilemma that states propositional beliefs derive their justification from non-propositional experiences. For instance, Fumerton (1995, pp. 74-76) proposes an account of knowledge by acquaintance (see above section 2b) where the subject is in a position to know a truth through three acquaintances. Since acquaintance by itself is not an epistemic concept in need of justification, it enables the subject to be appropriately related to the source of one’s justification without necessitating further levels of justification. Thus, in response to the challenge to explain how non-propositional experiences (such as the raw experience of searing pain) can justify propositional thoughts (such as the belief that I am experiencing pain), Fumerton maintains that justification is made possible by being directly acquainted with the relation of correspondence that holds between the non-propositional experience and the propositional thought.

BonJour offers another influential response to Sellars’s dilemma (2001, pp. 28-34; 2003, especially pp. 69-74). On BonJour’s account of basic empirical knowledge (see above section 2b), the awareness or apprehension of the justification for one’s belief is built-in or partly constitutive of the justifying non-conceptual experience. While non-conceptual experiences do not stand in certain kinds of logical relations (for example, inferring) to conceptual beliefs, there is a relation of description or fit that holds between experiences and beliefs. Since certain kinds of experiences have a built-in awareness of their contents, these experiences contain within themselves a kind of reason for thinking that the given description accurately fits. The crucial move in BonJour’s solution is to see that the built-in awareness renders the basic empirical belief justified without any further need for justification.

Staunch defenders of the Sellarsian dilemma will likely remain unimpressed with these responses. The problem, they might urge, is that these alleged solutions push the problem back a level. Those defending the dilemma will press these proposals to explain how propositional beliefs can correspond or accurately describe non-propositional experiences. In some places, Fumerton and BonJour seem to suggest that first-hand experience with our own conscious states of mind adequately demonstrate how these relations are able to hold (Fumerton 1995, p. 77; BonJour 2003, pp. 69-74).

Other friends of direct acquaintance have suggested that experiences, while being non-propositional, may have a propositional structure (or proto-propositional structure) that allows for propositional content to map onto it (Fales 1996, pp. 166-169). For example, a non-propositional experience may be constituted by presenting particulars exemplifying specific properties, which naturally provides a structure resembling statements in the subject-predicate form. Timothy McGrew contends that basic empirical beliefs can be formed by indexically referring to one’s non-propositional experience as part of what constitutes the propositional content of the belief (1995, especially pp. 89-90). By embedding the non-propositional content as a constituent part of the belief (for example, “I am being appeared to thusly”), it is possible to show how non-propositional experiences may provide a basis for forming justified propositional beliefs (for some concerns about the richness of indexically formed beliefs to serve as a foundation see Sosa 2003b, especially pp. 122-124).

3. The Philosophy of Mind and Knowledge by Acquaintance

While knowledge by acquaintance has its most immediate application to philosophical topics in epistemology, it has increasingly been applied to issues in metaphysics, especially in the philosophy of mind. In particular, knowledge by acquaintance has played a role in the knowledge argument against physicalism. Some argue that knowledge of qualia is direct and unmediated, which provides an insight into the nature of the mind that cannot be known through the physical sciences. Frank Jackson presents this argument through a compelling thought experiment about Mary (Jackson 1982; 1986). Mary is a scientist who learns all the physical truths from her exhaustive study of the completed physics. For whatever reasons, Mary has lived her entire life without experiencing any colors besides black, white, and shades of gray. One day after she has mastered all the physical truths and everything that can be deduced a priori from them, Mary leaves her black-and-white environment and sees a ripe tomato.  Intuitively, it seems that Mary learns something new with this experience; she learns this is what it’s like to have a red experience. Since Mary knew all the physical truths prior to seeing the ripe tomato and since Mary learned something new about the world after seeing the ripe tomato, the implication of the thought experiment, then, is that physicalism is false.

More recently David Chalmers has made use of knowledge by acquaintance to support property dualism in the same vein as the knowledge argument. His arguments rely in part on the notion that our direct knowledge of phenomenal conscious states justifies our beliefs about them (see Chalmers 1996, especially pp. 196-198; 2004; 2007). Among his arguments is the case from the asymmetry between one’s knowledge of consciousness and the rest of the world (Chalmers 1996, pp. 101-103). A person’s knowledge of consciousness is based on first-hand, direct experiences of it, not evidence that is external to one’s immediate access. His argument, roughly stated, is that since subjects know by acquaintance that phenomenal consciousness exists and possesses certain features, and this knowledge cannot be deduced a priori from one’s knowledge of physical truths, it follows that these features of conscious experience are not physical.

Some philosophers have attempted to defend physicalist theories of mind with the notion of knowledge by acquaintance, albeit by employing a non-traditional approach to knowledge by acquaintance (compared to the traditional approach as described in section 1 and section 2). Generally these approaches have endorsed that knowledge by acquaintance is nothing more than a subject’s being directly acquainted with a property or fact, and they have deviated from the traditional position that knowledge by acquaintance requires propositional belief (see Conee 1994 and Tye 2009). Those who use knowledge by acquaintance to defend physicalism claim that the knowledge argument only highlights two different ways of knowing the same thing. One way this case has been made is by suggesting that one can know all the propositional truths about something (for example, the city of Houston) and yet not know it directly. The difference between knowledge of phenomenal consciousness and knowledge of brain states is like the difference between knowing about Houston (by reading a very thorough visitor’s guide) and knowing Houston directly (by visiting the city). According to these views, it is because knowledge by acquaintance is a different kind of knowledge that phenomenal knowledge appears to differ from descriptive, physical knowledge about brain states. Although there is not space for a full evaluation of these views, one problem that has been raised is that knowledge by acquaintance cannot by itself account for the epistemic disparity that this solution is attempting to solve (see Nida-Rümelin 1995; Gertler 1999). In other words, the problem is that there appears to be propositional, factual content about the properties of conscious experience that these non-standard accounts of knowledge by acquaintance fail to capture.

4. Conclusion

The distinction between knowledge by acquaintance and knowledge by description has a number of important applications in philosophy. In epistemology, it underwrites a tradition from Bertrand Russell that continues to influence debates on the nature of foundationalism and the possibility of a privileged class of knowledge. In metaphysics, knowledge by acquaintance has increasingly been incorporated into arguments concerning the nature of conscious experience and the viability of physicalism. The current trend suggests that knowledge by acquaintance will continue to be refined and put to work on a variety of philosophical fronts.

5. References and Further Reading

  • Balog, K. 2012. “Acquaintance and the Mind-Body Problem.” In Simone Gozzano and Christopher Hill (eds.), New Perspectives on Type Identity: The Mental and the Physical (pp. 16-43). Cambridge: Cambridge University Press.
  • BonJour, L. 2001. “Toward a Defense of Empirical Foundationalism.” In Michael Raymond DePaul (ed.), Resurrecting Old-Fashioned Foundationalism (pp. 21-38). Lanham: Rowman and Littlefield.
  • BonJour, L. 2003. “A Version of Internalist Foundationalism.” In Laurence BonJour and Ernest Sosa (eds.), Epistemic Justification: Internalism Vs. Externalism, Foundations Vs. Virtues (pp.5-96). Malden: Blackwell.
  • Brewer, B. 2011. Perception and its Objects. Oxford: Oxford University Press.
  • Chalmers, D. 1996. The Conscious Mind. Oxford: Oxford University Press.
  • Chalmers, D. 2003. “The Content and Epistemology of Phenomenal Belief.” In Quentin Smith and Aleksandar Jokic (eds.), Consciousness: New Philosophical Perspectives (pp. 220-272). Oxford: Oxford University Press.
  • Chalmers, D. 2004. “Phenomenal Concepts and the Knowledge Argument.” In Peter Ludlow, Yujin Nagasawa, and Daniel Stoljar (eds.), There’s Something about Mary: Essays on Phenomenal Consciousness and Frank Jackson’s Knowledge Argument (pp. 269-298). Cambridge: MIT Press.
  • Chalmers, D. 2007. “Phenomenal Concepts and the Explanatory Gap.” In Torin Alter and Sven Walter (eds.), Phenomenal Concepts and Phenomenal Knowledge: New Essays on Consciousness and Physicalism (pp. 167-194). Oxford: Oxford University Press.
  • Chisholm, R. 1942. “The Problem of the Speckled Hen.” Mind 51, 368-373.
  • Conee, E. 1994. “Phenomenal Knowledge.” Australasian Journal of Philosophy 72, 136-150.
  • Davidson, D. 1986. “A Coherence Theory of Truth and Knowledge.” In Ernest Lepore (ed.), Truth and Interpretation: Perspectives on the Philosophy of Donald Davidson (pp. 423-438). Malden: Blackwell.
  • Fales, E. 1996. A Defense of the Given. Lanham: Rowman and Littlefield.
  • Feldman, R. 2004. “The Justification of Introspective Beliefs.” In Earl Conee and Richard Feldman (eds.), Evidentialism: Essays in Epistemology (pp. 199-218). Oxford: Oxford University Press.
  • Fumerton, R. 1995. Metaepistemology and Skepticism. Lanham: Rowman and Littlefield.
  • Fumerton, R. 2005. “Speckled Hens and Objections of Acquaintance.” Philosophical Perspectives 19, 121-138.
  • Gertler, B. 1999. “A Defense of the Knowledge Argument.” Philosophical Studies 93, 317-336.
  • Gertler, B. 2001. “Introspecting Phenomenal States.” Philosophy and Phenomenological Research 63, 305–328.
  • Gertler, B. 2011. Self-Knowledge. New York: Routledge.
  • Gertler, B. 2012. “Renewed Acquaintance.” In Declan Smithies and Daniel Stoljar (eds.), Introspection and Consciousness (pp. 93-128). Oxford: Oxford University Press.
  • Hasan, A. Forthcoming. “Phenomenal Conservatism, Classical Foundationalism, and Internalist Justification.” Philosophical Studies.
  • Jackson, F. 1982. “Epiphenomenal Qualia.” Philosophical Quarterly 32, 126-136.
  • Jackson, F. 1986. “What Mary Didn’t Know.” The Journal of Philosophy 83, 291-295.
  • Markie, P. 2009. “Classical Foundationalism and Speckled Hens.” Philosophy and Phenomenological Research 79, 190-206.
  • McGrew, T. 1995. The Foundations of Knowledge. Lanham: Littlefield Adams.
  • Nida-Rümelin, M. 1995. “What Mary Couldn’t Know: Belief about Phenomenal States.” In Thomas Metzinger (ed.), Conscious Experience (pp. 219-241). Exeter: Imprint Academic.
  • Pitt, D. “The Phenomenology of Cognition, or, What it is Like to Think That P?” Philosophy and Phenomenological Research 69, 1-36.
  • Poston, T. 2007. “Acquaintance and the Problem of the Speckled Hen.” Philosophical Studies 132, 331-346.
  • Poston, T. 2010. “Similarity and Acquaintance.” Philosophical Studies 147, 369-378.
  • Russell, B. 1905. “On Denoting.” Mind 14, 479-493.
  • Russell, B. 1910. “Knowledge by Acquaintance and Knowledge by Description.” Proceedings of the Aristotelian Society 11, 108-128.
  • Russell, B. 1997 [1912]. Problems of Philosophy, ed. John Perry. Oxford: Oxford University Press.
  • Russell, B. 1993 [1914]. Our Knowledge of the External World. New York: Routledge.
  • Sellars, W. 1963. Empiricism and the Philosophy of Mind. In Science, Perception, and Reality. London: Routledge and Keagan Paul.
  • Sosa, E. 2003a. “Privileged Access.” In Quentin Smith and Aleksandar Jokic (eds.), Consciousness: New Philosophical Perspectives (pp. 273-292). Oxford: Oxford University Press.
  • Sosa, E. 2003b. “Beyond Internal Foundations to External Virtues.” In Laurence BonJour and Ernest Sosa (eds.), Epistemic Justification: Internalism Vs. Exeternalism, Foundations Vs. Virtues (pp. 99-170). Malden: Blackwell.
  • Tye, M. 2009. Consciousness Revisited. Cambridge: MIT Press.
  • Williams, M. 1999 [1977]. Groundless Belief. Princeton: Princeton University Press.

 

Author Information

John M. DePoe
Email: jdepoe@marywood.edu
Marywood University
U. S. A.