Scientific Papers

A New Kind of Relevance for Archaeology

Description of Image


In recent years an increasing number of archaeologists have conducted research that is explicitly-designed to address contemporary issues (Sabloff, 2008; Cooper and Sheets, 2012; Ingram and Gilpin, 2015; Nelson et al., 2015; Liebmann et al., 2016; Kohler et al., 2017; Hambrecht et al., 2018; Hegmon and Peeples, 2018; Jackson et al., 2018). Despite many exciting results emanating from this work, as of yet it seems to have had little impact on actual public policy discussions. For example, despite extensive research by archaeologists on human responses to climate change, to date the results of such research have been largely absent from reports by the Intergovernmental Panel on Climate Change (Jackson et al., 2018). Given that the archaeological record is the most extensive compendium of human experience there is, it seems only natural that the results of archaeological research should have an impact on discussions concerning contemporary issues (Smith et al., 2012; Kintigh et al., 2014; Altschul, 2016; Altschul et al., 2017). But so far there seems to have been limited success in this regard. Why is this? What would an archaeology that has practical relevance beyond archaeology look like? How would it be different from the archaeology many of us practice right now?

In this essay I will offer my own opinions on these sorts of questions. I will argue, first, that the traditional focus of archaeology—constructing historical narratives—is valuable but is unlikely to expand its practical relevance because the results are too contingent on local details. Second, I will argue that traditional “grand synthesis” and cross-cultural research are also insufficient because their results are too general to connect to specific issues and solutions. Finally, I will suggest, perhaps surprisingly, that a productive way forward is to resuscitate and reformulate aspects of the New Archaeology that were not realized in the 1970s. I use the example of settlement scaling theory to illustrate that the New Archaeology’s interest in developing predictive knowledge of specific social phenomena is both possible and productive, and that additional work in this spirit may be our best way forward. In a nutshell I believe that, if archaeology is to achieve greater practical relevance it will not be through research that reconstructs the past or makes broad generalizations. Rather, it will come from studies of specific social phenomena regardless of where or when they occur.

What is Practical Relevance?

Before getting into the main arguments of this paper, I should say a few words about what I mean by practical relevance. There are many aspects of archaeology that yield practical benefits in the present, from developing sites for cultural tourism to creating the raw material for museum exhibits to promoting social justice for marginalized groups. Here, I use the term “practical relevance” to refer to something more specific: predictive knowledge of specific social phenomena that can help us make informed decisions regarding issues we face today.

Two questions come immediately to mind. First, is it really worthwhile to view human behavior as predictable? There are of course many aspects of the behavior of individuals, and of groups, that are not predictable. But at least some are. As examples: today’s demographic rates have predictable effects for tomorrow’s economy; insurance companies use actuarial tables to predict payouts and adjust premiums with reasonable confidence; political scientists create models based on demographic and socioeconomic characteristics of subgroups that predict election results; the daily movements of individuals follow predictable patterns that allow our smartphones to plot the most time-efficient route of travel between two places; simple models often surpass expert judgment in predicting the outcomes of sporting events; and tech companies use browsing and posting habits to predict which products we are most likely to purchase.

It’s also important to recognize that the ability to predict is generally beneficial. Knowing how many people of different ages will be around at a future date is critical for maintaining the finances of the social safety net; actuarial tables ensure that insurance companies can honor their commitments to people in need; predicting travel times helps individuals use their time more effectively, and connecting people with the products they are likely to want helps consumers in addition to tech companies. So even though many aspects of human behavior may never be entirely predictable, at least some are, at least partly, and it therefore stands to reason that social scientists should be able to expand knowledge of predictable behavior with appropriate effort.

Second, even if one grants that human behavior is at least partly predictable, is it really reasonable to imagine that the knowledge generated through archaeology is relevant for issues we face today? After all, societies of the past were different in innumerable ways from those of the present. They were smaller, lacked modern transport and information technologies, had different social and political institutions, and operated in terms of diverse cultural concepts that for the most part do not characterize late-stage capitalist nations of the present. Given all these differences, why should anyone think the results of archaeology actually apply to today’s decisions?

One possible answer involves social theory. For many decades archaeologists have engaged with social theorists in cultural anthropology, sociology, geography, and related fields to make ontological claims regarding sociocultural phenomena, and in many cases these frameworks have been devised in the context of contemporary societies and then applied to past societies by archaeologists (Shanks and Tilley, 1987; Trigger, 1989; Hodder, 1991, 2012; Olsen, 2010; Alt and Pauketat, 2019). So there is an established tradition that argues, in effect, that the basic properties and relations of human social life apply to all societies. This approach has yielded many insights, but it seems limited from the perspective of practical relevance in that the approach generally does not lead to predictions that can be evaluated empirically. Instead, in most cases the process involves mapping or indexing a conceptual framework onto archaeological information from a given context (Smith, 2015, 2017). Most of the time, this approach helps one interpret the archaeological evidence better, but it doesn’t lead to empirical predictions such that one can know if or how a particular idea is wrong.

By and large, the social sciences do not yet possess a body of such ideas, and I suspect many archaeologists would question whether it is even possible. We should be under no illusions that developing a predictive theory of human society is easy. Still, the history of other sciences provides a basis for optimism. Newtonian mechanics applies to all objects and has sufficient predictive power to engineer spaceships that get people to the moon and rovers to Mars. The periodic table applies to all elements and makes it feasible to develop new compounds. The Neo-Darwinian synthesis leads to predictions about how populations of organisms as simple as bacteria, and as complex as human beings, change from one generation to the next. And some would even argue that behavioral economics reflects intrinsic aspects of human cognition and leads to predictions about human decision-making in any context (Kahneman, 2011; Thaler, 2016). Developing these frameworks is hard—the very fact that the scientists most responsible for these insights are household names is a hint of the difficulties involved. But we need to do it, and more importantly, we need to believe it is a good thing to do, if the social sciences are to play a more important role in our future.

With this perspective in mind, what would an archaeology that has practical relevance for today look like? Recognizing that ultimately this will require theoretical development, I focus in the following sections on the epistemological and methodological basis of what I believe would be a productive approach. I will approach this vision by first illustrating why several traditional approaches to archaeological interpretation are unlikely to achieve practical relevance. Then, I’ll suggest that archaeology got close to moving in this direction in the 1970s but for various reasons turned away from it. Finally, I’ll develop an example which illustrates that practical relevance can be achieved if we are willing to apply the same reasoning and analytical processes that are used throughout the sciences to the material proxies for human behavior we can now derive from the archaeological record.

The Relevance of History

Archaeologists are good at historical reconstruction, and getting better all the time. I’d like to think I’ve contributed to this effort myself (Ortman, 2012, 2016a; Ortman and McNeil, 2018). From GIS to AMS-dating, isotopes, ICP-MS, micromorphology, phytoliths, ancient DNA, LiDAR, UAV photogrammetry, linguistic paleontology and more, archaeological methods continue to expand our ability to reconstruct past human behavior and environments; and as these methods expand, our historical narratives become increasingly detailed, accurate, and compelling. Today, we really do know far more about the details of the human experience through archaeology than we did a few decades ago. And these narratives are important. They feed our imaginations regarding the range of social worlds humans have created and the range of worlds that are possible (e.g., Fowles, 2013); they provide both celebratory and cautionary tales regarding what can happen (Harper, 2017); they support human diversity and multi-culturalism by illuminating the heritage of contemporary peoples (Popa, 2019); and they even contribute to social justice by getting the facts of history right (Preucel, 2002; Cameron, 2008). All of these outcomes are valuable and I want to stress that, in making the argument of this paper, I do not mean to suggest that archaeologists should not continue doing good work in all these areas. Instead, the question I wish to ask is whether the historical narratives that most archaeologists contribute to can lead to predictive knowledge that might help us make informed practical decisions: how (or if) to define land-use zones, redistribute wealth, stimulate economic growth, reduce environmental impacts, improve public health, mitigate the effects of climate change, and so forth. In other words, my question is whether such narratives give us a basis for predicting future outcomes based on actions we could take today.

An example may help to illustrate what I have in mind. Let’s say that one is interested in using the results of archaeological research to suggest productive ways of adapting to climate change. One way to proceed would be to examine specific cases where the long-term history of human-climate relations is understood in great detail. A good example is The Village Ecodynamics Project (VEP), an interdisciplinary project involving archaeologists, computer scientists and ecologists that has worked since 2003 to examine human-environment relationships in the US Southwest. The project has received substantial financial support from agencies and organizations in the US, and I have been fortunate to be a part of it. Working together, we have retrodicted past precipitation and temperature in two study areas by correlating tree-ring series with weather station and pollen core data (Wright, 2012; Bocinsky and Kohler, 2014), and then then translated these into productivity estimates (at a temporal resolution of 1 year and a spatial resolution of 4 ha) by combining paleoclimate reconstructions with soils and historic crop yield data (Kohler et al., 2007, 2012b; Varien et al., 2007; Bocinsky and Varien, 2017). We also compiled architectural, ceramic, and chronometric data for thousands of archaeological sites and used these data to estimate the population histories of our study areas at a temporal resolution approaching a single human generation (Ortman et al., 2007; Ortman, 2016b; Schwindt et al., 2016). We created time series for interpersonal violence rates, demographic rates and hunting pressure on wild game (Johnson et al., 2005; Kohler et al., 2008, 2009, 2014; Kohler and Reese, 2014), and we reconstructed patterns of settlement, community organization and migration into and out of our study areas (Glowacki and Ortman, 2012; Ortman, 2012; Glowacki, 2015; Kemp et al., 2017). Finally, we developed agent-based models that provide robust null models for assessing the effects of climate for demographic rates and social organization (Kohler, 2012; Kohler et al., 2012a; Crabtree et al., 2017).

Through this work we have developed an incredibly-detailed reconstruction of the social and environmental history of the ancestral Pueblo people who lived in our study areas. Indeed, I think it is fair to say that our syntheses of the archaeological and environmental records of these areas incorporate more cumulative expenditures on archaeology than for any comparably-sized areas anywhere in the world. As a result, we now have a much clearer picture regarding how this society collapsed around 1280 CE. Several centuries of rapid population growth, in the context of a subsistence farming society with a modest division of labor, led to a substantial fraction of the population living on land that was vulnerable to drought. When drought finally hit, the overall landscape was still productive enough to feed the population, but people who lived on the most productive lands were not accustomed to producing food surpluses, and people who needed food the most had no means of obtaining it through the economic system. Social breakdown, characterized by extreme internecine violence and a rejection of existing social institutions, led to mass migration and the end of a cultural tradition.

Based on VEP research, it is now clear that the social response to drought was far in excess of its actual impact to regional agricultural potential. And the organization of the society seems to have been a primary reason. Indeed, it is tempting to conclude from this work that a good way to ameliorate the social consequences of climate change is to promote development of non-agricultural sectors in developing nations. But here is where the problem with history begins to show itself. There are competing views on just about every issue in society, even among those who are committed to fact-based analysis. So it is not difficult to imagine someone cross-examining the VEP research by asking “How do you know from this specific case that there is a predictable relationship between climate change, level of economic integration, and extent of sociopolitical disruption?” At this stage, the only honest response would be “we don’t.” Despite all our efforts to get the details of history right in this case, and the exceptional investment of resources in doing so, in the end we cannot say whether the observed level of sociopolitical disruption is a predictable outcome of general processes or a contingent outcome of specific circumstances. We hoped our agent-based models might do this, and they do seem to account for certain aspects of this history, but none of these models reproduce the most obvious and important outcome, which is the actual collapse of the society.

Traditional Responses

This is just one example, but I think it serves to illustrate the point that historical reconstructions always arrive at the same destination. When history is the goal, increasing research time and effort inevitably lead to greater focus on local details at increasing levels of magnification. We do learn a lot more about specific episodes of human experience, but as the narratives become more detailed our ability to extract practical knowledge from them declines. This is not a new problem, as archaeologists have recognized local contingency as a barrier to generalization for a long time. Faced with this problem, archaeologists interested in generalization have traditionally pursued one of two approaches.

The first is the process Altschul (2016) has labeled “traditional synthesis”: qualitative comparison of a series of case studies (Childe, 1936; Adams, 1966; Ford, 1969; Blanton et al., 1993; Johnson and Earle, 2000; Trigger, 2003; Diamond, 2005; Flannery and Marcus, 2012; Jennings, 2016). Such studies have always identified interesting patterns, at least some of which must reflect predictable regularities in human affairs. But due to the inter-correlated nature of many properties of human societies it remains extremely difficult to identify predictable causal pathways that relate to specific issues. To offer just one example: in Understanding Early Civilizations, Trigger (2003) found that early civilizations exhibit idiosyncratic cultural variation but strong regularities in their economies and social and political organizations that cannot be explained by historical connections or shared ancestry. He concludes that the primary factors behind the emergence of civilization are more political and economic than strictly ecological or cultural (Trigger, 2003, p. 674–676). “Some of the parallels appear to result from the operation of practical reason, while others reflect little-understood tendencies of the human mind to produce particular types of analogies” (Trigger, 2003, p. 685). These are deep insights, but they are very general, and as such they do not provide much basis for practical decisions one could make to address a specific contemporary issue. So although traditional synthesis yields fascinating generalizations, it is not structured enough to provide more than a starting point for an archaeology with practical relevance.

The second approach is cross-cultural analysis. As with traditional synthesis, there is a long and varied tradition in this sort of work, in both cultural anthropology and archaeology (Murdock, 1949; Driver and Massey, 1957; Oliver, 1962; Carneiro, 1967; Jorgensen, 1980; Ember and Ember, 1994; Peregrine, 2003; Gell-Mann, 2011; Whitehouse et al., 2019). Much of it has involved extraction of nominal or ordinal variables from primary ethnographic and archaeological literature that was rarely created of written for this purpose. For the most part, these studies focus on establishing statistical relationships between variables. A good recent example of this style of research is the SESHAT project, which has compiled a global archaeological and historical database and used it to test hypotheses about the underlying structure of variation in human social organization at the level of polities (Turchin et al., 2018). SESHAT researchers collected data for 51 (nominal, ordinal and continuous) variables from 414 polities dating from 9600 BCE to 1900 CE and aggregated these into nine “complexity characteristics”: polity population, polity territory, capital population, hierarchy, “texts,” information system, infrastructure, money, and government. Principal components analysis of the scores for each of these characteristics shows that all are highly correlated, indicating that they all tend to evolve together.

This is a strong finding that expands knowledge of the general process of human social evolution. More importantly for the purposes of this paper, the results allow one to predict that if one dimension of social complexity increases, the others are more likely than not to follow suit. Still, notice that what is being predicted in this case is a correlated increase in measures that are complex combinations of many nominal, ordinal, and/or continuous variables. As a result, from this analysis it is not possible to determine how a certain amount of change in any specific property will affect any other property. This is what would be needed for these results to have practical value in addition to scientific value. Also, since the unit of analysis is the polity, many problems related to the internal functioning of societies, cities or households cannot be addressed. So although cross-cultural analysis can lead to predictive knowledge, such studies tend to operate at a level of abstraction that is too general to address specific social problems and solutions.

These two traditional responses to the problem of historical contingency, then, have the opposite problem: instead of leading to results that are too contingent on local and historical factors to apply elsewhere, they lead to results that are too general to be useful for predicting the outcomes of specific actions. Identifying cross-cultural regularities and patterns in (pre)history is extremely interesting, and one would expect much useful information to be embedded in the results of such studies. But the relationships identified through such studies are typically too general for practical application.

Unfinished Business?

What then to do? I’d begin by noting that the issues discussed above are once again nothing new, as archaeologists have been aware of the shortcomings of traditional approaches as a means of generating predictive knowledge of human affairs ever since the foundational writings of the New Archaeology. This intellectual movement of the 1960s and 70s drew on the philosophy of logical positivism, which was viewed by its proponents as the foundation of the natural sciences, in an attempt to generate “covering laws” that applied to the entirety of the archaeological record (Hempel, 1966; Binford, 1968; Watson et al., 1971). The New Archaeology was not successful in its stated aims, but I want to suggest that the reasons behind its failure may help archaeology chart a path toward enhanced practical relevance.

The New Archaeology had several shortcomings. One was the appeal to philosophers of science as opposed to actual practice. This was unfortunate because logical positivism is an abstraction that never characterized actual practice in the natural sciences (Smith, 2017). To give just one example, contrary to the formal, binary logic of logical positivism (“In C, if A, then B”) (Watson et al., 1971, p. 6–7), most scientific knowledge claims are actually statistical: what the average outcome should be, the likelihood of a certain level of effect, and so forth. A second shortcoming was a faulty conception of “explanation.” In its best-known manifesto, Watson et al. (1971) argued that the major goal of archaeology was to show that specific past events are instances of a general or “covering” law. In their words, “A scientist explains a particular event by subsuming its description under the appropriate confirmed general law, that is, by finding a general law that covers the particular event by describing the general circumstances, objects, and behavior of which the particular case is an example” (Watson et al., 1971, p. 5). This formulation suggests the goal of archaeology is to explain the specific historical case by showing that it is an instance of a general rule. Archaeologists can certainly do this. But the earlier discussion of history suggests that if the goal of archaeology is to explain the specific event, delving into the details toward historically-contingent factors will be far more productive. So following this procedure actually drives one away from the search for generalizations that have practical relevance.

A final shortcoming of the New Archaeology was a fuzzy distinction between explanation of human behavior vs. explanation of the archaeological record (Schiffer, 1972). Explaining why the archaeological record has the properties it has—what has come to be known as middle-range theory—is a necessary step in translating observations of that record into proxies for past human behavior. But such theory was largely absent in the 1960s, and as a result early attempts at explanation in archaeology, notably the work of the so-called “ceramic sociologists” (Longacre, 1964; Hill, 1970), were readily deconstructed (e.g., Allen and Richardson, 1971). Still, several aspects of contemporary archaeology, including the study of site formation processes, taphonomy and ethnoarchaeology, are positive outcomes. In the US Southwest, for example, archaeologists today routinely use generalizations derived from ethnoarchaeological studies of abandoned structures to interpret the fill stratigraphy and floor assemblages of ancient dwellings (Stevenson, 1982; Schiffer, 1985; Cameron and Tomka, 1993); and they use the discard equation to relate artifact accumulations to household inventories, people, and time (Schiffer, 1987; Mills, 1989; Varien and Mills, 1997; Varien and Potter, 1997; Varien and Ortman, 2005). The relationships between human behavior and site formation processes captured in these approaches are highly predictable; indeed, one can rightly claim that these relationships explain basic properties of the archaeological record.

But in the end, explaining the archaeological record as a present-day phenomenon is only an instrumental goal in and of itself. It’s a necessary step, but things only start to get relevant outside of archaeology when one uses this knowledge to study human social dynamics. Since none of this existed in the 1960s, proponents of the New Archaeology quickly realized that middle-range theory had to come first, and as a result the scientific knowledge they produced focused on the archaeological record as a present-day phenomenon. Kent Flannery famously derided the initial results as mere “Mickey Mouse laws” (Flannery, 1973), and such critiques led archaeologists to abandon the ultimate goal of the New Archaeology program and return to the goals of traditional synthesis, leading (among other things) to the variety of evolutionary approaches that continue to have practitioners (and critics) today (Wright and Johnson, 1975; Sanders et al., 1979; Flannery and Marcus, 1983, 2012; Feinman and Marcus, 1998; Johnson and Earle, 2000; Laland and Brown, 2002; Shennan, 2002; Smith, 2003; Yoffee, 2005; Pauketat, 2007; Jennings, 2016; Lekson, 2018).

Much of this intellectual history is well-known to archaeologists, and the field has advanced in many ways since the 1970s. Still, notice what the ultimate goal of the New Archaeology actually was: to discover regularities in human social behavior that are context independent, with the implication that they apply to the present as well as the past. And notice what its methodology was: to develop theory that leads to predictions (“test expectations”) that can be checked against measurements derived from the archaeological record. This sounds precisely like the kind of knowledge that would contribute to contemporary conversations regarding urban planning, economic development, inequality, sustainability, migration, health, and other issues. In short, the New Archaeology would appear to represent an initial, and still unrealized, attempt to achieve practical relevance for archaeology. In the process of thinking through what archaeology as a social science would look like, it became apparent that archaeologists needed to translate material traces into reliable proxies for past human behavior before it could hope to investigate human social dynamics. Archaeologists today routinely use the results of this effort as part of normal practice. Perhaps the issue, then, is that it was not possible to realize the ultimate goal of the New Archaeology because the field needed to develop middle-range theory first. In other words, perhaps the failure of the New Archaeology was not due to a mismatch between scientific reasoning and human society; but because archaeology had to build the capacity to study human social dynamics before it could apply such reasoning to the study of specific social phenomena. The New Archaeology was highly successful with this initial goal. Perhaps we can still accomplish the second?

I suspect many readers will have an immediate negative reaction to this suggestion. The New Archaeology was clearly not successful in its stated aims. And I suspect many readers would argue that the reason it failed is because a natural science-type of reasoning does not apply to human affairs. After all, archaeology is a historical science, like paleontology, where it’s not possible to achieve experimental control or re-run the tape of history again and again (Gould, 1989). And the archaeological record is hopelessly haphazard and partial in its details. The material residues of past behavior that it preserves vary dramatically for all manner of reasons, from the material cultures and technologies of past societies to subsequent disturbance to decomposition and so forth. It’s also quite expensive to collect enough data, in systematic enough ways, to really use this record in a natural science kind of way. So we shouldn’t pretend we can. And so the argument goes.

But let’s think about this argument a bit more, using an example of how scientific research is actually practiced in a field that generates useful knowledge. Although it does not provide a perfect analogy, the example of clinical trials is instructive. When medical researchers test the efficacy of a new drug, they typically study three groups—one that receives the treatment, a second that receives a placebo and a third that does not receive a treatment at all. As the patients are human beings with free will, it is impossible to completely control for variation in human biology, the life history of patients prior to treatment, and the behavior of patients during or after treatment. So, in clinical trials “experimental control” is achieved by stratifying patients into genetic, demographic and/or life-style subgroups and then examining the effect of the treatment across large numbers of people in each group, under the assumption that the uncontrolled effects will effectively cancel each other out. There is no attempt to quantify or even document all of these uncontrollable factors.

In other words, variation in the biology and experience of individuals is at best only partly controlled in such experiments. Instead, experimental control is achieved through sample size and stratification into subgroups. The logic of such studies is that despite the myriad uncontrollable factors that govern outcomes for any given individual, it is still possible to determine the average effect of a single factor across a population, and to determine courses of action that have a significant impact on peoples’ lives, through statistical analysis of outcomes for many individuals across subgroups. The main methodological principle in clinical trials, then, is that to learn something useful about a particular unit of study (in this case, individual humans), the best way to control for all the factors that one cannot control at the level of that unit of study is to compare results across large numbers of units. When this is done, one can develop predictive knowledge concerning the average effects of a specific factor for specific outcomes. And the results are clearly useful. Indeed, in the case of clinical trials, many peoples’ lives depend on them.

Notice that the practical relevance of clinical trials does not necessarily derive from exotic analytical or statistical methods. Indeed, the statistical techniques typically used in clinical trials (statistical tests, regression, etc.) are also part of the basic toolkit of archaeologists. And using these tools, it is possible to say that we “know” that a certain type of pottery is older than another; that the average house grew larger over time; or that the length of a knife is unrelated to its width. In other words, this logic, which characterizes both clinical trials and archaeology, can and does lead to secure and even predictive knowledge of the world. For example, we can use it to predict, with high confidence, that an archaeological site at which a certain variety of pottery is common was occupied during a certain time period.

The main difference between artifact analysis and the clinical trial, then, is the practical relevance of the unit of analysis. Knowing that a certain treatment will increase life expectancy for patients makes a difference in peoples’ lives today. Knowing that sites bearing a certain kind of pottery were inhabited during a certain period, in and of itself, does not. The point here is that archaeologists know how to do this kind of analysis; we just don’t typically do it in such a way that the results could have practical relevance. For archaeologists, our potentially relevant units include households, neighborhoods, settlements, polities, ethnic groups, and populations. But in most studies of these units, we have been content with historical reconstruction, traditional synthesis, or cross-cultural comparison. We do not have a tradition of applying the same techniques we normally apply in everyday analysis and interpretation to the units that matter beyond our field.

What I am suggesting, then, is that what archaeology needs to do to achieve greater practical relevance is replace the patient in the clinical trial with a household, neighborhood, settlement, polity, ethnic group, or population, based on relevant material proxies supported by middle-range theory. There is no reason why archaeologists cannot do this. We just need to apply this logic to relevant units of analysis, design and implement appropriate methodologies, and use the law of large numbers to provide effective controls. In addition, we need to work with other social scientists to develop theories, models, and expectations regarding how proxies for human behavior derived from the archaeological record might be expected to vary under specific conditions. There is no road map for doing this, but in the final section I’d like to develop an example, drawn from my more recent work, which shows that this can be done.

An Example

In this final section, I discuss the ideas, methodology, and results of the Social Reactors Project, a collaboration among archaeologists, urban scientists and economists that is investigating agglomeration effects, past and present, using ideas from network science and complex systems. I present this example not so much to promote these particular ideas (although I do find them compelling), but to illustrate the more general point that it is possible to develop predictive knowledge of human affairs that is relevant for the present and future by combining familiar archaeological proxies and analytical methods with a dose of theoretical abstraction.

The basic idea at the center of our approach is that when humans arrange themselves in space, they do so in ways that balance the material benefits of social contact with the cost of moving around to do it. We do not view this as a utility maximizing process (as in economics), but as a balancing of costs and benefits following the tradition in geography (Alonso, 1964; Christaller, 1966; von Thünen, 1966). We suggest the spatial equilibrium resulting from this balancing act leads to the concentration of humans, their interactions, and their outcomes, in space and time. As a result, individuals in larger settlements have more social contacts and exchanges per unit time; and there are also increased opportunities for specialization as individuals can meet more of their material needs through human networks as opposed to their own individual effort. This process, which we label the “social reactor process,” induces human networks to grow in consistent, non-linear, and open-ended ways with population (Bettencourt, 2013, 2014; Ortman et al., 2015, 2016; Cesaretti et al., 2016; Hanson et al., 2017, 2019; Ortman and Coffey, 2017).

The key question, for the purposes of this essay, is how we justify the claim that the social reactor process is an intrinsic property of human settlements. After all, there are innumerable social, cultural, geographic and historical factors, beyond population, which interact in complex and often unobservable ways to produce the observable properties of each individual settlement. How can one claim to know that population, by itself, has a predictable effect on such properties? There are two parts to the answer. First, we use the results of middle-range research to identify archaeological proxies for the parameters of settlement scaling models. The sorts of measures we have used include house and structure counts and densities; the lengths and widths of roads, paths and public spaces; the areas and volumes of houses and public works; and the densities, ratios, and diversity of artifact types.

Second, we use the logic of the clinical trial. The archaeological record is obviously haphazard when viewed in detail. Not only is preservation partial, but investigation of the remaining traces is also biased in several ways due to the time and expense involved in archaeological field and laboratory work and the changing interests of investigators over time. As a result, there is error associated with every measurement, and we cannot know, for example, the exact momentary population, or the precise rate of pottery consumption, for any past settlement. And more importantly, even if we could measure the properties of individual settlements precisely and accurately, it would still be the case that every settlement has a unique history, such that a myriad of factors beyond population, only some of which are observable, have combined in unique ways to produce its specific observed properties. Due to these combined effects of measurement error and historical contingency, it is not reasonable or feasible to test predictions of settlement scaling theory (SST) through analysis of a single settlement. The only way to do it is to compare many settlements, ideally from many settlement systems, to see if the predicted effects are apparent, on average, across all of them.

It turns out this task is relatively straightforward once one has compiled relevant data. SST argues that the average effect of settlement population for an aggregate property of interest is given by a power function Y=Y0Nβ, where Y is the aggregate property, Y0 is a baseline value, N is the settlement population, and β is an exponent that summarizes the rate of increase of the property relative to the population. The theory also includes mathematical models that derive predictions for what the exponent β should be, depending on whether the property of interest represents a socio-economic rate, a measure of functional diversity, or a measure of physical infrastructure. These scalar effects of human networks can be observed empirically by fitting a linear function to log-transformed measures of N and Y across a sample of settlements in a system. This is feasible because Y=Y0Nβ and logY = βlogN+logY0 are equivalent expressions. When this is done, the slope of the fit line is an estimate of β, and its intercept is an estimate of logY0, and thus of Y0. The details of SST have been presented in a variety of places (Bettencourt, 2013, 2014; Ortman et al., 2014, 2015; Youn et al., 2016), but for present purposes the key point is that the analysis determines whether, on average, the estimated exponent β falls within the range of statistical tolerance of the value predicted by the relevant model. So when we conduct a scaling analysis, we are testing whether a specific prediction of the framework is borne out by the data. When the data do not conform to the prediction, it tells us that something is wrong, either with the model or with the data.

An example of such a test is shown in Figure 1, which examines the relationship between settlement population and aggregate settlement productivity in the archaeological record of five New World societies: the Basin of Mexico; the Prehispanic Upper Mantaro Valley of highland Peru; the Mesa Verde region in Colorado, USA; the Middle Missouri region in North and South Dakota, USA; and the Lower Santa valley of coastal Peru (Data Sheet 1). The data for the Lower Santa valley derive from a settlement pattern survey of the region by Wilson (1985, 1988), and the other datasets have been analyzed in previous publications (Ortman et al., 2015, 2016; Ortman and Coffey, 2017). These settlements encompass six orders of magnitude in population, 60° in latitude, and 6,000 years in time.

Figure 1. The relationship between settlement population and total house area in five New World societies. Note that the intercept of the fit line for each society is different, but the slope (coefficient) of the fit line is very similar across cases.

The proxy for settlement population varies across societies. In the Basin of Mexico population is estimated either by multiplying the domestic mound count by the average household population, or by multiplying the site area by a population density indexed to the surface artifact density; in The Middle Missouri, Upper Mantaro and Lower Santa the estimated population is simply the number of domestic residences in the settlement; and in the central Mesa Verde the estimated population is the number of pit structures present. In all cases, the proxy for a socio-economic rate is the total area of the domestic structures (or mounds) in the settlement. We treat the latter as a measure of total settlement productivity based on a variety of archaeological and ethnoarchaeological studies which support an association between house size and wealth (Smith, 1987; Blanton, 1994; Kohler and Smith, 2018). The basic argument is that, because most wealth in past societies took the form of tangible goods, households that had more stuff per person needed more floor area per person. A recent demonstration of this comes from a study of households in Aztec-period Central Mexico which found that larger houses are associated with greater amounts of more valuable possessions (Olson and Smith, 2016).

The process by which total house areas are estimated varies substantially across settlements both within and between regions. In some cases, total roofed space was measured directly based on complete surface preservation or geophysical survey; in others counts and average areas of different classes of structure are reported; and in still others only the counts and areas of those domestic mounds that happen to be preserved are reported. In such cases we either multiplied the average mound area by the house count to estimate the total space, or we calculated a weighted average area per structure based on reported counts and average areas of documented structure types and then multiplied the total structure estimate by this average.

Due to the realities of the archaeological record, and the resulting data, there are obviously errors in the estimates of both population and total domestic roofed area at every site. These data at best represent conditions at the moment of peak occupation, which need not have occurred simultaneously across sites in a region. The relationships between structure count and population, and roofed space and wealth, are also only approximate. Finally, even if we could measure population and wealth exactly on an annual basis, with no error, the actual wealth possessed in each site at a given moment would have derived from all sorts of factors in addition to population size. So even if we had perfect data, and even if our model is right, we would not expect it to predict the observed value of Y for each site. In the real world, the best one can hope is that all of these factors cancel each other out, allowing us to recover the average relationship between N and Y reflected by the slope of the fit line. This is exactly the logic of a clinical trial: one cannot predict the precise outcome of treatment for any individual patient, but one can predict the average outcome across individuals in a sample.

In addition, the average relationship between settlement population and total house area varies across regions because the specific measures vary. In some cases population estimates are in persons, and in others they are in households. In some cases, house areas are based on mound dimensions, whereas in others they are based on actual wall foundations. Finally, the baseline amount of roofed space per capita varies across regions due to a variety of factors, including but not necessarily limited to the productivity of environments, farming technologies, transport costs, and a variety of social institutions that affected the productivity of social interaction.

Despite all of these caveats, Figure 1 and Table 1 show that there is a striking regularity in the relationship between settlement population and house area. Across these five societies the size distribution of settlements varies, and the overall height of the relationship varies, but the slopes of the fit lines capturing the relationship are nearly identical. Table 1 shows that these slopes are all in excess of one and in the vicinity of the theoretical prediction of 7/6, or 1.167. All of the regressions have high r-squared values, but these are in part autocorrelation effects that derive from using the house count to construct the roofed space estimate at many sites. Still, in most cases the 95% confidence interval of the estimate for beta excludes one, which is what the slope of the relationship would be if the estimates of roofed space per capita were independent of the site population. These results thus provide striking evidence for a specific empirical regularity in the relationship between population and material productivity.

Table 1. Estimated scaling coefficients for the relationship between settlement population and total house area in five New World societies.

This uniformity can be made even clearer by centering the data from each region so that the mean coordinate of each dataset is at the origin. This is done using the following formula:

centered (xi)=xi[(i=1nxi)/n],    (1)

which allows one to use the data from all five regions in a single regression analysis. The relationship for the centered data is presented in Figure 2, and in the bottom row of Table 1. This analysis leads to a remarkable result. The value of β predicted by SST for socio-economic rates is 7/6, or 1.167; the observed value in this centered dataset is 1.165, with a standard error of 0.026. This means that, when one controls for regional differences by centering, and for other factors beyond population through sample size, the resulting estimate of the average rate of gain in productivity with increasing settlement population is within two one-thousandths of the predicted value. This result provides striking support for the model.

Figure 2. Evidence that house areas follow a single scaling relationship across societies. In this plot, the data have been centered by subtracting the mean coordinates of the data for each society from each data point. This process re-scales the data so that their center of the data for each society is at the origin. The estimated coefficient of the scaling relationship for the centered data is within two one thousandths of the predicted value.

There is one final point that should be made about this analysis. In the contemporary world the height of a scaling relationship, captured by the intercept of the fit line, generally increases from year to year. Current theory suggests such increases are due to decreases in transport costs and increases in the energetic productivity of individual interactions. As a result, one needs to center the data by year if one wishes to use contemporary data from different years in a single analysis. In contrast, Figures 1 and 2 combine sites that date to different moments in the history of each region. There is no theoretical reason to expect that the intercept of the fit line capturing the relationship between N and Y should be static, but previous studies have not found evidence for a changing intercept over time (Ortman et al., 2015, 2016; Ortman and Coffey, 2017). The fact that the pooled analysis presented here leads to an estimate of β that is so close to the theoretical prediction provides additional evidence for consistency in the basic energetics of the economy in each of these regions over long periods of time. This does not mean past economies were static, but it does suggest the easiest way for societies to increase productivity is through agglomeration. This is a striking finding with obvious relevance for social policy.

This framework, and type of analysis, has been applied to a range of urban and non-urban settlement systems known through history and archaeology (Hamilton et al., 2007, 2018; Ortman et al., 2014, 2016; Cesaretti et al., 2016; Hanson and Ortman, 2017; Hanson et al., 2017; Ortman and Coffey, 2017; Altaweel and Palmisano, 2018). And it has also been applied to a range of data from contemporary urban systems (Pumain et al., 2006; Bettencourt, 2013; Lobo et al., 2013; Schläpfer et al., 2014; Bettencourt and Lobo, 2016; Mahjabin et al., 2018). So far, with allowance for a few wrinkles, the data have been consistent with specific expectations of settlement scaling models in every case. These results suggest that, at least with respect to population size, human agglomeration effects are highly predictable. This does not necessarily mean that doubling the size of a given city today would necessarily increase its per capita socio-economic rates by 16.7 percent. Indeed, there are cases from recent times where specific cities have grown substantially in population without a corresponding increase in GDP, for example (Henderson, 2003; Jedwab and Vollrath, 2015). But the theory does say that this is the average expectation. In essence SST allows one to control for agglomeration effects, thus bringing other factors that influence outcomes in specific situations into greater focus. It doesn’t disregard history or context; it simply captures the physical and energetic factors that constrain the range of histories that are possible. SST only deals with the material effects of agglomeration. It does not address associated psychological or emotional effects, or indeed, any other aspects of life in cities that it would be worthwhile to know more about. But it is still a good start. Indeed, I think being able to make mathematical predictions regarding anything specific about human networks is an exciting advance.

An additional important aspect of the Social Reactors Project that archaeological evidence is not merely being used to confirm an existing theory. Rather, it is being used to expand and elaborate the theory. For example, the theory proposes that the increasing returns to scale that characterize contemporary urban systems derive from the expansion of human connectivity brought about by density. In a modern context density is a tricky concept because it is sensitive to the area over which people are counted, and when they are counted. What area should be used? What time of day? Today, the edges of built-up urban areas bear little resemblance to administrative and political boundaries, and many workers commute across such boundaries on a daily basis. As a result, it is very difficult to define the relevant spatial units, and interacting populations, that should exhibit increasing returns in contemporary urban systems. It turns out that this problem is much less severe for the smaller and simpler societies known through archaeology. In most cases, the physical settlement and its associated mixing population correspond much more closely in the archaeological record than they do today. As a result, it is actually more straightforward to test SST using archaeological evidence than it often is using contemporary data (Lobo et al., 2019).

Some may question whether the assumptions embedded in this approach—the balancing of costs and benefits, that socio-economic rates are proportional to interaction rates, the idea that interactions have energetic benefits, etc.—are appropriate. It is also reasonable to question whether the archaeological proxies used in testing these models are appropriate, and whether the data at our disposal are of sufficient quality. All of these issues aside, settlement scaling models generate testable predictions that are borne out in many datasets, using a variety of measures, from many societies, with radically different forms of political and economic organization, both past and present. So the empirical support for settlement scaling theory exists regardless of one’s prior beliefs regarding the assumptions in these models and proxies. This is important because it helps the theory stand up to cross-examination by someone who is not predisposed to accept it. To reject the theory, one needs to show that an alternative model accounts for the empirical evidence better. Urban geographers are beginning to interrogate some of the assumptions and results of settlement scaling research more closely (Arcaute et al., 2015; Depersin and Barthelemy, 2018; Keuschnigg et al., 2019). And it would be great if archaeologists contributed to this as well. This is what it will take to build an understanding of agglomeration effects that is strong, clear, and specific enough to guide us into the future.

A New Kind of Relevance

In presenting the example of SST, I do not mean to suggest that the only way archaeology can achieve practical relevance is through the development of explicit formal models. Indeed, in many cases medical researchers show that specific medicines have quantifiable therapeutic effects even when they can’t explain the mechanisms behind them. And there are examples of this kind of logic being applied to archaeology. As an example, Ingram (2015) recently investigated human vulnerability to drought by comparing paleoclimate records with measures of settlement instability for large numbers of settlements located in a variety of ecological settings. Among other things, his analyses found a strong relationship between drought and migration that was insensitive to the proximity of residents to a perennial water source. Additional studies of specific situations like this clearly have the potential to guide future decisions, even in the absence of a formal model.

Regardless of how well SST stands the test of time, I hope this example successfully illustrates that it is possible to build predictive knowledge of human affairs that incorporates but also transcends the archaeological record. The process has just barely begun, but if we believe the archaeological record is at least partly systematic, that human behavior is at least partly predictable, and that scientific reasoning can be employed to improve the human condition overall, this seems like a very good thing to incorporate into an expanding scope of archaeological practice.

Such research is challenging. It requires careful observation of the phenomenon to be explained, definition of key concepts and relations, formulation of theory and models, painstaking work to compile the relevant evidence for testing, careful analysis of the data, and critical evaluation of the entire process. But it is not impossible. The basic logic and analytical procedures for testing such models are already part of the standard training of archaeologists. Middle-range theory continues to provide a basis for constructing valid proxies for human behavior that researchers outside of archaeology will find relevant. Dramatic recent expansion in the ability to collect data on contemporary human behavior is stimulating exciting developments in other social sciences. And the example of settlement scaling theory shows that it is possible to develop predictive theory that is amenable to empirical testing and applies as well to societies known through archaeology as it does to societies that can be observed directly today. As a result, knowledge of the social reactor process emanating from archaeological research should be relevant for urban science and urban policy. I see no reason why this could not also be done for a wider range of contemporary social issues with additional effort, and with more interdisciplinary collaboration.

There is great social benefit in all the things archaeologists do—from heritage management to museum exhibits, cultural tourism, advocacy, historical reconstruction, traditional synthesis, and cross-cultural analysis. We should be proud of everything we do, and keep on doing it. The purpose of this paper has been to suggest that in addition to all this we can and should strive to expand the contemporary relevance of archaeology such that the results of archaeological research can help us make informed decisions in charting a better future as we confront today’s challenges. The archaeological record is the richest and most extensive source of information on human social experience we have. In coming years, I hope more of us will work to develop this record to its full potential.

Data Availability Statement

All datasets generated for this study are included in the manuscript/Supplementary Files.

Author Contributions

SO conceived of and wrote this paper.


Publication of this research has been supported by a grant from the James S. McDonnell Foundation (#220020438).

Conflict of Interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor is currently organizing a Research Topic with the author, SO, and confirms the absence of any other collaboration.


I wish to thank Tim Kohler, Mike Smith, Keith Kintigh, Jeff Altschul, and the article reviewers for many helpful comments on previous versions.

Supplementary Material

The Supplementary Material for this article can be found online at:

Data Sheet 1. House counts and total house areas for archaeological settlements in five New World societies.


Description of Image

Source link