1Over the past 5,500 years, humans have invented and used more than 100 distinct graphic, relatively permanent, and largely non-linguistic means for representing numbers. Numerical notations are semiotically and causally linked to, but distinct from, both the numeral words of their users’ languages and the writing systems used to encode language visually. This essay aims to investigate the interrelationship of writing, language, and numerical notation both in historical time – spanning the period from the origin of literacy to the present day – as well as in everyday practice. Numerical notations are neither mere records of language, nor mere adjuncts to writing. Rather, they constitute a distinct, parallel representational modality, one that is so cross-culturally widespread that it is easy to take for granted.
2In the world as experienced by humans, there are many domains of experience that could potentially have special associated notation systems: sound, color, movement, and kinship, to name only a few. But while such notations do exist, such as Labanotation for notating human movement (Farnell 1994) or Iberian Neolithic stone plaques used to notate social relations (Lillios 2008), they are cross-culturally rare and highly specialized. In contrast, almost every literate person is also a user of a numerical notation system, and in fact these are some of the earliest graphic symbols learned by children (Tolchinsky 2003). Numerical notations developed at least six times independently worldwide (possibly more), in contrast to phonographic writing which developed independently only four times. They are so widespread, and so frequent in daily use, that it is easy to overlook their remarkable ubiquity. Why should these notations be so cross-culturally recurrent, given that there are ready alternatives to their use? And how does their structure differ from other semiotic systems for representing number?
3The comparative study of numerical notations has a long and complex history across traditions in both philology and mathematics (Pihan 1860; Menninger 1969; Guitel 1975; Ifrah 1987; Chrisomalis 2010). Because their analysis is, strictly speaking, neither linguistic nor mathematical, it is an interdisciplinary and complex topic that, inevitably, relies on data collected in a vast array of local historical, linguistic, and archaeological traditions. Linguists have, quite understandably, focused on numeral words over numerical notations, so the understanding of pattern and historical process is far more advanced in the study of lexical numbers (Greenberg 1978; Hurford 1975). Numerals are a word class with their own peculiar morphosyntax that has occupied linguistic theorists for over a century. Since Saussure, linguists have worked within a framework that regards spoken language as the unmarked and default subject of inquiry, with the result that graphic notations detached from language have been analyzed as if they were adjuncts to language. Similarly, mathematicians (and historians of mathematics, many of whom are mathematicians themselves) have their own set of interests and goals, such as the identification of innovations, blind alleys, and useful notations for particular problems. Until the past decade or two, historians of mathematics have taken for granted that the principal function of numerical notation is arithmetical manipulation – quite understandably, given their focus. Only with the more recent greater integration of the history of mathematics with the broader history of science has there been a shift in this perception. It is comparatively rare to use numerical notation directly for arithmetic even today, and in pre-modern periods this function was almost nonexistent.
4Once we conceptualize numerical notations as representational systems and not computational systems, we are able to analyze their development and use as information-processing and information-sharing systems rather than, more narrowly, as adjuncts to mathematical activity. Numerical notation is multifunctional, but its principal function in any case is to represent numerical quantities in a relatively fixed, visual format – in other words, to establish relations of meaning between writers and readers. Rather than asking “Why are the Roman numerals so poor for arithmetic?”, we ought to instead ask, “For what purposes and in what contexts did and do people use Roman numerals?” If, in certain contexts (as it was in the modern West, from roughly 1600 to 1950), written arithmetic was an important role, then that certainly deserves our attention, but it is not an eternal or inevitable function of numerical notation. To understand the structure of a numeral system requires that we ask for what purposes that structure might be useful. This semiotic and pragmatic orientation differs from the orientations of earlier scholars such as Geneviève Guitel (1975), whose Histoire comparée des numérations écrites treats numerical notation systems principally as formal structures. This is not to say that no formal typology is possible. Rather, it is to insist that whatever typology we adopt must address the factors of use and context relevant to writers and readers, as well as panhuman cognitive capacities relevant to understanding why these human representations take the form they do.
5The present paper builds on the finding that numerical notations are representational and not computational systems to argue that their evolution and cognitive manipulation must, therefore, be explained in representational terms – linking cognition with semiotics rather than arithmetic. Even a brief examination of the structures of numerical notations shows that they differ systematically from the morphosyntax of spoken number words. Because number words are so common, and because every user of numerical notation is also a speaker of one or more languages, the place to begin is in juxtaposing these two representations. To these, we can add systems of tallying and computation technologies like the abacus, each of which have their own properties. The case can then be advanced that numerical notation is a hybrid, incorporating aspects of language and aspects of notation into a system that allows for a fluid recomposition, or transcoding, from one system to another (e.g., through reading and writing). Finally, it is suggested that once we abandon the computational bias of most histories of numeration, we can address the origin of numerical notation, not functionally as an adjunct to bookkeeping, but semiotically as an adjunct to state formation more generally.
6Numerical notations are severely constrained in terms of their structure. The range of attested patterns is far less than the range of imaginable notations. For instance, it would be perfectly coherent to have a system where size is used semantically to indicate numerical value, so that a big 1 might indicate 10 and an even bigger 1 might indicate 100. We could construct a system in which each prime number has its own sign, and each composite number is written through a combination of these signs, whose product is then taken. There is no logical prohibition against a numeral system with a base of, say, seven, so that 100 means (in base 10) 49, but no such system is attested to have ever been used. That these imaginable-but-unattested systems are so readily able to be conceptualized, despite their absence in the historical record of practices of numeral-users, suggests that a set of cognitive and graphic constraints are at work, restricting what systems will actually occur and be accepted by users. Following the argument of Dan Sperber (1985, 1996), not all representations are equal, since some are more readily acquired and shared by human minds.
Fig. 1. Typology of numerical notations (after Chrisomalis 2004, 2010)
7All numerical notations fall into five basic typological categories, across two dimensions (fig. 1). To classify a particular notation, the two central questions we must ask are: How are the signs formed and combined within any power of the base?; How are the different powers combined together to take a total of a numeral-phrase?
8We may call these, respectively, intraexponential and interexponential principles. There are only three basic intraexponential principles. Cumulative systems are those where signs are repeated and added within each power, so, e.g., Roman numeral XXX = 30. Ciphered systems, in contrast, use only one sign per power of the base being expressed; our own Western (sometimes called Hindu-Arabic) numerals have this property. Multiplicative systems use two signs per power, which are multiplied together. This is very common in numeral word systems (e.g., three hundred) but somewhat rarer in numerical notations. Similarly, there are only two interexponential principles. Additive systems require the reader merely to take the sum of signs for the various expressed powers. Positional or place-value systems require the additional step of accounting for the order or location of signs within the phrase, which implicitly indicate multiplication by various powers of the base.
Fig. 2. Cumulative-additive Greek acrophonic numerals
Fig. 3. Cumulative-positional Babylonian sexagesimal cuneiform numerals
9Combining these two principles provides six possible types, of which one (multiplicative-positional) is logically excluded because multiplicative systems express power through special power signs, not through place value (Chrisomalis 2004, 2010; see also Widom & Schlimm 2012 for an interesting variant). The Greek acrophonic numerals are a cumulative-additive system (fig. 2), using different signs for 1 and 5 of each power of 10, combined through repetition and addition, like the (related) Roman numerals. The Babylonian sexagesimal (base 60) numerals exemplify cumulative-positional notation (fig. 3), combining units of 10 and 1 within each power, multiplied by (implicit) powers of 60 based on their order within the phrase. Ciphered-additive systems (fig. 4) like the Georgian alphabetic numerals have distinct signs for each multiple of 1-9, 10-90, 100-900, and 1000-9000, combined additively. Ciphered-positional systems (fig. 5) like the Lao numerals (or for that matter like Western numerals) use the same set of signs for 1 up to the base, plus a zero, in each position. Finally, multiplicative-additive systems (fig. 6) like the classical Chinese numerals use two signs for each power (except the ones), a multiplier-sign and a power-sign (in this case, 1-9 and 10, 100, 1000…, respectively).
Fig. 4. Ciphered-additive Georgian alphabetic
Fig. 5. Ciphered-positional Lao numerals
Fig. 6. Multiplicative-additive classical Chinese numerals
10There are, of course, additional complexities to be considered, such as differences in base structure, or systems that combine two or more principles, or systems that contain irregularities of various sorts, but all known systems use these five principles, and no others. The advantage of a two-dimensional typology is that it gets past the older, much more typical tendency to only look at whether systems use place-value or not, to provide a fuller account of how systems actually represent numbers. This avoids the teleological argument of Ifrah (1987) and Dehaene (1997) that sees decimal, positional numeration as the end-point of a developmental sequence towards perfection. In fact, using this typology it is evident that there is no perfect system. Ciphered-additive systems write numerals more compactly than any of the other types; compared to a ciphered-positional system like our own, any number with zeroes will be written with fewer signs. Cumulative-positional systems use a much smaller inventory of signs than other systems (often as few as two or three), but are far less compact. One might argue that the most appealing quality of the Western numerals, and other ciphered-positional systems, is that they are mediocre in both of these criteria.
11The highly formal, severely constrained set of attested structures of numerical notations raise the important question: what purpose do they serve? Why are there those five basic systems, and no others? Given that numerical notations, unlike language, are not universal, we have a historical problem regarding their evolution, and also a functional problem regarding the relationship of their structure and contexts of use.
12In an important essay, Malcolm Hyman (2006) sets out a framework for understanding writing, which sheds light on the relationship between language, notation, and the written word. Rejecting the term semasiography employed by Sampson (1985) as “vacuous” and defined only by what it is not, he distinguishes glottographic writing, visual marks that encode language, from non-glottographic writing, encompassing all varieties of relatively permanent notation that do not do so. Hyman uses “writing” as an umbrella term encompassing a variety of notations, directing his focus not simply to the typology of notations but to the processes through which we engage with graphic marks. Noting that “we read glottographic writing, but verbalize non-glottographic writing” (Hyman 2006: 244), he argues that the cognitive processes by which we draw meaning from notations like numerical notation are quite different from those used to process linguistic codes. Written numerals, from this perspective, are a structured system of non-glottographic writing with no necessary correspondence with language, that coexist with writing (often co-occurring with them in close proximity) and that have their own distinct properties.
13We might be tempted to think of numerals as logograms, each representing a single word with a single sign, e.g. 2 –> two. However, this would be erroneous for two reasons. First, because numerical notation is translinguistic – that is, it does not represent any specific language, but rather, can be read in any language in which the reader is familiar – there is no simple mapping between one sign and one word. The numeral 17 is seventeen in English but dix-sept in French and dau ar bymtheg in (traditional) Welsh. Is it one word or two or three? Posing the issue differently, we read the same notation as 10 + 7 in French, 7 + 10 in English, and 2 + 15 in Welsh, but Welsh speakers do not experience significant difficulty even though the notational encoding is deeply different from the linguistic one. Additionally, even within any given language, a numeral sign can be read in a diverse range of ways. In English, 2 is two when standing alone, half in ½, sec- (perhaps) in 2nd, squared in x2, and twen- in 20. While not all numerals are so polyvalent, the argument that numerals are nothing more than words in different clothes cannot be sustained. Rather, they are ideograms – what links them is neither phonetics nor the lexicon of any language, but the conceptual framework that unites all instances of 2 by their two-ness (Edgerton 1941).
14It is thus a mistake to consider numerical notations simply as graphic representations of words or of language. Of course, to be useful, numerical notation is normally translated both from language and back into language – its users, in other words, are also speakers. It is better, however, to think of numerical notation as a transcoding engine, facilitating one or more possible readings, dependent on context but also subject to individual choice and variation. So, for instance, twenty seventeen and two thousand seventeen are both available to English speakers upon reading the numeral 2017. In silent reading, it may not always be possible for the reader to reliably identify which reading they have just used. Yet, so fluent are we in managing this variation that we are able to attend carefully to context when choosing appropriate readings. So, for instance, no one reads the phone number 867-5309 as eight hundred and sixty seven, five thousand three hundred and nine, but rather, digit by digit. However, if your phone number were 867-4000, an American English speaker would be very likely to read it as eight six seven, four thousand (not four zero zero zero) – the difference between 5309 and 4000 being rapidly perceived, activating a different transcoding strategy almost without perceptible effort. Numerical notation provides us a template onto which our lexical resources can be mapped in multiple ways. In contrast, it is generally true that each lexical numeral has a single representation in a particular numerical notation system. Upon hearing six thousand four hundred and twenty-nine, almost any English speaker will be able to produce 6429 without variation or error, and no one (probably) will write 6.429 x 103.
15Numerical notations, therefore, serve a principal role as tools for decomposing spoken (or verbal written) numbers into a set of constituent elements and recomposing them in new ways. This process of decomposition and recomposition, or transcoding, occurs almost whenever a numeral is read or written. Of course, transcoding occurs with other notations as well, including writing systems, none of which are perfectly faithful representations of any particular language. The orthographic peculiarities of any writing system, however, are largely at the phonological level. In contrast, numerical notation systems facilitate morphosyntactic transcoding to and from the numeral words (spoken or written) of any particular language – for instance, breaking apart words and phrases into elements, and transforming them into a set of signs with a different order and structure.
16This does not deny that in some cases, no transcoding is needed: there may be special cases, especially when working in higher arithmetic, where language is only tangentially involved in manipulating numbers. But because there has been such a strong bias in the study of numerals towards its mathematical functions, the tendency has been to overemphasize these processes. If we treat Graphic Numeral –> Number Concept and Lexical Numeral –> Number Concept as essentially independent, as if the two operated without interference or intermediation, we ignore the unmarked, ordinary case whereby graphic numerals are read (as a non-glottographic writing system) into language as they are being conceptualized.
17There is some psycholinguistic evidence that transcoding errors are affected by notation, based on research over the past twenty-five years (Noël 1991). For instance, speakers of languages like Dutch and German that put the units before the tens (dreiundvierzig = 3 + 40 = 43) require slightly more time to read or write such numbers than do speakers of tens-units languages (Brysbaert et al. 1998). Various errors in reading and writing numerals are also observable in persons suffering with particular forms of neurological impairment. In a recent study, Benavides-Varela et al. (2016) show that the transcoding of numerals involving zero, which is represented graphically in numerical notation but not verbally in number systems, is extremely challenging for individuals with certain sorts of brain injury. However, for the most part, the reading and writing of numerals is so fluid and error-free that readers and writers are not fully conscious of the ingenuity of the transcoding process. We may not even be consciously aware whether we are transcoding them into language at all. For instance, when doing complex pen-and-paper arithmetic, many people have the intuition that their work is language-independent, although the work on numerical cognition suggests that this is unlikely.
18The question of how numbers are stored in the mind – how numerical representations are encoded cognitively to facilitate reading, hearing, speaking, and writing them – has been one of the central topics in numerical cognition over the past thirty years (Rips et al. 2008; Carey 2001; Dehaene 1997). However, almost no numerical notation other than ciphered-positional systems (like Western numerals, but also like the various Arabic and South Asian numeral systems, as well as modern East Asian systems) is used widely and fluently by anyone today. This poses a serious empirical challenge to the investigation of numerical cognition. While many schoolchildren learn to read (roughly) Roman numerals in European and American schools, almost no one has sufficient fluency with them to be considered “bi-numerical” in the way that many people are biscriptal (fluent users of multiple writing systems). Five hundred years ago, there would have been thirty or more actively used numerical notations worldwide, and even a hundred years ago, one could find considerable diversity among fluent users of local notations. We can, of course, still find texts that use multiple numerical notations – for instance (fig. 7), the postage stamp below from Italian-occupied Ethiopia in 1936, which uses Roman (cumulative-additive), Western (ciphered-positional), Arabic (ciphered-positional) and Amharic (ciphered-additive) numerals in a single tiny text. But because many notations are used for restricted functions only, or only by a small group of specialists, direct comparison of systems using cognitive experiments is often unfeasible. Thus, while there is an important and growing literature on the cognitive effects of using different writing systems (Ellis et al. 2004; Share 2014), as well as literature on the number concepts of nonliterate peoples (Zebian 2008) the cognitive-semiotic comparison of number systems requires an inferential approach. Nevertheless, using a general awareness of the various cognitive principles that might underlie the human number sense, it is possible to reconstruct plausible evolutionary paths through which the history of numerical notations makes sense.
Fig. 7. Italo-Ethiopian postage stamp, 1936
It contains four numerical notations of three different structures.
19One of the most widespread properties of many numerical notations is that they use the repetition of similar signs to indicate addition. These are the cumulative systems described above, in both their additive and positional variants. They are among the more common systems used worldwide, and are also among the earliest systems used. Both the proto-cuneiform numerals used in Mesopotamia and the early hieroglyphic numerals used in predynastic Egypt, the two systems dating unequivocally from the 4th millennium BCE, are cumulative-additive. The earliest notations of the Americas – the Mesoamerican bar-and-dot numerals and the Inka khipu, although much later chronologically, are similarly, cumulative. Not every early notation is cumulative – the Shang oracle-bone script (jiaguwen 甲骨文) in China and the Brahmi script in India are only minimally so, for the first few numbers represented through parallel lines – but every region of the world and script tradition has some cumulative notations.
20The iterative property of numerical notations focuses our attention on a parallel, but rather distinct, numerical practice, namely tallying. Tallying relies on the principle of one-to-one correspondence to coordinate relationships between some set of counted entities (days, drinks, people) and a set of notches, knots, lines, or other marks. Tallying is well-attested from the Upper Paleolithic, although there is still considerable debate as to its earliest date or frequency in early periods (d’Errico 1995; d’Errico & Villa 1998). It is cross-culturally extremely widespread, including a broad range of non-literate societies (Lagercrantz 1968, 1970, 1973), suggesting that tallying is a core semiotic system, practically panhuman, for engaging with countable aspects of the world. This suggests an obvious source for the base-structured numerical notations that emerge with writing at the end of the Neolithic. But although early and nearly universal, tallies are not simple; they come in numerous varieties across different materials and sign-structures.
21Tallying is distinct from numerical notation in several key respects. First, whereas numerical notation systems are structured through a base – some number whose powers receive distinct notation – tallies do not have a base. Even if we consider complex tallies where every fifth or tenth mark is different from the ones preceding it, each mark still represents the count of a single unit. Contrast the tally IIIIVIIIIXIIIIVII with the Roman numeral XVII; in the former, the X does not represent 10, but is actually an ordinal representation of “the tenth mark”. One result of this is that even moderately long tallies are much longer than numerical notations; they are unsuitable for use in most written contexts, and rarely occur within texts, but rather, when they survive, are found as adjuncts to text (in margins, on scrap paper, etc.). We can think of a tally system like this one as structured, not by a base and its powers, but by a modulus that, like a clock, once it reaches its maximum value, starts anew at 1.
22Additionally, while some tallying practices may be oriented towards some audience of readers, just as often, they are a cognitive practice undertaken for the writer, as part of some arithmetical practice. Because they are long, reading them takes considerable effort. Even when, as in the Western tallying tradition, every group of five is specially demarcated (e.g., by having the fifth tally cross the first four), reading is tedious, although straightforward. Taking a tally serves a mnemonic function but not necessarily a socially representational one – there is no need for there to be a reader other than the writer. The tally-marks of particular literate traditions are often vastly different from the numerical notations of those traditions, and should properly be understood as adjoining them, rather than as part of them.
23Perhaps the best evidence for the distinct quality of tallying practices is that they coexist quite readily with numerical notations – for instance, when tallying votes on some key issue, we might still mark each vote with a tally in the appropriate column, and only after the full count has been taken would we write down, in Western numerals, the final result for each candidate. Tallying serves a set of purposes that cannot readily be served by numerical notation, namely working with an ongoing count that allows the possibility of extension. They form part of core economic practices in a wide range of societies, as indicators of amounts paid, as “checksum” records to be shared across participants in a transaction, as markers of credit or loans, and a variety of circumstances where numerical notation might be less desirable (Folmer & Henkelman 2016). Even in numerical notations like the Roman numerals that look like tallies, one cannot simply take some numeral-phrase like CCCLXXIII and add more signs freely.
24Related to tallying, and making similar use of iteration, are the host of counting devices that often (somewhat erroneously) go by the name abacus, which share the common property that they allow some set of counters (beads, stones, rods, etc.) to be manipulated across a set of registers or columns that represent different powers of the base. Each column uses repetition of units (or sometimes units and fives) while the ordinal and spatial relations among the columns indicates their power value. Figure 8 depicts the number 2018 on a Chinese suan pan with two beads above the median indicating five, and the five beads below each indicating one. Such devices seem to have been developed independently in the Near East, East Asia, India, the Andes, and Mesoamerica. This suggests that the combination of iteration within any power of the base along with place-value is a highly practical arithmetical tool. There is strong experimental evidence showing that the manipulation of devices like the Japanese soroban (“bead-abacus”) has powerful cognitive effects facilitating rapid mental arithmetic even when the device is not available, affecting numerical representation among expert users (Stigler 1984; Miller & Stigler 1991).
Fig. 8. “5+2” suanpan computational device/“abacus”, showing the numeral 2018
25The similarity between these notations and the cumulative-positional numerals described above is no coincidence. But we should also exercise caution in attributing too much to this similarity. Cumulative-positional numeral systems are cross-culturally very rare, in fact the rarest of the five types described above. There is good reason for this: while this structure is excellent for manipulating numbers, it is among the least concise ways to write a number. Because numerical notation is, first and foremost, a system for numerical representation, not for computation, it is unsurprising to find that computational devices often have structures very different from those of their users’ numerals. Place-value (positional) arithmetical devices often coincide with purely additive numerical notation systems – this is of course true for the Roman abacus, and also seems to be the case for Mesopotamian abaci (Woods 2017). Although Taisbak (1965: 158) claims that the Greco-Roman abacus preceded and led to the development of the Roman numerals, there is no evidence that this is true. Rather, because Roman numerals were not used directly for arithmetic, the two numerical subsystems served radically different functions, thus explaining their different structures. Working with a computational device thus requires still another level of transcoding – from the representation on the board to a representation in numerical notation, both of which differ from the corresponding representation in some language.
26Tallying, cumulative numerical notations, and abacus-like arithmetical devices all share in common a cognitive underpinning in the capacity for subitizing, or immediately and rapidly perceiving small quantities. Subitizing is not only a pan-human capacity but is also found in numerous animal species (Whalen et al. 1999; Piazza & Izard 2009; Boysen & Capaldi 2014). It allows one, upon seeing up to four objects in a line or cluster, to accurately judge its numerosity without needing to count the objects. This limit helps explain why, even though languages like Latin do not have a base of five, Roman numerals do not simply keep repeating I up to nine, but rather end at four (IIII), or if using subtractive notation, as is common today, three (III). Other cumulative notations, like Phoenician-Aramaic numerals, solve this problem by grouping lines for 1 by threes, so that nine would be III III III. Subitizing is also useful in non-cumulative numerical notations, for instance when grouping long sets of digits in numbers like 1,000,000; the use of commas to separate chunks of three falls within the subitizing limit. We should not make too much of the fact that numerical notations take advantage of our cognitive capacity for subitizing, but nonetheless this capacity scaffolds a variety of iterative numerical activities, notational and otherwise.
27But what about number words? In fact, when we look at the lexical numeral systems of the world, we are struck by the profound absence of iteration. Iteration is a property of language in general, for instance in the form of reduplication, which serves a variety of functions in different languages, or as a very basic form of recursive structure. In English, where reduplication is rare, one of its main functions is to identify prototypical members of some class in contrastive focus with other members – i.e., a “salad salad” is a prototypical or normal green salad as opposed to a fruit salad, potato salad, etc. (Ghomeshi et al. 2004). But in lexical number systems, reduplication is rare in all languages, in contrast to its ubiquity in cumulative numerical notations, as well as in tallies and counting devices, it is nearly absent in languages. The sole exception is that a surprising number of restricted-numeral languages – those with small numerical limits, less than 10, and no numerical base, such as is found in some of the languages of Australia and North America. But curiously, whereas iterative notations use marks that represent one, restricted-numeral languages use reduplicated expressions not of the number one, but of the number two, such that six would be two two two. Epps et al. (2012) discuss the fascinating complexities of restricted-numeral systems in detail, noting, in particular, that regional explanations, rather than ones dependent on social scale, best explain their worldwide distribution. These iterative, additive linguistic structures disappear as new words for higher numbers enter into common use in linguistic communities, and do not survive (even vestigially) in languages that have a numerical base.
28Noam Chomsky, very famously, has drawn an analogy between the recursive property of natural language (fundamental to Universal Grammar) and the natural numbers, which he calls “discrete infinity” (Chomsky 2000). Discrete infinity means that a limited set of entities can be used produce a potentially infinite set of utterances (the set of natural numbers and the set of grammatical utterances in a language). That two different domains of experience come to have the same mental structure underpins the orientation that sees recursion as central to cognition. It is true that an infinite set of numbers can be constructed through a discrete set of numeral-signs, which could be as simple as the word “one” or a single tally (N=1) and the successor function N+1. But is this how we think and talk about numbers? The worldwide absence of iterative “one one one” structures in the world’s languages gives reason to doubt the analogy between the successor function and linguistic recursion (Watanabe 2017). The claim for discrete infinity has been challenged both for spoken number systems (Greenberg 1978) and for language as a whole (Pullum & Scholtz 2010). We can think of three as “one, plus another one, plus another one” and we may even notate numbers in tallies as if that were so, but we do not talk as if it were so, not because iteration is fundamental to recursion, but because it would be exceedingly uneconomical to use the auditory channel in such a way.
29No known language uses anything like hundred hundred hundred for 300, although there would be no ambiguity in doing so, and even though notations (like Roman numerals) do so frequently. When numerals are repeated, these phrases are multiplicative, not additive, in structure, so, for instance, tunka-tunka-tunka “ten-ten-ten” is 1000 in Tacana, not 30 (Salzmann 1950: 82). Once again, the unattested-but-imaginable formulation suggests a cognitive constraint, only this time, one applicable to number words.
30Lexical numeration, cross-linguistically, is built on a multiplicative foundation, not an additive one. When drawing on lexical resources to express high numbers, speakers across almost all language families use multiplication as the step-ladder upon which higher numerical values can be compactly and unambiguously expressed. In Zdeněk Salzmann’s formal analysis of numeral word systems (1950), elements that he calls part of the cyclic pattern describes the recurrence of particular morphemes multiplicatively throughout a system – in French, for instance, -(a)nte for 10, vingt (to some degree), cent, mille, and million (and so on) are used this way. Without the terms in the cyclic pattern, it would be essentially impossible to compose large numbers in languages. Note that the cyclic pattern need not correspond with the mathematical concept of a base – vingt is not a base because its powers are not specially designated.
31Many of the differences between lexical number and numerical notation largely derive from issues of modality, with the oral-auditory channel used for speech differing from the graphic-visual channel used in notation. Notation favours brevity, clarity of notation, and legibility by the human visual and cognitive system. Because notation help readers and writers to minimize ambiguity, features that might be plausible in language are often absent. So, for instance, quatorze and quarante are both etymologically “4 10” (deriving from Latin quattuordecim and quadraginta, respectively), and neither has any indicator of the operation to be taken. This apparent ambiguity is easily resolved, however, by having a more flexible set of stem morphemes for any number word (in this case, -ze and -ante for 10). Because numerical notation lacks this flexibility, it requires a different strategy to distinguish 14 and 40 – in this case, using strict descending ordering by powers to distinguish 10 + 4 from 10 x 4. The rigid descending structure of numerical notation, which contrasts with the more fluid power ordering in many languages’ number systems, is thus explainable as a product of cognitive constraints relating to modality.
32There are, however, strong correspondences between numerical notations and lexical numeration, among multiplicative-additive systems, those that combine signs for units with signs for powers. So, for instance, in languages like Mandarin where each numeral word is a single morpheme and each morpheme has a single character, there is a close (though not perfect) correspondence between the spoken numeral words and their graphic notation. Such systems still eliminate some of the phonological, morphological, and syntactic complexities of numeral words, but retain much of the essential cyclical pattern. A French multiplicative-additive system might require a reader to transcode deux cent quatre-vingt-douze as “2 100 9 10 2” rather than “2 100 4 20 12”, for instance. We see this notation partially employed in European medieval and early modern texts in which the higher parts of the cyclical pattern are written out in words, with the rest in Roman numerals, as in the portion of the document in Figure 9, which lists an amount of “2. mille 4. cent florins” in the body of the text and “ij. mille iv. cent florins” as the total (Motyf van rechten… 1674: 61). These multiplicative hybrids of Roman or Western numeral-phrases for 1-9, with numeral words for 100 and 1000 are common in Western European texts, and although cumbersome at first to the uninitiated, are probably no more so than “127 million” would be to a contemporary reader. Multiplicative-additive numerical notations are still transcodings relative to number words, but they require less cognitive effort than other additive systems, or indeed any positional system, where the cyclical pattern is implicitly encoded through linear organization rather than with graphic signs.
Fig. 9. Amount of 2400 florins expressed twice, using hybrid multiplicative Western numeral, French and Roman notation and French notation
Source : Motyf van rechten… 1674: 61
33Transcoding is thus a recompositional cognitive process, building on multiple pre-existing cognitive capacities and semiotic resources available to the user of a notation. Iteration alone is insufficient to provide the framework for an expansive, fully-fleshed-out numerical notation, but likewise, simply abbreviating number words is inadequate. We can instead think of this as a cognitive tool, partly internal, partly external – i.e., a materially-anchored, distributive cognitive system for facilitating numerical representations (Hutchins 2005). Figure 10 shows the representation of 364 in a variety of systems, some lexical (English, French), some notational (Hittite, Western). One can take a numeral phrase (three hundred sixty-four), break it apart into its cyclical pattern, and recode it as 3 6 4, which can then be read as trois cent soixante-quatre. Many other transcodings are of course possible. The advantages of such a system extend beyond cognitive convenience for any speaker, hearer, reader, or writer. Because numerical notations are trans-linguistic, they can be read in any language. But this is not merely a process of providing a translinguistic code to facilitate translation – it is about providing a semiotic resource that enables reading, rereading, emphasis, manipulation, and re-expression.
Fig. 10. Recomposition of numeral words as symbols
34There are parallels to be drawn between the process of recomposition described here and the cognitive processes outlined by David Wengrow (2013) in his discussion of visual representations of human-animal and animal-animal hybrids, or “monsters”, in Neolithic and early state societies in Eurasia. Wengrow argues that it is not merely a historical particularity that in the Eurasian Bronze Age a new tradition of recompositional artistic representation emerged, in which not only were there far more hybrids than previously had been the case, but that hybrids became characterized by a “modular logic of depiction” in which graphic elements (heads, limbs, etc.) were isolated from their context and then recombined into new forms (Wengrow 2013: 56). Wengrow links this artistic process to a cognitive shift drawing on the modular logic of state bureaucracies, which de-individualize and decontextualize information in order to more readily work with it at scale. This process is observable across domains, e.g., in the standardization of mud-brick architecture in early Mesopotamian states. It is by no means simple – it involves craft specialists with sufficient time and energy to develop and refine these techniques, made possible by surpluses extracted from non-elites. In short, decomposition/recomposition is the artistic and semiotic signature of the early state.
35One would not want to draw too close a connection between the state and recompositional thinking. While artistic representations of hybrids may be rare in the Near East before the state, they exist in the stories, myths, rituals, and indeed, the visual material culture of many non-state societies and nonliterate peoples. Obviously, in the domain of number, any multilingual individual must, to some degree, be able to translate across languages with different structures. Wengrow’s claim is that the logic of the state creates both the time and the cognitive impetus to permit modular recompositions of images that, while possible in non-state societies, were rarely actualized in practice. In the cognitive literature on literacy and numeracy, this idea parallels David Olson’s (1996) argument that literacy promotes greater metalinguistic awareness of speech sounds, a process of decomposition of words into syllables and then into phonemes. The claim is not that there was no phonemic awareness in non-literate societies, but that literacy is a technology that more readily affords thinking in such a way.
36The recomposition of numerals into numerical notation systems is a related process, one that requires a combination of notational and structural principles drawn from the resources and modalities available. On the one hand, the near-universality of tallying practices, which require iterative markings using one-to-one correspondence, is accessible to almost anyone, and has been widespread since the Paleolithic. But as discussed above, tallying is hard to integrate with text, and very slow to read and write directly. Certainly it is hard to imagine numbers greater than around 10 being represented with tallying regularly. On the other hand, numeral words are language-bound, full of morphosyntactic complexities, and subject to diachronic change. We cannot meaningfully communicate without them, but we need some framework to notate them.
37The result, numerical notation, is a hybrid, helping us decompose number words into their constituents (units and powers) and then recompose them using a limited inventory of consistent signs. Some of them are likely to rely heavily on iteration – the cumulative systems, which build on similar principles as tallying systems but yet are so different from them. Others recompose number words into signs that build on the cyclic pattern of their speakers’ languages, highlighting multiplicative aspects of the number words explicitly through signs (multiplicative-additive systems) or implicitly through place value (positional systems). But no numerical notation is either a pure re-encoding of language, or a tallying or iterative system extended to its logical conclusion. All of them require input from multiple systems across multiple modalities in order to make sense as representational systems.
38Because numerical notations emerge, in different parts of the world, at or near the development of the earliest state societies in each region, their origins have often been linked narrowly to bureaucratic or bookkeeping practices, on the assumption that numeracy exists to support state administrative tasks (Postgate, Wang & Wilkinson 1995; Steensberg 1989). In fact, the case of pre-colonial West African states suggest that written numeracy is overrated as a necessary corollary of large-scale bureaucracy (Trigger 2003: 595). While bookkeeping is surely a function that numerical notation can serve, there is little evidence that it actually is the central purpose of the earliest numerical notations in Mesoamerica, China, Egypt, and India (Chrisomalis 2009). So, we need a better answer than functional necessity to explain why numerical notations co-evolve with the state.
39In studies of writing systems, there is an over-reliance on the “token –> tablet” model popular among Assyriologists, drawn from the single case of the Uruk period, which contends that numerical notation precedes and causes the development of writing in conjunction with bookkeeping / accounting activities (Schmandt-Besserat 1992; Nissen, Damerow & Englund 1993). There are no grounds to imagine that, just because numeracy is causally linked to script origins that happened in one place, all inventions follow this pattern. Even in Mesopotamia, Jean-Jacques Glassner (2000) points out, numerical notations are only one of a suite of representational systems including seals, visual art, and iconography that become integrated in what would eventually become cuneiform writing. Similarly, Andréas Stauder (2010) argues that the “tomb tags” found in the late fourth millennium BCE proto-royal tomb at Abydos need not be interpreted as evidence that writing served state bureaucracies, but rather that they had ceremonial, aesthetic, and semiotic functions related to royal burials. It is not that they are not writing (and numeration) – and it is not that they are not linked to the state. It is that the particular framework in which we, rather ethnocentrically, view writing and numeration as serving a narrow set of functions within the state is not well-supported. These arguments do not deny that the state is relevant – at the very least, one needs craft specialists with the time and energy to devote to reflection upon, and elaboration upon, pre-existing semiotic resources. It does allow the development of numerical notations for purposes other than conducting calculations – purposes for which there were other subsystems in place in those societies (like counting devices) and for which there is no direct evidence that they were conducted using numerals.
40Under this view, to ask, from a functionalist perspective, for what purposes numerical notations developed, misses the point. Instead we should ask: in what contexts, by what processes, and using what preexisting resources were they developed? Recognizing that this upends the traditional narrative told about numerical notations, writing, and the state, it is nonetheless consistent with a framework that sees written numeracy as a likely corollary of state formation, but is agnostic as to its specific role within ancient states. In place of a perspective that sees arithmetical or accounting functions leading directly to numerical notation, it seeks the origin in a broader, cognitive revolution that appropriated recompositional thinking and scaffolded it onto pre-existing numerical systems, thereby producing a system for transcoding numbers across modalities.
41The proposition that numerical notations permit transcoding from lexical to non-lexical systems, and that they hybridize the properties of other visual and manipulable notations like tallies and abaci with those of language, is the first step in a broader demonstration and evaluation of the cognitive processes underlying them. Here, we have focused principally on a comparison of the structures of those systems, leaving untested the specific cognitive mechanisms that might underpin them. But, once having opened the door on the prospect of an analysis that treats representation, not computation, as central, the prospects for a semiotic science of numeration are excellent.