Language documentation and documentary linguistics

language documentation

I firmly believe linguistics is—or at least should aim to be—about a deep and holistic understanding of human Language, understood as the human faculty of communication through (human) language(s).1In order to express clearly a distinction lost in the English lexicon, I consistently use “Language” for the human faculty and “language” for a specific linguistic system. One way of achieving (part of) this objective is to describe and explain the linguistic structure(s) of the languages of the world (since these are expected to mirror the features of the complex faculty of Language, at least on a significant percentage). The first step towards this task (description and explanation) is certainly the appropriate collection of linguistic data, on which a linguistic analysis can be performed.

It is over this (ideal) conceptualisation of the linguistic enterprise that what has been called “documentary linguistics”—the product of which is “language documentation” (Himmelmann 2002:2)—should somehow be superimposed. According to Himmelmann (2006:1), “a language documentation is a lasting, multipurpose record of a language.” This is indeed a noble intent, but it is not a parallel one to the description-and-explanation enquiry. What I mean is that the goal of language documentation stretches along several moments of the linguist’s work of describing (and explaining) language(s) and should not be considered opposed to it.

Himmelman 2002 delineates a distinction where data collection is to be understood as “documentary linguistics”, while, broadly speaking, linguistic analysis is “descriptive linguistics.” But in Himmelmann 2006, the documentation enterprise is clearly broader than a “mere” collection of linguistic data. Indeed, data collection is just what it is: data collection. And its product is (or should be) “a lasting, multipurpose record” (Himmelman 2006:1) of primary linguistic data.

Himmelmann further states that “resources allocated to documentation should not be “wasted” on writing a grammar but are better spent on enlarging the corpus of primary data” (2006:22). This is so because in its view of language documentation, description is on a very different side of the linguistic discipline landscape. But, as it is acknowledged by the author (2006:23), descriptive linguistics is indeed essential for language documentation. That is because documentation is not complete without an “apparatus” and a set of “annotations” (2006:11–14). These are not possible without first undertaking a thorough linguistic analysis on primary data, and for this purpose, linguistic analysis equals data analysis (more on linguistic analysis and its components in a post to follow).

This surface contradiction has been already pinpointed by other scholars, among which Evans (2008). He indirectly refers to Himmelmann’s position as an instance of “documentarist fundamentalism,” and he argues against the idea that “analysis stops well before the completion of a descriptive grammar” (Evans 2008:347). He goes on, backed up by Rhodes et al. (2006:3–4), stating that “[t]he question is, of course, how far the analysis needs to go” (Evans 2008:346). And this question strongly relates to linguistic analysis and hence language description.

Documentary linguistics is thus an activity that spans several tasks and sub-goals of language description: data collection, data analysis and, finally, data annotation complemented with a language description apparatus. Whereas the final aim for language description is, obviously, the description of a language (that goes through data collection and data analysis), language documentation makes a step further closing the circle and providing that “lasting, multipurpose record of a language” (Himmelmann 2006:1, emphasis added), which is indeed composed of primary data and the application of the product of language description, i.e. annotation. Clearly, annotation is useless without a key for its interpretation and this is indeed the job for the language description resources (approximately, a grammar and a dictionary). Such resources are to be complete and extensive for the interpretation process to be sound and solid. If the record of a language cannot be safely interpreted, its use and usefulness would be less clear. If interpretation of a record is to be considered secondary or just a support, a “luxury”, I don’t see why bother having a complete record of data and annotation instead of a simple database of primary data, i.e. a record of linguistic behaviours and speakers’ awareness and opinion on those behaviours (more or less corresponding to the “metalinguistic knowledge” as Himmelmann 2006:8 defined it).

To conclude, language documentation and documentary linguistics should not be considered and conceptualised as something parallel to descriptive (and explanatory) linguistics. On the contrary, documentary linguistics and its activity/product—language documentation—should be considered as an enterprise which is superimposed on descriptive linguistics.


Evans, Nicholas. 2008. Review of “Gippert, Jost, Nikolaus Himmelmann, and Ulrike Mosel. 2006. Essentials of language documentation. Berlin; New York: Walter de Gruyter. x + 424 pp.” Language Documentation and Conservation 2(2). 340–350.

Himmelmann, Nikolaus P. 2002. Documentary and descriptive linguistics (full version). In Osamu Sakiyama & Fubito Endo (eds.), Lectures on endangered languages: 5, 37–83. Kyoto: Endangered Languages of the Pacific Rim.

Himmelmann, Nikolaus P. 2006. Language documentation: What is it and what is it good for? In Essentials of language documentation, 1–30. Berlin New York: Mouton de Gruyter.

Rhodes, Richard A., Leonore A. Grenoble, Anna Berge, and Paula Radetzky. 2006. Adequacy of documentation: A preliminary report to the CELP [Committee on Endangered Languages and their Preservation, Linguistic Society of America].

Notes   [ + ]

1. In order to express clearly a distinction lost in the English lexicon, I consistently use “Language” for the human faculty and “language” for a specific linguistic system.