The world's linguistic diversity is in danger. CNRS linguists are racing against the clock to record endangered languages before they completely disappear.


© A. Lahaussois/CNRS

Children from the Rai ethnic group (Eastern Nepal) no longer speak Thulung Rai, their group's language.

Languages around the world are disappearing at an astonishing rate. According to UNESCO's recently published Atlas of the World's Languages in Danger, an online database tracking language status across the globe,1 half of the 6700 languages currently spoken around the world will probably disappear over the next century. These are often unwritten, transmitted from one generation to the next, meaning that once the last speaker dies, so do all traces of the language. Although linguists have no power to reverse this trend, they are doing their best to record these languages before it is too late.
A number of linguists at CNRS, many of whom contributed data used in the UNESCO Atlas, carry out what is known as descriptive work, collecting and analyzing data on endangered languages before they are extinct. They go on location to interact with a community, often over a period of several years, to learn and record its language. The linguist's work will usually result in a dictionary, a grammatical description and a collection of “stories.” These are traditional stories, conversations, recipes, personal narratives, but also materials relevant to experts such as names and usage of plants, recordings of religious ceremonies, family lineages and incest “laws,” origin myths that may hold clues to geological events... This kind of information can be very useful to many scientists, including botanists, ethnographers, geologists, sociologists, or even historians.
CNRS laboratories working in the field either focus all their research on a specific geographical area or study a number of geographical areas at once. LLACAN,2 for example, is a geographically oriented lab which specializes in languages of Sub-Saharan Africa. It is one of three such labs at CNRS, together with CELIA,3 which studies native languages of the Americas (particularly those spoken in Latin America), and CRLAO,4 a lab which focuses on the linguistics of East Asia and carries out documentation projects on minority languages of Southwestern China.
Other laboratories that have a strong commitment to endangered languages work across the globe, notably DDL,5 which runs a special program on endangered languages called AALLED,6 and LACITO,7 where the bulk of research is on purely oral languages and cultures from around the world.
In all cases, the collected field data is the basis for analytic work. It can be used to group languages into families (language classification), to recreate what earlier stages of the languages looked and sounded like (language reconstruction), to learn more about how neighboring languages influence one another's sounds and structures (language contact), study these sounds and structures (phonology, morphology, syntax), and determine whether they are unique to a single language or common to many (language typology).
One crucial factor in language preservation is data storage: The collected data needs to be safeguarded for future generations, and it needs to be accessible to community members and to academics. CNRS has created the CRDO8 for this purpose, an online archive of linguistic materials that will be constantly updated as technologies evolve.
LACITO has its own archive (hosted at CRDO) dedicated to rare languages, where the materials can be viewed and listened to directly from an online interface,9 with sound recordings synchronized with transcriptions and translations. This is a far cry from the early days of linguistic fieldwork, when researchers had no choice but to keep their (non-digital) recordings to themselves.

