BantUGent documentation
BantUGent maintains an extensive digital, searchable inventory with scientific documentation on Bantu language communities. Since we started to accumulate this digital documentation as part of the successive cross-disciplinary KongoKing (2012-2016) and BantuFirst (2018-2023) projects, it pertains first and foremost to the languages, archaeology, botany, ethnography & anthropology, and history of West-Coastal Bantu-speaking peoples, and more particularly the Kikongo Language Cluster. Nonetheless, because the scope of BantUGent research covers the entire Bantu language distribution area and beyond, the digital inventory increasingly incorporates more documentation from other parts of Africa, including many academic sources that tend to be difficult to access. View the contents of our digital BantUGent documentation inventory.
Extract from the Vocabularium Latinum, Hispanicum, e Congense. Ad Usum Missionariorû transmittendorû ad Regni Congo Missiones from 1652 (Rome: National Central Library, Fundo Minori 1896, MS Varia 274), the oldest surviving dictionary of a Bantu language.
[This unique resource was transcribed using professional dictionary writing software and is, as is the case for all other BantUGent documentation, now available in various digital, searchable formats, and downloadable here.]

Map from 1619 by Jodocus Hondius, Sr. representing Congo and Ethiopia.[In addition to texts, the BantUGent documentation also includes images of various kinds, such as historical maps, but also pictures from recent research trips. In our Documentation Archive we also keep all the raw material collected via fieldwork.]
Corpora
(Historical) Bantu corpus linguistics is one of BantUGent’s fields of expertise. Following Tognini-Bonelli (2001), corpus linguistics is the analysis and description of language use as documented in texts, including transcribed oral texts. To analyze and describe ‘real’ language, one needs “large quantities of actual occurrences of that language” assembled in an electronic corpus. Over the past two decades, BantUGent has developed such corpora for several Bantu languages in order to allow for their corpus-based (and in some cases even corpus-driven) study, both in terms of grammar and the lexicon.
BantUGent warmly welcomes scholars who want to do research using one of our existing corpora, or to build and examine a new corpus of their own (Bantu) language as part of a doctoral or postdoctoral research project.
Iin addition to corpora for Afrikaans, English, Hausa and Somali, which may be used for comparative purposes, text corpora are available for the following Bantu languages: Cilubà, isiNdebele, isiXhosa, isiZulu, Kikongo (a set of several dozen synchronic and diachronic corpora for the Kikongo Language Cluster), Kirundi, Kiswahili, Lingála, Luganda, Lusoga, Northern Sotho, Sesotho, Setswana, siSwati, Tshivenda, Xitsonga.
Other documentation
- Online Swahili – English Dictionary by Gilles-Maurice de Schryver (BantUGent), Sarah Hillewaert, Pitta & David Joffe
- Online Cilubà – French Dictionary by Ngo Semzara Kabuta
- Online Lingála – Dutch Dictionary by Michael Meeuwis
- Kwilu Bantu by Joseph Koni Muluwa (ISP Kikwit – BantUGent) & Koen Bostoen (BantUGent) on endangered languages in the Kwilu Province (DRC)
- Supporting data for Tense and Aspect in Bantu by Derek Nurse
- Tense and Aspect in Niger-Congo by Derek Nurse, Sarah Rose, and John Hewson
- Bibliography on Tense and Aspect in Bantu & Niger-Congo by Derek Nurse
- Bajuni documentation by Derek Nurse: general (people, society, geography, history, language), grammatical sketch, lexicon