Skip to Main Content
Coronavirus Updates: Click here for the status of library operations and how to access resources & services remotely

Baquaqua : The Afro-Diasporic Text Corpus Project: Home

Transcribed and Annotated Texts by African-American Authors


The goals of Baquaqua : Afro-Diasporic Text Corpus project are:

  • To create a complete, dynamic, diachronic corpus of Afro-Diasporic writings in all genres.

This project will provide access to as many writings by Diasporic authors as we can find in all genres and in all times, within the limits of copyright or other licensing. We will continually add to the database as the public domain threshold advances or as licensing agreements permit.

  • To annotate texts with Part-of-Speech and other descriptive tags to facilitate text analysis and corpus linguistics,

This project will provide access to texts in both annotated and unannotated formats to meets diverse needs of researchers in many scholarly environments.

  • To create a searchable database with interactive analytic tools and export options,

The aforementioned features (i.e. tags and field values) of this project will be discoverable through a database with different search parameters for retrieving by annotation type, genre, author and numerous other metadata values.  Export options will allow researchers to generate sub-corpora specific to their needs, or produce result reports of analyses that utilize built-in tools.

  • To serve as a venue for future computational linguistics, and digital humanities research.

The corpus and its infrastructure can be used to study Computational Linguistics and broader aspects of Digital Humanities comparatively or in the abstract.   We can use this and related research to make informed decisions about incorporating new features to make the resource more responsive to developments in these fields.

  • To serve as a bibliographic resource for African-American authors.

As a compendium of African Diasporic writers, this project can also serve as a bibliographic resource that provides users with authors and writings by genre, time period, location and more, as well as a library that provides human-readable texts.

At present we are developing several test corpora to evaluate the technical challenges for approaching the foregoing goals.

Last updated 30 May 2024


Scope : Writings of all members of or descendants of the African Diaspora.

Time : 1746-1927 The first African Diaspora author is, by general consensus, Lucy Terry Prince whose "Bars Fight" from 1746 represents the inaugural testimony of African American literature.  Writings of all genres dating from 1746 to 1927 are included.  The terminal date 1927 is observed because it represents (as of this writing in 2023) the end of the Public Domain for writing in the United States.  Under current legislation this date will advance one year every year and so more authors will be included as the public domain advances.

Cost : This project is presently developed without any funding, and relies on whatever resources are available through Richardson Library.

Project Plan

The AATC project goals are divided into nine phases.  While several phases are currently in progress to varying degrees, the principal emphasis is presently on phases 1 and 2 with the goal of producing a project exemplar that is both manageable in scale, but can be used for basic scholarship in digital humanities.

Phase 1 : Complete collection of Diasporic writings prior to 1801 with annotated set (In progress),

Phase 2 : Develop a test corpus to evaluate technical and other requirements (In Progress),

Phase 3: Seek input and develop an advisory panel from experts in the Morgan Community and beyond to develop research agenda, provide informed advice on issues and challenges experienced during project development and guide the growth and progress of the AATC to make it responsive to a broad range of research and learning interests. (In Progress),

Phase 4 : Develop research projects that apply text analysis and data mining (TDM) techniques to the corpus to demonstrate the utility of the AATC. (In Progress),

Phase 5 : Develop a searchable database to facilitate searching by genre, author, year, place and more (In Progress),

Phase 6 : Expand collection of Diasporic writings from 1801 to 1865 (Planned),

Phase 7 :  Develop descriptive and analytic tools and integrate them into the database to facilitate reports and visualizations that can be generated online and exported.

Phase 8 :  Promote the project as a scholarly resources to external learning and research communities. (Planned),

Phase 9 : Expand collection of Diasporic writings from 1866 to 1927 (Planned).


Bryan Fuller, MLIS, MS
Reference and Government Documents Librarian
133-A Richardson Library
Morgan State University
1700 E. Cold Spring Ln.
Baltimore, MD 21251

Adrian Clarindo, MA, CPE
Professor English and Portuguese 
Instituto Federal do Paraná, Parana, Brazil 
Universidade Estadual de Ponta Grossa, Parana, Brazil 
Ph.D. student at São Paulo University, São Paulo, Brazil

©2018 Morgan State University | 1700 East Cold Spring Lane Baltimore, Maryland 21251 | 443-885-3333 | Privacy | Accessibility