Spoken Corpus

This page provides the documentation for The Student-Transcribed Corpus of Spoken American English. The information concerns both metadata on the files as well as social and situational variables for the speakers. All text files are completely coded for each of these pieces of information. Choose below to either see a list of all text files currently included in the corpus or to see an explanation of each of the coded variables.

Complete List of Files

Click here to see a complete list of all the files included in the corpus and their associated information.

Explanation of Variables

Click here to read about the operationalization of the individual variables that all the corpus files were coded for.

“ Documentation is like sex: when it is good, it is very, very good; and when it is bad, it is better than nothing.” - Dick H. Brandon

Convenience-sampled data, like that used in this corpus, is quite difficult to document in a consistent and comprehensive way. The proposed classification attempts to make some sense of the myriad of parameters along which speech may vary. However, due to a person's mobility, limited availability of exact data, fluctuations in lifestyle and other difficulties, the information provided may not always be absolutely accurate.

The Student-Transcribed Corpusof Spoken American English

www.SpokenCorpus.org

Documentation

Complete List of Files

Explanation of Variables

The Student-Transcribed Corpus
of Spoken American English