Documentation
Complete List of Files
Click here to see a complete list of all the files included in the corpus and their associated information.
Explanation of Variables
This page explains the operationalization of the individual variables that all corpus files were coded for.
Variable Explanation
Under construction
Complete explanation of variables will be added here.
Education
The 'Education' variable reflects the highest academic diploma or qualification a speaker has obtained. The information is taken from publicly available sources. The educational level refers to the moment of the speech recording and may thus change for the same speaker over time or in different files. Its levels are an ordinal rank from 1 (lowest) to 5 (highest). The criteria used to determine the levels are shown in the table below.
Education level | Criteria | Example |
---|---|---|
1_VeryLow | No completed formal edudcation, no diploma | High school drop-out |
2_Low | High school diploma | 18 year old high school graduate, freshman |
3_Middle | Bachelor degree, Master's degree in humanities or with an applied focus, other degrees from unaccredited institutions | Bachelor of Science in Geography, Master's in Accounting, Doctor of Ministry |
4_High | Ph.D., J.D., Master's degree in science or engineering | Ph.D. in linguistics |
5_VeryHigh | Ph.D. in any field and continuing involvement in research | Professor at a university |
Gender
The variable 'Gender' records the gender of the speaker. Classification is based simply on immediate perception of the speaker: gendered names, facial appearance, pitch of the voice, etc. For the vast majority of speakers, the levels are either 'Male' or 'Female'. In a small number of cases, however, the value may be different, e.g., for speakers who are transsexual or otherwise self-identify in meaningful ways as another kind of gender.
Click here to show/hide distribution of 'Gender' in the corpus..