The Institute for Historical Dialectology (‘IHD’) continued an extensive research programme into variation in medieval written vernaculars that was started in the early 1950s by Professor Angus McIntosh of Edinburgh University and Professor Michael Samuels of Glasgow University (later joined by Professor Michael Benskin, now of Oslo University). Between 1994 and 2013, the IHD attracted research grants in excess of £2 million to produce a suite of large-scale, online, scholarly resources, viz: A Linguistic Atlas of Early Middle English (LAEME), A Linguistic Atlas of Older Scots (LAOS), a heavily revised electronic version of the original Linguistic Atlas of Late Mediaeval English (eLALME), and A Corpus of Narrative Etymologies and accompanying Corpus of Changes (CoNE). In December 2013, the Institute for Historical Dialectology became the Angus McIntosh Centre for Historical Linguistics (‘AMC’). The change from IHD to AMC marks a major initiative intended to increase IHD’s visibility in the research community locally, nationally and internationally. It implies a broadening of the IHD’s scope, an extension of its remit, and thereby the transformation of a small research unit into a vibrant hub of research activity in historical linguistics.
Methodologies used for the making of the linguistic atlases
The LALME project was largely carried out before the computer age. It was made using filing slips and paper, pen or pencil. It collected data using the tool traditionally employed by dialectologists, the questionnaire. By 1987 technology had progressed to the point where we were able to use computers from the inception of the ‘daughter’ atlas projects (LAEME and LAOS) and in a way that is integral to the methodology. Instead of completing questionnaires comprising a set of predetermined ‘items’, we developed a method whereby entire texts were transcribed and keyed onto computer disk and analysed linguistically using programs written in-house.
Each word or morpheme in a text was tagged according to its lexical meaning and grammatical function and each newly tagged text was added to the corpus of such texts. Programs then allowed information on particular ‘items’ (defined by one or more tags) to be abstracted from the corpus to identify spatial or temporal distributions of the forms associated with the item. Output may be produced in different formats including concordances, text profile comparisons, time charts and maps.
This corpus method of analysis has considerable advantages over the traditional questionnaire. Selection of items for a questionnaire must be made before analysis begins, or very early in the investigation, on a trial-and-error basis. Results are restricted and provide only a fraction of the information achievable by the corpus method. Tagged texts in the corpus are immediately and constantly available to be processed and compared. Not all the material will be of use for dialectal work but this method allows items to be selected from a complete inventory of linguistic forms rather than from some predetermined sample.
The method shortcircuits Gilliéron’s paradox that for results to be optimal a questionnaire ought to be devised after the investigation. The tagged corpora provide a detailed lexical-grammatical taxonomy that is useful not just for dialect mapping but for the historical study of phonology, morphology, syntax or semantics. The implementation of the corpus approach to linguistic analysis makes feasible a dynamic, interactive concept of dialect atlas.
The CoNE project (which is still ongoing) is an etymological corpus that aims to contain a narrative etymology of every form type that appears in the LAEME database. In principle the methodology could be extended also to include the LAOS corpus and the (e)LALME data.