AMC Affiliates complete syntactically parsed corpus for Early Middle English

PLAEME_500x500_Color

The AMC is delighted to kick off 2018 by announcing the successful completion of an exciting project: the Parsed Linguistic Atlas of Early Middle English (PLAEME). Spearheaded by AMC affiliates Robert Truswell and Rhona Alcorn (Edinburgh) PLAEME provides syntactic annotation (in the format of the Penn Parsed Corpora of Historical English) to much of the Linguistic Atlas of Early Middle English (LAEME).

PLAEME parses a substantial chunk (68 files; 173,000 words) of the LAEME corpus, making it available to the general public and plugging a well-known gap in the electronic resources for the syntactic record of English. The bulk of the work was conducted by Rob with the help of Jim Donaldson, a doctoral canditate at Edinburgh. This fantastic achievement will hopefully be the first step in the longer-term goal of creating fully-parsed versions of the entire LAEME corpus (which contains 650k words from 1150-1325) and its sister LAOS (Linguistic Atlas of Older Scots, 280k words from 1380-1500).

The corpus and a more technical description of its contentents and compilation procedures can be found on the University of Edinburgh’s DataShare platform (https://datashare.is.ed.ac.uk/handle/10283/3027) or on GitHub: https://github.com/rtruswell/PLAEME_current.

Congratulations to PLAEME team!

You may also like...