The Digital Exposure of English Place-Names

Kelly Kilpatrick, Research Fellow, Institute for Name-Studies (University of Nottingham)

The Digital Exposure of English Place-Names (DEEP) project aims to digitise the Survey volumes of the English Place-Name Society. This information has not previously been available electronically, and digitising this material will enable it to be cross-searched and mapped like never before. DEEP is funded by JISC, and has been running from 2011 and will be completed in October 2013. This collaborative, multi-institutional digital project draws on the expertise of toponymists, technical developers and web designers from the following institutions: The Institute for Name-Studies (University of Nottingham), Centre for Data Digitisation and Analysis (Queen’s University Belfast), Language Technology Group (University of Edinburgh) and The Centre for e-Research (King’s College London).

The developments of English place-names have been systematically surveyed by the English Place-Name Society (EPNS) since 1922, detailing over four million place-name forms from classical sources, through the Middle Ages to the modern period. To date, the Society has published 89 volumes, and the Survey is still ongoing. By digitising the volumes using XML markup language, it is possible to link and cross-search the historic place-name forms, dates and sources—amongst a number of other useful functions—thereby creating an important online research tool. There will be two main outputs from the DEEP project: the first an online Historical Gazetteer of place-names and their attestations, and the second a website containing all the EPNS Survey data allowing for complex free-text and element searching. The Historical Gazetteer is a free online database, and the Survey website will be available to members of Higher Education Institutions and the English Place-Name Society.

Over the past year considerable progress has been made, and we are now nearing the completion of the Historical Gazetteer database. Numerous methodological and technical challenges have arisen throughout the project, notably how to cope with variation in the presentation of the material in the Survey volumes. Though the basic format has remained relatively consistent, since the publication of The Place-Names of Buckinghamshire in 1925 there has been considerable development and variation over the past nine decades. The Shropshire survey, for example, is presented differently from other counties. The major names in Shropshire are dealt with in the first volume, with further discussion of place-names by parishes in the following volumes with cross-references to names in the first. This is easy to follow in the paper volumes, but difficult from a technical processing perspective; linking the correct information between the volumes required considerable planning. The administrative hierarchy also varies from county to county and preserving this structure has generated numerous challenges and required innovative solutions. While the administrative geography of many counties are organised into hundreds or wapentakes, some have different administrative levels. The highest administrative level within Sussex, for example, is a Rape, and in Westmorland a Barony. These different levels require different processing to ensure that they are represented correctly. Additionally, with over 7,000 sources cited throughout the Survey, we also had to devise a systematic process to cope with the Survey bibliography.

Numerous stages of collaborative effort are involved in order to generate the digitised output. The volumes are scanned and OCRed by the Centre for Data Digitisation and Analysis, then passed to the Institute for Name-Studies where we prepare the volumes for digital conversion. This preparation involves checking the texts for any errors that may cause digital processing difficulties and expanding abbreviated forms, in order to make the historic place-name forms fully searchable. For example: Buldywas Parva, ~ Maugna needs to become Buldywas Parva, Buldywas Maugna in order for the second form, Buldywas Maugna, to be searchable. Once the texts are prepared, they are sent to Language Technology Group at Edinburgh where they are converted to XML, a markup language that is machine readable. At this stage, complex lower-level XML tags are applied to the texts, the modern place-names are geocoded, the bolded place-name elements are linked to the element tabled within the Key to English Place-Names database and the sources are assigned identification numbers. Below is a short example (excluding geo-references) of the minor name Frogmarsh from A. H. Smith, The Place-Names of Gloucestershire, Part 1, EPNS 38 (Cambridge, 1964), p. 99:

FROGMARSH, Froggemore 1381 MinAcct, Frogmore Shard 1777 M, v. frogga, mōr ‘marshy moor’, sceard ‘gap’.

represented in XML:

<?xml version="1.0" encoding="utf-16"?>
<mappedname modname="Frogmarsh" shortmodname="Frogmarsh">
    <modname>
        <w pws="yes" id="w179036" style="small-caps">Frogmarsh</w>
    </modname>
    <w id="w179045" pws="no">,</w>
    <altset>
        <alt>
            <histform>
                <w pws="yes" id="w179047" style="italic">Froggemore</w>
            </histform>
            <attested>
                <date end="1381" begin="1381" type="date" subtype="simple">
                    <w pws="yes" id="w179058">1381</w>
                </date>
                <source id="go600">
                    <w pws="yes" id="w179063" style="italic">MinAcct</w>
                </source>
            </attested>
        </alt>
        <w id="w179070" pws="no">,</w>
        <alt>
            <histform>
                <w pws="yes" id="w179072" style="italic">Frogmore</w>
                <w pws="yes" id="w179081" style="italic">Shard</w>
            </histform>
            <attested>
                <date end="1777" begin="1777" type="date" subtype="simple">
                    <w pws="yes" id="w179087">1777</w>
                </date>
                <source id="go316">
                    <w pws="yes" id="w179092">M</w>
                </source>
            </attested>
        </alt>
    </altset>
    <w id="w179093" pws="no">,</w>
    <w pws="yes" id="w179095" style="italic">v.</w>
    <etympart>
        <pn-element langcode="O" hversion="1" headword="frogga" kepnid="54684" ambig="no" hquery="headword">
            <w pws="yes" id="w179098" style="bold">frogga</w>
        </pn-element>
    </etympart>
    <w id="w179104" pws="no">,</w>
    <etympart>
        <pn-element langcode="O" hversion="1" headword="mōr" kepnid="52974" ambig="no" hquery="headword">
            <w pws="yes" id="w179106" style="bold">mōr</w>
        </pn-element>
    </etympart>
    <w pws="yes" id="w179110">'</w>
    <gloss>
        <w id="w179111" pws="no">marshy</w>
        <w pws="yes" id="w179118">moor</w>
    </gloss>
    <w id="w179122" pws="no">'</w>
    <w id="w179123" pws="no">,</w>
    <etympart>
        <pn-element langcode="O" hversion="1" headword="sceard" kepnid="54930" ambig="no" hquery="headword">
            <w pws="yes" id="w179125" style="bold">sceard</w>
        </pn-element>
    </etympart>
    <w pws="yes" id="w179132">'</w>
    <gloss>
        <w id="w179133" pws="no">gap</w>
    </gloss>
    <w id="w179136" pws="no">'</w>
    <w id="w179137" pws="no">.</w>
</mappedname>

Processing the data in this way will allow for complex text searching in the online database: date-ranges can be selected, specific sources can be searched and place-name searches can be narrowed down to counties and types of names. In the Survey website it will also be possible to search by place-name element. The place-names will also be represented on a map; where the locations of minor names, field-names and historic place-names are unknown, the point on the map will be of the parish in which they are located. This XML structure will make it possible to perform different data mining functions and visualise the material in a way not possible with the paper volumes. Following the conversion to XML, various checks are conducted between the institutions, before the data is finally uploaded to the Gazetteer website.

Slide1

Site currently under construction

The DEEP Gazetteer website offers a number of data mining functions and will be a useful onomastic tool. The Gazetteer will enable the user to search by both modern place-names and historic forms, or to browse by county. The information can also be restricted in various ways to fit the needs of the user. With an historical search you can select source or choose from popular sources, select the counties you would like to search and also specify the type of place (e.g. hundred, major settlement, field-name).

P2

Site currently under construction

To search by modern name, this option is selected from a drop box, and the user can specify a free-text search or a precise character search to return an exact match. If the browse function is selected, the user can drill down through the administrative hierarchy.

P3

Site currently under construction

For example, if Derbyshire is selected, a page displaying the hundreds in Derbyshire would be returned. The boundaries of the hundreds will be displayed in the map on the right of the page. Selecting a hundred would then open a page with its parishes, and by selecting a parish this would return the settlements within that parish, as well as the minor names, field-names and other types of names associated with the settlements. Clicking on each of the name-links will display the historic forms of the place-name with dates and sources on the right of the screen.

P4

Site currently under construction

Another great feature of the DEEP Gazetteer is the ability to search by type place-name. Users can select from a variety of names, such as the names of larger administrative divisions (hundreds, wapentakes, honours), to pub-names, field-names and names of features such as roads, wells, gates, etc. Additionally, users can search for place-names by date range or by sources. The date range function will enable the user to narrow down the search to precise dates or selected centuries. Searching by source will return the place-names found in the chosen sources, and the source search can also be restricted by county, place-name type, etc. Furthermore, the place-names in the Gazetteer will be linked to the Survey website through URIs, providing easy navigation to further Survey information.

The DEEP Gazetteer will be a useful resource for onomasts, local and national historians, linguists, genealogists, archaeologists and anyone with an interest in names. The webpage will be available at: http://www.placenames.org.uk. We will notify www.onomastics.co.uk when the site is completed, and in the meantime please visit our Facebook page for updates on DEEP progress and posts of interesting and funny place-names:

https://www.facebook.com/EnglishPlaceNames?ref=hl

or contact [email protected] for further information. We look forward to hearing from you!

Comments are closed.