Operation Heron corpus

Uploaded: 2021-09-29
Languages: English
Collected from: 2020
Access category: Open
To: 2020


The dataset is a subset of the abusive letters sent by Margaret Walkers to individuals in the public eye between 2007 and 2009.

Subject keywords: corpus, letter series, abusive language
Data types: Written
Funders: N/A
Associated AIFL centres: Forensic Linguistic Databank (FoLD)
License: Non-Commercial Government Licence for public sector information


The corpus represents an incomplete subset of the entire collection of abusive letters sent by Margaret Walkers between 2007 and 2012. The present datasets spans from January 2007 to April 2009, and consists of 50 letters and 49 envelopes, amounting to 10,650 tokens. The letters are directed to private individuals (50%), healthcare professionals – especially doctors (28%) and to other categories such as Imams, city county officials, hairdressers etc. (22%). The single files are coded with metadata about the file itself: the original number of the document (as given by the police), the type of document (letter or envelope), and the date (in the format MONTH/YEAR). For example, letter n. 1 sent to a medical doctor in January 2007 is coded as "01_letter_doctor_012007". associated publication: Busso, L., Petyko, M., Atkins, S., & Grant, T. (2022). Operation Heron–Latent topic changes in an abusive letter series. Corpora 17(2)

Data Donors


Here are the files submitted for this Item.