Child Sexual Abuse Dark-Web Discussion Forum Corpus

Uploaded: 2021-09-08
Languages: English
Collected from: 2021
Access category: Controlled
Email: Not available
To: 2021


Dark Web CSEA forum corpus

Subject keywords: corpus, dark web, forum
Data types: Written
Funders: N/A
Associated AIFL centres: Centre for Forensic Text Analysis (FTA)
License: N/A


The present dataset includes 114756 posts that were collected from a Dark Web Child abuse forum. The data is exclusively textual and it was scraped for research use for the Aston Institute for Forensic Linguistics. The posts are authored by 2074 different users of the forum. Corpus size is around 8.4 million tokens.

Data Donors


Information: This dataset contains highly sensitive material or data that come from a third party and have heavy constraints on access and use. This dataset is therefore stored not on the FoLD web server but on an air-gapped, offline computer in our secure data lab at the Aston Institute for Forensic Linguistics. Users who wish to access this dataset must make a detailed application to FoLD and the researcher, as well as potentially gain additional agreement from an external organisation before they can be approved for access.

Request Item