Deceptive Opinion Spam Corpus v1.4
Corpus of truthful and deceptive hotel reviews of 20 Chicago hotels
Data types: Written
Associated AIFL centres: None
License: CC0 (The Creative Commons Public Domain Dedication)
This corpus contains: 400 truthful positive reviews from TripAdvisor (described in ) 400 deceptive positive reviews from Mechanical Turk (described in ) 400 truthful negative reviews from Expedia, Hotels.com, Orbitz, Priceline, TripAdvisor and Yelp (described in ) 400 deceptive negative reviews from Mechanical Turk (described in ) Each of the above datasets consist of 20 reviews for each of the 20 most popular Chicago hotels (see  for more details). The files are named according to the following conventions: Directories prefixed with fold correspond to a single fold from the cross-validation experiments reported in  and . Files are named according to the format %c_%h_%i.txt, where: %c denotes the class: (t)ruthful or (d)eceptive %h denotes the hotel: affinia: Affinia Chicago (now MileNorth, A Chicago Hotel) allegro: Hotel Allegro Chicago - a Kimpton Hotel amalfi: Amalfi Hotel Chicago ambassador: Ambassador East Hotel (now PUBLIC Chicago) conrad: Conrad Chicago fairmont: Fairmont Chicago Millennium Park hardrock: Hard Rock Hotel Chicago hilton: Hilton Chicago homewood: Homewood Suites by Hilton Chicago Downtown hyatt: Hyatt Regency Chicago intercontinental: InterContinental Chicago james: James Chicago knickerbocker: Millennium Knickerbocker Hotel Chicago monaco: Hotel Monaco Chicago - a Kimpton Hotel omni: Omni Chicago Hotel palmer: The Palmer House Hilton sheraton: Sheraton Chicago Hotel and Towers sofitel: Sofitel Chicago Water Tower swissotel: Swissotel Chicago talbott: The Talbott Hotel %i serves as a counter to make the filename unique References  M. Ott, Y. Choi, C. Cardie, and J.T. Hancock. 2011. Finding Deceptive Opinion Spam by Any Stretch of the Imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.  M. Ott, C. Cardie, and J.T. Hancock. 2013. Negative Deceptive Opinion Spam. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
The data is stored externally. Please follow the link below for access.