View on GitHub

Calliope Sentiment Parser Series

Opinion mining over labelled and unlabelled data

Welcome to CSPS

The Calliope Sentiment Parser Series (CSPS) is an educational undertaking in statistical methods for opinion mining. CSPS was started in 2014 by Trent Hyer and has several iterations.

Orpheus (2014)

Orpheus was the first iteration of the CSPS and was based on the work of Jonathon Read. Read's research suggested that it would be possible to seed a sentiment scoring corpus based off of emoticons, and Orpheus was a proof-of-concept. Developed as a side project for Domo, Inc. in American Fork (Utah), the tool was extensively tested and considered a minor success. Source code, implementation details, and detailed metrics for Orpheus are unavailable since it is the property of Domo.

Linus (2015)

The second iteration of the CSPS, Linus was an attempt to improve on the accuracy of Orpheus by combining unlabelled data (emotions in Twitter text) with labelled data (Yelp reviews with star ratings). This was also the first time that several accuracy metrics were introduced and the tools were baselined. Developed as a class project, source code and a report are available here.

Biston (2015-2016)

Biston was the first attempt to combine sentiment prediction with usefulness metrics. These metrics were generated by extracting textual features and passing them through machine learning models, using Yelp usefulness ratings as labels. Biston was a phenomenal success largely because of the contributions of fellow BYU students Courtni Byun (Linguistics) and Stephen Carron (Statistics). Biston is available as an open source repository