UNED Online Reputation Monitoring Team at RepLab 2013

Abstract

This paper describes the UNED’s Online Reputation Monitoring Team participation at RepLab 2013. Several approaches were tested: first, an instance-based learning approach that uses Heterogeneity Based Ranking to combine seven different similarity measures was applied for all the subtasks. The filtering subtask was also tackled by automatically discovering filter keywords: those whose presence in a tweet reliably confirm (positive keywords) or discard (negative keywords) that the tweet refers to the company. Different approaches have been submitted for the topic detection subtask: agglomerative clustering over wikified tweets, co-occurrence term clustering and an LDA-based model that uses temporal information. Finally, the polarity subtask was tackled by following the approach presented in to generate domain specific semantic graphs in order to automatically expand the general purpose lexicon SentiSense. We next use the domain specific sub-lexicons to classify tweets according to their reputational polarity, following an emotional concept-based system for sentiment analysis. We corroborated that using entity-level training data improves the filtering step. Additionally, the proposed approaches to detect topics obtained the highest scores in the official evaluation, showing that they are promising directions to address the problem. In the reputational polarity task, our results suggest that a deeper analysis should be done in order to correctly identify the main differences between the Reputational Polarity task and traditional Sentiment Analysis tasks. A final remark is that the overall performance of a monitoring system in RepLab 2013 highly depends on the performance of the initial filtering step.

Publication
CLEF'13 Eval. Labs and Workshop Online Working Notes