Automatic Identification of Political Ideology in Online News Articles


  • Vikki FIORE Montclair State University, Montclair, NJ, USA


The following research discusses text analysis approaches to automatically categorize news articles based on their political ideology. In this case, ideology is defined as a writer expressing either a liberal or a conservative point of view. This classification is done at both the document and the phrase level, as previous research has indicated that doing so increases classifier performance over using a “bag of words†approach. Linguistic features related to lexical richness are extracted from the articles via Python, and features related to emotions and values are extracted via the Linguistic Inquiry and Word Count software. The machine learning software Weka is then used to apply various classification algorithms on the numeric features. Additionally, Amazon Mechanical Turk is used to measure human accuracy and inter-rater agreement on identifying the ideology of the same texts. In all, the trained classifiers perform well above the baseline and outperform the human annotators on the same tasks.