Applying social media analysis to gauge public sentiment during election campaign
By : Flytxt R&D Team
Social media networks such as Facebook have become platforms for millions of people to broadcast their thoughts and opinions on great variety of topics. It has become an important domain for political conversations throughout the world too.
Data analytics that use computational models of natural language processing on publicly available data to infer people’s attitude possess huge potential in providing insights on public sentiment similar to how public opinion pollsters query a population. It can also help political parties to understand their popularity in the public domain and to monitor the impact of various events and campaigns conducted by them. Mining public opinion from freely available text content is a much faster and cost-effective approach.
Flytxt undertook an experimental project using text analytics to obtain different public opinions, especially on major political parties, surrounding the State Assembly Elections in the southern state of Kerala in India. The study had two objectives. First was to find out whether the issues widely discussed in various media and political campaigns were trending in social media as well. The second one was to find out how these issues were impacting the overall public sentiment towards the different political fronts and leaders.
For the analysis, social networking website Facebook was selected as the data source because it is a major source of online political commentary and discussion in Kerala. The posts were sourced from the official pages maintained by these political parties as well as from pages of some prominent leaders in each of these parties. Data of at least last six months preceding the election date were taken into account for the analysis.
The analysis essentially involved grouping of similar comments using a clustering algorithm followed by examining the most relevant words in a cluster which correspond to the cluster centroids. Prior to clustering, pre-processing was carried out to filter out unwanted text, followed by converting the filtered text data to a Term Frequency – Inverse Document Frequency (TF- IDF) matrix where each comment is vectorised via TF-IDF scoring. TF-IDF score gives an importance score of words in a corpus. The clusters were then visualised in the form of word clouds for quickly perceiving the most relevant terms. The importance of each word was then represented by font colour, e.g. lighter font to indicate words of higher importance. The cluster size represented the volume or quantity of a sentiment that is contained in each cluster.
Major portions of the comments favoured their respective parties (around 60% for LDF, 65% for the UDF, and 70% for the BJP). However, the amount of criticisms for UDF was considerably high (15%) as compared to the other parties (1% for LDF and 5% for BJP). Our experimental analysis suggested that the increased percentage of criticisms for the UDF might indicate a decrease in popularity for the party and the election results eventually supported our conclusion