Throughout the twenty first century there has been quite a few influential movie franchises and series but none compare to that of Marvel or DC Comics. Both companies were established around the 1930s creating fictional comic books in which they have been neck and neck since then, whether that was comics, tv shows, or movies. Recently the two franchises have released one of their highest grossing movies, Marvel’s Endgame was proclaimed the highest grossing movie of all time, and DC Comics Joker was one of Warner Brothers highest grossing movies. I wanted to analyze what the fans were thinking of these movies using Reddits API to analyze the sentiment of their respective fans.
I started this analysis by researching different platforms which I could retrieve a sentiment from its audiences. I looked into different ways I would be able to pull data from sites such as Twitter, Facebook, IMDB, and Reddit. After researching different platforms I decided that since Reddit is a platform that people can join together based on common interests that I would be able to find valuable insights. Other studies had used the Natural language toolkit for analyzing sentiment of tweets about Donald Trump and his presidential campaign. Using this python library and a few extra I was able to write a code in which could pull and analyze Reddit posts. The Natural language toolkit allowed me to generate a polarity score based from a negative one to a positive one. The national language toolkit is a library which contains text processors for parsing, stemming, classification, tokenization, semantic reasoning, and tagging. Through this library we were able to pull specific words which had either a positive or negative correlation with the post. Once finished with the code I was able to search the subreddits r/joker and r/Endgame for common trends and abnormalities in viewpoints in beliefs and polarity scores.