Masters Student at the University of Colorado
0_MAZ5FbNywl-QOjfX.jpeg

Marvel vs. DC

Marvel vs. DC Comics Sentiment Analysis

 Throughout the twenty first century there has been quite a few influential movie franchises and series but none compare to that of Marvel or DC Comics. Both companies were established around the 1930s creating fictional comic books in which they have been neck and neck since then, whether that was comics, tv shows, or movies. Recently the two franchises have released one of their highest grossing movies, Marvel’s Endgame was proclaimed the highest grossing movie of all time, and DC Comics Joker was one of Warner Brothers highest grossing movies. I wanted to analyze what the fans were thinking of these movies using Reddits API to analyze the sentiment of their respective fans.

I started this analysis by researching different platforms which I could retrieve a sentiment from its audiences. I looked into different ways I would be able to pull data from sites such as Twitter, Facebook, IMDB, and Reddit. After researching different platforms I decided that since Reddit is a platform that people can join together based on common interests that I would be able to find valuable insights. Other studies had used the Natural language toolkit for analyzing sentiment of tweets about Donald Trump and his presidential campaign. Using this python library and a few extra I was able to write a code in which could pull and analyze Reddit posts. The Natural language toolkit allowed me to generate a polarity score based from a negative one to a positive one. The national language toolkit is a library which contains text processors for parsing, stemming, classification, tokenization, semantic reasoning, and tagging. Through this library we were able to pull specific words which had either a positive or negative correlation with the post. Once finished with the code I was able to search the subreddits r/joker and r/Endgame for common trends and abnormalities in viewpoints in beliefs and polarity scores.

The first of the subreddits in which I analyzed was r/Joker, this is a subreddit in which is a place that fans of the character can come together and discuss all things related to DC’s Joker. At first glance at the subreddit one would think that it is geared towards talking about memes, but these memes place a strong correlation on what people’s general feelings are towards the subject. Some common trends that I noticed while reading through the subreddit were that all of the posts were towards fans sharing art and their own interpretations of the character. The first step in the code was to pull the posts from Reddit, I pulled around a thousand different posts which varied from positive to negative polarity. When implementing the subreddit into the code I found that there was a lot more neutral posts than I originally expected. The combination of the national language tool kit and vader sentiment analysis pulled a percentage of around 56% meaning that within the posts there were a significant amount more neutral words then either positive at 32% and negative at 12%. During the analysis we can see that words such as Joker come out as a positive word as well as a negative word meaning that depending on the rest of the post it could be considered to have a different sentiment.

Through the analysis of Avengers: Endgame one would be able to see that the code picked up around a thousand different posts again with a high correlation of neutral posts. We can see that the code found 61% of the posts to be neutral, 17% of the posts to be negative and 22% of the posts to be positive. Similar to that of the Joker analysis we can see that the name of the subreddit is both the number one positive and negative words that were picked up inside of the code. While reading through the subreddit I found that unlike the Joker platform which posted mainly memes and information about the movie, the Endgame subreddit users were posting questions and their own individual fan fiction that they had created. Another factor that I found peculiar while reading through the two subreddits was that the Endgame channel had a lot more members than the Joker channel but didn’t nearly have as much new content with the newest post being 2 days old. I thought this was because of the age of the movie, since Endgame was released in April whereas Joker was just released in the month of October.

Some limitations that I came across during my analysis was the information that was derived from the Natural Language toolkit was hard to understand since most of the posts were considered neutral. Since around 60% of the data the code pulled was neutral I was unable to pull any valuable insights from the code. This caused my research to be limited to only 40% of the actual posts that were pulled meaning that my information was not rich enough to make any solid conclusions about the sentiment of the respective subreddits.

The patterns that I observed during my research suggests that the overall interaction between individuals throughout community platforms such as Reddit generally steer positive, as fans are generally supportive of one another throughout the Marvel and DC communities, and provide knowledge when it is needed on a specific post. The whole fan community generally supports each other with trying to give some suggestions to improve future plot lines, or even make their own. However, the negative interactions we mostly seen when users post any false claims or spoilers, as these findings shows us is that while both companies fans are mostly seen as supportive and devoted, they will still express their negative opinions toward the Joker and Avengers: Endgame films.