“Popcorn & Politics” — data viz of movies, their audience and their impact on the 2008 US Presidential Race
As a jumping off point:
In journalism, we recognize a kind of hierarchy of fame among the famous. We measure it in two ways: by the length of an obituary and by how far in advance it is prepared.
The grey bars represent the ten longest obituaries of the last 7 months. From left to right, they are:
“A Star Idolized and Haunted, Michael Jackson Dies at 50″ (2839 words)
“Jack Nelson, an Investigative Reporter, Dies at 80″ (1266 words)
“Dominick Dunne, Writer Who Chronicled High-Profile Crime, Is Dead at 83″ (1966 words)
“Robert Rines, Inventor and Monster Hunter, Dies at 87″ (2839 words)
“Howard Unruh, 88, Dies; Killed 13 of His Neighbors in Camden in 1949″ (1304 words)
“Roy DeCarava, Harlem Insider Who Photographed Ordinary Life, Dies at 89″ (1485 words)
“Walter Cronkite, 92, Dies; Trusted Voice of TV News” (2968 words)
“Budd Schulberg, ‘On the Waterfront’ Writer, Dies at 95″ (1855 words)
“Henrich, Yankees Clutch Hitter, Dies at 96″ (1086 words)
“Bela Kiraly Dies at 97; Led Revolt In Hungary” (1136 words)
Notable outliers are:
Michael Jackson, receiving a 2839 word obituary at the age of 50 (Billy Mays is the other 50 year old). Jack Rose, the youngest recipient of a NYTimes obituary. Along with Michael Jackson, Walter Cronkite received a very long obituary that ran 50% longer than the two next longest obituaries.
TweetCatcha seeks to uncover the organic nature of news as it travels through Twitter over time, by examining the movement of NY Times articles through Twitter.
Nick Hardeman + Bruce Drummond.
- OverviewAn exploration of the NYTimes obituary to examine what is supposed to be the most notable people to have died on a given day. This very very short daily list is then contrasted against the much larger set of likely mundane, but certainly much more varied nation-wide set of newspaper obituaries. This exercise seeks to both bring attention to the large number of deaths that occur every day and find an alternative snapshot of what the American life is through its daily deceased.
- DataMy data sets will be the NYTimes Article Search API (searching for “obituary”) + an RSS feeds from the site obituaries101.com located at http://www.big101.com/obituary_search_find_famous_death_notices.php.
- Design QuestionsMy initial approach is to use scale and variation in type size to underline how small of a snapshot the NYTimes obituary section is of the greater body of obituaries in the United States. I’m not exactly sure if I will be using any graphics as text is a central part of this exploration.
- Rupa’s project, Out of Sight, Out of Mind, dealing with visualizing deaths and she was previously mentioning the use of bar graphs to identify spikes in casualties across time.
- NPR did a radio show called The Art of the Obituary which revealed the behind-the-scenes process of identifying aging members of society of note and prewriting and subsequently updating obituaries. Hearing this story when it aired a number of years ago really piqued my interest in the topic of obituary writing.
- Almost forgot the most striking infographic I’ve seen that related to death: the visualization of suicides on the Golden Gate Bridge: http://www.sfgate.com/cgi-bin/object/article?f=/c/a/2005/10/30/MNG2NFF7KI1.DTL&m=/c/pictures/2005/10/30/mn_suicide30_loc_tt.gif.
My final project will explore the relationship between the geographical location of Twitter users and the New York Times articles they tweet about. I’m interested in seeing (geographically) where the interest of Twitter users lie on a daily, monthly, and (possibly) yearly basis. I also plan to implement filters, allowing users to explore where New York Times article topics are being talked about most, and the distribution of tweets about New York Times articles by section.
- NYTimes Articles API
- BackTweets API
- Twitter API
- Google Maps API
3. Design Questions
- What does the distribution of Twitter user’s interests about various topics, locations, and sections from the New York Times look like visually?
- Do current issues in the news effect where Twitter users decide to tweet about?
- Do patterns emerge based on country/region, or are the Tweet/Articles relationships random?
- Are there unseen political/economic/social relationships between countries/regions that are hidden in the data?
4. Prior Art / Precedents
Flight Patterns by Aaron Koblin
This visualization elegantly maps air traffic patterns. Some of the images in this series show incredibly intricate networks that are formed by air traffic, as well as the locations of the largest airports.
Just Landed by Jer Thorp
Jer Thorp’s processing based visualization shows the locations of twitter users and the places that they fly to, cleverly scraped based on the two tweeted words “just landed”. One of the most compelling aspects of this piece is the 3d translation of data, allowing for an exploration into the intricacies of the paths.
Presentation is here >
>> Presentation <<
Here is what Bruce and I presented in class today, the tweets are not loading online for some reason, but the basic functionality is there.
This prototype is for functionality purposes only. We are moving towards this aesthetics once we achieve the functionality that we desire.
Since Steve and I began with a different proposal, I’ll recap what we’ve done and where we’re heading:
Final Project Proposal
Stephen Varga & Kunal D Patel
A short summary of what your project will be. Give me your best elevator pitch here.
Our project focuses on the polarity of comments on the New York Times website as a means of tracking public opinion about articles and topics. Through a robust time-based interface, we are interested in seeing the potential ebbs and flows of commenting, if we can spot “influencers” that shift polarity, and explore the topics generating the most positive and negative discussion.
The data sources you will be using
- The Community API – we’re making a call 1x a day to pull all of the previous day’s comments and store them in a SQL database
- The Article Search API – after grabbing all the comments for the previous day, we store the articles they correspond to into a separate table in the same database
- Amplify API – we’ve now been granted access to the client services for Amplify, a sentiment engine that we are using to parse through the comments
3. Design Questions
A set of questions that you intend to answer or explore. At least one question should be about the data itself (i.e., what is the story you’re hoping to tell?), but these questions may also address design methods or technical approaches.
- Does the polarity of an article (how positive or negative it is) substantially influence the polarity of its comments?
- Do comments for an article tend to fall within a narrow range, or is there often variation?
- Through a time-based visualization, will be able to identify “influential” comments that (qualitatively) appear to shift comment polarity?
- Will Amplify be able to conduct meaningful and consistent topic extraction and sentiment analysis?
- Do we have enough time to develop a robust enough interface components to support comment AND topic searches?
4. Prior Art / Precedents
Discuss at least two existing works that are similar in some respect to your proposed project. How do you see your project in relation to this ecosystem of other works? Will it contribute something unique? Will it address problems that you see in other works?
10×10 – Jonathan Harris’ hourly scraping of several international news feeds, visualized as a sorted list of the 100 most “important” words in the news connected to a 10×10 grid of corresponding images. These words are generated by his own algorithms from the articles themselves. Our focus is on comments to see what the NYT community feel are the most important and polarizing topics for discussion.
Twittermood – Twittermood conducts basic sentiment analysis on tweets using the ANEW dataset (which assigns emotional values to words) and maps the results in real-time. Besides the difference in audience, our planned interface is far more robust, allowing for navigation by week, day, author, and article in order to view patterns.
A brief explanation of how you plan to collaborate with your partner.
Steve and I worked together to write the PHP scripts to parse through the NYT API’s, pull the relevant data, and store it into a SQL database running on Steve’s web server. I’ve been focusing on working with the Amplify API and developing interface wireframes while Steve’s attention has been devoted to developing a working prototype in openFrameworks. For the rest of the semester, we will retain our individual foci in order to efficiently divide tasks, but our work will converge much more as the interface takes shape.
Interaction between stock market price and news article of New York Times
by Yoon and Seung
- Currently company stock prices are not normalized (Can go out of the screen)
- Full screen is supported (Right click – ‘Go Full Screen’)