[Final] NYT Article Polarity Visualizer Proposal
Posted: November 12th, 2009 | Author: Kunal | Filed under: Assignments | No Comments »Since Steve and I began with a different proposal, I’ll recap what we’ve done and where we’re heading:
Final Project Proposal
Stephen Varga & Kunal D Patel
1. Overview
A short summary of what your project will be. Give me your best elevator pitch here.
Our project focuses on the polarity of comments on the New York Times website as a means of tracking public opinion about articles and topics. Through a robust time-based interface, we are interested in seeing the potential ebbs and flows of commenting, if we can spot “influencers” that shift polarity, and explore the topics generating the most positive and negative discussion.
2. Data
The data sources you will be using
- The Community API – we’re making a call 1x a day to pull all of the previous day’s comments and store them in a SQL database
- The Article Search API – after grabbing all the comments for the previous day, we store the articles they correspond to into a separate table in the same database
- Amplify API – we’ve now been granted access to the client services for Amplify, a sentiment engine that we are using to parse through the comments
3. Design Questions
A set of questions that you intend to answer or explore. At least one question should be about the data itself (i.e., what is the story you’re hoping to tell?), but these questions may also address design methods or technical approaches.
- Does the polarity of an article (how positive or negative it is) substantially influence the polarity of its comments?
- Do comments for an article tend to fall within a narrow range, or is there often variation?
- Through a time-based visualization, will be able to identify “influential” comments that (qualitatively) appear to shift comment polarity?
- Will Amplify be able to conduct meaningful and consistent topic extraction and sentiment analysis?
- Do we have enough time to develop a robust enough interface components to support comment AND topic searches?
4. Prior Art / Precedents
Discuss at least two existing works that are similar in some respect to your proposed project. How do you see your project in relation to this ecosystem of other works? Will it contribute something unique? Will it address problems that you see in other works?
10×10 – Jonathan Harris’ hourly scraping of several international news feeds, visualized as a sorted list of the 100 most “important” words in the news connected to a 10×10 grid of corresponding images. These words are generated by his own algorithms from the articles themselves. Our focus is on comments to see what the NYT community feel are the most important and polarizing topics for discussion.
Twittermood – Twittermood conducts basic sentiment analysis on tweets using the ANEW dataset (which assigns emotional values to words) and maps the results in real-time. Besides the difference in audience, our planned interface is far more robust, allowing for navigation by week, day, author, and article in order to view patterns.
5. Collaboration
A brief explanation of how you plan to collaborate with your partner.
Steve and I worked together to write the PHP scripts to parse through the NYT API’s, pull the relevant data, and store it into a SQL database running on Steve’s web server. I’ve been focusing on working with the Amplify API and developing interface wireframes while Steve’s attention has been devoted to developing a working prototype in openFrameworks. For the rest of the semester, we will retain our individual foci in order to efficiently divide tasks, but our work will converge much more as the interface takes shape.
Leave a Reply
You must be logged in to post a comment.