Issue One — Media Hit Data Scraper demo

This Media Hit Data Scraper is code I have developed as part of my work on political campaigns and social change initiatives, where it often was imperative that we keep close tabs on our earned media for the sake of strategic analysis and also documenting relationships built with the media. However, the work of cataloging media mentions is often time-consuming, sometimes subjective and arbitrary, and always burns through precious staff hours. Recognizing this problem, I have developed a Python script that can convert a list of earned media URLs and turn it into a rich relational database. [Learn more about the Media Hit Data Scraper]

To demonstrate the capabilities of the Media Hit Data Scraper, I have retrieved the list of earned media URLs available on the Issue One website alongside the names of Issue One staff, advisers, and ReFormers (last retrieved on 11/20/2020). With these lists of data given as a starting point, the Media Hit Data Scraper is able to assist in generating the following outputs:

  • A relational database containing media hits, publications, and people (i.e., reporters and Issue One team members).
    • The top publications featuring Issue One have been Roll Call (59), TheHill (55), POLITICO (47), Washington Post (35), and Bloomberg (25).
    • The top reporters featuring or citing Issue One have been Kate Ackley (32), Zach Montellaro (23), Karl Evers-Hillstrom (13), Viktor Reklaitis (11), and Courtney Bublé (11).
    • In terms of staff mentions, Meredith McGehee has been mentioned 213 times, Michael Beckel has been mentioned 115 times, and Nick Penniman has been mentioned 32 times. The board member who has been mentioned the most times is Tom Ridge (18), the adviser who has been mentioned the most times is Zach Wamp (78), the advisory board member who has been mentioned the most times is Trevor Potter (19), and the ReFormer who has been mentioned the most times is Tom Daschle (19).
    • View the entire relational database on Airtable.
  • A visual overview of key statistics of media hits (see graphs below).

Issue One has recorded a total of 968 media hits over its lifetime, with 353 (36.5%) this year alone. In particular, there appears to be a spike correlated with the 2020 U.S. elections.

Not all media mention are created equal. Alexa Page Rank is a means of approximating the relative influence of a particular website. Generally, the higher the page rank (the darker the color) the better.
As this graph shows, Issue One’s proportion of media hits from different tiers of page rank has not changed significantly over the years. This is good news – it means the improvement in posts from 2020 has not been because of lower quality publications (i.e., blogs).

Historically, October and November appear to be good months for Issue One to get media coverage; although this spike likely is due to the 2020 U.S. elections. The data for December is skewed lower than expected because December media mentions have not yet been reported in 2020, Issue One’s best year for media mentions.

The sentiment score for Issue One – that is, the positivity or negativity of the context in which Issue One is mentioned – has been slightly negative overall. This negativity is not significant by any means (see the error bars, including a positive sentiment score for 2018) and most likely reflects the number of critical watchdog-like statements Issue One has made in the press rather than a negative attitude held by reporters. Sentiment scores are calculated using artificial intelligence algorithms provided by IBM’s Watson.

While difficult to measure in absolute terms, we can check the relative rank of emotions expressed in the articles/contexts that mention Issue One. Sadness appears to be the dominant emotion (perhaps correlated with expressions of disappointment directed towards certain elected officials), whereas fear appears to rarely expressed (suggesting that apocalyptic arguments are not part of Issue One’s rhetorical voice). Emotion scores are calculated using artificial intelligence algorithms provided by IBM’s Watson.

The Media Hit Data Scraper is proprietary code that I continue to develop as a side-project. Learn more about the Media Hit Data Scraper by viewing this presentation.