Monday, November 14, 2016

I scraped all the 2016 U.S. election data

I'm a weird guy. I enjoy building webscrapers; I find it relaxing. I have no real plans to use the 2016 U.S. election data, no particular horses to grind (I'm not to thrilled at the outcome, but hey), but I've been  hanging around /r/datasets on Reddit, and lots of people were asking for the data, and wondering if someone was going to scrape it. So, I did.

All of the data is in this Github repo. Obviously, I did not put a license of any sort on it, so feel free to use it. If you end up doing any interesting analyses with it (even boring ones, I'm not picky) I'd love to hear about it!

If I understand correctly, official verified (as opposed to reported) results will start to be released by the states in 2-4 weeks, but it seemed a shame the data was not, as far as I could tell, easily available right now.

(Of course, given my experience as to how the universe works, probably there will be a better data dump of all this info somewhere soon, or someone will point out it already exists somewhere my Google Fu wasn't strong enough to find, making my 10-12 hours of effort redundant. But it wouldn't have happened if I hadn't forced fate's hand!)

Is scraping legal? Yes. Is it ethical? Pretty much. Feel free to disagree; my position is that this is all public data, not in any copyright, presented publicly, and I scraped it by automating an actual web browser, so I did not use up any more of the websites' resources than a regular visitor. If I have breached any terms of service, I'll just have to live with the consequences, of which there are likely to be none. Here's a good Quora post about the subject.





• • •

12 comments:

  1. Hi David, you may be weird for enjoying web scraping, but you've made at least one geek very, very happy...
    I'll use your data for some coursework I'm doing for my MSc in Data Science if that's OK? I'd just started trying to get a dataset for the US elections together when I stumbled on your rather tasty dataset. Aiming to discover which factors influenced voting in the election. I'll be assembling some extra data on demographics, prior voting patterns, economic performance and a few others and then doing some machine learning and visualisations in search of some answers. Should you ever be in London (UK, not ON), I think I owe you a few beers... best, Peter

    ReplyDelete
    Replies
    1. I'm overjoyed you can make use of my efforts! Do so with my blessing!

      Delete
    2. Thanks David! I will share what conclusions I reach with you when done... might take a few weeks though. All the best, Peter

      Delete
  2. Hi David, coursework reports done, thanks again for that data! If you're suffering from insomnia and want a look, let me know the best way of getting them over to you -- I don't really want to post them openly...

    ReplyDelete
  3. This post is much helpful for us. This is really very massive value to all the readers and it will be the only reason for the post to get popular with great authority.
    Web Designing Training in Chennai

    ReplyDelete
  4. great activities are conducting to students. this makes some relieving to them technical studies. i wish to follow and conduct those kind of activities you are mentioning. keep share.
    PTE Training in Chennai

    ReplyDelete
  5. Thanks for this blog. provided great information. All the details are explained clearly with the great explanation. Thanks for this wonderful blog. Step by step processes execution are given clearly.Know the details about different thing.
    Digital Marketing Company in Chennai

    ReplyDelete
  6. Great post! I am actually getting ready to across this information, It's very helpful for this blog.Also great with all of the valuable information you have Keep up the good work you are doing well.
    Security Services in Chennai

    ReplyDelete
  7. nice blog too informative. looking and reading your points its so impressive. doing more blog like this. i really appreciated doing like this.
    SEO Company in India

    ReplyDelete