How to Hydrate Tweets using Hydrator

EDIT (Feb 2023): As of 09/02/2023, Twitter is closing down public access to the API and replacing it with a paid service. In this two-part series post, we cover two ways to hydrate tweets for data analysis. I thought it was best to split this into two parts covering the easy way and the slightly … Read more

Self-Hosting with Cloudflare Tunnels (feat Raspberry Pi)

EDIT FEB 2024: All of this can now be managed through the Cloudflare dashboard. If you’ve ever self-hosted services on your local network and wanted to expose them to the world, you would know that this is not a straightforward matter. This involves opening up your firewall by forwarding ports 80 (HTTP) and 443 (HTTPS) … Read more

Creating Reply Networks from Reddit Comment Threads

In the previous blog post , we learned how to use PRAW to scrape and process data from Reddit. We finished off by looking at how to collect top-level comments from a post submission and briefly mentioned how to collect replies. However, due to the complexity of modelling nested replies, I thought it would be … Read more

How to Collect Data From Reddit – Introducing PRAW

If you’re a nerd like me, you’ll probably be very familiar with Reddit . Reddit describes itself as “the front page of the internet” which is certainly true for me and many others. I use it pretty much on a daily basis. Anything from tech news digest, to niche topics and to tech support. There’s … Read more