Notes on Crawling Instances on the Known Fediverse

One of my favourite features of Fediverse is how different instances (whether it be Mastodon, Lemmy or Pixelfed) cooperate with each other over using the ActivityPub protocol. This is a fundamental feature of decentralised social media. For instance (pun unintended), users of mastodon.social can communicate with mastodonapp.uk, and mastodonapp.uk can communicate with pixelfed.social. As a … Read more

Interacting With REST APIs in Python With 5 Lines of Code

An essential skill for any web scraper or data scientist is to know how to collect information from a publicly available REST API. In short, a REST API is a very simple web service where simple HTTP requests (just like a web browser) are used to collect data usually in the form of a JSON … Read more

How to Scrape a Twitter Timeline

EDIT (Feb 2023): As of 09/02/2023, Twitter is closing down public access to the API and replacing it with a paid service. In a separate blog post, we covered the basics for learning how to stream tweets from Twitter using Python. This is ideal if you are interested in collecting tweets in real-time as and … Read more