How To: Create triva from Wikipedia

Datasaurs Milestone

Updating the Datasaurs Twitter bot is my go-to task when I’m in the mood to do something fun in R but not feeling particularly creative or motivated to start a new project. To celebrate the bot’s 1,500th post, I decided it would be fun to add random trivia about the featured animal to the bottom of each Datasaur image.

The Goal

Sticking with my brand, this post is an overview of the R process behind each tweet, from processing the data, plotting the new creature, and posting the text and image on Twitter.

All scripts and data used in Datasaurs can be found on my GitHub.

The Data

The bot is dependent on two key inputs: silhoette images of dinosaurs (or other animals) and cause of death time series.

The images are from PhyloPic. While there are some awesome R packages that can import images from PhyloPic, for now, I have manually curated the images used in the bot to ensure the images will work well with my goal. This also allows me to save some extra metadata with each image, including the direction the animal is facing, common family names, and the Twitter handles of the artists.

The U.S. cause of death time series data is downloaded from the Center for Disease Control Wonder database. I downloaded 24 subsets of the data by age, race, gender, and region. The CDC redacts any queries that yield too few results, so I’ve kept these high-level.