R packages

castarter - content analysis starter toolkit for R

castarter is a more modern, fully-featured, and consistent iteration of castarter - Content Analysis Starter Toolkit for the R programming language (a previous iteration is still available as castarter.legacy. It facilitates text mining and web scraping by taking care of many of the most common file management issues, keeps tracks of download advancement in a local database, facilitates extraction through dedicated convenience functions, and allows for basic exploration of textual corpora through a Shiny interface.

tidywikidatar - Interact with Wikidata and get tidy data frames in response

The goal of tidywikidatar is to facilitate interaction with Wikidata: all responses are transformed into data frames or simple character vectors it is easy to enable efficient caching in a local sqlite database (integration with other databases is also available) If you want to benefit of the wealth of information stored by Wikidata, but you do not like SPARQL queries and nested lists, then you may find tidywikidatar useful.


tifkremlinen is a package providing a single dataset - kremlin_en - including all contents published on the English-language version of kremlin.ru starting with 31 December 1999 and until 31 December 2020. Yearly updates will likely be made available. Link to repo on GitHub Link to official version of dataset with all details

castarter.legacy - content analysis starter toolkit for R

castarter (now castarter.legacy) is designed to make it easy also for relatively inexperienced users to create a textual dataset from a website, or a section of a website, keep it up-to-date, and explore it through word frequency graphs or a web interface that makes it possibe to tag items. Documentation is available on castarter’s website.

EDJNet's Quote Finder

EDJNet’s Quote Finder facilitates finding different takes on European affairs. It provides an interactive interface to explore and filter tweets by all members of the European Parliament who are on Twitter, and to visualise word frequencies as wordclouds. It is possible to filter contents based on keywords, hashtags, political affiliation and language of the tweet. A different interface allows for interactively exploring textual contents published by EU-institutions such as press-releases. In this case, available visualisations include time series in order to highlight changes in the relative prominence of certain issues within the official EU discourse.

ganttrify - Create beautiful gantt charts with ggplot2

ganttrify facilitates the creation of nice-looking Gantt charts, commonly used in project proposals and project management. Documentation is available on GitHub.

genderedstreetnames - Find and plot on a map gendered street names

genderedstreetnames automatically finds the gender of street names, facilitates manually fixing what the automatic part got wrong, and plots the results. It gets information from OpenStreetMap and Wikidata. There is currently a vignette showing examples of what can be done with this package on the package’s website. Documentation is available on GitHub.


networkedwebsitesdetector offers a structured approach for finding websites which have clear signs of common ownership or are otherwise related. There is currently a vignette showing examples of what can be done with this package on the package’s website. Documentation is available on GitHub.

zoteror - Access the Zotero API in R

zoteror introduces basic functionalities to access the Zotero API. It allows to create new Zotero items, and to take a csv file (or data frame) and import it into Zotero, as long as data are properly mapped. zoteror has function that facilitate giving to tabular data a structure that can properly be read into Zotero. It facilitates resizing the storage space used, by ordering items by attachment size, and by allowing to add items to a collection if certain criteria are met.