GERDA R Package

This package provides tools to download comprehensive datasets of local, state, and federal election results in Germany from 1990 to 2025. The package facilitates access to data on turnout, vote shares for major parties, and demographic information across different levels of government. The package also includes county-level socioeconomic covariates from INKAR, municipality-level data from the German Census 2022, and a party crosswalk mapping GERDA party names to ParlGov attributes.

GERDA was compiled by Vincent Heddesheimer, Florian Sichart, Andreas Wiedemann and Hanno Hilbig. For additional information, see the GERDA website (www.german-elections.com) and the accompanying publication: doi.org/10.1038/s41597-025-04811-5

Note: This package is currently a work in progress. Comments and suggestions are welcome – please send to .

Installation

You can install GERDA from CRAN:

install.packages("gerda")

Or install the development version from GitHub:

# Install devtools if you haven't already
if (!requireNamespace("devtools", quietly = TRUE)) {
  install.packages("devtools")
}

# Install GERDA development version
devtools::install_github("hhilbig/gerda")

Main Functions

Usage Examples

# Load the package
library(gerda)

# List available datasets
available_data <- gerda_data_list()

# Load a dataset
data_municipal_harm <- load_gerda_web("municipal_harm", verbose = TRUE, file_format = "rds")

County-Level Covariates

The package provides access to socioeconomic and demographic indicators for 400 German counties (1995-2022) from INKAR. INKAR data is available from 1995 to 2022, so covariates can be matched to federal elections from 1998 onwards (earlier elections fall outside the INKAR coverage window). These can be easily added to both county-level and municipal-level GERDA election data:

library(gerda)
library(dplyr)

# Works with county-level data
county_merged <- load_gerda_web("federal_cty_harm") %>%
  add_gerda_covariates()

# Also works with municipal-level data
# (Note: All municipalities in the same county get identical covariate values)
muni_merged <- load_gerda_web("federal_muni_harm_21") %>%
  add_gerda_covariates()

# Done! Your data now includes 30 county-level covariates

For more control, use the accessor functions:

# Get raw covariate data
covs <- gerda_covariates()

# View the codebook
codebook <- gerda_covariates_codebook()

# Manual merge (advanced)
merged <- elections %>%
  left_join(covs, by = c("county_code" = "county_code", "election_year" = "year"))

The dataset includes 30 variables covering:

Coverage note: Core variables (demographics, economy, labor market) are available for all election years 1998-2021. Some newer INKAR indicators are available for recent elections only. Check gerda_covariates_codebook() for per-variable coverage details.

See ?gerda_covariates for full documentation and gerda_covariates_codebook() for a complete data dictionary with variable descriptions, units, and missing data information.

Census 2022 Data

The package also provides municipality-level data from the German Census 2022 (Zensus 2022). This cross-sectional snapshot covers approximately 10,800 municipalities and can be merged with any GERDA election dataset. The main advantage of this data is that it is observed at the municipal level (unlike the county-level INKAR data), allowing for more fine-grained analyses of local election outcomes. However, the census is a single time point (2022), so it does not vary across election years – users should not conduct analyses that rely on over-time variation in these covariates.

library(gerda)

# Add census data to municipal-level elections
muni_merged <- load_gerda_web("federal_muni_harm_21") |>
  add_gerda_census()

# Also works with county-level data (aggregated from municipalities)
county_merged <- load_gerda_web("federal_cty_harm") |>
  add_gerda_census()

The census data includes 14 indicators across four categories:

Since the census is a 2022 snapshot, the same values are attached to all election years.

Coverage note: Most census variables have >95% municipality coverage. avg_household_size_census22 has approximately 12.5% missing values due to Destatis disclosure rules that suppress data for small municipalities.

See ?gerda_census for full documentation and gerda_census_codebook() for the complete data dictionary.

Note

For a complete list of available datasets and their descriptions, use the gerda_data_list() function. This function either prints a formatted table to the console and invisibly returns a tibble or directly returns the tibble without printing.

Feedback

As this package is a work in progress, we welcome feedback. Please send your comments to hhilbig@ucdavis.edu or open an issue on the GitHub repository.