Using ACLED's API

Accessing the API

To use ACLED’s API, you must first register an account in myACLED. You can find more information about registering your account by visiting myACLED’s FAQs.

You will authenticate your API requests using your myACLED email and password. Unlike in older versions of acledR and the ACLED API, you will not need an API key to authenticate your requests. Instead, ACLED’s API uses OAuth and acledR will handle OAuth tokens internally for you. We recommend storing the email and password associated with your ACLED account as environmental variables for easy and safe access in your scripts.

To do so, run

file.edit(file.path("~", ".Renviron"))

which will open your .Renviron file. Once open, set:

email_address = "your_email"
acled_password = "your_password"

Then save the file and restart your R session. You can confirm that they have been properly stored by running Sys.getenv("email_address") to return the stored email address and Sys.getenv("acled_password") to return the stored ACLED key in the console. You can then enter your credentials within each API call like:

acled_api(
  email = Sys.getenv("email_address"),
  password = Sys.getenv("acled_password"),
  ...)

As an alternative, you can manually enter your credentials within each API call. While we do not recommend this approach because it is the least secure, it may be suitable for some use cases. You can use this option by following this example:

acled_api(
  email = "email_address", 
  password = "acled_password",
  ...)

You can also leave the password field empty to open an interactive prompt for typing in your password:

acled_api(
  email = "email_address", 
  ...)

Whether you set your email and password via environmental variables, manually, or via an interactive prompt, acledR will manage associated OAuth tokens for you via acledR::acledR::acled_auth(), which relies on httr2::req_oauth_password().

ACLED API

acled_api() is a function you can use to request and process ACLED API calls. The function takes the following arguments:

acled_api(email = NULL,
          password = NULL,
          country = NULL,
          regions = NULL,
          start_date = "1997-01-01",
          end_date = Sys.Date(),
          timestamp = NULL,
          event_types = NULL,
          population = "none",
          monadic = FALSE,
          inter_numeric = FALSE,
          ...)

Parameters for the API

Geographical filters

You can use the country and regions parameters to specify the locations from which you would like to request data. If both values are NULL or are not included, the API will return data for all countries and regions. If you would like to request data for multiple countries, you can do so by using a vector of country names (e.g., c("Argentina","Spain","Bolivia")). Similarly, you can request data from one or more regions by using either a vector of region names or numeric codes. acledR::acled_countries and acledR::acled_regions show the full lists of countries and regions available. Please visit ACLED’s Knowledge Base for region-specific methodology questions.

Temporal filters

You can specify the date range for which you would like to receive data by using the start_date and end_date parameters, both of which require data in the “yyyy-mm-dd” format.

You can use the timestamp parameter to select data that were added or updated over a specific time period. Please keep in mind that timestamp indicates when the event was added or modified in ACLED’s dataset, meaning that an event that occurred far in the past (i.e., with an old event date) may still have a recent timestamp if it was recently updated.

In practice, the timestamp parameter is typically not used for analysis but is instead used to keep your own dataset up to date as changes are made to ACLED’s data. To learn more about how to keep your datasets up to date, visit the Keeping your datasets up to date page for an acledR approach or this guide more relevant to Excel or other spreadsheet tools.

Additional filters

You can also use the event_types argument to filter to specific event_types in ACLED data. To do so, you should enter the event_type of interest as a string or as a vector of strings (e.g., event_types = "Battles" or event_types = c("Battles", "Protests")). For a description of all available event_types in ACLED’s dataset, please refer to ACLED’s codebook.

ACLED data defaults to a wide (or dyadic) format, where each row contains multiple actor columns, with those actors interacting during the event. However, you can request a long (or monadic) format using the monadic argument. By default, this argument is FALSE, meaning you will receive a dyadic version of the data. When monadic=TRUE, the function will return a monadic (“long-form”) data frame with only one actor (based on actor1 and actor2). For transforming your dataset from wide to long without using the API, or transforming it based on different sets of columns, visit acled_transform_longer(). For more information on the difference between our wide/dyadic and monadic/long datasets, please visit our API guide

Finally, you can use the population argument to specify if you want to include the estimated population exposure columns. This argument takes three options, none which returns no extra columns, best which only returns the population_best column, or full which returns all the estimated population columns. For more information, visit our Conflict Exposure piece.

The ... parameter represents any other arguments you might want to include in your API query, such as ISO or Interaction. If you want to use these filters or others not included in the list of parameters described above, then you can write them as &paramenter=value. For instance, you might wish to include &iso=4 at the end of the function. You can visit ACLED’s API guide to learn more about other valid parameters.

Handling big API calls

As is common when executing API calls, handling large volumes of data requires some special consideration. In ACLED’s case, the base API uses pagination to address some of these issues, but pagination can be confusing for newer users (see our API guide for a more detailed explanation). Fortunately, this package avoids this issue. Instead of manual pagination, the acled_api() function splits the call automatically.

acled_api() will first estimate how much data you are requesting. You will then be prompted with a message which includes the following:

The number of countries for which data is being requested,
The number of estimated events requested (based only on country and year, and NOT event type),
The number of API calls needed, based on an estimate of how big the call is,
A question asking whether given this information and the number of available API calls linked to your account – you would like to proceed with your API call.

Example - Requesting data with `acled_api()`

Imagine you are interested in events from Argentina occurring between June 1-30 2022

library(acledR)
library(dplyr)

#Note: This is simply an example–you will need to include your own credentials rather than the email and key placeholders that are included below.

df_ar <- acled_api(
  email = Sys.getenv("email_address"),
  password = Sys.getenv("acled_password"),
  country = c("Argentina"),
  start_date = "2022-06-01",
  end_date = "2022-06-30",
  monadic = FALSE)

This returns a tibble that includes each ACLED event in Argentina during the specified period:

head(df_ar, 5)
#> # A tibble: 5 × 31
#>   event_id_cnty event_date  year time_precision disorder_type event_type
#>   <chr>         <date>     <dbl>          <dbl> <chr>         <chr>     
#> 1 ARG10607      2022-06-30  2022              1 Demostrations Protests  
#> 2 ARG10626      2022-06-30  2022              1 Demostrations Protests  
#> 3 ARG10618      2022-06-30  2022              1 Demostrations Protests  
#> 4 ARG10615      2022-06-30  2022              1 Demostrations Protests  
#> 5 ARG10627      2022-06-30  2022              1 Demostrations Protests  
#> # ℹ 25 more variables: sub_event_type <chr>, actor1 <chr>, assoc_actor_1 <chr>,
#> #   inter1 <dbl>, actor2 <chr>, assoc_actor_2 <chr>, inter2 <dbl>,
#> #   interaction <dbl>, civilian_targeting <chr>, iso <dbl>, region <chr>,
#> #   country <chr>, admin1 <chr>, admin2 <chr>, admin3 <lgl>, location <chr>,
#> #   latitude <dbl>, longitude <dbl>, geo_precision <dbl>, source <chr>,
#> #   source_scale <chr>, notes <chr>, fatalities <dbl>, tags <chr>,
#> #   timestamp <dbl>

If you wanted data from both Brazil and Colombia, you would execute the following:

df_br_co <- acled_api(
  email = Sys.getenv("email_address"),
  password = Sys.getenv("acled_password"),
  country = c("Brazil", "Colombia"),
  start_date = "2022-01-01",
  end_date = "2022-12-01",
  monadic = FALSE)

If you are interested in events occurring over a larger area, it may be simpler to omit the country parameter and include a regions argument instead. You could also include an event_type argument to receive only a specific type of event:

df_sa <- acled_api(email = Sys.getenv("email_address"),
                   password = Sys.getenv("acled_password"),
                   regions = c("South America"),
                   start_date = "2022-01-01",
                   end_date = "2022-12-01",
                   event_type = "Protests",
                   monadic = FALSE)

You can use the timestamp column/filter to specify the dates from which you would like to receive new or updated data. You can include the argument as either a string (“yyyy-mm-dd”) or a numeric Unix timestamp:

df_br_co <- acled_api(
  email = Sys.getenv("email_address"),
  password = Sys.getenv("acled_password"),
  country = c("Brazil", "Colombia"),
  start_date = "2022-01-01",
  end_date = "2022-12-01",
  monadic = FALSE,
  # timestamp = "2022-01-24" -> in the case of string
  timestamp = 1643056974, # -> in the case of a numeric Unix timestamp)

If you would like to include only one type of interaction (e.g., “Rioters versus Civilians (57)”), then you can add interaction code to the ... argument:

df_sa <- acled_api(email = Sys.getenv("email_address"),
                   password = Sys.getenv("acled_password"),
                   country = c("Brazil", "Colombia"),
                   start_date = "2022-01-01",
                   end_date = "2022-12-01",
                   monadic = FALSE,
                   ... = "&interaction=57")

You could also request the monadic version of the data by setting monadic = TRUE:

df_sa_monadic <- acled_api(email = Sys.getenv("email_address"),
                           password = Sys.getenv("acled_password"),
                           regions = c("South America"),
                           start_date = "2022-01-01",
                           end_date = "2022-01-01",
                           monadic = TRUE)