We now have a very extensive set of software projects hosted on GitHub: https://github.com/openeventdata. This includes a number of different automated event data coders, dictionaries, pipelines, formatting programs and geolocation programs.
Phoenix is up and running again thanks to funding from the National Science Foundation and the great people at the University of Texas at Dallas. The link is
This provides access to real-time events which are updated multiple times during the day. Access to the data is via a web-based REST API interface which allows substantial control on the events which are downloaded, as well as an R package for accessing the data. A free API is required to access the data: there is a form on the web site for obtaining this.
TERRIER (Temporally Extended, Regular, Reproducible International Event Records) BETA is a new machine coded event dataset using the CAMEO ontology produced from a historical corpus ranging from 1979 to 2016, available for download at OSF. This dataset is an initial beta release of the data, lacking event geolocation. The historical text corpus used to code TERRIER is the largest used to date for an event data project. It includes the complete archives of all major US and international newspapers and wire services going back to the 1970s.
The dataset was produced by a team at the University of Oklahoma as part of the NSF RIDIR grant “Modernizing Political Event Data” SBE-SMA-1539302.
The link is http://terrierdata.org/
Cline Center has a lengthy historical data set at http://www.clinecenter.illinois.edu/data/event/phoenix/
The successor to the DARPA ICEWS project has data which as of 20 Sept 2018 was available to the public for 1995 to February 2017, though until fairly recently it was being updated monthly with a lag of one yearhttps://dataverse.harvard.edu/dataverse/icews
The ICEWS Dataverse site includes actor dictionaries as well as the data.
If you happen to be working on an application for the US government, the near-real-time data—it seems to be updated about once a week—is available at various places, though apparently some negotiation is required to get it.
The format of the ICEWS data is a little unconventional: this program reformats it into a version that looks more like conventional event data. Also note that there are slight differences between the formats of the Dataverse and government versions.
These data projects are not formally affiliated with OEDA but the links may be useful if you are looking for data on political conflict events, broadly defined. And as you can see from the list, there's a lot of open data available on political conflict.
ACLED is a disaggregated conflict collection, analysis and crisis mapping project. ACLED collects the dates, actors, types of violence, locations, and fatalities of all reported political violence and protest events across Africa, South Asia, South East Asia and the Middle East.
UCDP is the world’s main provider of data on organized violence and the oldest ongoing data collection project for civil war, with a history of almost 40 years.
The ur-data on quantitative studies of conflict, COW provides a wide variety of datasets on inter-state (and occasionally sub-state) war and threats of war, and a variety of structural variables such as national capacity, treaties and alliances, trade and diplomatic relations and so on.
The GTD is an open-source database including information on terrorist events around the world from 1970 through 2016 (with annual updates planned for the future). GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 170,000 cases.
The Mass Mobilization in Autocracies Database (MMAD) contains sub-national data on mass mobilization events in autocracies worldwide. It includes both instances of anti- and pro-regime protest at the level of cities with daily resolution. The data is coded based on news reports obtained from AP, the AFP and BBC Monitoring. The main database contains information at the level of reports, which means that every mention of a political protest in a news report constitutes a new entry in the database.
The Nonviolent and Violent Campaigns and Outcomes (NAVCO) Data Project is a multi-level data collection effort that catalogues major nonviolent and violent resistance campaigns around the globe from 1900-2013. The project produces aggregate-level data on resistance campaigns from 1900-2013 (NAVCO 1), annual data on campaign behavior from 1946-2013 (NAVCO 2), and events data on tactical selection in a sample of 26 countries with major nonviolent and violent campaigns from 1991-2012 (NAVCO 3).
Version 3.0 of the Ethnic Power Relations dataset (EPR3) identifies all politically relevant ethnic groups and their access to state power in every country of the world from 1946 to 2010. It includes annual data for 157 counries and 758 groups and codes the degree to which their representatives held executive-level state power—from total control of the government to overt political discrimination.
The GROWup data portal unites a number of datasets on ethnic groups and intrastate conflict from various sources in a single relational database.
The SIPRI arms transfers database shows all international transfers of major conventional arms since 1950 and is the most comprehensive publicly available source of information on international arms transfers. The SIPRI military expenditure database gives the annual military spending of countries since 1988, allowing comparison of countries’ military spending in local currency at current prices; in US dollars at constant prices and exchange rates; and as a share of GDP.
The Global Conflict Risk Index (GCRI) is an index of the statistical risk of violent conflict in the next 1-4 years and is exclusively based on quantitative indicators from open sources. With the assumption that structural conditions in a country are linked to the occurrence of violent conflict, the GCRI collects 24 variables in 5 dimensions (social, economic, security, political, geographical/environmental) and uses statistical regression models to calculate probability and intensity of violent conflict.
The freely accessible Europe Media Monitor (EMM) is a fully automatic system that analyses both traditional and social media. It gathers about 300,000 news articles per day in up to 70 languages, groups related items, categorises them into thousands of categories, extracts information, produces statistics and timelines, detects breaking news and sends out alerts. While the system primarily focuses on real-time texts (updates every 10 minutes), an event data component was recently added.
The PITF Worldwide Atrocities Dataset is a geolocated global dataset that describes, in quantitative terms, the deliberate killing of non-combatant civilians in the context of a wider political conflict. The cases involve the deaths of five or more individuals in a single incident, and targeted killings of politically-relevant individuals such as politicians, journalists, and educators, which were reported by one or more of the major international news services. The dataset covers 1 January 1995 to 30 September 2017 and includes about 14,000 events.
CShapes is a new dataset that provides historical maps of state boundaries and capitals in the post-World War II period. The dataset is coded according to both the Correlates of War and the Gleditsch and Ward (1999) state lists, and is therefore compatible with a great number of existing databases in the discipline. Provided in a geographic data format, CShapes can be used directly with standard GIS software.
The Archigos data provide a list of leaders for all independent states in the world 1885 through 2015.
Last update: 20 September 2018