An Integrated Machine Learning Approach To Studying Terrorism


This project investigates an integrated machine learning approach for classification and analysis of global terrorist activity. In this project, we aim to make the following three contributions – 1) exploration of supervised machine learning approaches as a novel technique in the study of terrorist activity; 2) development of a model that classifies historical events in the Global Terrorism Database (GTD) that, at present, have yet to be attributed to a responsible party; and 3) release of a new dataset, QFactors_Terrorism, that integrates event-specific features derived from the GTD with population-level demographic data from open sources like the World Bank and United Nations. Using this new dataset, a random forest model was trained that classifies the actor responsible for an identified incident with up to 68% accuracy. This project makes no claim on the ability to forecast or predict future terrorist activity – rather, it is intended to highlight the importance of a machine learning approach that, when integrated with domain area expertise, can augment study of complex social issues.

Undergraduate Thesis, Yale University