Google Trends and Refugee Modeling

This project builds on the work done by University of Oulu, University of Melbourne, and Harokopio University of Athens in their paper Correlating Refugee Border Crossings with Internet Search Data. In that paper, they attempted to answer the question Can Internet search data be used as a proxy to predict refugee mobility?. Per their findings, "Results indicate that the reuse of internet search data considerably improves the predictive power of the models." However, this research solely focused on refugees fleeing North Africa and the Middle East for Greece. This project aims to apply these techniques to refugees fleeing Somalia and South Sudan for Ethiopia.

To access this open source project, check out https://github.com/jataware/refugee-trends . provides a tool that uses Google Translate and Google Trends to assist a user in identification of terms that may be correlated with refugee arrivals.

This currently supports refugee arrivals to Ethiopia from Somalia and South Sudan. The user selects a country of interest (Somalia or South Sudan) then inputs a term. That term can optionally be translated into Somali or Sudanese. Then it is checked against Google Trends for that country. A Pearson correlation coefficient is returned for the Google Trend time series correlation with refugee arrivals to Ethiopia.

Once you have identified a set of target terms, our tool lets the modeler build a set of time series for each target term. We then provide a wrapper on the Panos Kostakos, et al model which determine whether prediction of refugees from Ethiopia improved compared to a baseline model:

########### RESULTS ###########
            Method  RMSE   MAE
1          Baseline 16.13 13.59
2 Linear Regression 12.16  8.88
3         Full tree 19.90 15.57
4       Pruned tree 16.13 13.59
5     Random forest 16.66 13.03
6               SVM 12.07  9.96
###############################