Skip to Content

Our projects

The Internet as Quantitative Social Science Platform

In this project, we analysed over one trillion online/offline observations to build the world’s richest data set on internet activity around the globe. With this data, we can examine human behaviour on a previously unimaginable scale. The data has already been used to study global sleep patterns, the diffusion of the internet and the internet’s impact on the economy, but that is just the beginning.The culmination of over three years’ work, our ground-breaking database provides a first glimpse of the potential for global internet activity data to profoundly change the way research on human behaviour and social interactions is conducted, and the types of questions we can ask and answer.This has become possible because, for the first time in human history, half of the world’s population is connected to a single network on which every device can instantly, and at negligible cost, passively query the other’s online or offline status, rendering the internet a powerful and unprecedented social data-science platform.

Classifying Satellite Images for Socio-Economic Activity

The goal of this project is to go beyond nighttime light luminosity as a measure for subnational economic development or social activity and use daytime imagery. High resolution daytime satellite images are becoming relatively more available for the entire world and compared to night lights,it contains more information about the landscape which are reflective of economic activity. As mentioned before, night lights havethe limitations of distinguishing between the poor and densely populated areas and in these cases, the daytime images can fill in the gap.However, with more information in the daytime images, they are highly unstructured and thus makes it rather difficult to extractinformation which can be scaled to some economic measure.We first employ a convolutional neural networks (CNN) approach extract physical features from the daytime images (e.g., roads, railways, buildings)The predicted values then will be used, to build a second model, to predict economic indicators (e.g. GDP) fromthe aggregated OSM predictions, from grid cells to regions.

Predicting Social Unrest via Textual Analytics

Social unrest impacts societies all over the globe. Causes of unrest are many and varied, but are often triggered by either machinations in the political class, or basic Economic factors. Unfortunately, in many authoritarian or autocratic regimes, social unrest leads to violence, destruction of property, and sadly physical harm and death. The aim of this project is to develop a method to predict the likelihood of social unrest occurring in a given location by using alternative and economic data. Using alternative data from the GDELT [link http://www.gdeltproject.org/] textual analysis project, together with economic series from the OECD and other sources, we use AI methods to predict the propensity of social unrest one to four weeks in advance. It is anticipated that such a method will provide the international community with much needed forewarning of likely flash-points around the globe, such that diplomatic, observer, and other actions can be taken to diffuse the crisis, protect citizens, and document any resulting harms.

The Effect of the Internet on Political Mobilization

This project studies empirically the effect of the Internet on protests worldwide. We compile a novel panel dataset that combines geo-referenced data on Internet quality and weekly protests for over 18,907 subnational (ADM2) districts from 236 countries and the years 2006-2012.The Internet penetration data was constructed by combining over a trillion (1.5 X 10^12) IP activity(offline/online) observations to a commercially-available, IP-geolocation library. Our identification strategy exploits random weekly variation in global Internet latency to identify the causal effect of the Internet on local protests. According to our estimates, latency-adjusted Internet increases the occurrence of local protests. We show that most of the variation in the effect of the Internet on local protests comes from national differences in political institutions and local differences in Internet penetration.

Cyber-cultural Similarity from Google Search

The aim of this project is to develop methods to provide international socio-cultural similarilty measures, at scale, from revealed human preferences. For decades the World Values Survey (WVS) has been the gold-standard of international cultural similarity and comparison, however, the WVS's strength is also its weakness: by conducting face-to-face values surveys with carefully constructed representative sample sub-populations, the WVS delivers data-source control up-scaling potential. However, face-to-face surveys are known to be highly problematic in eliciting personal values information, especially in repressive or semi-autocratic political contexts. Furthermore, the WVS machinery is highly costly to implement, with survey waves necessarily separated by five year periods. With more than 50% of humanity now online, alternative data sources, like Google search queries, provide promising opportunities in this space: queries are self-revealed and often highly intimate, queries can be geo-located and analysed in near real-time, and in almost all countries queries to Google cover increasingly large fractions of the population.

Testing Motivational Crowding-Out Theory on reddit

This project explores the relationship between extrinsic (monetary) rewards and intrinsically motivated behaviour in the context on the online discussion platform reddit. Data was obtained from the online forum over selected periods between January 2014 and August 2017.A python script was written to scrape data from the website, including the following variables for each comment; opening poster of the thread in which the comment is contained, date of that thread, comments in that thread, textpost of comment, upvote ratio, and time of the comment. Our python script to scrape reddit subreddits can be downloaded here.

Occupational mobility and automation: a data-driven network model

The potential impact of automation on the labour market is a topic that has generated significant interest and concern amongst scholars, policymakers and the broader public. A number of studies have estimated occupation-specific risk profiles by examining how suitable associated skills and tasks are for automation. However, little work has sought to take a more holistic view on the process of labour reallocation and how employment prospects are impacted as displaced workers transition into new jobs. In this article, we develop a data-driven model to analyse how workers move through an empirically derived occupational mobility network in response to automation scenarios. At a macro level, our model reproduces the Beveridge curve, a key stylized fact in the labour market. At a micro level, our model provides occupation-specific estimates of changes in short and long-term unemployment corresponding to specific automation shocks. We find that the network structure plays an important role in determining unemployment levels, with occupations in particular areas of the network having few job transition opportunities. In an automation scenario where low wage occupations are more likely to be automated than high wage occupations, the network effects are also more likely to increase the long-term unemployment of low-wage occupations.

Diverting domestic turmoil

When faced with intense domestic turmoil, governments may strategically engage in foreign interactions to divert the public’s attention away from pressing domestic issues. I test this hypothesis for a globally representative sample of 190 countries, at the monthly level, over the years 1997-2014. Using textual data on media–reported events retrieved from the GDELT database, I find robust evidence that governments resort to diversionary tactics in times of domestic turmoil and that such diversion takes the form of verbally aggressive foreign interactions, typically targeted at ‘weak’ countries and countries closely linked along religious, linguistic and geographic dimensions. Strategically important trade partners are unlikely to be victimized. These findings suggest that diversionary foreign policy is, in fact, systematically practised by governments as a strategic tool, and that such diversion is exercised in a manner that may not lead to large scale costs or risks of retaliation.

The value of names - Civil society, information, and governing multinationals on the global periphery

Civil society is essential to governance, especially where laws and authority are weak. We study how a core strategy of international civil society groups - informing and publicizing human rights abuses - impacts those tied to abuse. Our study focuses on a major trend at the center of on-going international media campaigns: the assassination of civil society activists involved in mining activity. Collecting and coding 20 years of data on assassination events, we use Event Study Methodology to study how publicity of these events impact the asset prices of firms associated with abuse. We show that publicizing abuses has a significant impact on multinationals. Firm's associated with an assassination have large, negative abnormal returns following the event. We calculate a median loss in market capitalisation of over 100 million USD, ten days following violence. We highlight the role of media publicity in our results. We show negative returns from assassinations are stronger during periods of low media pressure, versus when they coincide with competing newsworthy events. As well, we argue our results are driven by events where companies are explicitly named in media publicity, using a set of placebo events where no firms were identified by news coverage. Furthermore, we reject that our results are driven by other forms of unrest and conflict. Last, we show activist assassinations are positively related to the royalties paid by firms to domestic governments.