Lachlan O'Neill, Nandini Anantharama, Wray Buntine and Simon D. Angus
Empirical social science requires structured data. Traditionally, these data have arisen from statistical agencies, surveys, or other controlled settings. But what of language, political speech, and discourse more generally? Can text be data? Until very recently, the journey from text to data has relied on human coding, severely limiting study scope. Here, we introduce natural language processing (NLP), a field of artificial intelligence (AI), and its application to discourse analysis at scale. We introduce AI/NLP’s key terminology, concepts, and techniques, and demonstrate its application to the social sciences. In so doing, we emphasise a major shift in AI/NLP technological capability now underway, due largely to the development of transformer models. Our aim is to provide the quantitative social scientists with both a guide to state-of-the-art AI/NLP in general, and something of a road-map for the transformer revolution now sweeping through the landscape.
Richard Bluhm, Christian Lessmann and Paul Schaudt
We study the link between subnational capital cities and urban development using a global data set of hundreds of first-order administrative and capital city reforms from 1987 until 2018. We show that gaining subnational capital status has a sizable effect on city growth in the medium run. We provide new evidence that the effect of these reforms depends on locational fundamentals, such as market access, and that the effect is greater in countries where urbanization and industrialization occurred later. Consistent with both an influx of public investments and a private response of individuals and firms, we document that urban built-up, population, foreign aid, infrastructure, and foreign direct investment in several sectors increase once cities become subnational capitals.
I study the impact of industrial policy on industrial development by considering a canonical intervention. Following a political crisis, South Korea dramatically altered its development strategy with a sector-specific industrial policy: the Heavy and Chemical Industry (HCI) drive, 1973-1979. With newly assembled data, I use the sharp introduction and withdrawal of industrial policies to study the impacts of industrial policy—during and after the intervention period.
- HCI promoted the expansion and dynamic comparative advantage of directly targeted industries
- Using variation in exposure to policies through the input-output network, I show HCI indirectly benefited downstream users of targeted intermediates
- I find direct and indirect benefits of HCI persisted even after the end of HCI, following the 1979 assassination of the president.
These effects include the eventual development of directly targeted exporters and their downstream counterparts. Together, my findings suggest that the temporary drive shifted Korean manufacturing into more advanced markets and created durable industrial change. These findings clarify lessons drawn from South Korea and the East Asian growth miracle.
Lachlan O'Neill, Simon D Angus, Satya Borgohain, Nader Chmait, David L Dowe
As the discipline has evolved, research in machine learning has been focused more and more on creating more powerful neural networks, without regard for the interpretability of these networks. Such ``black-box models'' yield state-of-the-art results, but we cannot understand why they make a particular decision or prediction. Sometimes this is acceptable, but often it is not.
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis. While some methods for combining these exist in the literature, our architecture generalizes these approaches by taking interactions into account, offering the power of a dense neural network without forsaking interpretability. We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets, matching the power of a dense neural network. Finally, we discuss how these techniques can be generalized to other neural architectures, such as convolutional and recurrent neural networks.
Do citizens hold their government accountable for the delivery of public goods? The literature has traditionally answered this question using temporally aggregated voting data. This paper proposes an alternative, fine-grained approach to explore the short term dynamics underlying public sentiments towards governments, for 132 countries over the period 2002-2016.
Focusing on terror attacks as a government accountability shock, and using high-frequency, text-based event data to quantify public sentiments, I find that the average level of Public Discontent increases by approximately 14% in the 11 months following a successful terror attack. This effect is not merely driven by fear, and is influenced by information on government competence and attack-specific features. Citizens are less reproachful if the government made a reasonable effort to keep the public safe, and for events that may be beyond the government's control. Interestingly, young leaders and new leaders demonstrate an ability to mobilize the masses to rally 'round the flag in the aftermath of terror attacks.
Communication barriers and infant health: Intergenerational effects of randomly allocating refugees across language regions
Daniel Auer (U Mannheim & WZB), Johannes S. Kunz (Monash U)
This paper investigates the intergenerational effect of communication barriers on child health at birth using a natural experiment in Switzerland. We leverage the fact that refugees arriving in Switzerland originate from places that have large shares of French (or Italian) speakers for historical reasons and upon arrival are by law randomly allocated across states that are dominated by different languages but subject to the same jurisdiction.
Our findings based on administrative records of all refugee arrivals and birth events between 2010 and 2017 show that children born to mothers who were exogenously allocated to an environment that matched their linguistic heritage are on average 72 gram heavier (or 2.2%) than those that were allocated to an unfamiliar language environment.
The differences are driven by growth rather than gestation and manifest in a 2.9 percentage point difference in low birth weight incidence. We find substantial dose-response relationships in terms of language exposure in both, the origin country and the destination region. Moreover, French (Italian) exposed refugees only benefit from French- (Italian-) speaking destinations, but not vice versa. Contrasting the language match with co-ethnic networks, we find that high-quality networks are acting as a substitute rather than a complement.
Klaus Ackermann, Sefa Awaworyi Churchill and Russell Smyth
We examine the effects of mobile phone coverage on violent conflicts in Africa using a new monthly panel dataset on mobile phone coverage at 55x55km grid cell levels for 32 African countries covering the period from 2008 to 2018. The base rate of a conflict event in a month across our data set is 0.0039 with a standard deviation of 0.0620. We find that access to mobile phone coverage increases the probability of a conflict occurring in the next month by 0.0028. This finding is robust to a suite of sensitivity checks including the use of various specifications and alternative datasets.
We examine heterogeneity on the impact of mobile phone coverage across state-based conflict, non-state-based conflict and one-sided conflict, and find that our results are being driven by non-state conflicts. We examine economic growth as a channel through which mobile phone coverage influences conflict. In doing so, we construct new satellite data for night-time light activity as a proxy for economic growth.
We find that economic activity is a channel through which mobile phone coverage influences conflicts, and that higher economic growth weakens the positive effect of mobile phone coverage on conflict.
Johannes S. Kunz, Kevin E. Staub, And Rainer Winkelmann
Many applied settings in empirical economics require estimation of a large number of individual effects, like teacher effects or location effects; in health economics, prominent examples include patient effects, doctor effects, or hospital effects. Increasingly, these effects are the object of interest of the estimation, and predicted effects are often used for further descriptive and regression analyses. To avoid imposing distributional assumptions on these effects, they are typically estimated via fixed effects methods. In short panels, the conventional maximum likelihood estimator for fixed effects binary response models provides poor estimates of these individual effects since the finite sample bias is typically substantial. We present a bias-reduced fixed effects estimator that provides better estimates of the individual effects in these models by removing the first-order asymptotic bias. An additional, practical advantage of the estimator is that it provides finite predictions for all individual effects in the sample, including those for which the corresponding dependent variable has identical outcomes in all time periods over time (either all zeros or ones); for these, the maximum likelihood prediction is infinite. We illustrate the approach in simulation experiments and in an application to health care utilization. The installation documents for the Stata command are found here.
Sascha O. Becker, Francisco Pino, Jordi Vidal-Robert
The Protestant Reformation in the early 16th century challenged the monopoly of the Catholic Church. The printing press helped the new movement spread its ideas well beyond the cradle of the Reformation in Luther’s city of Wittenberg. The Catholic Church reacted by issuing indexes of forbidden books which blacklisted not only Protestant authors but all authors whose ideas were considered to be in conflict with Catholic doctrine.
We use newly digitized data on the universe of books censored by the Catholic Church during the Counter-Reformation, containing information on titles, authors, printers and printing locations. We classify censored books by topic (religion, sciences, social sciences and arts) and language and record when and where books were indexed. Our results show that Catholic censorship did reduce printing of forbidden authors, as intended, but also negatively impacted on the diffusion of knowledge, and city growth.
When faced with intense domestic turmoil, governments may strategically engage in foreign interactions to divert the public’s attention away from pressing domestic issues. I test this hypothesis for a globally representative sample of 190 countries, at the monthly level, over the years 1997-2014. Using textual data on media–reported events retrieved from the GDELT database, I find robust evidence that governments resort to diversionary tactics in times of domestic turmoil and that such diversion takes the form of verbally aggressive foreign interactions, typically targeted at ‘weak’ countries and countries closely linked along religious, linguistic and geographic dimensions. Strategically important trade partners are unlikely to be victimized. These findings suggest that diversionary foreign policy is, in fact, systematically practised by governments as a strategic tool, and that such diversion is exercised in a manner that may not lead to large scale costs or risks of retaliation.
Sascha O. Becker, Volker Lindenthal, Sharun Mukand, Fabian Waldinger
We study the role of professional networks in facilitating the escape of persecuted academics from Nazi Germany. From 1933, the Nazi regime started to dismiss academics of Jewish origin from their positions. The timing of dismissals created individual-level exogenous variation in the timing of emigration from Nazi Germany, allowing us to estimate the causal effect of networks for emigration decisions. Academics with ties to more colleagues who had emigrated in 1933 or 1934 (early émigrés) were more likely to emigrate. The early émigrés functioned as “bridging nodes” that helped other academics cross over to their destination. Furthermore, we provide some of the first empirical evidence of decay in social ties over time. The strength of ties also decays across space, even within cities. Finally, for high-skilled migrants, professional networks are more important than community networks.
Sascha O. Becker, Cheongyeon Won
In the mid 19th century, pre-colonial Korea under the Joseon dynasty was increasingly isolated and lagging behind in its economic development. Joseon Korea was forced to sign unequal treaties with foreign powers as a result of which Christian missionaries entered the country and contributed to the establishment of private schools. We show that areas with a larger presence of Christians have higher literacy rates in 1930, during the Japanese colonial period. We also show that a higher number of Protestants is associated with higher female literacy, consistent with a stronger emphasis on female education in Protestant denominations.