1 Introduction
One of the main challenges in comparative studies on populism is how to measure it across a large number of cases, including several countries and parties within countries. Previous literature has explored this possibility using different methods, textual analysis among them (Armony 2005; Hawkins 2009; Jagers and Walgrave 2007; Ribera Payá 2019; Rooduijn and Pauwels 2011; Wettstein et al. 2020). The advent of machine learning has cleared the way for further research in this direction, allowing for a faster processing of data and more accurate predictions. Text-as-data approaches based on automated tools are useful for investigating differentiated political questions because of the possibility of analyzing large quantities of data with fewer resources, inferring actors’ positions directly from the texts and obtaining more replicable results. Given these features, an increasing number of studies on comparative populism have relied on computer-assisted textual analysis through supervised learning (Hawkins et al. 2018). The advantages of using text-as-data in the measurement of populism are several. For example, they allow focusing on the elites and their ideas; measuring populism across a large number of cases, within and between countries; and obtaining continuous populism measures which, unlike dichotomous ones, better account for the multi-dimensionality of populism and differentiate between its degrees (Meijers and Zaslove 2020a). However, most of the methods proposed to date are resource-intensive or suffer from structural limitations, particularly when they heavily rely on the process of human annotation for the analysis of vast corpora. Valuable contributions, such as expert surveys or extensive human-coded works, are expensive from an economic perspective and for time needed to obtain the results. Consequently, these contributions might be inadequate to seize the rapid changes and transformations of the party landscape.
Here, we propose a method for measuring populism based on Supervised Machine Learning (Hindman 2015; Ho 1995), drawing on techniques commonly used in Natural Language Processing. We show that the use of text data (Laver, Benoit, and Garry 2003) and machine learning can significantly improve research in this field and reduce limitations inherent in human-coding techniques. Focusing on six western European countries that exhibit a long-standing tradition of populist parties (Italy, France, Spain, Germany, Austria, and the Netherlands), we used a Random Forest classification algorithm (Breiman 2001) to derive a score of populism for every party observed. To obtain the score, we performed text analysis on
The use of electoral manifestos for measuring populism is not uncontested. One of the standard arguments against their use is that they might show lower levels of populist rhetoric relative to other types of text, for example, party magazines (Pauwels 2017). However, their use is grounded in the literature (e.g., (Rooduijn and Pauwels 2011; Rooduijn and Akkerman 2017) and meets some practical needs. Even if party manifestos are seldom widely read, they are official documents and offer the advantage of exploring parties’ discourses as institutions, rather than focusing on leaders who might promote narratives that significantly differ from those of their parties (Hawkins et al. 2018). Furthermore, they convey party arguments in given times (Rooduijn and Pauwels 2011) and show the type of engagement that parties have with their electorates. They show how political actors use economic, social, and psychological crises as leverage for their electoral campaigns and set the boundaries between what parties have promised to do and what they have done. Not only are they the documents that summarize party stances while addressing a broader audience, but they are also the documents that (besides the speeches) are produced and made public with similar goals across cases (Hawkins et al. 2018). Differently from other types of text, like speeches, manifestos are also easy to access. The facility with which to collect them and their other characteristics, above all the comparability of the textual source, makes them suitable for comparative analyses that aim to obtain a refined time line of party positions (Klemmensen, Hobolt, and Hansen 2007), clearing the way for consistent, valid, and reliable temporal and spatial comparisons of levels of populism across parties (Hawkins et al. 2019).
Due to the absence of a monolingual comprehensive corpus, we trained six different models, one for each country. The algorithm was trained by assigning labels to chunks of text depending on whether they were drawn from manifestos of populist or nonpopulist parties. For our working definition of populism, we adhered to the broad ideational approach (Hawkins and Kaltwasser 2017; Hawkins and Littvay 2019) and defined “populists” as those parties that understand politics as a Manichean struggle between a reified will of the people and a corrupt, conspiring elite (Hawkins 2009). The algorithm was trained on
Our method addresses four main issues associated with measuring populism across parties. First, it allows for the measurement of a great selection of parties without resource-intensive human-coding processes. Second, it ensures its measurement across space and over time, allowing for comparative temporal and spatial analyses (however, based on contemporary classifications). Third, as a continuous measure, it allows for obtaining more accurate analyses of the party landscape, reducing the risk of classifications that could be arbitrary (Meijers and Zaslove 2020a). Finally, unlike other more resource-intensive methods, it easily allows for obtaining updated results every time that a new party enters the political arena or researchers want to measure if (and to what extent) a party is populist or became “more” populist over time.
To show how the score can be applied, we studied trends across countries and over time using average levels of populism from the early 2000s for nearly two decades. Results show that the average amount of populism has significantly increased in Italy, whereas other countries show weaker growth or uneven trends. Our results suggest that textual data are a promising tool for expanding political research possibilities on measuring populism and its trajectories.
The paper develops as follows: in Section 2, we introduce the literature on populism and its measurement. In Section 3, we describe the datasets used to train the Random Forest algorithm. In Section 4, we describe how we trained the algorithm, the preprocessing procedure of the data, the algorithm accuracy, and the final derivation of the score. In Section 5, we validate the score, comparing it with expert-surveys’ scores and other text-based measures of populism. We discuss the possibility of using different datasets (political speeches and manually coded data) for the score derivation. Finally, we show how it can be used to describe countries’ temporal evolution of populism levels.
2 Populism and Its Measurement
The issue of defining populism has been the core of several studies, each of them highlighting the difficulty in finding a shared conceptualization of this phenomenon. The unclear nature of this term has led to an abundance of definitions in books, papers, and articles (for an extensive review on this, see Gidron and Bonikowski (2013) and Hawkins et al. (2018)). While some of them pivot around organizational features such as strong leadership or top-down mobilization (Weyland 2001), others highlight the centrality of economic aspects, for example, the promotion of unsustainable redistributive policies (Acemoglu, Egorov, and Sonin 2013; Guiso et al. 2017) or discursive elements such as the presence of a moral and Manichean language nourishing people’s opposition against the elite (Mudde 2004). A common approach to the definition of populism is the “ideational approach.” It sees populism as a set of ideas understanding politics as a Manichean struggle between a reified will of the people and a conspiring elite (Hawkins 2009; Hawkins et al. 2018). It entails the combined presence of three features (Hawkins and Kaltwasser 2017): the Manichean and moral cosmology, the depiction of the people as homogeneous and virtuous, and the elite’s depiction as selfish and corrupt (Hawkins et al. 2018). The simplicity of populism’s set of ideas allows it to adapt to different contexts. Accordingly, several “varieties” of populism develop based on the most relevant social grievances politicized by populist forces in each society (Caiani and Graziano 2019; Hawkins and Kaltwasser 2017). Consistently, with the ideational approach, we considered populism as a set of ideas expressed through political texts (e.g., manifestos or speeches), which exalt popular sovereignty and understand the political field as a struggle between “the people” and “the elite.” This definition rests on the assumptions that parties’ populism and its levels can be assessed via textual analysis on political corpora (Bonikowski and Gidron 2016; Deegan-Krause and Haughton 2009; Hawkins 2009) and are not necessarily stable. There can be substantial variation in the presence of populist claims in different temporal and spatial settings. This means that political actors might not always exhibit the same levels of populism over time.
2.1 Measuring Populism with Textual Analysis
Extant literature has made extensive use of texts to investigate differentiated political questions, including inferring parties and leaders’ political positions. Laver et al. (2003) and Slapin and Proksch (2008) estimated political positions using word frequencies in party manifestos. Stewart and Zhukov (2009) used public statements by Russian leaders to understand whether military or political elites influence Russia’s decisions to intervene in neighboring countries. Eggers and Spirling (2011) relied on parliamentary debates to analyze exchanges among politicians in the British House of Commons. Debus and Gross (2016) inferred local actors’ policy preferences based on information in local parties’ manifestos. Given the advantages of adopting a text-as-data approach (e.g., the possibility of analyzing large quantities of data with fewer resources, inferring actors’ positions directly from their texts, and obtaining more replicable results), it is not astonishing that textual analysis is increasingly used to study (and measure) populism across parties. Researchers have relied on electoral manifestos, blogs, websites, leaders’ tweets, speeches, posts, and newspapers to infer and quantify parties’ amount of populism (Aslanidis 2018; Bonikowski and Gidron 2016; Bracciale and Martella 2017; Engesser et al. 2017; Hawkins et al. 2019; Hawkins, Riding, and Mudde 2012; Herkman and Matikainen 2019; Jagers and Walgrave 2007; Stulik 2019).
The reasons for supporting the use of textual analysis for measuring populism are at least two. First, it allows for focusing on the elites and their ideas, and, second, it allows for measuring populism across a large number of cases, within and between countries. However, notwithstanding the relevant achievements of some of the methodological contributions proposed to date, some of them have structural limitations, particularly when they heavily rely on the process of human annotation for the analysis of vast corpora. Reducing biases related to evaluations of individual readers (Ray 1999) and ensuring intercoder reliability requires the action of numerous coders involved in resource-intensive coding processes, and this might not always be the case. In dictionary-based approaches, choosing one dictionary or another can lead to substantially different results (Aslanidis 2018), and even establishing whether the dictionary is valuable or not is far from easy (Grimmer and Stewart 2013).
Furthermore, the nature of comparative analyses involves multilingual datasets, and the coding of such extensive corpora can hardly be done by one or two researchers alone. The advent of automated text methods might help overcome some of these limitations, such as allowing for the analysis of extensive collections of text with limited resources and in a short time (for a discussion and comparison of these techniques, see Wilkerson and Casas (2017)). The cross-fertilization between political science and Natural Language Processing (or related fields) has already shown its potentialities. For example, Born and Janssen (2020) used computational linguistics and computer science approaches to analyze MPs’ speeches and infer their positions by estimating distance measures between speeches. Gross and Jankowski (2020) relied on semi-automated content analysis techniques to detect dimensions of political conflict at the local German level using local manifestos. A valuable contribution that shows the potentialities of using automated text analysis for measuring populism comes from Hawkins and Castanho Silva (Hawkins et al. 2018). They used machine learning techniques to perform supervised classification of 154 documents comprised of speeches and manifestos using “holistic grading” for training the algorithm (Hawkins 2009). Holistic grading is a human-based approach that aims to evaluate the text as a whole and is used in educational psychology for assessing students’ writing (White 1985). The statistical model they developed for analyzing texts is based on a comparison of word frequencies; it weights the words that predict whether a document should be classified as populist or not. After comparing results with those obtained via human coding, they concluded that computerized text analysis could potentially be successful in identifying populism provided that there are bodies of data large enough to train the models (Hawkins et al. 2018).
3 The Dataset
The dataset includes
Building an alternative corpus comprising leaders’ speeches would be an advancement, given that populism can also vary across different types of texts (Hawkins and Littvay 2019). This instability, or perhaps document specificity, stands out as a significant weakness of textual approaches to the measurement of populism (Zaslove and Meijers 2019), and cross-comparisons between speeches and manifestos could yield relevant insights into this issue. Indeed, for Italy only, we also had a corpus of
3.1 Data Processing
We prepared the dataset following standard procedures in automated text analysis (for more details on preprocessing, see Kannan and Gurusamy (2014)). We split each national corpus into sentences according to the structure of the electoral programs and the language. Sentences were preprocessed, turning all words to lowercase, and removing punctuation, numbers, and stop words (e.g., and, but, or, and that). We stemmed the remaining words and removed unnecessary space between words. We then converted each sentence into a “bag-of-words,” in which the words’ order is irrelevant. A bag-of-words is a vector
Table 1 Details concerning the area under the receiver operating characteristic (AuROC) levels and F1-scores for validation and testing, and the number of sentences and the fraction of sentences belonging to populist manifestos per each country. In the case of validation, the values shown represent the mean and standard deviations of the AuROC over the different split of the K-Fold cross-validation.
| Frac pop | ||||||
|---|---|---|---|---|---|---|
| Country | AuROC (Valid.) | F1 (Valid.) | AuROC (Test) | F1 (Test) | N sentences | sentences |
| Austria |
|
|
|
|
|
|
| France |
|
|
|
|
|
|
| Germany |
|
|
|
|
|
|
| Italy |
|
|
|
|
|
|
| The Netherlands |
|
|
|
|
|
|
| Spain |
|
|
|
|
|
|
4 Methods
For the score derivation, we resorted to a classification algorithm capable of discriminating between sentences belonging to populist or nonpopulist parties’ manifestos of a given country. The final party score is the fraction of its manifesto’s sentences that the classifier considers as belonging to a prototypically populist party manifesto in its nation. The classification algorithm we adopted was the Random Forest algorithm (Breiman 2001), which offers the advantage of ensuring accurate predictions in the case of nonlinear relationships (McAlexander and Mentch 2020). This characteristic has supported its use for casting predictions within many topics, including voting behavior, partisanship, and political sentiments (Ansari et al. 2020; Bindi et al. 2018; Bustikova et al. 2020). We show a synthetic representation of the score computation’s procedure in Figure A in the Supplementary Material. As the choice of the Random Forest algorithm is arbitrary, we show some results also for other classification algorithms, namely a Logistic Regression, a Feedforward Neural Network, and a Gradient Boosting algorithm.
4.1 Training of the Algorithm
In the absence of a monolingual corpus, we performed separate training for each country, obtaining six different models. With the labeled text data, we built models capable of assigning to each chunk of text its corresponding label (Alpaydin 2020). Considering the training set of a country, we performed a “Grid Search” over a set of hyperparameters of the Random Forest algorithm to find the best combination according to a classification accuracy metric. In other words, we iterated all the combinations of the chosen hyperparameters, selecting the most accurate one. The Random Forest algorithm is well-known to have good performances using standard settings that can be found in many software packages (Probst, Wright, and Boulesteix 2019). However, tuning hyperparameters could still improve the classification accuracy of many tasks (Bernard, Heutte, and Adam 2009). We chose the hyperparameters’ set among typical values for the Random Forest algorithm, and we show them in Table C in the Supplementary Material.
We estimated the classification accuracy for each combination of hyperparameters using K-Fold cross-validation. The training set is initially split into
Once we found the best hyperparameters’ combination, we retrained the model on the whole training set. Finally, we used the whole training set to find the best threshold for the probabilities given by the Random Forest algorithm using the Receiver Operating Characteristic curve and the Youden index (Ruopp et al. 2008). This latter procedure further increases the model’s final accuracy, choosing the best combination of true positive and false positive rates. The average values of AuROC for the best hyperparameters are shown in the “AuROC (Validation)” columns of Table 1, while Table D in the Supplementary Material shows the best hyperparameters’ values for each nation.
While the presented practice is quite common in Machine Learning, it can lead to underestimating the cross-validation variance (Bengio and Grandvalet 2004). Better but more computationally intensive practices can be adopted (Cawley and Talbot 2010); however, their application goes beyond this work scope. Since the choice of the Random Forest algorithm was somewhat arbitrary, we repeated the training for the Italian corpus using a logistic regression (Muchlinski et al. 2016), a gradient boosting (Friedman 2001) algorithm, and a feedforward neural network (Rumelhart, Hinton, and Williams 1986), showing that results of prediction are mostly unvaried. Results and information about the models’ hyperparameters are reported in the Supplementary Material (from Table F to Table I).
5 Results
5.1 Testing the Model and Building the Score
Before building the score, we tested the accuracy of the Random Forest with the best parameter sets found for each country. Thus, we used the six country-specific models to classify all the sentences in the test set, and we computed the corresponding AuROCs. For completeness’ sake, we also computed the F1-score for the validation and test sets, which can be used as an alternative accuracy score for the Grid Search. Table 1 shows the AuROCs and the F1-scores for the test and validation sets. While AuROCs for the test sets are not far from the corresponding average validation score, the F1-scores for the test sets are generally higher than those for the validation sets. This fact is due to the Youden index (Ruopp et al. 2008) method that selects a reasonable threshold from the validation sets and increases the test sets’ accuracy. Finally, we also classified all the sentences of the parties excluded from the training. We built the parties’ scores by computing the fraction of the sentence classified as
Figure 1
Example of how parties can be ranked by their relative score. The scores are derived from training one model for each country and refer to the last national election available.
[Figure omitted. See PDF]
5.2 Validation of the Score
We validated the score using two different approaches. On the one hand, we relied on populism-related dimensions drawn from expert surveys (i.e., CHES (Polk et al. 2017) and POPPA (Meijers and Zaslove 2020b)) and the GPD database score (Hawkins et al. 2019). We selected two dimensions relevantly connected to populism from the 2017 CHES and five attributes of populism from the 2018 POPPA. In the CHES, a team of experts estimate the party positioning of national parties regarding integration, ideology, and policy issues in several European countries. In the 2017 wave, Austria is not covered, and we only have data for five of the countries in our analysis. In the POPPA,
5.2.1 Validation with CHES
For the validation, we first relied on the 2017 CHES (Polk et al. 2017). The two CHES dimensions that we selected are “anti-elite salience” and “people vs elite.” Anti-elitism, which is commonly used in the narrative of challenger parties in general (Hobolt and de Vries 2015), can be defined as an explicit attack on “the elites,” portrayed as a homogeneous power bloc (Zulianello 2019). As for “people vs elite,” according to the 2017 CHES codebook, it measures the positions of direct vs representative democracy. 3 However, support for referendums does not constitute a defining feature of the ideational approach, and the way the question was framed can be misleading (Meijers and Zaslove 2020a). Framing the question as it was in the survey gives the impression that populists, per definition, oppose representative democracy and implies that populism is associable with plebiscitary democracy. Nevertheless, we decided to use this attribute because of the close relationship between populism and referendums. For example, populist parties consider themselves as the saviors of democracy and claim that direct democracy can help them save the people from the elites (Jacobs, Akkerman, and Zaslove 2018).
We excluded from the validation the Spanish regionalist parties, since they stand out as outliers; furthermore, their manifestos are sometimes in Catalan. We also excluded FI as the score that we have for the Italian 2013 national elections refers to the People of Freedom (PdL), and even if Berlusconi was the main leader of this party, it also included National Alliance (AN), plus some other minor parties. Figure 2 shows the correlations and
Figure 2
Correlation between the score and the relevant dimensions of the 2017 Chapel Hill Expert Survey (Polk et al. 2017) for left-wing parties (L), centrist/other parties (O), right-wing parties (R), and all parties (P). Horizontal bars represent the
[Figure omitted. See PDF]
5.2.2 Validation with POPPA
We repeated the validation process using the POPPA dataset (Meijers and Zaslove 2020b). The attributes that we selected for validating our score are considered as the five components of populism according to the ideational approach by Meijers and Zaslove (2020a). These attributes include the Manichean vision of politics, the indivisibility of the ordinary people, people’s general will, people-centrism, and anti-elitism. Figure 3 presents correlations between our score and the selected dimensions for parties distributed according to the left–center–right classification. 5
Figure 3
Correlation between the score and the relevant dimensions of the 2018 POPPA (Meijers and Zaslove 2020b) for left-wing parties (L), centrist/other parties (O), right-wing parties (R), and all parties (P). Horizontal bars represent the
[Figure omitted. See PDF]
Since a measurement of the ideational approach to populism is valid when it captures its five components (Hawkins et al. 2018), we also validated our score against a latent populism variable constructed using the five POPPA dimensions as suggested by Meijers and Zaslove (2020a). We performed an iterated principal exploratory factor analysis on the mean expert judgment on the five items operationalizing populism to build the latent variable. We then summed all the dimensions after weighting them by their value in the first factor. Pearson’s coefficient when looking at the correlation between the score of populism and the latent populism variable is
Figure 4
Correlation between the score and the latent populism variable built on the five relevant dimensions of populism in the 2018 POPPA (Meijers and Zaslove 2020b). These dimensions are the Manichean vision of politics, the indivisibility of the ordinary people, people’s general will, people-centrism, and anti-elitism. Horizontal bars represent the
[Figure omitted. See PDF]
5.2.3 Validation with GPD
For further validation of the score, we used the GPD by Hawkins et al. (2019). We only used the nine parties available both in our and their database for the same years for the validation. If we exclude the Spanish Socialist Workers’ Party (PSOE), which stand out as outliers, there is a significant correlation between the two scores (
5.2.4 Comparison with Leader Speeches’ and Manually Coded Datasets
To check whether our methodology can be applied to different textual sources and lead to substantially different outcomes when using a corpus made of manifestos rather than one comprising leaders’ speeches, we repeated the analysis, building a score using
5.3 Temporal Evolution of the Populist Score
As a first application, we used the score for checking countries’ variations over time by measuring the average aggregate level of parties’ populism per year. We excluded from the analysis parties that gained less than
Figure 5
Trends in the average amount of populism using the score. Parties that gained less than
[Figure omitted. See PDF]
Figure 6
Evolution of the populist score for Austrian People’s Party (ÖVP—Austria), Green Left (GL—The Netherlands), The Left (Linke—Germany), and People’s Party (PP—Spain) in time.
[Figure omitted. See PDF]
6 Discussion
Recent years have seen a growth in “methodological populism,” which attempts to measure party populism systematically and comparatively (Hawkins et al. 2018). This paper adds to the existing literature by proposing a systematic method for measuring parties’ levels of populism using a text-as-data approach based on Supervised Machine Learning. Unlike other methods based on computer-assisted textual analysis (e.g., holistic grading in Hawkins et al. (2018)), the methodology that we proposed is based on the observation of units of text and not the text as a whole. Furthermore, it measures parties’ rather than leaders’ levels of populism; and does not rely on human-coding, nor does it require coders to assign scores to texts based on the elements that define the context of populism. Moreover, unlike dictionary-based approaches (see, e.g., Rooduijn and Pauwels (2011)), our method does not rely on the use of dictionaries, leaving out potential concerns about the validity and the selection of the dictionary used. Hence, it reduces the risk of arbitrary classification of parties.
It offers four main advantages. First, it classifies a vast number of parties by identifying their levels of populism (if any) without resource-intensive human-coding processes. Second, it obtains a party score to perform temporal and spatial analyses of populism, a feature that can lead to significant advancements in comparative studies. Third, it provides a continuous score of parties’ populism. Continuous measures help avoid conceptual confusion on whether populism is “sincere” or “strategic,” clearing the way for more fine-grained analyses of the correlates of populism and reducing the risk of arbitrary classification (Meijers and Zaslove 2020a). Fourth, unlike other methods for measuring populism, it obtains updated and fast results with a low allocation of time and resources. Furthermore, our method allows text analysis to be performed even when researchers have little or no polyglot knowledge, an element that is crucial in the perspective of spatial comparisons.
We validated our populism scores by comparing them with some populism-related dimensions of the 2017 CHES (Polk et al. 2017) and 2018 POPPA datasets (Meijers and Zaslove 2020b). We also validated the scores using the GPD (Hawkins et al. 2019), although we highlighted that the cross-validation should be based on the same corpus or a speech corpus. The scores are significantly correlated with the main attributes of populism, anti-elitism, and people-centrism in particular, as well as with a latent variable of populism built upon the five dimensions of populism proposed by Meijers and Zaslove (2020a).
We also checked the method’s robustness by repeating the analysis, using different classification algorithms with small variations in the results. In the Italian case, we showed that scores measured using sentences from manifestos and those measured using leader speeches are highly consistent. We only focused on Italy, since the collection of a corpus made of speeches remains a difficult task. Finding videos preceding the expansion of social media and the diffusion of modern smartphones is not easy, and not all leaders are on YouTube. When the process cannot be automated, manual transcription is time-consuming and requires optimal knowledge of the languages involved. Furthermore, we explored the potentialities of extending the method by using manually coded populist sentences from the Italian corpus for training the algorithms. While this method applied to a monolingual corpus would allow for a general score independent from the nation-specific ones, we did not see large variations in the score’s estimation.
Finally, we showed a small application of the score by performing a spatial and temporal comparative analysis of populism for Italy, France, Spain, Austria, Germany, and the Netherlands from the early 2000s for nearly two decades. We found significant differences among the countries, with populism increasing sharply in Italy while showing uneven trends in the other countries. Our first application of the score highlighted the importance of exploring the relatively untapped potential of continuous measures to investigate a wide range of populism-related issues, such as the populist zeitgeist (Mudde 2004), how the different attributes of populism (e.g., anti-elitism, people-centrism, and general will) evolved, or the relationship between populism and the economic (and sociopolitical) crises in a temporal and spatial perspective (Caiani and Graziano 2019; Kriesi and Pappas 2015). The method can also be used to examine a larger temporal interval, other types of textual sources, or other kinds of political and social phenomena.
Our method has some limitations that might be overcome in future developments of the present work. First, the different lengths of party manifestos could lead to less accurate estimates of the score, because the longer the manifesto, the higher the probability of covering more topics; the presence or absence of these topics could affect the score. However, we did not control for manifestos’ length at this stage. Segmenting the data so that each party is represented by a set of sentences belonging to specific topics could help solve this issue.
Second, in the absence of a monolingual corpus, that is, a corpus in which all manifestos have been translated to the same language, we performed a separate analysis for each country, training six different models. Besides adding complexity to the derivation of the score, the country-specific nature of populism can lead populism scores to have different scales. Therefore, a cross-country comparison would be more precise if using an integrated model trained on the same monolingual corpus.
Third, despite using a simple bag-of-words representation of sentences to obtain convincing results, such representation suffers from some shortcomings. For instance, the vocabulary should be carefully designed to manage the size, which impacts the sparsity of the document representations. Furthermore, by discarding word order, bag-of-words ignores the context and, therefore, the meaning of words in the document. We argue that more refined representations, such as Term Frequency-Inverse Document Frequency, which rescales the frequency of words (Baeza-Yates and Ribeiro-Neto 1999), or word embeddings, in which words with similar meanings have a similar representation (Li and Yang 2018), might enhance the accuracy.
Furthermore, the score seems to show that all parties can potentially exhibit some levels of populism. This shortcoming could be due to all the manifestos containing some common sentences or expressions, and hence, there exists a small number of sentences from nonpopulist manifestos that could belong to populist ones. This effect can be considerably reduced by using more extensively manually annotated corpora or limiting the analysis to sentences belonging only to specific topics. Finally, the use of other types of corpora, such as tweets or Facebook posts, might allow for more fine-grained temporal analyses and help detect more precisely significant turning points over the years.
Corresponding author Jessica Di Cocco
Edited by Jeff Gill
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© The Author(s) 2021. Published by Cambridge University Press on behalf of the Society for Political Methodology. This work is licensed under the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
One of the main challenges in comparative studies on populism concerns its temporal and spatial measurements within and between a large number of parties and countries. Textual analysis has proved useful for these purposes, and automated methods can further improve research in this direction. Here, we propose a method to derive a score of parties’ levels of populism using supervised machine learning to perform textual analysis on national manifestos. We illustrate the advantages of our approach, which allows for measuring populism for a vast number of parties and countries without resource-intensive human-coding processes and provides accurate, updated information for temporal and spatial comparisons of populism. Furthermore, our method allows for obtaining a continuous score of populism, which ensures more fine-grained analyses of the party landscape while reducing the risk of arbitrary classifications. To illustrate the potential contribution of this score, we use it as a proxy for parties’ levels of populism, analyzing average trends in six European countries from the early 2000s for nearly two decades.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Department of Economics, Sapienza University, Via del Castro Laurenziano 19, 00161 Rome, Italy . E-mail: [email protected]
2 Sony Computer Science Laboratories, 6 Rue Amyot, 75005 Paris, France





