This is an extended version of the Boulder talk that I gave at UMass Amherst 12/7/12. It also includes new work on UN roll call data (joint with Skyler Cranmer and Bruce Desmarais).
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Partition Decoupling for roll call data (2)
1. Partition Decoupling
for roll call data
Scott Pauls
Department of Mathematics
Dartmouth College
scott.pauls@dartmouth.edu
University of Massachusetts, Amherst
December 7, 2012
2. Partition Decoupling for roll call
data
This is joint work with Greg Leibon, Dan
Rockmore, and Robert Savell, all from
Dartmouth.
http://arxiv.org/abs/1108.2805
7. Random Model
The null model we use is a bootstrap
null model – one generated by randomly
permuting the data.
This preserves the basic structure of
outcomes of the votes, but destroys any
structure of association between
legislators.
8. Comparisons
Minority Random Poole- Poole- % of PDM: one PDM: % of
model model Rosenthal: Rosenthal: residual layer two residual
1 dim. 2 dim. captured layer captured
House APRE 0 0.4561 0.534 0.593 13 0.839 0.856 11
Percent 67.3 [72,88] 84.5 86.5 13 94.7 95.3 11
correct
(House)
Senate 0 0.4834 0.476 0.563 17 0.809 0.822 7
APRE
Percent 66.6 [70,90] 82.3 85.2 16 93.6 94.1 8
correct
(Senate)
9. Example: 108th Senate
“Conservative Republicans” Sessions, Kyl, Cornyn, Santo
“Moderate
rum, etc.
Republicans”: e.g.
Frist, Lott, Brownba Snowe, Chaffee, Collin
ck, Hagel s, Specter, etc.
Fitzgerald, Gregg, McCain,
Sununu, Warner
“tax cuts”
Zell Miller (D- “Liberal Democrats”:
GA) e.g.
Kennedy, Feingold, Bo
xer, Leahy, Reed
“Conservative
Democrats”: e.g.
Pryor, Lincoln, Bayh, B
reaux, Landrieu, etc.
10. Distinguishing clusters: 108th
Senate
Coarse picture: one dimensional
ideology (“liberal/conservative”).
Y N N N N N N
N N Y Y N N N
Y Y Y Y Y Y N
Y/ Y/ Y/ Y/ N/ N/ N/
N N N N Y Y Y
Y Y Y Y Y N N
Y Y Y N N Y N
Democrats Republicans
11. Distinguishing clusters 108th
Senate
Coarse picture: one dimensional
ideology (“liberal/conservative”).
Y N N N N N N
N N Y Y N N N
An amendment
Y Y Y Y Y Y N to an
Y/ Y/ Y/ Y/ N/ N/ N/ appropriations
N N N N Y Y Y bill which would
eliminate tax
Y Y Y Y Y N N
cuts.
Y Y Y N N Y N
Democrats Republicans
12. Distinguishing clusters 108th
Senate
Coarse picture: one dimensional
ideology (“liberal/conservative”).
Y N N N N N N
N N Y Y N N N
An amendment
Y Y Y Y Y Y N to repeal
Y/ Y/ Y/ Y/ N/ N/ N/ authorities and
N N N N Y Y Y requirements
for a base
Y Y Y Y Y N N
closure
Y Y Y N N Y N
Democrats Republicans
13. Distinguishing clusters 108th
Senate
Coarse picture: one dimensional
ideology (“liberal/conservative”).
Three votes:
Y N N N N N N 1. Sense of the
N N Y Y N N N Congress re:
global AIDS
Y Y Y Y Y Y N funding
2. Cloture:
Y/ Y/ Y/ Y/ N/ N/ N/ Safe, Accountable
N N N N Y Y Y , Flexible and
Efficient
Y Y Y Y Y N N Transportation
Act of 2004
Y Y Y N N Y N 3. Amendment to
provide a
brownfields
demonstration
Democrats Republicans for qualified
green/sustainabl
e design projects
14. Distinguishing clusters 108th
Senate
Coarse picture: one dimensional
ideology (“liberal/conservative”).
Y N N N N N N Two votes:
1. Extend
N N Y Y N N N
Unemployme
Y Y Y Y Y Y N nt Benefits
Y/ Y/ Y/ Y/ N/ N/ N/ 2. Sense of the
N N N N Y Y Y Senate re:
imposition of
Y Y Y Y Y N N an excise tax
Y Y Y N N Y N on tobacco
lawyer’s fees
that exceed
Democrats Republicans $20,000/hr
15. Distinguishing clusters 108th
Senate
Coarse picture: one dimensional
ideology (“liberal/conservative”).
Y N N N N N N
Amendment to
N N Y Y N N N
protect US
Y Y Y Y Y Y N workers from
Y/ Y/ Y/ Y/ N/ N/ N/ foreign
N N N N Y Y Y competition for
performance of
Y Y Y Y Y N N
Federal and
Y Y Y N N Y N State contracts.
Democrats Republicans
16. Distinguishing clusters 108th
Senate
Coarse picture: one dimensional
ideology (“liberal/conservative”).
Y N N N N N N
N N Y Y N N N
Y Y Y Y Y Y N Amendment to
vest sole
Y/ Y/ Y/ Y/ N/ N/ N/ jurisdiction over
N N N N Y Y Y Federal budget
Y Y Y Y Y N N process in the
Y Y Y N N Y N Committee on
the Budget
Democrats Republicans
17. Example: 88th Senate
Party
Civil Rights
Outer shape:
red=midwest, blue=northeast, green=south,
black=southwest, yellow=west
18. Layer two
Regional identification dominates highest
correlations (particularly in recent years).
Clustering on the residual data provides a
new partition of network which is (often)
completely different than the first layer.
In particular, clusters are not dominated
by party identification.
19. Example: 108th Senate
Three clusters of mixed party.
Four sets of issues distinguish the clusters
effectively:
1. Infrastructure: Three amendments (86, 214 and 230) to H.J. Res.
2, the Appropriations Bill, relating to infrastructure projects.
2. Energy: Seven amendments (515, 843, 844, 851, 853, 856, 884
and 1386) to Senate Bill 14, a bill concerning the energy security
of the United States. One amendment (272) to S. Con. Res. 23,
relating to drilling in the Arctic National Wildlife Refuge.
3. Homeland Security: Two amendments (515 and 3631)
pertaining to Homeland Security.
4. Trade: The passage of the US-Chile Free Trade Agreement
The first and second clusters are well separated by
the Energy votes, the first and third by Energy and
Infrastructure votes and the second and third by one
energy vote, Homeland Security and Trade votes.
22. Application to UN roll call
voting
This is work in progress – joint with
Skyler Cranmer (UNC, Chapel Hill) and
Bruce Desmarais (UMass, Amherst).
Goal: Can methods, such as the PDM, be
used to construct meaningful categories
which capture the positions of states in
the world political system?
23. Test Case
UN roll call votes from the 60th session
through the 66th session (2005-2011).
Consider, as with the U.S. House and
Senate, two layers of the PDM.
26. First Layer: GDP per capita
Cluster 1 Cluster 2
25
60
50 20
40 15
30
10
20
5
10
0 0
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
log GDP per capita log GDP per capita
27. Adaboost results
Cuba:
3 votes: Necessity of ending the economic, commercial and financial embargo imposed
by the United States of America against Cuba : resolution
Human Rights:
Human rights and unilateral coercive measures : resolution
Human rights and cultural diversity : resolution
Globalization and its impact on the full enjoyment of all human rights : resolution
Nuclear Weapons:
Follow-up to the advisory opinion of the International Court of Justice on the Legality
of the Threat or Use of Nuclear Weapons : resolution
Palestine:
The right of the Palestinian people to self-determination : resolution
Palestine refugees' properties and their revenues :
Economic Development:
International trade and development : resolution
The right to development : resolution
29. Second layer
Dark Blue: Black:
Ireland UK
Netherlands
Liechtenstein
Belgium
Switzerland 0.2
Luxembourg
Austria 0.1 France
San Marino Spain
Spectral Axis 3
Malta 0 Portugal
Serbia Poland
Hungary
Bosnia and -0.1
Czech Republic
Herzegovina -0.2 Slovakia
Cyprus Italy
Finland -0.3 Albania
-0.2
Sweden 0.2 Slovenia
New Zealand
-0.1 0.1 Bulgaria
0 0 Russian Federation
Marshall 0.1 -0.1 Estonia
Islands 0.2 -0.2 Latvia
Spectral Axis 2
Spectral Axis 1 Lithuania
Georgia
Azerbaijan
Denmark
Turkey
Tajikistan
Kyrgyzstan
Kazakhstan
31. Adaboost results
Human Rights:
Situation of human rights in the Democratic People's Republic of
Korea : resolution
Situation of human rights in the Islamic Republic of Iran :
resolution
Death Penalty:
2 votes: Moratorium on the use of the death penalty : resolution
Racism:
Inadmissibility of certain practices that contribute to fuelling
contemporary forms of racism, racial
discrimination, xenophobia and related intolerance : resolution
40. Polity
Polity IV scores (Marshall, Jaggers and
Gurr) provide a measure of the authority
characteristics of states in the world
political system.
It is often used as a proxy for political
similarity between states, and hence the
potential for cooperation on different
issues. E.g. two democratic states are
more likely to cooperate than one
democratic and one authoritarian state.
44. State classifications
0.2
Can the 0.15
0.1
segmentation given
Spectral Axis 3
0.05
by the layers in the
0
-0.05
PDM replace polity
-0.1
0.5
-0.15
0
for use as a covariate
-0.2
-0.5
0.1 0.05 0 -0.05 -0.1 -0.15
0.2 Spectral Axis 1 Spectral Axis 2
in, for example,
0.15
0.1
models in Spectral Axis 1
0.05
0
international -0.05
relations?
-0.1
-0.15
-0.2
-0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2
Spectral Axis 2
45. Summary
PDM decomposition reveals multiple layers of structure associated to
roll call voting.
Taken together, these form a mathematical description of ideology.
The coarse version of the first layer is close to the results of spatial
models but even the first layer significantly outperforms spatial models
with respect to standard metrics.
The use of multiple layers allows us to capture a more nuanced picture
of ideology while still retaining the parsimony of the NOMINATE-type
models.
Our dimensionality results confirm those of Poole-Rosenthal while
simultaneously incorporating contradicting evidence (e.g. Heckman-
Snyder) – the dimensions appear at different scales.
This labeling given by the clusters at various levels provide a novel, and
potentially useful, set of explanatory variables for use in political
science models.
Hinweis der Redaktion
Yes, I have a beard now.No, I am substantially taller than the rest of them. They made me sit.
From a sequence of votes, how much can we determine about the legislators themselves? Can we detect party affiliation? ideology? procedure? A standard tool in the political science literature is the family of NOMINATE models developed by Poole and Rosenthal. The basic idea in those models is that a legislator votes by considering the relative positions of themselves and a bill in a Euclidean representation of an issue space. The simplest version is the one dimensional model where the ideal points of the legislators and the bills provide a simple predictive model for voting.This has spurred something of a cottage industry for estimating ideal points – the NOMINATE scheme is based on a maximum likelihood estimation based on roll call data. There are, of course other approaches. For example, Heckman-Snyder use a factor model on the same data, finding that more dimension (~5) are needed to satisfactorily explain the data. Inherent in these methodologies are modeling assumptions, the foremost being the a priori determination of the dimension of the space of ideal points. One of the main motivations for our work is to attempt to understand to what extent these assumptions are warranted.
So what is our goal? We wish to provide an unsupervised method for empirically modeling ideology from roll call data. In particular, we wish to provide a method by which we gain estimates on the dimensionality of the data. This allows us to validate the extent to which the NOMINATE assumption of “less than 3” dimensions is appropriate. To this end, we present the data and our encoding of it. A legislator is considered to be a bundle of votes, no more, no less. A vote is -1 (nay), 1 (yea) or 0 (not present/not voting). Our basic modeling unit is the notion of a motivation – a (real valued) vector representing an ideologically coherent position on the votes.
Various comments are in order. First, the motivations need not be orthogonal or independent (although in practice they are often the latter) – they can overlap substantially. This is not surprising, different ideological positions may have votes of common salience. Second, the weights are quite important and a priori may appear on multiple different scales. Weights at different scales are precisely the type of issue that may make the NOMINATE paradigm fail to the extent that it misses smaller scales that are dominated by larger ones. Third, the residual term is included, in part, to capture the presence of noise in the data – things we have no chance of detecting given the limitations of the data. For example, if a legislator votes a particular way due to a single bribe, quid pro quo, log-rolling, etc., this is undetectable from a statistical point of view. Our algorithm aims to discover both the weights and the motivations.
We use correlation, but other measures (e.g. percentage of votes in common) yield similar results.For our clustering step, we use spectral clustering. Again, other methods (e.g. kmeans) yield similar results.We determine the weights via least squares.This first pass gives us our layer one approximation. This is a dimension reduced version of the data dictated by the motivations. By construction, the motivations will pick out only the structure at the dominant scale. Thus, when we create the residual (i.e. compute 𝜖), we see pieces of smaller scales amplified for further study. In principle, we continue until we cannot distinguish our residual data from noise (our model for this is a randomized version of the roll call data). In practice, we almost always stop after two levels to avoid overfitting. We use correlation, but other measures (e.g. percentage of votes in common) yield similar results.For our clustering step, we use spectral clustering. Again, other methods (e.g. kmeans) yield similar results.We determine the weights via least squares.This first pass gives us our layer one approximation. This is a dimension reduced version of the data dictated by the motivations. By construction, the motivations will pick out only the structure at the dominant scale. Thus, when we create the residual (i.e. compute 𝜖), we see pieces of smaller scales amplified for further study. In principle, we continue until we cannot distinguish our residual data from noise (our model for this is a randomized version of the roll call data). In practice, we almost always stop after two levels to avoid overfitting.
Aggregate Proportional Reduction in Error (APRE) = (Minority Vote – Predicted Errors)/Minority VoteRandom model APREs: 10 randomizations for each congress, APREs are mean of those trials. % correct given in ranges due to random nature of the model.Observations:PDM significantly outperforms NOMINATE. Part of this may be due to dimensionality – typically, for example, layer one has ~5-10 motivations, hence 5-10 “dimensions.” While not directly comparable, this would lead us to compare to a 5-10 dimensional NOMINATE model. While the errors are then more comparable, Poole and Rosenthal indicate that they believe these extra dimensions are just overfitting noise, while the motivations come with ideological descriptors. In other words, the dimensions given by the PDM have derived meaning associated to them and hence can be interpreted, compared to one another, etc. NOMINATE performs just about as well as our random model. Information difference: for n legislators and k votes, PR uses n+k variables for each dimension. Minority model uses k (binary) variables, the random model uses 2*k vars (# yea, #nay for each votes). Ours uses c(n+k) where c is the number of clusters (total).
This example shows the results of the layer one approximation for the 108th Senate. We find six motivations which clearly delineate the two major parties as well as subgroups within them. The only party “cross-over” is Zell Miller (you may remember that, at this time, he endorsed G. W. Bush for reelection over Kerry and spoke at the Republican convention). The embedding given here is a two dimensional spectral embedding – this is derived from the clustering process. In short, it is a reasonably good approximation of the layer one data. This embedding roughly reflects a one dimensional ideological projection similar (and correlated with) NOMINATE scores. Moreover, the motivations come with annotation derived from their representative votes. In the next slides, I’ll discuss this in detail but the point is that the different clusters are distinguished by appropriate and valid ideological indicators as represented by votes. For example, the “Liberal Democrats” are separated from all the other clusters by their votes on some amendments to an appropriations bill concerning tax cuts.
Note: does not conform to ideal point estimation!
A quick example of a higher dimensional representation. Here the PDM recovers a truly two dimensional representation for the 88th Senate which is directly in line with NOMINATE. The two axes, as derived from the motivations, are basically indicators of party ID and opinion on a collection of bills related to Civil Rights.
Note: while these can be explained by NOMINATE style cuts, they cannot be combined with the first layer.
These graphs give dimension estimates from the spectral embeddings (technically, they are dimension estimates using MDS with the traditional cutoff of stress < 0.1). The blue bars are the dimension estimates for the first layer. The red bars are the estimates for the second layer. The black line give the estimates for the combination of the two layers. Observations: The estimates on the first layer confirm the results of Poole-Rosenthal using NOMINATE. One or two dimensions is sufficient for most congresses.The estimates for the second layer show the amount of information being lost. This is consistent with Heckman-Snyder who uncovered, using factor analysis, the necessity of more dimensions. The combination (black) shows how this disparate views may be unified – the secondary dimensions are small in scale when compared to the first.
These graphs give dimension estimates from the spectral embeddings (technically, they are dimension estimates using MDS with the traditional cutoff of stress < 0.1). The blue bars are the dimension estimates for the first layer. The red bars are the estimates for the second layer. The black line give the estimates for the combination of the two layers. Observations: The estimates on the first layer confirm the results of Poole-Rosenthal using NOMINATE. One or two dimensions is sufficient for most congresses.The estimates for the second layer show the amount of information being lost. This is consistent with Heckman-Snyder who uncovered, using factor analysis, the necessity of more dimensions. The combination (black) shows how this disparate views may be unified – the secondary dimensions are small in scale when compared to the first.
Using 2 significant eigenvalues, we see three distinct clusters. The blue contains Israel, the US and 4 small island countries. The green cluster is basically Europe while the red consists of everything else. The leftmost red state is Russia.𝜎=0.3, 𝑘=3,𝑙=2Using 2 significant eigenvalues, we see three distinct clusters. The blue contains Israel, the US and 4 small island countries. The green cluster is basically Europe while the red consists of everything else. The leftmost red state is Russia.𝜎=0.3, 𝑘=3,𝑙=2
One conjecture is that the clustering is guided primarily by economic concerns (i.e. developed vs. developing world). This has some credibility – the per capita GDP of cluster 1 (rest) is an order of magnitude lower than that of cluster 2 (Europe).
Investigating further, we use Adaboost to pull out a handful of votes which best distinguish between the three clusters. Economic concerns are among them.
We see that US/Israel are split from the rest on questions regarding Cuba, Nuclear arms, one economic development vote, and (unsurprisingly) Palestine. The European cluster is distinguished from the rest of the world (and with the US/Israel cluster) on Human rights concerns and one economic development vote. The last vote is substantially out of whack with an ideal point estimation (assuming the others determine ideal points).
Upon removal of the first layer, we see four significant eigenvalue (only the three dimensional embedding is shown). We choose eight clusters to gain the best separation. Two clusters are detailed – the blue cluster which is some of Europe and the black one which is roughly the rest plus Russia and former eastern block countries. This shows how the clusters have realigned – in the first layer Europe was together and Russia (+ former eastern block) were in another. 𝜎=1, 𝑘=8, 𝑙=4Upon removal of the first layer, we see four significant eigenvalue (only the three dimensional embedding is shown). We choose eight clusters to gain the best separation. Two clusters are detailed – the blue cluster which is some of Europe and the black one which is roughly the rest plus Russia and former eastern block countries. This shows how the clusters have realigned – in the first layer Europe was together and Russia (+ former eastern block) were in another. 𝜎=1, 𝑘=8, 𝑙=4
Dark blue (3) on the last slide is now orange. Black (7) is now medium green. Cyan (6) is light green. Red (1) is grey. Green (2) is red. Yellow (4) is yellow/orange. Magenta (5) is yellow. White (8) is Dark Green.
Using adaboost again, we find five votes which best separate the clusters.
The four cluster (yellow) is all negative. This cluster contains Cuba, Belarus, Niger, Somalia, Syria, Zimbabwe, Libya, Sudan, Iran, Egypt, China, N. Korea, Myanmar, Laos, etc.
The eighth cluster (white), containing Gamba, Senegal, Iraq, Lebanon, Jordan, Saudi Arabia, Yemen, etc., is mostly negative.
The second cluster (green), containing the US, Caribbean nations, Nigeria, Chad, Uganda, Japan, Thailand, Singapore, etc., is mostly negative but both more and less so than cluster 8. This is one good indication of where the PDM can overcome shortcomings of NOMINATE – for nominate, we’d expect monotonicity.
Cluster 6 (cyan), including Canada, Grmany, Macdonia, Croatia, Greece, Ukraine, Norway, Iceland, Kenya, Ethiopia, etc., is pretty weak on all of these.
Cluster 1 (red), including a mixture of Central American, South American, and African countries, has strong positive scores on the items on the death penalty and racism.
2 o’clock: US, Canada, Israel, much of Europemiddle, 3-4 o’clock: Sweden, Norway, Denmark, Japan, S. Korea, former Eastern block9 o’clock: rest, including Russia.We get the same type of picture, but the flexibility in our methodology allows for (marginal) gains in accuracy.
Here are similarities between the two representations. To help clear up a common misconception – it is often asked whether the 3d NOMINATE model does as well as the layer one representation with three clusters. As alluded to before, this is not an apples to apples comparison – the analogous comparison would be to only allow cut lines that miss the clusters entirely (e.g. the orange superimposed lines).
Top: Here we see the comparison of polity scores and the clusters from layer one. Cluster two have polity scores near 10 while cluster one has a wider distribution of polity scores. Lower Left: A (3d) view of the first layer clusters with centers colored by polity score (-10 = black, 10= white).Lower Right: The W-NOMINATE first coordinate is provided for comparison – we see that high values in the first dimension imply high polity scores but that there is no such implication for low values on the first dimension.Conclusion: Layer 1 PDM and W-NOMINATE contain similar information wrt polity.
Layer two clusters show are more subtle description – these clusters generally have wide distributions of polity scores. This indicates that the ideological characteristics that are described by the second layer are not well captured by the polity metric.
A different view of the second layer where the centers of the nodes are colored (grayscale) by the polity score (black =-10, white = 10).