Site Loader
Join Us
Berkeley, California
Berkeley, California

A couple of months ago, we began our journey to create a system that models the voting blocs in Congress- groups of Congress members that tend to vote together on bills. As the semester has gone by, we’ve worked hard to make progress towards that goal. Now, we’re excited to share with you the progress we’ve made. In this article, we will highlight the steps we’ve taken in modeling voting blocs and visualizing Congressional votes, while also discussing some of the challenges we’ve encountered along the way. Finally, we will touch upon potential areas for future development, and the conclusions we’ve reached.


The purpose of this semester-long project was to research factors that influence Congressional votes, while concurrently developing methods to model Congressional voting blocs. Last semester, the Roll Call team created a pipeline that uses various APIs to generate a graph of Congress from sponsorship and voting data. This semester, we built off this implementation with 2 clustering algorithms for our graph of Congress (Spectral Clustering and Louvain Clustering). These algorithms provide us with clusters, or groups of congress members that vote together most often. As an example, we have looked at individual datasets that can be applicable or necessary to understand the political climate of the next election cycle in 2020. We want lobbyists, business people, or politicians to be able to interact with our open source package in order to better understand the structure of Congress. We also worked to create an interactive mapping tool for roll call votes Throughout the semester, two teams have worked to implement clustering techniques and create visualizations of the clusters and roll call votes.


The clustering team attempted three methods in building the clusters. The first was Spectral Clustering. First, you turn the graph representation of Congress into an adjacency matrix (the ij entry of the matrix is the edge weight between congresspeople i and j). Next, we used numerical methods on the adjacency matrix to find key components of the graph. The number of important components was much fewer than the number of congresspeople. Lastly, we ran a clustering algorithm (ie. k-means) on the key components.

The second method was Louvain Modularity Maximization. When you apply modularity to some nodes and edges, you want some edges to be small in length to express closeness or similarities between the nodes or congressman. If two congressman are far apart based on a certain edge, then that indicates the dissimilarity or major differences between the two individuals. In other words, you can make a certain statement that two individuals are from different parties/voting blocs. Once Louvain is applied to certain subsets of nodes, then it’s expressed as a ratio to see overall how that specific subset is alike or if there are discontinuities (e.g. polarized groups). Generally, this clustering algorithm is based on going through a channel or coherent searching sequence and looking at random nodes. The check is to look at those individual nodes as a subset and apply Louvain only once. Then you go to another cluster and you don’t ever go back to the previous cluster again.

The final method we looked at was Markov Clustering, but we faced issues with the packaging. As with any project, we faced challenges along the way. The clustering team’s main challenge is that although we are trying to determine relationships between congress people, there is really no way to check how good a graph is itself, we can only check the accuracy of clusters.


The visualization team worked to create two main types of visualizations, one of the network of congress people and their clusters, and the other being a map of the U.S. with states shaded based on the roll call votes of individual bills. Mainly, the focus had been put on the mapping of the U.S. of bills that have been voted on by individual representatives in the House and Senate from each state. More importantly, some cool features of the visualizations include the ability to hover over a particular state to view a particular congressman’s profile and stance on a particular bill.

In this map of House votes on H.R. 7 to the left, the dark-red shaded states had all of their representatives vote yay on the bill, and the dark-blue shaded states had all of their representatives vote nay, and other proportions are shaded by their proportion of yays to nays, as indicated by the scale. H.R. 7, the Paycheck Fairness Act, seeks to amend the Fair Labor Standards Act of 1938 to provide more effective remedies to victims of discrimination in the payment of wages on the basis of sex, and for other purposes. In more conservative states such representatives tended to vote nay, and more liberal states tended to vote yay. Rep. Mario Diaz-Balart (FL) and Rep. Will Hurd (TX) are both Republicans who voted yay, and they both also voted yay on H.R. 7, to expand background checks on guns. This is unusual, as they are the only two Republican Reps to vote yay on both bills, generally against the majority of their party. This bill passed through the House, 242 to 187.

In the image to the left, we’ve mapped the votes on H.R. 8. H.R. 8 establishes an intensive background check for firearm transfers between private parties. For example, it restricts a transfer between private parties unless a licensed gun dealer or manufacturer conducts a background check of the firearm. As a whole, there have been 240 votes for yays and 190 votes for nays for H.R. 8. It would make sense that the bill would be passed because the Democrats in the house have a majority vote, so the final result of the vote isn’t much of a surprise. The legend in the map determines whether the total number of house representatives in a particular state vote yay or nay. For example, in California, the dark blue shaded layer of the state is an indicator of most of the congressman voting yay on new background check requirements being established. Since that is the case, we also know that most of the house representatives in California are Democratic. However, in Arizona, the gray shaded region describes the split between the votes of the house representatives in the state. With our clustering techniques, we can determine how the voting blocs in our visualizations have been established.

What’s Next?

When thinking about what else could be done to further improve the project, the two main paths would be to find additional features for our model and create an interactive tool to visualize roll call votes.

In our project, we’re interested in features that can be used to build a better graph of the voting blocs in Congress. It’s possible that attributes beyond our current ones — votes on bills and cosponsorship — could help in modeling voting blocs and their influence on future votes. Additional features could include anything from party membership, subcommittee membership, or campaign contribution/lobbyist influence, to constituent demographics. As we noted in our first blog post, many of these features may pose a challenge in terms of gathering data; for example, it’s very difficult to isolate the impact of campaign contributions on day-to-day policy decisions made by Congressional members. A good example of how funding data would be difficult to implement is for the issue of gun control. It’s no secret that in general, Democrats are in favor of tighter regulation and Republicans opposed. Notable exceptions that could prove the significance of funding in the formation of voting blocs are Democrats that receive NRA funding. Would they still vote with their party, or change their votes in favor of looser regulation? No matter the answer, can we tell if their vote is due to the funding, or just their ideological stance on gun policy? While research and data could try and account for these factors, it would be very difficult to differentiate between these confounding variables.

Our other main area for expansion is that we hope to soon create an interactive tool to map bills. We would have finished adding a drop-down menu to check different bills that have been voted on and see which senators voted yay or nay. Also, the color scheme of the visualizations should indicate the party of a congressman. This was one of our goals at the beginning of the semester, and hopefully, this will be done in the coming month.

In Conclusion

Now, let’s return to our main question: How do all the research, clustering, and visualization work done this semester help give us context into the trend of voting blocs and polarization within Congress? Well, as we’ve found this semester, the answer is very complicated. To a certain extent, a data-driven approach fails to capture the complexity of the Congressional vote. As the adage goes, “One of the reasons people hate politics is that truth is rarely a politician’s objective.” The quantitative approach we’ve taken during this project can’t possibly address aspects of human relationships like friendships, social pressures, etc. So is all hope lost?

Not quite. We do still believe that we can roughly approximate the form of voting blocs in Congress with observable data. With factors like the roll call vote, co-sponsorships, funding, and constituent demographics, we can begin to gain an understanding of the complex networks of relationships in Congress. As a future goal, we hope to deploy a python package which can display a network of Congressional voting blocs based on given input features.

In the end, we hope that we have provided an avenue for policy analysts, lobbyists, researchers, students, educators, and businesses to navigate the complex dynamics of Congressional voting.

Post Author: Roll Call

Leave a Reply

Your email address will not be published. Required fields are marked *