Genius Gerrymandering with inteGers^*

^*Great alGorithms for proGramming

By Zach, Milo, Mikail, and Josh

Hidden text? We would never do such a thing (Easter Egg 2)

Introduction

Gerrymandering is when the ruling political party redraws voting district boundaries to benefit themselves. This issue is particularly felt by voters in the opposing party, since their legislators are less likely to respond to their concerns. Gerrymandering is particularly felt amongst minority groups, as they are often either “cracked” across multiple districts (to dilute their ability to influence each election) or “packed” into one district (to prevent them from influencing more than one election). Additionally, the lack of competition within districts means that it is harder for new candidates to challenge incumbents.

Many people have tried to solve this issue by using integer programming to have computers draw fair and non-partisan districts. This website will look at how this may be achieved.

Anecdote

In the 2000 New York State Assembly redistricting, Hakeem Jeffries “ran a strong challenge in a Democratic primary” against the candidate supported by the Democratic party in 2000. However, he was unable to run in the same district again because the new district lines “literally carved out the block on which Hakeem Jeffries lived” (Oliver, 2017). This both harmed Jeffries, who was unable to get the votes of the people who he had won over in the previous election, and harmed his past constituents, many of whom had “hoped to vote for” Jeffries (Oliver, 2017).

History of Gerrymandering

The feasible implementation of algorithmic approaches to redistricting requires computers, so using integer programming to approach gerrymandering was not considered until the 1960s. Even then, the literature was focused on explaining the value of using computers and outlining the criteria to maximize. For example, Weaver and Hess’s “A Procedure for Nonpartisan Districting: Development of Computer Techniques” (1963) begins by discussing why computer-based redistricting is helpful (because a nonpartisan algorithm is difficult for legislatures to sway) and later describes criteria that redistricting systems should have, such as keeping districts compact and centering them around major population centers. Another paper from this time period – Nagel’s “Computers and the Law and Politics of Redistricting” (1972) – proceeds similarly: it states six(!) reasons why using computers for redistricting is useful and then explains the three major criteria that redistricting systems should be judged by: “computer feasibility, federal and state legality, and political realism” (p. 78).

However, now that redistricting algorithms have become more popular, modern papers on redistricting assume that readers are familiar with the basics and dig deeply into their specific advances. For example, Landau, Reid, and Yershov’s “A Fair Division Solution to the Problem of Redistricting” (2009) explains an algorithm which can be used for two partisan actors to agree upon a new electoral map, and Carman’s “Repairing Redistricting” (2021) defines seven criteria which they sought to optimize, describes the process they used to create mathematical implementations of their criteria, and then showed examples of the maps which they produced. (We will elaborate further on Carman’s paper in the next section of this website.)

Mathematical Model

Analysis

$$ \begin{align*} \text{Unit} &= \text{smallest subset of a region }(i) \\ \text{District} &= \text{geographically compact collection of units }(j) \end{align*} $$

Here's an interactive map to demonstrate the effects of gerrymandering. Use the checkbox to toggle between a map in which Democrats (blue) win and Republicans (red) win. Hover over each unit to see which district they are a part of and how that district votes. Other variables that our model uses can be found in the box to the side and are outlined below. Try to count how many districts are red and blue to see who wins, then click the checkbox and count again.

Good for democrats

$$ \begin{aligned} & i = ? \\ & j = ? \\ & p_{i j} = \;? \\ & x_{i \text{ a}} = \;? \\ & x_{i \text{ b}} = \;? \\ & x_{i \text{ c}} = \;? \\ & x_{i \text{ d}} = \;? \\ \end{aligned} $$

Variables

Here is a list of the variable used in the model! Many of them are binary, which means that they can only take on the $$ \begin{align*} x_{ij} &= \begin{cases} 1 & \text{ if unit $i$ is in district $j$ } \\ 0 & \text{ if unit $i$ is not in district $j$ } \end{cases} \\ p_{ij} &= \begin{cases} 1 & \text{ if unit $i$ can be in district $j$ } \\ 0 & \text{ if unit $i$ cannot be in district $j$ } \end{cases} \\ y_{j} &= \begin{cases} 1 & \text{ if there are more Democrats than Republicans in district }j\\ 0 & \text{ if there are more Republicans than Democrats in district }j \end{cases} \\ v_i &= \text{total number of voters in unit }i \\ s_{ik} &= \begin{cases} 1 & \text{ if units $i$ and $k$ are adjacent} \\ 0 & \text{ if they are not adjacent} \end{cases} \end{align*} $$

Fairness

We want to make the gerrymandering as "fair" as possible. But in order to do this, we have to define what "fair" really means so that we can translate it into a mathematical optimization problem. We are going to say that the population of each district should be as close as possible to its target. In math terms, we want to minimize $$z = \max_{j} \left| \left(\sum_{i} x_{ij}v_i\right) - \text{Target} \right|.$$ Observe that the sum is the total number of voters in district $j$. Unfortunately, the absolute value prevents this function $z$ from being linear in our decision variables $x_{ij}$. Instead, we may break this nonlinear equation into two linear inequalities (constraints): $$ \begin{align*} z &\ge \left(\sum_{i} x_{ij}v_i\right) - \text{Target} \\ z &\ge \text{Target} - \left(\sum_{i} x_{ij}v_i\right). \end{align*} $$

Housekeeping

We have a few simple constraints that we include in order to keep the model from breaking. For example, something that hopefully seems self-explanatory is that each unit $i$ can only be assigned to exactly one district $j$. It wouldn't be fair if some units were represented more than others! We can write this mathematically as $$ \sum_j x_{ij} = 1$$ for all units $i$.

Compactness

To keep a district from being too big, we can set a maximum distance between units within the same district, called $\text{maxdist}$. The way we implement these constraints is to say that for all units $i,k$ who are separated by more than $\text{maxdist}$, $$ x_{ij} + x_{kj} \le 1.$$ There can be at most one of $i$ and $k$ in district $j$.

In addition to this, we attempt to minimize the number of "cut edges," or pairs of adjacent units that are in separate districts. To do this, we first define a binary variable $s_{ik}$ that indicates whether two adjacent units are in the same district. We have $$ s_{ik} \ge x_{ij} + x_{kj} - 1$$ and $$ s_{ik} \le \frac{1}{2} (x_{ij} + x_{ik}) + (1 - w_{ikj}). $$ Additionally, $$ \sum_j w_{ikj} \ge 1$$ and $$ \sum_{(i,k)} (1-s_{ik}) \le \text{MaxCut}$$ for all adjacent districts $i$ and $k$, where $\text{MaxCut}$ is the greatest allowed number of cut edges for a district.

Contiguity

Something that may be a little harder to grasp is how we require a district to be contiguous, or that between any two units in a district, there is a path between them made up of other adjacent units. Suppose that we have two units $k$ and $\ell$ in a district $j$ that are separated by a distance $d \le \text{MaxDist}$. Then we require that there is some other unit $i$ that is a distance $1$ from $k$ and $d-1$ from $\ell$ to also be in district $j$. We write this mathematically as $$ \sum_i x_{ij} \ge x_{kj} + x_{\ell j} - 1,$$ where we sum over all units $i$ within distance $1$ of $k$ and $d-1$ of $\ell$.

Political Distribution

We now turn our attention to the political distribution of the districts. In particular, we would like an equation that constrains $y_i$, a binary variable that indicates whether Democrat or Republican is dominant. Under the following constraint, if Democrat is dominant, $y_1$ is forced to be 1: $$ \text{total} \cdot y_j \ge \sum_{i} x_{ij}(\text{Democrat}_i - \text{Republican}_i) $$ Similarly, if Republican is dominant, $y_1$ is forced to be 0: $$ \text{total} \cdot (1-y_j) \ge \sum_{i} x_{ij}(\text{Republican}_i - \text{Democrat}_i) $$

Now that we can measure the number districts that favor Democrat over Republican, we can constrain them to lie within a window of width $\epsilon$ within $\text{Target}$: $$ \begin{align*} \sum_j y_j &\le \text{Target} + \epsilon \\ \sum_j y_j &\ge \text{Target} - \epsilon \end{align*} $$

Controversies

There are a few questions to to the problem of redistricting whose answers spark some disagreement, including:

What does it mean for a districting to be fair? Is it even possible to make a ‘fair’ districting? How do we account for unavoidable features, such as districts with unequal populations or nonstandard geography or topography?
For each criterion, what metric best determines how well a map satisfies said criterion? For example, when measuring compactness of districts, should we look at the number of cut edges, or should we look at the area of the smallest circle or rectangle that circumscribes the district?
What units should we use? Should they be as big as counties, or as small as census blocks? If we choose bigger units, how do we account for the unequal district populations? If we choose smaller units, how do we rewrite the rest of the model to let it run reasonably quickly?
In integer programming, what should the goal (that we’re trying to maximize) be? Should it be population equality, compactness, or competitiveness, or a mixture of multiple? How should the constraints of the other criteria be factored? If the goal includes multiple criteria, how should they be weighted?

Analysis

We disagreed a lot on the upsides and downsides of using redistricting algorithms. Here are a few of the core questions we thought about:

Currently, the biggest problem with these algorithms is that they are extremely slow; in Repairing Redistricting, for example, the authors were forced to use just 100 to 105 units because of the processing power that they had, which is not enough to provide the best maps possible in states which allow for a unit to be smaller than a county. However, this will likely be solved somewhat over the next few decades due to advances in computer technology, and could potentially be solved even more quickly if the United States deems redistricting important enough for it to let redistricting algorithms spend a lot of time running on supercomputers.
Computer-based redistricting does get rid of overt partisan fights over where the lines will be drawn. However, as mentioned in the previous section, there are many different ways for a redistricting algorithm to be created, and different ones may show consistent bias to one party over another. This comes with the disadvantage of less transparency regarding the process of redistricting, as it is hard for the general public to understand, for example, why using the efficiency gap metric of competitiveness, rather than the partisan symmetry metric, would be better for Democrats.
Computer-based redistricting could cause the district lines to completely change every ten years (as opposed to current systems, which mostly try to minimize such change). This could cause considerable social upheaval, as constituents and incumbents would have to deal with constantly-shifting districts. While minimizing this change could be encoded as a goal, it would add even more complexity to our models which are already having difficulty drawing fair districts in a reasonable amount of time.

Works Cited

Carman, B. A. (2021). Repairing Redistricting: Using an Integer Linear Programming Model to Optimize Fairness in Congressional Districts (Doctoral dissertation, Ohio University).
Ingraham, C. (2015). This is the Best Explanation of Gerrymandering you will ever See. The Washington Post.
Landau, Z., Reid, O., & Yershov, I. (2009). A Fair Division Solution to the Problem of Redistricting. Social Choice and Welfare, 32(3), 479-492.
Mighdoll, M., Brower, J., Khaishgi, M. & Zitzewitz, Z. (1982). Gerrymandering is Bad: Easter Egg Number Five Keeping Districts Good (Stunning Thesis Defense, Corollary College).
Nagel, S. S. (1972). Computers & the Law & Politics of Redistricting. Polity, 5(1), 77-93.
Oliver, John. Last Week Tonight. (2017, April 9). Gerrymandering: Last Week Tonight with John Oliver (HBO) [Video]. YouTube. https://www.youtube.com/watch?v=A-4dIImaodQ
Weaver, J. B., & Hess, S. W. (1963). A Procedure for Nonpartisan Districting: Development of Computer Techniques. Yale LJ, 73, 288.