Genius Gerrymandering with inteGers*

*Great alGorithms for proGramming

By Zach, Milo, Mikail, and Josh

Hidden text? We would never do such a thing (Easter Egg 2)

Introduction

Gerrymandering is when the ruling political party redraws voting district boundaries to benefit themselves. This issue is particularly felt by voters in the opposing party, since their legislators are less likely to respond to their concerns. Gerrymandering is particularly felt amongst minority groups, as they are often either “cracked” across multiple districts (to dilute their ability to influence each election) or “packed” into one district (to prevent them from influencing more than one election). Additionally, the lack of competition within districts means that it is harder for new candidates to challenge incumbents.

Many people have tried to solve this issue by using integer programming to have computers draw fair and non-partisan districts. This website will look at how this may be achieved.

Anecdote

In the 2000 New York State Assembly redistricting, Hakeem Jeffries “ran a strong challenge in a Democratic primary” against the candidate supported by the Democratic party in 2000. However, he was unable to run in the same district again because the new district lines “literally carved out the block on which Hakeem Jeffries lived” (Oliver, 2017). This both harmed Jeffries, who was unable to get the votes of the people who he had won over in the previous election, and harmed his past constituents, many of whom had “hoped to vote for” Jeffries (Oliver, 2017).

History of Gerrymandering

The feasible implementation of algorithmic approaches to redistricting requires computers, so using integer programming to approach gerrymandering was not considered until the 1960s. Even then, the literature was focused on explaining the value of using computers and outlining the criteria to maximize. For example, Weaver and Hess’s “A Procedure for Nonpartisan Districting: Development of Computer Techniques” (1963) begins by discussing why computer-based redistricting is helpful (because a nonpartisan algorithm is difficult for legislatures to sway) and later describes criteria that redistricting systems should have, such as keeping districts compact and centering them around major population centers. Another paper from this time period – Nagel’s “Computers and the Law and Politics of Redistricting” (1972) – proceeds similarly: it states six(!) reasons why using computers for redistricting is useful and then explains the three major criteria that redistricting systems should be judged by: “computer feasibility, federal and state legality, and political realism” (p. 78).

However, now that redistricting algorithms have become more popular, modern papers on redistricting assume that readers are familiar with the basics and dig deeply into their specific advances. For example, Landau, Reid, and Yershov’s “A Fair Division Solution to the Problem of Redistricting” (2009) explains an algorithm which can be used for two partisan actors to agree upon a new electoral map, and Carman’s “Repairing Redistricting” (2021) defines seven criteria which they sought to optimize, describes the process they used to create mathematical implementations of their criteria, and then showed examples of the maps which they produced. (We will elaborate further on Carman’s paper in the next section of this website.)

Mathematical Model

Analysis

$$ \begin{align*} \text{Unit} &= \text{smallest subset of a region }(i) \\ \text{District} &= \text{geographically compact collection of units }(j) \end{align*} $$

Here's an interactive map to demonstrate the effects of gerrymandering. Use the checkbox to toggle between a map in which Democrats (blue) win and Republicans (red) win. Hover over each unit to see which district they are a part of and how that district votes. Other variables that our model uses can be found in the box to the side and are outlined below. Try to count how many districts are red and blue to see who wins, then click the checkbox and count again.

$$ \begin{aligned} & i = ? \\ & j = ? \\ & p_{i j} = \;? \\ & x_{i \text{ a}} = \;? \\ & x_{i \text{ b}} = \;? \\ & x_{i \text{ c}} = \;? \\ & x_{i \text{ d}} = \;? \\ \end{aligned} $$

Variables

Here is a list of the variable used in the model! Many of them are binary, which means that they can only take on the $$ \begin{align*} x_{ij} &= \begin{cases} 1 & \text{ if unit $i$ is in district $j$ } \\ 0 & \text{ if unit $i$ is not in district $j$ } \end{cases} \\ p_{ij} &= \begin{cases} 1 & \text{ if unit $i$ can be in district $j$ } \\ 0 & \text{ if unit $i$ cannot be in district $j$ } \end{cases} \\ y_{j} &= \begin{cases} 1 & \text{ if there are more Democrats than Republicans in district }j\\ 0 & \text{ if there are more Republicans than Democrats in district }j \end{cases} \\ v_i &= \text{total number of voters in unit }i \\ s_{ik} &= \begin{cases} 1 & \text{ if units $i$ and $k$ are adjacent} \\ 0 & \text{ if they are not adjacent} \end{cases} \end{align*} $$

Fairness

We want to make the gerrymandering as "fair" as possible. But in order to do this, we have to define what "fair" really means so that we can translate it into a mathematical optimization problem. We are going to say that the population of each district should be as close as possible to its target. In math terms, we want to minimize $$z = \max_{j} \left| \left(\sum_{i} x_{ij}v_i\right) - \text{Target} \right|.$$ Observe that the sum is the total number of voters in district $j$. Unfortunately, the absolute value prevents this function $z$ from being linear in our decision variables $x_{ij}$. Instead, we may break this nonlinear equation into two linear inequalities (constraints): $$ \begin{align*} z &\ge \left(\sum_{i} x_{ij}v_i\right) - \text{Target} \\ z &\ge \text{Target} - \left(\sum_{i} x_{ij}v_i\right). \end{align*} $$

Housekeeping

We have a few simple constraints that we include in order to keep the model from breaking. For example, something that hopefully seems self-explanatory is that each unit $i$ can only be assigned to exactly one district $j$. It wouldn't be fair if some units were represented more than others! We can write this mathematically as $$ \sum_j x_{ij} = 1$$ for all units $i$.

Compactness

To keep a district from being too big, we can set a maximum distance between units within the same district, called $\text{maxdist}$. The way we implement these constraints is to say that for all units $i,k$ who are separated by more than $\text{maxdist}$, $$ x_{ij} + x_{kj} \le 1.$$ There can be at most one of $i$ and $k$ in district $j$.

In addition to this, we attempt to minimize the number of "cut edges," or pairs of adjacent units that are in separate districts. To do this, we first define a binary variable $s_{ik}$ that indicates whether two adjacent units are in the same district. We have $$ s_{ik} \ge x_{ij} + x_{kj} - 1$$ and $$ s_{ik} \le \frac{1}{2} (x_{ij} + x_{ik}) + (1 - w_{ikj}). $$ Additionally, $$ \sum_j w_{ikj} \ge 1$$ and $$ \sum_{(i,k)} (1-s_{ik}) \le \text{MaxCut}$$ for all adjacent districts $i$ and $k$, where $\text{MaxCut}$ is the greatest allowed number of cut edges for a district.

Contiguity

Something that may be a little harder to grasp is how we require a district to be contiguous, or that between any two units in a district, there is a path between them made up of other adjacent units. Suppose that we have two units $k$ and $\ell$ in a district $j$ that are separated by a distance $d \le \text{MaxDist}$. Then we require that there is some other unit $i$ that is a distance $1$ from $k$ and $d-1$ from $\ell$ to also be in district $j$. We write this mathematically as $$ \sum_i x_{ij} \ge x_{kj} + x_{\ell j} - 1,$$ where we sum over all units $i$ within distance $1$ of $k$ and $d-1$ of $\ell$.

Political Distribution

We now turn our attention to the political distribution of the districts. In particular, we would like an equation that constrains $y_i$, a binary variable that indicates whether Democrat or Republican is dominant. Under the following constraint, if Democrat is dominant, $y_1$ is forced to be 1: $$ \text{total} \cdot y_j \ge \sum_{i} x_{ij}(\text{Democrat}_i - \text{Republican}_i) $$ Similarly, if Republican is dominant, $y_1$ is forced to be 0: $$ \text{total} \cdot (1-y_j) \ge \sum_{i} x_{ij}(\text{Republican}_i - \text{Democrat}_i) $$

Now that we can measure the number districts that favor Democrat over Republican, we can constrain them to lie within a window of width $\epsilon$ within $\text{Target}$: $$ \begin{align*} \sum_j y_j &\le \text{Target} + \epsilon \\ \sum_j y_j &\ge \text{Target} - \epsilon \end{align*} $$

Controversies

There are a few questions to to the problem of redistricting whose answers spark some disagreement, including:

Analysis

We disagreed a lot on the upsides and downsides of using redistricting algorithms. Here are a few of the core questions we thought about:

Works Cited

Repairing Redistricting: Using an Integer Linear Programming Model to Optimize Fairness in Congressional Districts (2021)

The promise and perils of computers in redistricting (2010)

Fair redistricting is hard (2019)

Computers and the Law and Politics of Redistricting (1972)

A Fair Division Solution to the Problem of Redistricting (2009)