PR, RCV, race, turnout, and voter error

Does error go up when new people vote?

Does ranked-choice voting (RCV) baffle voters? Our (great-) grandparents used to say so. Better data and methods have led to new evidence, but the popular conversation is no more substantive than one we’d have had in the 1940s. California Gov. Jerry Brown repeated the “complicated” claim in a local-option veto message last September. Some have even suggested to me that RCV is systematically biased against certain groups.

I think the systematic bias charge is a leap. I think people are talking past each other in the popular RCV conversation. (Witness the number of commas in the title of this post.) I also think there are serious usability problems, but evidence suggests these are not limited to RCV.

First I’ll summarize the work I know. (I will not get into other good work on campaign tone, female and POC candidate entry, etc.) Then I will show my own data from American proportional representation (PR) elections since no modern work speaks to PR. Some of the error rates will be staggering.

My best guess is that high PR error rates resulted when select parties and candidates were mobilizing new voters en masse. If this were true, the actors would not be those who imposed PR in the first place. It also would explain why my error rates spike in some PR elections regarded as positives for people of color.

Race and turnout

I do not yet see a persuasive link between modern RCV implementations and turnout differences across racial groups. One study found lower turnout in San Francisco precincts with high proportions of black, white, young, and less-educated voters when comparing these precincts pre- and post-RCV. The author later found lower overall turnout in RCV when comparing RCV and non-RCV off-year elections. Another group of authors compared decisive-round turnout in RCV, primary-general, and two-round contexts. The units in this part of their analysis were matched cities, so they can’t finely disaggregate turnout by race/ethnicity. On average, they found less drop-off between the first and final rounds of an RCV count than they found between the first and second rounds of a primary-general or two-round cycle.

How should we reconcile the divergent findings from San Francisco and the nationwide sample? Part of this story is probably the different choice of baseline measures. But I also suspect each set of findings reflects the set of interests pushing RCV in each city. Voting-system change is power politics, not public policy. Changing the voting system is about reshuffling coalitions. If people aren’t voting, it’s probably because they’re not being asked to vote. We will see if this changes in future elections.

Race and invalid ballots

Nor am I convinced that modern RCV implementations cause more voter error than other, more prevalent voting systems. One study did find that San Francisco precincts with more black voters tended to produce more over-votes. At the same time, one of its coauthors later found similar rates of largely-black-precinct over-voting in RCV and non-RCV elections. The Minneapolis study gets the same basic result when comparing RCV (2013) with non-RCV (2005) elections.

Complicated ballots and bias

One troubling finding comes from a recent set of lab experiments. The authors find that multi-candidate elections cause voters to fail to suppress bias about the races and genders of candidates in front of them. This is not a study of RCV, but its implications are clear: in RCV elections with many candidates, voters who would have hidden their biases end up unable to do so. This begs a question: do voters attempt to hide their biases in real-world elections? I don’t know.

Either way, the implications are not RCV-specific. To increase the election of women and minorities, one must reduce voters’ options. Either there must be very few candidates, or the choice must not be among candidates. The question is how to include target populations among this limited set of options. It probably comes down to imposing quotas.


What can we make of these results? The mainstream commentary I’ve seen tends to ask whether RCV is “bad” or “good,” then finds results consistent with the adopted position. The absence of a shared dependent variable makes it difficult to draw conclusions from popular commentary.

I’m not ready to hang my hat on voter-error or turnout bias, but I do see evidence of persistent usability trouble across American voting systems, which varies with race and ethnicity in predictably bad ways.

Historic PR and invalid ballots

Here are the proportions of invalid ballots cast in each PR (“multi-winner RCV”) election I’ve studied, at the lowest possible level of aggregation. Error rates varied. Sometimes they were very high, especially in New York City.


What explains these patterns? Here I give some conjectures. Census data are poor for these places and periods. They certainly are not available biennially. But I hope to analyze the data we do have more systematically.

1) Abstention? New York City stopped separate tabulation of invalid and blank ballots after November 1941. Prior to that, blank ballots accounted for an average of 25 percent of all invalid ballots (minimum 12 percent, maximum 57 percent). There was a lot on the ballot at a New York municipal election, so we may be seeing a lot roll-off in the data. In Worcester, by contrast, blank ballots never exceeded 5 percent of invalid ballots. This makes sense. Barring a referendum, city council and school board were the only offices on the ballot at a Worcester municipal election. But those NYC error rates are still quite high if we subtract the 1937-41 average blank-ballot rate.

2) Complexity? New York City typically had long ballots. This was especially true in Brooklyn and Manhattan, two hotbeds of PR agitation in the run-up to adoption. Both always had higher error rates than the other three boroughs. Both also spawned more candidate entry than the other three boroughs, which would increase ballot complexity by increasing the volume of text to read. Still, I’m not satisfied. Manhattan and Brooklyn typically had as many or more invalid ballots (when subtracting the 1937-41 blank rate) as the other cities, even when the other cities had more candidates on the ballot.

3) Race? This was my first hunch. Worcester was even whiter in the 1950s than it is today, and it has the lowest average error rate of all seven jurisdictions. I do not know the historic racial composition of Staten Island. Otherwise, within-jurisdiction average error rates appear to increase with racial diversity. Some spikes in the data are also worth note. We see three in Cincinnati, and each coincides with strong runs by one or more black candidates: Hall and Conrad in 1927, Locker in 1941, and Berry (running for a third term and effectively the mayoralty) from 1953 onward. Less obvious spikes in the Cincinnati trend coincide with less notable runs. Turning to New York City, Powell enters council from Manhattan in 1941, earning the third-highest vote total among 20 declared candidates. But what about the other four boroughs?

4) New-voter mobilization? Berry, Conrad, Hall, Locker, and Powell all may have mobilized new voters when they ran most strongly. In New York, the Communist Party from 1939 to 1941 roughly doubled its first-round vote total in every borough but Staten Island (where it never ran candidates). It may be that this doubling came from new voters. If new voters are the ones who disproportionately invalidate ballots, this would explain the spikes in Cincinnati and four of five New York City boroughs. Worcester’s spikes in 1951 and 1953 remain puzzling, but redevelopment beginning in those years was disruptive.

5) Fraud? Those NYC error rates are staggering. Can’t rule it out.

Going forward

I think we need to be cautious when interpreting the science that exists. I often hear opinions from people who’ve gotten a “sense of the findings,” if you will. We need more findings, and we need to integrate the results we do have into a more coherent picture of voter error, ballot roll-off, and the people most affected by each.