<-Cassiodorus' Web NCSBE->

Exploring the Younger-Older Voter Registrations in the College Counties

Selecting Putative Registration Drives

The college counties are BUNCOMBE, DURHAM, FORSYTH, GUILFORD, JACKSON, MECKLENBURG, ORANGE, PITT, UNION, WAKE, WATAUGA. The voter registration data for these counties exclude October, November and December for even years, i.e., years in which there are general elections, due to registration deadlines and holidays. Odd years include all twelve months. In addition, the data exclude weekends and holidays, which I have discussed in other reports in this analysis.

The process I use for identifying putative registration drives can be thought of as considering the number of registrations by county by day as a collection of time series, and finding anomalous events. The fundamental criterion is that the drive results in significantly larger N.le22 than is characteristic of the data, and at the same time has large differences between N.le22 and N.gt22. This process can be seen through the use of an illustration. The following plot is selected from the yearly data from 2015 through 2020. I use yearly intervals because they are more convenient to work with than the entire six years, and the Thanksgiving and Christmas holidays, along with the general elections being at the start of November, provide a natural division.

The process of identifying anomalous events consists of the following steps, which are carried out separately for each county. A driver for the selection process is that the data is noisy. The day-to-day variation in registrations is considerable, making the use of the usual time series methods cumbersome. Consequently, I rely on subjective judgment for some steps.

  1. The data is presented as two numbers associated with each day: the number of registrations of people twenty-two years or younger (N.le22), and those of greater ages (N.gt22).
  2. Those days where N.le22 is greater than N.gt22 are considered as candidates for further analysis.
  3. The candidate days are arranged by N.le22 in descending order.
  4. A subjective choice is made to select those days at the top of the list that have what I consider significant N.le22 values and also where the difference between N.le22 and N.gt22 are large.

I also verified that the selected dates were characterized by over-22 age distributions that were unambiguously disjoint from that for the 22 and under.

That process led to the following set of days. The 2020-01-03 early registration harvesting, a statewide event, is also included in this analysis although not shown here.

COUNTY dates
BUNCOMBE 2019-09-26,2019-11-05,2018-03-09
DURHAM <2018-04-03,2018-04-12,2018-04-13> ,2017-09-29,2016-02-08,2015-10-09
FORSYTH 2018-05-31,2017-02-03,2016-09-13
GUILFORD <2019-10-08,2019-10-10> , <2019-10-11,2018-09-18,2018-09-24> ,2018-09-26,2017-09-15, <2017-10-25,2016-02-03,2016-02-08> ,2016-02-11,2015-10-09
JACKSON 2020-02-07,2019-08-19,2019-08-27,2019-09-12,2019-09-25,2017-10-10, <2016-09-06,2016-09-07> , <2016-08-24,2016-08-29> , <2016-09-16,2016-09-19>
MECKLENBURG 2019-03-15,2019-08-16, <2018-03-09,2018-03-12,2018-03-13,2018-03-14> ,2017-07-14
ORANGE 2020-02-07,2019-09-26,2019-10-11,2019-12-09,2017-10-13,2016-08-22,2015-09-29,2015-10-09
PITT <2019-09-19,2019-09-24> ,2019-10-04, <2018-08-20,2018-08-29> ,2017-09-21,2016-08-29, <2016-09-15,2016-09-22,2015-09-23> ,2015-10-07
UNION 2019-03-15,2018-08-20,2017-05-03
WAKE 2018-06-19,2016-02-05,2015-04-09
WATAUGA 2020-02-07, <2020-08-14,2020-08-18,2020-08-24,2020-08-26> , <2019-08-21,2019-08-28> ,2019-09-24, <2019-10-09,2019-10-11> , <2018-08-22,2018-08-27,2018-08-30> ,2018-09-18, <2017-10-04,2017-10-06> , <2017-08-24,2017-08-25> , <2016-08-17,2016-08-23> ,2015-08-19, <2015-09-09,2015-09-14,2015-09-18>

Complications: Multi-Day Drives and Joint Drives

This is complicated by some of the drives appearing to have taken place over a few consecutive or nearby days (shown above in braces). How should these be addressed? There seem to be few choices, including treating the days separately or combining them in some consistent fashion. I choose not to treat them separately because of the clutter that would cause, and the smaller numbers of young people who would be involved each of those days. I propose to pragmatically associate nearby days by using a grouping variable, and propagate that into any subsequent analysis. I assign each of these grouped drives a single calendar date, that being of the day with the maximum count of young persons, or in the case of ties, the earliest of the days. Claiming an association between supposed drive days requires some subjective decisions, and accordingly there will be some decreased confidence in the results of analysis. I will point out these drives in the various reports.

Another complication are “joint” drives, those that involve older persons as well as younger. I consider these to constitute a separate category and I will exclude them from this present analysis. There is no hard and fast rule for distinguishing joint drives, but roughly speaking, I will consider them so if in the count for a day the number of older persons is more than half the number of young persons.


Detailed plots of the data for the college counties are in Appendix A of this report.


This report was run on 2021-01-07.