Polling Links (General and on Weighting) Appendices: Implementing Sample Weights

Alan Reifman’s
Party ID &Sample Weighting Page

 
Examining Partisan Composition in Public Opinion Polls and…

 

 
...the Ongoing Controversy over Weighting by
Party ID





2022
National exit polls (in turnout for U.S. House races) show breakdown of
Republicans 36%, Democrats 33%, and Independents 31%



January/Febrary 2022: Charles Franklin examines Gallup's early 2022 report of GOP party-ID surge (here and here)



2020
National exit polls show breakdown of
Democrats 37%, Republicans 36%, and Independents 26%


2018

National exit polls (in turnout for U.S. House races) show breakdown of
Democrats 37%, Republicans 33%, and Independents 30% (extremely similar to 2016)



Pew Research Center Study of 
Stability and Change in Partisan Identification Among the Same Respondents,
Surveyed Repeatedly Between December 2015-March 2017 (Released May 17, 2017)



2
016

National exit polls reveal partisan distribution of Democrats 36%, Republicans 33%, and Independents 31%.

Pollster Nick Gourevitch quoted in Politico article "argu[ing] against letting partisanship float from poll to poll. 'I think you’re in a world where that fundamentally doesn’t work. You have to bring some assumptions to the table'" (November 3).

YouGov researchers argue that "when things are going badly for a candidate, their supporters tend to stop participating in polls" and recommend weighting to vote percentages in past elections (November 1).

Auto Alliance/Entertainment Software Association polls regularly present three different sets of results: for strong Democratic turnout, strong Republican turnout, and neutral. I encourage reporting of results under multiple assumptions. (This poll appears to have grown out of the Auto Alliance's survey of "nearly 5,000 car owners a month." A key question, therefore, is whether/how they account for people without cars.)

Excellent overview-article on Party ID, sample-weighting, and other sampling issues, by University of San Francisco professor and Bloomberg Politics analyst Ken Goldstein (September 20).

"
...conventions are raising both candidates’ poll numbers by temporarily increasing their voters’ response rates to pollsters"(Vox, August 1). Hence the need to be aware of polls' partisan composition.

Five-Thirty-Eight dismisses notion that polls are oversampling Democrats -- with some useful Party ID statistics (August 9).





2014 Notes


New York Times's exit-poll-based "Portrait of the Electorate" indicates that self-identified Republicans comprised 37% of the 2014 midterm electorate, Democrats 36%, and Independents 28% (be sure to click on "Size bars according to population of groups")

Columbia University statistician Andrew Gelman and colleagues write that:

The short story is much of the apparent changes in public opinion are actually changes in patterns of nonresponse:  When it looked like [2012 GOP nominee Mitt] Romney jumped in popularity, what was really happening was that disaffected Democrats were not responding to the survey while resurgent Republicans were more likely to respond. From a “methods” point of view, the key step is to poststratify by party ID...
 
                                            (
original Washington Post article and addendum; August 2014)

...
...
Campaign 2012


CNN exit polls peg 2012 Party ID breakdown as Democrat 38%, Repub. 32%, and Independent 29%.

Article by Samuel Best and Brian Krueger on Gallup missing the 2012 Party ID percentages (based on exit polls) by a wide margin

Because of questionable partisan composition, Pew Research Center weights its final 2012 pre-election poll by who respondents say they voted for in 2008 (similar to weighting by Party ID), "but that's not something we like to do."

INTRODUCTION TO CAMPAIGN 2012:  It's the issue that never seems to go away -- polls' partisan composition (the percentages of self-identified Democrats, Republicans, and Independents in the sample) and what to do (if anything) when these percentages seem out of whack compared to previous baselines. In August 2012 alone, four articles appeared (links below), expressing different opinions on the matter. I invite you to peruse these articles and all the other resources on this site, which I've maintained since 2004.

Stu Rothenberg suggests that "...when the partisan make-up of a sample differs significantly from the previous month’s, you may see a trend that says more about the samples than about the race."

Sean Trende notes, at least with some polls, that even when Party ID figures seem to be off, self-identified political ideology (liberal, moderate, conservative) may not be.

The Pew Research Center contends that "focusing on the partisan balance of surveys is, in almost every circumstance, the wrong place to look."

Harry Enten concurs in part, and dissents in part, with Pew. He argues that "There is one potential instance where re-weighting by party identification could be acceptable..."

Over the years, weighting schemes have gotten more sophisticated. Using, say, 2000 exit-poll figures on Party ID as a template to weight surveys for the 2004 election indeed is very questionable, as much about the political landscape can change in four years (or less time). With dynamic weighting, a pollster can implement party weights on his or her next survey based on the average party ID percentages from the same pollster's, say, past three months' previous surveys (here and here). Dynamic weighting thus allows for some shifting in Party ID due to recent events, but the aggregation over multiple recent polls allows for greater stability in Party ID than some surveys have shown.
  

Trends in Party Identification
Pollster.com/Huffington Post
compilation Rasmussen
New "Poll Manipulator" Device from the Washington Post
You can adjust the partisan split to see how the Obama-Romney horse-race numbers are affected

Other News on Party ID and Sample Weighting

After the Nov. 3 NBC/Wall St. Journal/Marist Poll showed Obama up by 6 percentage points in the crucial state of Ohio, NBC's Chuck Todd "tweeted" out what the result would have been under an alternative sample weighting on Party ID. I applaud this gesture and encourage other pollsters to report a range of results based on different possible Party ID distributions.




Insightful analysis of Ohio polls by Nick Gourevitch, offering a simple explanation for why you could have both a large Democratic edge in Party ID and Romney leading among Independents (November 1, 2012)

YouGov's Peter Kellner writes about the Romney-Obama debates and Party ID weighting (or lack thereof)
in the polls that followed (link). Highly recommended! (October 24, 2012)

 Study by Cobb and Nie from the 2012 AAPOR conference (link) appears to show robust individual-level stability of partisan ID, as I read it. (See especially slides 17 and 18.) The same respondents were surveyed 11 times, between Nov. 2007 and Dec. 2008. Most of the switching in Party ID that did occur seemed to take place between “strong” and “weak” identification with the same party, or from weak Democrat or Republican to Independent. Almost never did a self-identified Democrat (strong or weak) become a Republican at the next assessment or vice-versa. In fact, 82% of respondents identified with the same party (ignoring strong or weak intensity) at wave 11 as they did at wave 1. (Posted October 16, 2012.)


The Wall Street Journal's "Numbers Guy" tackles the Party ID-weighting issue. See the contrasting opinions of political scientists John Sides and Donald Green. (Posted October 16, 2012.)

Political "insiders" support weighting by Party ID. (Posted October 16, 2012.)

Mark Blumenthal offers some useful thoughts on how aggregate Party ID can shift
somewhat during a campaign (September 28, 2012).

Historically, it's been Democrats who have complained about partisan composition being off in polls.
This time it's the Republicans (September 25, 2012).

Analysis of Wisconsin Party ID Composition (September 17, 2012)
Updates: Marquette Law School Poll (Sept. 19; D+8, 34-26),
NY Times/CBS/Quinnipiac (Sept. 19; D+8, 35-27)




2011 Party Identification Data
Daily Kos/SEIU/Public Policy Polling: April-June 2011 Gallup: August 2010 to May 2011




...
...
2
0
1
0
11/9/2010  The New York Times presents its biennial exit-poll-based "Portrait of the Electorate." It estimates the self-identified partisan composition of the 2010 voters at 36% each Democrats and Republicans, and 27% Independent... Pollster.com has a nice article that delves into the importance of proper sample weighting, not on Party ID per se, but just on basic demographics... Emory University Professor Alan Abramowitz spanks Gallup for its highly inaccurate generic-ballot polling.

10/13/2010 
The major trackings of Party ID seem to suggest that, heading into the 2010 midterm elections, self-identified Democrats outnumber self-identified Republicans by a small margin (see charts from Pollster.com and Rasmussen). Gallup's early-October likely-voter models, however, are projecting a nine-point GOP edge over the Democrats in composition of the electorate; that would be a major departure from recent election cycles, where partisan composition has ranged from Democrat/Republican parity to Democratic advantages of several percentage points (see below).




Campaign 2008
National
Party ID
Numbers
Key State
Party ID
Numbers
Brief
News Items
on Party ID

My sincere appreciation to those who have provided online coverage of this site, including The Atlantic and National Review.


Pollsters' Party ID Compositions for National Surveys (2008)

Rasmussen Weighting Template for Final Days Leading Up to the Election:  39.9% Democrat, 33.4% Republican
Rasmussen Weighting Template for Week of October 26-November 1:  40.0% Democrat, 32.8% Republican
Rasmussen Weighting Template for Week of October 19-25:  39.7% Democrat, 33.0% Republican
Rasmussen Weighting Template for Week of October 12-18:  39.3% Democrat 33.0% Republican

Zogby Weights:  38% Democrat, 36% Republicans

Daily Kos/Research 2000 Weights: 35% Democrat, 26% Republican
I had erroneously been including this poll in the non-fixed-weight listing below.

The polls below do not re-weight or adjust their results based on Party ID.  Nevertheless, the Party ID percentages they obtain from their random samples are informative in judging the partisan composition of the electorate.  Party ID percentages for a given poll are shown by the respective "D" and "R" notations, with links to the original sources, if still available. Obama (O)-McCain (M) "horse race" numbers are shown below the Party ID percentages in italics.
Daily tracking polls use a moving (or rolling) average of the last few days' (usually 3) readings.  For example, a poll released on a
Thursday could be an average of Monday, Tuesday, and Wednesday's polling.  Because polls released on adjacent days thus have some
overlap in their data, such polls will likely show considerable stability in reported results over the short term.   
If a national poll you've seen in the news is not included below, it's probably because it didn't publicly report Party ID numbers.

Poll (Pollster) 10/16* 10/17 10/18 10/19 10/20 10/21 10/22 10/23 10/24 10/25 10/26 10/27 10/28 10/29 10/30 10/31 11/1 11/2 11/3 11/4
GWU Battleground
(Tarrance-Lake)
D43-
R39

O+6#
Although this is a daily tracking poll, full detailed reports (with Party ID) appear to be released only on Thursdays. D42-
R37
O+3
  D42-
R37
O+3
       


ELECTION
DAY

"Democrats made up 39 percent of the electorate and Republicans 32 percent in a national exit poll for The Associated Press and television networks."


These exit-poll figures match well with Rasmussen's template (above) and the daily averages (below)

 

 

 

 

 

 

 

 

 

Diageo-Hotline Not
acces-
sible
D41-
R36

O+10
Party ID numbers unavailable D40-
R37

O+5
D41-
R36

O+6
D41-
R38

O+5
D41-
R38
O+5
D41-
R37

O+7
D41-
R37
O+7
D42-
R37
O+8
D41-
R37
O+8
D41-
R37
O+8
D41-
R36
O+7
D41-
R37
O+6
D42-
R36
O+7
D42-
R36
O+7
D42-
R36
O+5
D41-
R36
O+5
Democracy Corps
(D-affiliated)
        D41-
R34^
O+5
      D38-
R35^
O+9
                  D40-
R31
^
O+
8
Wash. Post-ABC         D35-
R28

O+9
D36-
R29

O+9
D36-
R29

O+11
D36-
R27

O+11
D37-
R29

O+9
D37-
R29

O+9
D37-
R30

O+7
D37-
R32

O+7
D36-
R33

O+7
D37-
R31

O+8
D36-
R29

O+8
D37-
R28

O+9
D38-
R29

O+9
D37-
R31

O+11
D37-
R32

O+9
American Research Group           D41-
R32

O+4
            D41-
R33
O+5
           
Pew Research Center           D38-
R32

O+14
            D39-
R29

O+15
        D37-
R31

O+6
 
Associated Press-Gfk             D34-
R28@
O+1
                       
FOX News             D43-
R37

O+9
              D41-
R39

O+3
      D42-
R36

O+7
Big Ten Academic
(National)
              D36
R30
$
O+9
                     
CBS-New York Times               D41-
R28

O+13
            D39
R31

O+11
  D41-
R28

O+13
   
Newsweek                 D36-
R27
O+12
                   

Daily
(More or Less) Averages

========================>

D39.4
R32.6
D+6.8

D38.5
R33.0
D+5.5
D39.2
R32.0
D+7.2
D38.0
R32.0
D+6.0
      D39.2
R33.0
D+6.2
  D39.8
R34.6
D+5.2
      D39.4
R33.2
D+6.2

*Date the poll was reported.
#This poll measures the "horse race" question both with and without the VP candidates being named.  The version with the VP candidates appears to be considered the primary one.
^For this organization's 10/20 poll, adding in Independents who lean toward the parties, the Democratic number would become 48 and the Republican one, 43.  For the 10/24 poll, the D number would become 49 and the R one, 45.  For the 11/3 poll, the D number would become 51 and the R one, 43.
@Adding in Independents who lean toward the parties, the Democratic number would become 45 and the Republican one, 40.
$Adding in "leaners" would yield a Democratic ID of 51 and a Republican one of 42 (raw numbers of D and R leaners in question 34C were added to respective party totals in PID_3WOL)

Click here to return to top of page.



Estimated Party ID Composition in Key 2008 States

October 9, 2008 -- The American Research Group (ARG) released a batch of state polls today.  The one getting the most discussion is that from West Virginia, where Barack Obama is said to be leading John McCain 50-42%.  Though the Mountaineer State was long a Democratic stronghold, the GOP has dominated at the presidential level of late (Bush beat Gore by 6% in 2000, and Kerry by 13% in 2004).  Further, McCain has consistently led in West Virginia during the current campaign, albeit with some recent narrowing.  Whenever a polling result looks odd, discussion usually turns to partisan composition of the sample.  Did the pollster oversample Democrats or Republicans (or neither)?  This latest situation, involving West Virginia, has motivated me to complete a project I've long been considering -- namely, compiling the best estimates I can find of the Party ID composition in several swing states.  Below is what I've found.  The chart is definitely a work-in-progress, though, so please send me additional data sources if you know of any (alan.reifman@ttu.edu).

State Voter Registration
(if done by party in a given state and is available)
2006 Exit Poll
(from U.S. Senate or Gov. race)
Other Sources
Colorado R+3 (34-31) (a)   Nov. 2007 poll:  EVEN (34-34) with "2-3-2" format; D+5 (49-44) with "3-1-3" format. (n)  See Note 1.

D+4 (p)

Florida D+4 (41-37) (a) R+3 (39-36) (c) D+6 (p)
Indiana      
Michigan   D+7 (40-33) (d) D+7 (p)
Minnesota   D+4 (40-36) (e)  
Missouri   R+2 (39-37) (f) D+6 (p)
Nevada D+6  (43-37) (a) R+7 (40-33) (g)  
New Hampshire R+1 (31-30) (a)    
New Mexico D+18 (50-32) (a) D+9 (41-32) (h)  
North Carolina D+12 (45-33) (a)   D+13 (p)
Ohio   D+3 (40-37) (i) D+10 (p)
Pennsylvania D+12 (50-38) (a) D+5 (43-38) (j)  
Virginia   R+3 (39-36) (k) D+5 (p)
West Virginia D+27 (56-29)(b) D+19 (51-32) (l)  
Wisconsin   D+4 (38-34) (m) Compilation by Charles Franklin:  D+15 (40-25), based on visual inspection of graph. (o)

Sources:  (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) (p)

Note 1. Apparently, these pollsters used a seven-point scale, including strong, moderate/somewhat, and weak Democrats and Republicans, plus "pure" independent. The notation "2-3-2," I believe, thus lumps weak Democrats and Republicans with Independents, and only counts strong and somewhat strong party identifiers with their party; "3-1-3" would puts weak identifiers with their party.

Click here to return to top of page.



Brief
Campaign 2008 Items

Gallup summarizes its 2008 data on national and state-by-state Party ID (January 2009)

The Wall Street Journal examines Party ID weighting

Public Policy Polling survey firm raises interesting question about rising Democratic Party ID:
"
Are more people identifying as Democrats because more people are voting for Obama? Or has party id remained flat over the last two months, in contrast to what we've found in our polls, and Obama is just doing better because we're over sampling Democrats?"

ABC News polling expert Gary Langer discusses un-leaned and leaned Party ID

A look from Pollster.com at partisan composition trends from states where voter registration is by party

Nate Silver at FiveThirtyEight has a nice posting on weighting for Party ID

Pollster.com's Mark Blumenthal weighs in on "Party ID Wars" for this year's presidential campaign

Click here to return to top of page.
 



Party ID-Related Information Compiled During the
2006 Midterm Election Campaign

Dr. Reifman's summary of the party ID/sample weighting issue in the 2006 elections is available at Pollster.com.

Charles Franklin discusses end-of-the-year 2006 Gallup report on party ID

National Polls
Pollsters not shown either weight to a set partisan template  (Rasmussen, Zogby) or don’t appear to disclose their polls’ partisan composition

Days in Field
(2006)

Bush Job Approval(%)

% Vote for Congress?(Generic Ballot)

% of Sample Comprised of Self-Identified Partisans

Partisan
Composition of
Sample, Relative
to Rasmussen
Gold Standarda
[
37.0 D, 32.3 R]

D

R

D

R

NY Times Exit Poll
(Click here, then on
"Graphic")
Election Day
November 7
--- 54 46 39 36 D margin (+3) predicted well by late polls
FOX-Opinion
Dynamics
Nov 4-5 38 49 36 39 35 Margin about right
Pew Research
Center
(also here)
Nov 1-4 41 48/47b 40/43 35 31 Margin about right
ABC-
Washington Post
Nov 1-4 43
(registered)
53/51b 43/45 35/33 32/34 R edge in LV contrary to all other known readings
Time-SRBI Nov 1-3 37 55 40 29 26 D edge
understated
New York Times-
CBS
Oct 27-31 34 52/52b 33/34 35 28 D edge
overstated
Cook-
RT Strategies

(subscription)
Oct 26-29 41/39b
(38)d
52/59
(61)c
39/36
(35)
32/39
(41)
e
30/29
(29)
Reg. understates D edge, more likely categories overstate it
AP-AOL-Ipsos
 
Oct 20-25 37/38b 54/56 37/37 36/37e 30/30 D edge
overstated
FOX-Opinion
Dynamics
Oct 24-25 40 49 38 39 36 D edge
understated
Diageo-Hotline Oct 19-23 40/40b 49/52 34/34 31/35e 28/28 Registered under-states D edge, likely-voter overstates it
These green demarcations are to group polls by date, to compute averages for given time frame (see below).
ABC-
Washington Post
Oct 19-22 40
(registered)
54 41 30 28 D edge
understated
Cook-
RT Strategies

(subscription)
Oct 19-22 37/35b
(37)
d
49/56
(57)
c
37/34
(35)
31/36
(37)
e
29/27
(29)
D edge overstated among likeliest
voters
Newsweek
(more recent poll done, but party ID not available)
Oct 19-20 35 55 37 36 29 D about right,
R understated
American Research
Group
Oct 18-21 36
(registered)
--- --- 37 34 D right on,
R slightly high
Partisan composition averages of the four polls (above) taken from Oct 18-22 inclusive:  D 34.4 , R 29.8
Pew Research
Center
Oct 17-22 38 49 38 32e 26 D edge slightly
overstated
Democracy Corps
(Dem. affiliated)
Oct 15-17 39 51
(54)
f
40
(40)
40e 32 D edge
overstated
NBC-Wall St. J.
(more recent poll done, but party ID not available)
Oct 13-16 38 52c 37 28e 27 (D edge 43-37 w/
Ind-leaners matches Ras margin decently)
FOX-Opinion
Dynamics
Oct 10-11 40 50 41 38 34 Nearly
perfect!
These green demarcations are to group polls by date, to compute averages for given time frame.
USA Today-Gallup
(more recent poll done, but party ID not available)
Oct 6-8 37 59 36 38 29 D about right,
R understated
Cook-
RT Strategies

(subscription)
Oct 5-8 41/42b
(42
d)
49/51
(50)
c
38/40
(41)
33/35
(37)
e
33/34
(35)
D edge
understated
New York Times-
CBS
Oct 5-8 34 49 35 35 30 D-R margin almost perfect
ABC-
Washington Post
Oct 5-8 39 54 41 38 27 D edge (11%)
unusually large
Newsweek
 
Oct 5-6 33 51/51b 38/39 38/39 27/29 D edge (10-11%)
unusually large
Partisan composition averages of the five polls (above) taken from Oct 5-8 inclusive:  D 36.9, R 29.6
Democracy Corps
(Dem. affiliated)
Oct 1-3 43 51
(50)
f
41
(44)
39e 36 D edge under-stated somewhat
AP-Ipsos Oct 2-4 38/39b 54/51 38/41 36/36e 28-30 D about right,
R understated
Time-SRBI Oct 3-4 36 54 39 35 27 D edge (8%)
unusually large
Pew Research
Center
Sept 21-
Oct 4
37 51 38 34 27 D advantage
overstated a bit
NBC-Wall St.
Journal
Sept 30-
Oct 2
39 48 39 43 37 D advantage
overstated a bit
Shown here.
Cook-
RT Strategies

(subscription)
Sept 21-24 40/40b
(47d)
49/54
(49)
c
35/36
(41)
34/40
(38)
31/31
(35)
D-R margins fluc-tuate around Ras.
Diageo-Hotline Sept 24-26 42/42b 43/46 33/33 39/43 32/34 D too high;
R about right
FOX-Opinion
Dynamics
Sept 26-27 42 49 38 38 34 Virtually
perfect!
American Research Group Sept 18-21 39
(registered)
--- --- 35 32 3-pt difference
a little narrow
New York Times-
CBS
Sept 15-19 37 50 35 32 30 Understates D edge somewhat
USA Today-
Gallup
Sept 15-17 44 51/48b 42/48 34 31 3-pt difference
a little narrow

AP-Ipsos

Sept 11-13

41/40

51/53

39/39

35/37

26/27

D about right;
too few
R

FOX-Opinion Dynamics

Sept 12-13

40

41

38

36

35

D about right;
R slightly high

Pew Research
Center

Sept 6-10

37

50

39

34

30

4-pt difference
about right

NBC-Wall St. Journal

Sept 8-11

42

48c

39

40

37

3-pt difference
a little narrow

Lettering turns grey when a more recent poll by the same organization becomes available.  All polls above are national.  For discussion of partisan composition of state polls, see other sections of this website.

(a)  I use data from the Rasmussen daily polls as my personal “gold standard” of what the “correct” partisan composition is, due to their large aggregate sample sizes and sophisticated methods:  “At Rasmussen Reports, we adjust our party identification weighting targets each month based upon actual survey results from the previous 90 days. For the month of October, our partisan weighting targets are 37.0% Democrat, 32.3% Republican, and 30.7% unaffiliated. That's little changed from our September targets of 37.0% Democrat, 32.7% Republican, and 30.2% unaffiliated.  In an October 12 posting on his website, Rasmussen reports his polls have detected "no impact" of the scandal involving former Congressperson Mark Foley on party ID.  On November 1, Rasmussen announced new party ID figures of 37.7% D and 31.5% R.  Since there's not that much time until the elections, I'll just keep the October figures at the top of the chart.

(b)  For registered and “likely” voters, respectively.

(cAsked in terms of preference for which party to control Congress.

(d)  Among "most likely voters."

(e) Before "hard push" to see if Independents will align with either major party (or, in other polls, excluding Independents who lean D or R). 

(f) Using candidates' actual names, instead of generic reference to "Democrat" or "Republican."

My thanks to the pollsters who, in the spirit of openness and transparency, report the internal details of their surveys on their websites!

Click here to return to top of page.



Weighting Pre-Election Polls for Party Composition:
Should Pollsters Do It or Not?

by Alan Reifman

September 9, 2004 -- We have seen a lot of polls thus far in the Bush-Kerry race and we're going to see a lot more.  Often polls by two different survey outfits taken at the same time will show results in pretty stark disagreement.  Literally as I write this, Rasmussen Reports has the race a virtual dead-heat (Bush 47.5, Kerry 46.8; link no longer available), while CBS News has Bush up by 7.

All pollsters try to obtain a random, representative sample of voters to represent the full electorate.  In addition to vote choice (i.e., Bush, Kerry, or other), pollsters always ask respondents which party they align themselves with.  These two measures -- candidate preference and party ID -- often show great overlap, with Republicans (R's) heavily going for Bush and Democrats (D's) heavily going for Kerry.  However, people sometimes vote for the other party's candidate, so candidate preference and party ID are not identical.

One factor (among many) that may contribute to discrepancies between different outfits' polls in their Bush-Kerry margins, I will argue, is polling firms' different philosophies as to whether it's advisable to mathematically adjust their samples -- after all the interviews have been completed -- to make the percentages of D's and R's in their survey sample match the partisan composition that is likely to be evident at the polls on Election Day.  The latter can be estimated from exit polls from previous elections, party registration figures (in states where citizens declare a party ID when registering to vote), and surveys.  

(Another issue that often comes up in evaluating pre-election surveys, with which many of you may be familiar, is whether results are reported for "registered" or "likely" voters.  That is a different issue from what is being discussed presently.  Whether a pollster reports results for registered voters, likely voters, or both, weighting by party ID is a separate, independent decision.)

Exit polls from the three previous presidential elections yield the following percentages of the electorate comprised of self-identified D's, R's, and independents (from Zogby; link no longer available).

  Democrats Republicans Independents
1992 34% 34% 33%
1996 39% 34% 27%
2000 39% 35% 26%

One additional source is Democratic pollster Stanley Greenberg, author of The Two Americas.  He found, after conducting 15 national polls with an aggregate 15,045 voters from late 2001 until early 2003 and allocating "leaners" to the relevant party, that each major party had the allegiance of 46% of the voters.

The controversy occurs when a poll of, say, 1,000 voters shows a partisan composition vastly different from what we've come to expect.  Should the pollster make statistical adjustments (described below) to make the party breakdown conform to more typical estimates, or should he/she just leave the numbers alone and report the findings?  A good summary of the "back and forth" of this controversy is available in this Los Angeles Times article from earlier this summer.  I will be referring back to this article.

The following two scenarios should illustrate the key issues.  Before presenting the scenarios, I want to state that there are noted national authorities on either side of the "weight/no weight" debate (including Democrats on both sides and Republicans on both sides).  Each reader should decide for him- or herself.  A simple illustration of how to actually implement a sample weighting is described in Appendix A, with references for more complex situations in Appendix B.

SCENARIO 1

As noted above, most recent estimates of the partisan composition of the electorate suggest a rough balance between the number of voters leaning toward the D and R parties (i.e., "50/50 nation"), with the possibility that there might be slightly more D's than R's.  

In his aforementioned book, Greenberg characterizes party identification as "... a form of social identity, not unlike ethnicity or race, with considerable durability over time" (p. 93).  I would argue that individual-level stability should generally lead to population-level stability, although not perfectly so (e.g., from one presidential election to the next, some people pass away, others newly turn 18, immigrants become eligible to vote).

Suppose a pollster completes a survey and finds far more self-identified R's in the sample than D's.  This happened in the Newsweek poll released in early September right after the R convention that gave Bush an 11-point lead over Kerry.  Newsweek's sample contained 38% R, 31% D, and 31% I (article links no longer available).  There would seem to be three plausible explanations for the higher-than-usual sampling of R's:

  • There was a sudden, massive shift in party ID after the R convention.

  • Given that Newsweek's polling was done on Sept. 2-3 (partially overlapping the convention), one could argue that more R's than D's would have made it a point to be home to watch the convention, thus making themselves more accessible to telephone interviewers; even after the final day of the convention, R's may have been more politically energized, making them more likely to agree to participate in a survey.

  • It could have just been plain, "old fashioned" sampling error -- just as a coin, with probabilities of 50% heads and 50% tails, can yield 60% heads in a sequence of flips, random sampling of households could have yielded excessive R's just by chance.

If Greenberg is correct about the stability of individuals' party ID, then the first of the three explanations (a sudden shift) seems unlikely.  The fact that other recent polls besides Newsweek's have obtained samples with more R's than D's seems to go against the third explanation (chance).  In any event, we would conclude that if the second or third explanation were the true "culprit," Newsweek's party breakdown  would appear to be "out of whack" relative to the other aforementioned indicators.

(Again, to keep things bipartisan, the article I cited earlier as providing a good discussion of the "back and forth" of the controversy was itself prompted by a mid-summer L.A. Times poll in which it appeared there were way too many Democrats in the sample.) 

It is at this point that pollsters face the choice of whether to adjust the numbers to match more typical estimates of the D-R distribution (i.e., count R's less and D's more), or just leave the sample alone.

In 2000, pollster Scott Rasmussen went with the "leave things alone" strategy, with the result that his firm forecast a 9-point Bush victory over Gore in its final pre-election poll.  Rasmussen, to his credit, posted a candid summary on his website:

"Simply put, we had too many Republicans in our sample.  For a variety of reasons, our firm has never weighted by party.  However, if we had weighted the data before the election to include an equal number of Republicans and Democrats, we would have shown Bush leading by 2 points.  Had we weighted our data to match the partisan mix reported by the Voter News Service on Election Night, we would have shown Gore leading by a point"(document no longer available online).

One would surmise that Rasmussen is probably weighting this year.  [As shown above on this website in connection with the 2008 presidential election,Rasmussen indeed adopted a practice of weighting by party ID.]

Pollster John Zogby has pioneered the art of sample weighting on party ID.  Taking 1996 and 2000 together, he was the most accurate pollster in forecasting the two presidential elections.  As he noted on his website, "My polls use a party weight of 39% Democrat, 35% Republican and 26% Independent" [this 2004 document is no longer online].

Another apparent sign of polls that weight is that they should exhibit less volatility day-to-day or week-to-week than polls that don't weight.

SCENARIO 2

As previewed above, a number of prominent polling authorities would presumably argue that the Newsweek poll, with its larger-than-typical R composition, or the L.A. Times poll, with the unusually wide D edge, should be left alone and not "retrofitted" into some preconceived template of what the Election Day partisan composition will look like.

According to the L. A. Times "back and forth" article cited above:

"Andrew Kohut of the Pew Research Center for the People and the Press said that he once conducted a survey asking voters their party twice, four days apart, and that he found substantial differences in the responses."

Further, even though Democratic candidates sometimes come out with more favorable readings on party-weighted compared to unweighted polls, Ruy Teixeira, co-author of The Emerging Democratic Majority and operator of a "blog" related to the book [moved to here], opposes weighting for party ID.  In his September 5 entry, he writes:

"Does that mean I favor polls like this weighting their samples by party ID? No, I don't, because the distribution of party ID does shift some over time and polls should be able to capture this. What I do favor is release and prominent display of sample compostions [sic] by party ID, as well as basic demographics, whenever a poll comes out. Consumers of poll data should not have to ferret out this information from obscure places--it should be given out-front by the polling organizations or sponsors themselves. Then people can use this information to make judgements [sic] about whether and to what extent they find the results of the poll plausible."

If I had to argue for not weighting on party ID, I would make two points:

  • As embodied in Teixeira's quote, a "free market of ideas" should prevail.  With as much data as possible being released with polls, consumers can reach their own conclusions.  And, of course, pollsters acquire reputations over the years as to their surveys' accuracy in forecasting elections.  Over the long run, this should serve as a check and balance on polling/statistical procedures that have led them astray.

  • Some pollsters who do not weight on party ID may weight on other demographic characteristics (e.g., sex, race/ethnicity), on the grounds that whether one is male or female is far more stable than whether one is a D or R.  This may help eliminate some of the skew when people appear to be represented disproportionately.

Hopefully my essay has given you something to think about as the pre-election polls roll in.  There are no definitive answers on how to handle these issues.  For all we know, Election Day 2004 may have more R's voting than D's, a departure from the last three presidential elections.  In that event, polls not weighted on party ID may end up being more accurate than those with weightings based on recent past presidential elections.   

My personal advice -- whether, to use a phrase from Senator John McCain, you're a "Republican, Democrat, Libertarian or vegetarian" -- is to interpret polls favorable to your chosen candidate in cautious terms.  If the favorable poll turns out to be accurate on Election Day, you can exult, but if it was flawed, at least you won't be blindsided.

I would like to thank John Welte for discussing sample weighting with me over the years.

Click here to return to top of page.


Postscript on Sample Weighting in the 2004 Presidential Election

November 9, 2004 -- Here are some of my initial post-election thoughts on the sample-weighting issue.  As more information becomes available, I may refine my opinions. Given that (a) polls over the campaign's final months with partisan composition weighted to previous turnouts tended to show Kerry doing better than polls without such weighting and (b) Kerry lost, one's immediate reaction might be that party ID weighting should now be abandoned (at least tentatively).  However, the available evidence does not strongly support that view.  Bush won the national popular vote 51%-48%.  Zogby, who we know weights on party ID, had it Bush 48-47 in his final national poll, not too far away.  Rasmussen, who is believed to weight (see below), had it Bush 50.2-48.5.  Other firms such as Gallup (whose final likely voter poll had it Bush 49-47 before allocation of undecideds) that don't weight on party ID also came pretty close to the actual figure, but the point is firms that (presumably) weighted on party ID were not led heavily astray.  A central issue, of course, is what the actual partiorate is on Election Day.  According to a national exit poll of 13,660 respondents posted by CNN, this year's electorate was comprised of 37% Democrats, 37% Republicans, and 26% Independents.  Compared to 2000 figures (shown below), this would represent a 2% reduction in Democrats and a corresponding 2% gain in Republicans, with the percent of Independents the same.  The relatively small size of the shift presumably would explain why weighting polls this year on 2000 turnout did not cause major damage.  (We know that the 2004 exit polls were messed up; if they were accurate, Kerry would now be selecting his Cabinet.  The exit poll posted by CNN apparently included adjustments to make the overall Bush-Kerry percentages match the actual vote.  Given these considerations, I suspect there's still some degree of uncertainty in the 2004 partisan-composition estimates.)  Assuming the 37-37 D-R voter composition to be roughly accurate, however, I think we can safely say that polls from this past election season showing substantial GOP edges in sample representation (one Gallup poll, taken from September 24-26, had 12% more R's than D's!) were clearly out of bounds. 

ADDENDUM (11/17/04):  A Los Angeles Times national exit poll with 5,154 respondents came up with the following percentages of the electorate:  40 D, 39 R, and 19 I.  This matches fairly well with the figures from the other exit poll noted above.  (If you go to the linked document for the L.A. Times exit poll, you'll notice the high number of Californians surveyed; they were presumably weighted down to match California's share of the national population.)     

Click here to return to top of page.



Additional Polling Resources 

Party ID Numbers from the NBC-Wall Street Journal Poll, 1990-2015

Pollster.com (formerly "Mystery Pollster"), a blog devoted to polling methodology issues, including sample weighting by party ID

Polling Report (compilation of poll results)

Performance of different survey outfits' polls in forecasting true presidential vote from 1936-2000, 2004, and 2008

CNN.com Exit Polls (for looking at Party ID and other voter characteristics, nationally and state-by-state)
2010, 2008, 2006, 2004

"Unskewed Polls" -- A 2012 newcomer to the "Party ID wars," this site provided adjusted Obama-Romney horserace numbers by re-weighting non-Rasmussen polls to Rasmussen's Party ID percentages

Dr. Reifman's lecture notes on survey sampling

Letter by Dr. Reifman in the ISR Sampler (published by the University of Michigan's Institute for Social Research), arguing that the older concept of "quota sampling" and sample weighting are conceptually similar (when the document opens up, go to page 12 to see Dr. Reifman's letter and responses by two University of Michigan survey experts).

Statement by polling firm Survey USA on its sample-weighting policy (from Daily Kos)

USA Today article on the sample-weighting controversy (original "Move On" advertisement to which Gallup is responding in article)

Political Arithmetik (quantitative analyses of polls by Wisconsin Prof. Charles Franklin, who taught Dr. Reifman in a summer stats course at Michigan over 20 years ago!)

Party ID in the USA, 1952-2008, from the University of Michigan's National Election Study (gets at the question of the stability of party ID)

"A Consumer's Guide to the Polls" (2004), reviews the practices of the leading polling outlets on several dimensions

Bloggers Chris Bowers and Steve Soto were very active in analyzing the sample-weighting controversy in 2004.  Their writings led me to many of the websites linked above, for which I thank them.

Academic article (Erikson et al., 2004, Public Opinion Quarterly) on the volatility of samples' partisan composition in daily tracking polls.

Daily Kos diary by Dr. Reifman on the partisan composition of a poll in the New Jersey U.S. Senate race, with Kos reader comments (9/25/06).

Memo by pollster David Winston on possible oversampling of Democrats in 2006 pre-election polls  (10/16/06).

Website of Columbia University professor Andrew Gelman (see his 2001 article on post-stratification, in the Journal of the American Statistical Association, as well as other related articles)

Click here to return to top of page. 



APPENDIX A:
  Implementing Sample Weights in the SPSS Program

As shown in the grid below, I have created a fictitious data set of 30 respondents.  I made it so that 10 Democrats were sampled (80% of them voting for Kerry, in accordance with actual survey estimates), along with 20 Republicans (with 90% for Bush).  The gray shading is simply meant to break the grid into blocks of five lines, making it easier to count cases.  

If no weighting by party were done, Bush would be leading 66%-33% (20 to 10 in raw numbers).  

If we want to re-estimate the sample with 50% Republicans and 50% Democrats (which roughly matches some actual estimates), then we would weight each Democrat 1.50 (to bring them from 10 respondents to 15) and weight each Republican .75 (to bring them from 20 respondents to 15).  

The weights should be treated like any other variable, with a name such as "pweight" for party weight.  In SPSS, you would go to the "Data" menu, then "Weight Cases," then weight by pweight.  

With the weighting implemented, the sample becomes equalized on numbers of D's and R's, and Bush's lead shrinks to 55%-45%.  (The easiest way to visualize this is to aggregate the weights of the original Kerry voters [8 X 1.50] + [2 X .75] = 13.50, which is then divided by 30, yielding .45.)

Party ID (1 = D, 2 = R) Candidate Preference
(1 = Kerry, 2 = Bush)
Weights to be Applied
(pweight)
1 1 1.50
1 1 1.50
1 1 1.50
1 1 1.50
1 1 1.50
1 1 1.50
1 1 1.50
1 1 1.50
1 2 1.50
1 2 1.50
2 1 .75
2 1 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75
2 2 .75


APPENDIX B:  References for More Complex Situations
(Thanks to
Michael Frone )

Lee, E. S., Forthofer, R. N., & Lorimor, R. J. (1989). Analyzing complex survey data. Thousand Oaks, CA: Sage.
See especially pages 18-21.)

Lehtonen, R., & Pahkinen, E. (2004). Practical methods of design and analysis of complex samples (2nd ed.). NY: Wiley.

Lohr, S. (2010). Sampling: Design and analysis (2nd ed.). Boston, MA: Brooks/Cole.


Click here to return to top of page.