Polling Links (General and on Weighting) • Appendices: Implementing Sample Weights |
Alan Reifman’s
Party ID &Sample Weighting Page
|
|
|
The short story is much of the
apparent changes in public opinion are actually changes in patterns
of nonresponse: When it looked like [2012 GOP nominee Mitt] Romney
jumped in popularity, what was really happening was that disaffected
Democrats were not responding to the survey while resurgent
Republicans were more likely to respond. From
a “methods” point of view, the key step is to poststratify by party
ID... (original Washington Post article and addendum; August 2014) |
Campaign 2012 | |
Article by Samuel Best and Brian Krueger on Gallup missing the 2012 Party ID percentages (based on exit polls) by a wide margin
Because of questionable partisan composition, Pew Research
Center
weights its final 2012 pre-election poll by who respondents say
they voted for in 2008 (similar to weighting by Party ID), "but
that's not something we like to do."
INTRODUCTION TO CAMPAIGN 2012:
It's the issue that never seems to go away -- polls' partisan
composition (the percentages of self-identified Democrats,
Republicans, and Independents in the sample) and what to do (if
anything) when these percentages seem out of whack compared to
previous baselines. In August 2012 alone, four articles appeared
(links below), expressing different opinions on the matter. I invite
you to peruse these articles and all the other resources on this
site, which I've maintained since 2004.
Over the years, weighting schemes have gotten more
sophisticated. Using, say, 2000 exit-poll figures on Party ID as a
template to weight surveys for the 2004 election indeed is very
questionable, as much about the political landscape can change in
four years (or less time). With
dynamic weighting, a
pollster can implement party weights on his or her next survey based
on the average party ID percentages from the same pollster's, say,
past three months' previous surveys (here and
here).
Dynamic weighting thus allows for some shifting in Party ID due to
recent events, but the aggregation over multiple recent polls allows
for greater stability in Party ID than some surveys have shown. |
|
Trends in Party Identification
Pollster.com/Huffington Post compilation ● Rasmussen |
|
New "Poll
Manipulator" Device from the
Washington Post You can adjust the partisan split to see how the Obama-Romney horse-race numbers are affected |
... ...
My sincere appreciation to those who have provided online coverage of this site, including The Atlantic and National Review. Pollsters' Party ID Compositions for National Surveys (2008)
Rasmussen Weighting Template for Final Days Leading Up to the Election:
39.9% Democrat, 33.4% Republican
Zogby Weights:
38% Democrat, 36% Republicans
Daily Kos/Research 2000 Weights: 35% Democrat, 26% Republican The polls below
do not
re-weight or adjust their results based on Party ID. Nevertheless, the
Party ID percentages they obtain from their random samples are informative in judging the
partisan composition of the electorate. Party ID percentages for a
given poll are shown by the respective "D" and "R" notations, with links to
the original sources, if still available. Obama
(O)-McCain (M) "horse race" numbers are shown below the Party ID
percentages in italics.
*Date the poll was reported. Click here to return to top of page.
October 9, 2008 -- The American Research Group (ARG) released a batch of state polls today. The one getting the most discussion is that from West Virginia, where Barack Obama is said to be leading John McCain 50-42%. Though the Mountaineer State was long a Democratic stronghold, the GOP has dominated at the presidential level of late (Bush beat Gore by 6% in 2000, and Kerry by 13% in 2004). Further, McCain has consistently led in West Virginia during the current campaign, albeit with some recent narrowing. Whenever a polling result looks odd, discussion usually turns to partisan composition of the sample. Did the pollster oversample Democrats or Republicans (or neither)? This latest situation, involving West Virginia, has motivated me to complete a project I've long been considering -- namely, compiling the best estimates I can find of the Party ID composition in several swing states. Below is what I've found. The chart is definitely a work-in-progress, though, so please send me additional data sources if you know of any (alan.reifman@ttu.edu).
Sources: (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) (p) Note 1. Apparently, these pollsters used a seven-point scale, including strong, moderate/somewhat, and weak Democrats and Republicans, plus "pure" independent. The notation "2-3-2," I believe, thus lumps weak Democrats and Republicans with Independents, and only counts strong and somewhat strong party identifiers with their party; "3-1-3" would puts weak identifiers with their party. Click here to return to top of page. Gallup summarizes its 2008 data on national and state-by-state Party ID (January 2009) The Wall Street Journal examines Party ID weighting
Public
Policy Polling survey firm
raises interesting question about rising Democratic Party ID: ABC News polling expert Gary Langer discusses un-leaned and leaned Party ID A look from Pollster.com at partisan composition trends from states where voter registration is by party Nate Silver at FiveThirtyEight has a nice posting on weighting for Party ID Pollster.com's Mark Blumenthal weighs in on "Party ID Wars" for this year's presidential campaign Click
here to return to top
of page.
Dr. Reifman's summary of the party ID/sample weighting issue in the 2006 elections is available at Pollster.com. Charles Franklin discusses end-of-the-year 2006 Gallup report on party ID
Lettering turns grey when a more recent poll by the same organization becomes available. All polls above are national. For discussion of partisan composition of state polls, see other sections of this website. (a) I use data from the Rasmussen daily polls as my personal “gold standard” of what the “correct” partisan composition is, due to their large aggregate sample sizes and sophisticated methods: “At Rasmussen Reports, we adjust our party identification weighting targets each month based upon actual survey results from the previous 90 days. For the month of October, our partisan weighting targets are 37.0% Democrat, 32.3% Republican, and 30.7% unaffiliated. That's little changed from our September targets of 37.0% Democrat, 32.7% Republican, and 30.2% unaffiliated. In an October 12 posting on his website, Rasmussen reports his polls have detected "no impact" of the scandal involving former Congressperson Mark Foley on party ID. On November 1, Rasmussen announced new party ID figures of 37.7% D and 31.5% R. Since there's not that much time until the elections, I'll just keep the October figures at the top of the chart. (b) For registered and “likely” voters, respectively. (c) Asked in terms of preference for which party to control Congress. (d) Among "most likely voters." (e) Before "hard push" to see if Independents will align with either major party (or, in other polls, excluding Independents who lean D or R). (f) Using candidates' actual names, instead of generic reference to "Democrat" or "Republican." My thanks to the pollsters who, in the spirit of openness and transparency, report the internal details of their surveys on their websites! Click here to return to top of page.
by Alan Reifman September 9, 2004 -- We have seen a lot of polls thus far in the Bush-Kerry race and we're going to see a lot more. Often polls by two different survey outfits taken at the same time will show results in pretty stark disagreement. Literally as I write this, Rasmussen Reports has the race a virtual dead-heat (Bush 47.5, Kerry 46.8; link no longer available), while CBS News has Bush up by 7. All pollsters try to obtain a random, representative sample of voters to represent the full electorate. In addition to vote choice (i.e., Bush, Kerry, or other), pollsters always ask respondents which party they align themselves with. These two measures -- candidate preference and party ID -- often show great overlap, with Republicans (R's) heavily going for Bush and Democrats (D's) heavily going for Kerry. However, people sometimes vote for the other party's candidate, so candidate preference and party ID are not identical. One factor (among many) that may contribute to discrepancies between different outfits' polls in their Bush-Kerry margins, I will argue, is polling firms' different philosophies as to whether it's advisable to mathematically adjust their samples -- after all the interviews have been completed -- to make the percentages of D's and R's in their survey sample match the partisan composition that is likely to be evident at the polls on Election Day. The latter can be estimated from exit polls from previous elections, party registration figures (in states where citizens declare a party ID when registering to vote), and surveys. (Another issue that often comes up in evaluating pre-election surveys, with which many of you may be familiar, is whether results are reported for "registered" or "likely" voters. That is a different issue from what is being discussed presently. Whether a pollster reports results for registered voters, likely voters, or both, weighting by party ID is a separate, independent decision.) Exit polls from the three previous presidential elections yield the following percentages of the electorate comprised of self-identified D's, R's, and independents (from Zogby; link no longer available).
One additional source is Democratic pollster Stanley Greenberg, author of The Two Americas. He found, after conducting 15 national polls with an aggregate 15,045 voters from late 2001 until early 2003 and allocating "leaners" to the relevant party, that each major party had the allegiance of 46% of the voters. The controversy occurs when a poll of, say, 1,000 voters shows a partisan composition vastly different from what we've come to expect. Should the pollster make statistical adjustments (described below) to make the party breakdown conform to more typical estimates, or should he/she just leave the numbers alone and report the findings? A good summary of the "back and forth" of this controversy is available in this Los Angeles Times article from earlier this summer. I will be referring back to this article. The following two scenarios should illustrate the key issues. Before presenting the scenarios, I want to state that there are noted national authorities on either side of the "weight/no weight" debate (including Democrats on both sides and Republicans on both sides). Each reader should decide for him- or herself. A simple illustration of how to actually implement a sample weighting is described in Appendix A, with references for more complex situations in Appendix B. SCENARIO 1 As noted above, most recent estimates of the partisan composition of the electorate suggest a rough balance between the number of voters leaning toward the D and R parties (i.e., "50/50 nation"), with the possibility that there might be slightly more D's than R's. In his aforementioned book, Greenberg characterizes party identification as "... a form of social identity, not unlike ethnicity or race, with considerable durability over time" (p. 93). I would argue that individual-level stability should generally lead to population-level stability, although not perfectly so (e.g., from one presidential election to the next, some people pass away, others newly turn 18, immigrants become eligible to vote). Suppose a pollster completes a survey and finds far more self-identified R's in the sample than D's. This happened in the Newsweek poll released in early September right after the R convention that gave Bush an 11-point lead over Kerry. Newsweek's sample contained 38% R, 31% D, and 31% I (article links no longer available). There would seem to be three plausible explanations for the higher-than-usual sampling of R's:
If Greenberg is correct about the stability of individuals' party ID, then the first of the three explanations (a sudden shift) seems unlikely. The fact that other recent polls besides Newsweek's have obtained samples with more R's than D's seems to go against the third explanation (chance). In any event, we would conclude that if the second or third explanation were the true "culprit," Newsweek's party breakdown would appear to be "out of whack" relative to the other aforementioned indicators. (Again, to keep things bipartisan, the article I cited earlier as providing a good discussion of the "back and forth" of the controversy was itself prompted by a mid-summer L.A. Times poll in which it appeared there were way too many Democrats in the sample.) It is at this point that pollsters face the choice of whether to adjust the numbers to match more typical estimates of the D-R distribution (i.e., count R's less and D's more), or just leave the sample alone. In 2000, pollster Scott Rasmussen went with the "leave things alone" strategy, with the result that his firm forecast a 9-point Bush victory over Gore in its final pre-election poll. Rasmussen, to his credit, posted a candid summary on his website: "Simply put, we had too many Republicans in our sample. For a variety of reasons, our firm has never weighted by party. However, if we had weighted the data before the election to include an equal number of Republicans and Democrats, we would have shown Bush leading by 2 points. Had we weighted our data to match the partisan mix reported by the Voter News Service on Election Night, we would have shown Gore leading by a point"(document no longer available online). One would surmise that Rasmussen is probably weighting this year. [As shown above on this website in connection with the 2008 presidential election,Rasmussen indeed adopted a practice of weighting by party ID.] Pollster John Zogby has pioneered the art of sample weighting on party ID. Taking 1996 and 2000 together, he was the most accurate pollster in forecasting the two presidential elections. As he noted on his website, "My polls use a party weight of 39% Democrat, 35% Republican and 26% Independent" [this 2004 document is no longer online]. Another apparent sign of polls that weight is that they should exhibit less volatility day-to-day or week-to-week than polls that don't weight. SCENARIO 2 As previewed above, a number of prominent polling authorities would presumably argue that the Newsweek poll, with its larger-than-typical R composition, or the L.A. Times poll, with the unusually wide D edge, should be left alone and not "retrofitted" into some preconceived template of what the Election Day partisan composition will look like. According to the L. A. Times "back and forth" article cited above: "Andrew Kohut of the Pew Research Center for the People and the Press said that he once conducted a survey asking voters their party twice, four days apart, and that he found substantial differences in the responses." Further, even though Democratic candidates sometimes come out with more favorable readings on party-weighted compared to unweighted polls, Ruy Teixeira, co-author of The Emerging Democratic Majority and operator of a "blog" related to the book [moved to here], opposes weighting for party ID. In his September 5 entry, he writes: "Does that mean I favor polls like this weighting their samples by party ID? No, I don't, because the distribution of party ID does shift some over time and polls should be able to capture this. What I do favor is release and prominent display of sample compostions [sic] by party ID, as well as basic demographics, whenever a poll comes out. Consumers of poll data should not have to ferret out this information from obscure places--it should be given out-front by the polling organizations or sponsors themselves. Then people can use this information to make judgements [sic] about whether and to what extent they find the results of the poll plausible." If I had to argue for not weighting on party ID, I would make two points:
Hopefully my essay has given you something to think about as the pre-election polls roll in. There are no definitive answers on how to handle these issues. For all we know, Election Day 2004 may have more R's voting than D's, a departure from the last three presidential elections. In that event, polls not weighted on party ID may end up being more accurate than those with weightings based on recent past presidential elections. My personal advice -- whether, to use a phrase from Senator John McCain, you're a "Republican, Democrat, Libertarian or vegetarian" -- is to interpret polls favorable to your chosen candidate in cautious terms. If the favorable poll turns out to be accurate on Election Day, you can exult, but if it was flawed, at least you won't be blindsided. I would like to thank John Welte for discussing sample weighting with me over the years. Click here to return to top of page.
Click here to return to top of page.
Party ID Numbers from the NBC-Wall Street Journal Poll, 1990-2015 Pollster.com (formerly "Mystery Pollster"), a blog devoted to polling methodology issues, including sample weighting by party ID Polling Report (compilation of poll results) Performance of different survey outfits' polls in forecasting true presidential vote from 1936-2000, 2004, and 2008 CNN.com Exit Polls (for looking at Party ID and
other voter characteristics, nationally and state-by-state) "Unskewed Polls" -- A 2012 newcomer to the "Party ID wars," this site provided adjusted Obama-Romney horserace numbers by re-weighting non-Rasmussen polls to Rasmussen's Party ID percentages Dr. Reifman's lecture notes on survey sampling Letter by Dr. Reifman in the ISR Sampler (published by the University of Michigan's Institute for Social Research), arguing that the older concept of "quota sampling" and sample weighting are conceptually similar (when the document opens up, go to page 12 to see Dr. Reifman's letter and responses by two University of Michigan survey experts). Statement by polling firm Survey USA on its sample-weighting policy (from Daily Kos) USA Today article on the sample-weighting controversy (original "Move On" advertisement to which Gallup is responding in article) Political Arithmetik (quantitative analyses of polls by Wisconsin Prof. Charles Franklin, who taught Dr. Reifman in a summer stats course at Michigan over 20 years ago!) Party ID in the USA, 1952-2008, from the University of Michigan's National Election Study (gets at the question of the stability of party ID) "A Consumer's Guide to the Polls" (2004), reviews the practices of the leading polling outlets on several dimensions Bloggers Chris Bowers and Steve Soto were very active in analyzing the sample-weighting controversy in 2004. Their writings led me to many of the websites linked above, for which I thank them. Academic article (Erikson et al., 2004, Public Opinion Quarterly) on the volatility of samples' partisan composition in daily tracking polls. Daily Kos diary by Dr. Reifman on the partisan composition of a poll in the New Jersey U.S. Senate race, with Kos reader comments (9/25/06). Memo by pollster David Winston on possible oversampling of Democrats in 2006 pre-election polls (10/16/06). Website of Columbia University professor Andrew Gelman (see his 2001 article on post-stratification, in the Journal of the American Statistical Association, as well as other related articles) Click here to return to top of page.
As shown in the grid below, I have created a fictitious data set of 30 respondents. I made it so that 10 Democrats were sampled (80% of them voting for Kerry, in accordance with actual survey estimates), along with 20 Republicans (with 90% for Bush). The gray shading is simply meant to break the grid into blocks of five lines, making it easier to count cases. If no weighting by party were done, Bush would be leading 66%-33% (20 to 10 in raw numbers). If we want to re-estimate the sample with 50% Republicans and 50% Democrats (which roughly matches some actual estimates), then we would weight each Democrat 1.50 (to bring them from 10 respondents to 15) and weight each Republican .75 (to bring them from 20 respondents to 15). The weights should be treated like any other variable, with a name such as "pweight" for party weight. In SPSS, you would go to the "Data" menu, then "Weight Cases," then weight by pweight. With the weighting implemented, the sample becomes equalized on numbers of D's and R's, and Bush's lead shrinks to 55%-45%. (The easiest way to visualize this is to aggregate the weights of the original Kerry voters [8 X 1.50] + [2 X .75] = 13.50, which is then divided by 30, yielding .45.)
APPENDIX B:
References for More Complex Situations
Lee, E. S., Forthofer, R. N., & Lorimor,
R. J. (1989). Analyzing complex survey data. Thousand Oaks, CA:
Sage.
Click here to return to top of page.
|