





















































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A study using Word Association Tests (WATs) to analyze responses from mainland Chinese and Hong Kong participants regarding the concepts of democracy and China. The study compares response patterns, non-response rates, and latency times between the two groups. The data includes a vector of words provided by respondents for each cue word, and the analysis focuses on the probability of responding with a particular word given a specific cue word.
What you will learn
Typology: Schemes and Mind Maps
1 / 61
This page cannot be seen from the preview
Don't miss anything!
Abstract The standard practice to measuring political attitudes is to ask survey respon- dents to map their feelings onto a quantitative scale determined by the researcher. This approach, while widespread, suffers from a number of well-known problems. Such questions can be cognitively demanding, scales are different across cultures and even individuals of the same culture, and complex attitudes are reduced to a single number. In this paper, we advance the use of Word Association Tests (WATs), where respondents are presented a series of cue words and asked to provide other words that come to mind as quickly as possible. This approach more directly maps to how attitudes actually operate in the human mind, and it provides a richer set of data than a standard survey question. The paper develops and demonstrates the utility of WATs through an analysis of Chinese citizens’ attitudes towards the Chinese Communist Party (CCP). Keywords: survey; public opinion; sensitive questions; Word Association Test; China; Chinese Communist Party
†Ph.D. Student, Department of Politics, Princeton University. zeh@princeton.edu. ‡Ph.D. Candidate, Department of Politics, Princeton University. naijial@princeton.edu. §Assistant Professor of Politics and International Affairs, Princeton University. rtruex@princeton.edu. This material is based upon work supported by the Department of Politics, the School of Public and International Affairs, and the Paul and Marcia Wythes Center on Contemporary China at Princeton University. Our gratitude goes to Quentin Beazer, Dan Corstange, Simon de Deyne, Michelle Dion, Yue Hou, Kimuli Kisara, John Marshall, Andrew Nathan, Margaret Roberts, Arturas Rozenas, Arthur Spirling, Yiqing Xu and participants in seminars and panels at Columbia University, New York University, APSA and the PolMeth 2021 Summer Meeting. Any remaining errors are our own.
The standard practice to measuring political attitudes is to ask survey respondents to map their feelings into a quantitative scale determined by the researcher. Consider the following question commonly used in the study of Chinese politics (Lu and Dickson 2020; Ratigan and Rabin 2020; Shen and Truex 2021):
On a scale of 1 to 10, with 10 meaning very satisfied and 1 meaning not satisfied at all, how satisfied are you with the work of the following? a. Central government officials
Respondents are meant to take their feelings about the central government, reduce them down to a single number, and report that number back faithfully to the researcher. This question format is commonplace in the discipline. In American politics, researchers analyze “feeling thermometer” questions from the American National Election Studies (ANES) that require respondents to assess political figures on a 101-point scale (Hetherington 1998; Winter and Berinsky 1999). Scholars of international relations employ similar measures of citizens’ attitudes towards foreign countries (Gries et al. 2020). In the past five years (2017-2021), 162 articles in the American Political Science Review, American Journal of Political Science, and Journal of Politics have featured an analysis of survey data that measures political attitudes using quantitative scales. This represents roughly 15% of the articles in the top general interest journals in the field.^1 Anyone who has taken a survey knows that the standard approach suffers from a number of problems. Such questions can be cognitively demanding. We might not have well-defined attitudes on every topic, and even if we did, placing those beliefs into a single quantitative dimension can feel arbitrary (Berinsky 1999, 2004; Berinsky and Tucker 2006). Scales are different for different people, and this can make comparison difficult (Brady 1985; King et al. 2004). Perhaps most importantly, we lose a lot of information when we ask people to reduce their attitudes to a single number or response. (^1) This calculation does not include short articles, letters, or book reviews.
Substantively, our interest diverges from typical WATs that aim to explore how word meaning or semantic information in general is stored in memory (Anisfeld and Deese 1967; McRae et al. 2005; Vinson and Vigliocco 2008). We construct a WAT to measure attitudes towards the Chinese Communist Party (CCP) among Chinese citizens (in mainland China and Hong Kong. The aim of this paper is more methodological than substantive. Our goal is to show the utility of the WAT approach and provide a “how to” guide that will allow other researchers to use word association in other political contexts.
Every person possesses a body of preexisting knowledge which is stored in a vast long-term memory. Even though such information might not be front of mind, we have not deleted the knowledge of the color of our first car, or the directions to the movie theater, or the names and actions of our political leaders. When needed or “activated,” these stored pieces of information are moved into working memory, where it can be used for conscious thinking and reasoning (Anderson 1983; Collins and Loftus 1975). The space in our working memories are quite small – about 7 (plus or minus 2) bits of information (Miller 1956). Long term memory, in contrast, is thought to be essentially limitless (Lodge and Taber 2013). Much of what we “know” might lie outside of our conscious awareness for long periods of time, and we might not even know we know it. Memories are stored in a vast array of networked associations. Each piece of information is linked to countless other pieces of information, which are in turn linked to countless other pieces of information. When a concept is activated by some external stimulus, other linked pieces of information may be activated as well (Anderson 1983; Collins and Loftus 1975). For example, the concept of graduate school might bring the following related concepts quickly to mind: problem sets, comprehensive exams, job market, paper, job talk, seminar, carrel, desk, code, professor. One can do this for effectively every concept in memory. We can visualize this idea with a mental map, where concepts in memory are drawn with
Obama
Jeremiah Wright
Liberals
Oil Spill
Corrupt
Immigration
War in Iraq
Health Care Reform
Wall Street Bailout
Smart
Republican^ Vote
Democra^ Votet
Republicans Angry
Americans^ African
Democrats
+
_
+ + _ _
_ _
_
+
_
_
_
_
+ +
links to each other that represent associations. Figure 1, which is reproduced and amended slightly from Lodge and Taber (2013), shows the cognitive structure of a hypothetical Ameri- can citizen approaching the 2012 election. Different types of memory objects are denoted with different shapes, and the lines between the objects denote associations of varying strength. The memory objects are tagged with either a positive or negative affect. Figure 1 shows a mental map of a hypothetical Republican voter with negative associations with Barack Obama. Our memories are an exceedingly complex web of information, and even representing just a few concepts and links on paper can quickly get unwieldy. Note that all figures of this nature
categories or word classes (Johnson et al. 2012; Malek-Ahmadi, Small and Raj 2011; Ross et al. 2007). In “free” WATs, respondents can provide whatever word comes to mind (de An- drade et al. 2016; Judacewski et al. 2019; Rojas-Rivas et al. 2018). In “continuous” WATs, the cue word is presented to the subject only once, and she is asked to give as many associ- ations as possible in a pre-specified period of time (Brown and Ogle 1966, Matthews 1967, Silverstein and Harrow 1982, Silverstein and Chaifetz 1984). In “successive” WATs, the list of stimulus words is presented several times, often with the goal of measuring the stability of the subject’s responses (Pons and Baudet 1979, Pons et al. 1986, Rosen and Russell 1957). We know that with traditional survey questions, minor differences in question wording can make a big difference in outcomes. Some questions are too restrictive, while others are not constrained enough and include vague words and phrases that make responding difficult (Tourangeau, Rips and Rasinski 2000). WATs rarely include grammatical ambiguity and complicated syntax, and respondents can interpret the prompt relatively easily. WATs also do not involve quantitative scales of any kind and avoid the known issues of such questions
We administered two WATs designed to measure Chinese citizens attitudes towards the Chi- nese Communist Party. The first (“Study 1”) was administered on March 9-10, 2020 to a sample of 1,189 Chinese citizens in mainland China, of whom 616 (51.81%) identified as female and 573 (48.19%) identified as male. The mean age was 36.9 years (SD ≈ 11.19). The second (“Study 2”) was administered on May 21-June 10, 2020 to a sample of 1, Hong Kong residents of Chinese ethnicity, of whom 568 (55.74%) identified as female and 450
(44.16%) identified as male. The mean age was 37.19 years (SD ≈ 11.29). Both studies were administered online in partnership with a local Chinese marketing com- pany. All respondents were over the age of 18 and had to take the survey on a laptop/desktop computer. Apart from slight differences in the demographic and political attitude questions, the two surveys were identical, to facilitate a comparison between Hong Kong and mainland China. The Supporting Information provides the full questionnaire. After a standard set of demographic questions, each participant completed a short WAT designed to take about six minutes. The instructions said that a cue word would appear on the screen and told the respondent that she would have 20 seconds to type all words that came to mind. Each participant was presented with a list of 18 cue words. This is considered a “free” and “continuous” WAT – there were no restrictions placed on response words, and each cue word appeared only once. Some of the design decisions for our WAT merit further discussion. We wanted to give respondents enough time to provide multiple words in response to the cue word, but still limit the time such that the spontaneity and automaticity of the exercise was maintained. For example, if each cue word had a time limit of one minute, this would give respondents enough time to think through their responses and perhaps self-censor on more sensitive items. But if trials were restricted in five seconds, we might get only one word responses, or perhaps no responses at all. Relatedly, there is a question of how many cue words to include in a WAT. The more cue words, the more data to analyze, but the more likely the task would induce fatigue among respondents. After interviews with participants that piloted the survey, we felt that 20 seconds and 18 cue words were appropriate for our survey context. The number of cue words and the time for each trial are in line with best practices in various fields (De Deyne and Storms 2008; De Deyne, Navarro and Storms 2013; De Deyne et al. 2019; Gulacar et al. 2015; Li and Wang 2016; Vivas et al. 2019). A second issue is what words to include among the cue words. To start with, we identified a set of “core” cue words that were the substantive focus of the study. We wanted to learn how
in the Supporting Information. For each respondent i and cue word c, the data include a vector of words Wic that the respondent inputted as associating with the cue word. This vector varies in length across respondents and across words, which will we use as a variable, countic. Our core substantive analysis will focus on a simple associative strength measure, p(r|c), which is the probability of responding with word r when given word c as a cue (De Deyne et al. 2019). The data also includes two latency measures for each trial, latency.f irstclickic and latency.submitic. The former represents the time it took in seconds for the respondent i to enter their first response for the cue word c. The latter represents the time it took to sub- mit the trial – respondents had the option to submit before the twenty seconds had elapsed.
In the remainder of the article, we will focus on showing readers different steps in analyzing WAT data and some of the possibilities for visualization. Where appropriate we will also highlight some of the key substantive findings on public opinion in China.
Step 1: WAT Diagnostics
As with any set of responses to a novel question technique, researchers should first assess how respondents understood the task and identify any patterns or irregularities in the data. We would recommend a close analysis of submission patterns, specifically how long respondents take, how many words they submit, and whether key cue words are outliers on any of these variables. Figure 2 shows a histogram of latency.submit for all respondents I across the full set of cue words J for the two studies. We see a bimodal distribution – nearly identical across the mainland China and Hong Kong samples – with peaks around 5.5 and 20 seconds. This suggests that respondents participated in the WAT in different ways. Most respondents followed the directions and took the full 20 seconds per trial, while others clicked submit
much earlier.
0
2000
4000
6000
0 5 10 15 20 Latency − Trial Submission (seconds)
Count
Study 1 − Mainland China
0
1000
2000
3000
4000
5000
0 5 10 15 20 Latency − Trial Submission (seconds)
Count
Study 2 − Hong Kong
Note: Figure shows the histogram of the latency in seconds for the time it took a trial to be submitted. The allotted time was 20 seconds.
Not surprisingly, the time to submission was systematically related to the number of response words provided. Respondents that took the full 20 seconds provided an average of 3.682 (Study 1) and 3.240 (Study 2) response words in the mainland and Hong Kong samples, respectively. Respondents that took less than 10 seconds provided an average of 0.932 (Study
level. Figure 3 shows a histogram of the total number of nonresponses per respondent for the 18 WAT trials. We observe some respondents did not appear to take the WAT portion of the survey seriously at all. In mainland China, roughly 20.4% of respondents provided no answers to more than 50% of the WAT trials. In Hong Kong, about 5.4% of respondents showed that behavior pattern. Some level of item non-response to WAT cue words is understandable – respondents might not know a particular word, or they might struggle to come up with a response in the allotted time. But that level of non-response indicates “speeder” behavior. This data was unusable and will be excluded from the remainder of the analysis.^4 Note that the issue of repeated item non-response affects most online surveys, and it does not appear as though our survey was particularly vulnerable to the problem. In Figures 4 and SI1 in the Supporting Information, we also consider response patterns by trial number, which allows us to assess whether respondents changed how they took the WAT as they progressed through the 18 trials. In both Hong Kong and mainland China, nonresponse rates increased, time to provide the first response shortened, and submission times were faster for later trials. For the first half of trials, respondents in Study 1 provided an average of about 2.54 tokens. By the second half, they provided about 2.44. This difference is not large but suggests researchers should be careful in constructing longer WATs, as there begin to be some costs in data quality. Shorter WATs, in the territory of 10 to 12 trials, might be more successful.
(^4) A related issue which analysts should check for is “matching behavior,” whereby the respon- dents simply inputs the cue word as the response word. This indicates a misunderstanding of the task. Roughly 2.1% of trials (442 in total) in mainland China had a matching response, and 5.4% of trials (995 in total) in the Hong Kong study were matching responses. These trials were excluded from the analysis.
(^1 2 3 4 5 6 7) Trial Number 8 9 10 11 12 13 14 15 16 17 18
Nonresponse Rate
Nonresponse Rate
(^1 2 3 4 5 6 7) Trial Number 8 9 10 11 12 13 14 15 16 17 18
Count (mean)
Tokens Provided
(^1 2 3 4 5 6 7) Trial Number 8 9 10 11 12 13 14 15 16 17 18
Latency − First Click (mean)
Time to First Click
10
11
12
13
14
15
(^1 2 3 4 5 6 7) Trial Number 8 9 10 11 12 13 14 15 16 17 18
Latency − Submission (mean)
Time to Submission
Note: Figure shows the mean nonresponse rate, latency to submission, and tokens provided by the trial order number. Data is from Study 1, and is filtered to exclude respondents that engaged in “speeder” behavior (non-responses to more than 50% of trials).
Our hope in constructing this survey is that the WAT technique reduces the sensitivity of assessing attitudes towards actors like the CCP or Chinese government (Ratigan and Rabin 2020, Shen and Truex 2021).^5 One way to assess this is to compare the latency, count, and nonresponse measures for all the words included in the WAT. If a question item is sensitive, we would expect respondents to pause slightly longer before answering, and perhaps provide fewer associated words as a result. We might also see higher rates of non-response (Ratigan (^5) We believe WATs hav potential as a sensitive question technique. Experiments have shown that WAT participants tend to provide the first word in their mental lexicon, rather than deliberate or strategic responses (Playfoot et al. 2018).
motherpartyinsist
marriagelawyercoffee performancebirthdayChina
young ladynot badoffice
understandwifems.
encounterideasir meaningownfind
schooldanceenter
abandonabilityCCP
successrejoicecare
governmentsolutionfeel simplesurelyme
central governmentsurgeryreal
elder brothergoallife existencesoundkid
systembodysee choicehappyfinally
democracyuniversitylucky suggestionbelievefear
phonecrimedate
yesterdayhandleagain
experiencealwaysreason
programjust nowvote normalhumanyoung
freedombehindkey
recordingknowmind pleasebeforejust
last nightdudechild nervousreturnabove
troubledamnpain
appearancecontinuecontrol
excuse mego backsupport on the bodyneverbutt
part
0.00 0.05 (^) Nonresponse Rate0.10 0.
Term
Note: Figure shows the non-response rates for all cue words presented in the WAT. Data is from Study 1 (Mainland China) and is filtered to exclude respondents that engaged in “speeder” behavior (non-responses to more than 50% of trials). Core cue words are shown in blue. Note that all respondents saw these words, which is why the point estimates have smaller confidence intervals than the other words.
Outcome nonresponse latency.submit count (1) (2) (3) cue: CCP -0.013 0.633 0. (0.008) (0.233) (0.095) cue: China -0.037 0.037 0. (0.008) (0.233) (0.095) cue: Central Government -0.013 0.484 0. (0.008) (0.233) (0.095) cue: Democracy -0.004 -0.074 -0. (0.008) (0.233) (0.095) cue: Me -0.007 -0.614 0. (0.008) (0.234) (0.094) female 0.008 0.157 0. (0.003) (0.107) (0.043) age 0.003 -0.049 -0. (0.000) (0.006) (0.002) minority 0.027 -1.802 -0. (0.011) (0.335) (0.138) lowed 0.065 -1.791 -0. (0.004) (0.133) (0.054) rural -0.018 0.968 0. (0.005) (0.154) (0.062) ccp -0.001 0.569 0. (0.004) (0.123) (0.050) n 15,172 15,172 15, Note: Table shows regressions of WAT metadata on demographic co- variates and cue word indicators. The non core cue words represent theexcluded category. Data is from Study 1, and is filtered to exclude re- spondents that engaged in “speeder” behavior (non-responses to more than 60% of trials) or provided no responses or matched responses to- wards the cue. Data is organized on the trial level. All models estimated using OLS. Standard errors shown in parentheses.
more diverse set of answers to the political cue words. This tells us something about the diversity of political thought in Hong Kong relative to the mainland.
Step 2: Frequency Analysis and Subgroup Comparisons
The natural next step for a WAT analysis is to consider the frequencies and associative strength p(r|c) measure of different response words and look for substantive patterns therein. Table 3 shows the most common responses among mainland Chinese respondents (Study
Cue Word: Central Government Cue Word: CCP Cue Word: China Note: Table shows most frequent responses for the cue words central government, CCP, and China. Datais from Study 1, and is filtered to exclude respondents that engaged in “speeder” behavior (non-responses