This readme.txt file was generated by Yuchen He and Junming Huang in Dec 2023. ------------------- GENERAL INFORMATION ------------------- Title of Dataset: Data for "Declining Chinese Attitudes Toward the United States Amidst COVID-19" Author Information Yu Xie Paul and Marcia Center on Contemporary China, Princeton University, Princeton, NJ 08544 United States Center for Social Research, Guanghua School of Management, Peking University, Beijing 100871, China Junming Huang Paul and Marcia Center on Contemporary China, Princeton University, Princeton, NJ 08544 United States Yuchen He Center for Social Research, Guanghua School of Management, Peking University, Beijing 100871, China Feng Yang Center for Social Research, Guanghua School of Management, Peking University, Beijing 100871, China Yi Zhou Center for Social Research, Guanghua School of Management, Peking University, Beijing 100871, China Yue Qian Department of Sociology, University of British Columbia, 6303 NW Marine Drive,Vancouver, BC, V6T 1Z1 Canada Weicheng Cai Center for Social Research, Guanghua School of Management, Peking University, Beijing 100871, China Jie Zhou Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China Date of data collection: 2018 - 2023-3 Description: This dataset encompasses three distinct sets of data analyzed in the study, namely the survey data on favorability to the US, the survey data on trust in Americans, and the social media data. The first part of the dataset comprises the analysis in Study 1 and Study 3, which is collected from three surveys, including the Social Attitude Questionnaire of Urban and Rural Residents (SAQURR) in 2019 and 2020, the COVID-19 Multi-Wave Study (CMWS) between 2020 and 2022, and the Survey on Living Conditions (SLC) in 2023. The second part of the datasets provides information used in Study 4, involving the 2018 and 2020 waves of the CFPS, Baidu Index data, and the COVID-19 cases and deaths data. The third dataset is provided to depict trends in attitudes toward the US in Study 2. -------------------------- SHARING/ACCESS INFORMATION -------------------------- Licenses/restrictions placed on the data, or limitations of reuse: CC BY-NC-SA 4.0 Recommended citation for the data: Xie, Y., Huang, J., He, Y,. Yang, F., Zhou, Y., Qian, Y., Cai, W., Zhou, J., & He, Q. (2023). Data for "Declining Chinese Attitudes Toward the United States Amidst COVID-19" [Data set]. Princeton University. https://doi.org/10.34770/5pk2-8345 Links to other publicly accessible locations of the data: This data is available at yuxie.com (https://yuxie.scholar.princeton.edu/share-files/data-files-declining-chinese-attitudes-toward-united-states-amidst-covid-19) and Princeton DataSpace. COVID-19 Multi Wave Study (CMWS) and Survey on Living Conditions (SLC) are conducted by Population Development Studies Center, Renmin University of China. Social Attitude of Urban and Rural Residents Survey (SAURRS) is conducted by Institute of Psychology of Chinese Academy of Sciences. China Family Panel Studies (CFPS) is conducted by Institute of Social Science, Peking University. The Weibo data is owned by Sina. -------------------- DATA & FILE OVERVIEW -------------------- File list: README.txt media-data-average-opinion-us.csv survey-data-trust-analytical-sample.xlsx survey-data-trust-descriptive-sample.xlsx survey-datasets-favorability.csv Relationship between files, if important for context: [survey-datasets-favorability.csv] suffice the replication of the results presented in Study 1 and Study 3. [survey-data-trust-descriptive-sample.xlsx] reports the trust level for Study 4 and the [survey-data-trust-analytical-sample.xlsx] also provides information of other covariates. [media-data-average-opinion-us.csv] provides the daily attitude averaging across all users in Weibo for Study 2. If data was derived from another source, list source: The first part of the dataset comprises the analysis in Study 1 and Study 3, which is collected from three surveys, including the Social Attitude Questionnaire of Urban and Rural Residents (SAQURR) in 2019 and 2020, the COVID-19 Multi-Wave Study (CMWS) between 2020 and 2022, and the Survey on Living Conditions (SLC) in 2023. The second part of the datasets provides information used in Study 4, involving the 2018 and 2020 waves of the CFPS, Baidu Index data, and the COVID-19 cases and deaths data. The third dataset is provided to depict trends in attitudes toward the US in Study 2. The data is collected from 50,658,770 posts containing US-related keywords (美国, 灯塔国, 美利坚, 米国, 美帝) from January 1, 2016, to December 31, 2022, on the Chinese social media platform Weibo, which is similar to Twitter. COVID-19 Multi Wave Study (CMWS) and Survey on Living Conditions (SLC) are conducted by Population Development Studies Center, Renmin University of China. Social Attitude of Urban and Rural Residents Survey (SAURRS) is conducted by Institute of Psychology of Chinese Academy of Sciences. China Family Panel Studies (CFPS) is conducted by Institute of Social Science, Peking University. The Weibo data is owned by Sina. -------------------------- METHODOLOGICAL INFORMATION -------------------------- -------------------------- DATA-SPECIFIC INFORMATION: survey-datasets-favorability.csv -------------------------- The first part of the dataset comprises the analysis in Study 1 and Study 3, which is collected from three surveys, including the Social Attitude Questionnaire of Urban and Rural Residents (SAQURR) in 2019 and 2020, the COVID-19 Multi-Wave Study (CMWS) between 2020 and 2022, and the Survey on Living Conditions (SLC) in 2023. We append the data from the three surveys. The raw data at the micro level can be found in the [survey-datasets-favorability.csv], sufficing the replication of the results presented in Study 1 and Study 3. We disclose the relevant variables used in the research, including the favorability score, source of survey, year and month of the interview, and background information such as education and age, accompanying the weights. The PIDs, the personal identifications, are part of the original compilation from SAQURR, CMWS, and SLC. Our study object is to examine the trends in attitudes toward America, so our sample is limited to only those who reported their favorability towards the US, containing 3,266 observations in SAQURR, 28,897 observations in CMWS, and 2,592 observations in SLC. -------------------------- DATA-SPECIFIC INFORMATION: survey-data-trust-descriptive-sample.csv, survey-data-trust-analytical-sample.csv -------------------------- The second part of the datasets provides information used in Study 4, involving the 2018 and 2020 waves of the CFPS, Baidu Index data, and the COVID-19 cases and deaths data. The China Family Panel Studies (CFPS), conducted by Peking University, is a nationally representative, longitudinal, comprehensive, and biennial social survey started in 2010. The outcome of interest in Study 4 is trust in Americans measured in the 2020 CFPS, incorporating the baseline trust from the 2018 CFPS. We confined the sample to respondents who indicated their level of trust in Americans in both the 2018 and 2020 waves (N=17,497). [survey-data-trust-descriptive-sample.csv] reports the trust level in 2018 and 2020 and the changes in between, used to generate the descriptive estimates for Study 4 presented in the main text. In the regression analysis, we provide the subsample of those who have the 鈥減otential鈥� to decrease trust (trust scored above 0) and have complete information on location and interview date (N=11,430). They are interviewed at some point over the 23 weeks spanning from July 2020 to December 2020. We measure the Chinese public attention to the pandemic in the US using the Baidu Index [https://index.baidu.com/v2/index.html]. Baidu is the largest search engine in China. The Baidu Index provides query-based data that reflects the daily intensity of keywords entered into Baidu. We applied a logarithmic transformation to the Baidu Index scores for the keywords "美国疫情" (pandemic in the US), "疫情" (pandemic) and "中美贸易战" (Sino-US trade war) to quantify public attention. Our analysis in this part also involves the COVID-19 cases and deaths data obtained from the Oxford COVID-19 Government Response Tracker [https://www.bsg.ox.ac.uk/research/covid-19-government-response-tracker]. We used two measures with logarithmic transformation: the daily number of confirmed cases and the daily number of deaths occurring one day before the 2020 CFPS interview date. Due to the time difference between China and the US, these statistics are possibly the most up-to-date information available to the survey respondents who closely follow US news. [survey-data-trust-analytical-sample.csv] reports variables include in this data including the trust in Americans in 2018 and 2020, demographic variables, and location details (province) from CFPS, along with the merged data of Baidu Index and the COVID-19 cases and deaths data, used to produce the main results (Table 1) and all SI tables for Study 4. The variable meanings are explained below, as well as in Sheet 2 of the file. Variable name Meaning trust_americans Trust in Americans in 2020 trust_parents Trust in parents in 2020 trust_neighbors Trust in neighbors in 2020 trust_doctors Trust in doctors in 2020 trust_officials Trust in officials in 2020 trust_americans_18 Trust in Americans in 2018 trust_parents_18 Trust in parents in 2018 trust_neighbors_18 Trust in neighbors in 2018 trust_doctors_18 Trust in doctors in 2018 trust_officials_18 Trust in officials in 2018 increase Trust in Americans increased from 2018 to 2020 (binary) logUS_pandemic logged Baidu Search Index score of "pandemic in US" logpandemic logged Baidu Search Index score of "pandemic" logtrade_war logged Baidu Search Index score of "Sino-American trade war" logUS_case_new logged number of new COVID-19 cases in the US one day ago logUS_death_new logged number of new COVID-19 related deaths in the US one day ago age Age age2 Age squared married Married male Male hs_above Completed senior high school or a higher level of education uhukou Urban hukou internet Internet user student In full-time education, including undergraduate and postgraduate education employed In full- or part-time paid employment or was self-employed weekend Interviewed at weekend logUS_pandemic_lag1 logged Baidu Search Index score of "pandemic in US" one day ago logUS_pandemic_lag2 logged Baidu Search Index score of "pandemic in US" two days ago logUS_pandemic_lag3 logged Baidu Search Index score of "pandemic in US" three days ago logUS_pandemic_lead1 logged Baidu Search Index score of "pandemic in US" one day later logUS_pandemic_lead2 logged Baidu Search Index score of "pandemic in US" two days later logUS_pandemic_lead3 logged Baidu Search Index score of "pandemic in US" three days later week Week indicator provcd18 Province indicator date_N15 Indicating at least 15 respondents are interviewed on a given day -------------------------- DATA-SPECIFIC INFORMATION: media-data-average-opinion-us.csv -------------------------- The third dataset is provided to depict trends in attitudes toward the US in Study 2. The data is collected from 50,658,770 posts containing US-related keywords (美国, 灯塔国, 美利坚, 米国, 美帝) from January 1, 2016, to December 31, 2022, on the Chinese social media platform Weibo, which is similar to Twitter. The substantial size provides us with a high level of confidence that this dataset encompasses prevalent viewpoints on Chinese social media. Each post was labeled with an attitude score toward the US on a scale of -2 (most unfavorable), -1 (somewhat unfavorable), 0 (neutral), 1 (somewhat favorable), and 2 (most favorable). Subsequently, we employed fine-tuning on a large language model, BERT, using these annotations for two tasks. The first task involved binary classification to determine whether a Weibo post conveyed attitudes toward the US. The second task was a regression model to predict the attitude score. The daily attitude averaging across all users is provided in [media-data-average-opinion-us.csv], smoothed using a 270-day sliding window to filter out minor fluctuations.