Educative Interventions to Combat Misinformation: Evidence from a Field Experiment in India.

Reference

Badrinathan, Sumitra. 2021. “Educative Interventions to Combat Misinformation: Evidence from a Field Experiment in India.” American Political Science Review 115 (4): 1325–41. https://doi.org/10.1017/S0003055421000459.

Intervention

Code

intervention_info <- tibble(
    intervention_description = 'The study tested the effect of a literacy intervention with two slightly different variants. At first, all participants in the treatment group received the following : "Pedagogical intervention: Next, respondents went through a learning module to help inoculate against misinformation. This included an hour-long discussion on encouraging people to verify information along with concrete tools to do so." The difference between the two treatment groups was in some of the materials they were presented with: "Both treatment groups received the pedagogical intervention. However, one group received corrections to four pro-BJP false stories and the other received corrections to four anti-BJP false stories. Besides differences in the stories that were fact-checked, the tips on the flyer remained the same for both treatment groups. The author pooled the conditions in their study as well as in the data we have at hand. We therefor assign a single label (`intervention_label` = "literacy") for the treatment.',
    intervention_selection = NA,
    intervention_selection_description = 'The author pooled both treatment groups together, after not having found differences in the treatment effect between the two. We follow the author, mostly because both treatment conditions seem sufficiently similar.',
    # the author measured detection of misinformation, not discernment  
    originally_identified_treatment_effect = NA,
    control_format = "picture")

# display
show_conditions(intervention_info)

intervention_description	intervention_selection_description
The study tested the effect of a literacy intervention with two slightly different variants. At first, all participants in the treatment group received the following : "Pedagogical intervention: Next, respondents went through a learning module to help inoculate against misinformation. This included an hour-long discussion on encouraging people to verify information along with concrete tools to do so." The difference between the two treatment groups was in some of the materials they were presented with: "Both treatment groups received the pedagogical intervention. However, one group received corrections to four pro-BJP false stories and the other received corrections to four anti-BJP false stories. Besides differences in the stories that were fact-checked, the tips on the flyer remained the same for both treatment groups. The author pooled the conditions in their study as well as in the data we have at hand. We therefor assign a single label (`intervention_label` = "literacy") for the treatment.	The author pooled both treatment groups together, after not having found differences in the treatment effect between the two. We follow the author, mostly because both treatment conditions seem sufficiently similar.

Notes

From an e-mail exchange with the author, we know that true news were selected from “mainstream as well as fact checking sites” and that news were presented in format of “headline + image in some cases”.

The author measured detection of misinformation, not discernment. The outcome was only based on the misinformation items.

“Receiving this hour-long media literacy intervention did not significantly increase respondents’ ability to identify misinformation on average.”

Data Cleaning

Read data.

Code

d <- read_csv("badrinathan_2021.csv")

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Rows: 1224 Columns: 339
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr   (22): po_name, vill_tow_name, near_place, enu_name, sup_name, q12_oth,...
dbl  (278): Serial.No., DeviceId.x, po_code, Newpo_code, po_pri_nu, enu_code...
lgl   (37): spd_bn, spd_res_name, spd_vill, spd_emu_info, gp_inf_cons, disp_...
dttm   (2): StartTime.x, StartTime.y

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

`accuracy_raw`, `scale`, `veracity`, Conditions (`intervention_label`, `condition`)

From the cleaning document of the author, we can deduce which variables correspond to false and which to true news items.

The false news items were:

dv1: CCTV : ‘cctv’ dv2: no terror attacks : ‘attacks’ dv3 : pulwama fake photos : ‘pulwama’ dv4 : ganga fake photos : ‘ganga’ dv5 : fake plastic finger : ‘plastic’ dv6 : soldier : ‘soldier’ dv7 : gomutra : ‘gomutra’ dv8 : rally : ‘rally’ dv9 : child kidnap : ‘kidnap_dv’ dv10 : 2000 note : ‘note’ dv11 : patel statue : ‘patel’ dv12 : flag on statue of liberty : ‘flag’ dv13 : evm hacking : ‘evm’

The true news items were:

dv14: man ki baat: ‘true1’ dv15: pulwama : ‘true1’

Note that variables are coded 1 for correct responses (i.e. 1 corresponds to ‘not accurate’ for false news, and to ‘accurate’ for true news).

The data set does not distinguish between the two slightly different treatment groups, as the author pooled them for her analysis.

Code

# bring data to long format and recode variables
long_d <- d %>% 
  pivot_longer(c(attacks, pulwama, ganga, plastic, soldier, gomutra, rally, 
                 kidnap_dv, note, patel, flag, evm, true1, true2), 
               names_to = "item",
               values_to = "ratings") %>% 
  # make an binary 'veracity' variable identifying true and fake
  mutate(veracity = ifelse(grepl('true', item), 'true', 'false'), 
         # make condition a factor
         condition = recode_factor(treatment, `0` = "control", `1` = "treatment"), 
         # add an intervention label
         intervention_label = ifelse(condition == "treatment", "literacy", NA),
         # recode accuracy responses for fake news
         # so that 1 = rated as accurate (just as is measured for true news)
         accuracy_raw = ifelse(veracity == 'false', 
                                  ifelse(ratings == 1, 0, 1), 
                                  ratings), 
         scale = "binary"
         )

# check
# long_d %>% 
#   group_by(item, veracity, treatment, condition) %>% 
#   summarize(n = n(), 
#             mean_rating = mean(ratings, na.rm=TRUE),
#             mean_accuracy= mean(accuracy_raw, na.rm=TRUE))

`news_id`, `news_selection`

In the previous section, we have already created a news identifier item. Here we just rename this identifier.

Code

long_d <- long_d |> 
  mutate(news_id = item, 
         news_selection = "researchers")

Concordance (`concordance`, `partisan_identity`, `news_slant`)

While political concordance is not explicitly coded, we can build it from participants’ political leaning and the political slant of the news items.

From the cleaning document, we know that participants’ party id (i.e. pro or contra the governing party) is coded by the variable ‘BJP’. The problem is: we don’t know what level (0,1) corresponds to which id. By replicating figure 5 (with the replication file accessible online) from the paper, we figured out that 0 = non BJP and 1 = BJP.

We also know which fake news items are pro-BJP (gomutra, attacks, pulwama, soldier, flag, note) and which fake news items are anti_BJP (cctv, evm, ganga, kidnap_dv, plastic, patel). Regarding true news, combining the cleaning document and the supplement (table D.1), we know that true1 (man ki baat) = pro BJP; and true2 (pulwama) = anti BJP.

Code

pro_BJP <- c("gomutra", "attacks", "pulwama", "soldier", "flag", "note", "true1")

long_d <- long_d %>% 
  # make a binary variable indicating political slant of news
  mutate(news_slant = ifelse(item %in% pro_BJP, "pro_BJP", "anti_BJP"),
         # make a clearer party id variable
         partisan_identity = recode_factor(BJP, `0` = "non_BJP", `1` = "BJP"),
         # combine party id and political slant 
         concordance = case_when(news_slant == "pro_BJP" & partisan_identity == "BJP" ~ "concordant",
                                 news_slant == "anti_BJP" & partisan_identity == "non_BJP" ~ "concordant", 
                                 TRUE ~ "discordant")
  )

# check 
# long_d %>% select(political_slant, party_id, concordance)

`subject_id`

Code

# check id 
nrow(d) # n participants (one row per participant)

[1] 1224

Code

n_distinct(long_d$Serial.No.) # likely the correct variable

[1] 1224

Code

long_d %>% 
  group_by(Serial.No) %>% 
  count() # second check, 14 observations

# A tibble: 1,224 × 2
# Groups:   Serial.No [1,224]
   Serial.No     n
       <dbl> <int>
 1         1    14
 2         2    14
 3         4    14
 4         9    14
 5        10    14
 6        12    14
 7        13    14
 8        14    14
 9        16    14
10        17    14
# ℹ 1,214 more rows

Code

long_d <- long_d |> 
  mutate(subject_id = Serial.No.)

`age`

Code

# check age 
table(long_d$age, useNA = "always")


  18   19   20   21   22   23   24   25   26   27   28   29   30   31   32   33 
2086 1330 1680 1330 1204  784  770 1148  812  364  616  210  854  112  448  154 
  34   35   36   37   38   39   40   41   42   43   44   45   48   49   50   51 
 196  462  154   70  294   42  546   56  182   28   42  322  196   14  196   28 
  52   53   55   58   59   61   62   64   65   68   85 <NA> 
 112   42   84   28   42   14   14   14   28   14   14    0

Identifiers (`country`, `paper_id`, `experiment_id`)

Code

# make final data
badrinathan_2021 <- long_d %>% 
  mutate(
    experiment_id = 1,
    country = "India",
    paper_id = "badrinathan_2021") |> 
  # add_intervention_info 
  bind_cols(intervention_info) |> 
  # reduce to target variables
  select(any_of(target_variables))

Write out data

Code

save_data(badrinathan_2021)

Reference

Intervention

Notes

Data Cleaning

accuracy_raw, scale, veracity, Conditions (intervention_label, condition)

news_id, news_selection

Concordance (concordance, partisan_identity, news_slant)

subject_id

age

Identifiers (country, paper_id, experiment_id)

Write out data

`accuracy_raw`, `scale`, `veracity`, Conditions (`intervention_label`, `condition`)

`news_id`, `news_selection`

Concordance (`concordance`, `partisan_identity`, `news_slant`)

`subject_id`

`age`

Identifiers (`country`, `paper_id`, `experiment_id`)