Making the most of webcam videos using Datavyu and Databrary

Videos are an essential part of the conducting unmoderated remote research because they assure that data collected via behavioral measures is usable (how else would we know that a child actually did the study!)—but they also allow us to collect a richer dataset that can be used to answer novel research questions that cannot be captured in traditional psychological measures.

How our lab uses Datavyu

Once parents have uploaded their video recordings for each participant, the videos go through a Quality Assurance procedure (QA) as the first stage of data processing to determine whether the data are valid or not (see here for details of how we do this). But even for valid study sessions, some additional processing is needed to make sure that we end up with reliable, high-quality data. For example, given that the studies are conducted with no direct contact with an experimenter, it’s important to ensure that children’s responses are not influenced by interference from other people, so we code videos of study sessions for interference. This coding involves having a trained researcher watch the video and mark any portions of the study in which a parent or other person interfered with the child’s response using the video coding platform Datavyu, a free open-source video coding system (Lingeman, Freeman, & Adolph, 2014; Datavyu.org). Coding in Datavyu allows researchers to extract data from video recordings of study sessions in a concise and organized way. This includes both coding for potential interference, as well as coding for interesting aspects of children’s interactions with people around them during testing (see here for more info about the role of parents and why it’s important to code for it).

What proportion of videos to code varies by study, depending on both the topic and whether or not aspects of the child’s interaction with parents are included as dependent variables in the study design. For some studies, in which the likelihood of parental interference is high or is of interest, researchers might choose to code 100% of videos for interference. For other studies, in which the likelihood of interference is low and parent-child interactions are not part of the planned study design, researchers might choose to code a randomly selected subset of videos in order to assess the interference rate (see here for more details). Our previous unmoderated remote research with these methods has identified extremely low rates of parent or sibling interference (less than 1% of trials), and excluding such trials had no consequence for the overall pattern of findings (Leshin, Leslie, & Rhodes, 2020). So for some studies, we randomly select 20% of videos to be coded by a trained researcher in order to estimate the interference rate. If the interference rate in the subset of coded videos is less than 3%, then we conclude that it does not affect the study results (see this study for more information about interference rates found in our previous studies and detailed analysis). If the estimated rate of interference is more than 3%, however, then we instead code every video trial-by-trial for instances of interference and exclude any trials in which there was interference such that there is a question of whether the answer reflects the child’s response. If more than 25% of trials are excluded due to interference, then we do not include the participant’s data in analyses. We specify these coding plans in each study’s pre-registration (see here for an example).

In order to be able to exclude trials for interference, the study has to be set up in a way that allows accurate information about what participants are seeing at any given time during the video. One way of setting this up is by embedding data in the study about when each trial begins and ends (see our guide for setting this up). Once you have run your study, and you have webcam videos of study sessions and a Qualtrics file with the onsets and offsets of each trial, you can import the timing data into Datavyu to track what participants are seeing during each part of their video. Here's a sample Ruby script that we use to import our Qualtrics data into Datavyu.

In addition to using Datavyu to code for interference, we can also use the platform to code for variables of interest, like conversations between parents and children as well as behavioral aspects of children’s interactions with the other people in their home environments (see this study as an example of using Datavyu to code for parent and child speech and behavior).

Sharing videos with other researchers via Databrary

Webcam video is rich data, so each video offers a wealth of potential developmental data for researchers beyond the scope of the specific study’s questions. In order to share this rich data with the research community, we ask at the end of every study session for parents’ permission to share the video data with authorized researchers on Databrary. Sharing videos on Databrary also helps with reproducibility and reliability (see here for more info).

Here is a screenshot of how that looks: