
FAQ
Naturally Occurring Data (sometimes also referred to as naturalistic data) constitute authentic, real-life records of face-to-face, or technology-mediated interactions captured in various formats, including audio, video, and text. Unlike data that researchers generate through interviews, questionnaires, or experiments, NOD involve interactions that unfold naturally without the researchers directly interfering in what individuals are doing.
Examples of NOD include:
Video recordings of family dinner conversations
A Reddit thread discussing soccer
Audio recordings of telephone calls to a mental health helpline
YouTube videos of interactions between police officers and videographers
NOD are analysed by qualitative researchers from different disciplines such as communication, linguistics, psychology, and sociology using methods such as conversation analysis, ethnomethodology, discursive psychology, membership categorisation analysis and interactional linguistics. You can read more about NOD in these papers:
Hepburn, A., & Potter, J. (2022). Designing research for naturally occurring data. In U. Flick (Ed.), The Sage handbook of qualitative data analysis (pp. 483–501). Sage.
Speer, S. A. (2002). “Natural” and “contrived” data: A sustainable distinction? Discourse Studies, 4(4), 511–525.
Potter, J., & Huma, B. (2024). Ensuring quality: The power and potential of naturally occurring data in the social sciences. In U. Flick (Ed.), SAGE handbook of qualitative research quality. Sage.
Potter, J., & Shaw, C. (2018). The virtues of naturalistic data. In U. Flick (Ed.), The Sage handbook of qualitative data collection (pp. 182–199). Sage.
NOD most often come in the form of audio and video recordings or text records of social interactions that take place in real-life. Therefore, NOD datasets (also called data corpora) usually comprise either audio/video files and transcripts thereof or text logs, in the case of text-based interactions.
At the links below, you can find examples of NOD corpora:
NOD are usually accompanied by various ethnographic details about the participants, and the setting/context in which the data have been recorded.
Additionally, transcripts of NOD can come in various formats, some of which include specialised conventions. There are three sets of conventions that are mainly being used with NOD:
The conversation analytic (also called Jefferson) transcribing conventions that capture how speech is being produced (e.g., speed, volume, pitch, silences, overlaps etc.). You can read more about these conventions here:
Jefferson, G. (2004). Glossary of transcript symbols with an introduction. In G. H. Lerner (Ed.), Conversation analysis: Studies from the first generation (pp. 13–31). John Benjamins Publishing Company.
The GAT2 transcribing conventions which similarly, to the conversation analytic conventions also capture how speech is produced, but which include additional notations for recording prosody in more detail. You can read more about these conventions here: https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/166
The multimodal (also called Mondada) transcribing conventions that capture embodied movement. You can read more about these conventions here:
Mondada, L. (2018). Multiple temporalities of language and body in interaction: Challenges for transcribing multimodality. Research on Language & Social Interaction, 51(1), 85–106. https://doi.org/10.1080/08351813.2018.1413878
Whether NOD can be standardised depends on the research method and objective. In quantitative research, NOD can be “coded” by identifying and counting specific elements, such as how often people say hello, the frequency of pauses, or the number of times someone self-selects as a speaker. In addition, the actions speakers accomplish with utterances could also be categorised. This approach involves tallying with the data, making it more structured and measurable.
In qualitative research, the focus shifts from counting occurrences to exploring. Researchers might examine specific instances, such as how speakers select themselves in conversations or what the opening of a phone call looks like. Each instance is analyzed in depth to understand how social life is brought to life in interactions, rather than the instance being treated as just another entry in a tally.
Importantly, these two research methods are not mutually exclusive. Many studies effectively combine quantitative coding with qualitative analysis to provide a more comprehensive understanding of social interactions.