Skip to contents

The qlm_trail() function creates audit trails following Lincoln and Guba’s (1985) framework for establishing trustworthiness in qualitative research. It captures the complete decision history of your coding workflow.

The audit trail concept

Lincoln and Guba (1985, p. 319-320) describe six categories of audit trail materials for qualitative research:

  1. Raw data: Original texts
  2. Data reduction products: Summaries, coded results
  3. Data reconstruction products: Themes, comparisons, findings
  4. Process notes: Methodological decisions, procedures
  5. Materials relating to intentions: Research proposals, notes
  6. Instrument development: Codebooks, protocols

quallmer automatically captures these materials as you work:

Component What quallmer stores
Raw data Original texts in coded objects
Data reduction Coded results from each run
Data reconstruction Comparison and validation results
Process notes Model, temperature, timestamps, user notes
Intentions The R function calls you made
Instruments Codebook instructions and schema

How it works

Every quallmer object carries metadata about how it was created. The trail system connects these objects to show your complete workflow:

qlm_code()       →  Creates coded object with run metadata
      ↓
qlm_replicate()  →  Creates new object linked to parent
      ↓
qlm_compare()    →  Creates comparison linked to inputs
      ↓
qlm_trail()      →  Extracts and organizes all metadata

Testing branches

After initial coding with qlm_code(), the workflow branches into different types of quality assessment:

                           ┌─── a) Robustness (test-retest reliability)
                           │    qlm_replicate() → qlm_compare()
                           │
qlm_code() ────────────────┼─── b) Reliability (test-test across models)
                           │    qlm_replicate(<alter settings>) → qlm_compare()
                           │
                           └─── c) Accuracy (vs. gold standard)
                                qlm_validate()

a) Robustness via repetition: Run the same model and settings multiple times to establish test-retest reliability. Use qlm_replicate() with identical settings, then qlm_compare() to measure consistency.

b) Reliability via different models: Compare results across different LLMs or temperatures to establish test-test reliability. Use qlm_replicate() with altered settings (e.g., different model, temperature), then qlm_compare() to assess agreement.

c) Accuracy versus a gold standard: If human-coded gold standard data exists, use qlm_validate() to compute accuracy metrics (precision, recall, F1, etc.).

qlm_trail() collects all these branches into a single trail object, providing a complete record of what was tested and how the results compare.

Creating a trail

Pass any quallmer objects to qlm_trail():

library(quallmer)

# After your analysis...
coded1 <- qlm_code(
  texts, codebook,
  model = "openai/gpt-4o",
  name = "gpt4o",
  notes = "Initial coding with GPT-4o"
)

coded2 <- qlm_replicate(
  coded1,
  model = "anthropic/claude-sonnet-4",
  name = "claude",
  notes = "Replication to check model agreement"
)

comparison <- qlm_compare(coded1, coded2, by = sentiment)

# Create the audit trail
trail <- qlm_trail(coded1, coded2, comparison)
trail

The print output shows:

  • Run names and parent relationships
  • Timestamps and model information
  • Notes documenting why each run was performed
  • Comparison/validation summaries
  • Whether the chain is complete

Saving the trail

Add a path argument to save permanent files:

qlm_trail(coded1, coded2, comparison, path = "my_analysis")

This creates:

  • my_analysis.rds: Complete R object with all coded data
  • my_analysis.qmd: Quarto document for human-readable documentation

The Quarto report

The generated .qmd file contains:

  1. Trail summary: Number of runs, completeness status
  2. Instrument development: All codebooks used with instructions
  3. Process notes: Chronological record of each run
  4. Data reconstruction: Comparison and validation results
  5. Replication section: Code to reproduce the analysis

Render it with:

quarto::quarto_render("my_analysis.qmd")

Loading saved trails

Reload a trail to access the data:

trail <- readRDS("my_analysis.rds")

# Access runs
names(trail$runs)

# Get coded data from a run
trail$runs$gpt4o$data

# Check run metadata
trail$runs$gpt4o$metadata$timestamp
trail$runs$gpt4o$chat_args$name

Best practices

  1. Name your runs: Use the name parameter for readable trails
  2. Add notes: Use the notes parameter to document why each run was performed
  3. Include all objects: Pass every relevant object to qlm_trail() for a complete chain
  4. Save early: Create trail files before sharing or publishing results

Reference

Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic Inquiry. Sage.