
Changelog
quallmer 0.3.0
CRAN release: 2026-02-16
CRAN submission
- Expanded DESCRIPTION with supported LLM providers, method details, and DOI references.
- Added
\valuedocumentation to all exported methods. - Fixed HTML validation issue in
qlm_validate()documentation.
Internal changes
- Refactored corpus methods to use
qlm_corpuswrapper class pattern instead of conditionalregisterS3method(), eliminating load-order dependencies and runtime checks (#86).
Accessor functions
- New
qlm_meta()accessor function provides stratified access to metadata forqlm_coded,qlm_codebook,qlm_comparison, andqlm_validationobjects. Metadata is organized into three types following the quanteda convention:-
type = "user"(default): User-specified fields (name,notes) that can be modified viaqlm_meta<-(). -
type = "object": Read-only parameters set at creation time (batch,call,chat_args,execution_args,parent,n_units,input_type). -
type = "system": Read-only environment information (timestamp,ellmer_version,quallmer_version,R_version).
-
- New
qlm_meta<-()replacement function allows modifying user metadata fields only. Attempting to modify object or system metadata produces an informative error (#72). - New
codebook()extractor retrieves the codebook component fromqlm_coded,qlm_comparison, andqlm_validationobjects. This is a core component accessor analogous toformula()forlmobjects (#72). - New
inputs()extractor retrieves the original input data (texts or image paths) fromqlm_codedobjects. The function name mirrors theinputsargument inqlm_code()(#72). - These accessor functions replace direct
attr(x, "run")$...access, providing a stable API for extracting and modifying object metadata and components.
Build system
- Build system: pkgdown articles now built locally via Makefile to enable caching and avoid API key requirements in CI (#68).
Gold standard handling and validation improvements
- New
as_qlm_coded()function replacesqlm_humancoded()as the primary function for converting human-coded or external data toqlm_codedobjects. The new function includes anis_goldparameter to mark gold standard objects for automatic detection. -
as_qlm_coded()now supports quanteda corpus objects directly via S3 method dispatch. Document variables (docvars) are automatically converted to coded variables, with document names used as identifiers by default. This simplifies the workflow for corpus-based gold standards (#81). -
qlm_validate()now auto-detects gold standards marked withas_qlm_coded(data, is_gold = TRUE), making thegold =parameter optional when using marked objects. Explicitgold =still works for backward compatibility. -
qlm_validate()signature changed toqlm_validate(..., gold, by, ...)to support validating multiple coded objects against a single gold standard in one call. Results include aratercolumn identifying each object. -
qlm_humancoded()is now marked@keywords internalbut remains exported for backward compatibility. New code should useas_qlm_coded(). - Gold standard objects display
# Gold: Yesin their print output for easy identification. - Improved error messages in
qlm_validate()detect common mistakes like forgettinggold =or misspelling parameter names, with helpful suggestions for correction.
Confidence intervals and reliability metrics
-
ciparameter added toqlm_compare()andqlm_validate()with options"none"(default),"analytic", or"bootstrap". - Bootstrap confidence intervals now work for all metrics in both functions via percentile method with configurable
bootstrap_nparameter (default 1000). - Analytic confidence intervals available for ICC (via psych package) and Pearson’s r (via cor.test).
- Results include
ci_lowerandci_uppercolumns whenci != "none".
Rater identification and combinability
-
qlm_compare()results now includerater1,rater2,rater3, etc. columns containing the names of compared objects (fromnameattribute), enabling easy identification when combining multiple comparisons withdplyr::bind_rows(). -
qlm_validate()results now include aratercolumn identifying which object is being validated, enabling easy combining of multiple validations. - Both functions return data frames (class
qlm_comparisonandqlm_validation) instead of lists, making them easier to filter, combine, and analyze. - Results from multiple
qlm_compare()orqlm_validate()calls can be combined withbind_rows()for analysis across multiple coders or conditions.
API refinements
-
qlm_code()defaultnameparameter changed from"original"toNULLfor cleaner output when names aren’t specified. - Auto-conversion messages now recommend
as_qlm_coded()instead ofqlm_humancoded().
The quallmer audit trail
- New
notesparameter inqlm_code(),qlm_replicate(), andas_qlm_coded()for documenting the rationale behind each coding run. Notes are displayed in print output and captured inqlm_trail(). - The trail API has been simplified to a single function following Lincoln and Guba’s (1985) audit trail concept for establishing trustworthiness in qualitative research.
-
qlm_trail()now accepts an optionalpathargument. When provided, saves RDS archive and generates Quarto report with full audit trail documentation. - The Quarto report includes all Lincoln and Guba audit trail components: instrument development (codebooks), process notes (run parameters and timeline), data reconstruction (comparisons and validations), and raw data summary.
- New replication section in generated reports provides environment setup instructions, API credential configuration, and executable R code to replicate each coding run.
- Removed helper functions:
qlm_trail_save(),qlm_trail_export(),qlm_trail_report(), andqlm_archive(). Useqlm_trail(..., path = "filename")instead. -
qlm_trail()now generates fallback names for objects with missingnameattribute.
quallmer 0.2.0
The quallmer audit trail
- New
qlm_trail()function creates complete audit trails following Lincoln and Guba’s (1985) concept for establishing trustworthiness in qualitative research. - Use
qlm_trail(..., path = "filename")to save RDS archive and generate Quarto report. - Trail print output shows summaries of comparisons and validations (level, subjects, raters, etc.) for better visibility into workflow assessment steps.
- All
qlm_comparisonandqlm_validationobjects include run attributes capturing parent relationships, enabling full workflow traceability. - Audit trail automatically captures branching workflows when multiple coded objects are compared or validated.
New API
The package introduces a new qlm_*() API with richer return objects and clearer terminology for qualitative researchers:
-
qlm_codebook()defines coding instructions, replacingtask()(#27). -
qlm_code()executes coding tasks and returns a tibble with coded results and metadata as attributes, replacingannotate()(#27). The returnedqlm_codedobject prints as a tibble and can be used directly in data manipulation workflows. Now includesnameparameter for tracking runs and hierarchical attribute structure with provenance support. -
qlm_compare()compares multipleqlm_codedobjects to assess inter-rater reliability. Automatically computes all statistically appropriate measures from the irr package based on the specified measurement level (nominal, ordinal, or interval). -
qlm_validate()validates aqlm_codedobject against a gold standard (human-coded reference data). Automatically computes all statistically appropriate metrics based on the specified measurement level, using measures from the yardstick, irr, and stats packages. For nominal data, supports multiple averaging methods (macro, micro, weighted, or per-class breakdown). -
qlm_replicate()re-executes coding with optional overrides (model, codebook, parameters) while tracking provenance chain. Enables systematic assessment of coding reliability and sensitivity to model choices.
The new API uses the qlm_ prefix to avoid namespace conflicts (e.g., with ggplot2::annotate()) and follows the convention of verbs for workflow actions, nouns for accessor functions.
Restructured qlm_coded objects
-
qlm_codedobjects now use a hierarchical attribute structure with arunlist containingname,batch,call,codebook,chat_args,execution_args,metadata, andparentfields. This structure supports provenance tracking across replication chains and provides clearer organization of coding metadata (#26).- The
batchflag indicates whether batch processing was used. -
execution_argsreplacespcs_argsand stores all non-chat execution arguments for both parallel and batch processing. Old objects withpcs_argsremain compatible.
- The
Example codebooks
- New example codebook data object
data_codebook_sentimentprovides a ready-to-use codebook for sentiment analysis. - All predefined
task_*()functions are deprecated in favor of using the data objects or creating custom codebooks withqlm_codebook().
Deprecated and superseded functions
-
task()is deprecated in favor ofqlm_codebook()(#27). -
annotate()is deprecated in favor ofqlm_code()(#27). -
validate()is superseded byqlm_compare()(for inter-rater reliability) andqlm_validate()(for gold standard validation). The function remains available but is marked with a lifecycle badge. - Trail functions (
trail_settings(),trail_record(),trail_compare(),trail_matrix(),trail_icr()) are deprecated. Useqlm_code()with model and temperature parameters directly, orqlm_replicate()for systematic comparisons across models.
Backward compatibility: Old code continues to work with deprecation warnings. New qlm_codebook objects work with old annotate(), and old task objects work with new qlm_code(). This is achieved through dual-class inheritance where qlm_codebook inherits from both "qlm_codebook" and "task".
Package restructuring
-
validate_app()has been extracted into the companion package quallmer.app. This reduces dependencies in the core quallmer package (removing shiny, bslib, and htmltools from Imports). Install quallmer.app separately for interactive validation functionality.
Other changes
-
qlm_validate()now uses distinct, statistically appropriate metrics for each measurement level:-
Nominal (
level = "nominal"): accuracy, precision, recall, F1-score, Cohen’s kappa (unweighted) -
Ordinal (
level = "ordinal"): Spearman’s rho, Kendall’s tau, MAE (mean absolute error) -
Interval/Ratio (
level = "interval"): ICC (intraclass correlation), Pearson’s r, MAE, RMSE (root mean squared error)
The
measureargument has been removed entirely - all appropriate measures are now computed automatically based on thelevelparameter. Function signature changed:levelnow comes beforeaverage, andaverageonly applies to nominal (multiclass) data. Return values renamed for consistency:spearman→rho,kendall→tau,pearson→r. Print output uses “levels” terminology for ordinal data and “classes” for nominal data. This change provides more statistically sound validation that respects the mathematical properties of each measurement scale. -
Nominal (
-
qlm_compare()now computes all statistically appropriate measures for each measurement level:-
Nominal (
level = "nominal"): Krippendorff’s alpha (nominal), Cohen’s/Fleiss’ kappa, percent agreement -
Ordinal (
level = "ordinal"): Krippendorff’s alpha (ordinal), weighted kappa (2 raters only), Kendall’s W, Spearman’s rho, percent agreement -
Interval/Ratio (
level = "interval"): Krippendorff’s alpha (interval), ICC (intraclass correlation), Pearson’s r, percent agreement
The
measureargument has been removed entirely - all appropriate measures are now computed automatically and returned in the result object. The return structure changed from a single value to a list containing all computed measures for the specified level. Percent agreement is now computed for all levels; for ordinal/interval/ratio data, thetoleranceparameter controls what counts as agreement (e.g.,tolerance = 1means values within 1 unit are considered in agreement). -
Nominal (
New
qlm_humancoded()function converts human-coded data frames intoqlm_humancodedobjects (dual inheritance:qlm_humancoded+qlm_coded), enabling full provenance tracking for human coding alongside LLM results. Supports custom metadata for coder information, training details, and coding instructions (#43).qlm_validate()andqlm_compare()now accept plain data frames and automatically convert them toqlm_humancodedobjects with an informational message. Users can callqlm_humancoded()directly to provide richer metadata (coder names, instructions, etc.) or use plain data frames for quick comparisons (#43).qlm_validate()andqlm_compare()now support non-standard evaluation (NSE) for thebyargument, allowing bothby = sentiment(unquoted) andby = "sentiment"(quoted) syntax. This provides a more natural, tidyverse-style interface while maintaining backward compatibility (#43).Print method for
qlm_codedobjects now distinguishes human from LLM coding, displaying “Source: Human coder” forqlm_humancodedobjects instead of model information.Improved error messages in
qlm_compare()andqlm_validate()now show which objects are missing the requested variable and list available alternatives.Adopt tidyverse-style error messaging via
cli::cli_abort()andcli::cli_warn()throughout the package, replacing allstop(),stopifnot(), andwarning()calls with structured, informative error messages.Documentation and CI notes refreshed.