Skip to contents

Creates a codebook definition for use with qlm_code(). A codebook specifies what information to extract from input data, including the instructions that guide the LLM and the structured output schema.

Usage

qlm_codebook(
  name,
  instructions,
  schema,
  role = NULL,
  input_type = c("text", "image"),
  levels = NULL
)

Arguments

name

Name of the codebook (character).

instructions

Instructions to guide the model in performing the coding task.

schema

Structured output definition, e.g., created by ellmer::type_object(), ellmer::type_array(), or ellmer::type_enum().

role

Optional role description for the model (e.g., "You are an expert annotator"). If provided, this will be prepended to the instructions when creating the system prompt.

input_type

Type of input data: "text" (default) or "image".

levels

Optional named list specifying measurement levels for each variable in the schema. Names should match schema property names. Values should be one of "nominal", "ordinal", "interval", or "ratio". If NULL (default), levels are auto-detected from schema types using the following mapping: type_boolean and type_enum = nominal, type_string = nominal, type_integer = ordinal, type_number = interval.

Value

A codebook object (a list with class c("qlm_codebook", "task")) containing the codebook definition. Use with qlm_code() to apply the codebook to data.

Details

This function replaces task(), which is now deprecated. The returned object has dual class inheritance (c("qlm_codebook", "task")) to maintain backward compatibility.

See also

qlm_code() for applying codebooks to data, data_codebook_sentiment for a predefined codebook example, task() for the deprecated function.

Examples

# Define a custom codebook
my_codebook <- qlm_codebook(
  name = "Sentiment",
  instructions = "Rate the sentiment from -1 (negative) to 1 (positive).",
  schema = type_object(
    score = type_number("Sentiment score from -1 to 1"),
    explanation = type_string("Brief explanation")
  )
)

# With a role
my_codebook_role <- qlm_codebook(
  name = "Sentiment",
  instructions = "Rate the sentiment from -1 (negative) to 1 (positive).",
  schema = type_object(
    score = type_number("Sentiment score from -1 to 1"),
    explanation = type_string("Brief explanation")
  ),
  role = "You are an expert sentiment analyst."
)

# With explicit measurement levels
my_codebook_levels <- qlm_codebook(
  name = "Sentiment",
  instructions = "Rate the sentiment from -1 (negative) to 1 (positive).",
  schema = type_object(
    score = type_number("Sentiment score from -1 to 1"),
    explanation = type_string("Brief explanation")
  ),
  levels = list(score = "interval", explanation = "nominal")
)

# \donttest{
# Use with qlm_code() (requires API key)
texts <- c("I love this!", "This is terrible.")
coded <- qlm_code(texts, my_codebook, model = "openai/gpt-4o-mini")
#> Error in openai_key(): Can't find env var `OPENAI_API_KEY`.
coded
#> Error: object 'coded' not found
# }