
Example: Salience of topics
task_salience.RmdThe annotate() function with a predefined
task_salience() can be used to identify and rank the
salience of topics discussed in texts. In this example, we will
demonstrate how to apply this task to a sample corpus of innaugural
speeches from US presidents.
Loading packages and data
# We will use the quanteda package
# for loading a sample corpus of innaugural speeches
# If you have not yet installed the quanteda package, you can do so by:
# install.packages("quanteda")
library(quanteda)## Package version: 4.3.1
## Unicode version: 14.0
## ICU version: 71.1
## Parallel computing: disabled
## See https://quanteda.io for tutorials and examples.
## Loading required package: ellmer
# For educational purposes,
# we will use a subset of the inaugural speeches corpus
# The three most recent speeches in the corpus
data_corpus_inaugural <- quanteda::data_corpus_inaugural[57:60]Using annotate() for salience of ANY topics discussed
in texts
# Apply predefined salience task with task_salience() in the annotate() function
result <- annotate(data_corpus_inaugural, task = task_salience(),
model_name = "openai/gpt-4o",
params = list(temperature = 0))## [working] (0 + 0) -> 3 -> 1 | ■■■■■■■■■ 25%
## [working] (0 + 0) -> 2 -> 2 | ■■■■■■■■■■■■■■■■ 50%
## [working] (0 + 0) -> 0 -> 4 | ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 100%
| id | topics | explanation |
|---|---|---|
| 2013-Obama | American Values and Ideals , Equality and Civil Rights , Collective Action and Unity , Economic Prosperity and Middle Class, Role of Government and Policy Reform |
|
|
2017-Trump |
Transfer of Power to the People , America First Policy , National Unity and Patriotism , Economic Rebuilding and Job Creation, Critique of Political Establishment |
America First Policy: The repeated focus on prioritizing American interests in trade, immigration, and foreign affairs underscores this topic’s prominence. National Unity and Patriotism: The speech frequently calls for unity and patriotism, emphasizing shared values and protection by military and God. Economic Rebuilding and Job Creation: There is significant emphasis on rebuilding infrastructure, creating jobs, and economic prosperity. Critique of Political Establishment: The speech criticizes the political establishment for failing the people, marking it as a key point. |
|
2021-Biden |
Unity , Democracy , Challenges facing America , American values and ideals, COVID-19 pandemic |
Democracy: Democracy is a key focus, with references to its fragility, the recent attack on the Capitol, and the peaceful transfer of power. The speech celebrates the triumph of democracy and the will of the people. Challenges facing America: The text outlines various challenges such as political extremism, racial injustice, and climate change, emphasizing the need to address these issues collectively. American values and ideals: The speech frequently references American values like liberty, dignity, and truth, framing them as guiding principles for the nation. COVID-19 pandemic: The pandemic is mentioned as a significant current challenge, with references to its impact on lives and the economy, and the need for a united response. |
|
2025-Trump |
American Renewal and Greatness , Government Reform and Efficiency , National Security and Border Control , Economic Policies and Energy Independence, Unity and National Pride |
Government Reform and Efficiency: There is significant focus on reforming government structures, restoring integrity, and ending corruption. The mention of a new Department of Government Efficiency and stopping government censorship underscores this. National Security and Border Control: The speech discusses securing borders, ending illegal immigration, and designating cartels as terrorist organizations, indicating a strong emphasis on national security. Economic Policies and Energy Independence: Economic revitalization through energy independence, manufacturing, and trade reforms is a major theme, with specific policies like “drill, baby, drill” and ending the Green New Deal. Unity and National Pride: The speech frequently mentions national unity, pride, and the collective American spirit, aiming to inspire and unify the nation under shared values and goals. |
Using
annotate() for salience of a SPECIFIED LIST of topics
discussed in texts
# Define a list of topics to focus on
topics <- c("economy", "health", "education", "environment", "foreign policy")
# Apply predefined salience task with task_salience() in the annotate() function
result <- annotate(data_corpus_inaugural, task = task_salience(topics),
model_name = "openai/gpt-4o",
params = list(temperature = 0))## [working] (0 + 0) -> 3 -> 1 | ■■■■■■■■■ 25%
## [working] (0 + 0) -> 1 -> 3 | ■■■■■■■■■■■■■■■■■■■■■■■ 75%
## [working] (0 + 0) -> 0 -> 4 | ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 100%
| id | topics | explanation |
|---|---|---|
| 2013-Obama | economy , environment , foreign policy, education , health | The text prominently discusses the economy, emphasizing economic recovery, the importance of a rising middle class, and the need for infrastructure and job creation. The environment is highlighted through the commitment to addressing climate change and sustainable energy. Foreign policy is addressed with references to alliances, peace, and global engagement. Education is mentioned in the context of training teachers and reforming schools. Health is touched upon with references to healthcare costs and social security programs. |
| 2017-Trump | economy , foreign policy, education , health , environment | The text primarily focuses on the economy, discussing jobs, factories, and wealth redistribution, making it the most salient topic. Foreign policy is also emphasized, particularly in terms of trade and alliances. Education is mentioned in the context of failing systems and the need for improvement. Health is briefly touched upon with references to disease. The environment is not directly addressed, making it the least salient of the listed topics. |
| 2021-Biden | health , foreign policy, environment , economy , education |
|
|
2025-Trump |
foreign policy, economy , health , environment , education |
Economy: Economic issues are highlighted through mentions of inflation, energy policies, manufacturing, and tariffs, indicating a focus on economic revitalization. Health: The text briefly touches on public health, mentioning the COVID vaccine mandate and a commitment to ending disease epidemics. Environment: Environmental policies are mentioned in the context of energy production and the rejection of the Green New Deal, indicating a stance on environmental issues. Education: The education system is criticized for teaching children to be ashamed, suggesting a need for reform, though it is less emphasized than other topics. |
Adjusting the task_salience() so it also returns the stance for each topic
# Customizing the task to include the stance for each topic
custom_task <- task(
name = "Salience and stance of topics",
system_prompt = paste(
"You are an expert analysing the content of texts.",
"",
"Task:",
"- Read the text carefully.",
"- Identify and rank the salience of the following topics: economy, health, education, environment, foreign policy.",
"- For each topic mentioned, assign a stance as one of the following:",
" pro, neutral, or contra.",
"- Append the stance directly after each topic name in the form 'topic: stance'.",
"- Return all topic:stance entries in descending order of salience.",
"- Separate entries with commas when presenting them in a list.",
"",
"Do not infer information that is not in the text.",
"Base all evaluations solely on the language and arguments in the document.",
"",
"Output:",
"- `topic_stance`: a ranked list of topic labels with stance labels appended (e.g., 'economy: pro', 'health: contra').",
"- `explanation`: a brief justification explaining why the topics were ordered and how stance was determined.",
sep = "\n"
),
type_def = ellmer::type_object(
topic_stance = ellmer::type_array(
ellmer::type_string("Topic and stance label combined (e.g., 'economy: pro'), ranked by salience.")
),
explanation = ellmer::type_string(
"Brief justification for the salience ordering and stance classification."
)
),
input_type = "text"
)
# Apply the customized task in the annotate() function
custom_result <- annotate(data_corpus_inaugural, task = custom_task,
model_name = "openai/gpt-4o",
params = list(temperature = 0))## [working] (0 + 0) -> 3 -> 1 | ■■■■■■■■■ 25%
## [working] (0 + 0) -> 2 -> 2 | ■■■■■■■■■■■■■■■■ 50%
## [working] (0 + 0) -> 0 -> 4 | ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 100%
| id | topic_stance | explanation |
|---|---|---|
| 2013-Obama | economy: pro , environment: pro , foreign policy: pro, health: pro , education: pro | The text emphasizes the importance of a strong economy, highlighting the need for infrastructure, fair competition, and a rising middle class, indicating a pro stance on the economy. The environment is addressed with a commitment to tackling climate change and leading in sustainable energy, showing a pro stance. Foreign policy is discussed in terms of maintaining alliances and promoting peace, indicating a pro stance. Health is mentioned in the context of reducing healthcare costs and supporting social safety nets, suggesting a pro stance. Education is noted as essential for future success, with a focus on reforming schools and training workers, indicating a pro stance. |
| 2017-Trump | economy: pro , foreign policy: neutral, education: contra , environment: neutral , health: neutral | The text emphasizes economic issues, focusing on job creation, rebuilding infrastructure, and prioritizing American workers, indicating a pro stance on the economy. Foreign policy is addressed with a focus on ‘America first’ and alliances, suggesting a neutral stance as it balances protectionism with international cooperation. Education is mentioned negatively, highlighting a failing system, which suggests a contra stance. The environment is not directly addressed, but infrastructure plans imply a neutral stance. Health is mentioned in the context of eradicating disease, but not in detail, leading to a neutral stance. |
| 2021-Biden | democracy: pro , unity: pro , health: pro , foreign policy: pro, economy: pro , environment: pro | The speech primarily focuses on the theme of democracy, emphasizing its triumph and fragility, making ‘democracy: pro’ the most salient topic. Unity is a central theme, repeatedly mentioned as essential for overcoming challenges, hence ‘unity: pro’ is next. Health is addressed through the context of the pandemic, highlighting its impact and the need for a unified response, leading to ‘health: pro’. Foreign policy is discussed in terms of repairing alliances and engaging globally, resulting in ‘foreign policy: pro’. The economy is mentioned in relation to job losses and rebuilding, thus ‘economy: pro’. The environment is touched upon with a call to address climate crises, making ‘environment: pro’ relevant but less emphasized. |
| 2025-Trump | foreign policy: pro, economy: pro , environment: contra, health: contra , education: contra | The text emphasizes foreign policy with a strong stance on border security, military strength, and international respect, making it the most salient topic with a pro stance. The economy is also prominent, focusing on energy independence, manufacturing, and tariffs, indicating a pro stance. The environment is addressed negatively with the rejection of the Green New Deal, showing a contra stance. Health is mentioned in the context of a failing public health system and COVID vaccine mandates, suggesting a contra stance. Education is criticized for teaching negative views of the country, indicating a contra stance. |
In this example, we demonstrated how to use the
task_salience() for identifying and ranking topics
discussed in texts, both with and without a predefined list of topics.
Additionally, we showed how to customize the task to include stance
classification for each topic. This showcases the flexibility of the
annotate() function and the task framework in
quallmer for various text analysis tasks.