
Example: Fact checking of claims
task_fact.RmdThe annotate() function with a predefined
task_fact() can be used to fact-check claims made in texts.
In this example, we will demonstrate how to apply this task to a sample
corpus of innaugural speeches from US presidents. The fact-checking
process involves evaluating the truthfulness of claims made in the
speeches and providing explanations for each claim. The outcome is a
truthfulness score from 0 to 10, where 0 indicates completely
false claims and 10 indicates highest confidence in the truthfulness of
the claims.
Loading packages and data
# We will use the quanteda package
# for loading a sample corpus of innaugural speeches
# If you have not yet installed the quanteda package, you can do so by:
# install.packages("quanteda")
library(quanteda)## Package version: 4.3.1
## Unicode version: 14.0
## ICU version: 71.1
## Parallel computing: disabled
## See https://quanteda.io for tutorials and examples.
## Loading required package: ellmer
# For educational purposes,
# we will use a subset of the inaugural speeches corpus
# The three most recent speeches in the corpus
data_corpus_inaugural <- quanteda::data_corpus_inaugural[57:60]Using annotate() for fact checking of claims in
texts
# Apply predefined fact checking task with task_fact() in the annotate() function
result <- annotate(data_corpus_inaugural, task = task_fact(),
model_name = "openai/gpt-4o",
params = list(temperature = 0))## [working] (0 + 0) -> 3 -> 1 | ■■■■■■■■■ 25%
## [working] (0 + 0) -> 1 -> 3 | ■■■■■■■■■■■■■■■■■■■■■■■ 75%
## [working] (0 + 0) -> 0 -> 4 | ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 100%
| id | truth_score | misleading_topic | explanation |
|---|---|---|---|
| 2013-Obama | 9 | The text is a speech that reflects on American values, history, and aspirations. It accurately references historical events and foundational principles of the United States, such as the Declaration of Independence and the Constitution. The speech is largely aspirational and rhetorical, focusing on ideals rather than specific factual claims, which contributes to its high truthfulness score. There are no significant misleading topics or inaccuracies present. | |
| 2017-Trump | 6 | Transfer of Power , Economic Claims , Foreign Policy , Crime and Poverty , Unity and Division |
The speech contains several broad and ambitious claims that are difficult to verify or are overly simplistic.
|
| 2021-Biden | 9 | The text is a speech that emphasizes themes of unity, democracy, and hope. It accurately reflects historical events and current challenges, such as the COVID-19 pandemic and political divisions. The speech is largely aspirational and does not contain factual inaccuracies or misleading claims. Therefore, it receives a high truthfulness score. | |
| 2025-Trump | 3 | Historical inaccuracies, Policy claims , Election results , Panama Canal ownership , Energy resources | The text contains several misleading or inaccurate claims. Historical inaccuracies include the assertion about the Panama Canal being ‘given’ to China, which is not true. Policy claims, such as ending the Green New Deal or declaring a national energy emergency, are speculative and lack context. The statement about winning the popular vote by millions and sweeping all swing states is unverifiable and likely exaggerated. The claim about the U.S. having the largest oil and gas reserves is misleading without context. These issues reduce the overall truthfulness of the text. |
Using annotate() for fact checking with a specific
number of claims to check
# Apply predefined fact checking task with task_fact() in the annotate() function
result_claims <- annotate(data_corpus_inaugural, task = task_fact(max_topics = 3),
model_name = "openai/gpt-4o",
params = list(temperature = 0))## [working] (0 + 0) -> 3 -> 1 | ■■■■■■■■■ 25%
## [working] (0 + 0) -> 2 -> 2 | ■■■■■■■■■■■■■■■■ 50%
## [working] (0 + 0) -> 0 -> 4 | ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 100%
| id | truth_score | misleading_topic | explanation |
|---|---|---|---|
| 2013-Obama | 9 | The text is a ceremonial speech, likely an inaugural address, that emphasizes American values, historical references, and aspirations. It contains broad, aspirational statements rather than specific factual claims, which are generally accurate and consistent with historical and cultural narratives. There are no obvious misleading claims or inaccuracies, hence a high truthfulness score. | |
| 2017-Trump | 6 | Transfer of Power, Economic Claims , Foreign Policy |
The text is a political speech, which often includes aspirational and rhetorical statements rather than factual claims.
|
| 2021-Biden | 9 | The text is a speech that emphasizes themes of unity, democracy, and hope. It accurately reflects historical events and current challenges, such as the COVID-19 pandemic and political divisions. The speech is largely aspirational and does not contain specific factual inaccuracies. However, as with any political speech, the effectiveness of proposed solutions and the realization of goals are subject to debate and interpretation. Overall, the speech is truthful and accurate in its representation of events and intentions. | |
| 2025-Trump | 3 | Historical Inaccuracies, Policy Claims , Election Results |
The speech contains several misleading or inaccurate claims.
|
In this example, we demonstrated how to use the
annotate() function with the task_fact() to
fact-check claims in a corpus of innaugural speeches. The results
include a truth score, identified misleading topics, and explanations
for each claim evaluated. The amount of claims to check can be adjusted
using the max_topics parameter in the
task_fact() function. Now you can apply this approach to
your own texts for fact-checking purposes!