Skip to contents

Native implementation of the intraclass correlation coefficient (ICC) family for a subjects x raters matrix of interval/ratio ratings. Six forms are exposed via model/type/unit, following the Shrout-Fleiss naming and the McGraw-Wong calculation tables:

Usage

reliability_icc(
  ratings,
  model = c("oneway", "twoway"),
  type = c("consistency", "agreement"),
  unit = c("single", "average"),
  r0 = 0,
  conf.level = 0.95
)

Arguments

ratings

A subjects x raters matrix or data.frame of numeric ratings. Rows are objects of measurement (subjects); columns are raters. Must not contain NA.

model

"oneway" (each subject rated by a different random set of raters) or "twoway" (the same k raters rate every subject).

type

"consistency" (column variance excluded – relative agreement) or "agreement" (column variance included – absolute agreement). Ignored for model = "oneway".

unit

"single" (reliability of one rater's score) or "average" (reliability of the mean across k raters; the Spearman-Brown stepped-up form).

r0

Null-hypothesis value for the F-test. Default 0 tests H0: ICC = 0.

conf.level

Confidence level for the CI on the population ICC (default 0.95).

Value

A list with elements:

method

Short label, e.g. "icc_2_1" or "icc_3_k".

value

The ICC estimate.

ci_lower, ci_upper

Confidence interval bounds at conf.level.

per_value

NULL (ICC has no per-category breakdown).

n_observers, n_units, n_pairable

Counts (k, n, k*n).

model, type, unit

The configuration that produced the ICC.

icc_name

Canonical Shrout-Fleiss name, e.g. "ICC(2,1)".

F_value, df1, df2, p_value, r0

F-test of H0: ICC = r0.

Details

modeltypeunitShrout & FleissMcGraw & Wong
oneway(n/a)singleICC(1,1)ICC(1)
oneway(n/a)averageICC(1,k)ICC(k)
twowayconsistencysingleICC(3,1)ICC(C,1)
twowayconsistencyaverageICC(3,k)ICC(C,k)
twowayagreementsingleICC(2,1)ICC(A,1)
twowayagreementaverageICC(2,k)ICC(A,k)

For model = "oneway" the type argument is ignored (only one form exists). The two-way random and two-way mixed models share the same calculations; they differ only in interpretation (whether the column factor levels are treated as a random sample or as fixed). See Koo & Li (2016) for guidance on selecting a form.

References

Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 420-428. doi:10.1037/0033-2909.86.2.420

McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1(1), 30-46. doi:10.1037/1082-989X.1.1.30

Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155-163. doi:10.1016/j.jcm.2016.02.012