
Fleiss' kappa for many raters
reliability_kappa_fleiss.RdNative implementation of Fleiss' generalisation of kappa to a constant
number of raters per subject (Fleiss, 1971), where the raters rating
one subject need not be the same as those rating another. For two
raters use reliability_kappa() (Cohen's): the two coefficients
differ even on the same data because Cohen's uses each rater's
marginals while Fleiss' uses pooled marginals.
Value
A list with elements method, value, ci_lower, ci_upper,
per_value, n_observers, n_units, n_pairable. CI bounds are
from the asymptotic SE in Fleiss (1971, Eq. 16). per_value gives
per-category kappa_j from Fleiss (1971, Eqs. 20-21).
References
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378-382. doi:10.1037/h0031619