F-measure (F-beta) — metric_f

Native implementation of the F-beta score (default beta = 1, the harmonic mean of precision and recall). Macro and macro-weighted forms compute the (possibly weighted) arithmetic mean of per-class F-beta scores – the convention used by yardstick and scikit-learn (Manning et al. 2008, ch. 13). This differs from Sokolova & Lapalme (2009, Table 3) where macro F-score is computed from the macro-averaged precision and recall directly; the two coincide only when per-class precision and recall are equal across classes. Micro pools TP, FP, and FN globally before computing F-beta.

Usage

metric_f_meas(
  truth,
  estimate,
  estimator = c("binary", "macro", "macro_weighted", "micro"),
  event_level = c("first", "second"),
  beta = 1
)

Arguments

truth: Factor (or coercible) of true class labels.
estimate: Factor (or coercible) of predicted class labels. Must take values from the same level set as truth.
estimator: One of "binary" (exactly two classes; uses event_level), "macro" (unweighted mean of per-class precisions), "macro_weighted" (mean weighted by truth-class prevalence), or "micro" (pooled TP and FP across all classes; for single-label multi-class data this equals accuracy).
event_level: For estimator = "binary": which level is the positive event, "first" (default) or "second".
beta: Positive numeric. beta = 1 (default) gives the familiar F1; beta < 1 weights precision more, beta > 1 weights recall more.

Value

A single numeric value.

References

Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427-437. doi:10.1016/j.ipm.2009.03.002

Manning, C. D., Raghavan, P., & Schutze, H. (2008). Introduction to Information Retrieval, Chapter 13. Cambridge University Press. (Free online: https://nlp.stanford.edu/IR-book/)