
F-measure (F-beta)
metric_f_meas.RdNative implementation of the F-beta score (default beta = 1, the harmonic mean of precision and recall). Macro and macro-weighted forms compute the (possibly weighted) arithmetic mean of per-class F-beta scores – the convention used by yardstick and scikit-learn (Manning et al. 2008, ch. 13). This differs from Sokolova & Lapalme (2009, Table 3) where macro F-score is computed from the macro-averaged precision and recall directly; the two coincide only when per-class precision and recall are equal across classes. Micro pools TP, FP, and FN globally before computing F-beta.
Arguments
- truth
Factor (or coercible) of true class labels.
- estimate
Factor (or coercible) of predicted class labels. Must take values from the same level set as
truth.- estimator
One of
"binary"(exactly two classes; usesevent_level),"macro"(unweighted mean of per-class precisions),"macro_weighted"(mean weighted by truth-class prevalence), or"micro"(pooled TP and FP across all classes; for single-label multi-class data this equals accuracy).- event_level
For
estimator = "binary": which level is the positive event,"first"(default) or"second".- beta
Positive numeric.
beta = 1(default) gives the familiar F1;beta < 1weights precision more,beta > 1weights recall more.
References
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427-437. doi:10.1016/j.ipm.2009.03.002
Manning, C. D., Raghavan, P., & Schutze, H. (2008). Introduction to Information Retrieval, Chapter 13. Cambridge University Press. (Free online: https://nlp.stanford.edu/IR-book/)