Skip to main content

Methodological issues on evaluating agreement between two detection methods by Cohen’s kappa analysis

Abstract

We read with great interest the article by Hendershot et al. (Parasit Vectors 14:473, 2021). The authors compared a PCR method for detecting Plasmodium vivax’s mitochondrial (mt) cytochrome oxidase I (COX-I) gene with the current “gold standard” circumsporozoite (CSP) ELISA for detecting circumsporozoite protein for identification of different life stages of Plasmodium vivax during development within Anopheles arabiensis. We found that Cohen’s kappa value for measuring the agreement between mt COX-I PCR and CSP ELISA was questionable. In addition, we recommend a more appropriate statistical method in this article.

In short, any scientific conclusion requires support by the reasonable application of methodological and statistical methods.

To the Editor,

We read with interest the article entitled: “A comparison of PCR and ELISA methods to detect different stages of Plasmodium vivax in Anopheles arabiensis,” which was published in Parasites and Vectors on 15 September 2021 [1]. The authors compared a PCR method for detecting Plasmodium vivax’s mitochondrial (mt) cytochrome oxidase I (COX-I) gene with the current “gold-standard” circumsporozoite (CSP) ELISA for detecting circumsporozoite protein for identification of different life stages of Plasmodium vivax during development within Anopheles arabiensis. They evaluated the agreement between the results of the mt COX-I PCR and the CSP ELISA by using Cohen’s kappa.

Generally, Cohen’s kappa [2] is calculated as follows:

$${k}_{C}=\frac{\sum_{j=1}^{n}{u}_{jj}\left(i{i}^{^{\prime}}\right)-\sum_{j=1}^{n}{p}_{ij}{p}_{{i}^{^{\prime}}j}}{1-\sum_{j=1}^{n}{p}_{ij}{p}_{{i}^{^{\prime}}j}}$$
(1)

The value of \({u}_{jj}\left(i{i}^{^{\prime}}\right)\) is the proportion of objects put in the same category j by both raters \(i\) and \({i}^{^{\prime}}\). The value of \({p}_{ij}\) is the proportion of objects that rater \(i\) assigned to category \(j\), and \(k\) is the number of raters. Cohen suggested the k value be interpreted as follows: k ≤ 0 as indicating no agreement and 0.01–0.20 as none to slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as almost perfect agreement [2]. According to the author’s description, according to Cohen’s interpretation for k value, the agreement between mt COX-I PCR and CSP ELISA was “fair” for mosquitoes bisected at 9–15 dpi in the head and thorax (κ = 0.312).

Although this article has provided valuable information, some substantial points that may lead to misinterpretation of the results need to be clarified. Unlike the authors, for 9–15 dpi, we calculated the agreement between mt COX-I PCR and CSP ELISA with SPSS 18 statistical package (SPSS 18 Inc., Chicago, IL, USA) software. The kappa values in head and thorax and abdomen samples were 0.299 and 0.304, respectively. Furthermore, a simple sum of the data was performed, and the kappa value obtained was 0.302 (Table 1). Each of the three kappa values was different from the authors’ kappa value of 0.312. We would be grateful if the authors could explain their results in detail and clarify the misunderstanding.

Table 1 Kappa values for calculating agreement between COX-I PCR and CSP ELISA for 9–15 dpi

Furthermore, McHugh [4] provided a more logical interpretation of k value: 0–0.20 = no agreement, 0.21–0.39 = minimal agreement, 0.40–0.59 = weak agreement, 0.60–0.79 = moderate agreement, 0.80–0.90 = strong agreement, and 0.91–1.00 = almost perfect agreement. McHugh stated that: “For percent agreement, 61% agreement can immediately be seen as problematic. Almost 40% of the data in the data set represent faulty data. In healthcare research, this could lead to recommendations for changing practice based on faulty evidence. For a clinical laboratory, having 40% of the sample evaluations being wrong would be an extremely serious quality problem. This is the reason that many texts recommend 80% agreement as the minimum acceptable interrater agreement. Given the reduction from percent agreement that is typical in kappa results, some lowering of standards from percent agreement appears logical. However, accepting 0.40 to 0.60 as ‘moderate’ may imply the lowest value (0.40) is adequate agreement.” Therefore, we also recommend the authors use McHugh’s interpretation to replace Cohen’s interpretation to analyze the kappa values. In a word, any scientific conclusion needs to be supported by the reasonable application of methodological and statistical methods. Using appropriate statistical methods can improve the scientific nature of research results.

Availability of data and materials

Not applicable.

References

  1. Hendershot AL, Esayas E, Sutcliffe AC, Irish SR, Gadisa E, Tadesse FG, et al. A comparison of PCR and ELISA methods to detect different stages of Plasmodium vivax in Anopheles arabiensis. Parasit Vectors. 2021;14:473.

    Article  CAS  Google Scholar 

  2. Conger AJ. Kappa and rater accuracy: paradigms and parameters. Educ Psychol Meas. 2017;77:1019–47.

    Article  Google Scholar 

  3. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46.

    Article  Google Scholar 

  4. McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22:276–82.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

ML conceived and prepared the first draft of the manuscript. TY critically reviewed the draft. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Tianfei Yu.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, M., Yu, T. Methodological issues on evaluating agreement between two detection methods by Cohen’s kappa analysis. Parasites Vectors 15, 270 (2022). https://doi.org/10.1186/s13071-022-05402-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13071-022-05402-8

Keywords