We show that only 5% of the comparator`s misclassification of patients can be sufficient to statistically refute performance estimates such as sensitivity, specificity, and the area below the receptor identification curve if this uncertainty is not measured and taken into account. In some common diagnostic scenarios, this distortion effect does not increase linearly with comparison uncertainty. For clinical populations with high classification uncertainty, non-measurement and consideration of this effect will carry significant risks of drawing erroneous conclusions. The effects of classification uncertainty are further reinforced by effective tests that would otherwise be almost perfect in diagnostic evaluation trials. A requirement of very high diagnostic performance for clinical introduction, such as a 99% sensitivity, can be made almost inaccessible even for a perfect test, even if the comparison diagnosis contains few uncertainties. This document and an accompanying online simulation tool show the impact of classification uncertainties on the apparent performance of tests in a series of typical diagnostic scenarios. Simulated and actual data sets are used to show the deterioration of apparent test power in the event of increasing uncertainty of comparison. In the next blog post, we`ll show you how to use Analysis-it to perform the contract test with a treated example. Additional simulations have shown that for each test, even for a perfect test, it is very unlikely to achieve very high performance in a diagnostic evaluation study, even if there is little uncertainty in the comparative value against which the test is evaluated. Like what. B in S7 Supporting Information (”Very high performance tests”), when 99% of PPA (sensitivity) or NPA (specificity) is required in a diagnostic evaluation study, a low ranking rate of 5% compared to that of perfect diagnostic tests results in a probability of more than 99.999%. A specific numerical test power requirement, in particular a very high performance requirement such as 99% PPA (sensitivity), can therefore only be usefully discussed if overall classification uncertainty is excluded or characterized in a study and measured test power is interpreted against the theoretical limit values imposed by comparison uncertainty. Although the positive and negative matching formulas are identical to those for sensitivity/specificity, it is important to distinguish them because the interpretation is different.
To avoid confusion, we recommend that you always use the terms positive agreement (AAE) and negative agreement (NPA) when describing the agreement of these tests. In medicine and epidemiology, the effects of classification uncertainty on apparent test performance are often referred to as ”information bias,” ”classification biases” or ”undifferentiated bias” and are considered under other names in other areas [8-10]. These terms refer to the fact that, as classification uncertainty increases, there will be a growing gap between actual test performance and empirical measures of test performance, such as sensitivity, specificity, negative forecast value (NPV), positive predictive value (APP) or surface under the receiving operating line (ROCC).