Machine learning-based diagnostic tests have certain differences of measurement indicators with traditional diagnostic tests. In this paper, we elaborate the definitions, calculation methods and statistical inferences of common measurement indicators of machine learning-based diagnosis models in detail. We hope that this paper will be helpful for clinical researchers to better evaluate machine learning diagnostic models.
Comparative diagnostic test accuracy study, a type of diagnostic accuracy test, aims to compare accuracy of two or more index tests in a study. The application of GRADE in comparative test accuracy differs from single test accuracy, mainly including the selection of appropriate comparative study designs, additional criteria for judging risk of bias, and the consequences of using comparative measures of test accuracy. The study focuses on basic principles and methods of GRADE approach in systematic reviews of comparative test accuracy to promote the understanding and application of the method by domestic scholars.
Objective To evaluate whether the paper titled “Application of tumor type M2 pyruvate kinase in the diagnosis of lung cancer” met the standards set in the STARD statement. Methods Based on each of the 25 items of STAndards for the Reporting of Diagnostic accuracy studies (STARD statement), the paper titled “Application of tumor type M2 pyruvate kinase in the diagnosis of lung cancer” was checked and evaluated. Results In the paper titled “Application of tumor type M2 pyruvate kinase in the diagnosis of lung cancer”, the reporting of 1 item of the STARD statement was adequately standardized, 7 items were relatively standardized, 5 items were inadequately standardized, 2 items were not standardized, and the other 10 were not reported. Conclusion Generally speaking, the reporting of diagnostic accuracy studies has not been standardized adequately in China. The methodological quality and applicability of diagnostic accuracy studies should be improved.
Brain computer interface is a control system between brain and outside devices by transforming electroencephalogram (EEG) signal. The brain computer interface system does not depend on the normal output pathways, such as peripheral nerve and muscle tissue, so it can provide a new way of the communication control for paralysis or nerve muscle damaged disabled persons. Steady state visual evoked potential (SSVEP) is one of non-invasive EEG signals, and it has been widely used in research in recent years. SSVEP is a kind of rhythmic brain activity simulated by continuous visual stimuli. SSVEP frequency is composed of a fixed visual stimulation frequency and its harmonic frequencies. The two-dimensional ensemble empirical mode decomposition (2D-EEMD) is an improved algorithm of the classical empirical mode decomposition (EMD) algorithm which extended the decomposition to two-dimensional direction. 2D-EEMD has been widely used in ocean hurricane, nuclear magnetic resonance imaging (MRI), Lena image and other related image processing fields. The present study shown in this paper initiatively applies 2D-EEMD to SSVEP. The decomposition, the 2-D picture of intrinsic mode function (IMF), can show the SSVEP frequency clearly. The SSVEP IMFs which had filtered noise and artifacts were mapped into the head picture to reflect the time changing trend of brain responding visual stimuli, and to reflect responding intension based on different brain regions. The results showed that the occipital region had the strongest response. Finally, this study used short-time Fourier transform (STFT) to detect SSVEP frequency of the 2D-EEMD reconstructed signal, and the accuracy rate increased by 16%.
This paper introduced the preferred reporting items for journal and conference abstracts of systematic reviews and meta-analyses of diagnostic test accuracy studies (PRISMA-DTA for abstracts), which was published in BMJ in March 2021. This paper presented the 12 items of checklist, explanations, and examples of complete reporting, to help domestic researchers to report complete and informative abstracts of systematic reviews and meta-analyses of diagnostic test accuracy studies.
Objective To evaluate the accuracy of soluble triggering receptor expressed on myeloid cells-1 ( sTREM-1) as a diagnostic index for ventilator-associated pneumonia ( VAP) . Methods We searched the PubMed, EMBase, Cochrane Library,Wanfang Database, CNKI and VIP for clinical trials which assessed the diagnosis accuracy of sTREM-1 for VAP. The methodological quality of each study was assessed by the quality assessment for studies of diagnostic accuracy ( QUADAS) tool. The Meta-disc software was used to conduct merger analyses on sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio. The heterogeneity test was performed and summary receiver operating characteristic ( SROC) curve was completed. Results 8 studies were included ( 180 VAP patients and 224 non-VAP patients) . The value of merger sensitivity, specificity, and diagnostic odds ratio were 0. 80, 0. 74, and 13. 89, respectively. The area under of SROC curve was 0. 857, with Q point at 0. 788. Conclusion sTREM-1 showed moderate accuracy for VAP diagnosis in adult mechanically ventilated patients, which should be combined with other diagnostic markers to further improve the sensitivity and specificity.
Objective To compare the effectiveness of robot-assisted and traditional freehand screw placement in the treatment of atlantoaxial dislocation. Methods The clinical data of 55 patients with atlantoaxial dislocation who met the selection criteria between January 2021 and January 2024 were retrospectively analyzed. According to different screw placement methods, they were divided into the traditional group (using the traditional freedhand screw placement, 31 cases) and the robot group (using the Mazor X robot-assisted screw placement, 24 cases). There was no significant difference in gender, age, body mass index, etiology, and preoperative visual analogue scale (VAS) score, cervical spine Japanese Orthopaedic Association (JOA) score between the two groups (P>0.05). The operation time, intraoperative blood loss, operation cost, and intraoperative complications were recorded and compared between the two groups. The VAS score and cervical spine JOA score were used to evaluate the improvement of pain and cervical spinal cord function before operation and at 1 month after operation. CT examination was performed at 3 days after operation, and the accuracy of screw placement was evaluated according to Neo grading criteria. Results All the 55 patients successfully completed the operation. The operation time, intraoperative blood loss, and operation cost in the robot group were significantly higher than those in the traditional group (P<0.05). A total of 220 C1 and C2 pedicle screws were inserted in the two groups, and 94 were inserted in the robot group, with an accuracy rate of 95.7%, among them, 2 were inserted by traditional freehand screw placement due to bleeding caused by intraoperative slip. And 126 pedicle screws were inserted in the traditional group, with an accuracy rate of 87.3%, which was significantly lower than that in the robot group (P<0.05). There were 1 case of venous plexus injury in the robot group and 3 cases in the traditional group, which improved after pressure hemostasis treatment. No other intraoperative complication such as vertebral artery injury or spinal cord injury occurred in both groups. All patients were followed up 4-16 months with an average of 6.6 months, and there was no significant difference in the follow-up time between the two groups (P>0.05). Postoperative neck pain significantly relieved in both groups, and neurological symptoms relieved to varying degrees. The VAS score and cervicle spine JOA score of both groups significantly improved at 1 month after operation when compared with preoperative scores (P<0.05), and there was no significant difference in the score change between the two groups (P>0.05). Conclusion In the treatment of atlantoaxial dislocation, the accuracy of robot-assisted screw placement is superior to the traditional freedhand screw placement.
The correct and reasonable statistical analysis method can make the results of comparative diagnosis test accuracy more convincing. In this paper, the accuracy of diagnostic tests is divided into 2 forms: binary-scale outcomes and ordinal-scale/continuous-scale outcomes. Taking diagnostic indicators such as sensitivity, specificity, receiver operating characteristic (ROC) curves and area under curve (AUC) values as entry points, combined with examples, this paper introduced how to compare the diagnostic results of tests by parameter estimation and hypothesis testing, with the aim of providing references for the comparative diagnosis test accuracy.
The method of network meta-analysis of diagnostic test accuracy is in the exploratory stage. We had explored and introduced several methods of network meta-analysis of diagnostic test accuracy before. Based on example, we introduce ANOVA model for performing network meta-analysis of diagnostic test accuracy step-by-step.
Receiving the amount of true positives, false positives, false negatives and true negatives is necessary when conducting meta-analysis of diagnostic tests. Advanced methods of data extraction are required if these data could not be directly obtained from a literature. We introduced three methods and discussed the theories. An example was then given to illustrate how to apply the methods.