Medical studies have found that tumor mutation burden (TMB) is positively correlated with the efficacy of immunotherapy for non-small cell lung cancer (NSCLC), and TMB value can be used to predict the efficacy of targeted therapy and chemotherapy. However, the calculation of TMB value mainly depends on the whole exon sequencing (WES) technology, which usually costs too much time and expenses. To deal with above problem, this paper studies the correlation between TMB and slice images by taking advantage of digital pathological slices commonly used in clinic and then predicts the patient TMB level accordingly. This paper proposes a deep learning model (RCA-MSAG) based on residual coordinate attention (RCA) structure and combined with multi-scale attention guidance (MSAG) module. The model takes ResNet-50 as the basic model and integrates coordinate attention (CA) into bottleneck module to capture the direction-aware and position-sensitive information, which makes the model able to locate and identify the interesting positions more accurately. And then, MSAG module is embedded into the network, which makes the model able to extract the deep features of lung cancer pathological sections and the interactive information between channels. The cancer genome map (TCGA) open dataset is adopted in the experiment, which consists of 200 pathological sections of lung adenocarcinoma, including 80 data samples with high TMB value, 77 data samples with medium TMB value and 43 data samples with low TMB value. Experimental results demonstrate that the accuracy, precision, recall and F1 score of the proposed model are 96.2%, 96.4%, 96.2% and 96.3%, respectively, which are superior to the existing mainstream deep learning models. The model proposed in this paper can promote clinical auxiliary diagnosis and has certain theoretical guiding significance for TMB prediction.
Deformable image registration plays a crucial role in medical image analysis. Despite various advanced registration models having been proposed, achieving accurate and efficient deformable registration remains challenging. Leveraging the recent outstanding performance of Mamba in computer vision, we introduced a novel model called MCRDP-Net. MCRDP-Net adapted a dual-stream network architecture that combined Mamba blocks and convolutional blocks to simultaneously extract global and local information from fixed and moving images. In the decoding stage, we employed a pyramid network structure to obtain high-resolution deformation fields, achieving efficient and precise registration. The effectiveness of MCRDP-Net was validated on public brain registration datasets, OASIS and IXI. Experimental results demonstrated significant advantages of MCRDP-Net in medical image registration, with DSC, HD95, and ASD reaching 0.815, 8.123, and 0.521 on the OASIS dataset and 0.773, 7.786, and 0.871 on the IXI dataset. In summary, MCRDP-Net demonstrates superior performance in deformable image registration, proving its potential in medical image analysis. It effectively enhances the accuracy and efficiency of registration, providing strong support for subsequent medical research and applications.
In clinical, manually scoring by technician is the major method for sleep arousal detection. This method is time-consuming and subjective. This study aimed to achieve an end-to-end sleep-arousal events detection by constructing a convolutional neural network based on multi-scale convolutional layers and self-attention mechanism, and using 1 min single-channel electroencephalogram (EEG) signals as its input. Compared with the performance of the baseline model, the results of the proposed method showed that the mean area under the precision-recall curve and area under the receiver operating characteristic were both improved by 7%. Furthermore, we also compared the effects of single modality and multi-modality on the performance of the proposed model. The results revealed the power of single-channel EEG signals in automatic sleep arousal detection. However, the simple combination of multi-modality signals may be counterproductive to the improvement of model performance. Finally, we also explored the scalability of the proposed model and transferred the model into the automated sleep staging task in the same dataset. The average accuracy of 73% also suggested the power of the proposed method in task transferring. This study provides a potential solution for the development of portable sleep monitoring and paves a way for the automatic sleep data analysis using the transfer learning method.
[Abstract]Automatic and accurate segmentation of lung parenchyma is essential for assisted diagnosis of lung cancer. In recent years, researchers in the field of deep learning have proposed a number of improved lung parenchyma segmentation methods based on U-Net. However, the existing segmentation methods ignore the complementary fusion of semantic information in the feature map between different layers and fail to distinguish the importance of different spaces and channels in the feature map. To solve this problem, this paper proposes the double scale parallel attention (DSPA) network (DSPA-Net) architecture, and introduces the DSPA module and the atrous spatial pyramid pooling (ASPP) module in the “encoder-decoder” structure. Among them, the DSPA module aggregates the semantic information of feature maps of different levels while obtaining accurate space and channel information of feature map with the help of cooperative attention (CA). The ASPP module uses multiple parallel convolution kernels with different void rates to obtain feature maps containing multi-scale information under different receptive fields. The two modules address multi-scale information processing in feature maps of different levels and in feature maps of the same level, respectively. We conducted experimental verification on the Kaggle competition dataset. The experimental results prove that the network architecture has obvious advantages compared with the current mainstream segmentation network. The values of dice similarity coefficient (DSC) and intersection on union (IoU) reached 0.972 ± 0.002 and 0.945 ± 0.004, respectively. This paper achieves automatic and accurate segmentation of lung parenchyma and provides a reference for the application of attentional mechanisms and multi-scale information in the field of lung parenchyma segmentation.
Glioma is a primary brain tumor with high incidence rate. High-grade gliomas (HGG) are those with the highest degree of malignancy and the lowest degree of survival. Surgical resection and postoperative adjuvant chemoradiotherapy are often used in clinical treatment, so accurate segmentation of tumor-related areas is of great significance for the treatment of patients. In order to improve the segmentation accuracy of HGG, this paper proposes a multi-modal glioma semantic segmentation network with multi-scale feature extraction and multi-attention fusion mechanism. The main contributions are, (1) Multi-scale residual structures were used to extract features from multi-modal gliomas magnetic resonance imaging (MRI); (2) Two types of attention modules were used for features aggregating in channel and spatial; (3) In order to improve the segmentation performance of the whole network, the branch classifier was constructed using ensemble learning strategy to adjust and correct the classification results of the backbone classifier. The experimental results showed that the Dice coefficient values of the proposed segmentation method in this article were 0.909 7, 0.877 3 and 0.839 6 for whole tumor, tumor core and enhanced tumor respectively, and the segmentation results had good boundary continuity in the three-dimensional direction. Therefore, the proposed semantic segmentation network has good segmentation performance for high-grade gliomas lesions.
In order to meet the need of autonomous control of patients with severe limb disorders, this paper designs a nursing bed control system based on motor imagery-brain computer interface (MI-BCI). In view of the low decoding performance of cross-subjects and the dynamic fluctuation of cognitive state in the existing MI-BCI technology, the neural network structure optimization and user interaction feedback enhancement are improved. Firstly, the optimized dual-branch graph convolution multi-scale neural network integrates dynamic graph convolution and multi-scale convolution. The average classification accuracy is higher than that of multi-scale attention temporal convolution network, Gram angle field combined with convolution long short term memory hybrid network, Transformer-based graph convolution network and other existing methods. Secondly, a dual visual feedback mechanism is constructed, in which electroencephalogram (EEG) topographic map feedback can improve the discrimination of spatial patterns, and attention state feedback can enhance the temporal stability of signals. Compared with the single EEG topographic map feedback and non-feedback system, the average classification accuracy of the proposed method is also greatly improved. Finally, in the four classification control task of nursing bed, the average control accuracy of the system is 90.84%, and the information transmission rate is 84.78 bits/min. In summary, this paper provides a reliable technical solution for improving the autonomous interaction ability of patients with severe limb disorders, which has important theoretical significance and application value.
Convolutional neural networks (CNNs) are renowned for their excellent representation learning capabilities and have become a mainstream model for motor imagery based electroencephalogram (MI-EEG) signal classification. However, MI-EEG exhibits strong inter-individual variability, which may lead to a decline in classification performance. To address this issue, this paper proposes a classification model based on dynamic multi-scale CNN and multi-head temporal attention (DMSCMHTA). The model first applies multi-band filtering to the raw MI-EEG signals and inputs the results into the feature extraction module. Then, it uses a dynamic multi-scale CNN to capture temporal features while adjusting attention weights, followed by spatial convolution to extract spatiotemporal feature sequences. Next, the model further optimizes temporal correlations through time dimensionality reduction and a multi-head attention mechanism to generate more discriminative features. Finally, MI classification is completed under the supervision of cross-entropy loss and center loss. Experiments show that the proposed model achieves average accuracies of 80.32% and 90.81% on BCI Competition IV datasets 2a and 2b, respectively. The results indicate that DMSCMHTA can adaptively extract personalized spatiotemporal features and outperforms current mainstream methods.
To address issues such as loss of detailed information, blurred target boundaries, and unclear structural hierarchy in medical image fusion, this paper proposes an adaptive feature medical image fusion network based on a full-scale diffusion model. First, a region-level feature map is generated using a kernel-based saliency map to enhance local features and boundary details. Then, a full-scale diffusion feature extraction network is employed for global feature extraction, alongside a multi-scale denoising U-shaped network designed to fully capture cross-layer information. A multi-scale feature integration module is introduced to reinforce texture details and structural information extracted by the encoder. Finally, an adaptive fusion scheme is applied to progressively fuse region-level features, global features, and source images layer by layer, enhancing the preservation of detail information. To validate the effectiveness of the proposed method, this paper validates the proposed model on the publicly available Harvard dataset and an abdominal dataset. By comparing with nine other representative image fusion methods, the proposed approach achieved improvements across seven evaluation metrics. The results demonstrate that the proposed method effectively extracts both global and local features of medical images, enhances texture details and target boundary clarity, and generates fusion image with high contrast and rich information, providing more reliable support for subsequent clinical diagnosis.
Chromatin three-dimensional genome structure plays a key role in cell function and gene regulation. Single-cell Hi-C techniques can capture genomic structure information at the cellular level, which provides an opportunity to study changes in genomic structure between different cell types. Recently, some excellent computational methods have been developed for single-cell Hi-C data analysis. In this paper, the available methods for single-cell Hi-C data analysis were first reviewed, including preprocessing of single-cell Hi-C data, multi-scale structure recognition based on single-cell Hi-C data, bulk-like Hi-C contact matrix generation based on single-cell Hi-C data sets, pseudo-time series analysis, and cell classification. Then the application of single-cell Hi-C data in cell differentiation and structural variation was described. Finally, the future development direction of single-cell Hi-C data analysis was also prospected.
Atrial fibrillation (AF) is a life-threatening heart condition, and its early detection and treatment have garnered significant attention from physicians in recent years. Traditional methods of detecting AF heavily rely on doctor’s diagnosis based on electrocardiograms (ECGs), but prolonged analysis of ECG signals is very time-consuming. This paper designs an AF detection model based on the Inception module, constructing multi-branch detection channels to process raw ECG signals, gradient signals, and frequency signals during AF. The model efficiently extracted QRS complex and RR interval features using gradient signals, extracted P-wave and f-wave features using frequency signals, and used raw signals to supplement missing information. The multi-scale convolutional kernels in the Inception module provided various receptive fields and performed comprehensive analysis of the multi-branch results, enabling early AF detection. Compared to current machine learning algorithms that use only RR interval and heart rate variability features, the proposed algorithm additionally employed frequency features, making fuller use of the information within the signals. For deep learning methods using raw and frequency signals, this paper introduced an enhanced method for the QRS complex, allowing the network to extract features more effectively. By using a multi-branch input mode, the model comprehensively considered irregular RR intervals and P-wave and f-wave features in AF. Testing on the MIT-BIH AF database showed that the inter-patient detection accuracy was 96.89%, sensitivity was 97.72%, and specificity was 95.88%. The proposed model demonstrates excellent performance and can achieve automatic AF detection.