The accurate segmentation of breast ultrasound images is an important precondition for the lesion determination. The existing segmentation approaches embrace massive parameters, sluggish inference speed, and huge memory consumption. To tackle this problem, we propose T2KD Attention U-Net (dual-Teacher Knowledge Distillation Attention U-Net), a lightweight semantic segmentation method combined double-path joint distillation in breast ultrasound images. Primarily, we designed two teacher models to learn the fine-grained features from each class of images according to different feature representation and semantic information of benign and malignant breast lesions. Then we leveraged the joint distillation to train a lightweight student model. Finally, we constructed a novel weight balance loss to focus on the semantic feature of small objection, solving the unbalance problem of tumor and background. Specifically, the extensive experiments conducted on Dataset BUSI and Dataset B demonstrated that the T2KD Attention U-Net outperformed various knowledge distillation counterparts. Concretely, the accuracy, recall, precision, Dice, and mIoU of proposed method were 95.26%, 86.23%, 85.09%, 83.59%and 77.78% on Dataset BUSI, respectively. And these performance indexes were 97.95%, 92.80%, 88.33%, 88.40% and 82.42% on Dataset B, respectively. Compared with other models, the performance of this model was significantly improved. Meanwhile, compared with the teacher model, the number, size, and complexity of student model were significantly reduced (2.2×106 vs. 106.1×106, 8.4 MB vs. 414 MB, 16.59 GFLOPs vs. 205.98 GFLOPs, respectively). Indeedy, the proposed model guarantees the performances while greatly decreasing the amount of computation, which provides a new method for the deployment of clinical medical scenarios.
The skin is the largest organ of the human body, and many visceral diseases will be directly reflected on the skin, so it is of great clinical significance to accurately segment the skin lesion images. To address the characteristics of complex color, blurred boundaries, and uneven scale information, a skin lesion image segmentation method based on dense atrous spatial pyramid pooling (DenseASPP) and attention mechanism is proposed. The method is based on the U-shaped network (U-Net). Firstly, a new encoder is redesigned to replace the ordinary convolutional stacking with a large number of residual connections, which can effectively retain key features even after expanding the network depth. Secondly, channel attention is fused with spatial attention, and residual connections are added so that the network can adaptively learn channel and spatial features of images. Finally, the DenseASPP module is introduced and redesigned to expand the perceptual field size and obtain multi-scale feature information. The algorithm proposed in this paper has obtained satisfactory results in the official public dataset of the International Skin Imaging Collaboration (ISIC 2016). The mean Intersection over Union (mIOU), sensitivity (SE), precision (PC), accuracy (ACC), and Dice coefficient (Dice) are 0.901 8, 0.945 9, 0.948 7, 0.968 1, 0.947 3, respectively. The experimental results demonstrate that the method in this paper can improve the segmentation effect of skin lesion images, and is expected to provide an auxiliary diagnosis for professional dermatologists.
Speech imagery is an emerging brain-computer interface (BCI) paradigm with potential to provide effective communication for individuals with speech impairments. This study designed a Chinese speech imagery paradigm using three clinically relevant words—“Help me”, “Sit up” and “Turn over”—and collected electroencephalography (EEG) data from 15 healthy subjects. Based on the data, a Channel Attention Multi-Scale Convolutional Neural Network (CAM-Net) decoding algorithm was proposed, which combined multi-scale temporal convolutions with asymmetric spatial convolutions to extract multidimensional EEG features, and incorporated a channel attention mechanism along with a bidirectional long short-term memory network to perform channel weighting and capture temporal dependencies. Experimental results showed that CAM-Net achieved a classification accuracy of 48.54% in the three-class task, outperforming baseline models such as EEGNet and Deep ConvNet, and reached a highest accuracy of 64.17% in the binary classification between “Sit up” and “Turn over”. This work provides a promising approach for future Chinese speech imagery BCI research and applications.
Lung cancer is the most threatening tumor disease to human health. Early detection is crucial to improve the survival rate and recovery rate of lung cancer patients. Existing methods use the two-dimensional multi-view framework to learn lung nodules features and simply integrate multi-view features to achieve the classification of benign and malignant lung nodules. However, these methods suffer from the problems of not capturing the spatial features effectively and ignoring the variability of multi-views. Therefore, this paper proposes a three-dimensional (3D) multi-view convolutional neural network (MVCNN) framework. To further solve the problem of different views in the multi-view model, a 3D multi-view squeeze-and-excitation convolution neural network (MVSECNN) model is constructed by introducing the squeeze-and-excitation (SE) module in the feature fusion stage. Finally, statistical methods are used to analyze model predictions and doctor annotations. In the independent test set, the classification accuracy and sensitivity of the model were 96.04% and 98.59% respectively, which were higher than other state-of-the-art methods. The consistency score between the predictions of the model and the pathological diagnosis results was 0.948, which is significantly higher than that between the doctor annotations and the pathological diagnosis results. The methods presented in this paper can effectively learn the spatial heterogeneity of lung nodules and solve the problem of multi-view differences. At the same time, the classification of benign and malignant lung nodules can be achieved, which is of great significance for assisting doctors in clinical diagnosis.
Manual segmentation of coronary arteries in computed tomography angiography (CTA) images is inefficient, and existing deep learning segmentation models often exhibit low accuracy on coronary artery images. Inspired by the Transformer architecture, this paper proposes a novel segmentation model, the double parallel encoder u-net with transformers (DUNETR). This network employed a dual-encoder design integrating Transformers and convolutional neural networks (CNNs). The Transformer encoder transformed three-dimensional (3D) coronary artery data into a one-dimensional (1D) sequential problem, effectively capturing global multi-scale feature information. Meanwhile, the CNN encoder extracted local features of the 3D coronary arteries. The complementary features extracted by the two encoders were fused through the noise reduction feature fusion (NRFF) module and passed to the decoder. Experimental results on a public dataset demonstrated that the proposed DUNETR model achieved a Dice similarity coefficient of 81.19% and a recall rate of 80.18%, representing improvements of 0.49% and 0.46%, respectively, over the next best model in comparative experiments. These results surpassed those of other conventional deep learning methods. The integration of Transformers and CNNs as dual encoders enables the extraction of rich feature information, significantly enhancing the effectiveness of 3D coronary artery segmentation. Additionally, this model provides a novel approach for segmenting other vascular structures.
ObjectiveTo systematically review the effect of media multitasking on working memory and attention among adolescents. MethodsCNKI, CBM, WanFang Data, VIP, PubMed, Web of Science, and EMbase databases were electronically searched to collect cross-sectional studies on the effect of media multitasking on working memory and attention among adolescents from inception to January 1st, 2021. Two reviewers independently screened literature, extracted data, and assessed the risk of bias of included studies; then, meta-analysis was performed using Stata 15.1 software. ResultsA total of 16 cross-sectional studies were included. The results of meta-analysis showed that there were negative correlations between media multitasking and working memory (Cohen's d=0.40, 95%CI 0.14 to 0.66, P=0.003), as well as in attention (Cohen's d=1.02, 95%CI 0.58 to 1.47, P<0.001). ConclusionCurrent evidence shows that media multitasking has negative impact on working memory and attention. Due to limited quality and quantity of the included studies, more high-quality studies are required to verify the above conclusion.
ObjectiveTo systematically review the methodological quality of guidelines concerning attention-deficit/hyperactivity disorder (ADHD) in children and adolescents, and to compare differences and similarities of the drugs recommended, in order to provide guidance for clinical practice.
MethodsGuidelines concerning ADHD were electronically retrieved in PubMed, EMbase, VIP, WanFang Data, CNKI, NGC (National Guideline Clearinghouse), GIN (Guidelines International Network), NICE (National Institute for Health and Clinical Excellence) from inception to December 2013. The methodological quality of included guidelines were evaluated according to the AGREE Ⅱ instrument, and the differences between recommendations were compared.
ResultsA total of 9 guidelines concerning ADHD in children and adolescents were included, with development time ranging from 2004 to 2012. Among 9 guidelines, 4 were made by the USA, 3 in Europe and 2 by UK. The levels of recommendations were Level A for 2 guidelines, and Level B for 7 guidelines. The scores of guidelines according to the domains of AGREE Ⅱ decreased from "clarity of presentations", "scope and purpose", "participants", "applicability", "rigour of development" and "editorial independence". Three evidence-based guidelines scored the top three in the domain of "rigour of development". There were slightly differences in the recommendations of different guidelines.
ConclusionThe overall methodological quality of ADHD guidelines is suboptimal in different countries or regions. The 6 domains involving 23 items in AGREE Ⅱ vary with scores, while the scores of evidence-base guidelines are higher than those of non-evidence-based guidelines. The guidelines on ADHD in children and adolescents should be improved in "rigour of development" and "applicability" in future. Conflicts of interest should be addressed. And the guidelines are recommended to be developed on the basis of methods of evidence-based medicine, and best evidence is recommended.
Accurate segmentation of ground glass nodule (GGN) is important in clinical. But it is a tough work to segment the GGN, as the GGN in the computed tomography images show blur boundary, irregular shape, and uneven intensity. This paper aims to segment GGN by proposing a fully convolutional residual network, i.e., residual network based on atrous spatial pyramid pooling structure and attention mechanism (ResAANet). The network uses atrous spatial pyramid pooling (ASPP) structure to expand the feature map receptive field and extract more sufficient features, and utilizes attention mechanism, residual connection, long skip connection to fully retain sensitive features, which is extracted by the convolutional layer. First, we employ 565 GGN provided by Shanghai Chest Hospital to train and validate ResAANet, so as to obtain a stable model. Then, two groups of data selected from clinical examinations (84 GGN) and lung image database consortium (LIDC) dataset (145 GGN) were employed to validate and evaluate the performance of the proposed method. Finally, we apply the best threshold method to remove false positive regions and obtain optimized results. The average dice similarity coefficient (DSC) of the proposed algorithm on the clinical dataset and LIDC dataset reached 83.46%, 83.26% respectively, the average Jaccard index (IoU) reached 72.39%, 71.56% respectively, and the speed of segmentation reached 0.1 seconds per image. Comparing with other reported methods, our new method could segment GGN accurately, quickly and robustly. It could provide doctors with important information such as nodule size or density, which assist doctors in subsequent diagnosis and treatment.
The processing mechanism of the human brain for speech information is a significant source of inspiration for the study of speech enhancement technology. Attention and lateral inhibition are key mechanisms in auditory information processing that can selectively enhance specific information. Building on this, the study introduces a dual-branch U-Net that integrates lateral inhibition and feedback-driven attention mechanisms. Noisy speech signals input into the first branch of the U-Net led to the selective feedback of time-frequency units with high confidence. The generated activation layer gradients, in conjunction with the lateral inhibition mechanism, were utilized to calculate attention maps. These maps were then concatenated to the second branch of the U-Net, directing the network’s focus and achieving selective enhancement of auditory speech signals. The evaluation of the speech enhancement effect was conducted by utilising five metrics, including perceptual evaluation of speech quality. This method was compared horizontally with five other methods: Wiener, SEGAN, PHASEN, Demucs and GRN. The experimental results demonstrated that the proposed method improved speech signal enhancement capabilities in various noise scenarios by 18% to 21% compared to the baseline network across multiple performance metrics. This improvement was particularly notable in low signal-to-noise ratio conditions, where the proposed method exhibited a significant performance advantage over other methods. The speech enhancement technique based on lateral inhibition and feedback-driven attention mechanisms holds significant potential in auditory speech enhancement, making it suitable for clinical practices related to artificial cochleae and hearing aids.
Deep learning-based automatic classification of diabetic retinopathy (DR) helps to enhance the accuracy and efficiency of auxiliary diagnosis. This paper presents an improved residual network model for classifying DR into five different severity levels. First, the convolution in the first layer of the residual network was replaced with three smaller convolutions to reduce the computational load of the network. Second, to address the issue of inaccurate classification due to minimal differences between different severity levels, a mixed attention mechanism was introduced to make the model focus more on the crucial features of the lesions. Finally, to better extract the morphological features of the lesions in DR images, cross-layer fusion convolutions were used instead of the conventional residual structure. To validate the effectiveness of the improved model, it was applied to the Kaggle Blindness Detection competition dataset APTOS2019. The experimental results demonstrated that the proposed model achieved a classification accuracy of 97.75% and a Kappa value of 0.971 7 for the five DR severity levels. Compared to some existing models, this approach shows significant advantages in classification accuracy and performance.