Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-02-12 | Rhythmic sharing: A bio-inspired paradigm for zero-shot adaptation and learning in neural networks | Hoony Kang, Wolfgang Losert | Link | The brain can rapidly adapt to new contexts and learn from limited data, a coveted characteristic that artificial intelligence algorithms have struggled to mimic. Inspired by oscillatory rhythms of the mechanical structures of neural cells, we developed a learning paradigm that is based on oscillations in link strengths and associates learning with the coordination of these oscillations. We find that this paradigm yields rapid adaptation and learning in artificial neural networks. Link oscillations can rapidly change coordination, endowing the network with the ability to sense subtle context changes in an unsupervised manner. In other words, the network generates the missing contextual tokens required to perform as a generalist AI architecture capable of predicting dynamics in multiple contexts. Oscillations also allow the network to extrapolate dynamics to never-seen-before contexts. These capabilities make our learning paradigm a powerful starting point for novel models of learning and cognition. Furthermore, learning through link coordination is agnostic to the specifics of the neural network architecture, hence our study opens the door for introducing rapid adaptation and learning capabilities into leading AI models. |
2025-02-12 | Rapid Whole Brain Mesoscale In-vivo MR Imaging using Multi-scale Implicit Neural Representation | Jun Lyu, Lipeng Ning, William Consagra, Qiang Liu, Richard J. Rushmore, Berkin Bilgic, Yogesh Rathi | Link | Purpose: To develop and validate a novel image reconstruction technique using implicit neural representations (INR) for multi-view thick-slice acquisitions while reducing the scan time but maintaining high signal-to-noise ratio (SNR). Methods: We propose Rotating-view super-resolution (ROVER)-MRI, an unsupervised neural network-based algorithm designed to reconstruct MRI data from multi-view thick slices, effectively reducing scan time by 2-fold while maintaining fine anatomical details. We compare our method to both bicubic interpolation and the current state-of-the-art regularized least-squares super-resolution reconstruction (LS-SRR) technique. Validation is performed using ground-truth ex-vivo monkey brain data, and we demonstrate superior reconstruction quality across several in-vivo human datasets. Notably, we achieve the reconstruction of a whole human brain in-vivo T2-weighted image with an unprecedented 180{\mu}m isotropic spatial resolution, accomplished in just 17 minutes of scan time on a 7T MRI scanner. Results: ROVER-MRI outperformed LS-SRR method in terms of reconstruction quality with 22.4% lower relative error (RE) and 7.5% lower full-width half maximum (FWHM) indicating better preservation of fine structural details in nearly half the scan time. Conclusion: ROVER-MRI offers an efficient and robust approach for mesoscale MR imaging, enabling rapid, high-resolution whole-brain scans. Its versatility holds great promise for research applications requiring anatomical details and time-efficient imaging. |
2025-02-12 | Brain Latent Progression: Individual-based Spatiotemporal Disease Progression on 3D Brain MRIs via Latent Diffusion | Lemuel Puglisi, Daniel C. Alexander, Daniele Ravì | Link | The growing availability of longitudinal Magnetic Resonance Imaging (MRI) datasets has facilitated Artificial Intelligence (AI)-driven modeling of disease progression, making it possible to predict future medical scans for individual patients. However, despite significant advancements in AI, current methods continue to face challenges including achieving patient-specific individualization, ensuring spatiotemporal consistency, efficiently utilizing longitudinal data, and managing the substantial memory demands of 3D scans. To address these challenges, we propose Brain Latent Progression (BrLP), a novel spatiotemporal model designed to predict individual-level disease progression in 3D brain MRIs. The key contributions in BrLP are fourfold: (i) it operates in a small latent space, mitigating the computational challenges posed by high-dimensional imaging data; (ii) it explicitly integrates subject metadata to enhance the individualization of predictions; (iii) it incorporates prior knowledge of disease dynamics through an auxiliary model, facilitating the integration of longitudinal data; and (iv) it introduces the Latent Average Stabilization (LAS) algorithm, which (a) enforces spatiotemporal consistency in the predicted progression at inference time and (b) allows us to derive a measure of the uncertainty for the prediction. We train and evaluate BrLP on 11,730 T1-weighted (T1w) brain MRIs from 2,805 subjects and validate its generalizability on an external test set comprising 2,257 MRIs from 962 subjects. Our experiments compare BrLP-generated MRI scans with real follow-up MRIs, demonstrating state-of-the-art accuracy compared to existing methods. The code is publicly available at: https://github.com/LemuelPuglisi/BrLP. |
2025-02-12 | Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection | Ziyue Yang, Kehan Wang, Yuhang Ming, Yong Peng, Han Yang, Qiong Chen, Wanzeng Kong | Link | Camouflaged Object Detection (COD), the task of identifying objects concealed within their environments, has seen rapid growth due to its wide range of practical applications. A key step toward developing trustworthy COD systems is the estimation and effective utilization of uncertainty. In this work, we propose a human-machine collaboration framework for classifying the presence of camouflaged objects, leveraging the complementary strengths of computer vision (CV) models and noninvasive brain-computer interfaces (BCIs). Our approach introduces a multiview backbone to estimate uncertainty in CV model predictions, utilizes this uncertainty during training to improve efficiency, and defers low-confidence cases to human evaluation via RSVP-based BCIs during testing for more reliable decision-making. We evaluated the framework in the CAMO dataset, achieving state-of-the-art results with an average improvement of 4.56\% in balanced accuracy (BA) and 3.66\% in the F1 score compared to existing methods. For the best-performing participants, the improvements reached 7.6\% in BA and 6.66\% in the F1 score. Analysis of the training process revealed a strong correlation between our confidence measures and precision, while an ablation study confirmed the effectiveness of the proposed training policy and the human-machine collaboration strategy. In general, this work reduces human cognitive load, improves system reliability, and provides a strong foundation for advancements in real-world COD applications and human-computer interaction. Our code and data are available at: https://github.com/ziyuey/Uncertainty-aware-human-machine-collaboration-in-camouflaged-object-identification. |
2025-02-12 | Normative Cerebral Perfusion Across the Lifespan | Xinglin Zeng, Yiran Li, Lin Hua, Ruoxi Lu, Lucas Lemos Franco, Peter Kochunov, Shuo Chen, John A Detre, Ze Wang | Link | Cerebral perfusion plays a crucial role in maintaining brain function and is tightly coupled with neuronal activity. While previous studies have examined cerebral perfusion trajectories across development and aging, precise characterization of its lifespan dynamics has been limited by small sample sizes and methodological inconsistencies. In this study, we construct the first comprehensive normative model of cerebral perfusion across the human lifespan (birth to 85 years) using a large multi-site dataset of over 12,000 high-quality arterial spin labeling (ASL) MRI scans. Leveraging generalized additive models for location, scale, and shape (GAMLSS), we mapped nonlinear growth trajectories of cerebral perfusion at global, network, and regional levels. We observed a rapid postnatal increase in cerebral perfusion, peaking at approximately 7.1 years, followed by a gradual decline into adulthood. Sex differences were evident, with distinct regional maturation patterns rather than uniform differences across all brain regions. Beyond normative modeling, we quantified individual deviations from expected CBF patterns in neurodegenerative and psychiatric conditions, identifying disease-specific perfusion abnormalities across four brain disorders. Using longitudinal data, we established typical and atypical cerebral perfusion trajectories, highlighting the prognostic value of perfusion-based biomarkers for detecting disease progression. Our findings provide a robust normative framework for cerebral perfusion, facilitating precise characterization of brain health across the lifespan and enhancing the early identification of neurovascular dysfunction in clinical populations. |
2025-02-11 | From Brainwaves to Brain Scans: A Robust Neural Network for EEG-to-fMRI Synthesis | Kristofer Grover Roos, Quan Huu Cap, Atsushi Fukuda | Link | While functional magnetic resonance imaging (fMRI) offers rich spatial resolution, it is limited by high operational costs and significant infrastructural demands. In contrast, electroencephalography (EEG) provides millisecond-level precision in capturing electrical activity but lacks the spatial resolution necessary for precise neural localization. To bridge these gaps, we introduce E2fNet, a simple yet effective deep learning model for synthesizing fMRI images from low-cost EEG data. E2fNet is specifically designed to capture and translate meaningful features from EEG across electrode channels into accurate fMRI representations. Extensive evaluations across three datasets demonstrate that E2fNet consistently outperforms existing methods, achieving state-of-the-art results in terms of the structural similarity index measure (SSIM). Our findings suggest that E2fNet is a promising, cost-effective solution for enhancing neuroimaging capabilities. The code is available at https://github.com/kgr20/E2fNet. |
2025-02-11 | An affordable, wearable, fiber-free pulsed-mode diffuse speckle contrast flowmetry (PM-DSCF) sensor for noninvasive measurements of deep cerebral blood flow | Chaebeom Yeo, Xuhui Liu, Mehrana Mohtasebi, Faezeh Akbari, Faraneh Fathi, Guoqiang Yu | Link | Significance: Measuring cerebral blood flow (CBF) is crucial for diagnosing various cerebral diseases. An affordable, wearable, and fiber-free continuous-wave speckle contrast flowmetry (CW-DSCF) technique has been developed for continuous monitoring of CBF variations. However, its application in adult humans is limited by shallow tissue penetration. Aim: To develop an innovative pulse-mode DSCF (PM-DSCF) system for continuous monitoring of CBF variations in adult humans. Approach: The PM-DSCF utilizes an 808 nm laser diode and a small NanEye camera to capture diffuse laser speckle fluctuations caused by red blood cell movement in the brain (i.e., CBF). Operating in short-pulse mode (duty cycle < 5%), the system maximizes peak pulse light power for deeper tissue penetration, while ensuring that the average power density remains within ANSI safety standards for skin exposure. The PM-DSCF was evaluated on tissue-simulating phantoms and in adult humans. Results: The maximum effective source-detector distance increased from 15 mm (CW-DSCF) to 35 mm (PM-DSCF). The PM-DSCF successfully detected CBF variations in adult brains during head-up-tilting experiments, consistent with physiological expectations. Conclusions: Switching from CW mode to PM mode significantly increases the maximum tissue penetration depth from ~7.5 mm (CW-DSCF) to ~17.5 mm (PM-DSCF), enabling successful CBF measurements in adult humans. |
2025-02-11 | Numerical Study on Human Brain Cortical Electrostimulation Assessment During Uniform Magnetic Field Exposure at Intermediate Frequencies | Jose Gomez-Tames, Thomas Tarnaud, Wout Joseph, Emmeric Tanghe | Link | Objectives: Permissible limits have been established by international guidelines and standards for human protection to electromagnetic field exposure to prevent adverse health effects stemming from electrostimulation in the most sensitive body part. That is the peripheral nervous system (PNS) in the intermediate frequency range (300 Hz to 100 kHz) and the central nervous system (CNS) at lower frequencies. However, there is a need to reevaluate protection limits against CNS electrostimulation in the intermediate frequency range, considering the importance of brain tissues during electromagnetic head exposure. This study aims to derive the level of CNS cortical stimulation to evaluate compliance with existing protection limits. Method: Multi-scale computation modelling was used to evaluate neuron stimulation thresholds by integrating individual neurons into realistic anatomical head models. Five different excitable membrane models within the motor cortex were examined across three human head models, providing the most comprehensive and extensive evaluation to date. Results: Current protection limits are confirmed as conservative, with non-compliance observed in only 0.02% and 2.4% of axons under clamped and sealed boundary conditions, respectively. The study highlights significant intersubject variability (up to 600% mean threshold) and clarifies the influence of neural excitation models on permissible level assessments. Conclusion: Current electric field limits are conservative for CNS electrostimulation in the intermediate frequencies range, but the margin of safety decreases at higher frequencies, warranting further evaluation. Impact: The findings and methodology contribute to the rationale and provide valuable insights for revising electromagnetic safety exposure guidelines. |
2025-02-11 | Quantitative evaluation of unsupervised clustering algorithms for dynamic total-body PET image analysis | Oona Rainio, Maria K. Jaakkola, Riku Klén | Link | Background. Recently, dynamic total-body positron emission tomography (PET) imaging has become possible due to new scanner devices. While clustering algorithms have been proposed for PET analysis already earlier, there is still little research systematically evaluating these algorithms for processing of dynamic total-body PET images. Materials and methods. Here, we compare the performance of 15 unsupervised clustering methods, including K-means either by itself or after principal component analysis (PCA) or independent component analysis (ICA), Gaussian mixture model (GMM), fuzzy c-means (FCM), agglomerative clustering, spectral clustering, and several newer clustering algorithms, for classifying time activity curves (TACs) in dynamic PET images. We use dynamic total-body $^{15}$O-water PET images collected from 30 patients with suspected or confirmed coronary artery disease. To evaluate the clustering algorithms in a quantitative way, we use them to classify 5000 TACs from each image based on whether the curve is taken from brain, right heart ventricle, right kidney, lower right lung lobe, or urinary bladder. Results. According to our results, the best methods are GMM, FCM, and ICA combined with mini batch K-means, which classified the TACs with a median accuracies of 89\%, 83\%, and 81\%, respectively, in a processing time of half a second or less on average for each image. Conclusion. GMM, FCM, and ICA with mini batch K-means show promise for dynamic total-body PET analysis. |
2025-02-11 | From Thought to Action: How a Hierarchy of Neural Dynamics Supports Language Production | Mingfang, Zhang, Jarod Lévy, Stéphane d'Ascoli, Jérémy Rapin, F. -Xavier Alario, Pierre Bourdillon, Svetlana Pinet, Jean-Rémi King | Link | Humans effortlessly communicate their thoughts through intricate sequences of motor actions. Yet, the neural processes that coordinate language production remain largely unknown, in part because speech artifacts limit the use of neuroimaging. To elucidate the unfolding of language production in the brain, we investigate with magnetoencephalography (MEG) and electroencephalography (EEG) the neurophysiological activity of 35 skilled typists, while they typed sentences on a keyboard. This approach confirms the hierarchical predictions of linguistic theories: the neural activity preceding the production of each word is marked by the sequential rise and fall of context-, word-, syllable-, and letter-level representations. Remarkably, each of these neural representations is maintained over long time periods within each level of the language hierarchy. This phenomenon results in a superposition of successive representations that is supported by a hierarchy of dynamic neural codes. Overall, these findings provide a precise computational breakdown of the neural dynamics that coordinate the production of language in the human brain. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-02-11 | From Brainwaves to Brain Scans: A Robust Neural Network for EEG-to-fMRI Synthesis | Kristofer Grover Roos, Quan Huu Cap, Atsushi Fukuda | Link | While functional magnetic resonance imaging (fMRI) offers rich spatial resolution, it is limited by high operational costs and significant infrastructural demands. In contrast, electroencephalography (EEG) provides millisecond-level precision in capturing electrical activity but lacks the spatial resolution necessary for precise neural localization. To bridge these gaps, we introduce E2fNet, a simple yet effective deep learning model for synthesizing fMRI images from low-cost EEG data. E2fNet is specifically designed to capture and translate meaningful features from EEG across electrode channels into accurate fMRI representations. Extensive evaluations across three datasets demonstrate that E2fNet consistently outperforms existing methods, achieving state-of-the-art results in terms of the structural similarity index measure (SSIM). Our findings suggest that E2fNet is a promising, cost-effective solution for enhancing neuroimaging capabilities. The code is available at https://github.com/kgr20/E2fNet. |
2025-02-11 | From Thought to Action: How a Hierarchy of Neural Dynamics Supports Language Production | Mingfang, Zhang, Jarod Lévy, Stéphane d'Ascoli, Jérémy Rapin, F. -Xavier Alario, Pierre Bourdillon, Svetlana Pinet, Jean-Rémi King | Link | Humans effortlessly communicate their thoughts through intricate sequences of motor actions. Yet, the neural processes that coordinate language production remain largely unknown, in part because speech artifacts limit the use of neuroimaging. To elucidate the unfolding of language production in the brain, we investigate with magnetoencephalography (MEG) and electroencephalography (EEG) the neurophysiological activity of 35 skilled typists, while they typed sentences on a keyboard. This approach confirms the hierarchical predictions of linguistic theories: the neural activity preceding the production of each word is marked by the sequential rise and fall of context-, word-, syllable-, and letter-level representations. Remarkably, each of these neural representations is maintained over long time periods within each level of the language hierarchy. This phenomenon results in a superposition of successive representations that is supported by a hierarchy of dynamic neural codes. Overall, these findings provide a precise computational breakdown of the neural dynamics that coordinate the production of language in the human brain. |
2025-02-11 | Emotional EEG Classification using Upscaled Connectivity Matrices | Chae-Won Lee, Jong-Seok Lee | Link | In recent studies of emotional EEG classification, connectivity matrices have been successfully employed as input to convolutional neural networks (CNNs), which can effectively consider inter-regional interaction patterns in EEG. However, we find that such an approach has a limitation that important patterns in connectivity matrices may be lost during the convolutional operations in CNNs. To resolve this issue, we propose and validate an idea to upscale the connectivity matrices to strengthen the local patterns. Experimental results demonstrate that this simple idea can significantly enhance the classification performance. |
2025-02-10 | Retrieving Filter Spectra in CNN for Explainable Sleep Stage Classification | Stephan Goerttler, Yucheng Wang, Emadeldeen Eldele, Fei He, Min Wu | Link | Despite significant advances in deep learning-based sleep stage classification, the clinical adoption of automatic classification models remains slow. One key challenge is the lack of explainability, as many models function as black boxes with millions of parameters. In response, recent work has increasingly focussed on enhancing model explainability. This study contributes to these efforts by globally explaining spectral processing of individual EEG channels. Specifically, we introduce a method to retrieve the filter spectrum of low-level convolutional feature extraction and compare it with the classification-relevant spectral information in the data. We evaluate our approach on the MSA-CNN model using the ISRUC-S3 and Sleep-EDF-20 datasets. Our findings show that spectral processing plays a significant role in the lower frequency bands. In addition, comparing the correlation between filter spectrum and data-based spectral information with univariate performance indicates that the model naturally prioritises the most informative channels in a multimodal setting. We specify how these insights can be leveraged to enhance model performance. The code for the filter spectrum retrieval and its analysis is available at https://github.com/sgoerttler/MSA-CNN. |
2025-02-10 | FEMBA: Efficient and Scalable EEG Analysis with a Bidirectional Mamba Foundation Model | Anna Tegon, Thorir Mar Ingolfsson, Xiaying Wang, Luca Benini, Yawei Li | Link | Accurate and efficient electroencephalography (EEG) analysis is essential for detecting seizures and artifacts in long-term monitoring, with applications spanning hospital diagnostics to wearable health devices. Robust EEG analytics have the potential to greatly improve patient care. However, traditional deep learning models, especially Transformer-based architectures, are hindered by their quadratic time and memory complexity, making them less suitable for resource-constrained environments. To address these challenges, we present FEMBA (Foundational EEG Mamba + Bidirectional Architecture), a novel self-supervised framework that establishes new efficiency benchmarks for EEG analysis through bidirectional state-space modeling. Unlike Transformer-based models, which incur quadratic time and memory complexity, FEMBA scales linearly with sequence length, enabling more scalable and efficient processing of extended EEG recordings. Trained on over 21,000 hours of unlabeled EEG and fine-tuned on three downstream tasks, FEMBA achieves competitive performance in comparison with transformer models, with significantly lower computational cost. Specifically, it reaches 81.82% balanced accuracy (0.8921 AUROC) on TUAB and 0.949 AUROC on TUAR, while a tiny 7.8M-parameter variant demonstrates viability for resource-constrained devices. These results pave the way for scalable, general-purpose EEG analytics in both clinical and highlight FEMBA as a promising candidate for wearable applications. |
2025-02-09 | Neurophysiological correlates to the human brain complexity through |
Dimitri Marques Abramov, Daniel de Freitas Quintanilha, Henrique Santos Lima, Roozemeria Pereira Costa, Carla Kamil-Leite, Vladimir V. Lazarev, Constantino Tsallis | Link | The prospects of assessing neural complexity (NC) by |
2025-02-09 | Protecting Intellectual Property of EEG-based Neural Networks with Watermarking | Ahmed Abdelaziz, Ahmed Fathi, Ahmed Fares | Link | EEG-based neural networks, pivotal in medical diagnosis and brain-computer interfaces, face significant intellectual property (IP) risks due to their reliance on sensitive neurophysiological data and resource-intensive development. Current watermarking methods, particularly those using abstract trigger sets, lack robust authentication and fail to address the unique challenges of EEG models. This paper introduces a cryptographic wonder filter-based watermarking framework tailored for EEG-based neural networks. Leveraging collision-resistant hashing and public-key encryption, the wonder filter embeds the watermark during training, ensuring minimal distortion ( |
2025-02-07 | Geometric Machine Learning on EEG Signals | Benjamin J. Choi | Link | Brain-computer interfaces (BCIs) offer transformative potential, but decoding neural signals presents significant challenges. The core premise of this paper is built around demonstrating methods to elucidate the underlying low-dimensional geometric structure present in high-dimensional brainwave data in order to assist in downstream BCI-related neural classification tasks. We demonstrate two pipelines related to electroencephalography (EEG) signal processing: (1) a preliminary pipeline removing noise from individual EEG channels, and (2) a downstream manifold learning pipeline uncovering geometric structure across networks of EEG channels. We conduct preliminary validation using two EEG datasets and situate our demonstration in the context of the BCI-relevant imagined digit decoding problem. Our preliminary pipeline uses an attention-based EEG filtration network to extract clean signal from individual EEG channels. Our primary pipeline uses a fast Fourier transform, a Laplacian eigenmap, a discrete analog of Ricci flow via Ollivier's notion of Ricci curvature, and a graph convolutional network to perform dimensionality reduction on high-dimensional multi-channel EEG data in order to enable regularizable downstream classification. Our system achieves competitive performance with existing signal processing and classification benchmarks; we demonstrate a mean test correlation coefficient of >0.95 at 2 dB on semi-synthetic neural denoising and a downstream EEG-based classification accuracy of 0.97 on distinguishing digit- versus non-digit thoughts. Results are preliminary and our geometric machine learning pipeline should be validated by more extensive follow-up studies; generalizing these results to larger inter-subject sample sizes, different hardware systems, and broader use cases will be crucial. |
2025-02-07 | Removing Neural Signal Artifacts with Autoencoder-Targeted Adversarial Transformers (AT-AT) | Benjamin J. Choi | Link | Electromyogenic (EMG) noise is a major contamination source in EEG data that can impede accurate analysis of brain-specific neural activity. Recent literature on EMG artifact removal has moved beyond traditional linear algorithms in favor of machine learning-based systems. However, existing deep learning-based filtration methods often have large compute footprints and prohibitively long training times. In this study, we present a new machine learning-based system for filtering EMG interference from EEG data using an autoencoder-targeted adversarial transformer (AT-AT). By leveraging the lightweight expressivity of an autoencoder to determine optimal time-series transformer application sites, our AT-AT architecture achieves a >90% model size reduction compared to published artifact removal models. The addition of adversarial training ensures that filtered signals adhere to the fundamental characteristics of EEG data. We trained AT-AT using published neural data from 67 subjects and found that the system was able to achieve comparable test performance to larger models; AT-AT posted a mean reconstructive correlation coefficient above 0.95 at an initial signal-to-noise ratio (SNR) of 2 dB and 0.70 at -7 dB SNR. Further research generalizing these results to broader sample sizes beyond these isolated test cases will be crucial; while outside the scope of this study, we also include results from a real-world deployment of AT-AT in the Appendix. |
2025-02-06 | From Bedside to Desktop: A Data Protocol for Normative Intracranial EEG and Abnormality Mapping | Heather Woodhouse, Sarah J. Gascoigne, Gerard Hall, Callum Simpson, Nathan Evans, Gabrielle M. Schroeder, Peter N. Taylor, Yujiang Wang | Link | Normative mapping is a framework used to map population-level features of health-related variables. It is widely used in neuroscience research, but the literature lacks established protocols in modalities that do not support healthy control measurements, such as intracranial EEG (icEEG). An icEEG normative map would allow researchers to learn about population-level brain activity and enable comparison of individual data against these norms to identify abnormalities. Currently, no standardised guide exists for transforming clinical data into a normative, regional icEEG map. Papers often cite different software and numerous articles to summarise the lengthy method, making it laborious for other researchers to understand or apply the process. Our protocol seeks to remedy this gap by providing a dataflow guide and key decision points that summarise existing methods. This protocol is used heavily in published works from our own lab (twelve peer-reviewed journal publications). Briefly, we take as input, icEEG recordings and neuroimaging data from people with epilepsy who are undergoing evaluation for resective surgery. As final outputs, we obtain a normative icEEG map, comprising signal properties localised to brain regions. Optionally, we can also process new subjects through the same pipeline and obtain their z-scores (or centiles) in each brain region, for abnormality detection and localisation. To date, a single, cohesive, dataflow pipeline for generating normative icEEG maps, along with abnormality mapping, has not been created. We envisage that this dataflow guide will not only increase understanding and application of normative mapping methods, but will also improve the consistency and quality of studies in the field. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-02-12 | Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection | Ziyue Yang, Kehan Wang, Yuhang Ming, Yong Peng, Han Yang, Qiong Chen, Wanzeng Kong | Link | Camouflaged Object Detection (COD), the task of identifying objects concealed within their environments, has seen rapid growth due to its wide range of practical applications. A key step toward developing trustworthy COD systems is the estimation and effective utilization of uncertainty. In this work, we propose a human-machine collaboration framework for classifying the presence of camouflaged objects, leveraging the complementary strengths of computer vision (CV) models and noninvasive brain-computer interfaces (BCIs). Our approach introduces a multiview backbone to estimate uncertainty in CV model predictions, utilizes this uncertainty during training to improve efficiency, and defers low-confidence cases to human evaluation via RSVP-based BCIs during testing for more reliable decision-making. We evaluated the framework in the CAMO dataset, achieving state-of-the-art results with an average improvement of 4.56\% in balanced accuracy (BA) and 3.66\% in the F1 score compared to existing methods. For the best-performing participants, the improvements reached 7.6\% in BA and 6.66\% in the F1 score. Analysis of the training process revealed a strong correlation between our confidence measures and precision, while an ablation study confirmed the effectiveness of the proposed training policy and the human-machine collaboration strategy. In general, this work reduces human cognitive load, improves system reliability, and provides a strong foundation for advancements in real-world COD applications and human-computer interaction. Our code and data are available at: https://github.com/ziyuey/Uncertainty-aware-human-machine-collaboration-in-camouflaged-object-identification. |
2025-02-07 | Geometric Machine Learning on EEG Signals | Benjamin J. Choi | Link | Brain-computer interfaces (BCIs) offer transformative potential, but decoding neural signals presents significant challenges. The core premise of this paper is built around demonstrating methods to elucidate the underlying low-dimensional geometric structure present in high-dimensional brainwave data in order to assist in downstream BCI-related neural classification tasks. We demonstrate two pipelines related to electroencephalography (EEG) signal processing: (1) a preliminary pipeline removing noise from individual EEG channels, and (2) a downstream manifold learning pipeline uncovering geometric structure across networks of EEG channels. We conduct preliminary validation using two EEG datasets and situate our demonstration in the context of the BCI-relevant imagined digit decoding problem. Our preliminary pipeline uses an attention-based EEG filtration network to extract clean signal from individual EEG channels. Our primary pipeline uses a fast Fourier transform, a Laplacian eigenmap, a discrete analog of Ricci flow via Ollivier's notion of Ricci curvature, and a graph convolutional network to perform dimensionality reduction on high-dimensional multi-channel EEG data in order to enable regularizable downstream classification. Our system achieves competitive performance with existing signal processing and classification benchmarks; we demonstrate a mean test correlation coefficient of >0.95 at 2 dB on semi-synthetic neural denoising and a downstream EEG-based classification accuracy of 0.97 on distinguishing digit- versus non-digit thoughts. Results are preliminary and our geometric machine learning pipeline should be validated by more extensive follow-up studies; generalizing these results to larger inter-subject sample sizes, different hardware systems, and broader use cases will be crucial. |
2025-02-06 | Transfer Learning for Covert Speech Classification Using EEG Hilbert Envelope and Temporal Fine Structure | Saravanakumar Duraisamy, Mateusz Dubiel, Maurice Rekrut, Luis A. Leiva | Link | Brain-Computer Interfaces (BCIs) can decode imagined speech from neural activity. However, these systems typically require extensive training sessions where participants imaginedly repeat words, leading to mental fatigue and difficulties identifying the onset of words, especially when imagining sequences of words. This paper addresses these challenges by transferring a classifier trained in overt speech data to covert speech classification. We used electroencephalogram (EEG) features derived from the Hilbert envelope and temporal fine structure, and used them to train a bidirectional long-short-term memory (BiLSTM) model for classification. Our method reduces the burden of extensive training and achieves state-of-the-art classification accuracy: 86.44% for overt speech and 79.82% for covert speech using the overt speech classifier. |
2025-02-07 | Decoding Human Attentive States from Spatial-temporal EEG Patches Using Transformers | Yi Ding, Joon Hei Lee, Shuailei Zhang, Tianze Luo, Cuntai Guan | Link | Learning the spatial topology of electroencephalogram (EEG) channels and their temporal dynamics is crucial for decoding attention states. This paper introduces EEG-PatchFormer, a transformer-based deep learning framework designed specifically for EEG attention classification in Brain-Computer Interface (BCI) applications. By integrating a Temporal CNN for frequency-based EEG feature extraction, a pointwise CNN for feature enhancement, and Spatial and Temporal Patching modules for organizing features into spatial-temporal patches, EEG-PatchFormer jointly learns spatial-temporal information from EEG data. Leveraging the global learning capabilities of the self-attention mechanism, it captures essential features across brain regions over time, thereby enhancing EEG data decoding performance. Demonstrating superior performance, EEG-PatchFormer surpasses existing benchmarks in accuracy, area under the ROC curve (AUC), and macro-F1 score on a public cognitive attention dataset. The code can be found via: https://github.com/yi-ding-cs/EEG-PatchFormer . |
2025-02-05 | Fine-Tuning Strategies for Continual Online EEG Motor Imagery Decoding: Insights from a Large-Scale Longitudinal Study | Martin Wimpff, Bruno Aristimunha, Sylvain Chevallier, Bin Yang | Link | This study investigates continual fine-tuning strategies for deep learning in online longitudinal electroencephalography (EEG) motor imagery (MI) decoding within a causal setting involving a large user group and multiple sessions per participant. We are the first to explore such strategies across a large user group, as longitudinal adaptation is typically studied in the single-subject setting with a single adaptation strategy, which limits the ability to generalize findings. First, we examine the impact of different fine-tuning approaches on decoder performance and stability. Building on this, we integrate online test-time adaptation (OTTA) to adapt the model during deployment, complementing the effects of prior fine-tuning. Our findings demonstrate that fine-tuning that successively builds on prior subject-specific information improves both performance and stability, while OTTA effectively adapts the model to evolving data distributions across consecutive sessions, enabling calibration-free operation. These results offer valuable insights and recommendations for future research in longitudinal online MI decoding and highlight the importance of combining domain adaptation strategies for improving BCI performance in real-world applications. Clinical Relevance: Our investigation enables more stable and efficient long-term motor imagery decoding, which is critical for neurorehabilitation and assistive technologies. |
2025-02-05 | Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies | Siyang Li, Hongbin Wang, Xiaoqing Chen, Dongrui Wu | Link | Brain-computer interfaces (BCIs) enable direct communication between the brain and external devices. This review highlights the core decoding algorithms that enable multimodal BCIs, including a dissection of the elements, a unified view of diversified approaches, and a comprehensive analysis of the present state of the field. We emphasize algorithmic advancements in cross-modality mapping, sequential modeling, besides classic multi-modality fusion, illustrating how these novel AI approaches enhance decoding of brain data. The current literature of BCI applications on visual, speech, and affective decoding are comprehensively explored. Looking forward, we draw attention on the impact of emerging architectures like multimodal Transformers, and discuss challenges such as brain data heterogeneity and common errors. This review also serves as a bridge in this interdisciplinary field for experts with neuroscience background and experts that study AI, aiming to provide a comprehensive understanding for AI-powered multimodal BCIs. |
2025-01-30 | ISAM-MTL: Cross-subject multi-task learning model with identifiable spikes and associative memory networks | Junyan Li, Bin Hu, Zhi-Hong Guan | Link | Cross-subject variability in EEG degrades performance of current deep learning models, limiting the development of brain-computer interface (BCI). This paper proposes ISAM-MTL, which is a multi-task learning (MTL) EEG classification model based on identifiable spiking (IS) representations and associative memory (AM) networks. The proposed model treats EEG classification of each subject as an independent task and leverages cross-subject data training to facilitate feature sharing across subjects. ISAM-MTL consists of a spiking feature extractor that captures shared features across subjects and a subject-specific bidirectional associative memory network that is trained by Hebbian learning for efficient and fast within-subject EEG classification. ISAM-MTL integrates learned spiking neural representations with bidirectional associative memory for cross-subject EEG classification. The model employs label-guided variational inference to construct identifiable spike representations, enhancing classification accuracy. Experimental results on two BCI Competition datasets demonstrate that ISAM-MTL improves the average accuracy of cross-subject EEG classification while reducing performance variability among subjects. The model further exhibits the characteristics of few-shot learning and identifiable neural activity beneath EEG, enabling rapid and interpretable calibration for BCI systems. |
2025-01-29 | Neural Spelling: A Spell-Based BCI System for Language Neural Decoding | Xiaowei Jiang, Charles Zhou, Yiqun Duan, Ziyi Zhao, Thomas Do, Chin-Teng Lin | Link | Brain-computer interfaces (BCIs) present a promising avenue by translating neural activity directly into text, eliminating the need for physical actions. However, existing non-invasive BCI systems have not successfully covered the entire alphabet, limiting their practicality. In this paper, we propose a novel non-invasive EEG-based BCI system with Curriculum-based Neural Spelling Framework, which recognizes all 26 alphabet letters by decoding neural signals associated with handwriting first, and then apply a Generative AI (GenAI) to enhance spell-based neural language decoding tasks. Our approach combines the ease of handwriting with the accessibility of EEG technology, utilizing advanced neural decoding algorithms and pre-trained large language models (LLMs) to translate EEG patterns into text with high accuracy. This system show how GenAI can improve the performance of typical spelling-based neural language decoding task, and addresses the limitations of previous methods, offering a scalable and user-friendly solution for individuals with communication impairments, thereby enhancing inclusive communication options. |
2025-01-29 | EMD-Fuzzy: An Empirical Mode Decomposition Based Fuzzy Model for Cross-Stimulus Transfer Learning of SSVEP | Beining Cao, Xiaowei Jiang, Daniel Leong, Charlie Li-Ting Tsai, Yu-Cheng Chang, Thomas Do, Chin-Teng | Link | The Brain-Computer Interface (BCI) enables direct brain-to-device communication, with the Steady-State Visual Evoked Potential (SSVEP) paradigm favored for its stability and high accuracy across various fields. In SSVEP BCI systems, supervised learning models significantly enhance performance over unsupervised models, achieving higher accuracy in less time. However, prolonged data collection can cause user fatigue and even trigger photosensitive epilepsy, creating a negative user experience. Thus, reducing calibration time is crucial. To address this, Cross-Stimulus transfer learning (CSTL) can shorten calibration by utilizing only partial frequencies. Traditional CSTL methods, affected by time-domain impulse response variations, are suitable only for adjacent frequency transfers, limiting their general applicability. We introduce an Empirical Mode Decomposition (EMD) Based Fuzzy Model (EMD-Fuzzy), which employs EMD to extract crucial frequency information and achieves stimulus transfer in the frequency domain through Fast Fourier Transform (FFT) to mitigate time-domain differences. Combined with a Fuzzy Decoder that uses fuzzy logic for representation learning, our approach delivers promising preliminary results in offline tests and state-of-the-art performance. With only 4 frequencies, our method achieved an accuracy of 82.75% (16.30%) and an information transfer rate (ITR) of 186.56 (52.09) bits/min on the 40-target Benchmark dataset. In online tests, our method demonstrates robust efficacy, achieving an averaged accuracy of 86.30% (6.18%) across 7 subjects. This performance underscores the effectiveness of integrating EMD and fuzzy logic into EEG decoding for CSTL and highlights our method's potential in real-time applications where consistent and reliable decoding is crucial. |
2025-01-27 | SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments | Simon Dahan, Gabriel Bénédict, Logan Z. J. Williams, Yourong Guo, Daniel Rueckert, Robert Leech, Emma C. Robinson | Link | Current AI frameworks for brain decoding and encoding, typically train and test models within the same datasets. This limits their utility for brain computer interfaces (BCI) or neurofeedback, for which it would be useful to pool experiences across individuals to better simulate stimuli not sampled during training. A key obstacle to model generalisation is the degree of variability of inter-subject cortical organisation, which makes it difficult to align or compare cortical signals across participants. In this paper we address this through the use of surface vision transformers, which build a generalisable model of cortical functional dynamics, through encoding the topography of cortical networks and their interactions as a moving image across a surface. This is then combined with tri-modal self-supervised contrastive (CLIP) alignment of audio, video, and fMRI modalities to enable the retrieval of visual and auditory stimuli from patterns of cortical activity (and vice-versa). We validate our approach on 7T task-fMRI data from 174 healthy participants engaged in the movie-watching experiment from the Human Connectome Project (HCP). Results show that it is possible to detect which movie clips an individual is watching purely from their brain activity, even for individuals and movies not seen during training. Further analysis of attention maps reveals that our model captures individual patterns of brain activity that reflect semantic and visual systems. This opens the door to future personalised simulations of brain function. Code & pre-trained models will be made available at https://github.com/metrics-lab/sim, processed data for training will be available upon request at https://gin.g-node.org/Sdahan30/sim. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-02-11 | From Brainwaves to Brain Scans: A Robust Neural Network for EEG-to-fMRI Synthesis | Kristofer Grover Roos, Quan Huu Cap, Atsushi Fukuda | Link | While functional magnetic resonance imaging (fMRI) offers rich spatial resolution, it is limited by high operational costs and significant infrastructural demands. In contrast, electroencephalography (EEG) provides millisecond-level precision in capturing electrical activity but lacks the spatial resolution necessary for precise neural localization. To bridge these gaps, we introduce E2fNet, a simple yet effective deep learning model for synthesizing fMRI images from low-cost EEG data. E2fNet is specifically designed to capture and translate meaningful features from EEG across electrode channels into accurate fMRI representations. Extensive evaluations across three datasets demonstrate that E2fNet consistently outperforms existing methods, achieving state-of-the-art results in terms of the structural similarity index measure (SSIM). Our findings suggest that E2fNet is a promising, cost-effective solution for enhancing neuroimaging capabilities. The code is available at https://github.com/kgr20/E2fNet. |
2025-02-10 | Direct Estimation of Pediatric Heart Rate Variability from BOLD-fMRI: A Machine Learning Approach Using Dynamic Connectivity | Abdoljalil Addeh, Karen Ardila, Rebecca J Williams, G. Bruce Pike, M. Ethan MacDonald | Link | In many pediatric fMRI studies, cardiac signals are often missing or of poor quality. A tool to extract Heart Rate Variation (HRV) waveforms directly from fMRI data, without the need for peripheral recording devices, would be highly beneficial. We developed a machine learning framework to accurately reconstruct HRV for pediatric applications. A hybrid model combining one-dimensional Convolutional Neural Networks (1D-CNN) and Gated Recurrent Units (GRU) analyzed BOLD signals from 628 ROIs, integrating past and future data. The model achieved an 8% improvement in HRV accuracy, as evidenced by enhanced performance metrics. This approach eliminates the need for peripheral photoplethysmography devices, reduces costs, and simplifies procedures in pediatric fMRI. Additionally, it improves the robustness of pediatric fMRI studies, which are more sensitive to physiological and developmental variations than those in adults. |
2025-02-09 | Topological Time Frequency Analysis of Functional Brain Signals | Moo K. Chung, Aaron F. Struck | Link | We present a novel topological framework for analyzing functional brain signals using time-frequency analysis. By integrating persistent homology with time-frequency representations, we capture multi-scale topological features that characterize the dynamic behavior of brain activity. This approach identifies 0D (connected components) and 1D (loops) topological structures in the signal's time-frequency domain, enabling robust extraction of features invariant to noise and temporal misalignments. The proposed method is demonstrated on resting-state functional magnetic resonance imaging (fMRI) data, showcasing its ability to discern critical topological patterns and provide insights into functional connectivity. This topological approach opens new avenues for analyzing complex brain signals, offering potential applications in neuroscience and clinical diagnostics. |
2025-02-08 | Multi-Site rs-fMRI Domain Alignment for Autism Spectrum Disorder Auxiliary Diagnosis Based on Hyperbolic Space | Yiqian Luo, Qiurong Chen, Yangsong Zhang | Link | In the medical field, most resting-state fMRI (rs-fMRI) data are collected from multiple hospital sites. Multi-site rs-fMRI data can increase the volume of training data, enabling auxiliary diagnostic algorithms for brain diseases to learn more accurate and stable models. However, due to the significant heterogeneity and domain shift in rs-fMRI data across different sites, the accuracy of auxiliary diagnosis remains unsatisfactory. Moreover, there has been limited exploration of multi-source domain adaptation algorithms, and the interpretability of models is often poor. To address these challenges, we proposed a domain-adaptive algorithm based on hyperbolic space embedding. Hyperbolic space is naturally suited for representing the topology of complex networks such as brain functional networks. Therefore, we embedded the brain functional network into hyperbolic space and constructed the corresponding hyperbolic space community network to effectively extract brain network representations. To address the heterogeneity of data across different sites and the issue of domain shift, we introduce a constraint loss function, HMMD (Hyperbolic Maximum Mean Discrepancy), to align the marginal distributions in the hyperbolic space. Additionally, we employ class prototype alignment to align the conditional distributions. This significantly improves the quality of brain representations and enhances diagnostic classification accuracy for Autism Spectrum Disorder (ASD). Experimental results demonstrated that the proposed algorithm is robust to multi-site heterogeneity and shows promising potential for brain network mechanism analysis. |
2025-02-07 | MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data | Yuqin Dai, Zhouheng Yao, Chunfeng Song, Qihao Zheng, Weijian Mai, Kunyu Peng, Shuai Lu, Wanli Ouyang, Jian Yang, Jiamin Wu | Link | Brain decoding aims to reconstruct visual perception of human subject from fMRI signals, which is crucial for understanding brain's perception mechanisms. Existing methods are confined to the single-subject paradigm due to substantial brain variability, which leads to weak generalization across individuals and incurs high training costs, exacerbated by limited availability of fMRI data. To address these challenges, we propose MindAligner, an explicit functional alignment framework for cross-subject brain decoding from limited fMRI data. The proposed MindAligner enjoys several merits. First, we learn a Brain Transfer Matrix (BTM) that projects the brain signals of an arbitrary new subject to one of the known subjects, enabling seamless use of pre-trained decoding models. Second, to facilitate reliable BTM learning, a Brain Functional Alignment module is proposed to perform soft cross-subject brain alignment under different visual stimuli with a multi-level brain alignment loss, uncovering fine-grained functional correspondences with high interpretability. Experiments indicate that MindAligner not only outperforms existing methods in visual decoding under data-limited conditions, but also provides valuable neuroscience insights in cross-subject functional analysis. The code will be made publicly available. |
2025-02-07 | A Foundational Brain Dynamics Model via Stochastic Optimal Control | Joonhyeong Park, Byoungwoo Park, Chang-Bae Bang, Jungwon Choi, Hyungjin Chung, Byung-Hoon Kim, Juho Lee | Link | We introduce a foundational model for brain dynamics that utilizes stochastic optimal control (SOC) and amortized inference. Our method features a continuous-discrete state space model (SSM) that can robustly handle the intricate and noisy nature of fMRI signals. To address computational limitations, we implement an approximation strategy grounded in the SOC framework. Additionally, we present a simulation-free latent dynamics approach that employs locally linear approximations, facilitating efficient and scalable inference. For effective representation learning, we derive an Evidence Lower Bound (ELBO) from the SOC formulation, which integrates smoothly with recent advancements in self-supervised learning (SSL), thereby promoting robust and transferable representations. Pre-trained on extensive datasets such as the UKB, our model attains state-of-the-art results across a variety of downstream tasks, including demographic prediction, trait analysis, disease diagnosis, and prognosis. Moreover, evaluating on external datasets such as HCP-A, ABIDE, and ADHD200 further validates its superior abilities and resilience across different demographic and clinical distributions. Our foundational model provides a scalable and efficient approach for deciphering brain dynamics, opening up numerous applications in neuroscience. |
2025-02-06 | Dark Brain Energy: Toward an Integrative Model of Spontaneous Slow Oscillations | ZhuQing Gong, XiNian Zuo | Link | Neural oscillations facilitate the functioning of the human brain in spatial and temporal dimensions at various frequencies. These oscillations feature a universal frequency architecture that is governed by brain anatomy, ensuring frequency specificity remains invariant across different measurement techniques. Initial magnetic resonance imaging (MRI) methodology constrained functional MRI (fMRI) investigations to a singular frequency range, thereby neglecting the frequency characteristics inherent in blood oxygen level-dependent oscillations. With advancements in MRI technology, it has become feasible to decode intricate brain activities via multi-band frequency analysis (MBFA). During the past decade, the utilization of MBFA in fMRI studies has surged, unveiling frequency-dependent characteristics of spontaneous slow oscillations (SSOs) believed to base dark energy in the brain. There remains a dearth of conclusive insights and hypotheses pertaining to the properties and functionalities of SSOs in distinct bands. We surveyed the SSO MBFA studies during the past 15 years to delineate the attributes of SSOs and enlighten their correlated functions. We further proposed a model to elucidate the hierarchical organization of multi-band SSOs by integrating their function, aimed at bridging theoretical gaps and guiding future MBFA research endeavors. |
2025-02-04 | scBIT: Integrating Single-cell Transcriptomic Data into fMRI-based Prediction for Alzheimer's Disease Diagnosis | Yu-An Huang, Yao Hu, Yue-Chao Li, Xiyue Cao, Xinyuan Li, Kay Chen Tan, Zhu-Hong You, Zhi-An Huang | Link | Functional MRI (fMRI) and single-cell transcriptomics are pivotal in Alzheimer's disease (AD) research, each providing unique insights into neural function and molecular mechanisms. However, integrating these complementary modalities remains largely unexplored. Here, we introduce scBIT, a novel method for enhancing AD prediction by combining fMRI with single-nucleus RNA (snRNA). scBIT leverages snRNA as an auxiliary modality, significantly improving fMRI-based prediction models and providing comprehensive interpretability. It employs a sampling strategy to segment snRNA data into cell-type-specific gene networks and utilizes a self-explainable graph neural network to extract critical subgraphs. Additionally, we use demographic and genetic similarities to pair snRNA and fMRI data across individuals, enabling robust cross-modal learning. Extensive experiments validate scBIT's effectiveness in revealing intricate brain region-gene associations and enhancing diagnostic prediction accuracy. By advancing brain imaging transcriptomics to the single-cell level, scBIT sheds new light on biomarker discovery in AD research. Experimental results show that incorporating snRNA data into the scBIT model significantly boosts accuracy, improving binary classification by 3.39% and five-class classification by 26.59%. The codes were implemented in Python and have been released on GitHub (https://github.com/77YQ77/scBIT) and Zenodo (https://zenodo.org/records/11599030) with detailed instructions. |
2025-02-03 | A Privacy-Preserving Domain Adversarial Federated learning for multi-site brain functional connectivity analysis | Yipu Zhang, Likai Wang, Kuan-Jui Su, Aiying Zhang, Hao Zhu, Xiaowen Liu, Hui Shen, Vince D. Calhoun, Yuping Wang, Hongwen Deng | Link | Resting-state functional magnetic resonance imaging (rs-fMRI) and its derived functional connectivity networks (FCNs) have become critical for understanding neurological disorders. However, collaborative analyses and the generalizability of models still face significant challenges due to privacy regulations and the non-IID (non-independent and identically distributed) property of multiple data sources. To mitigate these difficulties, we propose Domain Adversarial Federated Learning (DAFed), a novel federated deep learning framework specifically designed for non-IID fMRI data analysis in multi-site settings. DAFed addresses these challenges through feature disentanglement, decomposing the latent feature space into domain-invariant and domain-specific components, to ensure robust global learning while preserving local data specificity. Furthermore, adversarial training facilitates effective knowledge transfer between labeled and unlabeled datasets, while a contrastive learning module enhances the global representation of domain-invariant features. We evaluated DAFed on the diagnosis of ASD and further validated its generalizability in the classification of AD, demonstrating its superior classification accuracy compared to state-of-the-art methods. Additionally, an enhanced Score-CAM module identifies key brain regions and functional connectivity significantly associated with ASD and MCI, respectively, uncovering shared neurobiological patterns across sites. These findings highlight the potential of DAFed to advance multi-site collaborative research in neuroimaging while protecting data confidentiality. |
2025-02-01 | TROI: Cross-Subject Pretraining with Sparse Voxel Selection for Enhanced fMRI Visual Decoding | Ziyu Wang, Tengyu Pan, Zhenyu Li, Wu Ji, Li Xiuxing, Jianyong Wang | Link | fMRI (functional Magnetic Resonance Imaging) visual decoding involves decoding the original image from brain signals elicited by visual stimuli. This often relies on manually labeled ROIs (Regions of Interest) to select brain voxels. However, these ROIs can contain redundant information and noise, reducing decoding performance. Additionally, the lack of automated ROI labeling methods hinders the practical application of fMRI visual decoding technology, especially for new subjects. This work presents TROI (Trainable Region of Interest), a novel two-stage, data-driven ROI labeling method for cross-subject fMRI decoding tasks, particularly when subject samples are limited. TROI leverages labeled ROIs in the dataset to pretrain an image decoding backbone on a cross-subject dataset, enabling efficient optimization of the input layer for new subjects without retraining the entire model from scratch. In the first stage, we introduce a voxel selection method that combines sparse mask training and low-pass filtering to quickly generate the voxel mask and determine input layer dimensions. In the second stage, we apply a learning rate rewinding strategy to fine-tune the input layer for downstream tasks. Experimental results on the same small sample dataset as the baseline method for brain visual retrieval and reconstruction tasks show that our voxel selection method surpasses the state-of-the-art method MindEye2 with an annotated ROI mask. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-02-11 | From Thought to Action: How a Hierarchy of Neural Dynamics Supports Language Production | Mingfang, Zhang, Jarod Lévy, Stéphane d'Ascoli, Jérémy Rapin, F. -Xavier Alario, Pierre Bourdillon, Svetlana Pinet, Jean-Rémi King | Link | Humans effortlessly communicate their thoughts through intricate sequences of motor actions. Yet, the neural processes that coordinate language production remain largely unknown, in part because speech artifacts limit the use of neuroimaging. To elucidate the unfolding of language production in the brain, we investigate with magnetoencephalography (MEG) and electroencephalography (EEG) the neurophysiological activity of 35 skilled typists, while they typed sentences on a keyboard. This approach confirms the hierarchical predictions of linguistic theories: the neural activity preceding the production of each word is marked by the sequential rise and fall of context-, word-, syllable-, and letter-level representations. Remarkably, each of these neural representations is maintained over long time periods within each level of the language hierarchy. This phenomenon results in a superposition of successive representations that is supported by a hierarchy of dynamic neural codes. Overall, these findings provide a precise computational breakdown of the neural dynamics that coordinate the production of language in the human brain. |
2025-02-07 | Estimated Roadway Segment Traffic Data by Vehicle Class for the United States: A Machine Learning Approach | Brittany Antonczak, Meg Fay, Aviral Chawla, Gregory Rowangould | Link | The Highway Performance Monitoring System, managed by the Federal Highway Administration, provides essential data on average annual daily traffic across U.S. roadways, but it has limited representation of medium- and heavy-duty vehicles on non-interstate roads. This gap limits research and policy analysis on the impacts of truck traffic, especially concerning air quality and public health. To address this, we use random forest regression to estimate medium- and heavy-duty vehicle traffic volumes in areas with sparse data. This results in a more comprehensive dataset, which enables the estimation of traffic density at the census block level as a proxy for traffic-related air pollution exposure. Our high-resolution spatial data products, rigorously validated, provide a more accurate representation of truck traffic and its environmental and health impacts. These datasets are valuable for transportation planning, public health research, and policy decisions aimed at mitigating the effects of truck traffic on vulnerable communities exposed to air pollution. |
2025-02-07 | Shifting Attention to You: Personalized Brain-Inspired AI Models | Stephen Chong Zhao, Yang Hu, Jason Lee, Andrew Bender, Trisha Mazumdar, Mark Wallace, David A. Tovar | Link | The integration of human and artificial intelligence represents a scientific opportunity to advance our understanding of information processing, as each system offers unique computational insights that can enhance and inform the other. The synthesis of human cognitive principles with artificial intelligence has the potential to produce more interpretable and functionally aligned computational models, while simultaneously providing a formal framework for investigating the neural mechanisms underlying perception, learning, and decision-making through systematic model comparisons and representational analyses. In this study, we introduce personalized brain-inspired modeling that integrates human behavioral embeddings and neural data to align with cognitive processes. We took a stepwise approach, fine-tuning the Contrastive Language-Image Pre-training (CLIP) model with large-scale behavioral decisions, group-level neural data, and finally, participant-level neural data within a broader framework that we have named CLIP-Human-Based Analysis (CLIP-HBA). We found that fine-tuning on behavioral data enhances its ability to predict human similarity judgments while indirectly aligning it with dynamic representations captured via MEG. To further gain mechanistic insights into the temporal evolution of cognitive processes, we introduced a model specifically fine-tuned on millisecond-level MEG neural dynamics (CLIP-HBA-MEG). This model resulted in enhanced temporal alignment with human neural processing while still showing improvement on behavioral alignment. Finally, we trained individualized models on participant-specific neural data, effectively capturing individualized neural dynamics and highlighting the potential for personalized AI systems. These personalized systems have far-reaching implications for the fields of medicine, cognitive research, human-computer interfaces, and AI development. |
2025-02-06 | Detecting Mild Traumatic Brain Injury with MEG Scan Data: One-vs-K-Sample Tests | Jian Zhang, Gary Green | Link | Magnetoencephalography (MEG) scanner has been shown to be more accurate than other medical devices in detecting mild traumatic brain injury (mTBI). However, MEG scan data in certain spectrum ranges can be skewed, multimodal and heterogeneous which can mislead the conventional case-control analysis that requires the data to be homogeneous and normally distributed within the control group. To meet this challenge, we propose a flexible one-vs-K-sample testing procedure for detecting brain injury for a single-case versus heterogeneous controls. The new procedure begins with source magnitude imaging using MEG scan data in frequency domain, followed by region-wise contrast tests for abnormality between the case and controls. The critical values for these tests are automatically determined by cross-validation. We adjust the testing results for heterogeneity effects by similarity analysis. An asymptotic theory is established for the proposed test statistic. By simulated and real data analyses in the context of neurotrauma, we show that the proposed test outperforms commonly used nonparametric methods in terms of overall accuracy and ability in accommodating data non-normality and subject-heterogeneity. |
2025-01-31 | Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming | Mrinank Sharma, Meg Tong, Jesse Mu, Jerry Wei, Jorrit Kruthoff, Scott Goodfriend, Euan Ong, Alwin Peng, Raj Agarwal, Cem Anil, Amanda Askell, Nathan Bailey, Joe Benton, Emma Bluemke, Samuel R. Bowman, Eric Christiansen, Hoagy Cunningham, Andy Dau, Anjali Gopal, Rob Gilson, Logan Graham, Logan Howard, Nimit Kalra, Taesung Lee, Kevin Lin, Peter Lofgren, Francesco Mosconi, Clare O'Hara, Catherine Olsson, Linda Petrini, Samir Rajani, Nikhil Saxena, Alex Silverstein, Tanya Singh, Theodore Sumers, Leonard Tang, Kevin K. Troy, Constantin Weisser, Ruiqi Zhong, Giulio Zhou, Jan Leike, Jared Kaplan, Ethan Perez | Link | Large language models (LLMs) are vulnerable to universal jailbreaks-prompting strategies that systematically bypass model safeguards and enable users to carry out harmful processes that require many model interactions, like manufacturing illegal substances at scale. To defend against these attacks, we introduce Constitutional Classifiers: safeguards trained on synthetic data, generated by prompting LLMs with natural language rules (i.e., a constitution) specifying permitted and restricted content. In over 3,000 estimated hours of red teaming, no red teamer found a universal jailbreak that could extract information from an early classifier-guarded LLM at a similar level of detail to an unguarded model across most target queries. On automated evaluations, enhanced classifiers demonstrated robust defense against held-out domain-specific jailbreaks. These classifiers also maintain deployment viability, with an absolute 0.38% increase in production-traffic refusals and a 23.7% inference overhead. Our work demonstrates that defending against universal jailbreaks while maintaining practical deployment viability is tractable. |
2025-01-28 | "Ownership, Not Just Happy Talk": Co-Designing a Participatory Large Language Model for Journalism | Emily Tseng, Meg Young, Marianne Aubin Le Quéré, Aimee Rinehart, Harini Suresh | Link | Journalism has emerged as an essential domain for understanding the uses, limitations, and impacts of large language models (LLMs) in the workplace. News organizations face divergent financial incentives: LLMs already permeate newswork processes within financially constrained organizations, even as ongoing legal challenges assert that AI companies violate their copyright. At stake are key questions about what LLMs are created to do, and by whom: How might a journalist-led LLM work, and what can participatory design illuminate about the present-day challenges about adapting ``one-size-fits-all'' foundation models to a given context of use? In this paper, we undertake a co-design exploration to understand how a participatory approach to LLMs might address opportunities and challenges around AI in journalism. Our 20 interviews with reporters, data journalists, editors, labor organizers, product leads, and executives highlight macro, meso, and micro tensions that designing for this opportunity space must address. From these desiderata, we describe the result of our co-design work: organizational structures and functionality for a journalist-controlled LLM. In closing, we discuss the limitations of commercial foundation models for workplace use, and the methodological implications of applying participatory methods to LLM co-design. |
2025-01-26 | The Advanced Muon Facility: a proposed multi-purpose muon facility at Fermilab | Sophie Middleton | Link | Charged lepton flavor violation (CLFV) is expected in a diverse set of new physics scenarios. The current generation of experiments probe CLFV in the muon sector in three complementary channels: |
2025-01-28 | Scaling laws for decoding images from brain activity | Hubert Banville, Yohann Benchetrit, Stéphane d'Ascoli, Jérémy Rapin, Jean-Rémi King | Link | Generative AI has recently propelled the decoding of images from brain activity. How do these approaches scale with the amount and type of neural recordings? Here, we systematically compare image decoding from four types of non-invasive devices: electroencephalography (EEG), magnetoencephalography (MEG), high-field functional Magnetic Resonance Imaging (3T fMRI) and ultra-high field (7T) fMRI. For this, we evaluate decoding models on the largest benchmark to date, encompassing 8 public datasets, 84 volunteers, 498 hours of brain recording and 2.3 million brain responses to natural images. Unlike previous work, we focus on single-trial decoding performance to simulate real-time settings. This systematic comparison reveals three main findings. First, the most precise neuroimaging devices tend to yield the best decoding performances, when the size of the training sets are similar. However, the gain enabled by deep learning - in comparison to linear models - is obtained with the noisiest devices. Second, we do not observe any plateau of decoding performance as the amount of training data increases. Rather, decoding performance scales log-linearly with the amount of brain recording. Third, this scaling law primarily depends on the amount of data per subject. However, little decoding gain is observed by increasing the number of subjects. Overall, these findings delineate the path most suitable to scale the decoding of images from non-invasive brain recordings. |
2025-01-21 | Probing Type II Seesaw Leptogenesis Through Lepton Flavor Violation | Chengcheng Han, Yijun Han, Sihui Huang, Zhanhong Lei | Link | Lepton flavor violation (LFV) offers a powerful probe of physics beyond the Standard Model, particularly in models addressing neutrino masses and the baryon asymmetry of the universe. In this study, we investigate LFV processes within the framework of type II seesaw leptogenesis, where the Standard Model is extended by an |
2025-01-20 | Artificial Neural Networks for Magnetoencephalography: A review of an emerging field | Arthur Dehgan, Hamza Abdelhedi, Vanessa Hadid, Irina Rish, Karim Jerbi | Link | Magnetoencephalography (MEG) is a cutting-edge neuroimaging technique that measures the intricate brain dynamics underlying cognitive processes with an unparalleled combination of high temporal and spatial precision. MEG data analytics has always relied on advanced signal processing and mathematical and statistical tools for various tasks ranging from data cleaning to probing the signals' rich dynamics and estimating the neural sources underlying the surface-level recordings. Like in most domains, the surge in Artificial Intelligence (AI) has led to the increased use of Machine Learning (ML) methods for MEG data classification. More recently, an emerging trend in this field is using Artificial Neural Networks (ANNs) to address many MEG-related tasks. This review provides a comprehensive overview of how ANNs are being used with MEG data from three vantage points: First, we review work that employs ANNs for MEG signal classification, i.e., for brain decoding. Second, we report on work that has used ANNs as putative models of information processing in the human brain. Finally, we examine studies that use ANNs as techniques to tackle methodological questions in MEG, including artifact correction and source estimation. Furthermore, we assess the current strengths and limitations of using ANNs with MEG and discuss future challenges and opportunities in this field. Finally, by establishing a detailed portrait of the field and providing practical recommendations for the future, this review seeks to provide a helpful reference for both seasoned MEG researchers and newcomers to the field who are interested in using ANNs to enhance the exploration of the complex dynamics of the human brain with MEG. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-01-04 | Asynchronous Hebbian/anti-Hebbian networks | Henrique Reis Aguiar, Matthias H. Hennig | Link | Lateral inhibition models coupled with Hebbian plasticity have been shown to learn factorised causal representations of input stimuli, for instance, oriented edges are learned from natural images. Currently, these models require the recurrent dynamics to settle into a stable state before weight changes can be applied, which is not only biologically implausible, but also impractical for real-time learning systems. Here, we propose a new Hebbian learning rule which is implemented using plausible biological mechanisms that have been observed experimentally. We find that this rule allows for efficient, time-continuous learning of factorised representations, very similar to the classic noncontinuous Hebbian/anti-Hebbian learning. Furthermore, we show that this rule naturally prevents catastrophic forgetting when stimuli from different distributions are shown sequentially. |
2024-11-27 | NeuroAI for AI Safety | Patrick Mineault, Niccolò Zanichelli, Joanne Zichen Peng, Anton Arkhipov, Eli Bingham, Julian Jara-Ettinger, Emily Mackevicius, Adam Marblestone, Marcelo Mattar, Andrew Payne, Sophia Sanborn, Karen Schroeder, Zenna Tavares, Andreas Tolias | Link | As AI systems become increasingly powerful, the need for safe AI has become more pressing. Humans are an attractive model for AI safety: as the only known agents capable of general intelligence, they perform robustly even under conditions that deviate significantly from prior experiences, explore the world safely, understand pragmatics, and can cooperate to meet their intrinsic goals. Intelligence, when coupled with cooperation and safety mechanisms, can drive sustained progress and well-being. These properties are a function of the architecture of the brain and the learning algorithms it implements. Neuroscience may thus hold important keys to technical AI safety that are currently underexplored and underutilized. In this roadmap, we highlight and critically evaluate several paths toward AI safety inspired by neuroscience: emulating the brain's representations, information processing, and architecture; building robust sensory and motor systems from imitating brain data and bodies; fine-tuning AI systems on brain data; advancing interpretability using neuroscience methods; and scaling up cognitively-inspired architectures. We make several concrete recommendations for how neuroscience can positively impact AI safety. |
2024-11-21 | Evaluating Representational Similarity Measures from the Lens of Functional Correspondence | Yiqing Bo, Ansh Soni, Sudhanshu Srivastava, Meenakshi Khosla | Link | Neuroscience and artificial intelligence (AI) both face the challenge of interpreting high-dimensional neural data, where the comparative analysis of such data is crucial for revealing shared mechanisms and differences between these complex systems. Despite the widespread use of representational comparisons and the abundance classes of comparison methods, a critical question remains: which metrics are most suitable for these comparisons? While some studies evaluate metrics based on their ability to differentiate models of different origins or constructions (e.g., various architectures), another approach is to assess how well they distinguish models that exhibit distinct behaviors. To investigate this, we examine the degree of alignment between various representational similarity measures and behavioral outcomes, employing group statistics and a comprehensive suite of behavioral metrics for comparison. In our evaluation of eight commonly used representational similarity metrics in the visual domain -- spanning alignment-based, Canonical Correlation Analysis (CCA)-based, inner product kernel-based, and nearest-neighbor methods -- we found that metrics like linear Centered Kernel Alignment (CKA) and Procrustes distance, which emphasize the overall geometric structure or shape of representations, excelled in differentiating trained from untrained models and aligning with behavioral measures, whereas metrics such as linear predictivity, commonly used in neuroscience, demonstrated only moderate alignment with behavior. These insights are crucial for selecting metrics that emphasize behaviorally meaningful comparisons in NeuroAI research. |
2024-10-25 | A prescriptive theory for brain-like inference | Hadi Vafaii, Dekel Galor, Jacob L. Yates | Link | The Evidence Lower Bound (ELBO) is a widely used objective for training deep generative models, such as Variational Autoencoders (VAEs). In the neuroscience literature, an identical objective is known as the variational free energy, hinting at a potential unified framework for brain function and machine learning. Despite its utility in interpreting generative models, including diffusion models, ELBO maximization is often seen as too broad to offer prescriptive guidance for specific architectures in neuroscience or machine learning. In this work, we show that maximizing ELBO under Poisson assumptions for general sequence data leads to a spiking neural network that performs Bayesian posterior inference through its membrane potential dynamics. The resulting model, the iterative Poisson VAE (iP-VAE), has a closer connection to biological neurons than previous brain-inspired predictive coding models based on Gaussian assumptions. Compared to amortized and iterative VAEs, iP-VAElearns sparser representations and exhibits superior generalization to out-of-distribution samples. These findings suggest that optimizing ELBO, combined with Poisson assumptions, provides a solid foundation for developing prescriptive theories in NeuroAI. |
2024-09-09 | Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models | Emily Cheng, Richard J. Antonello | Link | Research has repeatedly demonstrated that intermediate hidden states extracted from large language models are able to predict measured brain response to natural language stimuli. Yet, very little is known about the representation properties that enable this high prediction performance. Why is it the intermediate layers, and not the output layers, that are most capable for this unique and highly general transfer task? In this work, we show that evidence from language encoding models in fMRI supports the existence of a two-phase abstraction process within LLMs. We use manifold learning methods to show that this abstraction process naturally arises over the course of training a language model and that the first "composition" phase of this abstraction process is compressed into fewer layers as training continues. Finally, we demonstrate a strong correspondence between layerwise encoding performance and the intrinsic dimensionality of representations from LLMs. We give initial evidence that this correspondence primarily derives from the inherent compositionality of LLMs and not their next-word prediction properties. |
2024-07-22 | Predictive Coding Networks and Inference Learning: Tutorial and Survey | Björn van Zwol, Ro Jefferson, Egon L. van den Broek | Link | Recent years have witnessed a growing call for renewed emphasis on neuroscience-inspired approaches in artificial intelligence research, under the banner of NeuroAI. A prime example of this is predictive coding networks (PCNs), based on the neuroscientific framework of predictive coding. This framework views the brain as a hierarchical Bayesian inference model that minimizes prediction errors through feedback connections. Unlike traditional neural networks trained with backpropagation (BP), PCNs utilize inference learning (IL), a more biologically plausible algorithm that explains patterns of neural activity that BP cannot. Historically, IL has been more computationally intensive, but recent advancements have demonstrated that it can achieve higher efficiency than BP with sufficient parallelization. Furthermore, PCNs can be mathematically considered a superset of traditional feedforward neural networks (FNNs), significantly extending the range of trainable architectures. As inherently probabilistic (graphical) latent variable models, PCNs provide a versatile framework for both supervised learning and unsupervised (generative) modeling that goes beyond traditional artificial neural networks. This work provides a comprehensive review and detailed formal specification of PCNs, particularly situating them within the context of modern ML methods. Additionally, we introduce a Python library (PRECO) for practical implementation. This positions PC as a promising framework for future ML innovations. |
2023-10-29 | Beyond Geometry: Comparing the Temporal Structure of Computation in Neural Circuits with Dynamical Similarity Analysis | Mitchell Ostrow, Adam Eisen, Leo Kozachkov, Ila Fiete | Link | How can we tell whether two neural networks utilize the same internal processes for a particular computation? This question is pertinent for multiple subfields of neuroscience and machine learning, including neuroAI, mechanistic interpretability, and brain-machine interfaces. Standard approaches for comparing neural networks focus on the spatial geometry of latent states. Yet in recurrent networks, computations are implemented at the level of dynamics, and two networks performing the same computation with equivalent dynamics need not exhibit the same geometry. To bridge this gap, we introduce a novel similarity metric that compares two systems at the level of their dynamics, called Dynamical Similarity Analysis (DSA). Our method incorporates two components: Using recent advances in data-driven dynamical systems theory, we learn a high-dimensional linear system that accurately captures core features of the original nonlinear dynamics. Next, we compare different systems passed through this embedding using a novel extension of Procrustes Analysis that accounts for how vector fields change under orthogonal transformation. In four case studies, we demonstrate that our method disentangles conjugate and non-conjugate recurrent neural networks (RNNs), while geometric methods fall short. We additionally show that our method can distinguish learning rules in an unsupervised manner. Our method opens the door to comparative analyses of the essential temporal structure of computation in neural circuits. |
2023-05-25 | Explaining V1 Properties with a Biologically Constrained Deep Learning Architecture | Galen Pogoncheff, Jacob Granley, Michael Beyeler | Link | Convolutional neural networks (CNNs) have recently emerged as promising models of the ventral visual stream, despite their lack of biological specificity. While current state-of-the-art models of the primary visual cortex (V1) have surfaced from training with adversarial examples and extensively augmented data, these models are still unable to explain key neural properties observed in V1 that arise from biological circuitry. To address this gap, we systematically incorporated neuroscience-derived architectural components into CNNs to identify a set of mechanisms and architectures that comprehensively explain neural activity in V1. We show drastic improvements in model-V1 alignment driven by the integration of architectural components that simulate center-surround antagonism, local receptive fields, tuned normalization, and cortical magnification. Upon enhancing task-driven CNNs with a collection of these specialized components, we uncover models with latent representations that yield state-of-the-art explanation of V1 neural activity and tuning properties. Our results highlight an important advancement in the field of NeuroAI, as we systematically establish a set of architectural components that contribute to unprecedented explanation of V1. The neuroscience insights that could be gleaned from increasingly accurate in-silico models of the brain have the potential to greatly advance the fields of both neuroscience and artificial intelligence. |
2024-11-09 | A Deep Probabilistic Spatiotemporal Framework for Dynamic Graph Representation Learning with Application to Brain Disorder Identification | Sin-Yee Yap, Junn Yong Loo, Chee-Ming Ting, Fuad Noman, Raphael C. -W. Phan, Adeel Razi, David L. Dowe | Link | Recent applications of pattern recognition techniques on brain connectome classification using functional connectivity (FC) are shifting towards acknowledging the non-Euclidean topology and dynamic aspects of brain connectivity across time. In this paper, a deep spatiotemporal variational Bayes (DSVB) framework is proposed to learn time-varying topological structures in dynamic FC networks for identifying autism spectrum disorder (ASD) in human participants. The framework incorporates a spatial-aware recurrent neural network with an attention-based message passing scheme to capture rich spatiotemporal patterns across dynamic FC networks. To overcome model overfitting on limited training datasets, an adversarial training strategy is introduced to learn graph embedding models that generalize well to unseen brain networks. Evaluation on the ABIDE resting-state functional magnetic resonance imaging dataset shows that our proposed framework substantially outperforms state-of-the-art methods in identifying patients with ASD. Dynamic FC analyses with DSVB-learned embeddings reveal apparent group differences between ASD and healthy controls in brain network connectivity patterns and switching dynamics of brain states. The code is available at https://github.com/Monash-NeuroAI/Deep-Spatiotemporal-Variational-Bayes. |
2023-03-11 | Towards NeuroAI: Introducing Neuronal Diversity into Artificial Neural Networks | Feng-Lei Fan, Yingxin Li, Hanchuan Peng, Tieyong Zeng, Fei Wang | Link | Throughout history, the development of artificial intelligence, particularly artificial neural networks, has been open to and constantly inspired by the increasingly deepened understanding of the brain, such as the inspiration of neocognitron, which is the pioneering work of convolutional neural networks. Per the motives of the emerging field: NeuroAI, a great amount of neuroscience knowledge can help catalyze the next generation of AI by endowing a network with more powerful capabilities. As we know, the human brain has numerous morphologically and functionally different neurons, while artificial neural networks are almost exclusively built on a single neuron type. In the human brain, neuronal diversity is an enabling factor for all kinds of biological intelligent behaviors. Since an artificial network is a miniature of the human brain, introducing neuronal diversity should be valuable in terms of addressing those essential problems of artificial networks such as efficiency, interpretability, and memory. In this Primer, we first discuss the preliminaries of biological neuronal diversity and the characteristics of information transmission and processing in a biological neuron. Then, we review studies of designing new neurons for artificial networks. Next, we discuss what gains can neuronal diversity bring into artificial networks and exemplary applications in several important fields. Lastly, we discuss the challenges and future directions of neuronal diversity to explore the potential of NeuroAI. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-02-12 | Ultrasound Image Generation using Latent Diffusion Models | Benoit Freiche, Anthony El-Khoury, Ali Nasiri-Sarvi, Mahdi S. Hosseini, Damien Garcia, Adrian Basarab, Mathieu Boily, Hassan Rivaz | Link | Diffusion models for image generation have been a subject of increasing interest due to their ability to generate diverse, high-quality images. Image generation has immense potential in medical imaging because open-source medical images are difficult to obtain compared to natural images, especially for rare conditions. The generated images can be used later to train classification and segmentation models. In this paper, we propose simulating realistic ultrasound (US) images by successive fine-tuning of large diffusion models on different publicly available databases. To do so, we fine-tuned Stable Diffusion, a state-of-the-art latent diffusion model, on BUSI (Breast US Images) an ultrasound breast image dataset. We successfully generated high-quality US images of the breast using simple prompts that specify the organ and pathology, which appeared realistic to three experienced US scientists and a US radiologist. Additionally, we provided user control by conditioning the model with segmentations through ControlNet. We will release the source code at http://code.sonography.ai/ to allow fast US image generation to the scientific community. |
2025-02-12 | Brain Latent Progression: Individual-based Spatiotemporal Disease Progression on 3D Brain MRIs via Latent Diffusion | Lemuel Puglisi, Daniel C. Alexander, Daniele Ravì | Link | The growing availability of longitudinal Magnetic Resonance Imaging (MRI) datasets has facilitated Artificial Intelligence (AI)-driven modeling of disease progression, making it possible to predict future medical scans for individual patients. However, despite significant advancements in AI, current methods continue to face challenges including achieving patient-specific individualization, ensuring spatiotemporal consistency, efficiently utilizing longitudinal data, and managing the substantial memory demands of 3D scans. To address these challenges, we propose Brain Latent Progression (BrLP), a novel spatiotemporal model designed to predict individual-level disease progression in 3D brain MRIs. The key contributions in BrLP are fourfold: (i) it operates in a small latent space, mitigating the computational challenges posed by high-dimensional imaging data; (ii) it explicitly integrates subject metadata to enhance the individualization of predictions; (iii) it incorporates prior knowledge of disease dynamics through an auxiliary model, facilitating the integration of longitudinal data; and (iv) it introduces the Latent Average Stabilization (LAS) algorithm, which (a) enforces spatiotemporal consistency in the predicted progression at inference time and (b) allows us to derive a measure of the uncertainty for the prediction. We train and evaluate BrLP on 11,730 T1-weighted (T1w) brain MRIs from 2,805 subjects and validate its generalizability on an external test set comprising 2,257 MRIs from 962 subjects. Our experiments compare BrLP-generated MRI scans with real follow-up MRIs, demonstrating state-of-the-art accuracy compared to existing methods. The code is publicly available at: https://github.com/LemuelPuglisi/BrLP. |
2025-02-12 | Representation Learning to Advance Multi-institutional Studies with Electronic Health Record Data | Doudou Zhou, Han Tong, Linshanshan Wang, Suqi Liu, Xin Xiong, Ziming Gan, Romain Griffier, Boris Hejblum, Yun-Chung Liu, Chuan Hong, Clara-Lea Bonzel, Tianrun Cai, Kevin Pan, Yuk-Lam Ho, Lauren Costa, Vidul A. Panickan, J. Michael Gaziano, Kenneth Mandl, Vianney Jouhet, Rodolphe Thiebaut, Zongqi Xia, Kelly Cho, Katherine Liao, Tianxi Cai | Link | The adoption of EHRs has expanded opportunities to leverage data-driven algorithms in clinical care and research. A major bottleneck in effectively conducting multi-institutional EHR studies is the data heterogeneity across systems with numerous codes that either do not exist or represent different clinical concepts across institutions. The need for data privacy further limits the feasibility of including multi-institutional patient-level data required to study similarities and differences across patient subgroups. To address these challenges, we developed the GAME algorithm. Tested and validated across 7 institutions and 2 languages, GAME integrates data in several levels: (1) at the institutional level with knowledge graphs to establish relationships between codes and existing knowledge sources, providing the medical context for standard codes and their relationship to each other; (2) between institutions, leveraging language models to determine the relationships between institution-specific codes with established standard codes; and (3) quantifying the strength of the relationships between codes using a graph attention network. Jointly trained embeddings are created using transfer and federated learning to preserve data privacy. In this study, we demonstrate the applicability of GAME in selecting relevant features as inputs for AI-driven algorithms in a range of conditions, e.g., heart failure, rheumatoid arthritis. We then highlight the application of GAME harmonized multi-institutional EHR data in a study of Alzheimer's disease outcomes and suicide risk among patients with mental health disorders, without sharing patient-level data outside individual institutions. |
2025-02-12 | Impact of Electric Spatially Discordant Alternans on Cardiac Magnetic Field | Martina Nicoletti, Anna Crispino, Alessandro Loppini, Alessio Gizzi, Letizia Chiodo, Christian Cherubini, Simonetta Filippi | Link | Spatially discordant alternans (SDA) play a crucial role in cardiac arrhythmogenesis by creating steep repolarization gradients facilitating conduction block and reentry. While traditionally studied using electrical indicators, this work provides a novel perspective by characterizing SDA through their magnetic field signatures. Using a one-dimensional cardiac fiber model, we demonstrate that magnetic field measurements effectively detect SDA and temperature dependent changes in cardiac action potentials, offering a non-invasive alternative to conventional electrophysiological metrics. Our results reveal that the spatial organization of SDA is mirrored in the magnetic field distribution, with SDA nodes clearly identifiable via spatial mapping. Notably, magnetic restitution curves exhibit a distinct pattern from APD-based indicators, closely following the dynamics of the action potential upstroke. These findings establish the cardiac magnetic field as a powerful diagnostic tool for detecting SDA, opening new avenues for biomagnetic monitoring of arrhythmic risk. |
2025-02-12 |
|
Yining Jiao, Sreekalyani Bhamidi, Huaizhi Qu, Carlton Zdanski, Julia Kimbell, Andrew Prince, Cameron Worden, Samuel Kirse, Christopher Rutter, Benjamin Shields, William Dunn, Jisan Mahmud, Tianlong Chen, Marc Niethammer | Link | The goal of this work is to develop principled techniques to extract information from high dimensional data sets with complex dependencies in areas such as medicine that can provide insight into individual as well as population level variation. We develop |
2025-02-12 | Hi-End-MAE: Hierarchical encoder-driven masked autoencoders are stronger vision learners for medical image segmentation | Fenghe Tang, Qingsong Yao, Wenxin Ma, Chenxu Wu, Zihang Jiang, S. Kevin Zhou | Link | Medical image segmentation remains a formidable challenge due to the label scarcity. Pre-training Vision Transformer (ViT) through masked image modeling (MIM) on large-scale unlabeled medical datasets presents a promising solution, providing both computational efficiency and model generalization for various downstream tasks. However, current ViT-based MIM pre-training frameworks predominantly emphasize local aggregation representations in output layers and fail to exploit the rich representations across different ViT layers that better capture fine-grained semantic information needed for more precise medical downstream tasks. To fill the above gap, we hereby present Hierarchical Encoder-driven MAE (Hi-End-MAE), a simple yet effective ViT-based pre-training solution, which centers on two key innovations: (1) Encoder-driven reconstruction, which encourages the encoder to learn more informative features to guide the reconstruction of masked patches; and (2) Hierarchical dense decoding, which implements a hierarchical decoding structure to capture rich representations across different layers. We pre-train Hi-End-MAE on a large-scale dataset of 10K CT scans and evaluated its performance across seven public medical image segmentation benchmarks. Extensive experiments demonstrate that Hi-End-MAE achieves superior transfer learning capabilities across various downstream tasks, revealing the potential of ViT in medical imaging applications. The code is available at: https://github.com/FengheTan9/Hi-End-MAE |
2025-02-12 | Electron Fourier ptychography for phase reconstruction | Jingjing Zhao, Chen Huang, Ali Mostaed, Amirafshar Moshtaghpour, James M. Parkhurst, Ivan Lobato, Marcus Gallagher-Jones, Judy S. Kim, Mark Boyce, David Stuart, Elena A. Andreeva, Jacques-Philippe Colletier, Angus I. Kirkland | Link | Phase reconstruction is important in transmission electron microscopy for structural studies. We describe electron Fourier ptychography and its application to phase reconstruction of both radiation-resistant and beam-sensitive materials. We demonstrate that the phase of the exit wave can be reconstructed at high resolution using a modified iterative phase retrieval algorithm with data collected using an alternative optical geometry. This method achieves a spatial resolution of 0.63 nm at a fluence of |
2025-02-12 | Screener: Self-supervised Pathology Segmentation Model for 3D Medical Images | Mikhail Goncharov, Eugenia Soboleva, Mariia Donskova, Ivan Oseledets, Marina Munkhoeva, Maxim Panov | Link | Accurate segmentation of all pathological findings in 3D medical images remains a significant challenge, as supervised models are limited to detecting only the few pathology classes annotated in existing datasets. To address this, we frame pathology segmentation as an unsupervised visual anomaly segmentation (UVAS) problem, leveraging the inherent rarity of pathological patterns compared to healthy ones. We enhance the existing density-based UVAS framework with two key innovations: (1) dense self-supervised learning (SSL) for feature extraction, eliminating the need for supervised pre-training, and (2) learned, masking-invariant dense features as conditioning variables, replacing hand-crafted positional encodings. Trained on over 30,000 unlabeled 3D CT volumes, our model, Screener, outperforms existing UVAS methods on four large-scale test datasets comprising 1,820 scans with diverse pathologies. Code and pre-trained models will be made publicly available. |
2025-02-12 | ActiveSSF: An Active-Learning-Guided Self-Supervised Framework for Long-Tailed Megakaryocyte Classification | Linghao Zhuang, Ying Zhang, Gege Yuan, Xingyue Zhao, Zhiping Jiang | Link | Precise classification of megakaryocytes is crucial for diagnosing myelodysplastic syndromes. Although self-supervised learning has shown promise in medical image analysis, its application to classifying megakaryocytes in stained slides faces three main challenges: (1) pervasive background noise that obscures cellular details, (2) a long-tailed distribution that limits data for rare subtypes, and (3) complex morphological variations leading to high intra-class variability. To address these issues, we propose the ActiveSSF framework, which integrates active learning with self-supervised pretraining. Specifically, our approach employs Gaussian filtering combined with K-means clustering and HSV analysis (augmented by clinical prior knowledge) for accurate region-of-interest extraction; an adaptive sample selection mechanism that dynamically adjusts similarity thresholds to mitigate class imbalance; and prototype clustering on labeled samples to overcome morphological complexity. Experimental results on clinical megakaryocyte datasets demonstrate that ActiveSSF not only achieves state-of-the-art performance but also significantly improves recognition accuracy for rare subtypes. Moreover, the integration of these advanced techniques further underscores the practical potential of ActiveSSF in clinical settings. To foster further research, the code and datasets will be publicly released in the future. |
2025-02-12 | SycEval: Evaluating LLM Sycophancy | Aaron Fanous, Jacob Goldberg, Ank A. Agarwal, Joanna Lin, Anson Zhou, Roxana Daneshjou, Sanmi Koyejo | Link | Large language models (LLMs) are increasingly applied in educational, clinical, and professional settings, but their tendency for sycophancy -- prioritizing user agreement over independent reasoning -- poses risks to reliability. This study introduces a framework to evaluate sycophantic behavior in ChatGPT-4o, Claude-Sonnet, and Gemini-1.5-Pro across AMPS (mathematics) and MedQuad (medical advice) datasets. Sycophantic behavior was observed in 58.19% of cases, with Gemini exhibiting the highest rate (62.47%) and ChatGPT the lowest (56.71%). Progressive sycophancy, leading to correct answers, occurred in 43.52% of cases, while regressive sycophancy, leading to incorrect answers, was observed in 14.66%. Preemptive rebuttals demonstrated significantly higher sycophancy rates than in-context rebuttals (61.75% vs. 56.52%, |