Expert-level diagnosis of nasal polyps using deep learning on whole-slide imaging.
Wu Qingwu,Chen Jianning,Deng Huiyi,Ren Yong,Sun Yueqi,Wang Weihao,Yuan Lianxiong,Hong Haiyu,Zheng Rui,Kong Weifeng,Huang Xuekun,Huang Guifang,Wang Lunji,Zhang Yana,Han Lanqing,Yang Qintai
The Journal of allergy and clinical immunology
Detection of anaemia from retinal fundus images via deep learning.
Mitani Akinori,Huang Abigail,Venugopalan Subhashini,Corrado Greg S,Peng Lily,Webster Dale R,Hammel Naama,Liu Yun,Varadarajan Avinash V
Nature biomedical engineering
Owing to the invasiveness of diagnostic tests for anaemia and the costs associated with screening for it, the condition is often undetected. Here, we show that anaemia can be detected via machine-learning algorithms trained using retinal fundus images, study participant metadata (including race or ethnicity, age, sex and blood pressure) or the combination of both data types (images and study participant metadata). In a validation dataset of 11,388 study participants from the UK Biobank, the fundus-image-only, metadata-only and combined models predicted haemoglobin concentration (in g dl) with mean absolute error values of 0.73 (95% confidence interval: 0.72-0.74), 0.67 (0.66-0.68) and 0.63 (0.62-0.64), respectively, and with areas under the receiver operating characteristic curve (AUC) values of 0.74 (0.71-0.76), 0.87 (0.85-0.89) and 0.88 (0.86-0.89), respectively. For 539 study participants with self-reported diabetes, the combined model predicted haemoglobin concentration with a mean absolute error of 0.73 (0.68-0.78) and anaemia an AUC of 0.89 (0.85-0.93). Automated anaemia screening on the basis of fundus images could particularly aid patients with diabetes undergoing regular retinal imaging and for whom anaemia can increase morbidity and mortality risks.
Identifying facial phenotypes of genetic disorders using deep learning.
Gurovich Yaron,Hanani Yair,Bar Omri,Nadav Guy,Fleischer Nicole,Gelbman Dekel,Basel-Salmon Lina,Krawitz Peter M,Kamphausen Susanne B,Zenker Martin,Bird Lynne M,Gripp Karen W
Syndromic genetic conditions, in aggregate, affect 8% of the population. Many syndromes have recognizable facial features that are highly informative to clinical geneticists. Recent studies show that facial analysis technologies measured up to the capabilities of expert clinicians in syndrome identification. However, these technologies identified only a few disease phenotypes, limiting their role in clinical settings, where hundreds of diagnoses must be considered. Here we present a facial image analysis framework, DeepGestalt, using computer vision and deep-learning algorithms, that quantifies similarities to hundreds of syndromes. DeepGestalt outperformed clinicians in three initial experiments, two with the goal of distinguishing subjects with a target syndrome from other syndromes, and one of separating different genetic subtypes in Noonan syndrome. On the final experiment reflecting a real clinical setting problem, DeepGestalt achieved 91% top-10 accuracy in identifying the correct syndrome on 502 different images. The model was trained on a dataset of over 17,000 images representing more than 200 syndromes, curated through a community-driven phenotyping platform. DeepGestalt potentially adds considerable value to phenotypic evaluations in clinical genetics, genetic testing, research and precision medicine.
Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning.
Gainza P,Sverrisson F,Monti F,Rodolà E,Boscaini D,Bronstein M M,Correia B E
Predicting interactions between proteins and other biomolecules solely based on structure remains a challenge in biology. A high-level representation of protein structure, the molecular surface, displays patterns of chemical and geometric features that fingerprint a protein's modes of interactions with other biomolecules. We hypothesize that proteins participating in similar interactions may share common fingerprints, independent of their evolutionary history. Fingerprints may be difficult to grasp by visual analysis but could be learned from large-scale datasets. We present MaSIF (molecular surface interaction fingerprinting), a conceptual framework based on a geometric deep learning method to capture fingerprints that are important for specific biomolecular interactions. We showcase MaSIF with three prediction challenges: protein pocket-ligand prediction, protein-protein interaction site prediction and ultrafast scanning of protein surfaces for prediction of protein-protein complexes. We anticipate that our conceptual framework will lead to improvements in our understanding of protein function and design.
A convolutional neural network segments yeast microscopy images with high accuracy.
Dietler Nicola,Minder Matthias,Gligorovski Vojislav,Economou Augoustina Maria,Joly Denis Alain Henri Lucien,Sadeghi Ahmad,Chan Chun Hei Michael,Koziński Mateusz,Weigert Martin,Bitbol Anne-Florence,Rahi Sahand Jamal
The identification of cell borders ('segmentation') in microscopy images constitutes a bottleneck for large-scale experiments. For the model organism Saccharomyces cerevisiae, current segmentation methods face challenges when cells bud, crowd, or exhibit irregular features. We present a convolutional neural network (CNN) named YeaZ, the underlying training set of high-quality segmented yeast images (>10 000 cells) including mutants, stressed cells, and time courses, as well as a graphical user interface and a web application ( www.quantsysbio.com/data-and-software ) to efficiently employ, test, and expand the system. A key feature is a cell-cell boundary test which avoids the need for fluorescent markers. Our CNN is highly accurate, including for buds, and outperforms existing methods on benchmark images, indicating it transfers well to other conditions. To demonstrate how efficient large-scale image processing uncovers new biology, we analyze the geometries of ≈2200 wild-type and cyclin mutant cells and find that morphogenesis control occurs unexpectedly early and gradually.
Covid-19 classification by FGCNet with deep feature fusion from graph convolutional network and convolutional neural network.
Wang Shui-Hua,Govindaraj Vishnu Varthanan,Górriz Juan Manuel,Zhang Xin,Zhang Yu-Dong
An international journal on information fusion
() COVID-19 is an infectious disease spreading to the world this year. In this study, we plan to develop an artificial intelligence based tool to diagnose on chest CT images. () On one hand, we extract features from a self-created convolutional neural network (CNN) to learn individual image-level representations. The proposed CNN employed several new techniques such as rank-based average pooling and multiple-way data augmentation. On the other hand, relation-aware representations were learnt from graph convolutional network (GCN). Deep feature fusion (DFF) was developed in this work to fuse individual image-level features and relation-aware features from both GCN and CNN, respectively. The best model was named as FGCNet. () The experiment first chose the best model from eight proposed network models, and then compared it with 15 state-of-the-art approaches. () The proposed FGCNet model is effective and gives better performance than all 15 state-of-the-art methods. Thus, our proposed FGCNet model can assist radiologists to rapidly detect COVID-19 from chest CT images.
Neural Network-Based On-Chip Spectroscopy Using a Scalable Plasmonic Encoder.
Brown Calvin,Goncharov Artem,Ballard Zachary S,Fordham Mason,Clemens Ashley,Qiu Yunzhe,Rivenson Yair,Ozcan Aydogan
Conventional spectrometers are limited by trade-offs set by size, cost, signal-to-noise ratio (SNR), and spectral resolution. Here, we demonstrate a deep learning-based spectral reconstruction framework using a compact and low-cost on-chip sensing scheme that is not constrained by many of the design trade-offs inherent to grating-based spectroscopy. The system employs a plasmonic spectral encoder chip containing 252 different tiles of nanohole arrays fabricated using a scalable and low-cost imprint lithography method, where each tile has a specific geometry and thus a specific optical transmission spectrum. The illumination spectrum of interest directly impinges upon the plasmonic encoder, and a CMOS image sensor captures the transmitted light without any lenses, gratings, or other optical components in between, making the entire hardware highly compact, lightweight, and field-portable. A trained neural network then reconstructs the unknown spectrum using the transmitted intensity information from the spectral encoder in a feed-forward and noniterative manner. Benefiting from the parallelization of neural networks, the average inference time per spectrum is ∼28 μs, which is much faster compared to other computational spectroscopy approaches. When blindly tested on 14 648 unseen spectra with varying complexity, our deep-learning based system identified 96.86% of the spectral peaks with an average peak localization error, bandwidth error, and height error of 0.19 nm, 0.18 nm, and 7.60%, respectively. This system is also highly tolerant to fabrication defects that may arise during the imprint lithography process, which further makes it ideal for applications that demand cost-effective, field-portable, and sensitive high-resolution spectroscopy tools.
Intelligent Image-Activated Cell Sorting.
Nitta Nao,Sugimura Takeaki,Isozaki Akihiro,Mikami Hideharu,Hiraki Kei,Sakuma Shinya,Iino Takanori,Arai Fumihito,Endo Taichiro,Fujiwaki Yasuhiro,Fukuzawa Hideya,Hase Misa,Hayakawa Takeshi,Hiramatsu Kotaro,Hoshino Yu,Inaba Mary,Ito Takuro,Karakawa Hiroshi,Kasai Yusuke,Koizumi Kenichi,Lee SangWook,Lei Cheng,Li Ming,Maeno Takanori,Matsusaka Satoshi,Murakami Daichi,Nakagawa Atsuhiro,Oguchi Yusuke,Oikawa Minoru,Ota Tadataka,Shiba Kiyotaka,Shintaku Hirofumi,Shirasaki Yoshitaka,Suga Kanako,Suzuki Yuta,Suzuki Nobutake,Tanaka Yo,Tezuka Hiroshi,Toyokawa Chihana,Yalikun Yaxiaer,Yamada Makoto,Yamagishi Mai,Yamano Takashi,Yasumoto Atsushi,Yatomi Yutaka,Yazawa Masayuki,Di Carlo Dino,Hosokawa Yoichiroh,Uemura Sotaro,Ozeki Yasuyuki,Goda Keisuke
A fundamental challenge of biology is to understand the vast heterogeneity of cells, particularly how cellular composition, structure, and morphology are linked to cellular physiology. Unfortunately, conventional technologies are limited in uncovering these relations. We present a machine-intelligence technology based on a radically different architecture that realizes real-time image-based intelligent cell sorting at an unprecedented rate. This technology, which we refer to as intelligent image-activated cell sorting, integrates high-throughput cell microscopy, focusing, and sorting on a hybrid software-hardware data-management infrastructure, enabling real-time automated operation for data acquisition, data processing, decision-making, and actuation. We use it to demonstrate real-time sorting of microalgal and blood cells based on intracellular protein localization and cell-cell interaction from large heterogeneous populations for studying photosynthesis and atherothrombosis, respectively. The technology is highly versatile and expected to enable machine-based scientific discovery in biological, pharmaceutical, and medical sciences.
Evaluation and development of deep neural networks for image super-resolution in optical microscopy.
Qiao Chang,Li Di,Guo Yuting,Liu Chong,Jiang Tao,Dai Qionghai,Li Dong
Deep neural networks have enabled astonishing transformations from low-resolution (LR) to super-resolved images. However, whether, and under what imaging conditions, such deep-learning models outperform super-resolution (SR) microscopy is poorly explored. Here, using multimodality structured illumination microscopy (SIM), we first provide an extensive dataset of LR-SR image pairs and evaluate the deep-learning SR models in terms of structural complexity, signal-to-noise ratio and upscaling factor. Second, we devise the deep Fourier channel attention network (DFCAN), which leverages the frequency content difference across distinct features to learn precise hierarchical representations of high-frequency information about diverse biological structures. Third, we show that DFCAN's Fourier domain focalization enables robust reconstruction of SIM images under low signal-to-noise ratio conditions. We demonstrate that DFCAN achieves comparable image quality to SIM over a tenfold longer duration in multicolor live-cell imaging experiments, which reveal the detailed structures of mitochondrial cristae and nucleoids and the interaction dynamics of organelles and cytoskeleton.
Analysis of the Human Protein Atlas Image Classification competition.
Ouyang Wei,Winsnes Casper F,Hjelmare Martin,Cesnik Anthony J,Åkesson Lovisa,Xu Hao,Sullivan Devin P,Dai Shubin,Lan Jun,Jinmo Park,Galib Shaikat M,Henkel Christof,Hwang Kevin,Poplavskiy Dmytro,Tunguz Bojan,Wolfinger Russel D,Gu Yinzheng,Li Chuanpeng,Xie Jinbin,Buslov Dmitry,Fironov Sergei,Kiselev Alexander,Panchenko Dmytro,Cao Xuan,Wei Runmin,Wu Yuanhao,Zhu Xun,Tseng Kuan-Lun,Gao Zhifeng,Ju Cheng,Yi Xiaohan,Zheng Hongdong,Kappel Constantin,Lundberg Emma
Pinpointing subcellular protein localizations from microscopy images is easy to the trained eye, but challenging to automate. Based on the Human Protein Atlas image collection, we held a competition to identify deep learning solutions to solve this task. Challenges included training on highly imbalanced classes and predicting multiple labels per image. Over 3 months, 2,172 teams participated. Despite convergence on popular networks and training techniques, there was considerable variety among the solutions. Participants applied strategies for modifying neural networks and loss functions, augmenting data and using pretrained networks. The winning models far outperformed our previous effort at multi-label classification of protein localization patterns by ~20%. These models can be used as classifiers to annotate new images, feature extractors to measure pattern similarity or pretrained networks for a wide range of biological applications.
Human-computer collaboration for skin cancer recognition.
Tschandl Philipp,Rinner Christoph,Apalla Zoe,Argenziano Giuseppe,Codella Noel,Halpern Allan,Janda Monika,Lallas Aimilios,Longo Caterina,Malvehy Josep,Paoli John,Puig Susana,Rosendahl Cliff,Soyer H Peter,Zalaudek Iris,Kittler Harald
The rapid increase in telemedicine coupled with recent advances in diagnostic artificial intelligence (AI) create the imperative to consider the opportunities and risks of inserting AI-based support into new paradigms of care. Here we build on recent achievements in the accuracy of image-based AI for skin cancer diagnosis to address the effects of varied representations of AI-based support across different levels of clinical expertise and multiple clinical workflows. We find that good quality AI-based support of clinical decision-making improves diagnostic accuracy over that of either AI or physicians alone, and that the least experienced clinicians gain the most from AI-based support. We further find that AI-based multiclass probabilities outperformed content-based image retrieval (CBIR) representations of AI in the mobile technology environment, and AI-based support had utility in simulations of second opinions and of telemedicine triage. In addition to demonstrating the potential benefits associated with good quality AI in the hands of non-expert clinicians, we find that faulty AI can mislead the entire spectrum of clinicians, including experts. Lastly, we show that insights derived from AI class-activation maps can inform improvements in human diagnosis. Together, our approach and findings offer a framework for future studies across the spectrum of image-based diagnostics to improve human-computer collaboration in clinical practice.
Understanding Image Memorability.
Rust Nicole C,Mehrpour Vahid
Trends in cognitive sciences
Why are some images easier to remember than others? Here, we review recent developments in our understanding of 'image memorability', including its behavioral characteristics, its neural correlates, and the optimization principles from which it originates. We highlight work that has used large behavioral data sets to leverage memorability scores computed for individual images. These studies demonstrate that the mapping of image content to image memorability is not only predictable, but also non-intuitive and multifaceted. This work has also led to insights into the neural correlates of image memorability, by way of the discovery of a type of population response magnitude variation that emerges in high-level visual cortex as well as higher stages of deep neural networks trained to categorize objects.
Edge-centric functional network representations of human cerebral cortex reveal overlapping system-level architecture.
Faskowitz Joshua,Esfahlani Farnaz Zamani,Jo Youngheun,Sporns Olaf,Betzel Richard F
Network neuroscience has relied on a node-centric network model in which cells, populations and regions are linked to one another via anatomical or functional connections. This model cannot account for interactions of edges with one another. In this study, we developed an edge-centric network model that generates constructs 'edge time series' and 'edge functional connectivity' (eFC). Using network analysis, we show that, at rest, eFC is consistent across datasets and reproducible within the same individual over multiple scan sessions. We demonstrate that clustering eFC yields communities of edges that naturally divide the brain into overlapping clusters, with regions in sensorimotor and attentional networks exhibiting the greatest levels of overlap. We show that eFC is systematically modulated by variation in sensory input. In future work, the edge-centric approach could be useful for identifying novel biomarkers of disease, characterizing individual variation and mapping the architecture of highly resolved neural circuits.
Neural population control via deep image synthesis.
Bashivan Pouya,Kar Kohitij,DiCarlo James J
Science (New York, N.Y.)
Particular deep artificial neural networks (ANNs) are today's most accurate models of the primate brain's ventral visual stream. Using an ANN-driven image synthesis method, we found that luminous power patterns (i.e., images) can be applied to primate retinae to predictably push the spiking activity of targeted V4 neural sites beyond naturally occurring levels. This method, although not yet perfect, achieves unprecedented independent control of the activity state of entire populations of V4 neural sites, even those with overlapping receptive fields. These results show how the knowledge embedded in today's ANN models might be used to noninvasively set desired internal brain states at neuron-level resolution, and suggest that more accurate ANN models would produce even more accurate control.
CDeep3M-Plug-and-Play cloud-based deep learning for image segmentation.
Haberl Matthias G,Churas Christopher,Tindall Lucas,Boassa Daniela,Phan Sébastien,Bushong Eric A,Madany Matthew,Akay Raffi,Deerinck Thomas J,Peltier Steven T,Ellisman Mark H
As biomedical imaging datasets expand, deep neural networks are considered vital for image processing, yet community access is still limited by setting up complex computational environments and availability of high-performance computing resources. We address these bottlenecks with CDeep3M, a ready-to-use image segmentation solution employing a cloud-based deep convolutional neural network. We benchmark CDeep3M on large and complex two-dimensional and three-dimensional imaging datasets from light, X-ray, and electron microscopy.
Image reconstruction by domain-transform manifold learning.
Zhu Bo,Liu Jeremiah Z,Cauley Stephen F,Rosen Bruce R,Rosen Matthew S
Image reconstruction is essential for imaging applications across the physical and life sciences, including optical and radar systems, magnetic resonance imaging, X-ray computed tomography, positron emission tomography, ultrasound imaging and radio astronomy. During image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function. Image reconstruction is challenging because analytic knowledge of the exact inverse transform may not exist a priori, especially in the presence of sensor non-idealities and noise. Thus, the standard reconstruction approach involves approximating the inverse function with multiple ad hoc stages in a signal processing chain, the composition of which depends on the details of each acquisition strategy, and often requires expert parameter tuning to optimize reconstruction performance. Here we present a unified framework for image reconstruction-automated transform by manifold approximation (AUTOMAP)-which recasts image reconstruction as a data-driven supervised learning task that allows a mapping between the sensor and the image domain to emerge from an appropriate corpus of training data. We implement AUTOMAP with a deep neural network and exhibit its flexibility in learning reconstruction transforms for various magnetic resonance imaging acquisition strategies, using the same network architecture and hyperparameters. We further demonstrate that manifold learning during training results in sparse representations of domain transforms along low-dimensional data manifolds, and observe superior immunity to noise and a reduction in reconstruction artefacts compared with conventional handcrafted reconstruction methods. In addition to improving the reconstruction performance of existing acquisition methodologies, we anticipate that AUTOMAP and other learned reconstruction approaches will accelerate the development of new acquisition strategies across imaging modalities.
Ultrafast machine vision with 2D material neural network image sensors.
Mennel Lukas,Symonowicz Joanna,Wachter Stefan,Polyushkin Dmitry K,Molina-Mendoza Aday J,Mueller Thomas
Machine vision technology has taken huge leaps in recent years, and is now becoming an integral part of various intelligent systems, including autonomous vehicles and robotics. Usually, visual information is captured by a frame-based camera, converted into a digital format and processed afterwards using a machine-learning algorithm such as an artificial neural network (ANN). The large amount of (mostly redundant) data passed through the entire signal chain, however, results in low frame rates and high power consumption. Various visual data preprocessing techniques have thus been developed to increase the efficiency of the subsequent signal processing in an ANN. Here we demonstrate that an image sensor can itself constitute an ANN that can simultaneously sense and process optical images without latency. Our device is based on a reconfigurable two-dimensional (2D) semiconductor photodiode array, and the synaptic weights of the network are stored in a continuously tunable photoresponsivity matrix. We demonstrate both supervised and unsupervised learning and train the sensor to classify and encode images that are optically projected onto the chip with a throughput of 20 million bins per second.
11 TOPS photonic convolutional accelerator for optical neural networks.
Xu Xingyuan,Tan Mengxi,Corcoran Bill,Wu Jiayang,Boes Andreas,Nguyen Thach G,Chu Sai T,Little Brent E,Hicks Damien G,Morandotti Roberto,Mitchell Arnan,Moss David J
Convolutional neural networks, inspired by biological visual cortex systems, are a powerful category of artificial neural networks that can extract the hierarchical features of raw data to provide greatly reduced parametric complexity and to enhance the accuracy of prediction. They are of great interest for machine learning tasks such as computer vision, speech recognition, playing board games and medical diagnosis. Optical neural networks offer the promise of dramatically accelerating computing speed using the broad optical bandwidths available. Here we demonstrate a universal optical vector convolutional accelerator operating at more than ten TOPS (trillions (10) of operations per second, or tera-ops per second), generating convolutions of images with 250,000 pixels-sufficiently large for facial image recognition. We use the same hardware to sequentially form an optical convolutional neural network with ten output neurons, achieving successful recognition of handwritten digit images at 88 per cent accuracy. Our results are based on simultaneously interleaving temporal, wavelength and spatial dimensions enabled by an integrated microcomb source. This approach is scalable and trainable to much more complex networks for demanding applications such as autonomous vehicles and real-time video recognition.
Power-efficient neural network with artificial dendrites.
Li Xinyi,Tang Jianshi,Zhang Qingtian,Gao Bin,Yang J Joshua,Song Sen,Wu Wei,Zhang Wenqiang,Yao Peng,Deng Ning,Deng Lei,Xie Yuan,Qian He,Wu Huaqiang
In the nervous system, dendrites, branches of neurons that transmit signals between synapses and soma, play a critical role in processing functions, such as nonlinear integration of postsynaptic signals. The lack of these critical functions in artificial neural networks compromises their performance, for example in terms of flexibility, energy efficiency and the ability to handle complex tasks. Here, by developing artificial dendrites, we experimentally demonstrate a complete neural network fully integrated with synapses, dendrites and soma, implemented using scalable memristor devices. We perform a digit recognition task and simulate a multilayer network using experimentally derived device characteristics. The power consumption is more than three orders of magnitude lower than that of a central processing unit and 70 times lower than that of a typical application-specific integrated circuit chip. This network, equipped with functional dendrites, shows the potential of substantial overall performance improvement, for example by extracting critical information from a noisy background with significantly reduced power consumption and enhanced accuracy.
Revealing Fine Structures of the Retinal Receptive Field by Deep-Learning Networks.
Yan Qi,Zheng Yajing,Jia Shanshan,Zhang Yichen,Yu Zhaofei,Chen Feng,Tian Yonghong,Huang Tiejun,Liu Jian K
IEEE transactions on cybernetics
Deep convolutional neural networks (CNNs) have demonstrated impressive performance on many visual tasks. Recently, they became useful models for the visual system in neuroscience. However, it is still not clear what is learned by CNNs in terms of neuronal circuits. When a deep CNN with many layers is used for the visual system, it is not easy to compare the structure components of CNNs with possible neuroscience underpinnings due to highly complex circuits from the retina to the higher visual cortex. Here, we address this issue by focusing on single retinal ganglion cells with biophysical models and recording data from animals. By training CNNs with white noise images to predict neuronal responses, we found that fine structures of the retinal receptive field can be revealed. Specifically, convolutional filters learned are resembling biological components of the retinal circuit. This suggests that a CNN learning from one single retinal cell reveals a minimal neural network carried out in this cell. Furthermore, when CNNs learned from different cells are transferred between cells, there is a diversity of transfer learning performance, which indicates that CNNs are cell specific. Moreover, when CNNs are transferred between different types of input images, here white noise versus natural images, transfer learning shows a good performance, which implies that CNNs indeed capture the full computational ability of a single retinal cell for different inputs. Taken together, these results suggest that CNNs could be used to reveal structure components of neuronal circuits, and provide a powerful model for neural system identification.
Dermatologist-level classification of skin cancer with deep neural networks.
Esteva Andre,Kuprel Brett,Novoa Roberto A,Ko Justin,Swetter Susan M,Blau Helen M,Thrun Sebastian
Skin cancer, the most common human malignancy, is primarily diagnosed visually, beginning with an initial clinical screening and followed potentially by dermoscopic analysis, a biopsy and histopathological examination. Automated classification of skin lesions using images is a challenging task owing to the fine-grained variability in the appearance of skin lesions. Deep convolutional neural networks (CNNs) show potential for general and highly variable tasks across many fine-grained object categories. Here we demonstrate classification of skin lesions using a single CNN, trained end-to-end from images directly, using only pixels and disease labels as inputs. We train a CNN using a dataset of 129,450 clinical images-two orders of magnitude larger than previous datasets-consisting of 2,032 different diseases. We test its performance against 21 board-certified dermatologists on biopsy-proven clinical images with two critical binary classification use cases: keratinocyte carcinomas versus benign seborrheic keratoses; and malignant melanomas versus benign nevi. The first case represents the identification of the most common cancers, the second represents the identification of the deadliest skin cancer. The CNN achieves performance on par with all tested experts across both tasks, demonstrating an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists. Outfitted with deep neural networks, mobile devices can potentially extend the reach of dermatologists outside of the clinic. It is projected that 6.3 billion smartphone subscriptions will exist by the year 2021 (ref. 13) and can therefore potentially provide low-cost universal access to vital diagnostic care.
Near real-time intraoperative brain tumor diagnosis using stimulated Raman histology and deep neural networks.
Hollon Todd C,Pandian Balaji,Adapa Arjun R,Urias Esteban,Save Akshay V,Khalsa Siri Sahib S,Eichberg Daniel G,D'Amico Randy S,Farooq Zia U,Lewis Spencer,Petridis Petros D,Marie Tamara,Shah Ashish H,Garton Hugh J L,Maher Cormac O,Heth Jason A,McKean Erin L,Sullivan Stephen E,Hervey-Jumper Shawn L,Patil Parag G,Thompson B Gregory,Sagher Oren,McKhann Guy M,Komotar Ricardo J,Ivan Michael E,Snuderl Matija,Otten Marc L,Johnson Timothy D,Sisti Michael B,Bruce Jeffrey N,Muraszko Karin M,Trautman Jay,Freudiger Christian W,Canoll Peter,Lee Honglak,Camelo-Piragua Sandra,Orringer Daniel A
Intraoperative diagnosis is essential for providing safe and effective care during cancer surgery. The existing workflow for intraoperative diagnosis based on hematoxylin and eosin staining of processed tissue is time, resource and labor intensive. Moreover, interpretation of intraoperative histologic images is dependent on a contracting, unevenly distributed, pathology workforce. In the present study, we report a parallel workflow that combines stimulated Raman histology (SRH), a label-free optical imaging method and deep convolutional neural networks (CNNs) to predict diagnosis at the bedside in near real-time in an automated fashion. Specifically, our CNNs, trained on over 2.5 million SRH images, predict brain tumor diagnosis in the operating room in under 150 s, an order of magnitude faster than conventional techniques (for example, 20-30 min). In a multicenter, prospective clinical trial (n = 278), we demonstrated that CNN-based diagnosis of SRH images was noninferior to pathologist-based interpretation of conventional histologic images (overall accuracy, 94.6% versus 93.9%). Our CNNs learned a hierarchy of recognizable histologic feature representations to classify the major histopathologic classes of brain tumors. In addition, we implemented a semantic segmentation method to identify tumor-infiltrated diagnostic regions within SRH images. These results demonstrate how intraoperative cancer diagnosis can be streamlined, creating a complementary pathway for tissue diagnosis that is independent of a traditional pathology laboratory.
Modified U-Net (mU-Net) With Incorporation of Object-Dependent High Level Features for Improved Liver and Liver-Tumor Segmentation in CT Images.
Seo Hyunseok,Huang Charles,Bassenne Maxime,Xiao Ruoxiu,Xing Lei
IEEE transactions on medical imaging
Segmentation of livers and liver tumors is one of the most important steps in radiation therapy of hepatocellular carcinoma. The segmentation task is often done manually, making it tedious, labor intensive, and subject to intra-/inter- operator variations. While various algorithms for delineating organ-at-risks (OARs) and tumor targets have been proposed, automatic segmentation of livers and liver tumors remains intractable due to their low tissue contrast with respect to the surrounding organs and their deformable shape in CT images. The U-Net has gained increasing popularity recently for image analysis tasks and has shown promising results. Conventional U-Net architectures, however, suffer from three major drawbacks. First, skip connections allow for the duplicated transfer of low resolution information in feature maps to improve efficiency in learning, but this often leads to blurring of extracted image features. Secondly, high level features extracted by the network often do not contain enough high resolution edge information of the input, leading to greater uncertainty where high resolution edge dominantly affects the network's decisions such as liver and liver-tumor segmentation. Thirdly, it is generally difficult to optimize the number of pooling operations in order to extract high level global features, since the number of pooling operations used depends on the object size. To cope with these problems, we added a residual path with deconvolution and activation operations to the skip connection of the U-Net to avoid duplication of low resolution information of features. In the case of small object inputs, features in the skip connection are not incorporated with features in the residual path. Furthermore, the proposed architecture has additional convolution layers in the skip connection in order to extract high level global features of small object inputs as well as high level features of high resolution edge information of large object inputs. Efficacy of the modified U-Net (mU-Net) was demonstrated using the public dataset of Liver tumor segmentation (LiTS) challenge 2017. For liver-tumor segmentation, Dice similarity coefficient (DSC) of 89.72 %, volume of error (VOE) of 21.93 %, and relative volume difference (RVD) of - 0.49 % were obtained. For liver segmentation, DSC of 98.51 %, VOE of 3.07 %, and RVD of 0.26 % were calculated. For the public 3D Image Reconstruction for Comparison of Algorithm Database (3Dircadb), DSCs were 96.01 % for the liver and 68.14 % for liver-tumor segmentations, respectively. The proposed mU-Net outperformed existing state-of-art networks.
U-net based metal segmentation on projection domain for metal artifact reduction in dental CT.
Hegazy Mohamed A A,Cho Myung Hye,Cho Min Hyoung,Lee Soo Yeol
Biomedical engineering letters
Unlike medical computed tomography (CT), dental CT often suffers from severe metal artifacts stemming from high-density materials employed for dental prostheses. Despite the many metal artifact reduction (MAR) methods available for medical CT, those methods do not sufficiently reduce metal artifacts in dental CT images because MAR performance is often compromised by the enamel layer of teeth, whose X-ray attenuation coefficient is not so different from that of prosthetic materials. We propose a deep learning-based metal segmentation method on the projection domain to improve MAR performance in dental CT. We adopted a simplified U-net for metal segmentation on the projection domain without using any information from the metal-artifacts-corrupted CT images. After training the network with the projection data of five patients, we segmented the metal objects on the projection data of other patients using the trained network parameters. With the segmentation results, we corrected the projection data by applying region filling inside the segmented region. We fused two CT images, one from the corrected projection data and the other from the original raw projection data, and then we forward-projected the fused CT image to get the fused projection data. To get the final corrected projection data, we replaced the metal regions in the original projection data with the ones in the fused projection data. To evaluate the efficacy of the proposed segmentation method on MAR, we compared the MAR performance of the proposed segmentation method with a conventional MAR method based on metal segmentation on the CT image domain. For the MAR performance evaluation, we considered the three primary MAR performance metrics: the relative error (REL), the sum of square difference (SSD), and the normalized absolute difference (NAD). The proposed segmentation method improved MAR performances by around 5.7% for REL, 6.8% for SSD, and 8.2% for NAD. The proposed metal segmentation method on the projection domain showed better MAR performance than the conventional segmentation on the CT image domain. We expect that the proposed segmentation method can improve the performance of the existing MAR methods that are based on metal segmentation on the CT image domain.
Recurrent residual U-Net for medical image segmentation.
Alom Md Zahangir,Yakopcic Chris,Hasan Mahmudul,Taha Tarek M,Asari Vijayan K
Journal of medical imaging (Bellingham, Wash.)
Deep learning (DL)-based semantic segmentation methods have been providing state-of-the-art performance in the past few years. More specifically, these techniques have been successfully applied in medical image classification, segmentation, and detection tasks. One DL technique, U-Net, has become one of the most popular for these applications. We propose a recurrent U-Net model and a recurrent residual U-Net model, which are named RU-Net and R2U-Net, respectively. The proposed models utilize the power of U-Net, residual networks, and recurrent convolutional neural networks. There are several advantages to using these proposed architectures for segmentation tasks. First, a residual unit helps when training deep architectures. Second, feature accumulation with recurrent residual convolutional layers ensures better feature representation for segmentation tasks. Third, it allows us to design better U-Net architectures with the same number of network parameters with better performance for medical image segmentation. The proposed models are tested on three benchmark datasets, such as blood vessel segmentation in retinal images, skin cancer segmentation, and lung lesion segmentation. The experimental results show superior performance on segmentation tasks compared to equivalent models, including a variant of a fully connected convolutional neural network called SegNet, U-Net, and residual U-Net.
UNet++: A Nested U-Net Architecture for Medical Image Segmentation.
Zhou Zongwei,Siddiquee Md Mahfuzur Rahman,Tajbakhsh Nima,Liang Jianming
Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support : 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, held in conjunction with MICCAI 2018, Granada, Spain, S...
In this paper, we present UNet++, a new, more powerful architecture for medical image segmentation. Our architecture is essentially a deeply-supervised encoder-decoder network where the encoder and decoder sub-networks are connected through a series of nested, dense skip pathways. The re-designed skip pathways aim at reducing the semantic gap between the feature maps of the encoder and decoder sub-networks. We argue that the optimizer would deal with an easier learning task when the feature maps from the decoder and encoder networks are semantically similar. We have evaluated UNet++ in comparison with U-Net and wide U-Net architectures across multiple medical image segmentation tasks: nodule segmentation in the low-dose CT scans of chest, nuclei segmentation in the microscopy images, liver segmentation in abdominal CT scans, and polyp segmentation in colonoscopy videos. Our experiments demonstrate that UNet++ with deep supervision achieves an average IoU gain of 3.9 and 3.4 points over U-Net and wide U-Net, respectively.
Retinal vessel segmentation using dense U-net with multiscale inputs.
Yue Kejuan,Zou Beiji,Chen Zailiang,Liu Qing
Journal of medical imaging (Bellingham, Wash.)
A color fundus image is an image of the inner wall of the eyeball taken with a fundus camera. Doctors can observe retinal vessel changes in the image, and these changes can be used to diagnose many serious diseases such as atherosclerosis, glaucoma, and age-related macular degeneration. Automated segmentation of retinal vessels can facilitate more efficient diagnosis of these diseases. We propose an improved U-net architecture to segment retinal vessels. Multiscale input layer and dense block are introduced into the conventional U-net, so that the network can make use of richer spatial context information. The proposed method is evaluated on the public dataset DRIVE, achieving 0.8199 in sensitivity and 0.9561 in accuracy. Especially for thin blood vessels, which are difficult to detect because of their low contrast with the background pixels, the segmentation results have been improved.
Inverse molecular design using machine learning: Generative models for matter engineering.
Sanchez-Lengeling Benjamin,Aspuru-Guzik Alán
Science (New York, N.Y.)
The discovery of new materials can bring enormous societal and technological progress. In this context, exploring completely the large space of potential materials is computationally intractable. Here, we review methods for achieving inverse design, which aims to discover tailored materials from the starting point of a particular desired functionality. Recent advances from the rapidly growing field of artificial intelligence, mostly from the subfield of machine learning, have resulted in a fertile exchange of ideas, where approaches to inverse molecular design are being proposed and employed at a rapid pace. Among these, deep generative models have been applied to numerous classes of materials: rational design of prospective drugs, synthetic routes to organic compounds, and optimization of photovoltaics and redox flow batteries, as well as a variety of other solid-state materials.
Reinforcement Learning, Fast and Slow.
Botvinick Matthew,Ritter Sam,Wang Jane X,Kurth-Nelson Zeb,Blundell Charles,Hassabis Demis
Trends in cognitive sciences
Deep reinforcement learning (RL) methods have driven impressive advances in artificial intelligence in recent years, exceeding human performance in domains ranging from Atari to Go to no-limit poker. This progress has drawn the attention of cognitive scientists interested in understanding human learning. However, the concern has been raised that deep RL may be too sample-inefficient - that is, it may simply be too slow - to provide a plausible model of how humans learn. In the present review, we counter this critique by describing recently developed techniques that allow deep RL to operate more nimbly, solving problems much more quickly than previous methods. Although these techniques were developed in an AI context, we propose that they may have rich implications for psychology and neuroscience. A key insight, arising from these AI methods, concerns the fundamental connection between fast RL and slower, more incremental forms of learning.
Artificial Intelligence in Cardiology.
Johnson Kipp W,Torres Soto Jessica,Glicksberg Benjamin S,Shameer Khader,Miotto Riccardo,Ali Mohsin,Ashley Euan,Dudley Joel T
Journal of the American College of Cardiology
Artificial intelligence and machine learning are poised to influence nearly every aspect of the human condition, and cardiology is not an exception to this trend. This paper provides a guide for clinicians on relevant aspects of artificial intelligence and machine learning, reviews selected applications of these methods in cardiology to date, and identifies how cardiovascular medicine could incorporate artificial intelligence in the future. In particular, the paper first reviews predictive modeling concepts relevant to cardiology such as feature selection and frequent pitfalls such as improper dichotomization. Second, it discusses common algorithms used in supervised learning and reviews selected applications in cardiology and related disciplines. Third, it describes the advent of deep learning and related methods collectively called unsupervised learning, provides contextual examples both in general medicine and in cardiovascular medicine, and then explains how these methods could be applied to enable precision cardiology and improve patient outcomes.
Artificial Intelligence in Precision Cardiovascular Medicine.
Krittanawong Chayakrit,Zhang HongJu,Wang Zhen,Aydar Mehmet,Kitai Takeshi
Journal of the American College of Cardiology
Artificial intelligence (AI) is a field of computer science that aims to mimic human thought processes, learning capacity, and knowledge storage. AI techniques have been applied in cardiovascular medicine to explore novel genotypes and phenotypes in existing diseases, improve the quality of patient care, enable cost-effectiveness, and reduce readmission and mortality rates. Over the past decade, several machine-learning techniques have been used for cardiovascular disease diagnosis and prediction. Each problem requires some degree of understanding of the problem, in terms of cardiovascular medicine and statistics, to apply the optimal machine-learning algorithm. In the near future, AI will result in a paradigm shift toward precision cardiovascular medicine. The potential of AI in cardiovascular medicine is tremendous; however, ignorance of the challenges may overshadow its potential clinical impact. This paper gives a glimpse of AI's application in cardiovascular clinical care and discusses its potential role in facilitating precision cardiovascular medicine.
Control of synaptic plasticity in deep cortical networks.
Roelfsema Pieter R,Holtmaat Anthony
Nature reviews. Neuroscience
Humans and many other animals have an enormous capacity to learn about sensory stimuli and to master new skills. However, many of the mechanisms that enable us to learn remain to be understood. One of the greatest challenges of systems neuroscience is to explain how synaptic connections change to support maximally adaptive behaviour. Here, we provide an overview of factors that determine the change in the strength of synapses, with a focus on synaptic plasticity in sensory cortices. We review the influence of neuromodulators and feedback connections in synaptic plasticity and suggest a specific framework in which these factors can interact to improve the functioning of the entire network.
Nonvolatile Memory Materials for Neuromorphic Intelligent Machines.
Jeong Doo Seok,Hwang Cheol Seong
Advanced materials (Deerfield Beach, Fla.)
Recent progress in deep learning extends the capability of artificial intelligence to various practical tasks, making the deep neural network (DNN) an extremely versatile hypothesis. While such DNN is virtually built on contemporary data centers of the von Neumann architecture, physical (in part) DNN of non-von Neumann architecture, also known as neuromorphic computing, can remarkably improve learning and inference efficiency. Particularly, resistance-based nonvolatile random access memory (NVRAM) highlights its handy and efficient application to the multiply-accumulate (MAC) operation in an analog manner. Here, an overview is given of the available types of resistance-based NVRAMs and their technological maturity from the material- and device-points of view. Examples within the strategy are subsequently addressed in comparison with their benchmarks (virtual DNN in deep learning). A spiking neural network (SNN) is another type of neural network that is more biologically plausible than the DNN. The successful incorporation of resistance-based NVRAM in SNN-based neuromorphic computing offers an efficient solution to the MAC operation and spike timing-based learning in nature. This strategy is exemplified from a material perspective. Intelligent machines are categorized according to their architecture and learning type. Also, the functionality and usefulness of NVRAM-based neuromorphic computing are addressed.
Hierarchical motor control in mammals and machines.
Merel Josh,Botvinick Matthew,Wayne Greg
Advances in artificial intelligence are stimulating interest in neuroscience. However, most attention is given to discrete tasks with simple action spaces, such as board games and classic video games. Less discussed in neuroscience are parallel advances in "synthetic motor control". While motor neuroscience has recently focused on optimization of single, simple movements, AI has progressed to the generation of rich, diverse motor behaviors across multiple tasks, at humanoid scale. It is becoming clear that specific, well-motivated hierarchical design elements repeatedly arise when engineering these flexible control systems. We review these core principles of hierarchical control, relate them to hierarchy in the nervous system, and highlight research themes that we anticipate will be critical in solving challenges at this disciplinary intersection.
Engineering a Less Artificial Intelligence.
Sinz Fabian H,Pitkow Xaq,Reimer Jacob,Bethge Matthias,Tolias Andreas S
Despite enormous progress in machine learning, artificial neural networks still lag behind brains in their ability to generalize to new situations. Given identical training data, differences in generalization are caused by many defining features of a learning algorithm, such as network architecture and learning rule. Their joint effect, called "inductive bias," determines how well any learning algorithm-or brain-generalizes: robust generalization needs good inductive biases. Artificial networks use rather nonspecific biases and often latch onto patterns that are only informative about the statistics of the training data but may not generalize to different scenarios. Brains, on the other hand, generalize across comparatively drastic changes in the sensory input all the time. We highlight some shortcomings of state-of-the-art learning algorithms compared to biological brains and discuss several ideas about how neuroscience can guide the quest for better inductive biases by providing useful constraints on representations and network architecture.
High-performance medicine: the convergence of human and artificial intelligence.
Topol Eric J
The use of artificial intelligence, and the deep-learning subtype in particular, has been enabled by the use of labeled big data, along with markedly enhanced computing power and cloud storage, across all sectors. In medicine, this is beginning to have an impact at three levels: for clinicians, predominantly via rapid, accurate image interpretation; for health systems, by improving workflow and the potential for reducing medical errors; and for patients, by enabling them to process their own data to promote health. The current limitations, including bias, privacy and security, and lack of transparency, along with the future directions of these applications will be discussed in this article. Over time, marked improvements in accuracy, productivity, and workflow will likely be actualized, but whether that will be used to improve the patient-doctor relationship or facilitate its erosion remains to be seen.
Artificial intelligence in radiology.
Hosny Ahmed,Parmar Chintan,Quackenbush John,Schwartz Lawrence H,Aerts Hugo J W L
Nature reviews. Cancer
Artificial intelligence (AI) algorithms, particularly deep learning, have demonstrated remarkable progress in image-recognition tasks. Methods ranging from convolutional neural networks to variational autoencoders have found myriad applications in the medical image analysis field, propelling it forward at a rapid pace. Historically, in radiology practice, trained physicians visually assessed medical images for the detection, characterization and monitoring of diseases. AI methods excel at automatically recognizing complex patterns in imaging data and providing quantitative, rather than qualitative, assessments of radiographic characteristics. In this Opinion article, we establish a general understanding of AI methods, particularly those pertaining to image-based tasks. We explore how these methods could impact multiple facets of radiology, with a general focus on applications in oncology, and demonstrate ways in which these methods are advancing the field. Finally, we discuss the challenges facing clinical implementation and provide our perspective on how the domain could be advanced.
Artificial intelligence in clinical and genomic diagnostics.
Dias Raquel,Torkamani Ali
Artificial intelligence (AI) is the development of computer systems that are able to perform tasks that normally require human intelligence. Advances in AI software and hardware, especially deep learning algorithms and the graphics processing units (GPUs) that power their training, have led to a recent and rapidly increasing interest in medical AI applications. In clinical diagnostics, AI-based computer vision approaches are poised to revolutionize image-based diagnostics, while other AI subtypes have begun to show similar promise in various diagnostic modalities. In some areas, such as clinical genomics, a specific type of AI algorithm known as deep learning is used to process large and complex genomic datasets. In this review, we first summarize the main classes of problems that AI systems are well suited to solve and describe the clinical diagnostic tasks that benefit from these solutions. Next, we focus on emerging methods for specific tasks in clinical genomics, including variant calling, genome annotation and variant classification, and phenotype-to-genotype correspondence. Finally, we end with a discussion on the future potential of AI in individualized medicine applications, especially for risk prediction in common complex diseases, and the challenges, limitations, and biases that must be carefully addressed for the successful deployment of AI in medical applications, particularly those utilizing human genetics and genomics data.
Next-Generation Machine Learning for Biological Networks.
Camacho Diogo M,Collins Katherine M,Powers Rani K,Costello James C,Collins James J
Machine learning, a collection of data-analytical techniques aimed at building predictive models from multi-dimensional datasets, is becoming integral to modern biological research. By enabling one to generate models that learn from large datasets and make predictions on likely outcomes, machine learning can be used to study complex cellular systems such as biological networks. Here, we provide a primer on machine learning for life scientists, including an introduction to deep learning. We discuss opportunities and challenges at the intersection of machine learning and network biology, which could impact disease biology, drug discovery, microbiome research, and synthetic biology.
Artificial intelligence in retina.
Schmidt-Erfurth Ursula,Sadeghipour Amir,Gerendas Bianca S,Waldstein Sebastian M,Bogunović Hrvoje
Progress in retinal and eye research
Major advances in diagnostic technologies are offering unprecedented insight into the condition of the retina and beyond ocular disease. Digital images providing millions of morphological datasets can fast and non-invasively be analyzed in a comprehensive manner using artificial intelligence (AI). Methods based on machine learning (ML) and particularly deep learning (DL) are able to identify, localize and quantify pathological features in almost every macular and retinal disease. Convolutional neural networks thereby mimic the path of the human brain for object recognition through learning of pathological features from training sets, supervised ML, or even extrapolation from patterns recognized independently, unsupervised ML. The methods of AI-based retinal analyses are diverse and differ widely in their applicability, interpretability and reliability in different datasets and diseases. Fully automated AI-based systems have recently been approved for screening of diabetic retinopathy (DR). The overall potential of ML/DL includes screening, diagnostic grading as well as guidance of therapy with automated detection of disease activity, recurrences, quantification of therapeutic effects and identification of relevant targets for novel therapeutic approaches. Prediction and prognostic conclusions further expand the potential benefit of AI in retina which will enable personalized health care as well as large scale management and will empower the ophthalmologist to provide high quality diagnosis/therapy and successfully deal with the complexity of 21st century ophthalmology.
Radiomics: Data Are Also Images.
Hatt Mathieu,Le Rest Catherine Cheze,Tixier Florent,Badic Bogdan,Schick Ulrike,Visvikis Dimitris
Journal of nuclear medicine : official publication, Society of Nuclear Medicine
The aim of this review is to provide readers with an update on the state of the art, pitfalls, solutions for those pitfalls, future perspectives, and challenges in the quickly evolving field of radiomics in nuclear medicine imaging and associated oncology applications. The main pitfalls were identified in study design, data acquisition, segmentation, feature calculation, and modeling; however, in most cases, potential solutions are available and existing recommendations should be followed to improve the overall quality and reproducibility of published radiomics studies. The techniques from the field of deep learning have some potential to provide solutions, especially in terms of automation. Some important challenges remain to be addressed but, overall, striking advances have been made in the field in the last 5 y.
Deep Neural Networks as Scientific Models.
Cichy Radoslaw M,Kaiser Daniel
Trends in cognitive sciences
Artificial deep neural networks (DNNs) initially inspired by the brain enable computers to solve cognitive tasks at which humans excel. In the absence of explanations for such cognitive phenomena, in turn cognitive scientists have started using DNNs as models to investigate biological cognition and its neural basis, creating heated debate. Here, we reflect on the case from the perspective of philosophy of science. After putting DNNs as scientific models into context, we discuss how DNNs can fruitfully contribute to cognitive science. We claim that beyond their power to provide predictions and explanations of cognitive phenomena, DNNs have the potential to contribute to an often overlooked but ubiquitous and fundamental use of scientific models: exploration.
Deep neural networks in psychiatry.
Durstewitz Daniel,Koppe Georgia,Meyer-Lindenberg Andreas
Machine and deep learning methods, today's core of artificial intelligence, have been applied with increasing success and impact in many commercial and research settings. They are powerful tools for large scale data analysis, prediction and classification, especially in very data-rich environments ("big data"), and have started to find their way into medical applications. Here we will first give an overview of machine learning methods, with a focus on deep and recurrent neural networks, their relation to statistics, and the core principles behind them. We will then discuss and review directions along which (deep) neural networks can be, or already have been, applied in the context of psychiatry, and will try to delineate their future potential in this area. We will also comment on an emerging area that so far has been much less well explored: by embedding semantically interpretable computational models of brain dynamics or behavior into a statistical machine learning context, insights into dysfunction beyond mere prediction and classification may be gained. Especially this marriage of computational models with statistical inference may offer insights into neural and behavioral mechanisms that could open completely novel avenues for psychiatric treatment.
Using goal-driven deep learning models to understand sensory cortex.
Yamins Daniel L K,DiCarlo James J
Fueled by innovation in the computer vision and artificial intelligence communities, recent developments in computational neuroscience have used goal-driven hierarchical convolutional neural networks (HCNNs) to make strides in modeling neural single-unit and population responses in higher visual cortical areas. In this Perspective, we review the recent progress in a broader modeling context and describe some of the key technical innovations that have supported it. We then outline how the goal-driven HCNN approach can be used to delve even more deeply into understanding the development and organization of sensory cortical processing.
A Primer on Motion Capture with Deep Learning: Principles, Pitfalls, and Perspectives.
Mathis Alexander,Schneider Steffen,Lauer Jessy,Mathis Mackenzie Weygandt
Extracting behavioral measurements non-invasively from video is stymied by the fact that it is a hard computational problem. Recent advances in deep learning have tremendously advanced our ability to predict posture directly from videos, which has quickly impacted neuroscience and biology more broadly. In this primer, we review the budding field of motion capture with deep learning. In particular, we will discuss the principles of those novel algorithms, highlight their potential as well as pitfalls for experimentalists, and provide a glimpse into the future.
Deep learning in mammography and breast histology, an overview and future trends.
Hamidinekoo Azam,Denton Erika,Rampun Andrik,Honnor Kate,Zwiggelaar Reyer
Medical image analysis
Recent improvements in biomedical image analysis using deep learning based neural networks could be exploited to enhance the performance of Computer Aided Diagnosis (CAD) systems. Considering the importance of breast cancer worldwide and the promising results reported by deep learning based methods in breast imaging, an overview of the recent state-of-the-art deep learning based CAD systems developed for mammography and breast histopathology images is presented. In this study, the relationship between mammography and histopathology phenotypes is described, which takes biological aspects into account. We propose a computer based breast cancer modelling approach: the Mammography-Histology-Phenotype-Linking-Model, which develops a mapping of features/phenotypes between mammographic abnormalities and their histopathological representation. Challenges are discussed along with the potential contribution of such a system to clinical decision making and treatment management.
Deep Learning: The Good, the Bad, and the Ugly.
Annual review of vision science
Artificial vision has often been described as one of the key remaining challenges to be solved before machines can act intelligently. Recent developments in a branch of machine learning known as deep learning have catalyzed impressive gains in machine vision-giving a sense that the problem of vision is getting closer to being solved. The goal of this review is to provide a comprehensive overview of recent deep learning developments and to critically assess actual progress toward achieving human-level visual intelligence. I discuss the implications of the successes and limitations of modern machine vision algorithms for biological vision and the prospect for neuroscience to inform the design of future artificial vision systems.
MDCC-Net: Multiscale double-channel convolution U-Net framework for colorectal tumor segmentation.
Zheng Suichang,Lin Xue,Zhang Weifeng,He Baochun,Jia Shuangfu,Wang Ping,Jiang Huijie,Shi Jingjing,Jia Fucang
Computers in biology and medicine
PURPOSE:Multiscale feature fusion is a feasible method to improve tumor segmentation accuracy. However, current multiscale networks have two common problems: 1. Some networks only allow feature fusion between encoders and decoders of the same scale. It is obvious that such feature fusion is not sufficient. 2. Some networks have too many dense skip connections and too much nesting between the coding layer and the decoding layer, which causes some features to be lost and means that not enough information will be learned from multiple scales. To overcome these two problems, we propose a multiscale double-channel convolution U-Net (MDCC-Net) framework for colorectal tumor segmentation. METHODS:In the coding layer, we designed a dual-channel separation and convolution module and then added residual connections to perform multiscale feature fusion on the input image and the feature map after dual-channel separation and convolution. By fusing features at different scales in the same coding layer, the network can fully extract the detailed information of the original image and learn more tumor boundary information. RESULTS:The segmentation results show that our proposed method has a high accuracy, with a Dice similarity coefficient (DSC) of 83.57%, which is an improvement of 9.59%, 6.42%, and 1.57% compared with nnU-Net, U-Net, and U-Net++, respectively. CONCLUSION:The experimental results show that our proposed method has good performance in the segmentation of colorectal tumors and is close to the expert level. The proposed method has potential clinical applicability.
A Hierarchical Graph Convolution Network for Representation Learning of Gene Expression Data.
Tan Kaiwen,Huang Weixian,Liu Xiaofeng,Hu Jinlong,Dong Shoubin
IEEE journal of biomedical and health informatics
The curse of dimensionality, which is caused by high-dimensionality and low-sample-size, is a major challenge in gene expression data analysis. However, the real situation is even worse: labelling data is laborious and time-consuming, so only a small part of the limited samples will be labelled. Having such few labelled samples further increases the difficulty of training deep learning models. Interpretability is an important requirement in biomedicine. Many existing deep learning methods are trying to provide interpretability, but rarely apply to gene expression data. Recent semi-supervised graph convolution network methods try to address these problems by smoothing the label information over a graph. However, to the best of our knowledge, these methods only utilize graphs in either the feature space or sample space, which restrict their performance. We propose a transductive semi-supervised representation learning method called a hierarchical graph convolution network (HiGCN) to aggregate the information of gene expression data in both feature and sample spaces. HiGCN first utilizes external knowledge to construct a feature graph and a similarity kernel to construct a sample graph. Then, two spatial-based GCNs are used to aggregate information on these graphs. To validate the model's performance, synthetic and real datasets are provided to lend empirical support. Compared with two recent models and three traditional models, HiGCN learns better representations of gene expression data, and these representations improve the performance of downstream tasks, especially when the model is trained on a few labelled samples. Important features can be extracted from our model to provide reliable interpretability.
Manifold Modeling in Embedded Space: An Interpretable Alternative to Deep Image Prior.
Yokota Tatsuya,Hontani Hidekata,Zhao Qibin,Cichocki Andrzej
IEEE transactions on neural networks and learning systems
Deep image prior (DIP), which uses a deep convolutional network (ConvNet) structure as an image prior, has attracted wide attention in computer vision and machine learning. DIP empirically shows the effectiveness of the ConvNet structures for various image restoration applications. However, why the DIP works so well is still unknown. In addition, the reason why the convolution operation is useful in image reconstruction, or image enhancement is not very clear. This study tackles this ambiguity of ConvNet/DIP by proposing an interpretable approach that divides the convolution into ``delay embedding'' and ``transformation'' (i.e., encoder-decoder). Our approach is a simple, but essential, image/tensor modeling method that is closely related to self-similarity. The proposed method is called manifold modeling in embedded space (MMES) since it is implemented using a denoising autoencoder in combination with a multiway delay-embedding transform. In spite of its simplicity, MMES can obtain quite similar results to DIP on image/tensor completion, super-resolution, deconvolution, and denoising. In addition, MMES is proven to be competitive with DIP, as shown in our experiments. These results can also facilitate interpretation/characterization of DIP from the perspective of a ``low-dimensional patch-manifold prior.''.
Nonstationary Discrete Convolution Kernel for Multimodal Process Monitoring.
Tan Ruomu,Ottewill James R,Thornhill Nina F
IEEE transactions on neural networks and learning systems
Data-driven process monitoring has benefited from the development and application of kernel transformations, especially when various types of nonlinearity exist in the data. However, when dealing with the multimodality behavior that is frequently observed in the process operations, the most widely used radial basis function (RBF) kernel has limitations in describing process data collected from multiple normal operating modes. In this article, we highlight this limitation via a synthesized example. In order to account for the multimodality behavior and improve the fault detection performance accordingly, we propose a novel nonstationary discrete convolution kernel, which derives from the convolution kernel structure, as an alternative to the RBF kernel. By assuming the training samples to be the support of the discrete convolution, this new kernel can properly address these training samples from different operating modes with diverse properties and, therefore, can improve the data description and fault detection performance. Its performance is compared with RBF kernels under a standard kernel principal component analysis framework and with other methods proposed for multimode process monitoring via numerical examples. Moreover, a benchmark data set collected from a pilot-scale multiphase flow facility is used to demonstrate the advantages of the new kernel when applied to an experimental data set.
Learning of 3D Graph Convolution Networks for Point Cloud Analysis.
Lin Zhi-Hao,Huang Sheng Yu,Wang Yu-Chiang Frank
IEEE transactions on pattern analysis and machine intelligence
Point clouds are among the popular geometry representations in 3D vision. However, unlike 2D images with pixel-wise layouts, such representations containing unordered data points which make the processing and understanding the associated semantic information quite challenging. Although a number of previous works attempt to analyze point clouds and achieve promising performances, their performances would degrade significantly when data variations like shift and scale changes are presented. In this paper, we propose 3D Graph Convolution Networks (3D-GCN), which uniquely learns 3D kernels with graph max-pooling mechanisms for extracting geometric features from point cloud data across different scales. We show that, with the proposed 3D-GCN, satisfactory shift and scale invariance can be jointly achieved. We show that 3D-GCN can be applied to point cloud classification and segmentation tasks, with ablation studies and visualizations verifying the design of 3D-GCN.
Convolutional Neural Network Based on Bandwise-Independent Convolution and Hard Thresholding for Hyperspectral Band Selection.
Feng Jie,Chen Jiantong,Sun Qigong,Shang Ronghua,Cao Xianghai,Zhang Xiangrong,Jiao Licheng
IEEE transactions on cybernetics
Band selection has been widely utilized in hyperspectral image (HSI) classification to reduce the dimensionality of HSIs. Recently, deep-learning-based band selection has become of great interest. However, existing deep-learning-based methods usually implement band selection and classification in isolation, or evaluate selected spectral bands by training the deep network repeatedly, which may lead to the loss of discriminative bands and increased computational cost. In this article, a novel convolutional neural network (CNN) based on bandwise-independent convolution and hard thresholding (BHCNN) is proposed, which combines band selection, feature extraction, and classification into an end-to-end trainable network. In BHCNN, a band selection layer is constructed by designing bandwise 1×1 convolutions, which perform for each spectral band of input HSIs independently. Then, hard thresholding is utilized to constrain the weights of convolution kernels with unselected spectral bands to zero. In this case, these weights are difficult to update. To optimize these weights, the straight-through estimator (STE) is devised by approximating the gradient. Furthermore, a novel coarse-to-fine loss calculated by full and selected spectral bands is defined to improve the interpretability of STE. In the subsequent layers of BHCNN, multiscale 3-D dilated convolutions are constructed to extract joint spatial-spectral features from HSIs with selected spectral bands. The experimental results on several HSI datasets demonstrate that the proposed method uses selected spectral bands to achieve more encouraging classification performance than current state-of-the-art band selection methods.
Multioutput Convolution Spectral Mixture for Gaussian Processes.
Chen Kai,van Laarhoven Twan,Groot Perry,Chen Jinsong,Marchiori Elena
IEEE transactions on neural networks and learning systems
Multioutput Gaussian processes (MOGPs) are an extension of Gaussian processes (GPs) for predicting multiple output variables (also called channels/tasks) simultaneously. In this article, we use the convolution theorem to design a new kernel for MOGPs by modeling cross-channel dependencies through cross convolution of time-and phase-delayed components in the spectral domain. The resulting kernel is called multioutput convolution spectral mixture (MOCSM) kernel. The results of extensive experiments on synthetic and real-life data sets demonstrate the advantages of the proposed kernel and its state-of-the-art performance. MOCSM enjoys the desirable property to reduce to the well-known spectral mixture (SM) kernel when a single channel is considered. A comparison with the recently introduced multioutput SM kernel reveals that this is not the case for the latter kernel, which contains quadratic terms that generate undesirable scale effects when the spectral densities of different channels are either very close or very far from each other in the frequency domain.
Fast semantic segmentation method for machine vision inspection based on a fewer-parameters atrous convolution neural network.
Huang Jian,Guixiong Liu,He Binyuan
Owing to the recent development in deep learning, machine vision has been widely used in intelligent manufacturing equipment in multiple fields, including precision-manufacturing production lines and online product-quality inspection. This study aims at online Machine Vision Inspection, focusing on the method of online semantic segmentation under complex backgrounds. First, the fewer-parameters optimization of the atrous convolution architecture is studied. Atrous spatial pyramid pooling (ASPP) and residual network (ResNet) are selected as the basic architectures of ηseg and ηmain, respectively, which indicate that the improved proportion of the participating input image feature is beneficial for improving the accuracy of feature extraction during the change of the number and dimension of feature maps. Second, this study proposes five modified ResNet residual building blocks, with the main path having a 3 × 3 convolution layer, 2 × 2 skip path, and pooling layer with ls = 2, which can improve the use of image features. Finally, the simulation experiments show that our modified structure can significantly decrease segmentation time Tseg from 719 to 296 ms (decreased by 58.8%), with only a slight decrease in the intersection-over-union from 86.7% to 86.6%. The applicability of the proposed machine vision method was verified through the segmentation recognition of the China Yuan (CNY) for the 2019 version. Compared with the conventional method, the proposed model of semantic segmentation visual detection effectively reduces the detection time while ensuring the detection accuracy and has a significant effect of fewer-parameters optimization. This slows for the possibility of neural network detection on mobile terminals.
Recent Advances in Medical Image Processing.
Huang Zhen,Li Qiang,Lu Ju,Feng Junlin,Hu Jiajia,Chen Ping
BACKGROUND:Application and development of the artificial intelligence technology have generated a profound impact in the field of medical imaging. It helps medical personnel to make an early and more accurate diagnosis. Recently, the deep convolution neural network is emerging as a principal machine learning method in computer vision and has received significant attention in medical imaging. Key Message: In this paper, we will review recent advances in artificial intelligence, machine learning, and deep convolution neural network, focusing on their applications in medical image processing. To illustrate with a concrete example, we discuss in detail the architecture of a convolution neural network through visualization to help understand its internal working mechanism. SUMMARY:This review discusses several open questions, current trends, and critical challenges faced by medical image processing and artificial intelligence technology.