Unbiased spatial proteomics with single-cell resolution in tissues.
Mass spectrometry (MS)-based proteomics has become a powerful technology to quantify the entire complement of proteins in cells or tissues. Here, we review challenges and recent advances in the LC-MS-based analysis of minute protein amounts, down to the level of single cells. Application of this technology revealed that single-cell transcriptomes are dominated by stochastic noise due to the very low number of transcripts per cell, whereas the single-cell proteome appears to be complete. The spatial organization of cells in tissues can be studied by emerging technologies, including multiplexed imaging and spatial transcriptomics, which can now be combined with ultra-sensitive proteomics. Combined with high-content imaging, artificial intelligence and single-cell laser microdissection, MS-based proteomics provides an unbiased molecular readout close to the functional level. Potential applications range from basic biological questions to precision medicine.
A knowledge graph to interpret clinical proteomics data.
Implementing precision medicine hinges on the integration of omics data, such as proteomics, into the clinical decision-making process, but the quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across multiple biomedical databases and publications, pose a challenge to data integration. Here we present the Clinical Knowledge Graph (CKG), an open-source platform currently comprising close to 20 million nodes and 220 million relationships that represent relevant experimental data, public databases and literature. The graph structure provides a flexible data model that is easily extendable to new nodes and relationships as new databases become available. The CKG incorporates statistical and machine learning algorithms that accelerate the analysis and interpretation of typical proteomics workflows. Using a set of proof-of-concept biomarker studies, we show how the CKG might augment and enrich proteomics data and help inform clinical decision-making.
Quantitative Proteomics in Translational Absorption, Distribution, Metabolism, and Excretion and Precision Medicine.
A reliable translation of in vitro and preclinical data on drug absorption, distribution, metabolism, and excretion (ADME) to humans is important for safe and effective drug development. Precision medicine that is expected to provide the right clinical dose for the right patient at the right time requires a comprehensive understanding of population factors affecting drug disposition and response. Characterization of drug-metabolizing enzymes and transporters for the protein abundance and their interindividual as well as differential tissue and cross-species variabilities is important for translational ADME and precision medicine. This review first provides a brief overview of quantitative proteomics principles including liquid chromatography-tandem mass spectrometry tools, data acquisition approaches, proteomics sample preparation techniques, and quality controls for ensuring rigor and reproducibility in protein quantification data. Then, potential applications of quantitative proteomics in the translation of in vitro and preclinical data as well as prediction of interindividual variability are discussed in detail with tabulated examples. The applications of quantitative proteomics data in physiologically based pharmacokinetic modeling for ADME prediction are discussed with representative case examples. Finally, various considerations for reliable quantitative proteomics analysis for translational ADME and precision medicine and the future directions are discussed. SIGNIFICANCE STATEMENT: Quantitative proteomics analysis of drug-metabolizing enzymes and transporters in humans and preclinical species provides key physiological information that assists in the translation of in vitro and preclinical data to humans. This review provides the principles and applications of quantitative proteomics in characterizing in vitro, ex vivo, and preclinical models for translational research and interindividual variability prediction. Integration of these data into physiologically based pharmacokinetic modeling is proving to be critical for safe, effective, timely, and cost-effective drug development.
Deep Learning in Proteomics.
Wen Bo,Zeng Wen-Feng,Liao Yuxing,Shi Zhiao,Savage Sara R,Jiang Wen,Zhang Bing
Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.
An image-based multi-label human protein subcellular localization predictor (iLocator) reveals protein mislocalizations in cancer tissues.
Xu Ying-Ying,Yang Fan,Zhang Yang,Shen Hong-Bin
Bioinformatics (Oxford, England)
MOTIVATION:Human cells are organized into compartments of different biochemical cellular processes. Having proteins appear at the right time to the correct locations in the cellular compartments is required to conduct their functions in normal cells, whereas mislocalization of proteins can result in pathological diseases, including cancer. RESULTS:To reveal the cancer-related protein mislocalizations, we developed an image-based multi-label subcellular location predictor, iLocator, which covers seven cellular localizations. The iLocator incorporates both global and local image descriptors and generates predictions by using an ensemble multi-label classifier. The algorithm has the ability to treat both single- and multiple-location proteins. We first trained and tested iLocator on 3240 normal human tissue images that have known subcellular location information from the human protein atlas. The iLocator was then used to generate protein localization predictions for 3696 protein images from seven cancer tissues that have no location annotations in the human protein atlas. By comparing the output data from normal and cancer tissues, we detected eight potential cancer biomarker proteins that have significant localization differences with P-value < 0.01. AVAILABILITY:http://www.csbio.sjtu.edu.cn/bioinf/iLocator/
New Opportunities and Challenges of Smart Polymers in Post-Translational Modification Proteomics.
Qing Guangyan,Lu Qi,Xiong Yuting,Zhang Lei,Wang Hongxi,Li Xiuling,Liang Xinmiao,Sun Taolei
Advanced materials (Deerfield Beach, Fla.)
Protein post-translational modifications (PTMs), which denote covalent additions of various functional groups (e.g., phosphate, glycan, methyl, or ubiquitin) to proteins, significantly increase protein complexity and diversity. PTMs play crucial roles in the regulation of protein functions and numerous cellular processes. However, in a living organism, native PTM proteins are typically present at substoichiometric levels, considerably impeding mass-spectrometry-based analyses and identification. Over the past decade, the demand for in-depth PTM proteomics studies has spawned a variety of selective affinity materials capable of capturing trace amounts of PTM peptides from highly complex biosamples. However, novel design ideas or strategies are urgently required for fulfilling the increasingly complex and accurate requirements of PTM proteomics analysis, which can hardly be met by using conventional enrichment materials. Considering two typical types of protein PTMs, phosphorylation and glycosylation, an overview of polymeric enrichment materials is provided here, with an emphasis on the superiority of smart-polymer-based materials that can function in intelligent modes. Moreover, some smart separation materials are introduced to demonstrate the enticing prospects and the challenges of smart polymers applied in PTM proteomics.
Microfluidic-Mass Spectrometry Interfaces for Translational Proteomics.
Pedde R Daniel,Li Huiyan,Borchers Christoph H,Akbari Mohsen
Trends in biotechnology
Interfacing mass spectrometry (MS) with microfluidic chips (μchip-MS) holds considerable potential to transform a clinician's toolbox, providing translatable methods for the early detection, diagnosis, monitoring, and treatment of noncommunicable diseases by streamlining and integrating laborious sample preparation workflows on high-throughput, user-friendly platforms. Overcoming the limitations of competitive immunoassays - currently the gold standard in clinical proteomics - μchip-MS can provide unprecedented access to complex proteomic assays having high sensitivity and specificity, but without the labor, costs, and complexities associated with conventional MS sample processing. This review surveys recent μchip-MS systems for clinical applications and examines their emerging role in streamlining the development and translation of MS-based proteomic assays by alleviating many of the challenges that currently inhibit widespread clinical adoption.
Detecting Cardiovascular Protein-Protein Interactions by Proximity Proteomics.
Kushner Jared S,Liu Guoxia,Eisert Robyn J,Bradshaw Gary A,Pitt Geoffrey S,Hinson J Travis,Kalocsay Marian,Marx Steven O
Rapidly changing and transient protein-protein interactions regulate dynamic cellular processes in the cardiovascular system. Traditional methods, including affinity purification and mass spectrometry, have revealed many macromolecular complexes in cardiomyocytes and the vasculature. Yet these methods often fail to identify in vivo or transient protein-protein interactions. To capture these interactions in living cells and animals with subsequent mass spectrometry identification, enzyme-catalyzed proximity labeling techniques have been developed in the past decade. Although the application of this methodology to cardiovascular research is still in its infancy, the field is developing rapidly, and the promise is substantial. In this review, we outline important concepts and discuss how proximity proteomics has been applied to study physiological and pathophysiological processes relevant to the cardiovascular system.
A proteomics sample metadata representation for multiomics integration and big data analysis.
The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.
Vistain Luke F,Tay Savaş
Trends in biochemical sciences
The inability to make broad, minimally biased measurements of a cell's proteome stands as a major bottleneck for understanding how gene expression translates into cellular phenotype. Unlike sequencing for nucleic acids, there is no dominant method for making single-cell proteomic measurements. Instead, methods typically focus on either absolute quantification of a small number of proteins or highly multiplexed protein measurements. Advances in microfluidics and output encoding have led to major improvements in both aspects. Here, we review the most recent progress that has enabled hundreds of protein measurements and ultrahigh-sensitivity quantification. We also highlight emerging technologies such as single-cell mass spectrometry that may enable unbiased measurement of cellular proteomes.
High-throughput proteomics and AI for cancer biomarker discovery.
Xiao Qi,Zhang Fangfei,Xu Luang,Yue Liang,Kon Oi Lian,Zhu Yi,Guo Tiannan
Advanced drug delivery reviews
Biomarkers are assayed to assess biological and pathological status. Recent advances in high-throughput proteomic technology provide opportunities for developing next generation biomarkers for clinical practice aided by artificial intelligence (AI) based techniques. We summarize the advances and limitations of cancer biomarkers based on genomic and transcriptomic analysis, as well as classical antibody-based methodologies. Then we review recent progresses in mass spectrometry (MS)-based proteomics in terms of sample preparation, peptide fractionation by liquid chromatography (LC) and mass spectrometric data acquisition. We highlight applications of AI techniques in high-throughput clinical studies as compared with clinical decisions based on singular features. This review sets out our approach for discovering clinical biomarkers in studies using proteomic big data technology conjoined with computational and statistical methods.
Paving the way to single-molecule protein sequencing.
Restrepo-Pérez Laura,Joo Chirlmin,Dekker Cees
Proteins are major building blocks of life. The protein content of a cell and an organism provides key information for the understanding of biological processes and disease. Despite the importance of protein analysis, only a handful of techniques are available to determine protein sequences, and these methods face limitations, for example, requiring a sizable amount of sample. Single-molecule techniques would revolutionize proteomics research, providing ultimate sensitivity for the detection of low-abundance proteins and the realization of single-cell proteomics. In recent years, novel single-molecule protein sequencing schemes that use fluorescence, tunnelling currents and nanopores have been proposed. Here, we present a review of these approaches, together with the first experimental efforts towards their realization. We discuss their advantages and drawbacks, and present our perspective on the development of single-molecule protein sequencing techniques.
Spatial proteomics: a powerful discovery tool for cell biology.
Lundberg Emma,Borner Georg H H
Nature reviews. Molecular cell biology
Protein subcellular localization is tightly controlled and intimately linked to protein function in health and disease. Capturing the spatial proteome - that is, the localizations of proteins and their dynamics at the subcellular level - is therefore essential for a complete understanding of cell biology. Owing to substantial advances in microscopy, mass spectrometry and machine learning applications for data analysis, the field is now mature for proteome-wide investigations of spatial cellular regulation. Studies of the human proteome have begun to reveal a complex architecture, including single-cell variations, dynamic protein translocations, changing interaction networks and proteins localizing to multiple compartments. Furthermore, several studies have successfully harnessed the power of comparative spatial proteomics as a discovery tool to unravel disease mechanisms. We are at the beginning of an era in which spatial proteomics finally integrates with cell biology and medical research, thereby paving the way for unbiased systems-level insights into cellular processes. Here, we discuss current methods for spatial proteomics using imaging or mass spectrometry and specifically highlight global comparative applications. The aim of this Review is to survey the state of the field and also to encourage more cell biologists to apply spatial proteomics approaches.
A Golden Age for Working with Public Proteomics Data.
Martens Lennart,Vizcaíno Juan Antonio
Trends in biochemical sciences
Data sharing in mass spectrometry (MS)-based proteomics is becoming a common scientific practice, as is now common in the case of other, more mature 'omics' disciplines like genomics and transcriptomics. We want to highlight that this situation, unprecedented in the field, opens a plethora of opportunities for data scientists. First, we explain in some detail some of the work already achieved, such as systematic reanalysis efforts. We also explain existing applications of public proteomics data, such as proteogenomics and the creation of spectral libraries and spectral archives. Finally, we discuss the main existing challenges and mention the first attempts to combine public proteomics data with other types of omics data sets.
Integrating Networks and Proteomics: Moving Forward.
Goh Wilson Wen Bin,Wong Limsoon
Trends in biotechnology
Networks can resolve many analytical problems in proteomics, including incomplete coverage and inconsistency. Despite high expectations, network-related research in proteomics has experienced only modest growth. In practice, most current research examines non-quantitative usages, for example determining physical interactions among proteins or contextualizing a differential protein list, rather than addressing practical quantitative usages, for example predicting missing proteins or making sample-class predictions. Moreover, many applications are irreproducible and are not widely adopted owing to a lack of common standards, particularly evaluation criteria and gold-standard datasets. A concerted drive towards quantitative applications and convergence towards common standards is essential for 'network-based proteomics' to realize its development potential and make meaningful contributions to clinical applications.
From Molecules to Mechanisms: Functional Proteomics and Its Application to Renal Tubule Physiology.
Rinschen Markus M,Limbutara Kavee,Knepper Mark A,Payne D Michael,Pisitkun Trairak
Classical physiological studies using electrophysiological, biophysical, biochemical, and molecular techniques have created a detailed picture of molecular transport, bioenergetics, contractility and movement, and growth, as well as the regulation of these processes by external stimuli in cells and organisms. Newer systems biology approaches are beginning to provide deeper and broader understanding of these complex biological processes and their dynamic responses to a variety of environmental cues. In the past decade, advances in mass spectrometry-based proteomic technologies have provided invaluable tools to further elucidate these complex cellular processes, thereby confirming, complementing, and advancing common views of physiology. As one notable example, the application of proteomics to study the regulation of kidney function has yielded novel insights into the chemical and physical processes that tightly control body fluids, electrolytes, and metabolites to provide optimal microenvironments for various cellular and organ functions. Here, we systematically review, summarize, and discuss the most significant key findings from functional proteomic studies in renal epithelial physiology. We also identify further improvements in technological and bioinformatics methods that will be essential to advance precision medicine in nephrology.