Latest Research

Current Position: english  >  Home  >  Latest Research  >  Content

Prof. An-Yuan Guo’group develops TCRdb, the most comprehensive T-cell receptor sequence database

time:2020-10-02 16:29     number of views:

On September 30, 2020, an original article entitled “TCRdb: a comprehensive database for T-cell receptor sequences with powerful search function” has been published online at Nucleic Acids Research (IF=11.14), which was done by Prof. An-Yuan Guo’s Group at College of Life Science and Technology of Huazhong University of Science and Technology. The PhD student Si-Yi Chen and master student Tao Yue are co-first authors, the postdoctoral fellow Dr. Qian Lei and Professor An-Yuan Guo are co-corresponding authors.


The T-cell receptor (TCR) located on the surface of T cell, is response for the antigen recognition and adaptive immunity. TCR is one of the most diverse regions of the human genome and determine how the human immune systems adapt to changes in the environment. The TCR consists of a variable region for antigen recognition and a constant region, which theoretically can produce 1015 to 1020 different clonal types. The sum of all distinct TCR clone type is called TCR repertoire. The TCR repertoire varies greatly under different conditions (like diseases), and the TCR repertoire also reflects the state of the individual's immune profile. The authors had published a TCR detection method CATT (Bioinformatics, 2020) this year, which could be used to sensitively detect TCR in both TCR-Seq and RNA-Seq data from bulk sequencing or single-cell sequencing.

In this paper, the authors have constructed the most comprehensive database of T cell receptor sequences to date, TCRdb, based on the published methods described above, by integrating and analyzing the most comprehensive TCR-seq data from different diseases and conditions. The database analyzed more than 8200 samples and detected nearly 300 million TCRs of CDR3 sequences. For each sample, the TCR repertoire is provided with user-friendly interactive figures that can be used for publication. The database also provides flexible sequence search capabilities (including fuzzy search and regular expression match search), meeting the need to query and analyze data in big data volume for the first time. The main functional modules of the database (http://bioinfo.life.hust.edu.cn/TCRdb/) include (i) searching for similar or identical sequences in large amounts of data to analyze the specificity of TCR sequences, (ii) browsing and analyzing T cell receptor sequences in different states (e.g., tumor, infection, immune, etc.), (iii) browsing and querying the samples in database by their source, disease state, and cell type. In addition, TCRdb currently contains the TCR repertoire of nearly 1,500 COVID-19 samples. The database is the most comprehensive, annotated, and searchable TCR database to date. TCRdb has been created to aid in the understanding of T-cell immune regulation and mechanisms, as well as studies related to immunotherapy.

Prof. An-Yuan Guo is currently a professor and PhD supervisor, focusing on bioinformatics analysis for complex diseases and deep mining of gene expression data. His group has developed a series of databases with high reputation, such as AnimalTFDB, miRNASNP, lncRNASNP and EVmiRNA, and a series of methods for expression data mining, such as FFLtool, a method to study the co-regulation of transcription factors and miRNAs; ImmuCellAI, an immune cell component analysis method; CATT, a TCR sequence identification method; SEGtool, a specifically expressed gene detection method, and CCLA, an expression-based cancer cell line identification method. These methods and databases have been applied to discover some important regulatory molecules and mechanisms in studies such as leukemia and extracellular vesicles. These developed methods and databases are available on the laboratory's website at http://bioinfo.life.hust.edu.cn/.

This paper was supported by grants from the National Natural Science Foundation of China and the Key Research and Development Program of the Ministry of Science and Technology, and is gratefully acknowledged here.