Extraction of TGFβ1 protein information using UniProt

UniProt is a protein database containing information on the TGFβ1 protein. In order to obtain relevant information, the researcher needs to type in keywords in the search tab as shown in the image below. After clicking the entry link i.e. P01137 (entry id), a new page will open. This page provides complete information on the function of the respective protein in the human body. Length of the protein is by default. One can see it in the entry id (see image above) wherein the length of TGFβ1 protein is 390.

exctracting TGFβ1 protein from UnitProt

UniProtKB entry result page to find the TGFβ1 protein

Click and  open the link to see general information about TGFβ1 protein on the database page. Protein name, gene name, organism and status are also stated. Further it defines functions which refer to co-factor binding sites or non-protein binding sites.

showing the sample entry for the information page

Sample entry information page

Structural characteristics of protein

In the extended image on the same page, structural characteristics are defined in terms of following components for this area of the TGFβ1 protein:

  • helix– motif in the secondary structure of proteins shows the position and length which is also the backbone of the structure
  • turn– shows the position and length of the sequence where polypeptide has reversed its direction and
  • beta– also called the beta sheet, it shows the position and length extended backbone conformation.
  • strands and PDB– this part of the entry structure represents that the protein sequence is known.
information about location, helix, beta and turns stands

Secondary structure information on location of helix, turns and beta strands

Protein sequence and 3D model

The same page shows the sequence of TGFβ1. One can locate the start amino acid residue, which is Methionine, and Glycine as the end amino acid residue.

showing FASTA format on protein sequence entry

FASTA format of the entry protein sequence

Furthermore, to check the 3-D structure, visit the Structure section of the same page, where 3D structure databases are linked to the sites with the 3D structures. The methods to study and visualize the 3D structure of the protein are discussed in the Methodology part (NMR and X-ray). It also presents the  Resolution of the methods where it shows the measure of the quality of data by using crystallography or spectroscopy method. The Chain section (A/B/C/D etc) shows the number of chains or the amino acids in a polypeptide chain post 3-D modelling. Positions show the sections in the sequence after the 3-D modelling. PDBsum is a pictorial database of protein 3D structures deposited in the Data Bank.

Information on 3D structure of the entry protein

When clicking on any of the links of PDBsum, one can check the 3D structure. The description represents the basic information related to the sources and the researcher’s contribution to 3-D modelling. Title is the name of the research paper which represented the initial model. Structure gives information on the type of chain and section of the protein sequence adopted for the 3-D modelling. Source shows the origin of the protein sequence, the type of organism, the part from where the protein was taken and other related information. Resolution and the method shows the technique used to study the 3-D structure, either NMR or X-ray crystallography, along with its resolution. The description of author represents the names of the researchers who conducted the study. Similarly the key references show the source of study and reference to conduct this study.  Finally, the study indicates date and publication.

Information page of 3D structure and 3D model (PDBSum)

Avishek Majumder

Avishek Majumder

Research Analyst at Project Guru
Avishek is a Master in Biotechnology and has previously worked with Lifecell International Private Limited. Apart from data analysis and biological research, he loves photography and reading. He loves to play football and basketball in his spare time with an avid interest in adventure and nature. He was also a member of the Scouts in his school and has attended Military training.
Avishek Majumder

Related articles

  • Multiple sequence alignment studies of merA protein sequence In the previous article, similar gene sequences of an established mercuric ion reductase or merA gene were identified. They were studied from the NCBI database using BLAST tool. In this article, the protein sequence of merA enzyme is studied with respect to its closely related sequences […]
  • Extracting information of TGF-β1 gene using National Center for Biotechnology Information (NCBI) In order to understand the process of using National Center for Biotechnology Information (NCBI) for extraction of gene information for bioinformatics studies, the author uses case example of Transforming Growth Factor beta 1 (TGFβ1) gene which encodes for the TGFB1 protein found in humans.
  • The Buzz about SEO SEO stands for search engine optimization. Search engines were developed as a directory to all website addresses in the world. Today in this world when the internet has started shrinking ( Checkout the complete report by BBC) SEO is one of the most efficient emerging marketing tool.
  • Different statistical formulas used in hypothesis testing The article focuses on different inferential statistics tools which are used for hypothesis testing. This article introduces both the terms.
  • Mass spectrometry (MS) in protein biomarker discovery Mass spectrometry is a tool that helps in characterizing the proteins. It measures the mass of protein molecules through different steps and using different components and show a positive response to a protein biomarker research.


We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.