Extraction of TGFβ1 protein information using UniProt

By Avishek Majumder on June 2, 2017

UniProt is a protein database containing information on the TGFβ1 protein. In order to obtain relevant information, the researcher needs to type in keywords in the search tab as shown in the image below. After clicking the entry link i.e. P01137 (entry id), a new page will open. This page provides complete information on the function of the respective protein in the human body. Length of the protein is by default. One can see it in the entry id (see image above) wherein the length of TGFβ1 protein is 390.

exctracting TGFβ1 protein from UnitProt
UniProtKB entry result page to find the TGFβ1 protein

Click and  open the link to see general information about TGFβ1 protein on the database page. Protein name, gene name, organism and status are also stated. Further it defines functions which refer to co-factor binding sites or non-protein binding sites.

showing the sample entry for the information page
Sample entry information page

Structural characteristics of protein

In the extended image on the same page, structural characteristics are defined in terms of following components for this area of the TGFβ1 protein:

  • helix– motif in the secondary structure of proteins shows the position and length which is also the backbone of the structure
  • turn– shows the position and length of the sequence where polypeptide has reversed its direction and
  • beta– also called the beta sheet, it shows the position and length extended backbone conformation.
  • strands and PDB– this part of the entry structure represents that the protein sequence is known.
information about location, helix, beta and turns stands
Secondary structure information on location of helix, turns and beta strands

Protein sequence and 3D model

The same page shows the sequence of TGFβ1. One can locate the start amino acid residue, which is Methionine, and Glycine as the end amino acid residue.

showing FASTA format on protein sequence entry
FASTA format of the entry protein sequence

Furthermore, to check the 3-D structure, visit the Structure section of the same page, where 3D structure databases are linked to the sites with the 3D structures. The methods to study and visualize the 3D structure of the protein are discussed in the Methodology part (NMR and X-ray). It also presents the  Resolution of the methods where it shows the measure of the quality of data by using crystallography or spectroscopy method. The Chain section (A/B/C/D etc) shows the number of chains or the amino acids in a polypeptide chain post 3-D modelling. Positions show the sections in the sequence after the 3-D modelling. PDBsum is a pictorial database of protein 3D structures deposited in the Data Bank.

Information on 3D structure of the entry protein
Information on 3D structure of the entry protein

When clicking on any of the links of PDBsum, one can check the 3D structure. The description represents the basic information related to the sources and the researcher’s contribution to 3-D modelling. Title is the name of the research paper which represented the initial model. Structure gives information on the type of chain and section of the protein sequence adopted for the 3-D modelling. Source shows the origin of the protein sequence, the type of organism, the part from where the protein was taken and other related information. Resolution and the method shows the technique used to study the 3-D structure, either NMR or X-ray crystallography, along with its resolution. The description of author represents the names of the researchers who conducted the study. Similarly the key references show the source of study and reference to conduct this study.  Finally, the study indicates date and publication.

Information page of 3D structure and 3D model (PDBSum)
Information page of 3D structure and 3D model (PDBSum)