Hh-Ii-Pp Help Page Back to Search

What is HIP2?

HIP2 is a web based database which stands for Healthy Human Individuals' Integrated Plasma Proteome. The database provides comprehensive information on plasma proteins detected from the blood of "healthy" or "normal" individuals (defined as healthy human adults without major life-threatening disease, known genetic diseases, HIV, or inflammation at the time of blood drawing), using different tandem mass spectrometry techniques.

How should I pronounce HIP2?

We pronounce it as Hip Two. While excited, we also tend to yell out Hip-Hip, since the original initials of the database name contains two H's, two I's and two P's. The mathematically oriented folks think we are weird and the normal way of saying the name should be "Hip squared", both for the superscript position of the number 2 and the augmented reality emotion that this version could bring forth. So, it's really up to our reader to decide which version is the "hippest" to be adopted.

Why should I use HIP2?

The HIP2 database provides protein biologists and clinical biomedical researchers with new opportunities to investigate which proteins may be used for future biomarker research, by comparing plasma proteomics results from patients with diseases such as cancer, neurodegenarative diseases, metabolic diseases, and other genetic disorders. The HIP2 provides first such "background" information on which proteins are expected, with ample peptide sequence as evidence, in health individual's plasma.

Why focus on plasma proteome for healthy individuals only?

Many shot-gun proteomics experiments performed today can only observe a few hundred high- to medium- abundant proteins, due to limitations of instrument detection dynamic range, wide range of protein concentrations in blood, and difficulty with protein separations. An exhausive search of all MS spectra against all theoretical peptide patterns are also not possible on a routine basis. These technical barriers have made it difficult to consistently observe "low abundant" but expected plasma proteins from healthy individual plasma samples. This limitation results in an effective "stochastic sampling" process to observe any "normal and expected" proteins in well controlled experiments. Therefore, it is essential to compare proteomics results from "treated" or "diseased" samples obtained from any particular proteomics platform against "normal and expected" proteomics results obtained from all proteomics platforms in a database such as HIP2. We want to remain focused on this mission. For general repository of peptide mapping and proteomics data repository for human plasma, refer to the ProteomeCommons.Org Data Project and our manuscript published at BMC Medical Genomics in 2008 here.

How is the HIP2 designed?

We show in the following figure the overall design of the database. User queries are connected to the backend database, which is implemented using Oracle10G relational database system hosted at Indiana University High-performance Computing Facilities. The query box will direct results to user queries into one of the three web pages, protein page, peptide page, and experimental information page. The database result pages are also linked to external web pages.

What types of questions will I be able to answer by querying the HIP2 database?

The HIP2 database, although simple in its GUI, is quite powerful in searching for all known healthy human plasma proteins, as long as the peptide evidence and the protein are published and included in our database. Particularly, the database can answer the following sample questions posed by our users easily:

How do I enter information in the main database query page?

You may enter your query with one of the two query types:

What information can I obtain from the database?

You may expect three types of information pages, with each page type corresponding to a group of information relevant to your search query:

How is information on the "Protein Page" organized?

There are three types of information associated with each protein, if it is found in the HIP2 database. They are:

  1. Protein Summary. For example, the following shows a summary of protein IPI00000138, with all common protein accession numbers, and brief function descriptions.

  2. Protein Summary
    IPI ID IPI00000138.1
    Sequence Length (A.A. Residues) 445
    Uniprot Name MGAT1_HUMAN
    SwissProt ID P26572
    Related SwissProt/TrEMBL IDs P26572
    RefSeq ID NP_002397
    Vega ID OTTHUMP00000161546
    Gene Symbol MGAT1
    Description Alpha-1,3-mannosyl-glycoprotein 2-beta-N-acetylglucosaminyltransferase
    Functional Keywords Glycosyltransferase; Golgi stack; Signal-anchor; Transferase;Transmembrane.

  3. Experimental Evidence. For example, the following is a partial view of the peptide results mapped to the query protein IPI00000138. Only 2 of the 6 peptide results are shown. Note that Experimental KeyCode is a field that we use to distinguish one experiment from another. Experimental Type is the separation/mass spectrometry platform used to acquire the raw data. Search Software is the bioinformatics tool used to identify the peptide from the raw data. The hyperlink in this field directs the user to the homepage for the software. Any known posttranslational modifications are listed in the Posttranslational Modification field - if none are known to exist, the field reports "unknown". PubMed Reference is the journal article reference source of the data. The Comments field contains other information about the data, such as ion charge state, signal intensity, peptide identification score, and peptide identification score thresholds.

  4. Experimental Evidence
    Total Peptide(s) Mapped 6
    Peptide Sequence #1 GLLQQIGDALSSQRGRVPTAAPPAQPR
    Experiment KeyCode PEPATLAS_E1.01
    Experimental Type LC-MS/MS
    Search Software SEQUEST
    Posttranslational Modification unknown
    PubMed Reference Proteomics_2005_vol5_pp3497
    Comments PeptideProphet probability 0.97
    Peptide Sequence #2 ALGVMDDLK
    Experiment KeyCode IUBPPM_E1.01
    Experimental Type IMS-MS/MS_TOF
    Search Software MASCOT
    Posttranslational Modification unknown
    PubMed Reference J Am Soc Mass Spectrom_2007_vol18_pp1249
    Comments Z=3 Intensity=2.47E+03 Score=35

  5. Protein-peptide Alignment Map. This map displays the full protein sequence (the first sequence in the alignment map) with trypsin cleavage sites (K & R, not followed by P) in red with other amino acids in black. Identified peptides are displayed in green below the corresponding sequence in the protein. The peptides are shown in the order when they were referred to initially in the Experimental Evidence table. Non-tryptic sequences can be readily determined by comparing the identified peptides to the trypsin cleavage sites in the full protein sequence. The following is an example showing all 6 peptides that were mapped to the protein IPI00000138 by the underlying experiments.
Protein-Peptide Alignment Map
0
|
MLKKQSAGLVLWGAILFVAWNALLLLFFWTRPAPGRPPSVSALDGDPASLTREVIRLAQDAEVELERQRGLLQQIGDALSSQRGRVPTAAPPAQPRVPVT
---------------------------------------------------------------------GLLQQIGDALSSQRGRVPTAAPPAQPR----
----------------------------------------------------------------------------------------------------
---KQSAGLVLWGAILFVAWNALLLLFFWTRPAPGRPPSVSALDGDPASLTR------------------------------------------------
------------------------------------------------------------------------------------------------VPVT
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
100
|
PAPAVIPILVIACDRSTVRRCLDKLLHYRPSAELFPIIVSQDCGHEETAQAIASYGSAVTHIRQPDLSSIAVPPDHRKFQGYYKIARHYRWALGQVFRQF
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
PAPAVIPILVIACDRSTVR---------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
200
|
RFPAAVVVEDDLEVAPDFFEYFRATYPLLKADPSLWCVSAWNDNGKEQMVDASRPELLYRTDFFPGLGWLLLAELWAELEPKWPKAFWDDWMRRPEQRQG
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
300
|
RACIRPEISRTMTFGRKGVSHGQFFDQHLKFIKLNQQFVHFTQLDLSYLQREAYDRDFLARVYGAPQLQVEKVRTNDRKELGEVRVQYTGRDSFKAFAKA
----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------A
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------ELGEVRVQYTGR---------
-----------------------------------------------------------------------------------------------AFAKA
400
|
LGVMDDLKSGVPRAGYRGIVTFQFRGRRVHLAPPPTWEGYDPSWN
---------------------------------------------
LGVMDDLK-------------------------------------
---------------------------------------------
---------------------------------------------
---------------------------------------------
LGVMDDLK-------------------------------------

How is information on the "Peptide Page" organized?

There are three types of information associated with each protein, if it is found in the HIP2 database. They are:

  1. Peptide Summary. For example, the following shows a summary of the query peptide ALGVMDDLK, showing its amino acid sequence length. In addition, it provides a link to the PeptideAtlas database, if the peptide is sometimes recorded in the PeptideAtlas database (the peptide is not always found, since our database integrates more sources of data). It also provides a link to the SAPS (statistical analysis of protein sequences) server for researchers interested in analyzing peptide compositional patterns.

  2. Peptide Summary
    Peptide Sequence Length 9
    PeptideAtlas Peptide Search Search ALGVMDDLK
    SAPS Peptide Statistical Analysis Analyze ALGVMDDLK

  3. Experimental Evidence. For example, the following shows all the human proteins that can be found by the query peptide ALGVMDDLK. One of the protein IPI00000138 is found in the healthy human plasma proteome, whereas the other protein IPI00179044, even though coming from the same gene MGAT1, was not reported before. Such information may be used as the basis for further worthy investigations.
  4. Experimental Evidence
    Total Number of Plasma Proteins Containing the Peptide
    2
    Protein #1
    HIP2 DB Protein Entry
    Search IPI00000138
    Protein Annotation
    IPI:IPI00000138.1; SWISS-PROT:P26572; TREMBL:Q59G70;Q6IBE3; ENSEMBL:ENSP00000332073; REFSEQ:NP_002397; H-INV:HIT000018985; VEGA:OTTHUMP00000161546; Tax_Id=9606 Gene_Symbol=MGAT1 Alpha-1,3-mannosyl-glycoprotein 2-beta-N-acetylglucosaminyltransferase
    Gene Symbol
    MGAT1
    Peptide Evidence
    Evidence for the protein is found distinctly in 3 data source(s), 3 MS experiment(s), 3 mass spectrometry types, 2 MS search software, and 6 mapped peptide sequences.
    Protein #2
    HIP2 DB Protein Entry
    Search IPI00179044
    Protein Annotation
    IPI:IPI00179044.1; TREMBL:Q8NBL8; ENSEMBL:ENSP00000311888; Tax_Id=9606 Gene_Symbol=MGAT1 CDNA PSEC0120 fis, clone PLACE1002379, highly similar to Human alpha- 1,3-mannosyl-glycoprotein beta-1, 2-N-acetylglucosaminyltransferase (MGAT) gene
    Gene Symbol
    MGAT1
    Peptide Evidence
    No existing evidence is found for the plasma protein.

  5. Peptide-protein Alignment Map. This map displays the full peptide sequence at the first line in the alignment map and all other proteins found in the IPI database (regardless of evidence found or not) shown in the same order they were initially introduced in the Experimental Evidence table earlier. All proteins are shown with trypsin cleavage sites (K & R, not followed by P) in red with other amino acids in black. Users can use this map to examine the contexts of the peptides in all available proteins and determine what proteins may likely or unlikely become inferred from peptide-to-protein experimental results.
Peptide-Protein Alignment Map
0
|
----------------------------------------------------------------------------------------------------
MLKKQSAGLVLWGAILFVAWNALLLLFFWTRPAPGRPPSVSALDGDPASLTREVIRLAQDAEVELERQRGLLQQIGDALSSQRGRVPTAAPPAQPRVPVT
....................................................................................................
100
|
----------------------------------------------------------------------------------------------------
PAPAVIPILVIACDRSTVRRCLDKLLHYRPSAELFPIIVSQDCGHEETAQAIASYGSAVTHIRQPDLSSIAVPPDHRKFQGYYKIARHYRWALGQVFRQF
..........................................MLKKQSAGLVLWGAILFVAWNALLLLFFWTRPAPGRPPSVSALDGDPASLTREVFRQF
200
|
----------------------------------------------------------------------------------------------------
RFPAAVVVEDDLEVAPDFFEYFRATYPLLKADPSLWCVSAWNDNGKEQMVDASRPELLYRTDFFPGLGWLLLAELWAELEPKWPKAFWDDWMRRPEQRQG
RFPAAVVVEDDLEVAPDFFEYFRATYPLLKADPSLWCVSAWNDNGKEQMVDASRPELLYRTDFFPGLGWLLLAELWAELEPKWPKAFWDDWMRRPEQRQG
300
|
---------------------------------------------------------------------------------------------------A
RACIRPEISRTMTFGRKGVSHGQFFDQHLKFIKLNQQFVHFTQLDLSYLQREAYDRDFLARVYGAPQLQVEKVRTNDRKELGEVRVQYTGRDSFKAFAKA
RACIRPEISRTMTFGRKGVSHGQFFDQHLKFIKLNQQFVHFTQLDLSYLQREAYDRDFLARVYGAPQLQVEKVRTNDRKELGEVRVQYTGRDSFKAFAKA
400
|
LGVMDDLK-------------------------------------
LGVMDDLKSGVPRAGYRGIVTFQFRGRRVHLAPPPTWEGYDPSWN
LGVMDDLKSGVPRAGYRGIVTFQFRGRRVHLAPPLTWEGYDPSWN

How is information on the "Experimental Information Page" organized?

This Experimental Information Page contains additional details about the experiment, including human subject's Ethnic Group, Gender, Sample Preparation Method, Protein Separation, Material Type, Peptide Separation, Depletion Method, and Reduction Method (treatment of cysteines), if such information is disclosed in the source publication.

Experimental Information
Experimental KeyCode HUPO_E46.01
Ethnic Group Chinese
Gender ND
Sample Preparation Method clot at room temp.
Protein Seperation none
Material Type Plasma
Peptide Seperation rp
Depletion Method top6
Reduction Method iam

The following shows the abbreviations used in the experimental information table.

aig depletion of albumin and Ig
ig depletion of Ig
top6 depletion of the most abundant proteins which are albumin, IgG, IgA, haptoglobin, alpha-1-anti-trypsin and transferrin
cho affinity affinity chromatography
sds sodium dodecyl sulfate polyacrylamide gel electrophoresis
rotofor rotofor TM apparatus for fractionations
iam indoacetamide
scx strong cation exchange
none not done
unknown information not available

How should the work be referenced?

Sudipto Saha, Scott Harrison, Changyu Shen, Haixu Tang, Predrag Radivojac, Randy J Arnold, Xiang Xhang, and Jake Yue Chen (2008)HIP2: An Online Database of Human Plasma Proteins from Healthy Individuals. BMC Medical Genomics, 2008, 1:12.

Home - Logs & Statistics - Help - Link - Download - About Us - Terms & Conditions


For questions, please contact Dr. Jake Y. Chen (jakechen@iupui.edu) or Dr. Sudipto Saha (sahas@iupui.edu)

All Rights Reserved. Copyright © 2007 by Discovery Informatics and Computing Group, Indiana University.