What is HIP2?
HIP2 is a web based database which stands for Healthy Human Individuals' Integrated Plasma Proteome. The database provides comprehensive information on plasma proteins detected from the blood of "healthy" or "normal" individuals (defined as healthy human adults without major life-threatening disease, known genetic diseases, HIV, or inflammation at the time of blood drawing), using different tandem mass spectrometry techniques.
How should I pronounce HIP2?
We pronounce it as Hip Two. While excited, we also tend to yell out Hip-Hip, since the original initials of the database name contains two H's, two I's and two P's. The mathematically oriented folks think we are weird and the normal way of saying the name should be "Hip squared", both for the superscript position of the number 2 and the augmented reality emotion that this version could bring forth. So, it's really up to our reader to decide which version is the "hippest" to be adopted.
Why should I use HIP2?
The HIP2 database provides protein biologists and clinical biomedical researchers with new opportunities to investigate which proteins may be used for future biomarker research, by comparing plasma proteomics results from patients with diseases such as cancer, neurodegenarative diseases, metabolic diseases, and other genetic disorders. The HIP2 provides first such "background" information on which proteins are expected, with ample peptide sequence as evidence, in health individual's plasma.
Why focus on plasma proteome for healthy individuals only?
Many shot-gun proteomics experiments performed today can only observe a few hundred high- to medium- abundant proteins, due to limitations of instrument detection dynamic range, wide range of protein concentrations in blood, and difficulty with protein separations. An exhausive search of all MS spectra against all theoretical peptide patterns are also not possible on a routine basis. These technical barriers have made it difficult to consistently observe "low abundant" but expected plasma proteins from healthy individual plasma samples. This limitation results in an effective "stochastic sampling" process to observe any "normal and expected" proteins in well controlled experiments. Therefore, it is essential to compare proteomics results from "treated" or "diseased" samples obtained from any particular proteomics platform against "normal and expected" proteomics results obtained from all proteomics platforms in a database such as HIP2. We want to remain focused on this mission. For general repository of peptide mapping and proteomics data repository for human plasma, refer to the ProteomeCommons.Org Data Project and our manuscript published at BMC Medical Genomics in 2008 here.
How is the HIP2 designed?
We show in the following figure the overall design of the database. User queries are connected to the backend database, which is implemented using Oracle10G relational database system hosted at Indiana University High-performance Computing Facilities. The query box will direct results to user queries into one of the three web pages, protein page, peptide page, and experimental information page. The database result pages are also linked to external web pages.

What types of questions will I be able to answer by querying the HIP2 database?
The HIP2 database, although simple in its GUI, is quite powerful in searching for all known healthy human plasma proteins, as long as the peptide evidence and the protein are published and included in our database. Particularly, the database can answer the following sample questions posed by our users easily:
- Has my favoriate protein, X, been observed in plasma proteome before?
- Use 1 and 3 B. (Refer to the figure above for number references, ibid. Examine protein-peptide mapping for the protein)
- What are all the evidence for me to judge the confidence level of protein X's presence healthy plasma proteome or not?
- Use 1, 3, 6, 8 and 5 C. (Investigate peptide number, protein-peptide alignment, and trypsin cut match information)
- For groups of similar proteins (e.g., different variants from the same gene), how are they currently represented on the Plasma Proteome and is there a gap of knowledge?
- 1, 3, 4 and 5 D. (Find a common peptide for all the similar proteins, search through peptide, and examine protein reports and peptide-protein alignment map)
- Are there other proteins having same peptides that could belong to healthy plasma?
- 1, 2, 3, 6 and 8 E. (Find a common peptide for all the similar proteins, examine missing plasma protein report on one of the protein, and examine peptide-protein alignment map)
- Are there other proteins having same peptides that may not belong to healthy plasma?
- 1, 2, 3, 6 and 8 (Find a common peptide for all the similar proteins, examine plasma protein report on one of the protein to see if the evidence is weak or not, and examine peptide-protein alignment map to see if the trypsin cut match is good or not for the matched peptide)
How do I enter information in the main database query page?
You may enter your query with one of the two query types:
- A protein query. You may enter a protein's accession numbers, which can be IPI number, Swiss-Prot ID, GI number, Vega ID, gene name, or Uniprot Name, in the "Enter a protein" field
- A peptide query. You may enter a peptide sequence, using single letter representation for each amino acid in the sequence, without spaces up to 4000 letters.
What information can I obtain from the database?
You may expect three types of information pages, with each page type corresponding to a group of information relevant to your search query:
- The Protein Page. You will obtain information whether a protein is in the health human individual's plasma proteome, what function it has, what peptides can be mapped to the protein, and how the peptides are aligned according to trypsin diggestion rules.
- The Peptide Page. You will obtain information whether a peptide may be mapped to any protein in the IPI database, and if so, whether the protein may be mapped to a health human individual's plasma protein. If the protein containing the peptide is found in the HIP2 database, summary protein information and all peptide evidence information available on the protein will be shown. Additionally, alignment of the query peptide on all the IPI proteins are shown.
- The Experimental Information Page. You will obtain detailed information on the sample collection/treatment and experimental details.
How is information on the "Protein Page" organized?
There are three types of information associated with each protein, if it is found in the HIP2 database. They are:
- Protein Summary. For example, the following shows a summary of protein IPI00000138, with all common protein accession numbers, and brief function descriptions.
| Protein Summary |
|
|
|
IPI ID |
IPI00000138.1 |
|
Sequence Length (A.A. Residues) |
445 |
|
Uniprot Name |
MGAT1_HUMAN |
|
SwissProt ID |
P26572 |
|
Related SwissProt/TrEMBL IDs |
P26572 |
|
RefSeq ID |
NP_002397 |
|
Vega ID |
OTTHUMP00000161546 |
|
Gene Symbol |
MGAT1 |
|
Description |
Alpha-1,3-mannosyl-glycoprotein 2-beta-N-acetylglucosaminyltransferase |
|
Functional Keywords |
Glycosyltransferase; Golgi stack; Signal-anchor; Transferase;Transmembrane. |
- Experimental Evidence. For example, the following is a partial view of the peptide results mapped to the query protein IPI00000138. Only 2 of the 6 peptide results are shown. Note that Experimental KeyCode is a field that we use to distinguish one experiment from another. Experimental Type is the separation/mass spectrometry platform used to acquire the raw data. Search Software is the bioinformatics tool used to identify the peptide from the raw data. The hyperlink in this field directs the user to the homepage for the software. Any known posttranslational modifications are listed in the Posttranslational Modification field - if none are known to exist, the field reports "unknown". PubMed Reference is the journal article reference source of the data. The Comments field contains other information about the data, such as ion charge state, signal intensity, peptide identification score, and peptide identification score thresholds.
| Experimental Evidence |
|
|
|
Total Peptide(s) Mapped |
6 |
|
|
|
Peptide Sequence #1 |
GLLQQIGDALSSQRGRVPTAAPPAQPR |
|
Experiment KeyCode |
PEPATLAS_E1.01 |
|
Experimental Type |
LC-MS/MS |
|
Search Software |
SEQUEST |
|
Posttranslational Modification |
unknown |
|
PubMed Reference |
Proteomics_2005_vol5_pp3497 |
|
Comments |
PeptideProphet probability 0.97 |
|
|
|
Peptide Sequence #2 |
ALGVMDDLK |
|
Experiment KeyCode |
IUBPPM_E1.01 |
|
Experimental Type |
IMS-MS/MS_TOF |
|
Search Software |
MASCOT |
|
Posttranslational Modification |
unknown |
|
PubMed Reference |
J Am Soc Mass Spectrom_2007_vol18_pp1249 |
|
Comments |
Z=3 Intensity=2.47E+03 Score=35 |
|
|
|
|
- Protein-peptide Alignment Map. This map displays the full protein sequence (the first sequence in the alignment map) with trypsin cleavage sites (K & R, not followed by P) in red with other amino acids in black. Identified peptides are displayed in green below the corresponding sequence in the protein. The peptides are shown in the order when they were referred to initially in the Experimental Evidence table. Non-tryptic sequences can be readily determined by comparing the identified peptides to the trypsin cleavage sites in the full protein sequence. The following is an example showing all 6 peptides that were mapped to the protein IPI00000138 by the underlying experiments.
| Protein-Peptide Alignment Map |
|
0 | MLKKQSAGLVLWGAILFVAWNALLLLFFWTRPAPGRPPSVSALDGDPASLTREVIRLAQDAEVELERQRGLLQQIGDALSSQRGRVPTAAPPAQPRVPVT ---------------------------------------------------------------------GLLQQIGDALSSQRGRVPTAAPPAQPR---- ---------------------------------------------------------------------------------------------------- ---KQSAGLVLWGAILFVAWNALLLLFFWTRPAPGRPPSVSALDGDPASLTR------------------------------------------------ ------------------------------------------------------------------------------------------------VPVT ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- 100 | PAPAVIPILVIACDRSTVRRCLDKLLHYRPSAELFPIIVSQDCGHEETAQAIASYGSAVTHIRQPDLSSIAVPPDHRKFQGYYKIARHYRWALGQVFRQF ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- PAPAVIPILVIACDRSTVR--------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- 200 | RFPAAVVVEDDLEVAPDFFEYFRATYPLLKADPSLWCVSAWNDNGKEQMVDASRPELLYRTDFFPGLGWLLLAELWAELEPKWPKAFWDDWMRRPEQRQG ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- 300 | RACIRPEISRTMTFGRKGVSHGQFFDQHLKFIKLNQQFVHFTQLDLSYLQREAYDRDFLARVYGAPQLQVEKVRTNDRKELGEVRVQYTGRDSFKAFAKA ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------A ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------ELGEVRVQYTGR--------- -----------------------------------------------------------------------------------------------AFAKA 400 | LGVMDDLKSGVPRAGYRGIVTFQFRGRRVHLAPPPTWEGYDPSWN --------------------------------------------- LGVMDDLK------------------------------------- --------------------------------------------- --------------------------------------------- --------------------------------------------- LGVMDDLK------------------------------------- |
How is information on the "Peptide Page" organized?
There are three types of information associated with each protein, if it is found in the HIP2 database. They are:
- Peptide Summary. For example, the following shows a summary of the query peptide ALGVMDDLK, showing its amino acid sequence length. In addition, it provides a link to the PeptideAtlas database, if the peptide is sometimes recorded in the PeptideAtlas database (the peptide is not always found, since our database integrates more sources of data). It also provides a link to the SAPS (statistical analysis of protein sequences) server for researchers interested in analyzing peptide compositional patterns.
- Experimental Evidence. For example, the following shows all the human proteins that can be found by the query peptide ALGVMDDLK. One of the protein IPI00000138 is found in the healthy human plasma proteome, whereas the other protein IPI00179044, even though coming from the same gene MGAT1, was not reported before. Such information may be used as the basis for further worthy investigations.
- Peptide-protein Alignment Map. This map displays the full peptide sequence at the first line in the alignment map and all other proteins found in the IPI database (regardless of evidence found or not) shown in the same order they were initially introduced in the Experimental Evidence table earlier. All proteins are shown with trypsin cleavage sites (K & R, not followed by P) in red with other amino acids in black. Users can use this map to examine the contexts of the peptides in all available proteins and determine what proteins may likely or unlikely become inferred from peptide-to-protein experimental results.
| Peptide-Protein Alignment Map |
|
0 | ---------------------------------------------------------------------------------------------------- MLKKQSAGLVLWGAILFVAWNALLLLFFWTRPAPGRPPSVSALDGDPASLTREVIRLAQDAEVELERQRGLLQQIGDALSSQRGRVPTAAPPAQPRVPVT .................................................................................................... 100 | ---------------------------------------------------------------------------------------------------- PAPAVIPILVIACDRSTVRRCLDKLLHYRPSAELFPIIVSQDCGHEETAQAIASYGSAVTHIRQPDLSSIAVPPDHRKFQGYYKIARHYRWALGQVFRQF ..........................................MLKKQSAGLVLWGAILFVAWNALLLLFFWTRPAPGRPPSVSALDGDPASLTREVFRQF 200 | ---------------------------------------------------------------------------------------------------- RFPAAVVVEDDLEVAPDFFEYFRATYPLLKADPSLWCVSAWNDNGKEQMVDASRPELLYRTDFFPGLGWLLLAELWAELEPKWPKAFWDDWMRRPEQRQG RFPAAVVVEDDLEVAPDFFEYFRATYPLLKADPSLWCVSAWNDNGKEQMVDASRPELLYRTDFFPGLGWLLLAELWAELEPKWPKAFWDDWMRRPEQRQG 300 | ---------------------------------------------------------------------------------------------------A RACIRPEISRTMTFGRKGVSHGQFFDQHLKFIKLNQQFVHFTQLDLSYLQREAYDRDFLARVYGAPQLQVEKVRTNDRKELGEVRVQYTGRDSFKAFAKA RACIRPEISRTMTFGRKGVSHGQFFDQHLKFIKLNQQFVHFTQLDLSYLQREAYDRDFLARVYGAPQLQVEKVRTNDRKELGEVRVQYTGRDSFKAFAKA 400 | LGVMDDLK------------------------------------- LGVMDDLKSGVPRAGYRGIVTFQFRGRRVHLAPPPTWEGYDPSWN LGVMDDLKSGVPRAGYRGIVTFQFRGRRVHLAPPLTWEGYDPSWN
|
How is information on the "Experimental Information Page" organized?
This Experimental Information Page contains additional details about the experiment, including human subject's Ethnic Group, Gender, Sample Preparation Method, Protein Separation, Material Type, Peptide Separation, Depletion Method, and Reduction Method (treatment of cysteines), if such information is disclosed in the source publication.
| Experimental Information |
|
|
|
Experimental KeyCode |
HUPO_E46.01 |
|
Ethnic Group |
Chinese |
|
Gender |
ND |
|
Sample Preparation Method |
clot at room temp. |
|
Protein Seperation |
none |
|
Material Type |
Plasma |
|
Peptide Seperation |
rp |
|
Depletion Method |
top6 |
|
Reduction Method |
iam |
The following shows the abbreviations used in the experimental information table.
| aig |
depletion of albumin and Ig |
| ig |
depletion of Ig |
| top6 |
depletion of the most abundant proteins which are albumin, IgG, IgA, haptoglobin, alpha-1-anti-trypsin and transferrin |
| cho affinity |
affinity chromatography |
| sds |
sodium dodecyl sulfate polyacrylamide gel electrophoresis |
| rotofor |
rotofor TM apparatus for fractionations |
| iam |
indoacetamide |
| scx |
strong cation exchange |
| none |
not done |
| unknown |
information not available |
How should the work be referenced?
Sudipto Saha, Scott Harrison, Changyu Shen, Haixu Tang, Predrag Radivojac, Randy J Arnold, Xiang Xhang, and Jake Yue Chen (2008)HIP2: An Online Database of Human Plasma Proteins from Healthy Individuals. BMC Medical Genomics, 2008, 1:12.
Home - Logs & Statistics - Help - Link - Download - About Us - Terms & Conditions