HCSD ptmENTO


General description

HCSD is a comprehensive database for human cancer secretome data. It contains >70,000 measurements generated in the field through >30 high-throughput studies on 17 cancer types. It has a simple and user friendly query system based on gene name, data type, and cancer type as the three main query options. The results are visualized in explicit and intractive manner.


statistics of HCSD (click to enlarge)

Browser requirements

HCSD relies on D3 and Javascript for full functionality. Any modern browser which supports Flash should work fine. Visit to Adobe's Flash pages to install or update the Flash plugin for your browser. We tested the following:

HCSD does not use browser cookies at the moment.

The database structure

(a) First we collected all the publications on human cancer secretome data, then we extract and processed the supplement data.
(b) The annotation and cross reference IDs from UniProt, ensembl, bioDBnet, and Entrez were retrieved.
(c) The secretory pathway features including signal peptide, transmembrane domains and non-classic secretory type were predicted by CBS prediction servers. Also the secondary structure and PTMs information were obtained from UniProt.
(d)The extracted proteomics data is divided to two types, the label-free and label-based (see the paper for difference). Therefore, we designed two separate database in MySQL (http://www.mysql.com/) for data basing and lighttpd as web server (http://www.lighttpd.net/). The interface and web application were implemented using web.py(http://webpy.org/), Javascript (http://en.wikipedia.org/wiki/JavaScript), jQuery(http://jquery.com/) and D3(http://d3js.org/).


the structure of the HCSD (click to enlarge)

Quering the HCSD

Quick search

User can query both the label-free and label-based data. As described in paper label-free data are collected from publications which use mostely spectral counting as proteomics techniques to measure the secretome of the understudied cancer while in label-based proteomics the samples are labeled with stable isotopes that allows the mass specctrometer to distinguish identical proteins from different samples .
In "QUERY" menu, the label-free table, the proteins with quantification are shown with up arrow(↑) and the proteins without quantification with flat arrow(→).It is important to keep in mind in spectral counting no quantification dose not neccessarily means the absence.



In the label-based datatable similar to the label-free table you can filter or sort the table based on information in the colummns.

Example 1- If you want to filter the table in such a way that in which cancer types your gene(protein) of interest is accessible in blood plasma?. You can type in search box the gene symbol(such EGFR) and "blood plasma" to filter the table.

Example 2-If you want to know which proteins are differentially expressed in colorectal carcinoma?,you can search the table for colorectal carcinoma and then sort the Fold change coloumn to have sorted fold changes.

both tables provide score for signal peptide prediction and non classical secretion.The reference column provides for each record the hyperlink to the publication which the data is obtained.


Advanced search

User can get all the more details on secretome mesurments, annotation, cross references, primary and secondary structure predictions for the protein of interest. To do a advanced search you need to:

1-Enter the gene name of interest.For example if the interested gene name is EGFR, user needs to enter the EGFR in the gene name box(the first query field)
The integrated autocomplete feature will let user to user to guess the name in case of uncertainty.
2- Choose all or specific cancer type. 3-Choose either label-free/ label-based or both as data type to explore.

4- For case study data type it is possible to query the secretome data type for all cancer types
5- Having all option selected, click the submit bottom to direct into the result page.

DATA SETS

In this menu you the publications that are used as data resource are summerized with breif information on proteomics techniques, sample source, etc.

If you are interested in more details on each publication methodology, by clicking the hyperlinks in the study column to go the study page which provides more details on workflow,experimental design , bioinformatics analysis,and candidate biomarkers.

The result pages

annotation

provides the annotation such as gene name, description, chromosomal location and cross references ID to the Ensembl,Entrez,and UniProt

label-free studies

In case of the label-free search, exploring all type of cancers will be visualized with a table with the cancer type icon in the header. Each row of this table starts with clickable pubmed id which directs the user to the publication that the data extracted. Under the each cancer type column the protein of interest is detected (greet spot), not detected (red spot) or not studied (grey spot).The last column shows the proteomics method used in each study in each row.
Each row of this table starts with clickable pubmed id which directs the user to the publication that the data extracted. Under the each cancer type column the protein of interest is detected (greet spot), not detected (red spot) or not studied (grey spot).The last column shows the proteomics method used in each study in each row.


label-free result table (click to enlarge)

label-based studies

In the case of label-based study, for the protein of interest and cancer type of each study will be given as title of a table which contains information about the stage of the cancer, fold change, experimental conditions the measurement obtained, and statistical tests data such as confidence and AVOVA scores in case are provided by the publication.


label-based result table (click to enlarge)

secretory pathway features

In querying both label-free and label-based data the second part of the results is devoted for the prediction score of the secretory features, visualization of PTMs and secondary structure information.
The secretory features include scores of SingalP (for signal peptide), TMHMM (for transmembrane domain), SecretomeP (for non-classical secretion), and HPPP (for human plasma membrane proteins). The last row of the table shows the subcellular localization data. The PTMs are color coded and hovering on the residues will show the position of the PTMs. The color code legend for PTMs and secondary structure information will appear in the below of the table


secretory features (click to enlarge)