Welcome to GlioVis : a user friendly web application for data visualization and analysis to explore brain tumors expression datasets.

How does it work?

GlioVis is very easy to use:

  1. Select the "Explore" tab
  2. Choose a dataset
  3. Enter a Gene Symbol
  4. Select one of the available plots (through the dropdown menu or a specific tab)

What type of plots are available?

For each dataset it is available a list of common "pre-defined" plots (see the table below) and a list of dataset-specific "user-defined" box plots. An overview of all the box plots for a given dataset can be found at "Explore/Summary/Plots".

Available datasets:

* multiple datasets combined; ** multiple samples from different tumor features; *** more than one sample for each patients
* includes adult samples

Which gene ID can I use?

Currently only HGNC-approved "Gene Symbols" are supported.

Can I download the plots?

Yes, all the plots can be downloaded in various formats: .pdf, .bmp, .jpeg, .tiff or .jpg.

Can I download the data?

Yes, it is highly recommended for reproducibility issues. For each plot there is a "Data" tab containing the actual data used to generate the plot. The data can also be downloaded at "Explore/Summary/Data".

What other tools are available?

SubtypeME: Classify tumor samples based on mRNA expression profiles.

EstimateME: Estimate of STromal and Immune cells in MAlignant Tumor samples (Yoshihara K. et al., 2013).

DeconvoluteME: Deconvolute gene expression profiles from heterogeneous tissue samples into cell-type-specific subprofiles.

Can I use GlioVis results for my publication?

Of course! If you do so, please include references for the dataset(s) you used and cite: "GlioVis data portal for visualization and analysis of brain tumor expression datasets" (Bowman R. et al., Neuro-Oncology 2017).

Please adhere to the TCGA publication guidelines when using TCGA data in your publications.

I have more questions, how can I reach you?

You can contact us by email or through the GlioVis user discussion group.


No great discovery was ever made without a bold guess.

Isaac Newton







- or paste -
- or paste -

mRNA expression (log2). Blue lines represent 25%, 50% and 75% quartiles. Red line represents the current selection.
mRNA expression (log2). Blue lines represent 25%, 50% and 75% quartiles. Red line represents the current selection.

Statistic:

Plot options:

Plot size

Points appearance

Axis labels

Miscellaneous

Plot options:

Statistic:

Plot options:



Oncoprint options:

Filter data:

Download:

Set download image dimensions (units are inches for PDF, pixels for all other formats)


Summary statistics


Tukey's Honest Significant Difference (HSD)

The table shows the difference between pairs, the 95% confidence interval and the p-value of the pairwise comparisons. ***p<0.001; **p<0.01; *p<0.05; ns, not significant.

Pairwise t tests

Pairwise comparisons between group levels with corrections for multiple testing (p-values with Bonferroni correction).

Kaplan-Meier estimator survival analysis

Hazard ratio

Calculating, please wait


Note: This is an interactive plot, click on a specific mRNA expression value to update the survival plot. The blue line represents the current selection.

Determine optimal cutoff for Kaplan-Meier survival analysis


Test for association/Correlation between paired samples

(Note: Correlation cannot be computed for groups with less then 4 samples)

Correlate expression of a gene with all the genes in the dataset

Calculating, please wait


Reverse phase protein array (RPPA) data for TCGA datasets

Mutation data for TCGA datasets




Calculating, please wait


Retrieving mutation data

Calculating, please wait


Calculating, please wait


Calculating, please wait


Gene ontology enrichment analysis

Note: you will have to re-perform the analysis everytime one of the input parameters is changed

Calculating, please wait


Calculating, please wait


Calculating, please wait


KEGG enrichment analysis

Note: you will have to re-perform the analysis everytime one of the input parameters is changed

Calculating, please wait


Calculating, please wait


Calculating, please wait




Upload a .csv file with samples in rows and genes expression in columns. The first column should contain the sample ID and should be named 'Sample':

Resampling permutation procedure is used to calculate an empirical p-value for each enrichment score


Upload a .csv file with samples in rows and pData in columns. The first column should contain the sample ID and should be named 'Sample'






Classify tumor samples based on mRNA expression profiles

Supported vector machine learning

Calculating, please wait


Be patients, switching to another tab will crash GlioVis ...

K-nearest neighbors prediction

Calculating, please wait


Be patients, switching to another tab will crash GlioVis ...

Single sample Gene Set Enrichment Analysis

Calculating, please wait


Be patients, switching to another tab will crash GlioVis ...

Generate and compare subtype calls by SVM, K-NN and ssGSEA

Calculating, please wait


Be patients, switching to another tab will crash GlioVis ...

Estimate of STromal and Immune cells in MAlignant Tumor tissues

Calculating, please wait


Deconvolute gene expression profiles into cell-type-specific subprofiles

Calculating, please wait


Calculating, please wait


Calculating, please wait


Data Processing

mRNA analysis

  • Microarray. For Affymetrix expression arrays (HG-U133_Plus_2, HG-U133A, HG_U95Av2 and HuGene-1_0-st) the raw .CEL files (when available) were downloaded from the respective sources and the analyses were completed in R using the Bioconductor suite. The ‘affy’ package was used for robust multi-array average normalization followed by quantile normalization. For genes with several probe sets, the median of all probes had been chosen. In case the raw data were not available and for other expression array platforms (Illumina beadchip, Agilent-4502A), the available pre-processed expression matrixs were used.
  • RNA-seq. The normalized count reads from the pre-processed data (sequence allignment and transcript abundance estimation) were log2 transformed after adding a 0.5 pseudocount (to avoid infinite value upon log transformation).

Copy number

Survival

  • Kaplan–Meier survival analysis are performed using the ‘survival’ package in R. Hazard ratios were determined utilizing the coxph function from the ‘survival’ package. To generate HR plot we partially use some code previously describend in Cutoff finder.

Mutations

  • Mutation data (identified with the MutSig2CV algorithm) are dinamically downloaded from the cBio Portal using the ‘cgdsr’ package.

Classification

CIMP status
CIMP status has been determined by Support Vector Machine (SVM), using TCGA GBM as training dataset.

Subtypes
- GBM
For the GBM subtype classification we use the 3 TCGA expression subtypes (Classical, Mesenchymal,Proneural) as defined in Wang Q. et al. 2017.

- LGG
For the LGG subtype classification we use the 3 TCGA molecular subtypes (IDH mutant with 1p/19q codeletion, IDH mutant without 1p/19q codeletion, IDH wild-type) as defined in NEJM, 2015. Important note: the GlioVis LGG classification identifies “molecular subtypes” using expression data. Such approach has to be considered very preliminary untill further validation.

Classification algorithms:

  • The Subtype calls generated by Support Vector Machine (SVM), are done performing a linear Kernel with 10-fold cross validation, using the ksvm function of the ‘kernlab’ package. As training dataset for the GBM we use the TCGA GBM samples described in Wang Q. et al. 2017. As training dataset for the LGG we use the TCGA LGG samples described in NEJM.
  • The K-nearest neighbor (K-NN) Subtype calls, are obtained using the knn3Train function of the ‘caret’ package, using the subtypes as in the SVM calls.
  • The Subtype call generated by single sample Gene Set Enrichment Analysis (ssGSEA), is done using the runSsGSEAwithPermutation function of the ‘ssgsea.GBM.classification’ package. Not available yet for LGG samples.

Funding

App info

Version 0.20 (18/03/2016)

By: Massimo Squatrito

Built using Shiny by RStudio

Code available on GitHub

Session info

License

GlioVis is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License, version 3, as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

© Massimo Squatrito (2015-2020) Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

15-01-2020

  • New datasets

  • New features

    • Select gender for survival analysis.
    • Select IDH status for survival analysis. For the adult datasets with IDH status information (“CGGA”,“TCGA_GBM”, “TCGA_GBMLGG”,“Gorovets”,“Gravendeel”,“POLA_Network”,“Kamoun”) you can now use a drop list to selct the specific group of interest. For the other adult datasets, you can still use the inferred G-CIMP status as an alternative.

26-09-2017

10-2-2017

  • Updates

DeconvoluteMe:

16-12-2016

  • New features
    • Select optimal cutoff point for Kaplan-Meier survival curves.
    • Interactive volcano plot for differential expression analysis.

09-11-2016

  • GlioVis paper has been published! Bowman R. et al. 2017

  • New features

    • Improved downloading options: it's now possible to specify the resolution (ppi) of the plots to be downloaded.

21-09-2016

  • New datasets
    • 2 new datasets for pediatric gliomas: Kool_2014 and Robinson.

24-06-2016

  • Updates

    • For GBM subtyping we have now switched to the latest classification from the Verhaak lab with 3 classes (Classical, Mesenchymal and Proneural). Please check Wang Q. et al. 2017 for details. 4 classes subtyping will be still available as “Subtype_Verhaak_2010”.
    • Agilent-4502A and RNA-seq data are now availble for the TCGA GBM dataset.
    • Mutations are now available also for the TCGA Pan-Glioma dataset.
  • New datasets

    • We have included 3 new datasets for adult gliomas and 10 for pediatric brain tumors.

18-03-2016

  • New datasets
    • We have included 12 new datasets for pediatric gliomas.

02-02-2016

  • Updates
    • TCGA GBMLGG (Pan-Glioma) subtyping and clustering have been updated accordingly to the recent publication in Cell (Ceccareli et al. 2016).

15-01-2016

  • New features

    • Gene ontology and KEGG enrichment analysis for the results of differential gene expression analysis.
  • Updates

    • DeconvoluteMe: user can now specify any gene sets in MSigDB.

27-11-2015

  • New features

    • Differential gene expression analysis for Microarray datasets.
    • Oncoprints for the TCGA mutation data.
    • Dataset download tab: it's now possible to dowload a full dataset as a .txt file.
  • Updates

    • TCGA datasets have been updated to the 21-08-2015 data freeze.
    • Rppa data are now available also for the TCGA GBMLLGG dataset.
    • New gene list, the LM22 (Newman AM. et al. 2015), for deconvolution analysis.
    • Mutations data for TCGA sudies are now downloaded from the cBio Portal.
    • In survival and correlation analysis it's now possible to specify subtypes, when available, for all histologies and not only for GBMs.

30-09-2015

  • Reformatted box plot options.

14-08-2015

  • User interface update.
  • 4 new datasets included: TCGA GBMLGG, Lee Y, POLA Network, Gleize.
  • New tool, DeconvoluteMe: Deconvolute gene expression profiles from heterogeneous tissue samples into cell-type-specific subprofiles.

07-05-2015

Gliovis 0.1 is live!!