The Pancreatic Analytics Hub hosts 4 core data sources: The Cancer Genome Atlas (TCGA), The International Cancer Genome Consortium (ICGC), Genomics Evidence Neoplasia Information Exchange (GENIE) and the Cancer Cell Line Encyclopaedia (CCLE).

TCGA: The Cancer Genome Atlas is a consortium dedicated to the systematic study of alterations in a variety of human cancers. It has made mRNA expression, mutation and methylation data from analysed cohorts publicly available, alongside associated clinical data. Currently, mRNA expression and mutation data from sequenced patients with pancreatic adenocarcinoma are available for analysis through the Analytics Hub, alongside associated clinical data.

ICGC: The International Cancer Genome Consortium is focussed on the generation of comprehensive catalogues of genomic abnormalities (somatic mutations, expression of genes, epigenetic modifications) in tumours from 50 different cancer types. It has made mRNA expression, DNA copy number, mutation and methylation data from analysed cohorts publicly available, alongside associated clinical data. Currently, mRNA expression and mutation data from sequenced patients with pancreatic adenocarcinoma or pancreatic endocrine neoplasms are available through the Analytics Hub.

GENIE: Genomics Evidence Neoplasia Information Exchange is a pilot project that seeks to identify and validate genomic biomarkers relevant to cancer treatment by linking tumour genomic data from clinical sequencing efforts with longitudinal clinical outcomes. It has made mutation data publicly available, alongside associated clinical data for a range of cancer types/subtypes. Mutation data from individuals with pancreatic cancer are available for analysis from the Analytics Hub.

CCLE: Cancer Cell Line Encyclopaedia project is an effort to conduct a detailed genetic characterisation of a large panel of human cancer cell lines. mRNA expression and mutation data for pancreatic cancer cell lines are available from the Analytics Hub.

Table 1. Features of the Analytics Hub for Publicly Available Data Sources
Results TabAnalytical features TCGAICGCGENIE CCLE
Genomics Genomic Summary
Genomics Somatic Interactions
Genomics OncoPlot
Genomics Lolliplot
Genomics Protein-Protein Networks
Results TabAnalytical features TCGAICGCGENIE CCLE
Transcriptomics Principal Component Analysis
Transcriptomics Expression Profiles
Transcriptomics Correlation
Transcriptomics Survival
Transcriptomics Protein-Protein networks

1.1 Summary. Genomics data from publicly available sequencing cohorts can be analysed using the integrated Bioconductor package MAFtools, which facilitates the analysis of somatic variants containing single-nucleotide variants (SNV) and small insertion/deletions (indels), based on variant characteristics, gene interactions and protein changes.

1.2 Somatic Interactions. From this tab, a MAFtools summary plot can be viewed for each cohort, displaying the range of variant classifications, variant types and base substitution profiles as bar plots and/or box plots. The number of variants in each sample can also be viewed as a stacked bar plot, alongside a summary of the top 10 mutated genes for each cohort.

1.3 OncoPlot. Mutually exclusive or co-occurring set of genes (top 25 mutated) can be analysed, using the pair-wise Fisher’s Exact test to detect significant pairs of genes and visualised as a correlation matrix.

1.4 Lolliplot. Users can also select to view amino acid changes within each of the top 50 mutated genes in each cohort as a lollipop plot. The plots display the observed mutation distribution and protein domains, which are labelled for each selected gene. A summary of the observed somatic mutation rate for each selected gene is also provided alongside each plot.

1.5 Coming soon! Protein-Protein and Drug-Target Interactions. The characterisation of drug-target interaction networks can provide an important tool to identify potential targets amenable to treatment with existing drugs. The networks available are based on the protein-protein and protein-drug interactions. Variants within candidate genes of interest can be queried against the DrugBank database, for the analysis of potential genotype-driven therapeutic targets.

2.1 Principal Component Analysis. Principal component analysis (PCA) reduces the dimensionality of data while retaining most of the variation in the dataset, making it possible to visually assess similarities and differences between different samples and determine whether groupings can be identified between individual samples. This exploratory analysis facilitates identification of the key factors affecting the variability in the mRNA expression data.

For each dataset, scatterplots representing the first two and the first three principal components (PCs) of the data are presented. Each data point represents the orientation of a single sample in the transcriptomic space projected on the PCA, with different colours indicating the biological group to which each sample belongs. The percentage values in brackets on each axis indicate the amount of variance in the data explained by the corresponding PC.

The global variability of the data can also be assessed from the scree plot. Here, you can identify the fraction of total variance (y-axis) attributed to each PC (x-axis). The PCs are ordered by decreasing order of contribution to total variance.

2.2 Expression Profiles. The distribution of mRNA expression measurements can be visualised across all samples for a user-defined gene (from the top 50 aberrantly expressed genes).

2.3 Correlation. Pairwise comparisons of expression profiles can be performed between multiple user-defined genes in each selected dataset and Pearson's correlation coefficients and p-values calculated for each comparison.

For queried set of genes (minimum of 3 genes), the Analytics Hub computes Pearson's correlation coefficients and corresponding p-values for all pairwise combinations of genes and displays the correlation coefficients in a form of pairwise comparison heatmap. The colour of each cell indicates correlation coefficient between corresponding genes labelled on the x-axis and y-axis. The heatmap colour key is displayed on the right-side of the plot with red and blue indicating high and low correlation values, respectively.

2.4 Survival Analysis. From this tab the relationship between the expression of genes of interest and survival can be assessed. A univariate Cox proportional hazards (PH) regression is applied to the survival data and the samples are assigned to risk groups based on the median dichotomisation of mRNA expression intensities of the selected gene. Relationships are presented as Kaplan-Meier plots. The hazard ratio (HR) and 95% confidence intervals (CI) from the Cox PH model and associated log-rank p-value are presented in the top right corner of the figure.

2.5 Coming soon! Protein-Protein and Drug-Target Interactions. The characterisation of drug-target interaction networks can provide an important tool to identify potential targets amenable to treatment with existing drugs. The networks available are based on the protein-protein and protein-drug interactions. Variants within candidate genes of interest can be queried against the DrugBank database, for the analysis of potential genotype-driven therapeutic targets.

In order to apply for PCRFTB Samples, you need to signup for the Tissue Request System and submit an Expression of Interest and apply for samples by filling a Application once the Expression of Interest is approved by Tissue Bank Coordinator.

To know more about Application work flow please visit PCRFTB Tissue Bank

For more information on PCRFTB Samples please contact Tissue Bank Coordinator by email