Alternative splicing check

From Array Suite Wiki

check data by ArrayLand

When we check the alternative splicing, we would like to see the scenario that a gene has similar expression in tumor and normal samples, but its transcripts have different or opposite expression pattern. One great example is the FGF12 gene in prostate cancer:

Alternative splicing check1.png

The blue bar represents the expression of tumor sample and the green bar represents the expression for normal samples. It is a great example to show alternative splicing since there are only 2 transcripts, they have different expression pattern but overall similar expression. One thing need to mention is that if we did not group the data by sample type (i.e. tumor vs. normal), just showing the expression level of the two transcripts in prostate cancer, we would have:

Alternative splicing check2.png

At the first glance, it seems that the transcript uc003fsy.3 (the top one) expresses much more than transcript uc003fsx.3 (the bottom one), which is inconsistent with the finding in first figure. The reason is that for the transcript expression level, the mean value is calculated by the specified sample set (Here are all samples in prostate cancer). Since there are much more tumor samples than normal samples in this sample set. The overall transcript expression level is more similar to the blue bar in the 1st figure. (In other word, if there are the same number of normal samples with the tumor samples, the expression level for the 2 transcript would be similar in the 2nd figure.)

check data by Array Studio

You can always double check the value shown on Arrayland by downloading the data and calculate the value yourself.

download data from Array Land to Array Studio

To download the data to local analysis, you can go to:

Alternative splicing check3.png

And download the expression for gene fgf12:

Alternative splicing check4.png

analyze data in Array Studio

For the sample set, since we just want to check the data in prostate cancer, you can first generate a sample set only containing prostate cancer then start downloading, or first downloading all the data and do the analysis in Array Studio. Here I will do the latter one.

Alternative splicing check5.png

And generate a list contains the sample IDs in prostate cancer.

Alternative splicing check6.png

Then we can calculate the transcript expression under different grouping condition:

1. We want a 2x2 table with the value shown in the 1st figure:

We can do this calculation by:

Alternative splicing check7.png

Alternative splicing check8.png

And you will get the result like:

Alternative splicing check9.png

Similarly, we can generate a 2x1 table with the value in figure 2 by grouping by Tumor type:

Alternative splicing check10.png

And we will get the result:

Alternative splicing check11.png

Actually, we can further check the number of different sample type by:

Alternative splicing check12.png

And get the result:

Alternative splicing check13.png

We can see that there are much more samples in tumor than in normal. So the ratio mean expression of the two transcripts over all prostate cancer (0.28:0.92) is closer to the ratio in tumor group (0.17:1.03).