# TwoGroupPermutationTest.pdf

# Two group permutation test

## Overview

The **Two Group Permutation Test** module gives the user the option to run two different permutation tests, which will randomly shuffle the samples between the disease and control groups and re-compute the testing statistic. Given the speed of permutation test, 100 permutations (the default) can take 5 min for asingle dataset with 50,000 probe sets with about 50 chips.

The two test types are COPA and RobustOutlierSum, both of which have been used to detect outliers in differential gene expression. One use of this module would be to detect differences in copy number variation between groups of samples (after generating gene level data).

To run this module, click **MicroArray | Inference | Other Tests | Two group permutation test**.

### Input Data Requirements

This function works on -Omic data types.

## General Options

### Input/Output

**Project & Data**: The window includes a dropdown box to select the Project and Data object to be filtered.**Variables**: Selections can be made on which variables should be included in the filtering (options include All variables, Selected variables, Visible variables, and Customized variables (select any pre-generated Lists)).**Observations**: Selections can be made on which observations should be included in the filtering (options include All observations, Selected observations, Visible observations, and Customized observations (select any pre-generated Lists).

**Output name**: The user can choose to name the output data object.

### Options

**Test Type**:The two tests available are COPA and RobustOutlierSum.**COPA**has three steps. First, gene expression values are median centered, setting each gene’s median expression value to zero. Second, the median absolute deviation (MAD) is calculated and scaled to 1 by dividing each gene expression value by its MAD. Of note, median and MAD were used for transformation as opposed to mean and standard deviation (SD) so that outlier expression values do not unduly influence the distribution estimates, and are thus preserved post-normalization. Third, the 90th percentiles of the transformed expression values are tabulated for each gene and then the genes are rank-ordered by their percentile scores, providing a prioritized list of outlier profiles. It can be thought of as replacing the normal t-test's mean value with all sample median and SD with MAD.**RobustOutlierSum**: There are multiple publications on Outlier Sum. Here we use the implementation of Baolin Wu (“Cancer outlier differential gene expression detection”, Biostatistics 2007). You can think of it is replacing the normal t-test's mean with control group median, and the standard deviation with pool group MAD. The statistic is picked at the outlier value (75th quantile + IQR(inter-quantile range)).

**Group**: The user should select the column containing the sample**Group**, which can be selected from the dropdown box (and comes from columns in the Design Table).**Control**: Select the baseline level to compare other sample groups to.**Hypothesis**: The option of "Greater Than" or "Less Than" can be selected from the dropdown list. If the user was only interested in those genes that are up-regulated compared to control, they would choose “Greater Than”, or if they were only interested in those genes that are down-regulated compared to control, they would choose “Less Than”. Currently, this module is only one-sided, so the user would have to run it twice to generate results in both directions.**Multiplicity**: The user can select the Multiplicity test (None, FDR_BH, FDR_BY, Bonferroni, Sidak, StepDownBonferroni, StepDownSidak, and StepUp--with BDR_BH being the default option)**Generate significant list**checkbox will generate a List of the significant rows based on the test.**Alpha level**: The user can change the value to generate a different significant list.

**Generate fold change**: Checking this box will result in a column of fold-change values in the output table.**FC transformation**: The fold change transformation options include Exp2, Exp and Exp10.

**Permutation Number**: Specify the permutation number (default is 100). An increase in the permutation number will increase the time in running the test.

## Output Results

An example Categorical Trait Association report is shown below, containing fold-change, estimate, raw and adjusted p-value: