SummaryStatistics.pdf

From Array Suite Wiki

Summary Statistics

Overview

The Summary Statistics command will calculate a set of summary statistics for an -Omic data object's observations or variables, and generate a new Table with the summary information in the Solution Explorer, under the Summary folder. Optionally, the user can choose a "Group" column, so subsets of the data will be independently summarized.

To run this module, type MicroArray | Summarize | Summary Statistics.

Summary menu.png

Input Data Requirements

It works on -Omic data types.


General Options

SummStats1.png

Input/Output

  • Project & Data: The window includes a dropdown box to select the Project and Data object to be filtered.
  • Variables: Selections can be made on which variables should be included in the filtering (options include All variables, Selected variables, Visible variables, and Customized variables (select any pre-generated Lists)).
  • Observations: Selections can be made on which observations should be included in the filtering (options include All observations, Selected observations, Visible observations, and Customized observations (select any pre-generated Lists).
  • Output name: The user can choose to name the output data object.


Options

  • By: Under the Options section, the user can choose to either run Summary Statistics on variables or observations.
    • Summarizing by Variables means that, for each variable, the set of observations will be summarized
    • Summarizing by Observations means that, for each observation, the set of variables will be summarized
  • Group: This drop-down box can be used to choose a column (from the Design Table for summarizing variables, or the Annotation Table for summarizing observations) to group the summarizing criteria. For instance, if the experiment contains a Treatment column, the user may be interested in using this column as a Group by which to summarize, to calculate summary statistics separately for each Treatment group.
  • Statistics: This box contains the statistics to be generated by the Summary Statistics command. Multiple selections can be made (hold ctrl to select multiple) and are listed below.
  • Quantiles: If Quantiles is selected in the Statistics section, then the Quantiles box can be set to the percentage that is desired by the user.
  • Append summary statistics to the covariate table: checking this box will append the newly generated statistics column to the relevant metadata table. This column can then be used in future filtering operations.
  • Generate observation box plots: Checking this box will create a box plot based on the summary statistics options selected above.
Summarization Methods used in Array Studio
Option
Meaning

N

number of data points

Mean

average

StdDev

Standard Deviation

Min

minimum value in that variable or observations

Max

maximum value in that variable or observation

MinAbs

minimum absolute value

MaxAbs

maximum absolute value

Range

range of values in that variable or observation

NMissing

number of missing values in that variable or observation

NMissingPercentage

percentage of missing values in that variable or observation

NNotMissing

number of non-missing values in that variable or observation

NNotMissingPercentage

percentage of non-missing values in that variable or observation

Sum

sum of values for that variable or observation

Variance

variance of values for that variable or observation

StdErr

standard error for that variable or observation

CV

coefficient of variation

Median

median for that variable or observation

IQR

interquartile range for that variable or observation

Skewness

skewness for that variable or observation N

Kurtosis

kurtosis for that variable or observation

MAD

median absolute deviation for variable or observation

NPositive

Number of positive data points for that variable or observation

NNegative

Number of negative data points for that variable or observation

PositivePercentage

Percentage of positive data points for that variable or observation

NegativePercentage

Percentage of negative data points for that variable or observation

PositiveChangeSize

maximal positive value * percentage of positive values for that variable or observation

NegativeChangeSize

minimal negative value * percentage of negative values for that variable or observation

PositiveMean

average of positive values for that variable or observation

NegativeMean

average of negative values for that variable or observation

GeometricMean

mean or average which indicates the central tendency or typical value of a set of numbers

Quantiles

Quantile value for that variable or observation, set according to the Quantiles percentage box


Output Results

This module generates a new Table in the Solution Explorer under the Summary folder. For every Variable or Observation,the selected summaries of Observations or Variables, respectively, will be displayed.

For example, the following table is output from summarizing N (count) and mean By Variable, and grouping the Observations by the design table's Cell Line column.

SummaryStatisticsOutput.png

For each sample group, the N and mean are displayed for each Variable.


OmicScript

Summarize


Related Articles