Table SummarizeMissingPattern.pdf

From Array Suite Wiki
(Redirected from SummarizeMissingPattern.txt)

Summarize Missing Pattern

Overview

The Missing Pattern command will summarize missing data from a Table object, generating a new table with a patterns of missing data, and their frequencies. It is most commonly used to figure out if there are any particular patterns of missing data (for instance, whole observations or whole variables with missing data). The results will be generated in the Solution Explorer under the Summarize folder of the Table section. It can be accessed by going to: Table | Summarize Missing Pattern.

Table SummarizeMissingPattern Menu.png

Input Data Requirements

This function works on all Table data, including design and annotation tables.

Step 1: Select source table

The user will first be asked to choose the table to summarize:

SelectData2.png


Step 2: Select columns for summarization

SumMissPat1.png

  • The user can specify which columns should be used in the summarization, or the user can just "Select all" to select all columns.


Output Results

Selecting the Submit button will generate a new table with patterns of missing data in the Solution Explorer under the Summarize tab.

Each pattern is summarized under the Pattern column, where each column is represented by either '0' (data present) or '1' (data missing).

  • Count: The number of rows matching this pattern.
  • N Cols Missing: A summarization of the number of columns with missing data in this pattern.
  • For each column, the Present or Missing status will be displayed.

See below for an example.

Example Usage

Given the following starting table and summarize missing pattern option:

SumMissPat2.png SumMissPat3.png

The table has missing value for Sample D and F for both Gene1 and Gene2, and only missing value for SampleA for Gene4, and here is the resulted table for missing pattern:

SumMissPat4.png

Thus, there are 3 types of missing pattern for all of the columns, the first rows means that there are 3 rows has no missing values; the second row shows that there are 1 row has missing pattern with "000100", which means that the forth column has missing value; and the third row shows that there are 2 rows has this missing pattern "101000", which means that first column and third column have missing value.

Related Articles

EnvelopeLarge2.png