Impute.pdf

From Array Suite Wiki

Impute Missing Values

Overview

The Impute command will replace missing data values based on one of several available imputation methods. It generates a new "Microarray" data type.

Input Data Requirements

It works on “Microarray” data types that contain missing data.

To run this module, type MicroArray | Preprocess | Impute

Impute menu.png

General Options

Impute1.png

Input/Output


  • Project & Data: The window includes a dropdown box to select the Project and Data object to be filtered.
  • Variables: Selections can be made on which variables should be included in the filtering (options include All variables, Selected variables, Visible variables, and Customized variables (select any pre-generated Lists)).
  • Observations: Selections can be made on which observations should be included in the filtering (options include All observations, Selected observations, Visible observations, and Customized observations (select any pre-generated Lists).
  • Output name: The user can choose to name the output data object.


Options

  • specifying the Imputation method, and include FixedNumber, FixedPercentile, RowPercentile, ColumnPercentile, RowAverage, and KNN.
  • The user also has additional options for Fixed number, Fixed percentile, Row percentile, Column percentile, and Neighbor number, where applicable for the selected Imputation method.
Different Imputation methods
FixedNumber

replaces each missing data point with the specified fixed number

FixedPercentile

replaces each missing data point with a value found by calculating the specified percentile for all cells in the dataset.

RowPercentile

replaces each missing data point with a value found by calculating the specified percentile for all cells in that particular row.

ColumnPercentile

replaces each missing data point with a value found by calculating the specified percentile for all cells in that particular column.

KNN (K nearest neighbor)

replaces each missing data points by finding the specified nearest neighbors in the space of genes (similar to http://rss.acs.unt.edu/Rdoc/library/impute/html/impute.knn.html)

Output Results

The Output type will either be set to Change input data, in which case the original Data object will be permanently changed, or if the user enters a name in the Output name field, the Output type will switch to Imputed Microarray Data, and a new Data object will be created in the Solution Explorer.

Warning.png WARNING: If users don't specify output name, the original MicroArray data will be overwritten by the new imputed data.


Related Articles