Impute.pdf
Impute Missing Values
Overview
The Impute command will replace missing data values based on one of several available imputation methods. It generates a new "Microarray" data type.
Input Data Requirements
It works on “Microarray” data types that contain missing data.
To run this module, type MicroArray | Preprocess | Impute
General Options
Input/Output
- Project & Data: The window includes a dropdown box to select the Project and Data object to be filtered.
- Variables: Selections can be made on which variables should be included in the filtering (options include All variables, Selected variables, Visible variables, and Customized variables (select any pre-generated Lists)).
- Observations: Selections can be made on which observations should be included in the filtering (options include All observations, Selected observations, Visible observations, and Customized observations (select any pre-generated Lists).
- Output name: The user can choose to name the output data object.
Options
- specifying the Imputation method, and include FixedNumber, FixedPercentile, RowPercentile, ColumnPercentile, RowAverage, and KNN.
- The user also has additional options for Fixed number, Fixed percentile, Row percentile, Column percentile, and Neighbor number, where applicable for the selected Imputation method.
FixedNumber |
replaces each missing data point with the specified fixed number |
FixedPercentile |
replaces each missing data point with a value found by calculating the specified percentile for all cells in the dataset. |
RowPercentile |
replaces each missing data point with a value found by calculating the specified percentile for all cells in that particular row. |
ColumnPercentile |
replaces each missing data point with a value found by calculating the specified percentile for all cells in that particular column. |
KNN (K nearest neighbor) |
replaces each missing data points by finding the specified nearest neighbors in the space of genes (similar to http://rss.acs.unt.edu/Rdoc/library/impute/html/impute.knn.html) |
Output Results
The Output type will either be set to Change input data, in which case the original Data object will be permanently changed, or if the user enters a name in the Output name field, the Output type will switch to Imputed Microarray Data, and a new Data object will be created in the Solution Explorer.
WARNING: If users don't specify output name, the original MicroArray data will be overwritten by the new imputed data.
Related Articles
- Latest Tutorials
- Omicsoft aligner wiki and publication