Impute Missing Values
The Impute command will replace missing data values based on one of several available imputation methods. It generates a new "Microarray" data type.
Input Data Requirements
It works on “Microarray” data types that contain missing data.
To run this module, type MicroArray | Preprocess | Impute
- Project & Data: The window includes a dropdown box to select the Project and Data object to be filtered.
- Variables: Selections can be made on which variables should be included in the filtering (options include All variables, Selected variables, Visible variables, and Customized variables (select any pre-generated Lists)).
- Observations: Selections can be made on which observations should be included in the filtering (options include All observations, Selected observations, Visible observations, and Customized observations (select any pre-generated Lists).
- Output name: The user can choose to name the output data object.
- specifying the Imputation method, and include FixedNumber, FixedPercentile, RowPercentile, ColumnPercentile, RowAverage, and KNN.
- The user also has additional options for Fixed number, Fixed percentile, Row percentile, Column percentile, and Neighbor number, where applicable for the selected Imputation method.
replaces each missing data point with the specified fixed number
replaces each missing data point with a value found by calculating the specified percentile for all cells in the dataset.
replaces each missing data point with a value found by calculating the specified percentile for all cells in that particular row.
replaces each missing data point with a value found by calculating the specified percentile for all cells in that particular column.
|KNN (K nearest neighbor)||
replaces each missing data points by finding the specified nearest neighbors in the space of genes (similar to http://rss.acs.unt.edu/Rdoc/library/impute/html/impute.knn.html)
The Output type will either be set to Change input data, in which case the original Data object will be permanently changed, or if the user enters a name in the Output name field, the Output type will switch to Imputed Microarray Data, and a new Data object will be created in the Solution Explorer.