From Array Suite Wiki

Estimate Kernel Density


Kernel density estimation (or Parzen window method) is a non-parametric way of estimating the probability density function of a random variable. As an illustration, given some data about a sample of a population, kernel density estimation makes it possible to extrapolate the data to the entire population. For more information on kernel density estimation, see

One common use of this module is to visualize whether the data approximate a normal distribution, before running statistical modules designed for microarray data.

To run this module, type MicroArray | Summarize | Kernel Density.

Kernel menu.png

Input Data Requirements

This module works on -Omic data types.

General Options


Add file

  • Project & Data: The window includes a dropdown box to select the Project and Data object to be filtered.
  • Variables: Selections can be made on which variables should be included in the filtering (options include All variables, Selected variables, Visible variables, and Customized variables (select any pre-generated Lists)).
  • Observations: Selections can be made on which observations should be included in the filtering (options include All observations, Selected observations, Visible observations, and Customized observations (select any pre-generated Lists).
  • Output name: The user can choose to name the output data object.


  • Kernel type: The kernel type users would like to fit.
    • Users can choose among Gaussian,Epanechnikov,Rectangular,Triangular,Biweight,Cosine, or OptCosine, depending on the expected best-fit distribution for the data.
  • N (estimation precision): The number of bins. Intuitively one wants to choose N as large as possible, however there is always a trade-off between the bias of the estimator and its variance. The default value is 512.

Output Results

An example Kernel Density plot is shown below and appears as a DensityView in the Summary folder of the Tables section in the Project Explorer.


If users add a Table view for Density result, the table's column number will be 2 times the sample size, as each sample will have two columns.

Density Table.png

Take chip '01 A' as an example, the first column '01 A X' shows the X axis position and the column '01 A' shows the corresponding density value. If users use the default N (estimation precision), there will be 512 rows, corresponding to 512 bins.

Users can also calculate other parameters of the data distribution in the Microarray Summary Statistics module, such as skewness and kurtosis.



Related Articles