Loading...
Kernel Density Estimation (KDE) is a non-parametric statistical method used to estimate the probability density function of random variables. In G2, the KDE data transform can perform kernel density algorithm processing on specified data to generate probability density function (PDF) data. It uses the open-source library pdfast under the hood, which employs triangular kernel functions and is optimized to O(N + K) time complexity.
After data processing, two fields (default y
and size
) are added to the data, both of which are array types used to represent density distribution points and their corresponding density values.
Density Plot: Display continuous estimation of data distribution, showing data distribution more smoothly than histograms.
Violin Plot: Combine the characteristics of box plots and density plots, which can display both the distribution shape of data and key statistical information.
Multi-group data distribution comparison: Through the groupBy
parameter, you can simultaneously display and compare data distribution of multiple groups.
Smooth data visualization: When you need to smooth discrete data points and show their overall trends and distribution.
Density analysis in different coordinate systems: Can be applied in Cartesian or polar coordinate systems to create data distribution visualizations from different perspectives.
Property | Description | Type | Default | Required |
---|---|---|---|---|
field | Data field for kernel density algorithm | string | - | Yes |
groupBy | Grouping fields for data grouping, multiple fields can be specified | string[] | - | Yes |
as | Fields to store after KDE processing | [string, string] | ['y', 'size'] | No |
min | Minimum value of the processing range | number | Data minimum | No |
max | Maximum value of the processing range | number | Data maximum | No |
size | Number of data items generated by the algorithm, larger values result in finer density curves | number | 10 | No |
width | Determines how many points an element affects on the left and right, similar to bandWidth, larger values result in smoother curves | number | 2 | No |
size
parameter.size: This parameter determines the fineness of the generated density curve. Larger values generate more points, resulting in finer density curves. In examples, you can see the effect of increasing from the default 10 to 20 or 30.
width: This parameter controls the smoothness of the density curve, similar to the bandwidth parameter in kernel density estimation. Larger values result in smoother curves but may lose some details.
The following example shows how to create a basic density plot displaying data distribution of different species:
import { Chart } from '@antv/g2';const chart = new Chart({container: 'container',});chart.options({type: 'density', // Set chart type to density plotdata: {type: 'fetch', // Specify data type as network fetchvalue: 'https://assets.antv.antgroup.com/g2/species.json', // Set data URLtransform: [{type: 'kde', // Use kernel density estimation (KDE) for data transformationfield: 'y', // Specify KDE calculation field as 'y'groupBy: ['x', 'species'], // Group data by 'x' and 'species' fieldssize: 20, // Generate 20 data points to represent probability density function},],},encode: {x: 'x', // Map 'x' field to x-axisy: 'y', // Map 'y' field to y-axiscolor: 'species', // Map 'species' field to colorsize: 'size', // Map 'size' field to graphic size},tooltip: false, // Disable chart tooltip});chart.render();
In this example, we set the size
parameter to 20, which is larger than the default value of 10, to obtain finer density curves.
Using KDE in polar coordinates can create circular violin plots, providing new perspectives for data distribution visualization:
import { Chart } from '@antv/g2';const chart = new Chart({container: 'container',});chart.options({type: 'view',autoFit: true,data: {type: 'fetch',value: 'https://assets.antv.antgroup.com/g2/species.json',},coordinate: { type: 'polar' }, // Set to polar coordinate systemchildren: [{type: 'density', // Density plot componentdata: {transform: [{ type: 'kde', field: 'y', groupBy: ['x', 'species'] }],},encode: {x: 'x',y: 'y',series: 'species',color: 'species',size: 'size',},tooltip: false,},{type: 'boxplot', // Box plot component for displaying violin plotencode: {x: 'x',y: 'y',series: 'species',color: 'species',shape: 'violin', // Set shape to violin},style: { opacity: 0.5, strokeOpacity: 0.5, point: false },},],});chart.render();
This example shows how to combine KDE with box plots to create violin plots. In polar coordinates, violin plots are distributed in a circular pattern, providing different perspectives to observe data distribution.
By adjusting KDE parameters, you can control the smoothness and accuracy of density estimation:
import { Chart } from '@antv/g2';const chart = new Chart({container: 'container',});chart.options({type: 'density',data: {type: 'fetch',value: 'https://assets.antv.antgroup.com/g2/species.json',transform: [{type: 'kde',field: 'y',groupBy: ['x'],size: 30, // Increase sampling points for finer density curveswidth: 3, // Increase bandwidth for smoother curvesmin: 0, // Set minimum value of processing rangemax: 8, // Set maximum value of processing rangeas: ['density_x', 'density_y'], // Custom output field names},],},encode: {x: 'x',y: 'density_x', // Use custom output fieldcolor: 'x',size: 'density_y', // Use custom output field},tooltip: false,});chart.render();
This example shows how to customize various KDE parameters:
size: 30
- Increase sampling points for finer density curveswidth: 3
- Increase bandwidth for smoother curvesmin: 0
and max: 8
- Set minimum and maximum values of processing rangeas: ['density_x', 'density_y']
- Custom output field namesThese parameter adjustments can help you obtain finer or smoother density curves, adjusting according to actual needs.
KDE data transform is a powerful tool in G2 that can help you create various density-related visualizations, such as density plots and violin plots. By adjusting its parameters, you can control the fineness and smoothness of density curves to meet different visualization needs.
Using KDE in different coordinate systems can provide different perspectives for data distribution. Combined with other chart types such as box plots, you can create richer data visualizations.
For more examples, you can check the Chart Examples - Violin Plot page.