Quartiles, skewness, and kurtosis are statistical measures that provide additional information about the shape, spread, and characteristics of a dataset, beyond what measures of central tendency like the mean, median, and mode can reveal. Let’s explore each of these concepts:
- Quartiles:
- Quartiles divide a dataset into four equal parts, each containing 25% of the data points. They are useful for understanding the spread and distribution of data.
- The three quartiles are:
- First Quartile (Q1): The value below which 25% of the data falls. It is also the 25th percentile of the data.
- Second Quartile (Q2): The same as the median, which is the value below which 50% of the data falls. It is also the 50th percentile of the data.
- Third Quartile (Q3): The value below which 75% of the data falls. It is the 75th percentile of the data.
- Quartiles are often used to identify outliers and assess the spread of data in box plots.
- Skewness:
- Skewness measures the asymmetry of the probability distribution of a dataset. In other words, it quantifies the degree to which a dataset’s values are skewed to one side (left or right) of the mean.
- There are three types of skewness:
- Positive Skew (Right-skewed): The tail of the distribution is longer on the right, and the majority of data points are on the left. The mean is typically greater than the median.
- Negative Skew (Left-skewed): The tail of the distribution is longer on the left, and the majority of data points are on the right. The mean is typically less than the median.
- Zero Skew: The distribution is symmetrical, with the mean and median being roughly equal.
- Skewness can help identify the presence of outliers and can affect the choice of statistical tests and models.
- Kurtosis:
- Kurtosis measures the “tailedness” or the shape of the probability distribution of a dataset. It quantifies how much data is in the tails (outliers) compared to the center of the distribution.
- There are two main types of kurtosis:
- Leptokurtic: Positive kurtosis indicates a distribution with heavier tails and a higher peak (more outliers than a normal distribution).
- Platykurtic: Negative kurtosis indicates a distribution with lighter tails and a flatter peak (fewer outliers than a normal distribution).
- A normal distribution has a kurtosis of 3 (mesokurtic), so deviations from 3 indicate whether a dataset has more or fewer outliers than a normal distribution.