Statistics Lab Work 8
Data Exploration Using Python
Histograms are useful for providing an overview of the measures of central tendency and the symmetry of observed data. Another graphical presentation that can summarize more detailed information about the distribution of observed data values is Box and Whisker Plots, often referred to simply as BoxPlots. As the name suggests, Box and Whisker consists of a Box and whiskers.
A box-plot or boxplot (also known as a box-and-whisker diagram) is a graphical representation of numerical data through five key measures:
- Minimum observation value
- Lower quartile or first quartile (Q1), which cuts off 25% of the lowest data
- Median (Q2) or middle value
- Upper quartile or third quartile (Q3), which cuts off 25% of the highest data
- Maximum observation value
The boxplot also indicates, if present, any outlier values from the observations. Boxplots can be used to show differences between populations without making assumptions about the underlying statistical distribution. Therefore, boxplots are categorized as non-parametric statistics. The distance between the parts of the box indicates the degree of dispersion and skewness in the data. Boxplots can be illustrated both horizontally and vertically.