Mastering Shapes of Distributions: Key to Statistical Analysis

Shapes of distributions

We learned from our lesson on the frequency distribution and histograms, that a frequency distribution is a tool to organize the gathered information from a statistical study into an efficient model, where data are summarized and depicted in a manner that facilitates its communication. A frequency distribution orderly sorts data based on the magnitude of the observations, it accounts for the total outcomes of a survey or experiment, and presents the frequency of each outcome as it has been observed or obtained; Then, the presentation of the data is done through a frequency distribution table, a histogram or even a frequency polygon.

But data is not only depicted through frequency distributions and their many graphic methods, data can also be presented through probability distributions.

Before we cover this new concept, let us remember that in general (in statistics) a distribution refers to the way data collected is presented (a graphic representation of a data set), in other words, a distribution is the way a data set has been arranged to show the spread of its values : the range the values have, how dispersed are they from each other, or close, etc. Usually a distribution is either a frequency distribution or a probability distribution, and the type of distribution depends on the basis of the arrangement (the basis taken to graph or depict the data in any way). While a frequency distribution depicts the data based on the specific outcomes obtained from the study or experiment, the probability distribution will base its depiction on the chances of each possible outcome to happen.

In simple words, a probability distribution is yet another graphic representation of the values in a data set, the difference is that a probability distribution graph provides the probability of each specific outcome to occur, rather than its frequency. Therefore, the vertical axis in a histogram or frequency polygon is the frequency of each events outcome in the study, while the vertical axis in a probability distribution graph is the probability of the outcomes happening.

On this lesson, we will be focusing on studying data distribution shapes and learning to identify the information that can be obtained just by looking to the shape of the distribution being studied.

How to determine the shape of a distribution

We will go into detail about the probability distribution in a later lesson, for now we will focus on the topic of shape of distribution statistics, no matter what type of distribution you are working with.

Let us learn how to determine the shape of distribution by looking at the basic figures one can find through different graphic representations of data:

• $\quad$ Symmetric

A symmetric distribution refers to a graphic representation of data that has symmetry with respect to an axis. Remember that the concept of symmetry says that a symmetric figure is that which is balanced by reflection, rotation or scaling; for symmetry due an axis (balanced by reflection), all of the points on the figure correspond to a point in the opposite side of the axis of symmetry and thus, you could bend the figure through the axis and all of the points in one side will correspond to the points in the opposite side:

Shapes of distributions

Figure 1: Symmetry

There are two main types of symmetric distributions, they can either be bell-shaped or u-shaped.

• $\quad$ Bell Shaped

A bell shaped distribution has the shape of a mountain (a mound) or bell that is symmetrical and its axis of symmetry comes from the value of the average of its data set values. All bell shaped distribution graphs will have its highest value in the center.

Bell shaped distributions are what is known in math and science as a normal (or Gaussian) distribution; they are the most important probability distribution shape since it is usually the product of a sufficiently large data set from random variables found in nature. We will talk about it more after we have a deeper understanding on what a probability distribution is in later lessons.

In general, a bell shaped distribution (also called a mound shaped distribution) looks like:

Shapes of distributions

Figure 2: Bell shaped distribution

The distribution in figure 2 could be of any type, this figure just represents how a normal distribution would look like; for a more specific view of a distribution let us look at an example using a frequency distribution graph: a histogram.

Shapes of distributions

Figure 3: Bell shaped histogram

• $\quad$ U Shaped

In general, a u shaped distribution example looks like:

Shapes of distributions

Figure 4: U shaped distribution

Now looking at a u shaped frequency distribution histogram:

Shapes of distributions

Figure 5: u shaped histogram

Although the bell shaped and the u shaped distributions (and their corresponding histogram distribution shapes) are the most common when looking at symmetric distributions, they are not the only ones that can occur. The next shape of a distribution is a good example of a symmetric distribution without a particular known shape, remember, as long as you can trace a vertical line at the center of the graph and use it as a symmetry axis where each side could be the mirrored image of the other, you have a symmetric shape of data distribution.

Shapes of distributions

Figure 6: Symmetric distribution

• $\quad$ Skewed

The concept of skewness when describing a distribution graph is often misunderstood, in general, skewness refers to a distortion; for statistics, skewness refers to how asymmetric a distribution graph shape is, in other words, if it doesnt have a central axis of symmetry, but it still has ups and downs, then this means the distribution has a skewed shape.
So what is the shape of this distribution? How does it look like? There are two options for this:

• $\quad$ Positively skewed

In general, a skewed graph looks as if it has been taken from one of its sides and stretched over, deforming what could be a bell-shaped distribution into a distribution that is asymmetric with a tail on one side. A distribution graph is said to be positively skewed when it shows the tail-like formation towards the right hand side of the graph. The skew in this case is positive because the tail is pointing in the same direction as the positive horizontal axis of the graph:

Shapes of distributions

Figure 7: Positively skewed distribution (skewed to the right)

A positively skewed distribution is also said to be skewed to the right.

• $\quad$ Negatively skewed

A distribution graph is said to be negatively skewed when it shows the tail-like formation towards the left hand side of the graph. The skew in this case is negative because the tail is pointing in the same direction as the negative horizontal axis of the graph:

Shapes of distributions

Figure 8: Negatively skewed distribution (skewed to the left)

Notice that skewness to either one side or the other means that the tail is on the side to which the skewness is.

Now that you know about the bell-shaped distribution and the skewed distributions, take a look at the next figure where you can compare them:

Shapes of distributions

Figure 9: Comparison between bell shaped and skewed distributions

On our next lesson about the center of a data set we will learn about the mean, median and the mode.

• $\quad$ Uniform

A uniform distribution shape is that which is flat (it can be perfectly flat, but not necessarily, as long as is close to being flat), this means the spread of the data is equal (or uniform) throughout the whole range of the data set. In a frequency distribution graph, this means that all of the outcomes or class intervals have the exact same frequency, producing a graph with no ups, downs or any other shape but a straight horizontal line.

Shapes of distributions

Figure 10: Uniform distribution

How to describe the shape of a distribution

For this section, let us go back to figure 6 where the distribution is easily observed to be symmetric, still, it does not have a particular shape. When having a specific shape, such as the bell shape and the u shape, is very simple to describe the shape of the distribution; on the other hand, what happens when you cannot recognize any of these well known shapes? How to describe the shape of a distribution that has all kinds of curves, ups and downs? For that we need to tell if there are peaks, gaps or clusters in the distribution.

In a distribution graph, a gap is an interval which contains no data; on the other hand, a peak is the highest point of a data set. When several data points lie close together we say there is a cluster, which typically means that these data points have similar characteristics. We usually can define as homogeneous the points belonging to a cluster, because of the sharing of characteristics which makes them so similar to each other (Things which are very diverse or dissimilar are called heterogeneous)
And so, if we want to practice how to determine the shape of a distribution, let us take a look at the next examples:

Example 1

In Mrs. Katsufrakis class the scores for a test are given in the graph below:

Shapes of distributions

Figure 11: Mrs. Katsufrakis class scores

From the options below, what is the average score?

The answer is number three: 70%, given that the graph is skewed to the left which produces the mean (average) to move towards the left too. In other words, the area on each side of the mean line must be the same, and given that the graph has a tail towards the left, is as if someone stretched it on that side and thus dragging what used to be the center (the mean is trying to get the central tendency) towards the left too.
If you have doubts about it go back and take a look at figure 9 for the negative skew distribution, where the mean line can be observed.

Example 2

A bank assures you that one of their tellers will be of assistance to you in 5 minutes or less. Further the bank tells you that the teller will become available equally likely at any time throughout those first 5 minutes. What does the distribution of this wait time look like?

On this case we are working with a probability distribution, since we are talking about the probability of having a bank teller assist you in the next five minutes; therefore, the vertical axis of our distribution graph will represent the probability, and the horizontal axis will represent the time. Then, if the bank says that a teller will become available equally likely at any time throughout those 5 minutes, then we have equal chances of having a teller assist you in each minute of the whole range of the distribution; thus, the probability of being assisted at any minute out of the five, is

\frac{1}{5}

. So, the probability distribution histogram looks like:

Shapes of distributions

Figure 12: Uniform probability distribution histogram

Since all of the possible outcomes have the same changes to occur, then the distribution is uniform.

Example 3

An IQ test has an average score of 100 and has more data clumped closer to the mean than further away. What is the shape of the distribution of an IQ test?

This distribution must be a bell shaped distribution just because more data is clumped towards the mean than further away, this can be intuitively seen in our figures above describing the bell shape graphs, but further than that, this is a very important topic that will be covered in much more detail on our lesson for the normal distribution and continuous random variable.

Example 4

The finishing times of a marathon are given in the following distribution below:

Shapes of distributions

Figure 13: Finishing times of a marathon

Signify where all the clusters, gaps and peaks are.

Looking at figure 13 we can instantaneously see there is a gap (remember,a gap in a distribution graph means an absence of data at a certain interval) from the 4.25 to the 4.75 hours; meaning no runners arrived within that time frame.
Then, the detection of the gap facilitates us the distinction of the two clusters in the distribution: the main cluster is the one on the left side which goes from the interval of 2.5 to the 4.25 hours, and then the second cluster goes from 4.75 hours to 5.25 hours. And finally let us look at the peaks: there is only one highest peak among the data columns in the distribution histogram, and this one belongs to the class interval of 3.5 to 3.75 hours.

***
To finalize our lesson we recommend you to take a look at this article where typical histogram shapes are presented, and many of them are different from the distribution shapes we have talked about today. Also, this article on the shapes of distributions has useful information that may complement what we saw here today.

So, this is it for our lesson of today, see you in the next one!

There are two main types of Distribution we are concerned with in statistics:

Frequency Distributions:
A graph representing the frequency of each outcome occurring.

Probability Distributions:
A graph that provides the probability of each outcome occurring.

The most common distribution shapes are:

Symmetric:
Symmetric, distribution shape

Bell-shaped:
Bell-shaped, distribution shape

Skewed to the left:
Skewed to the left, distribution shape

Skewed to the right:
Skewed to the right, distribution shape

Uniform:
Uniform, distribution shape

Cluster:

On a graph a cluster is when several data points lie close together. Typically this means that these data points have similar characteristics, which is also called things are homogeneous (Things which are very diverse or dissimilar are called heterogeneous)

Gap:

A gap is an interval which contains no data.

Peak:

A peak is the highest point of a data set.

Understanding Shapes of Distributions in Statistics
Dive into the world of distribution shapes and enhance your statistical analysis skills. Learn to recognize patterns, interpret data, and make informed decisions with our comprehensive guide.

Free to Join!

Easily See Your Progress

Make Use of Our Learning Aids

Last Viewed

Practice Accuracy

Suggested Tasks

Earn Achievements as You Learn

Create and Customize Your Avatar