Chi - Squared confidence intervals
In todays lesson we will learn what is chi square in statistics and how do we use it when constructing confidence intervals.
What is chi square and what is a chi squared distribution?
A chi squared distribution comes from the chi square statistic, which measures how different are observed values from the expected ones from a true hypothesis in a statistical test. The chi square symbol is , and this statistic is also called the goodness-of-fit statistic.
When talking in terms of the standard normal distribution, if we take a random sample of the values from such a distribution we would obtain a standard normal deviate. Squaring standard normal deviates and adding them together we would obtain a chi squared distribution, where the degrees of freedom are equal to the number of standard normal deviates that were added.
A Chi-Square with one degree of freedom is written as (1), is simply the distribution of a single normal deviate squared. The area of a Chi Square distribution below 4 is the same as the area of a standard normal distribution below 2, since 4 is 22. The mean of a Chi Square distribution is its degrees of freedom. Chi Square distributions are positively skewed, as the degrees of freedom increase, the Chi Square distribution approaches a normal distribution.
The concept of the chi squared statistic can be a bit complicated so let us answer the following questions to have a little bit more insight on the matter:
- What is chi square distribution in statistics
- Does chi square assume normal distribution
Notice the shape as the degrees of freedom increase, the more degrees of freedom present the more the chi-square curve looks almost symmetric like a normal distribution!
- What is the shape of the chi-square distribution?
- What is chi square critical value?
For example, when looking at what the area under the graph means we will find a confidence levell delimited by a confidence interval, and then a resulting significance level left on the side. Here is where the critical values come to place.
In any distribution, we know that a critical value is a point in the horizontal axis of the graph which marks the limit between an area (or region) and another. Usually, delimiting in between the confidence level () and the significance level (). The chi square distribution follows the same rules and since the units of its horizontal axis is itself, we call these critical values: chi-square critical values.
Notice in the graph below how the confidence level, the significance level and the critical values and are shown. The R and L subindexes denote the sides right and leftsince we are looking at a two tailed distribution, and notice the significance level is still evenly divided in halves (one for each side) even if the distribution is not symmetrical.
To find an area to the right of any critical value in the horizontal axis of this distribution such as and shown above, we use a chi-square distribution table. This table contains 3 types of values: degrees of freedom, area to the right of a critical value, and the chi square critical value itself and it looks like this:
Chi-square Table
At the very left column of this chi squared table are the degrees, the top row contains the area values and the numbers inside this left-column/top-row frame are the chi square critical values. We will learn how this is used in a later section of the lesson.
An even bigger table (and more complete) can be found in the following link, and we recommend using it in conjunction with a chi square calculator (found at the end of this lesson) when solving multiple problems during studying time on your own.
When is chi square distribution used?
To estimate a population variance a Chi-Squared distribution is used,
Chi-Squared:
Where:
= chi square
= sample size
= sample standard deviation
= population standard deviation
= population variance
= degrees of freedom
From the chi square formula we can solve for the value of the variance (the square of the populations standard deviation).
With this we can calculate the confidence interval for that variance too, similarly to how we have used the t-distribution and the margin of error before to calculate the confidence interval for a population mean. The confidence interval, in this case for the variance or standard deviation squared, is given by:
In this last formula, the chi square values in the denominators of each side refers to the chi square critical values defining the edges between the confidence level and the significance level on the distribution graph, and denoting that we have a two tailed significance level because we have a right side () and a left side () as shown below:
Another use for the chi-square distribution in statistics is when we are doing hypothesis testing, in this case, with the chi square test. We will talk about the chi-square test in a lesson later on.
How to use the chi square distribution table
The Chi-Square table gives critical value area to the right:
Chi-square Table
Let me explain the chi square distribution table in a little more detail:
The first column to the left expresses the degrees of freedom for a particular situation, while the top row expresses the area to the right of any specific chi square critical value we are looking at.
Therefore, to find a value of chi square we look at the point in the table where the specific degrees of freedom row intersects with the column for the specific chi squared critical value. This process can be seen easily in example problem 2 on both parts a and b, so if you have any questions about how the chi square critical value table works, we recommend you to skip into that part of the lesson where you will also be reading about how to calculate chi square related confidence intervals.
How to solve chi square distribution
In order to understand better what chi square is and how we use it, let us take a look into a few problem examples.
Example 1:
Determining Degrees of FreedomHow many degrees of freedom does a sample of size,
a. 7 have?
The sample size is defined as , and the degrees of freedom arise from simply subtracting , therefore for this case the degrees of freedom are .
And so we have 6 degrees of freedom in this case.
b. 20 have?
In this case .
Thus .
19 degrees of freedom in this case.
Example 2:
Determining the Critical Value for a Chi-SquareDistribution ( and )If a Chi-Squared distribution has 8 degrees of freedom find and , with a
a. 95% confidence level
If we have 8 degrees of freedom, then this means that , thus the sample size n is equal to 9.
A confidence level of 95% percent means , therefore the significance level is .
This significance level is spread in two equal parts in two tails since we are asked for both and , with this in mind, let us express that information in a chi squared distribution graph as follows:
Using the information about the 8 degrees of freedom and the area value of half significance level we use the chi squared table to find the value of both critical values and .
Remember that the area value in the chi square table top row is the area to the right of a critical value, and so, we start with since the area to its right is simply :
Therefore the right critical value is =17.535.
Now, the area to the right of the left critical value is equal to the confidence level plus half alfa since it includes both the confidence level area and the right tail.
Area to the right of the left critical value:
We use this to obtain :
Therefore the right critical value is = 2.180 .
b. 99% confidence level
Remember we have 8 degrees of freedom, and in this case, a confidence level of 99% percent which means , therefore the significance level is .
This significance level is spread in two equal parts in two tails: and , and so we express that information in a chi squared distribution graph as follows:
Using the information about the 8 degrees of freedom and the area value of half significance level we use the chi squared table to find the value of both critical values and .
We start with since the area to its right is simply :
Therefore the right critical value is = 21.955 .
Now, the area to the right of the left critical value is equal to the confidence level plus half alfa since it includes both the confidence level area and the right tail.
Area to the right of the left critical value:
We use this to obtain :
Therefore the right critical value is = 1.344 .
Example 3:
Determining the Confidence Interval for VarianceRoad and racing bicycles have an average wheel diameter of 622mm. From a sample of 15 bicycles it was found that the wheel diameters have a variance of 10mm. With a 90% confidence level give a range where the variance of all road and racing bicycle wheels lie.
Let us gather all of the information of the problem:
= sample size = 15 bicycles
= sample standard deviation thus = sample variance = 10 mm
= population standard deviation thus = population variance
= degrees of freedom = 15 - 1 = 14 degrees of freedom.
Remember that the confidence interval of the variance is given by:
And so we need to find the critical values and .
For that we use the 14 degrees of freedom and the confidence level of 90% given.
Since the confidence level is equal to 1 - = 0.90, we know that the significance level is = 0.1 then, and = 0.05.
We start with since the area to its right is simply :
Therefore the right critical value is = 23.685 .
Now, the area to the right of the left critical value is equal to the confidence level plus half alfa since it includes both the confidence level area and the right tail.
Area to the right of the left critical value:
We use this to obtain :
Therefore the right critical value is = 6.571 .
And so finally we have everything we need to calculate the confidence interval for the variance:
Example 4:
Determining the Confidence Interval for Standard DeviationA Soda-pop company "Jim's Old Fashion Soda" is designing their bottling machine. After making 41 bottles they find that their bottles have an average of 335mL of liquid with a standard deviation of 3mL. With a 99% confidence level what is the range of standard deviation that this machine will output per bottle?
Let us gather all of the information of the problem:
= sample size = 41 bottles
= sample mean = 335 mL
= sample standard deviation = 3 mL
= sample variance = 9 mL2
= population standard deviation = thus = population variance
= degrees of freedom = 41 - 1 = 40 degrees of freedom.
With all of this we are asked to find the confidence interval of the standard deviation of the population, but we only have the formula for the confidence interval of the variance which is given by:
In order to obtain the formula needed, we use the fact that the population standard deviation is the square root of the variance, and so we have that:
Where we have the degrees of freedom and the sample standard deviation, so we only need to find the critical values and .
In this case we have 40 degrees of freedom and a confidence level of 99%, which is 1 - = 0.99 and so the significance level is = 0.01 and = 0.005
We start with since the area to its right is simply :
Therefore the right critical value is = 66.766 .
For the left critical value we add the area for the confidence level plus half the significance level in the following manner:
We use this to obtain :
Therefore the right critical value is = 20.707.
And so finally we have everything we need to calculate the confidence interval for the standard deviation:
And so, here ends our lesson but this is not the last time we talk about chi square. The question that comes to mind next is: Is chi square a test statistic? The answer lies in our hypothesis testing section of this course, where you will learn how you can perform a chi squared test.
Before you go, we would like to recommend the following chi square critical value calculator, so you can have a fast way to check your answers when using the methodology we have shown in this lesson.
To estimate a population variance a Chi-Squared distribution is used,
• Chi-Squared:
: sample size
: sample standard deviation
: population standard deviation
: is also called "degrees of freedom"
• Chi-Square table gives critical value area to the right
The Confidence interval for the variance is given by:
• < <
• Chi-Squared:
: sample size
: sample standard deviation
: population standard deviation
: is also called "degrees of freedom"
• Chi-Square table gives critical value area to the right
The Confidence interval for the variance is given by:
• < <